Word Puzzles with Neo4j and Py2neo

G
Grant Paton-SimpsonSoftware developer à sofastatistics.com
Presented by Grant Paton-Simpson
Word Puzzles with
Neo4j and Py2neo
Overview
●
Brief look at graph databases & Neo4j
●
Introduction to word transformation game
●
Getting suitable words
●
Adding words and relationships into Neo4j
●
Querying graph data to generate puzzles
Graph Databases – a NoSQL option
http://neo4j.com/books/graph-databases/
NoSQL – when is it a good fit?
●
SQL has its origins in the 1970s
and may not be fresh and shiny
any more but ...
●
… we shouldn't choose NoSQL
for reasons of fashion.
●
Venerable SQL often a better
choice for standard hierarchies
e.g. countries that have cities
that have suburbs etc
https://twitter.com/edd/status/400190499585544192
Graph Databases
●
Graph databases much, much better for related data with:
– lots of different links between same nodes
– different numbers of links between nodes
e.g. 3 hops to one peer and 7 hops to another
– lots of peer-to-peer links
Substantial Benefits
●
Massive performance benefits (going exponential as number
of links grows)
●
Structural harmony
– between structure of data and structure of data storage
(what you draw on the whiteboard might look very similar
to how you data is actually structured)
– between questions of data and query language used to
answer them
Word transformations
●
Start with one word and get to
the other by single-letter
tranformations word-by-word
●
E.g. starting with “stores” get to
“slaked”
– BTW there are 96 alternative
ways 5 moves or less
stores
stored
stared
staked
slaked
Puzzle taster
Get from 'sloven' to 'closed' in
no more than 5 steps
(there are 10 unique solutions)
sloven
?
closed
Getting a simple word list
●
How hard could it be?
●
Lesson #1 – scrabble lists and similar are useless – only want lists
with standard words otherwise puzzles too hard
●
Lesson #2 – have to decide about taboo/profane words
●
Lesson #3 – the number of words affects the number of
ONE_LETTER_DIFF relationships a lot
●
Lesson #4 – clever optimisation not needed if restricting self to
ordinary words
SCOWL (Spell Checker Oriented Word Lists) http://wordlist.aspell.net/
Filtering words
●
Needed to turn é to e
●
Needed to eliminate possessives e.g. cat's (as used in the phrase “the
cat's whiskers”)
●
Needed to leave out capitalised words
For each word, identifying words different
by one letter only
Disclaimer: the code worked but probably some super-smart optimisations
would be possible involving n-dimensional space or something
Adding data to Neo4j
●
Create nodes and relationships
●
Lots of room for optimisations
●
Only need to build database once so 15 minutes is not worth
reducing
●
My Neo4j and Py2neo is beginner level but I was able to solve my
problem
Py2neo and Cypher
Cypher Syntax as ASCII Art (Really!)
Word Word
ONE_OFF
(Word) -[ONE_OFF]->(Word)
Cypher Syntax as ASCII Art (Really!)
Word Word
ONE_OFF
(Word) -[ONE_OFF]->(Word)
How cool is this?
Example Output
Matching chart
Live Demo – Suggestions for Start Word
“sloven” to “closed” solution(s)
Resources
●
Neo4j
– http://neo4j.com/books/graph-databases/
– http://neo4j.com/graphacademy/
– http://graphgist.neo4j.com/#!/gists
– https://www.youtube.com/channel/UCvze3hU6OZBkB1vkhH2lH9Q
●
Py2neo
– http://py2neo.org/2.0/
●
SCOWL
– http://wordlist.aspell.net/
About Catalyst
1 sur 22

Recommandé

Crash Course in Natural Language Processing (2016) par
Crash Course in Natural Language Processing (2016)Crash Course in Natural Language Processing (2016)
Crash Course in Natural Language Processing (2016)Vsevolod Dyomkin
1.8K vues47 diapositives
"Data mining и информационный поиск проблемы, алгоритмы, решения"_Краковецкий... par
"Data mining и информационный поиск проблемы, алгоритмы, решения"_Краковецкий..."Data mining и информационный поиск проблемы, алгоритмы, решения"_Краковецкий...
"Data mining и информационный поиск проблемы, алгоритмы, решения"_Краковецкий...GeeksLab Odessa
10.9K vues25 diapositives
AINL 2016: Malykh par
AINL 2016: MalykhAINL 2016: Malykh
AINL 2016: MalykhLidia Pivovarova
330 vues37 diapositives
Natural Language Processing in Practice par
Natural Language Processing in PracticeNatural Language Processing in Practice
Natural Language Processing in PracticeVsevolod Dyomkin
3.2K vues25 diapositives
IETF105 Regext RDAP JCard Profile par
IETF105 Regext RDAP JCard ProfileIETF105 Regext RDAP JCard Profile
IETF105 Regext RDAP JCard ProfileAPNIC
336 vues4 diapositives
Round pegs and square holes par
Round pegs and square holesRound pegs and square holes
Round pegs and square holesDaniel Greenfeld
5.9K vues55 diapositives

Contenu connexe

Similaire à Word Puzzles with Neo4j and Py2neo

Oracle's Take On NoSQL par
Oracle's Take On NoSQLOracle's Take On NoSQL
Oracle's Take On NoSQLAlexander Shopov
3.4K vues130 diapositives
Introduction to SQL Alchemy - SyPy June 2013 par
Introduction to SQL Alchemy - SyPy June 2013Introduction to SQL Alchemy - SyPy June 2013
Introduction to SQL Alchemy - SyPy June 2013Roger Barnes
2.3K vues26 diapositives
Getting started with Graph Databases & Neo4j par
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jSuroor Wijdan
4K vues23 diapositives
Do Languages Matter? par
Do Languages Matter?Do Languages Matter?
Do Languages Matter?Bruce Eckel
2.2K vues44 diapositives
Neo4j graph database par
Neo4j graph databaseNeo4j graph database
Neo4j graph databasePrashant Bhargava
193 vues22 diapositives
Week 2 tyoes of databases and ERD 2020 par
Week  2 tyoes of databases and ERD  2020Week  2 tyoes of databases and ERD  2020
Week 2 tyoes of databases and ERD 2020Osama Ghandour Geris
159 vues24 diapositives

Similaire à Word Puzzles with Neo4j and Py2neo(20)

Introduction to SQL Alchemy - SyPy June 2013 par Roger Barnes
Introduction to SQL Alchemy - SyPy June 2013Introduction to SQL Alchemy - SyPy June 2013
Introduction to SQL Alchemy - SyPy June 2013
Roger Barnes2.3K vues
Getting started with Graph Databases & Neo4j par Suroor Wijdan
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4j
Suroor Wijdan4K vues
Do Languages Matter? par Bruce Eckel
Do Languages Matter?Do Languages Matter?
Do Languages Matter?
Bruce Eckel2.2K vues
[CocoaHeads Tricity] Maciej Burda - Working as an iOS developer Interview Cas... par CocoaHeads Tricity
[CocoaHeads Tricity] Maciej Burda - Working as an iOS developer Interview Cas...[CocoaHeads Tricity] Maciej Burda - Working as an iOS developer Interview Cas...
[CocoaHeads Tricity] Maciej Burda - Working as an iOS developer Interview Cas...
Running Neo4j in Production: Tips, Tricks and Optimizations par Nick Manning
Running Neo4j in Production:  Tips, Tricks and OptimizationsRunning Neo4j in Production:  Tips, Tricks and Optimizations
Running Neo4j in Production: Tips, Tricks and Optimizations
Nick Manning531 vues
Running Neo4j in Production: Tips, Tricks and Optimizations par Nick Manning
Running Neo4j in Production:  Tips, Tricks and OptimizationsRunning Neo4j in Production:  Tips, Tricks and Optimizations
Running Neo4j in Production: Tips, Tricks and Optimizations
Nick Manning744 vues
Ontology Access Kit_ Workshop Intro Slides.pptx par Chris Mungall
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
Chris Mungall92 vues
Developing Korean Chatbot 101 par Jaemin Cho
Developing Korean Chatbot 101Developing Korean Chatbot 101
Developing Korean Chatbot 101
Jaemin Cho20.1K vues
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne... par Neo4j
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Neo4j2.7K vues

Dernier

Quality Engineer: A Day in the Life par
Quality Engineer: A Day in the LifeQuality Engineer: A Day in the Life
Quality Engineer: A Day in the LifeJohn Valentino
6 vues18 diapositives
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ... par
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...Deltares
11 vues32 diapositives
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft... par
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...Deltares
7 vues18 diapositives
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ... par
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Donato Onofri
825 vues34 diapositives
Keep par
KeepKeep
KeepGeniusee
75 vues10 diapositives
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko... par
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...Deltares
14 vues23 diapositives

Dernier(20)

DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ... par Deltares
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
Deltares11 vues
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft... par Deltares
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...
Deltares7 vues
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ... par Donato Onofri
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Donato Onofri825 vues
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko... par Deltares
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
Deltares14 vues
Copilot Prompting Toolkit_All Resources.pdf par Riccardo Zamana
Copilot Prompting Toolkit_All Resources.pdfCopilot Prompting Toolkit_All Resources.pdf
Copilot Prompting Toolkit_All Resources.pdf
Dapr Unleashed: Accelerating Microservice Development par Miroslav Janeski
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice Development
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs par Deltares
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
Deltares8 vues
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium... par Lisi Hocke
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...
Lisi Hocke30 vues
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx par animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm14 vues
Generic or specific? Making sensible software design decisions par Bert Jan Schrijver
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J... par Deltares
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
Deltares9 vues
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema par Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - GeertsemaDSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
Deltares17 vues
Software evolution understanding: Automatic extraction of software identifier... par Ra'Fat Al-Msie'deen
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI... par Marc Müller
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Marc Müller37 vues

Word Puzzles with Neo4j and Py2neo

  • 1. Presented by Grant Paton-Simpson Word Puzzles with Neo4j and Py2neo
  • 2. Overview ● Brief look at graph databases & Neo4j ● Introduction to word transformation game ● Getting suitable words ● Adding words and relationships into Neo4j ● Querying graph data to generate puzzles
  • 3. Graph Databases – a NoSQL option http://neo4j.com/books/graph-databases/
  • 4. NoSQL – when is it a good fit? ● SQL has its origins in the 1970s and may not be fresh and shiny any more but ... ● … we shouldn't choose NoSQL for reasons of fashion. ● Venerable SQL often a better choice for standard hierarchies e.g. countries that have cities that have suburbs etc
  • 6. Graph Databases ● Graph databases much, much better for related data with: – lots of different links between same nodes – different numbers of links between nodes e.g. 3 hops to one peer and 7 hops to another – lots of peer-to-peer links
  • 7. Substantial Benefits ● Massive performance benefits (going exponential as number of links grows) ● Structural harmony – between structure of data and structure of data storage (what you draw on the whiteboard might look very similar to how you data is actually structured) – between questions of data and query language used to answer them
  • 8. Word transformations ● Start with one word and get to the other by single-letter tranformations word-by-word ● E.g. starting with “stores” get to “slaked” – BTW there are 96 alternative ways 5 moves or less stores stored stared staked slaked
  • 9. Puzzle taster Get from 'sloven' to 'closed' in no more than 5 steps (there are 10 unique solutions) sloven ? closed
  • 10. Getting a simple word list ● How hard could it be? ● Lesson #1 – scrabble lists and similar are useless – only want lists with standard words otherwise puzzles too hard ● Lesson #2 – have to decide about taboo/profane words ● Lesson #3 – the number of words affects the number of ONE_LETTER_DIFF relationships a lot ● Lesson #4 – clever optimisation not needed if restricting self to ordinary words SCOWL (Spell Checker Oriented Word Lists) http://wordlist.aspell.net/
  • 11. Filtering words ● Needed to turn é to e ● Needed to eliminate possessives e.g. cat's (as used in the phrase “the cat's whiskers”) ● Needed to leave out capitalised words
  • 12. For each word, identifying words different by one letter only Disclaimer: the code worked but probably some super-smart optimisations would be possible involving n-dimensional space or something
  • 13. Adding data to Neo4j ● Create nodes and relationships ● Lots of room for optimisations ● Only need to build database once so 15 minutes is not worth reducing ● My Neo4j and Py2neo is beginner level but I was able to solve my problem
  • 15. Cypher Syntax as ASCII Art (Really!) Word Word ONE_OFF (Word) -[ONE_OFF]->(Word)
  • 16. Cypher Syntax as ASCII Art (Really!) Word Word ONE_OFF (Word) -[ONE_OFF]->(Word) How cool is this?
  • 19. Live Demo – Suggestions for Start Word
  • 21. Resources ● Neo4j – http://neo4j.com/books/graph-databases/ – http://neo4j.com/graphacademy/ – http://graphgist.neo4j.com/#!/gists – https://www.youtube.com/channel/UCvze3hU6OZBkB1vkhH2lH9Q ● Py2neo – http://py2neo.org/2.0/ ● SCOWL – http://wordlist.aspell.net/