SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
• 



— TO —>
Also
OK, graph databases
• Instead of tables and SQL
• Nodes and relationships
• Specialized queries

• Not everything is a graph

(and this is not sponsored)
Install / Update Neo4j
• Neo4j
• http://localhost:7474

Community Edition 3.0.3

• Python, PIP, and Py2Neo
• py2neo.__version__ = ‘3b1’
Step 0 - installing
• Install Neo4j - neo4j.com/install
• brew on Mac
• DigitalOcean has Linux instructions
• change default password
• Trouble installing locally?
• heroku addons:add graphene
Who uses graphs?
• Panama Papers
• IMDB / Six Degrees of Kevin Bacon
• Especially:
• social networks, research data, maps
• anywhere number of joins is large, indefinite,
or unlimited
Cypher
MoMA.org
• PostgreSQL sync to “The Museum System” CMS
outside our control
Who uses MoMA.org?
• Tourists
• Researchers
• Distant art fans
• Members
The trouble with tables
• Many joins to get people, titles, photos,
additional relationship info

• Speed of query
• Difficult to write new queries
Art Graph DB
• did Picasso collaborate with other artists
in his lifetime?
• are any artists credited as painter,
director, sculptor, etc?

(maybe an art EGOT)
Let’s build that graph
• Artists and artworks
• Basic bio data, MoMA ID -> Artist node
• Future DB: all people connected
• Title, date, MoMA ID -> Artwork node
• ARTIST_OF relationship (include order)
Let’s build that graph
• git clone

https://github.com/mapmeld/graph
!
• Building a scraper for MoMA
Demolitions and Dalí
in a Graph Database
Nick Doiron - @mapmeld
Cypher
Cypher
On to OSM
If you’re interested
• Google: MapZen Extracts
• download a city
• for this script, download the OSM XML file
• if you like PostGIS, there is a download

(no import script)
Benefits of OSM
• Open to use / full data
• Open to edit / choose tags
• HOT community
• Civil e-mail lists (Crimea)
Benefits of OSM
Google on OSM
• "Our maps represent

what you or I need to do on a day-to-day
basis

in the developed part of the world”
• — Google Maps Geospatial Technologist
(quoted in FastCompany)
In Haiti and worldwide
In Haiti and worldwide
XML data
XML data
• Nodes, ways, and relations
• Ways made up of multiple nodes
• Relations contain nodes and ways
• Practically:
• Multiple ways connect / combine
• Tags are a community construct
Smart Renderer
• When is a <way> a line (cul-de-sac) or a
polygon (river, lake, parking lot)?
• Has to support world’s fonts
• Tag for real life, not for the renderer
Building graph data
• Script adds all roads to Neo4j
• Includes an array of node ids (can mix content
types, similar to a document database)
• If two ways share a node with the same ID, link
them both ways <—>
Cypher + OSM
* you can put an index on schema fields now
Problem
Google Prediction API
• Prediction based on a CSV
• Categorization or numerical
• Google generates a model and estimates
accuracy
• Not allowed in Myanmar
Predicting Houses
• Format 60,000+ rows of database export
• Choose categories to predict 2-3 years
• Competing models determine how important
each column is
• Can it parse dates? Find patterns
• Edging up to ~74 percent accuracy
Network effect
• Adding network of streets
• Now tokens include not
just my street and
neighbors, but shared
streets
Network effect
• Most demolitions have one house on their street
demolished (it’s them)
Network effect
Network effect
• Google Prediction API reported 81% accuracy
• But is it good?
• Early optimization studies moved fire stations
and left neighborhoods vulnerable
• City can’t maintain it… hasn’t continued to
open their data
Looking forward
• Ideas for graph databases?

Ways to release large graph data - as an API?
As JSON files? As Neo4j dump?
• Ideas for statisticians / future research?
Demolitions and Dalí
in a Graph Database
Nick Doiron - @mapmeld

Contenu connexe

En vedette (6)

Unicode vs The World
Unicode vs The WorldUnicode vs The World
Unicode vs The World
 
Future of Home: Living on the Run with Airbnb
Future of Home: Living on the Run with AirbnbFuture of Home: Living on the Run with Airbnb
Future of Home: Living on the Run with Airbnb
 
Enabling Data-Driven Private-Public Collaborations
Enabling Data-Driven Private-Public CollaborationsEnabling Data-Driven Private-Public Collaborations
Enabling Data-Driven Private-Public Collaborations
 
Pushing Python: Building a High Throughput, Low Latency System
Pushing Python: Building a High Throughput, Low Latency SystemPushing Python: Building a High Throughput, Low Latency System
Pushing Python: Building a High Throughput, Low Latency System
 
Airbnb tech talk: Levi Weintraub on webkit
Airbnb tech talk: Levi Weintraub on webkitAirbnb tech talk: Levi Weintraub on webkit
Airbnb tech talk: Levi Weintraub on webkit
 
A Collaborative Approach to Teach Software Architecture - SIGCSE 2017
A Collaborative Approach to Teach Software Architecture - SIGCSE 2017A Collaborative Approach to Teach Software Architecture - SIGCSE 2017
A Collaborative Approach to Teach Software Architecture - SIGCSE 2017
 

Similaire à Demolitions and Dali : Web Dev and Data in a Graph Database

CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databases
sjwoodman
 
Going Mobile with HTML5
Going Mobile with HTML5Going Mobile with HTML5
Going Mobile with HTML5
John Reiser
 
Halko_santafe_2015
Halko_santafe_2015Halko_santafe_2015
Halko_santafe_2015
Nathan Halko
 
neurisa_11_09_rosenthal
neurisa_11_09_rosenthalneurisa_11_09_rosenthal
neurisa_11_09_rosenthal
tutorialsruby
 
neurisa_11_09_rosenthal
neurisa_11_09_rosenthalneurisa_11_09_rosenthal
neurisa_11_09_rosenthal
tutorialsruby
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit Hole
Christophe Grand
 

Similaire à Demolitions and Dali : Web Dev and Data in a Graph Database (20)

Openstreetmap
OpenstreetmapOpenstreetmap
Openstreetmap
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databases
 
OpenStreetMap in 3D - current developments
OpenStreetMap in 3D - current developmentsOpenStreetMap in 3D - current developments
OpenStreetMap in 3D - current developments
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks
 
Session 03 acquiring data
Session 03 acquiring dataSession 03 acquiring data
Session 03 acquiring data
 
Session 03 acquiring data
Session 03 acquiring dataSession 03 acquiring data
Session 03 acquiring data
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack
 
Going Mobile with HTML5
Going Mobile with HTML5Going Mobile with HTML5
Going Mobile with HTML5
 
JOSM Workshop HOT Summit Washington, DC Apr '15
JOSM Workshop HOT Summit Washington, DC Apr '15JOSM Workshop HOT Summit Washington, DC Apr '15
JOSM Workshop HOT Summit Washington, DC Apr '15
 
Intro to Neo4j with Ruby
Intro to Neo4j with RubyIntro to Neo4j with Ruby
Intro to Neo4j with Ruby
 
Social Networks Analysis
Social Networks AnalysisSocial Networks Analysis
Social Networks Analysis
 
Harpster, J. - Open data on buildings with satellite imagery processing
Harpster, J. - Open data on buildings with satellite imagery processingHarpster, J. - Open data on buildings with satellite imagery processing
Harpster, J. - Open data on buildings with satellite imagery processing
 
UNit4.pdf
UNit4.pdfUNit4.pdf
UNit4.pdf
 
Halko_santafe_2015
Halko_santafe_2015Halko_santafe_2015
Halko_santafe_2015
 
neurisa_11_09_rosenthal
neurisa_11_09_rosenthalneurisa_11_09_rosenthal
neurisa_11_09_rosenthal
 
neurisa_11_09_rosenthal
neurisa_11_09_rosenthalneurisa_11_09_rosenthal
neurisa_11_09_rosenthal
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit Hole
 
Saving Money with Open Source GIS
Saving Money with Open Source GISSaving Money with Open Source GIS
Saving Money with Open Source GIS
 
MongoDB Basics
MongoDB BasicsMongoDB Basics
MongoDB Basics
 

Plus de Nicholas Doiron

Plus de Nicholas Doiron (20)

Quantum Computers and Where to Hide from Them (Japanese)
Quantum Computers and Where to Hide from Them (Japanese)Quantum Computers and Where to Hide from Them (Japanese)
Quantum Computers and Where to Hide from Them (Japanese)
 
Arabic Unicode and Calligraphy
Arabic Unicode and CalligraphyArabic Unicode and Calligraphy
Arabic Unicode and Calligraphy
 
Building Encrypted APIs with HTTPS and Paillier
Building Encrypted APIs with HTTPS and PaillierBuilding Encrypted APIs with HTTPS and Paillier
Building Encrypted APIs with HTTPS and Paillier
 
Code for Japan: Civic Tech and Maps
Code for Japan: Civic Tech and MapsCode for Japan: Civic Tech and Maps
Code for Japan: Civic Tech and Maps
 
Post-Quantum Dev Ops
Post-Quantum Dev OpsPost-Quantum Dev Ops
Post-Quantum Dev Ops
 
If OLPC started today... JSConf.is
If OLPC started today... JSConf.isIf OLPC started today... JSConf.is
If OLPC started today... JSConf.is
 
NodeJS in Naypyitaw
NodeJS in NaypyitawNodeJS in Naypyitaw
NodeJS in Naypyitaw
 
Burmese Crosswords
Burmese CrosswordsBurmese Crosswords
Burmese Crosswords
 
iLoominate: Authoring eBooks in Multiple Languages
iLoominate: Authoring eBooks in Multiple LanguagesiLoominate: Authoring eBooks in Multiple Languages
iLoominate: Authoring eBooks in Multiple Languages
 
The Civic Deep Web
The Civic Deep WebThe Civic Deep Web
The Civic Deep Web
 
Community Planning: Less Maps, More Design
Community Planning: Less Maps, More DesignCommunity Planning: Less Maps, More Design
Community Planning: Less Maps, More Design
 
RobotsConf - Wiring, Soldering, Prototyping
RobotsConf - Wiring, Soldering, PrototypingRobotsConf - Wiring, Soldering, Prototyping
RobotsConf - Wiring, Soldering, Prototyping
 
CartoDrop: secure mapping and reporting over Tor
CartoDrop: secure mapping and reporting over TorCartoDrop: secure mapping and reporting over Tor
CartoDrop: secure mapping and reporting over Tor
 
CfA Ignite 2013: Uploading an Island, the Ultimate Backup Plan
CfA Ignite 2013: Uploading an Island, the Ultimate Backup PlanCfA Ignite 2013: Uploading an Island, the Ultimate Backup Plan
CfA Ignite 2013: Uploading an Island, the Ultimate Backup Plan
 
Code for America & the War on Git
Code for America & the War on GitCode for America & the War on Git
Code for America & the War on Git
 
GeoGit for Open Data
GeoGit for Open DataGeoGit for Open Data
GeoGit for Open Data
 
MajuroJS.org (Chicago presentation)
MajuroJS.org (Chicago presentation)MajuroJS.org (Chicago presentation)
MajuroJS.org (Chicago presentation)
 
Maps No One Wants
Maps No One WantsMaps No One Wants
Maps No One Wants
 
How Code for America Makes Maps
How Code for America Makes MapsHow Code for America Makes Maps
How Code for America Makes Maps
 
Can We Teach Everyone to Code
Can We Teach Everyone to CodeCan We Teach Everyone to Code
Can We Teach Everyone to Code
 

Dernier

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Dernier (20)

A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 

Demolitions and Dali : Web Dev and Data in a Graph Database

  • 1.
  • 4. OK, graph databases • Instead of tables and SQL • Nodes and relationships • Specialized queries
 • Not everything is a graph
 (and this is not sponsored)
  • 5. Install / Update Neo4j • Neo4j • http://localhost:7474
 Community Edition 3.0.3
 • Python, PIP, and Py2Neo • py2neo.__version__ = ‘3b1’
  • 6. Step 0 - installing • Install Neo4j - neo4j.com/install • brew on Mac • DigitalOcean has Linux instructions • change default password • Trouble installing locally? • heroku addons:add graphene
  • 7. Who uses graphs? • Panama Papers • IMDB / Six Degrees of Kevin Bacon • Especially: • social networks, research data, maps • anywhere number of joins is large, indefinite, or unlimited
  • 8.
  • 10. MoMA.org • PostgreSQL sync to “The Museum System” CMS outside our control
  • 11. Who uses MoMA.org? • Tourists • Researchers • Distant art fans • Members
  • 12. The trouble with tables • Many joins to get people, titles, photos, additional relationship info
 • Speed of query • Difficult to write new queries
  • 13. Art Graph DB • did Picasso collaborate with other artists in his lifetime? • are any artists credited as painter, director, sculptor, etc?
 (maybe an art EGOT)
  • 14. Let’s build that graph • Artists and artworks • Basic bio data, MoMA ID -> Artist node • Future DB: all people connected • Title, date, MoMA ID -> Artwork node • ARTIST_OF relationship (include order)
  • 15. Let’s build that graph • git clone
 https://github.com/mapmeld/graph ! • Building a scraper for MoMA
  • 16. Demolitions and Dalí in a Graph Database Nick Doiron - @mapmeld
  • 19.
  • 21. If you’re interested • Google: MapZen Extracts • download a city • for this script, download the OSM XML file • if you like PostGIS, there is a download
 (no import script)
  • 22. Benefits of OSM • Open to use / full data • Open to edit / choose tags • HOT community • Civil e-mail lists (Crimea)
  • 24. Google on OSM • "Our maps represent
 what you or I need to do on a day-to-day basis
 in the developed part of the world” • — Google Maps Geospatial Technologist (quoted in FastCompany)
  • 25. In Haiti and worldwide
  • 26. In Haiti and worldwide
  • 28. XML data • Nodes, ways, and relations • Ways made up of multiple nodes • Relations contain nodes and ways • Practically: • Multiple ways connect / combine • Tags are a community construct
  • 29. Smart Renderer • When is a <way> a line (cul-de-sac) or a polygon (river, lake, parking lot)? • Has to support world’s fonts • Tag for real life, not for the renderer
  • 30. Building graph data • Script adds all roads to Neo4j • Includes an array of node ids (can mix content types, similar to a document database) • If two ways share a node with the same ID, link them both ways <—>
  • 31. Cypher + OSM * you can put an index on schema fields now
  • 33. Google Prediction API • Prediction based on a CSV • Categorization or numerical • Google generates a model and estimates accuracy • Not allowed in Myanmar
  • 34. Predicting Houses • Format 60,000+ rows of database export • Choose categories to predict 2-3 years • Competing models determine how important each column is • Can it parse dates? Find patterns • Edging up to ~74 percent accuracy
  • 35. Network effect • Adding network of streets • Now tokens include not just my street and neighbors, but shared streets
  • 36. Network effect • Most demolitions have one house on their street demolished (it’s them)
  • 38. Network effect • Google Prediction API reported 81% accuracy • But is it good? • Early optimization studies moved fire stations and left neighborhoods vulnerable • City can’t maintain it… hasn’t continued to open their data
  • 39. Looking forward • Ideas for graph databases?
 Ways to release large graph data - as an API? As JSON files? As Neo4j dump? • Ideas for statisticians / future research?
  • 40. Demolitions and Dalí in a Graph Database Nick Doiron - @mapmeld