SlideShare une entreprise Scribd logo
1  sur  44
Grails Goes Graph
                         Stefan Armbruster, presales engineer @neotechnology
                                stefan.armbruster@neotechnology.com
                                        Twitter: @darthvader42



© 2012 SpringOne 2GX. All rights reserved. Do not distribute without permission.
about @self
This talk: Grails goes Graph

•   Intro into Graph (Databases)
•   Intro into Neo4j
•   Grails Neo4j plugin
•   Live demo
•   case study




3
trend 1: data growth




    source: Digital Universe Study 2011 by IDC
4
trend 2: data connectedness
                                                                                             GGG
                                                                                    Onotologies

                                                                                RDF

                                                                             Folksonomies
      Information connectivity



                                                                   Tagging
                                                          Wikis

                                                                   UGC
                                                           Blogs
                                                         Feeds

                                             Hypertext
                                    Text
                                 Documents



5
trend 3: semi-structured information

• Individualisation of content
    – 1970’s salary lists, all elements exactly one job
    – 2000’s salary lists, we need many job columns!
• All encompassing “entire world views”
    – Store more data about each entity
• Trend accelerated by the decentralization of content
  generation
• Age of participation (“web 2.0”)

6
trend 4: architecture
1980's: mainframe
trend 4: architecture
1990's: DB as integration platform
trend 4: architecture
2000's: decoupling of services
trend 4: architecture
2010: SOA
trend 4: scale for performance
     Salary list


                   Most Web apps


                                   Social Network


                                             Location-based services
data is different over times: 4 trends

1)   amount of data grows (bigdata)
2)   data gets more connected
3)   less structure – semi-structured
4)   architecture – massive horizontal scalability




12
NoSQL – what does that mean?

             NO to SQL ?

             not only SQL!

13
simplistic cartography of NoSQL




14
side note: aggregate oriented databases




     “   "There is a significant downside - the whole
         approach works really well when data access is
         89% of all virtualized applications
         aligned with the aggregates, but what if you want
         to lookthe world in a different way? ...Order
             in at the data run on VMware.
         entry naturally stores orders as aggregates, but
                     Gartner, December 2008
         analyzing product sales cuts across the
         aggregate structure. This is why aggregate-
         oriented stores talk so much about map-reduce"
         Martin Fowler on
         http://martinfowler.com/bliki/AggregateOrientedDatabase.html
15
graphs are everywhere
graphs everywhere

●
     Relationships in
      ●
         Politics, Economics, History, Science, Transportation
●
     Biology, Chemistry, Physics, Sociology
      ●
         Body, Ecosphere, Reaction, Interactions
●
     Internet
      ●
         Hardware, Software, Interaction
●
     Social Networks
      ●
         Family, Friends
      ●
         Work, Communities
      ●
         Neighbours, Cities, Society
17
relationships

●
     the world is rich, messy and related data
●
     relationships are as least as important as the things they connect
●
     Graphs = Whole > Σ parts
●
     complex interactions
●
     always changing, change of structures as well
●
     Graph: Relationships are part of the data
●
     RDBMS: Relationships part of the fixed schema




18
questions & answers

●
     Complex Questions
●
     Answers lie between the lines (things)
●
     Locality of the information
●
     Global searches / operations very expensive
●
     constant query time, regardless of data volume




19
categories

●
     Categories == Classes, Trees ?
●
     What if more than one category fits?
●
     Tags
●
     Categories via relationships like „IS_A“
●
     any number, easy change
●
     „virtual“ Relationships - Traversals
●
     Category dynamically derived from queries




20
everyone is talking about graphs




Facebook Open Graph




21
Neo4j
example of a property graph
querying the graph: your choice

•    Simple way: navigate relationship paths by core API
•    More powerful: simple traversers with callbacks for
                                     ted
      – Where to end traversal eca
                                 r
      – What should be in the dep set
                              result
•    Even more powerful: Traversal API
      – Fluent interface for specifying traversals,
•    Shell: mimics unix filesystem commands (ls, cd, ...)
•    Gremlin: graphetraversaldlanguage
                 to b dep
                          recate

•    Cypher: “the SQL for Neo4j”
      – Declarative
      – Designed for Humans (Devs + Domain experts)
24
Cypher examples

                  START
                  john=node:node_auto_index(name = 'John')
                  MATCH
                  john-[:friend]->()-[:friend]->fof
                  RETURN john, fof



                  START user=node(5,4,1,2,3)
                  MATCH user-[:friend]->follower
                  WHERE follower.name =~ /S.*/
                  RETURN user, follower.name


25
query performance
• a sample social graph
   – with ~1,000 persons
• average 50 friends per person
• pathExists(a,b) limited to depth 4
• caches warmed up to eliminate disk I/O
                            # Person    query time
            relational DB   1.000       2.000 ms
            Neo4j           1.000       2 ms
            Neo4j           1.000.000   2 ms
deployment options
• Embedded in JVM
   – Just drop couple of jars into your application
   – Use EmbeddedGraphDatabase
   – Very fast → no marshalling/unmarshalling, no network overhead
• Neo4j as Server
   – Exposes rich REST interface
      • granular API → many requests, consider network overhead
      • use batching or Cypher if possible
   – Add custom modules to the server (plugins/unmanaged extensions)
• Both, embedded and server can be run as HA!
   – One master, multiple slaves
   – Zookeeper for managing the cluster, about to change for upcoming versions
Neo4j HA architecture
Licensing Neo4j
  3 editions available:

• Community:
   – GPL
• Advanced
   – Community + enhanced Monitoring + enhanced Webadmin
   – AGPL or Commercial
• Enterprise
   – Advanced + HA + online backup + GCR-Cache
   – AGPL or Commercial
Neo4j - Overview
      Sharding




                                                  LS
        Master/Slave




                                                TRAVERSA
                          HIG                                        S
                              H_A                                 TE
                                  V   AIL                       RA
                                            .                TEG
                                                           IN
                       RUN
                          S   _AS




                                         E
                         AS



                                    SCAL
                                    S_TO
                        _
                     NS
                  RU




                                                                         30
graphconnect.com, Nov 6 – 7
GORM
• Grails Object Relational Mapping (GORM) aka grails-data-mapping
    – Lib: https://github.com/SpringSource/grails-data-mapping
•   manages meta-model of domain classes
•   Common data persistence abstraction layer
•   Methods for domain classes (CRUD + finders + X)
•   Extensible
•   Access to low level API of the implementation
•   TCK for implementation, +200 testcases
•   Existing implementations
    – Simple (In-Memory, hashmap based for unit testing)
    – Hibernate, JPA
    – MongoDB, SimpleDB, Dynamo, Redis, (Riak), Neo4j
some key abstractions in g-d-m
• MappingContext:
   – holds metainformation about mapping domain classes to the underlying
     datastore, does type conversion, holds list of EntityPersisters
• Datastore:
   – create sessions
   – manage connection to low-level storage
• Session:
   – similar HibernateSession
• EntityPersister:
   – does the dirty work: interact with low level datastore
• Query:
   – knows how to query the datastore by criteria (criterion, projections,...)
GORM has a price tag ;-)
Grails Neo4j Integration
• Resources:
     – Lib: https://github.com/SpringSource/grails-data-mapping
     – Plugin: http://www.grails.org/plugin/neo4j
     – Plugin docs:
       http://springsource.github.com/grails-data-mapping/neo4j/manual/index.html
•   goal: use Neo4j as persistence layer for a standard Grails domain
    model
Mapping Grails domain model to the nodespace

      domain class                            reference node
                               subreference

      domain class          instance
        instance

     domain instance
        property
                       properties



      association
2 “challanges” involved
• Locking of domain nodes in HA mode            reference node

• Category nodes become “super nodes”
   – causes potential bottleneck on
                                                domain node
     traversals


Solutions:
• add intermediate category nodes       instance nodes
• use indexing instead
currently working in the neo4j plugin (1/2)
• passing >98% of GORM TCK (hurray!)
• accessing embedded, REST and HA datasources
   – and ImpermanentGraphdatabase for testing
• property type conversion
• support of schemaless properties
• access to native API
   – instance.getNode(), bean: graphDatabaseService
• GORM enhancements:
   – <DomainClass>.traverseStatic, <DomainClass>.cypherStatic
   – <instance>.traverse, <instance>.cypher
currently working in the neo4j plugin (2/2)
• prevention of locking exceptions by using intermediate
  category nodes
• Declarative Indexing
   – apply static mapping closure just the standard way
• convenience methods on Neo4j's nodes and relationships:
   – node.<prop> = <value>
• JSON marshalling for Neo4j's Node and Relationships
• embed Neo4j's webadmin into grails application
praying to the demo god...
looking into the crystal ball
• get rid of subreferences in favour of indexing
• migrate plugin to use Cypher only instead of core-API
• option for mapping domain classes as a relationship
   – think of roads between cities having a distance property
• fix open issues: http://bit.ly/KEmVX2
• maybe use Spring Data Neo4j internally

• … and more
case study
• back in 2010 a website to collect and aggregate opinions of
  soccer fans went life
• votes can be based on almost everything
   – players, teams, matches, events in matches
• hard to model with classic RDBMS
• Neo4j to the rescue, used in embedded mode
• as always: hard and very tight schedule
   – build up technical debt due to lack of automated tests
• Neo4j HA scales very good for reads
case study: lessons learned
•   massive amount of very small write transactions in HA mode caused
    trouble:
     – e.g. locking exceptions upon user registration
     – aggregate multiple write transactions using JMS queue
•   serious issues with full GCs
     – since app AND Neo4j reside in same JVM full GCs happen
     – if “stop-the-world” pause is too large: master switch
•   have loadbalancer with 2 setups (planned):
     – write-driven requests go to master node
     – read-driven requests go to slave nodes
References
• general overview of nosql:
   – http://www.nosql-databases.org/
• Neo4j itself: http://www.neo4j.org
   – http://api.neo4j.org
   – http://doc.neo4j.org
• neo4j grails plugin:
   –   source: https://github.com/SpringSource/grails-data-mapping
   –   docs: http://springsource.github.com/grails-data-mapping/neo4j/
   –   issues: http://jira.grails.org/browse/GPNEO4J
   –   demo app: https://github.com/sarmbruster/neo4jsample
• Java REST driver: https://github.com/jexp/neo4j-java-rest-binding
• my blog: http://blog.armbruster-it.de
• twitter: @darthvader42

Contenu connexe

Tendances

Getting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jSuroor Wijdan
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataEUCLID project
 
Standards for Semantic Mashups
Standards for Semantic MashupsStandards for Semantic Mashups
Standards for Semantic MashupsLaurent Lefort
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataEUCLID project
 
Graph Databases, The Web of Data Storage Engines
Graph Databases, The Web of Data Storage EnginesGraph Databases, The Web of Data Storage Engines
Graph Databases, The Web of Data Storage EnginesPere Urbón-Bayes
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Olaf Hartig
 
Facebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceFacebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceJ Singh
 
Deploying Grid Services Using Hadoop
Deploying Grid Services Using HadoopDeploying Grid Services Using Hadoop
Deploying Grid Services Using HadoopGeorge Ang
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Olaf Hartig
 
Drill at the Chug 9-19-12
Drill at the Chug 9-19-12Drill at the Chug 9-19-12
Drill at the Chug 9-19-12Ted Dunning
 
Drill lightning-london-big-data-10-01-2012
Drill lightning-london-big-data-10-01-2012Drill lightning-london-big-data-10-01-2012
Drill lightning-london-big-data-10-01-2012Ted Dunning
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill Carol McDonald
 
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric BaldeschwielerArchitecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwielerlucenerevolution
 
Intro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-AthensIntro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-AthensStoitsis Giannis
 
Building Enterprise Apps for Big Data with Cascading
Building Enterprise Apps for Big Data with CascadingBuilding Enterprise Apps for Big Data with Cascading
Building Enterprise Apps for Big Data with CascadingPaco Nathan
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Jonathan Seidman
 
Hadoop and Hive at Orbitz, Hadoop World 2010
Hadoop and Hive at Orbitz, Hadoop World 2010Hadoop and Hive at Orbitz, Hadoop World 2010
Hadoop and Hive at Orbitz, Hadoop World 2010Jonathan Seidman
 

Tendances (20)

Getting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4j
 
Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked Data
 
Standards for Semantic Mashups
Standards for Semantic MashupsStandards for Semantic Mashups
Standards for Semantic Mashups
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Graph Databases, The Web of Data Storage Engines
Graph Databases, The Web of Data Storage EnginesGraph Databases, The Web of Data Storage Engines
Graph Databases, The Web of Data Storage Engines
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
 
Facebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceFacebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/Reduce
 
Querying Linked Data
Querying Linked DataQuerying Linked Data
Querying Linked Data
 
Deploying Grid Services Using Hadoop
Deploying Grid Services Using HadoopDeploying Grid Services Using Hadoop
Deploying Grid Services Using Hadoop
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
 
Drill at the Chug 9-19-12
Drill at the Chug 9-19-12Drill at the Chug 9-19-12
Drill at the Chug 9-19-12
 
Drill lightning-london-big-data-10-01-2012
Drill lightning-london-big-data-10-01-2012Drill lightning-london-big-data-10-01-2012
Drill lightning-london-big-data-10-01-2012
 
Intro to Hadoop
Intro to HadoopIntro to Hadoop
Intro to Hadoop
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill
 
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric BaldeschwielerArchitecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwieler
 
Intro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-AthensIntro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-Athens
 
Building Enterprise Apps for Big Data with Cascading
Building Enterprise Apps for Big Data with CascadingBuilding Enterprise Apps for Big Data with Cascading
Building Enterprise Apps for Big Data with Cascading
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
 
Hadoop and Hive at Orbitz, Hadoop World 2010
Hadoop and Hive at Orbitz, Hadoop World 2010Hadoop and Hive at Orbitz, Hadoop World 2010
Hadoop and Hive at Orbitz, Hadoop World 2010
 

Similaire à Grails goes Graph

No Sql Movement
No Sql MovementNo Sql Movement
No Sql MovementAjit Koti
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011jexp
 
Graph Database and Neo4j
Graph Database and Neo4jGraph Database and Neo4j
Graph Database and Neo4jSina Khorami
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Choosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot ApproachChoosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot ApproachDATAVERSITY
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataMarin Dimitrov
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriDemi Ben-Ari
 
CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databasessjwoodman
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)Emil Eifrem
 
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...Jean Ihm
 
Evolution of the Graph Schema
Evolution of the Graph SchemaEvolution of the Graph Schema
Evolution of the Graph SchemaJoshua Shinavier
 
20130204 graph to-pacer-xml
20130204 graph to-pacer-xml20130204 graph to-pacer-xml
20130204 graph to-pacer-xmlDavid Colebatch
 
Big data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalBig data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalramazan fırın
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, HowIgor Moochnick
 

Similaire à Grails goes Graph (20)

No Sql Movement
No Sql MovementNo Sql Movement
No Sql Movement
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011
 
Graph Database and Neo4j
Graph Database and Neo4jGraph Database and Neo4j
Graph Database and Neo4j
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Choosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot ApproachChoosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot Approach
 
Apache Drill
Apache DrillApache Drill
Apache Drill
 
the rising no sql technology
the rising no sql technologythe rising no sql technology
the rising no sql technology
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
 
CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databases
 
Database
DatabaseDatabase
Database
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
 
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
 
Evolution of the Graph Schema
Evolution of the Graph SchemaEvolution of the Graph Schema
Evolution of the Graph Schema
 
20130204 graph to-pacer-xml
20130204 graph to-pacer-xml20130204 graph to-pacer-xml
20130204 graph to-pacer-xml
 
Big data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalBig data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-final
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 

Dernier

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 

Dernier (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 

Grails goes Graph

  • 1. Grails Goes Graph Stefan Armbruster, presales engineer @neotechnology stefan.armbruster@neotechnology.com Twitter: @darthvader42 © 2012 SpringOne 2GX. All rights reserved. Do not distribute without permission.
  • 3. This talk: Grails goes Graph • Intro into Graph (Databases) • Intro into Neo4j • Grails Neo4j plugin • Live demo • case study 3
  • 4. trend 1: data growth source: Digital Universe Study 2011 by IDC 4
  • 5. trend 2: data connectedness GGG Onotologies RDF Folksonomies Information connectivity Tagging Wikis UGC Blogs Feeds Hypertext Text Documents 5
  • 6. trend 3: semi-structured information • Individualisation of content – 1970’s salary lists, all elements exactly one job – 2000’s salary lists, we need many job columns! • All encompassing “entire world views” – Store more data about each entity • Trend accelerated by the decentralization of content generation • Age of participation (“web 2.0”) 6
  • 8. trend 4: architecture 1990's: DB as integration platform
  • 9. trend 4: architecture 2000's: decoupling of services
  • 11. trend 4: scale for performance Salary list Most Web apps Social Network Location-based services
  • 12. data is different over times: 4 trends 1) amount of data grows (bigdata) 2) data gets more connected 3) less structure – semi-structured 4) architecture – massive horizontal scalability 12
  • 13. NoSQL – what does that mean? NO to SQL ? not only SQL! 13
  • 15. side note: aggregate oriented databases “ "There is a significant downside - the whole approach works really well when data access is 89% of all virtualized applications aligned with the aggregates, but what if you want to lookthe world in a different way? ...Order in at the data run on VMware. entry naturally stores orders as aggregates, but Gartner, December 2008 analyzing product sales cuts across the aggregate structure. This is why aggregate- oriented stores talk so much about map-reduce" Martin Fowler on http://martinfowler.com/bliki/AggregateOrientedDatabase.html 15
  • 17. graphs everywhere ● Relationships in ● Politics, Economics, History, Science, Transportation ● Biology, Chemistry, Physics, Sociology ● Body, Ecosphere, Reaction, Interactions ● Internet ● Hardware, Software, Interaction ● Social Networks ● Family, Friends ● Work, Communities ● Neighbours, Cities, Society 17
  • 18. relationships ● the world is rich, messy and related data ● relationships are as least as important as the things they connect ● Graphs = Whole > Σ parts ● complex interactions ● always changing, change of structures as well ● Graph: Relationships are part of the data ● RDBMS: Relationships part of the fixed schema 18
  • 19. questions & answers ● Complex Questions ● Answers lie between the lines (things) ● Locality of the information ● Global searches / operations very expensive ● constant query time, regardless of data volume 19
  • 20. categories ● Categories == Classes, Trees ? ● What if more than one category fits? ● Tags ● Categories via relationships like „IS_A“ ● any number, easy change ● „virtual“ Relationships - Traversals ● Category dynamically derived from queries 20
  • 21. everyone is talking about graphs Facebook Open Graph 21
  • 22. Neo4j
  • 23. example of a property graph
  • 24. querying the graph: your choice • Simple way: navigate relationship paths by core API • More powerful: simple traversers with callbacks for ted – Where to end traversal eca r – What should be in the dep set result • Even more powerful: Traversal API – Fluent interface for specifying traversals, • Shell: mimics unix filesystem commands (ls, cd, ...) • Gremlin: graphetraversaldlanguage to b dep recate • Cypher: “the SQL for Neo4j” – Declarative – Designed for Humans (Devs + Domain experts) 24
  • 25. Cypher examples START john=node:node_auto_index(name = 'John') MATCH john-[:friend]->()-[:friend]->fof RETURN john, fof START user=node(5,4,1,2,3) MATCH user-[:friend]->follower WHERE follower.name =~ /S.*/ RETURN user, follower.name 25
  • 26. query performance • a sample social graph – with ~1,000 persons • average 50 friends per person • pathExists(a,b) limited to depth 4 • caches warmed up to eliminate disk I/O # Person query time relational DB 1.000 2.000 ms Neo4j 1.000 2 ms Neo4j 1.000.000 2 ms
  • 27. deployment options • Embedded in JVM – Just drop couple of jars into your application – Use EmbeddedGraphDatabase – Very fast → no marshalling/unmarshalling, no network overhead • Neo4j as Server – Exposes rich REST interface • granular API → many requests, consider network overhead • use batching or Cypher if possible – Add custom modules to the server (plugins/unmanaged extensions) • Both, embedded and server can be run as HA! – One master, multiple slaves – Zookeeper for managing the cluster, about to change for upcoming versions
  • 29. Licensing Neo4j 3 editions available: • Community: – GPL • Advanced – Community + enhanced Monitoring + enhanced Webadmin – AGPL or Commercial • Enterprise – Advanced + HA + online backup + GCR-Cache – AGPL or Commercial
  • 30. Neo4j - Overview Sharding LS Master/Slave TRAVERSA HIG S H_A TE V AIL RA . TEG IN RUN S _AS E AS SCAL S_TO _ NS RU 30
  • 32. GORM • Grails Object Relational Mapping (GORM) aka grails-data-mapping – Lib: https://github.com/SpringSource/grails-data-mapping • manages meta-model of domain classes • Common data persistence abstraction layer • Methods for domain classes (CRUD + finders + X) • Extensible • Access to low level API of the implementation • TCK for implementation, +200 testcases • Existing implementations – Simple (In-Memory, hashmap based for unit testing) – Hibernate, JPA – MongoDB, SimpleDB, Dynamo, Redis, (Riak), Neo4j
  • 33. some key abstractions in g-d-m • MappingContext: – holds metainformation about mapping domain classes to the underlying datastore, does type conversion, holds list of EntityPersisters • Datastore: – create sessions – manage connection to low-level storage • Session: – similar HibernateSession • EntityPersister: – does the dirty work: interact with low level datastore • Query: – knows how to query the datastore by criteria (criterion, projections,...)
  • 34. GORM has a price tag ;-)
  • 35. Grails Neo4j Integration • Resources: – Lib: https://github.com/SpringSource/grails-data-mapping – Plugin: http://www.grails.org/plugin/neo4j – Plugin docs: http://springsource.github.com/grails-data-mapping/neo4j/manual/index.html • goal: use Neo4j as persistence layer for a standard Grails domain model
  • 36. Mapping Grails domain model to the nodespace domain class reference node subreference domain class instance instance domain instance property properties association
  • 37. 2 “challanges” involved • Locking of domain nodes in HA mode reference node • Category nodes become “super nodes” – causes potential bottleneck on domain node traversals Solutions: • add intermediate category nodes instance nodes • use indexing instead
  • 38. currently working in the neo4j plugin (1/2) • passing >98% of GORM TCK (hurray!) • accessing embedded, REST and HA datasources – and ImpermanentGraphdatabase for testing • property type conversion • support of schemaless properties • access to native API – instance.getNode(), bean: graphDatabaseService • GORM enhancements: – <DomainClass>.traverseStatic, <DomainClass>.cypherStatic – <instance>.traverse, <instance>.cypher
  • 39. currently working in the neo4j plugin (2/2) • prevention of locking exceptions by using intermediate category nodes • Declarative Indexing – apply static mapping closure just the standard way • convenience methods on Neo4j's nodes and relationships: – node.<prop> = <value> • JSON marshalling for Neo4j's Node and Relationships • embed Neo4j's webadmin into grails application
  • 40. praying to the demo god...
  • 41. looking into the crystal ball • get rid of subreferences in favour of indexing • migrate plugin to use Cypher only instead of core-API • option for mapping domain classes as a relationship – think of roads between cities having a distance property • fix open issues: http://bit.ly/KEmVX2 • maybe use Spring Data Neo4j internally • … and more
  • 42. case study • back in 2010 a website to collect and aggregate opinions of soccer fans went life • votes can be based on almost everything – players, teams, matches, events in matches • hard to model with classic RDBMS • Neo4j to the rescue, used in embedded mode • as always: hard and very tight schedule – build up technical debt due to lack of automated tests • Neo4j HA scales very good for reads
  • 43. case study: lessons learned • massive amount of very small write transactions in HA mode caused trouble: – e.g. locking exceptions upon user registration – aggregate multiple write transactions using JMS queue • serious issues with full GCs – since app AND Neo4j reside in same JVM full GCs happen – if “stop-the-world” pause is too large: master switch • have loadbalancer with 2 setups (planned): – write-driven requests go to master node – read-driven requests go to slave nodes
  • 44. References • general overview of nosql: – http://www.nosql-databases.org/ • Neo4j itself: http://www.neo4j.org – http://api.neo4j.org – http://doc.neo4j.org • neo4j grails plugin: – source: https://github.com/SpringSource/grails-data-mapping – docs: http://springsource.github.com/grails-data-mapping/neo4j/ – issues: http://jira.grails.org/browse/GPNEO4J – demo app: https://github.com/sarmbruster/neo4jsample • Java REST driver: https://github.com/jexp/neo4j-java-rest-binding • my blog: http://blog.armbruster-it.de • twitter: @darthvader42

Notes de l'éditeur

  1. Eric Schmidt: “Every two days we create as much information as up to 2003”