SlideShare a Scribd company logo
1 of 181
Download to read offline
TITAN
THE RISE OF BIG GRAPH DATA



             MARKO A. RODRIGUEZ
            MATTHIAS BROECHELER




            http://THINKAURELIUS.COM
ABSTRACT
A graph is a data structure composed of vertices/dots and
edges/lines. A graph database is a software system used to
persist and process graphs. The common conception in today's
database community is that there is a tradeoff between the
scale of data and the complexity/interlinking of data. To
challenge this understanding, Aurelius has developed Titan
under the liberal Apache 2 license. Titan supports both the size
of modern data and the modeling power of graphs to usher in
the era of Big Graph Data. Novel techniques in edge
compression, data layout, and vertex-centric indices that
exploit significant orders are used to facilitate the
representation and processing of a single atomic graph
structure across a multi-machine cluster. To ensure ease of
adoption by the graph community, Titan natively implements
the TinkerPop 2 Blueprints API. This presentation will review
the graph landscape, Titan's techniques for scale by
distribution, and a collection of satellite graph technologies to
be released by Aurelius in the coming summer months of 2012.
SPEAKER BIOGRAPHIES

  Dr. Marko A. Rodriguez is the founder of the graph consulting firm Aurelius.
  He has focused his academic and commercial career on the theoretical
  and applied aspects of graphs. Marko is a cofounder of TinkerPop and the
  primary developer of the Gremlin graph traversal language.




  Dr. Matthias Broecheler has been researching and developing large-scale
  graph database systems for many years in both academia and in his role
  as a cofounder of the Aurelius graph consulting firm. He is the primary
  developer of the distributed graph database Titan. Matthias focuses most
  of his time and effort on novel OLTP and OLAP graph processing
  solutions.
SPONSORS

    As the leading education services company, Pearson is serious about evolving how
    the world learns. We apply our deep education experience and research, invest in
    innovative technologies, and promote collaboration throughout the education
    ecosystem. Real change is our commitment and its results are delivered through
    connecting capabilities to create actionable, scalable solutions that improve access,
    affordability, and achievement.




          Aurelius is a team of software engineers and scientists committed to applying
          graph theory and network science to problems in numerous domains. Aurelius
          develops the theory and technology whereby graphs can be used to model,
          understand, predict, and influence the behavior of complex, interrelated
          social, economic, and physical networks.




Jive is the pioneer and world's leading provider of social business solutions. Our products
apply powerful technology that helps people connect, communicate and collaborate to get
more work done and solve their biggest business challenges. Millions of users and many
of the worldʼs most successful companies rely on Jive day in and day out to get work
done, serve their customers and stay ahead of their competitors.
OUTLINE
1. ThE GRAPH LANDSCAPE
  An introduction to graph computing.
  Graph technologies on the market today.


2. INTRODUCTION TO TITAN
  Getting up and running with Titan.
  Titan's techniques for scalability.



3. THE FUTURE OF AURELIUS
  Satellite technologies and the OLAP story.
  The graph landscape reprise.
PART 1:
ThE GRAPH LANDSCAPE




     MARKO A. RODRIGUEZ
GRAPH
EDGE
VERTEX




         GRAPH
EDGE
VERTEX




         GRAPH




                        G = (V, E)
                        Graph   Vertices Edges
G = (V, E)
Classic Textbook Graph Structure
A homogenous set of vertices...




                                  V
...connected by a homogenous set of edges.




                                             E
RESTRICTED MODELING




People and follows relationships...
RESTRICTED MODELING




People and follows relationships...   ...xor webpages and citations.
AN INTEGRATED MODEL
     IS TYPICALLY DESIRED

                   references




             createdBy               follows


references                                     references




                follows



                          mentions
AN INTEGRATED MODEL
         IS USEFUL
                             references




                        createdBy           follows


           references                                 references



                           follows



                                     mentions




Allows for more interesting/novel algorithms.
                                         (beyond "textbook" graph algorithms)


Allows for a universal model of things and their relationships.
                                (a single, unified model of a domain of interest)
THE PROPERTY GRAPH


                 G = (V, E, λ)
                         Current Popular Graph Structure




* Directed, attributed, edge-labeled graph
* Multi-relational graph with key/value pairs on the elements
VERTEX
PROPERTIES


  name:hercules




 VERTEX
PROPERTIES
KEY      VALUE
   name:hercules




   VERTEX
name:hercules
name:hercules



                mother




                    name:alcmene
                     type:human
name:hercules
                         LABEL

                mother



            EDGE



                    name:alcmene
                     type:human
name:hercules



                mother




                    name:alcmene
                     type:human
name:hercules



                                  mother
         father




name:jupiter                          name:alcmene
 type:god                              type:human
IS HERCULES A DEMIGOD?

DEMIGOD = HALF HUMAN + HALF GOD

                        name:hercules



                                        mother
               father




      name:jupiter                          name:alcmene
       type:god                              type:human
name:hercules



                                     mother
            father




   name:jupiter                          name:alcmene
    type:god                              type:human

gremlin> hercules
==>v[0]
name:hercules



                                     mother
            father




   name:jupiter                          name:alcmene
    type:god                              type:human

gremlin> hercules.out('mother','father')
==>v[1]
==>v[2]
DEMIGOD = HALF HUMAN + HALF GOD


                         name:hercules



                                         mother
                father




       name:jupiter                          name:alcmene
        type:god                              type:human

    gremlin> hercules.out('mother','father').type
    ==>human
    ==>god
DEMIGOD = HALF HUMAN + HALF GOD


                         name:hercules
                         type:demigod

                                         mother
                father




       name:jupiter                          name:alcmene
        type:god                              type:human

    gremlin> hercules.type = 'demigod'
    ==>demigod
COMPUTING
PROCESS               STRUCTURE
COMPUTING
PROCESS                 STRUCTURE




TRAVERSAL                GRAPH
COMPUTING
PROCESS                    STRUCTURE




TRAVERSAL                   GRAPH
             COMPUTING
            GRAPH-BASED
WhY GRAPH-BASED COMPUTING?
WhY GRAPH-BASED COMPUTING?
     INTUITIVE MODELING
WhY GRAPH-BASED COMPUTING?
     INTUITIVE MODELING


    EXPRESSIVE QUERYING
WhY GRAPH-BASED COMPUTING?
            INTUITIVE MODELING


       EXPRESSIVE QUERYING




        NUMEROUS ANALYSES
                         Mixing Patterns                Ranking
    Inference
                                Motifs     Path Expressions
                Centrality
  Scoring                                           Geodesics
ANALYSES ARE THE
EPIPHENOMENA OF TRAVERSAL




   f(        )→
WHAT IS THE SIGNIFICANCE OF
     GRAPH ANALYSIS?
ANALYSES YIELD
INSIGHTS ABOUT THE MODEL


                     TA TS
                  D A UC
                  OD
                PR

          =
              DE DATA
                CIS    -D
                    ION RIV
                        SU EN
                          PP
                            OR
                               T
RECOMMENDATION


People you may know.                      SOCIAL GRAPH


Products you might like.                 RATINGS GRAPH


Movies you should watch and              SOCIAL+RATINGS
 the friends you should watch them with.     GRAPH
WHO ELSE MIGHT HERCULES KNOW?

                     cerberus              pluto
                                  knows
                        1                    4
             knows                knows

  hercules           nemean               neptune
             knows                knows
     0                  2                    5
             knows               knows


                      hydra                jupiter
                                knows
                         3                   6
cerberus              pluto
                                         knows
                               1                    4
                    knows                knows

        hercules            nemean               neptune
                    knows                knows
           0                   2                    5
                    knows               knows


                             hydra                jupiter
                                       knows
                                3                   6




gremlin> hercules
==>v[0]
cerberus               pluto
                                           knows
                                  1                   4
                   knows                   knows

        hercules             nemean                neptune
                   knows                   knows
           0                      2                   5
                   knows                  knows


                                 hydra              jupiter
                                         knows
                                   3                  6




gremlin> hercules.out('knows')
==>v[1]
==>v[2]
==>v[3]
cerberus              pluto
                                          knows
                                1                    4
                   knows                  knows

        hercules             nemean               neptune
                   knows                  knows
           0                    2                    5
                   knows                 knows


                              hydra                jupiter
                                        knows
                                 3                   6




gremlin> hercules.out('knows').out('knows')
==>v[4]
==>v[5]
==>v[5]
==>v[6]
==>v[5]
cerberus                pluto
                                          knows
                                1                      4
                   knows                  knows

        hercules             nemean                 neptune
                   knows                  knows
           0                    2                      5
                   knows                 knows


                              hydra                  jupiter
                                        knows
                                 3                     6




gremlin> hercules.out('knows').out('knows').groupCount.cap
==>v[4]=1
==>v[5]=3
==>v[6]=1
HERCULES PROBABLY KNOWS NEPTUNE

                      cerberus                      pluto
                                           knows
                         1                            4
              knows                        knows

   hercules           nemean                       neptune
              knows                        knows
      0                  2                            5
              knows                       knows


                       hydra                        jupiter
                                         knows
                          3                           6


                                 knows
HERCULES PROBABLY KNOWS NEPTUNE


                                                              PH
                       cerberus                             pluto
                                            knows


                                                             A
                          1                                   4
               knows                        knows


                                                      E"   GR
                                                   YL
   hercules            nemean                              neptune



                                       ST
               knows                        knows
      0                   2                                   5

                                     K
                       OO
               knows                       knows




                  EX TB
                        hydra                               jupiter
                                          knows



                "T
                           3                                  6



      IS      A
   IS
                                  knows



 TH
HERCULES PROBABLY KNOWS NEPTUNE

                      cerberus               pluto
                                    knows
                         1                     4
              knows                 knows

   hercules           nemean                neptune
              knows                 knows
      0                  2                     5
              knows                knows

                                                       brother
                       hydra                 jupiter
                                  knows
                          3                    6


                         father




 ...PROBABLY MORE SO WHEN OTHER
     TYPES OF EDGES ARE ANALYZED
cerberus               pluto
                                 knows
                      1                     4
           knows                 knows

hercules           nemean                neptune
           knows                 knows
   0                  2                     5
           knows                knows

                                                    brother
                    hydra                 jupiter
                               knows
                       3                    6


                      father
cerberus               pluto
                                         knows
                              1                     4
           knows   likes                 knows

hercules                   nemean                neptune
           knows                         knows
   0                          2                     5
           knows                        knows

                                                            brother
                            hydra                 jupiter
                                       knows
                               3                    6


                              father
cerberus                  pluto
                                         knows
                              1                        4
           knows   likes                 knows

hercules                   nemean                   neptune
           knows                         knows
   0                          2                        5
           knows                        knows

                                                               brother
                            hydra                    jupiter
                                       knows
                               3                          6


                              father       SOCIAL GRAPH
human flesh

            7
                                    cerberus                  pluto
                                                  knows
                                       1                        4
                    knows   likes                 knows

hercules                            nemean                   neptune
                    knows                         knows
   0                                   2                        5
                    knows                        knows

                                                                        brother
                                     hydra                    jupiter
                                                knows
                                        3                          6


                                       father       SOCIAL GRAPH
likes

       human flesh

            7
                            likes       cerberus                  pluto
                                                      knows
                                           1                        4
                    knows       likes                 knows

hercules                                nemean                   neptune
                    knows                             knows
   0                                       2                        5
                    knows                            knows

                                                                            brother
                                         hydra                    jupiter
                                                    knows
                                            3                          6


                                           father       SOCIAL GRAPH
tartarus

                                                                 8
                                         likes

       human flesh

            7
                            likes       cerberus                          pluto
                                                      knows
                                           1                                4
                    knows       likes                 knows

hercules                                nemean                           neptune
                    knows                             knows
   0                                       2                                5
                    knows                            knows

                                                                                    brother
                                         hydra                            jupiter
                                                    knows
                                            3                               6


                                           father       SOCIAL GRAPH
tartarus

                                                                  8
                                         likes

       human flesh                                      likes          likes

            7
                            likes       cerberus                               pluto
                                                      knows
                                           1                                     4       dislikes

                    knows       likes                 knows

hercules                                nemean                                neptune
                    knows                             knows
   0                                       2                                     5
                    knows                            knows

                                                                                         brother
                                         hydra                                 jupiter
                                                    knows
                                            3                                    6


                                           father       SOCIAL GRAPH
tartarus

                                                                         8
RATINGS GRAPH                                   likes

              human flesh                                      likes          likes

                   7
                                   likes       cerberus                               pluto
                                                             knows
                                                  1                                     4       dislikes

                           knows       likes                 knows

       hercules                                nemean                                neptune
                           knows                             knows
          0                                       2                                     5
                           knows                            knows

                                                                                                brother
                                                hydra                                 jupiter
                                                           knows
                                                   3                                    6


                                                  father       SOCIAL GRAPH
NEMEAN MIGHT LIKE TARTARUS

                                                 PRODUCT GRAPH              tartarus

                                           smellsOf                            8
  RATINGS GRAPH                                       likes

                    human flesh                                      likes          likes

                         7
                                         likes       cerberus                               pluto
                                                                   knows
                                                        1                                     4       dislikes
            composedOf
                                 knows       likes                 knows

             hercules                                nemean                                neptune
                                 knows                             knows
                0                                       2                                     5
                                 knows                            knows

                                                                                                      brother
                                                      hydra                                 jupiter
                                                                 knows
                                                         3                                    6


                                                        father       SOCIAL GRAPH
* Collaborative Filtering + Content-Based Recommendation
PATH FINDING


How is this person related to this film?   MOVIE GRAPH


Which authors of this book also
                                          BOOK GRAPH
 wrote a New York Times bestseller?

Which movies are based on a book by a     MOVIE+BOOK
 New York Times bestseller?                 GRAPH
WHO PLAYED HERCULES
              IN WHAT MOVIE?


                 jupiter                   hercules

                   6                          0

                              depictedIn

                                                             role
                       role                    depictedIn
ernest                                                                              arnold
graves                                                                          schwarzenegger
         actor                hasActor                 hasActor         actor
 11               10                          7                     8                 9

                                         hercules in
                                          new york
WHO PLAYED HERCULES
                IN WHAT MOVIE?


                   jupiter                   hercules

                     6                          0

                                depictedIn

                                                               role
                         role                    depictedIn
  ernest                                                                              arnold
  graves                                                                          schwarzenegger
           actor                hasActor                 hasActor         actor
   11               10                          7                     8                 9

                                           hercules in
                                            new york




gremlin> hercules
==>v[0]
WHO PLAYED HERCULES
                IN WHAT MOVIE?


                   jupiter                   hercules

                     6                          0

                                depictedIn

                                                               role
                         role                    depictedIn
  ernest                                                                              arnold
  graves                                                                          schwarzenegger
           actor                hasActor                 hasActor         actor
   11               10                          7                     8                 9

                                           hercules in
                                            new york




gremlin> hercules.out('depictedIn')
==>v[7]
WHO PLAYED HERCULES
                IN WHAT MOVIE?


                   jupiter                   hercules

                     6                          0

                                depictedIn

                                                               role
                         role                    depictedIn
  ernest                                                                              arnold
  graves                                                                          schwarzenegger
           actor                hasActor                 hasActor         actor
   11               10                          7                     8                 9
                                                        movie
                                           hercules in
                                            new york




gremlin> hercules.out('depictedIn').as('movie')
==>v[7]
WHO PLAYED HERCULES
                IN WHAT MOVIE?


                   jupiter                   hercules

                     6                          0

                                depictedIn

                                                               role
                         role                    depictedIn
  ernest                                                                              arnold
  graves                                                                          schwarzenegger
           actor                hasActor                 hasActor         actor
   11               10                          7                     8                 9
                                                        movie
                                           hercules in
                                            new york




gremlin> hercules.out('depictedIn').as('movie').out('hasActor')
==>v[8]
==>v[10]
WHO PLAYED HERCULES
                IN WHAT MOVIE?


                   jupiter                   hercules

                     6                          0

                                depictedIn

                                                               role
                         role                    depictedIn
  ernest                                                                              arnold
  graves                                                                          schwarzenegger
           actor                hasActor                 hasActor         actor
   11               10                          7                     8                 9
                                                        movie
                                           hercules in
                                            new york




gremlin> hercules.out('depictedIn').as('movie').out('hasActor')
   .out('role')
==>v[0]
==>v[6]
WHO PLAYED HERCULES
                IN WHAT MOVIE?


                   jupiter                   hercules

                     6                          0

                                depictedIn

                                                               role
                         role                    depictedIn
  ernest                                                                              arnold
  graves                                                                          schwarzenegger
           actor                hasActor                 hasActor         actor
   11               10                          7                     8                 9
                                                        movie
                                           hercules in
                                            new york




gremlin> hercules.out('depictedIn').as('movie').out('hasActor')
   .out('role').retain(hercules)
==>v[0]
WHO PLAYED HERCULES
                IN WHAT MOVIE?


                   jupiter                   hercules

                     6                          0

                                depictedIn

                                                               role
                         role                    depictedIn
  ernest                                                                              arnold
  graves                                                                          schwarzenegger
           actor                hasActor                 hasActor         actor
   11               10                          7                     8                 9
                                                        movie
                                           hercules in
                                            new york




gremlin> hercules.out('depictedIn').as('movie').out('hasActor')
   .out('role').retain(hercules).back(2)
==>v[8]
WHO PLAYED HERCULES
                IN WHAT MOVIE?


                   jupiter                   hercules

                     6                          0

                                depictedIn

                                                               role
                         role                    depictedIn
  ernest                                                                              arnold
  graves                                                                          schwarzenegger
           actor                hasActor                 hasActor         actor
   11               10                          7                     8                 9
                                                        movie
                                           hercules in
                                            new york




gremlin> hercules.out('depictedIn').as('movie').out('hasActor')
   .out('role').retain(hercules).back(2).out('actor')
==>v[9]
WHO PLAYED HERCULES
                IN WHAT MOVIE?


                   jupiter                   hercules

                     6                          0

                                depictedIn

                                                               role
                         role                    depictedIn
  ernest                                                                              arnold
  graves                                                                          schwarzenegger
           actor                hasActor                 hasActor         actor
   11               10                          7                     8                 9
                                                        movie             star
                                           hercules in
                                            new york




gremlin> hercules.out('depictedIn').as('movie').out('hasActor')
   .out('role').retain(hercules).back(2).out('actor')
   .as('star')
==>v[9]
WHO PLAYED HERCULES
                IN WHAT MOVIE?


                   jupiter                   hercules

                     6                          0

                                depictedIn

                                                               role
                         role                    depictedIn
  ernest                                                                              arnold
  graves                                                                          schwarzenegger
           actor                hasActor                 hasActor         actor
   11               10                          7                     8                 9
                                                        movie             star
                                           hercules in
                                            new york




gremlin> hercules.out('depictedIn').as('movie').out('hasActor')
   .out('role').retain(hercules).back(2).out('actor')
   .as('star').select
==>[movie:v[7], star:v[9]]
WHO PLAYED HERCULES
                IN WHAT MOVIE?


                   jupiter                   hercules

                     6                          0

                                depictedIn

                                                               role
                         role                    depictedIn
  ernest                                                                              arnold
  graves                                                                          schwarzenegger
           actor                hasActor                 hasActor         actor
   11               10                          7                     8                 9
                                                        movie             star
                                           hercules in
                                            new york




gremlin> hercules.out('depictedIn').as('movie').out('hasActor')
   .out('role').retain(hercules).back(2).out('actor')
   .as('star').select{it.name}
==>[movie:hercules in new york, star:arnold schwarzenegger]
jupiter                   hercules

                   6                          0

                              depictedIn

                                                             role
                       role                    depictedIn
ernest                                                                              arnold
graves                                                                          schwarzenegger
         actor                hasActor                 hasActor         actor
 11               10                          7                     8                 9

                                         hercules in
                                          new york
jupiter                   hercules

                   6                          0

                              depictedIn

                                                             role
                       role                    depictedIn
ernest                                                                              arnold
graves                                                                          schwarzenegger
         actor                hasActor                 hasActor         actor
 11               10                          7                     8                 9

                                         hercules in
                                          new york
jupiter                   hercules
                                                          depictedIn
                                                                                the arms of
                   6                          0                            12
                                                                                  hercules
                              depictedIn

                                                             role
                       role                    depictedIn
ernest                                                                                  arnold
graves                                                                              schwarzenegger
         actor                hasActor                 hasActor             actor
 11               10                          7                        8                  9

                                         hercules in
                                          new york
fred
                                                                           saberhagen

                                                                              13


                                                                                 writtenBy

                 jupiter                   hercules
                                                          depictedIn
                                                                                    the arms of
                   6                          0                               12
                                                                                      hercules
                              depictedIn

                                                             role
                       role                    depictedIn
ernest                                                                                       arnold
graves                                                                                   schwarzenegger
         actor                hasActor                 hasActor                  actor
 11               10                          7                        8                       9

                                         hercules in
                                          new york
fred
                                      albuquerque                          saberhagen
                                                             livesIn
                                             14                               13


                                                                                 writtenBy

                 jupiter                   hercules
                                                          depictedIn
                                                                                    the arms of
                   6                          0                               12
                                                                                      hercules
                              depictedIn

                                                             role
                       role                    depictedIn
ernest                                                                                       arnold
graves                                                                                   schwarzenegger
         actor                hasActor                 hasActor                  actor
 11               10                          7                        8                       9

                                         hercules in
                                          new york
fred
                 santa fe              albuquerque                          saberhagen
                               25-North                       livesIn
                   15                         14                               13


                                                                                  writtenBy

                  jupiter                   hercules
                                                           depictedIn
                                                                                     the arms of
                    6                          0                               12
                                                                                       hercules
                               depictedIn

                                                              role
                        role                    depictedIn
ernest                                                                                        arnold
graves                                                                                    schwarzenegger
         actor                 hasActor                 hasActor                  actor
 11                 10                         7                        8                       9

                                          hercules in
                                           new york
marko                                                                             fred
rodriguez             santa fe              albuquerque                          saberhagen
            livesIn                 25-North                       livesIn
   16                   15                         14                               13


                                                                                       writtenBy

                       jupiter                   hercules
                                                                depictedIn
                                                                                          the arms of
                         6                          0                               12
                                                                                            hercules
                                    depictedIn

                                                                   role
                             role                    depictedIn
 ernest                                                                                            arnold
 graves                                                                                        schwarzenegger
             actor                  hasActor                 hasActor                  actor
   11                    10                         7                        8                       9

                                               hercules in
                                                new york
marko                                                                               fred
rodriguez             santa fe                albuquerque                          saberhagen
            livesIn                   25-North                       livesIn
   16                      15                        14                               13

              thinksHeIs
                                                                                         writtenBy

                       jupiter                     hercules
                                                                  depictedIn
                                                                                            the arms of
                           6                          0                               12
                                                                                              hercules
                                      depictedIn

                                                                     role
                               role                    depictedIn
 ernest                                                                                              arnold
 graves                                                                                          schwarzenegger
             actor                    hasActor                 hasActor                  actor
   11                      10                         7                        8                       9

                                                 hercules in
                                                  new york
TRANSPORTATION GRAPH
  marko                                                                               fred
rodriguez             santa fe                albuquerque                          saberhagen
            livesIn                   25-North                       livesIn
   16                      15                        14                               13

              thinksHeIs
                                                                  BOOK GRAPH             writtenBy
   PROFILE             jupiter                     hercules
   GRAPH                                                          depictedIn
                                                                                            the arms of
                           6                          0                               12
                                                                                              hercules
                                      depictedIn

                                                                     role
                               role                    depictedIn
 ernest                                                                                              arnold
 graves                                                                                          schwarzenegger
             actor                    hasActor                 hasActor                  actor
   11                      10                         7                        8                       9

                                                 hercules in
                                                  new york            MOVIE GRAPH
SOCIAL INFLUENCE

   Who are the most influential people in
    java, mathematics, art, surreal art, politics, ...?

   Which region of the social graph will propagate this
    advertisement this furthest?

   Which 3 experts should review this submitted article?


   Which people should I talk to at the upcoming
    conference and what topics should
    I talk to them about?


SOCIAL + COMMUNICATION + EXPERTISE + EVENT GRAPH
PATTERN IDENTIFICATION


 This connectivity pattern is a sign of financial fraud.
  When this motif is found, a red flag will be raised.

                                     TRANSACTION GRAPH

 Healthy discourse is typified by a discussion board
  with a branch factor in this range and a concept
  clique score in this range.
                                     DISCUSSION GRAPH
KNOWLEDGE DISCOVERY


The terms "ice", "fans", "stanley cup,"
                                             WIKIPEDIA GRAPH
 are classified as "sports"


Given that all identified birds fly,
 it can be deduced that all birds fly.
 If contrary evidence is provided,        EVIDENTIAL LOGIC GRAPH
    then this "fact" can be retracted.
WORLD MODEL
WORLD PROCESSES




  WORLD MODEL
WORLD PROCESSES




                 WORLD MODEL


A single world model and various types of traversers
   moving through that model to solve problems.
COMPUTING
PROCESS                    STRUCTURE




TRAVERSAL                   GRAPH
             COMPUTING
            GRAPH-BASED
GRAPH COMPUTING
    ENGINES
MEMORY-BASED GRAPHS
Graph Framework

Application




                                         NetworkX
                                 http://networkx.lanl.gov/




            iGraph
http://igraph.sourceforge.net/                                  JUNG
                                                    http://jung.sourceforge.net/
DISK-BASED GRAPHS
Graph Database

                                                              Neo4j
  Application    Application
                                                        http://neo4j.org/
          Application




                                                                OrientDB
                                                           http://orientdb.org
                                   InfiniteGraph
                               http://objectivity.com




                                                        DEX
                                     http://www.sparsity-technologies.com/dex
CLUSTER-BASED GRAPHS
   Bulk Synchronous Parallel Processing


         Application
              Application
                       Application
                                                    Hama
         3
                                     http://incubator.apache.org/hama/
              2
                       1

                                                                   Giraph
                                                    http://incubator.apache.org/giraph/




                                                                       GoldenOrb
                                                                http://goldenorbos.org/


* In the same spirit as Google's Pregel
MEMORY-bASED GRAPHS
Graph size is constrained by local machine's RAM.
Rich graph algorithm and visualization packages.
Oriented towards "textbook-style" graphs.




                                         * Based on typical behavior
MEMORY-bASED GRAPHS
Graph size is constrained by local machine's RAM.
Rich graph algorithm and visualization packages.
Oriented towards "textbook-style" graphs.


DISK-BASED GRAPHS
Graph size is constrained by local disk.
Optimized for local graph algorithms.
Oriented towards property graphs.




                                           * Based on typical behavior
MEMORY-bASED GRAPHS
Graph size is constrained by local machine's RAM.
Rich graph algorithm and visualization packages.
Oriented towards "textbook-style" graphs.


DISK-BASED GRAPHS
Graph size is constrained by local disk.
Optimized for local graph algorithms.
Oriented towards property graphs.


CLUSTER-BASED GRAPHS
Graph size is constrained to cluster's total RAM.
Optimized for global graph algorithms.
Oriented towards "textbook-style" graphs.
                                           * Based on typical behavior
TINKERPOP




                                                   Support for various graph vendors
Open source graph product group



                   * Encompassing the various graph computing styles




                                                      Simple, well-defined products
  Provides a vendor-agnostic graph framework
http://tinkerpop.com                                        * Based on future directions
TINKERPOP

                                             Graph
                                             Server



                                             Graph
                                           Algorithms



                                          Object-Graph
                                            Mapper



                                           Traversal
                                           Language



                                           Dataflow
                                          Processing


        http://tinkerpop.com                Generic
                                           Graph API




http://${project.name}.tinkerpop.com
TINKERPOP INTEGRATION




http://tinkerpop.com
AND NOW
 THERE IS ANOTHER...
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
TITAN
PART 2:
INTRODUCTION TO TITAN




     MATTHIAS BROECHELER
WhY CREATE TITAN?

     A number of Aurelius' clients...
         ...need to represent and process
        graphs at the 100+ billion edge
        scale w/ thousands of concurrent
        transactions.

        ...need both local graph traversals
        (OLTP) and batch graph
        processing (OLAP).


        ...desire a free, open source
        distributed graph database.
TITAN's KEY FEATURES

       Titan provides...

          ..."infinite size" graphs and
          "unlimited" users by means of a
          distributed storage engine.

          ...real-time local traversals (OLTP)
          and support for global batch
          processing via Hadoop (OLAP).


          ...distribution via the liberal, free,
          open source Apache2 license.
matthias$
matthias$ wget http://thinkaurelius/titan.zip
  % Total    % Received % Xferd Average Speed    Time     Time
100 99999    0 99999    0     0 11078       0 --:--:--   0:01:01
matthias$
matthias$ wget http://thinkaurelius/titan.zip
  % Total    % Received % Xferd Average Speed    Time     Time
100 99999    0 99999    0     0 11078       0 --:--:--   0:01:01
matthias$ unzip titan.zip
Archive: titan.zip
   creating: titan/
   ...
matthias$
matthias$ wget http://thinkaurelius/titan.zip
  % Total    % Received % Xferd Average Speed    Time     Time
100 99999    0 99999    0     0 11078       0 --:--:--   0:01:01
matthias$ unzip titan.zip
Archive: titan.zip
   creating: titan/
   ...
matthias$ cd titan
titan$
matthias$ wget http://thinkaurelius/titan.zip
  % Total    % Received % Xferd Average Speed    Time     Time
100 99999    0 99999    0     0 11078       0 --:--:--   0:01:01
matthias$ unzip titan.zip
Archive: titan.zip
   creating: titan/
   ...
matthias$ cd titan
titan$ bin/gremlin.sh

         ,,,/
         (o o)
-----oOOo-(_)-oOOo-----
gremlin>
gremlin> g = TitanFactory.open('/tmp/local-titan')
==>titangraph[local:/tmp/local-titan]
DE
                                                           MO
                                                       INE
                                                   ACH
gremlin> g = TitanFactory.open('/tmp/local-titan')

                                                 LM
==>titangraph[local:/tmp/local-titan]

                                            LO CA
gremlin> g.createKeyIndex('name',Vertex.class)
==>null
gremlin> g.stopTransaction(SUCCESS)
==>null
name:saturn        name:sky                       name:sea
                                                          type:titan         type:location                  type:location




                                                                    lives
                                                           father                                            lives

                                           name:jupiter
                                                                                          brother                       name:neptune
                                           type:god
                                                                                                                        type:god

                                            father                              brother         brother
                     name:hercules
                     type:demigod
                                               mother
                                                                                                    name:pluto
                                                                                                    type:god
                                                       name:alcmene
                                                       type:human           pet
                   battled
                                     battled     battled
                                                                                              lives
                         time:1   time:2     time:12


                                                                        lives                       name:tartarus
                                                                                                    type:location

          name:nemean         name:hydra         name:cerberus
          type:monster        type:monster       type:monster


gremlin> g.loadGraphML('data/graph-of-the-gods.xml')
==>null


* The Graph of the Gods is a toy dataset distributed with Titan
name:saturn        name:sky                       name:sea
                                                     type:titan         type:location                  type:location




                                                               lives
                                                      father                                            lives

                                      name:jupiter
                                                                                     brother                       name:neptune
                                      type:god
                                                                                                                   type:god

                                       father                              brother         brother
                name:hercules
                type:demigod
                                          mother
                                                                                               name:pluto
                                                                                               type:god
                                                  name:alcmene
                                                  type:human           pet
              battled
                                battled     battled
                                                                                         lives
                    time:1   time:2     time:12


                                                                   lives                       name:tartarus
                                                                                               type:location

     name:nemean         name:hydra         name:cerberus
     type:monster        type:monster       type:monster


gremlin> hercules = g.V('name','hercules').next()
==>v[24]
name:saturn        name:sky                       name:sea
                                                     type:titan         type:location                  type:location




                                                               lives
                                                      father                                            lives

                                      name:jupiter
                                                                                     brother                       name:neptune
                                      type:god
                                                                                                                   type:god

                                       father                              brother         brother
                name:hercules
                type:demigod
                                          mother
                                                                                               name:pluto
                                                                                               type:god
                                                  name:alcmene
                                                  type:human           pet
              battled
                                battled     battled
                                                                                         lives
                    time:1   time:2     time:12


                                                                   lives                       name:tartarus
                                                                                               type:location

     name:nemean         name:hydra         name:cerberus
     type:monster        type:monster       type:monster


gremlin> hercules.out('mother','father')
==>v[44]
==>v[16]
name:saturn        name:sky                       name:sea
                                                     type:titan         type:location                  type:location




                                                               lives
                                                      father                                            lives

                                      name:jupiter
                                                                                     brother                       name:neptune
                                      type:god
                                                                                                                   type:god

                                       father                              brother         brother
                name:hercules
                type:demigod
                                          mother
                                                                                               name:pluto
                                                                                               type:god
                                                  name:alcmene
                                                  type:human           pet
              battled
                                battled     battled
                                                                                         lives
                    time:1   time:2     time:12


                                                                   lives                       name:tartarus
                                                                                               type:location

     name:nemean         name:hydra         name:cerberus
     type:monster        type:monster       type:monster


gremlin> hercules.out('mother','father').name
==>alcmene
==>jupiter
THAT WAS TITAN LOCAL.

        NEXT IS TITAN DISTRIBUTED.




Broecheler, M., Pugliese, A., Subrahmanian, V.S., "COSI: Cloud Oriented Subgraph Identification in Massive Social Networks,"
Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, pp. 248-255, 2010.
http://www.knowledgefrominformation.com/2010/08/01/cosi-cloud-oriented-subgraph-identification-in-massive-social-networks/
BACKEND AGNOSTIC




     -OR-
TITAN DISTRIBUTED
                       VIA CASSANDRA


titan$ bin/gremlin.sh

         ,,,/
         (o o)
-----oOOo-(_)-oOOo-----
gremlin> conf = new BaseConfiguration();
==>org.apache.commons.configuration.BaseConfiguration@763861e6
gremlin> conf.setProperty("storage.backend","cassandra");
gremlin> conf.setProperty("storage.hostname","77.77.77.77");
gremlin> g = TitanFactory.open(conf);
==>titangraph[cassandra:77.77.77.77]
gremlin>




* There are numerous graph configurations: https://github.com/thinkaurelius/titan/wiki/Graph-Configuration
INHERITED FEATURES




      Continuously available with no single point of failure.


      No write bottlenecks to the graph as there is no master/slave architecture.


      Built-in replication ensures data is available during machine failure.


      Caching layer ensures that continuously accessed data is available in memory.


      Elastic scalability allows for the introduction and removal of machines.


Cassandra available at http://cassandra.apache.org/
TITAN DISTRIBUTED
                         VIA HBASE


titan$ bin/gremlin.sh

         ,,,/
         (o o)
-----oOOo-(_)-oOOo-----
gremlin> conf = new BaseConfiguration();
==>org.apache.commons.configuration.BaseConfiguration@763861e6
gremlin> conf.setProperty("storage.backend","hbase");
gremlin> conf.setProperty("storage.hostname","77.77.77.77");
gremlin> g = TitanFactory.open(conf);
==>titangraph[hbase:77.77.77.77]
gremlin>




* There are numerous graph configurations: https://github.com/thinkaurelius/titan/wiki/Graph-Configuration
INHERITED FEATURES




      Strictly consistent reads and writes.


      Linear scalability with the addition of machines.


      Base classes for backing Hadoop MapReduce jobs with HBase tables.


      HDFS-based data replication.


      Generally good integration with the tools in the Hadoop ecosystem.



HBase available at http://hbase.apache.org/
TITAN AND THE CAP THEOREM
                 Partitionability

    y




                                         Ava
        c
    ten




                                    il
        is




                                       abi
           s
        on




                                    ty     li
             C
Titan is all about ...
Titan is all about numerous concurrent users...
Titan is all about numerous concurrent users...
                                      high availability....
Titan is all about numerous concurrent users...
                                      high availability....
                                             dynamic scalability...
THE HOW OF TITAN
DATA MANAGEMENT



  EDGE COMPRESSION



   VERTEX-CENTRIC INDICES
THE HOW OF TITAN
DATA MANAGEMENT
DATA MANAGEMENT
                MAIN DESIGN PRINCIPLES
Immutable, Atomic Edges                           Optimistic Concurrency Control
    hercules                     cerberus
                  battled
1

    hercules      time:12        cerberus

2
                  battled
                                                      +           +
                                                                          +
    hercules
                  time:12
               successful:true   cerberus
                                                              +       -
3
                  battled
                                                          +

                                 Fined-Grained Locking Control
DATA MANAGEMENT
                              TYPE DEFINITION
      Datatype Constraints                                  Edge Label Signatures
    TitanKey timeKey =                                    TitanLabel battled =
      g.makeType().name("time")                             g.makeType().name("battled")
       .dataType(Integer.class)                              .signature(timeKey)

        time:12           time:"twelve"                    hercules                cerberus
                                                                        battled

                                                                        time:12


                                 Functional Declarations
                                TitanLabel father =
                                  g.makeType().name("father")
                                   .functional()

                                   hercules               jupiter
                                                 father


                                                          mars
                                              father




Data management configurations allow Titan to optimize how information is stored/retrieved from disk.
DATA MANAGEMENT
                                  TYPE DEFINITION


           Endogenous Indices
  g.createKeyIndex("name",Vertex.class)




                                                  Unique Property Key/Value Pairs
                                                     TitanKey status =
                                  name:jupiter         g.makeType().name("status")
        name:hercules
                                                        .unique()
                    name:hermes

                                                           name:jupiter             name:neptune
                                                      status:king of the gods   status:king of the gods




Data management configurations allow Titan to optimize how information is stored/retrieved from disk.
DATA MANAGEMENT
               LOCKING SYSTEM
Ensures consistency over non-consistent storage backends.
               hercules

                          father   jupiter
      write
                                             hercules            jupiter
                                                        father
                                   neptune

                          father
               hercules



       write

       1. Acquire lock at the end of the transaction.
          - locking mechanism depends on storage
             layer consistency guarantees.

       2. Verify original read.

       3. Fail transaction if any precondition is violated.
DATA MANAGEMENT
     ID MANAGEMENT




           [0,1,2,3,4,5,6,7,8,9,10,11]




 Global ID Pool Maintained by Storage Engine
DATA MANAGEMENT
           ID MANAGEMENT
[0,1,2]                                     [3,4,5]




             [0,1,2,3,4,5,6,7,8,9,10,11]




  Global ID Pool Maintained by Storage Engine




 [6,7,8]                                   [9,10,11]

  Pool Subsets Assigned to Individual Instances
THE HOW OF TITAN




EDGE COMPRESSION
EDGE COMPRESSION
              Natural graphs have a small world, community/cluster property.




                     Community 1                                         Community 2



                        High intra-connectivity within a community and
                        low inter-connectivity between communities.


Watts, D. J., Strogatz, S. H., "Collective Dynamics of 'Small-World' Networks,"
Nature 393 (6684), pp. 440–442, 1998.
EDGE COMPRESSION
EDGE COMPRESSION




           knows

12345678           12345683
EDGE COMPRESSION


             knows



  12345678           12345683
EDGE COMPRESSION


             knows



  12345678           12345683



  12345678     9     12345683   24 bytes
EDGE COMPRESSION


             knows



  12345678           12345683



  12345678     9     12345683   24 bytes



  12345678     9        +5
EDGE COMPRESSION


                   knows



   12345678                12345683



   12345678          9     12345683   24 bytes



   12345678          9        +5



               +
  12345678 9
               5
                                      7 bytes
THE HOW OF TITAN




VERTEX-CENTRIC INDICES
VERTEX-CENTRIC INDICES
        THE SUPER NODE PROBLEM



Natural, real-world graphs contain
vertices of high degree.


Even if rare, their degree ensures that
they exist on many paths.


Traversing a high degree vertex
means touching numerous incident
edges and potentially touching most
of the graph in only a few steps.
VERTEX-CENTRIC INDICES
           A SUPER NODE SOLUTION


A "super node" only exists from the
vantage point of classic "textbook
style" graphs.


In the world of property graphs,
intelligent disk-level filtering can
interpret a "super node" as a more
manageable low-degree vertex.


Vertex-centric querying utilizes B-Trees
and sort orders for speedy lookup of
incident edges with particular qualities.
VERTEX-CENTRIC INDICES
  PUSHDOWN PREDICATES

  vertex.query()




                               stars:5

                      likes   likes
                                             stars:2
          stars:2
                                      likes
             knows                       knows

            stars:3                      stars:3

                      likes                likes
                              knows




                                                       8 edges
VERTEX-CENTRIC INDICES
  PUSHDOWN PREDICATES

  vertex.query().direction(OUT)




                              stars:5

                      likes   likes
                                            stars:2
          stars:2
                                      likes
             knows                       knows

            stars:3                     stars:3

                      likes               likes




                                                      7 edges
VERTEX-CENTRIC INDICES
  PUSHDOWN PREDICATES

  vertex.query().direction(OUT)
    .labels("likes")




                              stars:5

                      likes   likes
                                              stars:2
          stars:2
                                      likes



            stars:3                     stars:3

                      likes               likes




                                                        5 edges
VERTEX-CENTRIC INDICES
  PUSHDOWN PREDICATES

  vertex.query().direction(OUT)
    .labels("likes").has("stars",5)




                   stars:5

                   likes




                                      1 edge
VERTEX-CENTRIC INDICES
             PUSHDOWN PREDICATES



             Query   Query.direction(Direction)
PREDICATES




             Query   Query.labels(String... labels)
             Query   Query.has(String, Object, Compare)
             Query   Query.has(String, Object)
             Query   Query.range(String, Object, Object)
GETTERS




             Iterable<Vertex> Query.vertices()
             Iterable<Edge> Query.edges()
VERTEX-CENTRIC INDICES
DISK-LEVEL SORTING/INDEXING

         battled
time:1



         time:2

         battled



         time:12
         battled




          knows




           knows
VERTEX-CENTRIC INDICES
DISK-LEVEL SORTING/INDEXING

         battled
time:1



         time:2
                   battled
         battled



         time:12
         battled




          knows




           knows   knows
VERTEX-CENTRIC INDICES
DISK-LEVEL SORTING/INDEXING

         battled
time:1

                       battled w/ time 1-5
         time:2

         battled



         time:12
         battled
                       battled w/ time 5-10




          knows
                             TitanLabel battled =
                               g.makeType().name("battled")
                                .primaryKey(time)
           knows   knows
VERTEX-CENTRIC INDICES
DISK-LEVEL SORTING/INDEXING

 brother




  father




 mother




   knows




   battled
VERTEX-CENTRIC INDICES
DISK-LEVEL SORTING/INDEXING

 brother




  father




 mother




   knows




   battled
VERTEX-CENTRIC INDICES
DISK-LEVEL SORTING/INDEXING

 brother




  father     family

                      TypeGroup family =
                        TypeGroup.of(2,"family");
 mother               TitanLabel father =
                        g.makeType().name("father")
                         .group(family).makeEdgeLabel();
                      TitanLabel mother =
   knows
                        g.makeType().name("mother")
                         .group(family).makeEdgeLabel();
                      TitanLabel brother =
   battled              g.makeType().name("brother")
                         .group(family).makeEdgeLabel();
VERTEX-CENTRIC INDICES
DISK-LEVEL SORTING/INDEXING

 brother




  father     family




 mother




   knows




   battled

                      vertex.query().group("family")...
THAT IS HOW TITAN WORKS
DATA MANAGEMENT



  EDGE COMPRESSION



   VERTEX-CENTRIC INDICES
WHAT IF YOU WANTED TO CREATE
  TWITTER FROM SCRATCH?




     SIMULATING TWITTER
3 BILLION EDGES
   100 MILLION VERTICES
10000 CONCURRENT USERS
         50 MACHINES
     1 GRAPH DATABASE




     COMING JULY 2012
PART 3:
THE FUTURE OF AURELIUS




MARKO A. RODRIGUEZ   MATTHIAS BROECHELER
AURELIUS' GRAPH
    COMPUTING STORY
Titan as the highly scalable, distributed graph database solution.




   OLTP
AURELIUS' GRAPH
    COMPUTING STORY
Titan as the highly scalable, distributed graph database solution.

Titan as the source (and potential sink) for other graph
processing solutions.

   OLTP                                       OLAP
FAUNUS




GOD OF HERDS
FAUNUS
PATH ALGEBRA FOR HADOOP
                                       battled                     battled


                            hercules              cretan bull                theseus


                                                     
                                        A · A ◦ n(I)


                                                     ally


                                       hercules                 theseus

Derived graphs are single-relational and are typically much smaller than
their multi-relational source. Therefore, derived graphs can be subjected to
textbook-style graph algorithms in both a meaningful and efficient manner.

  WHO IS THE MOST CENTRAL ALLY?
FAUNUS
PATH ALGEBRA FOR HADOOP
                       
B = A · A ◦ n(I)                            B · B ◦ n(I)
                ally                                     ally


  ally                                     ally                 ally
                              ally                                     ally
                                                     ally
                                                         ally

         ally          ally                       ally          ally




                My allies' allies are my allies.
                                      2
                              (A · A ) ◦ n(I)
FAUNUS
    PATH ALGEBRA FOR HADOOP
                       Used for global graph operations.

                              Implements the multi-relational path algebra
                              as a collection of Map/Reduce operations


                       Reduce a massive property graph into a smaller
                       semantically-rich single-relational graph.

                                     Project codename: TinkerPoop




           Support for HadoopGraph and HDFS file formats

Rodriguez M.A., Shinavier, J., “Exposing Multi-Relational Networks to
Single-Relational Network Analysis Algorithms,” Journal of Informetrics,
4(1), pp. 29-41, 2009. http://arxiv.org/abs/0806.2274
FULGORA




GODDESS OF LIGHTNING
FULGORA
         AN EFFICIENt IN-MEMORY
              GRAPH ENGINE
                               Non-transactional, in-memory graph engine.
                               It is not a database.

                               Process ~90 billion edges in 68-Gigs of RAM
                                assuming a small world topology.


                           Perform complex graph algorithms in-memory.
                              global graph analysis
                              multi-relational graph analysis




Similar in spirit to Twitter's Cassovary: https://github.com/twitter/cassovary
THE AURELIUS OLAP FLOW
Stores a massive-scale
    property graph

                                                          Analyzes compressed, large-scale
                                                              single or multi-relational
                              Generates a large-scale            graphs in memory
                              single-relational graph




                         Map/Reduce

                                                    Load into RAM
                                                  on a single-machine




              Update graph with derived edges


                 Update element properties with algorithm results       to a stats package
THE AURELIUS OLAP FLOW
Stores a massive-scale
    property graph

                                                            Analyzes compressed, large-scale
                                                                single or multi-relational
                                Generates a large-scale            graphs in memory
                                single-relational graph




                         Map/Reduce

                                                      Load into RAM
                                                    on a single-machine




                                  ally                     ally_centrality:0.0123

                     hercules            theseus
                                                                hercules


                                                                               to a stats package
THE AURELIUS OLAP FLOW
Stores a massive-scale
    property graph

                                                   Analyzes compressed, large-scale
                                                       single or multi-relational
                         Generates a large-scale          graphs in memory
                         single-relational graph




                                                                to a stats package
AURELIUS' USE OF BLUEPRINTS

    Aurelius products use the Blueprints API so any
    graph product can communicate with any other
    graph product.



    The code for graph databases, frameworks,
    algorithms, and batch-processing are written in terms
    of the Blueprints API.



    Aurelius encourages developers to use Blueprints/
    TinkerPop in order to grow a rich ecosystem of
    interoperable graph technologies.
THE GRAPH LANDSCAPE
                                      REPRISE
   Speed of Traversal/Process




                                                 Size of Graph/Structure
* Not to scale. Did not want to overlap logos.
NEXT STEPS
                              Make use of and/or contribute to the
                               free, open source Titan product.

Learn about applying graph
theory and network science.




   http://thinkaurelius.com


                                 http://thinkaurelius.github.com/titan/
THANK YOU
CREDITS
    PRESENTERS
MARKO A. RODRIGUEZ
MATTHIAS BROCHELER

 FINANCIAL SUPPORT
 PEARSON EDUCATION
      AURELIUS

LOCATION PROVISIONS
   JIVE SOFTWARE

   MANY THANKS TO
    DAN LAROCQUE
TINKERPOP COMMUNITY
  STEPHEN MALLETTE
    BOBBY NORTON
     KETRINA YIM

More Related Content

What's hot

Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakBuilding a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakHakka Labs
 
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...DataWorks Summit
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Josef A. Habdank
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks
 
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...HostedbyConfluent
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...Databricks
 
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLBuilding a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLScyllaDB
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flinkmxmxm
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Adrien Blind
 
Giraph at Hadoop Summit 2014
Giraph at Hadoop Summit 2014Giraph at Hadoop Summit 2014
Giraph at Hadoop Summit 2014Claudio Martella
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann
 
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowImproving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowDatabricks
 
Advanced SQL For Data Scientists
Advanced SQL For Data ScientistsAdvanced SQL For Data Scientists
Advanced SQL For Data ScientistsDatabricks
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingYu Huang
 

What's hot (20)

Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakBuilding a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe Crobak
 
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
 
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
 
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLBuilding a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
 
Giraph at Hadoop Summit 2014
Giraph at Hadoop Summit 2014Giraph at Hadoop Summit 2014
Giraph at Hadoop Summit 2014
 
Hadoop
HadoopHadoop
Hadoop
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowImproving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
 
Advanced SQL For Data Scientists
Advanced SQL For Data ScientistsAdvanced SQL For Data Scientists
Advanced SQL For Data Scientists
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous driving
 

Viewers also liked

Corporate social responsibility by Monika Sukhija
Corporate social responsibility by Monika SukhijaCorporate social responsibility by Monika Sukhija
Corporate social responsibility by Monika SukhijaMonika Sukhija
 
Va rail update
Va rail updateVa rail update
Va rail updateJeff South
 
2015-Amtrak-Sustainability-Report
2015-Amtrak-Sustainability-Report2015-Amtrak-Sustainability-Report
2015-Amtrak-Sustainability-ReportCasey Luddy
 
Epgp (one year) 2009-10_cf_ assignment_#3_14jan10
Epgp (one year) 2009-10_cf_ assignment_#3_14jan10Epgp (one year) 2009-10_cf_ assignment_#3_14jan10
Epgp (one year) 2009-10_cf_ assignment_#3_14jan10Rajendra Inani
 
Amtrak -- Innovation Policy
Amtrak -- Innovation PolicyAmtrak -- Innovation Policy
Amtrak -- Innovation PolicyJeffreyBo
 
National rail road passenger corporation ( amtrak)
National rail road passenger corporation ( amtrak)National rail road passenger corporation ( amtrak)
National rail road passenger corporation ( amtrak)mario manurung
 
Amtrak Services Presentation
Amtrak  Services PresentationAmtrak  Services Presentation
Amtrak Services Presentationmjpiscadlo
 
Amtrak Marketing Presentation
Amtrak Marketing PresentationAmtrak Marketing Presentation
Amtrak Marketing PresentationBrent Aguilar
 
How to Improve Amtrak - America's Railroad
How to Improve Amtrak - America's RailroadHow to Improve Amtrak - America's Railroad
How to Improve Amtrak - America's Railroaddurangokid123
 
Amtrak - Meet Our Company
Amtrak - Meet Our CompanyAmtrak - Meet Our Company
Amtrak - Meet Our CompanySrlaupan
 
The Amtrak Funding Debate:Why Amtrak Should Continue to Receive Federal Subsi...
The Amtrak Funding Debate:Why Amtrak Should Continue to Receive Federal Subsi...The Amtrak Funding Debate:Why Amtrak Should Continue to Receive Federal Subsi...
The Amtrak Funding Debate:Why Amtrak Should Continue to Receive Federal Subsi...guest4330129
 
THE COMPANY I ADMIRE THE MOST FOR ITS CORPORATE SOCIAL RESPONSIBILITY
THE COMPANY I ADMIRE THE MOST FOR ITS CORPORATE SOCIAL RESPONSIBILITYTHE COMPANY I ADMIRE THE MOST FOR ITS CORPORATE SOCIAL RESPONSIBILITY
THE COMPANY I ADMIRE THE MOST FOR ITS CORPORATE SOCIAL RESPONSIBILITYSwarupa Rani Sahu
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data ScientistDaniel Tunkelang
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in PythonImry Kissos
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The PeopleDaniel Tunkelang
 
Hadoop and Machine Learning
Hadoop and Machine LearningHadoop and Machine Learning
Hadoop and Machine Learningjoshwills
 
A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)Prof. Dr. Diego Kuonen
 

Viewers also liked (20)

Corporate social responsibility by Monika Sukhija
Corporate social responsibility by Monika SukhijaCorporate social responsibility by Monika Sukhija
Corporate social responsibility by Monika Sukhija
 
Va rail update
Va rail updateVa rail update
Va rail update
 
vivek icbm 2011ppt
vivek icbm 2011pptvivek icbm 2011ppt
vivek icbm 2011ppt
 
2015-Amtrak-Sustainability-Report
2015-Amtrak-Sustainability-Report2015-Amtrak-Sustainability-Report
2015-Amtrak-Sustainability-Report
 
Epgp (one year) 2009-10_cf_ assignment_#3_14jan10
Epgp (one year) 2009-10_cf_ assignment_#3_14jan10Epgp (one year) 2009-10_cf_ assignment_#3_14jan10
Epgp (one year) 2009-10_cf_ assignment_#3_14jan10
 
Amtrak -- Innovation Policy
Amtrak -- Innovation PolicyAmtrak -- Innovation Policy
Amtrak -- Innovation Policy
 
National rail road passenger corporation ( amtrak)
National rail road passenger corporation ( amtrak)National rail road passenger corporation ( amtrak)
National rail road passenger corporation ( amtrak)
 
Amtrak
AmtrakAmtrak
Amtrak
 
Amtrak Services Presentation
Amtrak  Services PresentationAmtrak  Services Presentation
Amtrak Services Presentation
 
Amtrak Marketing Presentation
Amtrak Marketing PresentationAmtrak Marketing Presentation
Amtrak Marketing Presentation
 
How to Improve Amtrak - America's Railroad
How to Improve Amtrak - America's RailroadHow to Improve Amtrak - America's Railroad
How to Improve Amtrak - America's Railroad
 
Amtrak - Meet Our Company
Amtrak - Meet Our CompanyAmtrak - Meet Our Company
Amtrak - Meet Our Company
 
The Amtrak Funding Debate:Why Amtrak Should Continue to Receive Federal Subsi...
The Amtrak Funding Debate:Why Amtrak Should Continue to Receive Federal Subsi...The Amtrak Funding Debate:Why Amtrak Should Continue to Receive Federal Subsi...
The Amtrak Funding Debate:Why Amtrak Should Continue to Receive Federal Subsi...
 
Titan presentation
Titan presentation Titan presentation
Titan presentation
 
THE COMPANY I ADMIRE THE MOST FOR ITS CORPORATE SOCIAL RESPONSIBILITY
THE COMPANY I ADMIRE THE MOST FOR ITS CORPORATE SOCIAL RESPONSIBILITYTHE COMPANY I ADMIRE THE MOST FOR ITS CORPORATE SOCIAL RESPONSIBILITY
THE COMPANY I ADMIRE THE MOST FOR ITS CORPORATE SOCIAL RESPONSIBILITY
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
 
Hadoop and Machine Learning
Hadoop and Machine LearningHadoop and Machine Learning
Hadoop and Machine Learning
 
A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)
 

More from Marko Rodriguez

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic MachineMarko Rodriguez
 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data TypeMarko Rodriguez
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryMarko Rodriguez
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialMarko Rodriguez
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryMarko Rodriguez
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph ComputingMarko Rodriguez
 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageMarko Rodriguez
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageMarko Rodriguez
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics EngineMarko Rodriguez
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with GraphsMarko Rodriguez
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph DatabasesMarko Rodriguez
 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinMarko Rodriguez
 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical GremlinMarko Rodriguez
 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the GraphMarko Rodriguez
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMarko Rodriguez
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataMarko Rodriguez
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Marko Rodriguez
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceMarko Rodriguez
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming PatternMarko Rodriguez
 

More from Marko Rodriguez (20)

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machine
 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Type
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph Theory
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM Dial
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal Machinery
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph Computing
 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and Language
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal Language
 
The Path Forward
The Path ForwardThe Path Forward
The Path Forward
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics Engine
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with Graphs
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph Databases
 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with Gremlin
 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical Gremlin
 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the Graph
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to Redemption
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of Data
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network Science
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming Pattern
 

Recently uploaded

Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 

Recently uploaded (20)

Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 

Titan: The Rise of Big Graph Data

  • 1. TITAN THE RISE OF BIG GRAPH DATA MARKO A. RODRIGUEZ MATTHIAS BROECHELER http://THINKAURELIUS.COM
  • 2. ABSTRACT A graph is a data structure composed of vertices/dots and edges/lines. A graph database is a software system used to persist and process graphs. The common conception in today's database community is that there is a tradeoff between the scale of data and the complexity/interlinking of data. To challenge this understanding, Aurelius has developed Titan under the liberal Apache 2 license. Titan supports both the size of modern data and the modeling power of graphs to usher in the era of Big Graph Data. Novel techniques in edge compression, data layout, and vertex-centric indices that exploit significant orders are used to facilitate the representation and processing of a single atomic graph structure across a multi-machine cluster. To ensure ease of adoption by the graph community, Titan natively implements the TinkerPop 2 Blueprints API. This presentation will review the graph landscape, Titan's techniques for scale by distribution, and a collection of satellite graph technologies to be released by Aurelius in the coming summer months of 2012.
  • 3. SPEAKER BIOGRAPHIES Dr. Marko A. Rodriguez is the founder of the graph consulting firm Aurelius. He has focused his academic and commercial career on the theoretical and applied aspects of graphs. Marko is a cofounder of TinkerPop and the primary developer of the Gremlin graph traversal language. Dr. Matthias Broecheler has been researching and developing large-scale graph database systems for many years in both academia and in his role as a cofounder of the Aurelius graph consulting firm. He is the primary developer of the distributed graph database Titan. Matthias focuses most of his time and effort on novel OLTP and OLAP graph processing solutions.
  • 4. SPONSORS As the leading education services company, Pearson is serious about evolving how the world learns. We apply our deep education experience and research, invest in innovative technologies, and promote collaboration throughout the education ecosystem. Real change is our commitment and its results are delivered through connecting capabilities to create actionable, scalable solutions that improve access, affordability, and achievement. Aurelius is a team of software engineers and scientists committed to applying graph theory and network science to problems in numerous domains. Aurelius develops the theory and technology whereby graphs can be used to model, understand, predict, and influence the behavior of complex, interrelated social, economic, and physical networks. Jive is the pioneer and world's leading provider of social business solutions. Our products apply powerful technology that helps people connect, communicate and collaborate to get more work done and solve their biggest business challenges. Millions of users and many of the worldʼs most successful companies rely on Jive day in and day out to get work done, serve their customers and stay ahead of their competitors.
  • 5. OUTLINE 1. ThE GRAPH LANDSCAPE An introduction to graph computing. Graph technologies on the market today. 2. INTRODUCTION TO TITAN Getting up and running with Titan. Titan's techniques for scalability. 3. THE FUTURE OF AURELIUS Satellite technologies and the OLAP story. The graph landscape reprise.
  • 6. PART 1: ThE GRAPH LANDSCAPE MARKO A. RODRIGUEZ
  • 8. EDGE VERTEX GRAPH
  • 9. EDGE VERTEX GRAPH G = (V, E) Graph Vertices Edges
  • 10. G = (V, E) Classic Textbook Graph Structure
  • 11. A homogenous set of vertices... V
  • 12. ...connected by a homogenous set of edges. E
  • 13. RESTRICTED MODELING People and follows relationships...
  • 14. RESTRICTED MODELING People and follows relationships... ...xor webpages and citations.
  • 15. AN INTEGRATED MODEL IS TYPICALLY DESIRED references createdBy follows references references follows mentions
  • 16. AN INTEGRATED MODEL IS USEFUL references createdBy follows references references follows mentions Allows for more interesting/novel algorithms. (beyond "textbook" graph algorithms) Allows for a universal model of things and their relationships. (a single, unified model of a domain of interest)
  • 17. THE PROPERTY GRAPH G = (V, E, λ) Current Popular Graph Structure * Directed, attributed, edge-labeled graph * Multi-relational graph with key/value pairs on the elements
  • 20. PROPERTIES KEY VALUE name:hercules VERTEX
  • 22. name:hercules mother name:alcmene type:human
  • 23. name:hercules LABEL mother EDGE name:alcmene type:human
  • 24. name:hercules mother name:alcmene type:human
  • 25. name:hercules mother father name:jupiter name:alcmene type:god type:human
  • 26. IS HERCULES A DEMIGOD? DEMIGOD = HALF HUMAN + HALF GOD name:hercules mother father name:jupiter name:alcmene type:god type:human
  • 27. name:hercules mother father name:jupiter name:alcmene type:god type:human gremlin> hercules ==>v[0]
  • 28. name:hercules mother father name:jupiter name:alcmene type:god type:human gremlin> hercules.out('mother','father') ==>v[1] ==>v[2]
  • 29. DEMIGOD = HALF HUMAN + HALF GOD name:hercules mother father name:jupiter name:alcmene type:god type:human gremlin> hercules.out('mother','father').type ==>human ==>god
  • 30. DEMIGOD = HALF HUMAN + HALF GOD name:hercules type:demigod mother father name:jupiter name:alcmene type:god type:human gremlin> hercules.type = 'demigod' ==>demigod
  • 31. COMPUTING PROCESS STRUCTURE
  • 32. COMPUTING PROCESS STRUCTURE TRAVERSAL GRAPH
  • 33. COMPUTING PROCESS STRUCTURE TRAVERSAL GRAPH COMPUTING GRAPH-BASED
  • 35. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING
  • 36. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING EXPRESSIVE QUERYING
  • 37. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING EXPRESSIVE QUERYING NUMEROUS ANALYSES Mixing Patterns Ranking Inference Motifs Path Expressions Centrality Scoring Geodesics
  • 38. ANALYSES ARE THE EPIPHENOMENA OF TRAVERSAL f( )→
  • 39. WHAT IS THE SIGNIFICANCE OF GRAPH ANALYSIS?
  • 40. ANALYSES YIELD INSIGHTS ABOUT THE MODEL TA TS D A UC OD PR = DE DATA CIS -D ION RIV SU EN PP OR T
  • 41. RECOMMENDATION People you may know. SOCIAL GRAPH Products you might like. RATINGS GRAPH Movies you should watch and SOCIAL+RATINGS the friends you should watch them with. GRAPH
  • 42. WHO ELSE MIGHT HERCULES KNOW? cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6
  • 43. cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6 gremlin> hercules ==>v[0]
  • 44. cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6 gremlin> hercules.out('knows') ==>v[1] ==>v[2] ==>v[3]
  • 45. cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6 gremlin> hercules.out('knows').out('knows') ==>v[4] ==>v[5] ==>v[5] ==>v[6] ==>v[5]
  • 46. cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6 gremlin> hercules.out('knows').out('knows').groupCount.cap ==>v[4]=1 ==>v[5]=3 ==>v[6]=1
  • 47. HERCULES PROBABLY KNOWS NEPTUNE cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6 knows
  • 48. HERCULES PROBABLY KNOWS NEPTUNE PH cerberus pluto knows A 1 4 knows knows E" GR YL hercules nemean neptune ST knows knows 0 2 5 K OO knows knows EX TB hydra jupiter knows "T 3 6 IS A IS knows TH
  • 49. HERCULES PROBABLY KNOWS NEPTUNE cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father ...PROBABLY MORE SO WHEN OTHER TYPES OF EDGES ARE ANALYZED
  • 50. cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father
  • 51. cerberus pluto knows 1 4 knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father
  • 52. cerberus pluto knows 1 4 knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  • 53. human flesh 7 cerberus pluto knows 1 4 knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  • 54. likes human flesh 7 likes cerberus pluto knows 1 4 knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  • 55. tartarus 8 likes human flesh 7 likes cerberus pluto knows 1 4 knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  • 56. tartarus 8 likes human flesh likes likes 7 likes cerberus pluto knows 1 4 dislikes knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  • 57. tartarus 8 RATINGS GRAPH likes human flesh likes likes 7 likes cerberus pluto knows 1 4 dislikes knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  • 58. NEMEAN MIGHT LIKE TARTARUS PRODUCT GRAPH tartarus smellsOf 8 RATINGS GRAPH likes human flesh likes likes 7 likes cerberus pluto knows 1 4 dislikes composedOf knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH * Collaborative Filtering + Content-Based Recommendation
  • 59. PATH FINDING How is this person related to this film? MOVIE GRAPH Which authors of this book also BOOK GRAPH wrote a New York Times bestseller? Which movies are based on a book by a MOVIE+BOOK New York Times bestseller? GRAPH
  • 60. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  • 61. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york gremlin> hercules ==>v[0]
  • 62. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york gremlin> hercules.out('depictedIn') ==>v[7]
  • 63. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new york gremlin> hercules.out('depictedIn').as('movie') ==>v[7]
  • 64. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new york gremlin> hercules.out('depictedIn').as('movie').out('hasActor') ==>v[8] ==>v[10]
  • 65. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new york gremlin> hercules.out('depictedIn').as('movie').out('hasActor') .out('role') ==>v[0] ==>v[6]
  • 66. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new york gremlin> hercules.out('depictedIn').as('movie').out('hasActor') .out('role').retain(hercules) ==>v[0]
  • 67. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new york gremlin> hercules.out('depictedIn').as('movie').out('hasActor') .out('role').retain(hercules).back(2) ==>v[8]
  • 68. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new york gremlin> hercules.out('depictedIn').as('movie').out('hasActor') .out('role').retain(hercules).back(2).out('actor') ==>v[9]
  • 69. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie star hercules in new york gremlin> hercules.out('depictedIn').as('movie').out('hasActor') .out('role').retain(hercules).back(2).out('actor') .as('star') ==>v[9]
  • 70. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie star hercules in new york gremlin> hercules.out('depictedIn').as('movie').out('hasActor') .out('role').retain(hercules).back(2).out('actor') .as('star').select ==>[movie:v[7], star:v[9]]
  • 71. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie star hercules in new york gremlin> hercules.out('depictedIn').as('movie').out('hasActor') .out('role').retain(hercules).back(2).out('actor') .as('star').select{it.name} ==>[movie:hercules in new york, star:arnold schwarzenegger]
  • 72. jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  • 73. jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  • 74. jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  • 75. fred saberhagen 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  • 76. fred albuquerque saberhagen livesIn 14 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  • 77. fred santa fe albuquerque saberhagen 25-North livesIn 15 14 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  • 78. marko fred rodriguez santa fe albuquerque saberhagen livesIn 25-North livesIn 16 15 14 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  • 79. marko fred rodriguez santa fe albuquerque saberhagen livesIn 25-North livesIn 16 15 14 13 thinksHeIs writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  • 80. TRANSPORTATION GRAPH marko fred rodriguez santa fe albuquerque saberhagen livesIn 25-North livesIn 16 15 14 13 thinksHeIs BOOK GRAPH writtenBy PROFILE jupiter hercules GRAPH depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york MOVIE GRAPH
  • 81. SOCIAL INFLUENCE Who are the most influential people in java, mathematics, art, surreal art, politics, ...? Which region of the social graph will propagate this advertisement this furthest? Which 3 experts should review this submitted article? Which people should I talk to at the upcoming conference and what topics should I talk to them about? SOCIAL + COMMUNICATION + EXPERTISE + EVENT GRAPH
  • 82. PATTERN IDENTIFICATION This connectivity pattern is a sign of financial fraud. When this motif is found, a red flag will be raised. TRANSACTION GRAPH Healthy discourse is typified by a discussion board with a branch factor in this range and a concept clique score in this range. DISCUSSION GRAPH
  • 83. KNOWLEDGE DISCOVERY The terms "ice", "fans", "stanley cup," WIKIPEDIA GRAPH are classified as "sports" Given that all identified birds fly, it can be deduced that all birds fly. If contrary evidence is provided, EVIDENTIAL LOGIC GRAPH then this "fact" can be retracted.
  • 85. WORLD PROCESSES WORLD MODEL
  • 86. WORLD PROCESSES WORLD MODEL A single world model and various types of traversers moving through that model to solve problems.
  • 87. COMPUTING PROCESS STRUCTURE TRAVERSAL GRAPH COMPUTING GRAPH-BASED
  • 88. GRAPH COMPUTING ENGINES
  • 89. MEMORY-BASED GRAPHS Graph Framework Application NetworkX http://networkx.lanl.gov/ iGraph http://igraph.sourceforge.net/ JUNG http://jung.sourceforge.net/
  • 90. DISK-BASED GRAPHS Graph Database Neo4j Application Application http://neo4j.org/ Application OrientDB http://orientdb.org InfiniteGraph http://objectivity.com DEX http://www.sparsity-technologies.com/dex
  • 91. CLUSTER-BASED GRAPHS Bulk Synchronous Parallel Processing Application Application Application Hama 3 http://incubator.apache.org/hama/ 2 1 Giraph http://incubator.apache.org/giraph/ GoldenOrb http://goldenorbos.org/ * In the same spirit as Google's Pregel
  • 92. MEMORY-bASED GRAPHS Graph size is constrained by local machine's RAM. Rich graph algorithm and visualization packages. Oriented towards "textbook-style" graphs. * Based on typical behavior
  • 93. MEMORY-bASED GRAPHS Graph size is constrained by local machine's RAM. Rich graph algorithm and visualization packages. Oriented towards "textbook-style" graphs. DISK-BASED GRAPHS Graph size is constrained by local disk. Optimized for local graph algorithms. Oriented towards property graphs. * Based on typical behavior
  • 94. MEMORY-bASED GRAPHS Graph size is constrained by local machine's RAM. Rich graph algorithm and visualization packages. Oriented towards "textbook-style" graphs. DISK-BASED GRAPHS Graph size is constrained by local disk. Optimized for local graph algorithms. Oriented towards property graphs. CLUSTER-BASED GRAPHS Graph size is constrained to cluster's total RAM. Optimized for global graph algorithms. Oriented towards "textbook-style" graphs. * Based on typical behavior
  • 95. TINKERPOP Support for various graph vendors Open source graph product group * Encompassing the various graph computing styles Simple, well-defined products Provides a vendor-agnostic graph framework http://tinkerpop.com * Based on future directions
  • 96. TINKERPOP Graph Server Graph Algorithms Object-Graph Mapper Traversal Language Dataflow Processing http://tinkerpop.com Generic Graph API http://${project.name}.tinkerpop.com
  • 98. AND NOW THERE IS ANOTHER...
  • 104. TITAN
  • 105. PART 2: INTRODUCTION TO TITAN MATTHIAS BROECHELER
  • 106. WhY CREATE TITAN? A number of Aurelius' clients... ...need to represent and process graphs at the 100+ billion edge scale w/ thousands of concurrent transactions. ...need both local graph traversals (OLTP) and batch graph processing (OLAP). ...desire a free, open source distributed graph database.
  • 107. TITAN's KEY FEATURES Titan provides... ..."infinite size" graphs and "unlimited" users by means of a distributed storage engine. ...real-time local traversals (OLTP) and support for global batch processing via Hadoop (OLAP). ...distribution via the liberal, free, open source Apache2 license.
  • 109. matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time 100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01 matthias$
  • 110. matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time 100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01 matthias$ unzip titan.zip Archive: titan.zip creating: titan/ ... matthias$
  • 111. matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time 100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01 matthias$ unzip titan.zip Archive: titan.zip creating: titan/ ... matthias$ cd titan titan$
  • 112. matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time 100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01 matthias$ unzip titan.zip Archive: titan.zip creating: titan/ ... matthias$ cd titan titan$ bin/gremlin.sh ,,,/ (o o) -----oOOo-(_)-oOOo----- gremlin>
  • 113. gremlin> g = TitanFactory.open('/tmp/local-titan') ==>titangraph[local:/tmp/local-titan]
  • 114. DE MO INE ACH gremlin> g = TitanFactory.open('/tmp/local-titan') LM ==>titangraph[local:/tmp/local-titan] LO CA
  • 116. name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monster gremlin> g.loadGraphML('data/graph-of-the-gods.xml') ==>null * The Graph of the Gods is a toy dataset distributed with Titan
  • 117. name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monster gremlin> hercules = g.V('name','hercules').next() ==>v[24]
  • 118. name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monster gremlin> hercules.out('mother','father') ==>v[44] ==>v[16]
  • 119. name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monster gremlin> hercules.out('mother','father').name ==>alcmene ==>jupiter
  • 120. THAT WAS TITAN LOCAL. NEXT IS TITAN DISTRIBUTED. Broecheler, M., Pugliese, A., Subrahmanian, V.S., "COSI: Cloud Oriented Subgraph Identification in Massive Social Networks," Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, pp. 248-255, 2010. http://www.knowledgefrominformation.com/2010/08/01/cosi-cloud-oriented-subgraph-identification-in-massive-social-networks/
  • 122. TITAN DISTRIBUTED VIA CASSANDRA titan$ bin/gremlin.sh ,,,/ (o o) -----oOOo-(_)-oOOo----- gremlin> conf = new BaseConfiguration(); ==>org.apache.commons.configuration.BaseConfiguration@763861e6 gremlin> conf.setProperty("storage.backend","cassandra"); gremlin> conf.setProperty("storage.hostname","77.77.77.77"); gremlin> g = TitanFactory.open(conf); ==>titangraph[cassandra:77.77.77.77] gremlin> * There are numerous graph configurations: https://github.com/thinkaurelius/titan/wiki/Graph-Configuration
  • 123. INHERITED FEATURES Continuously available with no single point of failure. No write bottlenecks to the graph as there is no master/slave architecture. Built-in replication ensures data is available during machine failure. Caching layer ensures that continuously accessed data is available in memory. Elastic scalability allows for the introduction and removal of machines. Cassandra available at http://cassandra.apache.org/
  • 124. TITAN DISTRIBUTED VIA HBASE titan$ bin/gremlin.sh ,,,/ (o o) -----oOOo-(_)-oOOo----- gremlin> conf = new BaseConfiguration(); ==>org.apache.commons.configuration.BaseConfiguration@763861e6 gremlin> conf.setProperty("storage.backend","hbase"); gremlin> conf.setProperty("storage.hostname","77.77.77.77"); gremlin> g = TitanFactory.open(conf); ==>titangraph[hbase:77.77.77.77] gremlin> * There are numerous graph configurations: https://github.com/thinkaurelius/titan/wiki/Graph-Configuration
  • 125. INHERITED FEATURES Strictly consistent reads and writes. Linear scalability with the addition of machines. Base classes for backing Hadoop MapReduce jobs with HBase tables. HDFS-based data replication. Generally good integration with the tools in the Hadoop ecosystem. HBase available at http://hbase.apache.org/
  • 126. TITAN AND THE CAP THEOREM Partitionability y Ava c ten il is abi s on ty li C
  • 127. Titan is all about ...
  • 128. Titan is all about numerous concurrent users...
  • 129. Titan is all about numerous concurrent users... high availability....
  • 130. Titan is all about numerous concurrent users... high availability.... dynamic scalability...
  • 131. THE HOW OF TITAN DATA MANAGEMENT EDGE COMPRESSION VERTEX-CENTRIC INDICES
  • 132. THE HOW OF TITAN DATA MANAGEMENT
  • 133. DATA MANAGEMENT MAIN DESIGN PRINCIPLES Immutable, Atomic Edges Optimistic Concurrency Control hercules cerberus battled 1 hercules time:12 cerberus 2 battled + + + hercules time:12 successful:true cerberus + - 3 battled + Fined-Grained Locking Control
  • 134. DATA MANAGEMENT TYPE DEFINITION Datatype Constraints Edge Label Signatures TitanKey timeKey = TitanLabel battled = g.makeType().name("time") g.makeType().name("battled") .dataType(Integer.class) .signature(timeKey) time:12 time:"twelve" hercules cerberus battled time:12 Functional Declarations TitanLabel father = g.makeType().name("father") .functional() hercules jupiter father mars father Data management configurations allow Titan to optimize how information is stored/retrieved from disk.
  • 135. DATA MANAGEMENT TYPE DEFINITION Endogenous Indices g.createKeyIndex("name",Vertex.class) Unique Property Key/Value Pairs TitanKey status = name:jupiter g.makeType().name("status") name:hercules .unique() name:hermes name:jupiter name:neptune status:king of the gods status:king of the gods Data management configurations allow Titan to optimize how information is stored/retrieved from disk.
  • 136. DATA MANAGEMENT LOCKING SYSTEM Ensures consistency over non-consistent storage backends. hercules father jupiter write hercules jupiter father neptune father hercules write 1. Acquire lock at the end of the transaction. - locking mechanism depends on storage layer consistency guarantees. 2. Verify original read. 3. Fail transaction if any precondition is violated.
  • 137. DATA MANAGEMENT ID MANAGEMENT [0,1,2,3,4,5,6,7,8,9,10,11] Global ID Pool Maintained by Storage Engine
  • 138. DATA MANAGEMENT ID MANAGEMENT [0,1,2] [3,4,5] [0,1,2,3,4,5,6,7,8,9,10,11] Global ID Pool Maintained by Storage Engine [6,7,8] [9,10,11] Pool Subsets Assigned to Individual Instances
  • 139. THE HOW OF TITAN EDGE COMPRESSION
  • 140. EDGE COMPRESSION Natural graphs have a small world, community/cluster property. Community 1 Community 2 High intra-connectivity within a community and low inter-connectivity between communities. Watts, D. J., Strogatz, S. H., "Collective Dynamics of 'Small-World' Networks," Nature 393 (6684), pp. 440–442, 1998.
  • 142. EDGE COMPRESSION knows 12345678 12345683
  • 143. EDGE COMPRESSION knows 12345678 12345683
  • 144. EDGE COMPRESSION knows 12345678 12345683 12345678 9 12345683 24 bytes
  • 145. EDGE COMPRESSION knows 12345678 12345683 12345678 9 12345683 24 bytes 12345678 9 +5
  • 146. EDGE COMPRESSION knows 12345678 12345683 12345678 9 12345683 24 bytes 12345678 9 +5 + 12345678 9 5 7 bytes
  • 147. THE HOW OF TITAN VERTEX-CENTRIC INDICES
  • 148. VERTEX-CENTRIC INDICES THE SUPER NODE PROBLEM Natural, real-world graphs contain vertices of high degree. Even if rare, their degree ensures that they exist on many paths. Traversing a high degree vertex means touching numerous incident edges and potentially touching most of the graph in only a few steps.
  • 149. VERTEX-CENTRIC INDICES A SUPER NODE SOLUTION A "super node" only exists from the vantage point of classic "textbook style" graphs. In the world of property graphs, intelligent disk-level filtering can interpret a "super node" as a more manageable low-degree vertex. Vertex-centric querying utilizes B-Trees and sort orders for speedy lookup of incident edges with particular qualities.
  • 150. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query() stars:5 likes likes stars:2 stars:2 likes knows knows stars:3 stars:3 likes likes knows 8 edges
  • 151. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query().direction(OUT) stars:5 likes likes stars:2 stars:2 likes knows knows stars:3 stars:3 likes likes 7 edges
  • 152. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query().direction(OUT) .labels("likes") stars:5 likes likes stars:2 stars:2 likes stars:3 stars:3 likes likes 5 edges
  • 153. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query().direction(OUT) .labels("likes").has("stars",5) stars:5 likes 1 edge
  • 154. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES Query Query.direction(Direction) PREDICATES Query Query.labels(String... labels) Query Query.has(String, Object, Compare) Query Query.has(String, Object) Query Query.range(String, Object, Object) GETTERS Iterable<Vertex> Query.vertices() Iterable<Edge> Query.edges()
  • 155. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING battled time:1 time:2 battled time:12 battled knows knows
  • 156. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING battled time:1 time:2 battled battled time:12 battled knows knows knows
  • 157. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING battled time:1 battled w/ time 1-5 time:2 battled time:12 battled battled w/ time 5-10 knows TitanLabel battled = g.makeType().name("battled") .primaryKey(time) knows knows
  • 158. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING brother father mother knows battled
  • 159. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING brother father mother knows battled
  • 160. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING brother father family TypeGroup family = TypeGroup.of(2,"family"); mother TitanLabel father = g.makeType().name("father") .group(family).makeEdgeLabel(); TitanLabel mother = knows g.makeType().name("mother") .group(family).makeEdgeLabel(); TitanLabel brother = battled g.makeType().name("brother") .group(family).makeEdgeLabel();
  • 161. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING brother father family mother knows battled vertex.query().group("family")...
  • 162. THAT IS HOW TITAN WORKS DATA MANAGEMENT EDGE COMPRESSION VERTEX-CENTRIC INDICES
  • 163. WHAT IF YOU WANTED TO CREATE TWITTER FROM SCRATCH? SIMULATING TWITTER
  • 164. 3 BILLION EDGES 100 MILLION VERTICES 10000 CONCURRENT USERS 50 MACHINES 1 GRAPH DATABASE COMING JULY 2012
  • 165. PART 3: THE FUTURE OF AURELIUS MARKO A. RODRIGUEZ MATTHIAS BROECHELER
  • 166. AURELIUS' GRAPH COMPUTING STORY Titan as the highly scalable, distributed graph database solution. OLTP
  • 167. AURELIUS' GRAPH COMPUTING STORY Titan as the highly scalable, distributed graph database solution. Titan as the source (and potential sink) for other graph processing solutions. OLTP OLAP
  • 169. FAUNUS PATH ALGEBRA FOR HADOOP battled battled hercules cretan bull theseus A · A ◦ n(I) ally hercules theseus Derived graphs are single-relational and are typically much smaller than their multi-relational source. Therefore, derived graphs can be subjected to textbook-style graph algorithms in both a meaningful and efficient manner. WHO IS THE MOST CENTRAL ALLY?
  • 170. FAUNUS PATH ALGEBRA FOR HADOOP B = A · A ◦ n(I) B · B ◦ n(I) ally ally ally ally ally ally ally ally ally ally ally ally ally My allies' allies are my allies. 2 (A · A ) ◦ n(I)
  • 171. FAUNUS PATH ALGEBRA FOR HADOOP Used for global graph operations. Implements the multi-relational path algebra as a collection of Map/Reduce operations Reduce a massive property graph into a smaller semantically-rich single-relational graph. Project codename: TinkerPoop Support for HadoopGraph and HDFS file formats Rodriguez M.A., Shinavier, J., “Exposing Multi-Relational Networks to Single-Relational Network Analysis Algorithms,” Journal of Informetrics, 4(1), pp. 29-41, 2009. http://arxiv.org/abs/0806.2274
  • 173. FULGORA AN EFFICIENt IN-MEMORY GRAPH ENGINE Non-transactional, in-memory graph engine. It is not a database. Process ~90 billion edges in 68-Gigs of RAM assuming a small world topology. Perform complex graph algorithms in-memory. global graph analysis multi-relational graph analysis Similar in spirit to Twitter's Cassovary: https://github.com/twitter/cassovary
  • 174. THE AURELIUS OLAP FLOW Stores a massive-scale property graph Analyzes compressed, large-scale single or multi-relational Generates a large-scale graphs in memory single-relational graph Map/Reduce Load into RAM on a single-machine Update graph with derived edges Update element properties with algorithm results to a stats package
  • 175. THE AURELIUS OLAP FLOW Stores a massive-scale property graph Analyzes compressed, large-scale single or multi-relational Generates a large-scale graphs in memory single-relational graph Map/Reduce Load into RAM on a single-machine ally ally_centrality:0.0123 hercules theseus hercules to a stats package
  • 176. THE AURELIUS OLAP FLOW Stores a massive-scale property graph Analyzes compressed, large-scale single or multi-relational Generates a large-scale graphs in memory single-relational graph to a stats package
  • 177. AURELIUS' USE OF BLUEPRINTS Aurelius products use the Blueprints API so any graph product can communicate with any other graph product. The code for graph databases, frameworks, algorithms, and batch-processing are written in terms of the Blueprints API. Aurelius encourages developers to use Blueprints/ TinkerPop in order to grow a rich ecosystem of interoperable graph technologies.
  • 178. THE GRAPH LANDSCAPE REPRISE Speed of Traversal/Process Size of Graph/Structure * Not to scale. Did not want to overlap logos.
  • 179. NEXT STEPS Make use of and/or contribute to the free, open source Titan product. Learn about applying graph theory and network science. http://thinkaurelius.com http://thinkaurelius.github.com/titan/
  • 181. CREDITS PRESENTERS MARKO A. RODRIGUEZ MATTHIAS BROCHELER FINANCIAL SUPPORT PEARSON EDUCATION AURELIUS LOCATION PROVISIONS JIVE SOFTWARE MANY THANKS TO DAN LAROCQUE TINKERPOP COMMUNITY STEPHEN MALLETTE BOBBY NORTON KETRINA YIM