SlideShare une entreprise Scribd logo
1  sur  55
Using Graph Databases For Insights
Into Connected Data

Gagan Agrawal

Xebia India

1
SOFTWARE DEVELOPMENT DONE RIGHT
Netherlands | USA | India | France | UK
Agenda








High level view of Graph Space
Comparison with RDBMS and other NoSQL
stores
Data Modeling
Cypher : Graph Query Language
Graph Database Internals
Graphs In Real World

Xebia India

3
What is a Graph?

Xebia India

4
What is a Graph?






A collection of vertices and edges.
Set of nodes and the relationships that connect
them.
Graph Represents 




Entities as NODES
The way those entities relate to the world as
RELATIONSHIP

Allows to model all kind of scenarios





System of road
Medical history
Supply chain management
Data Center
Xebia India

6
High Level view of Graph Space




Graph Databases - Technologies used primarily
for transactional online graph persistence –
OLTP.

Graph Compute Engines - Tecnologies used
primarily for offline graph analytics - OLAP.

Xebia India

9
Graph Databases


Online database management system with Create, Read, Update, Delete

methods that expose a graph data model.

Built for use with transactional (OLTP) systems.

Used for richly connected data.

Querying is performed through traversals.

Can perform millions of traversal steps per
second.

Traversal step resembles a join in a RDBMS
Xebia India

10
Graph Database Properties


The Underlying Storage : Native / Non-Native



The Processing Engine : Native / Non-Native

Xebia India

11
Graph DB – The Underlying Storage




Native Graph Storage – Optimized and designed
for storing and managing graphs.
Non-Native Graph Storage – Serialize the graph
data into a relational database, an object oriented
database, or some other general purpose data
store.

Xebia India

12
Graph DB – The processing Engine


Index free adjacency – Connected Nodes
physically point to each other in the database

Xebia India

14
Power of Graph Databases


Performance



Flexibility



Agility

Xebia India

18
Comparison


Relational Databases



NoSQL Databases



Graph Databases

Xebia India

19
Relational Databases Lack
Relationships








Initially designed to codify paper forms and
tabular structures.
Deal poorly with relationships.
The rise in connectedness translates into
increased joins.
Lower performance.
Difficult to cater for changing business needs.

Xebia India

20
NoSQL Databases also lack
Relationships






NOSQL Databases e.g key-value, document or
column oriented store sets of disconnected
values/documents/columns.
Makes it difficult to use them for connected data
and graphs.
One of the solution is to embed an aggregate's
identifier inside the field belonging to another
aggregate.




Effectively introducing foreign keys

Requires joining aggregates at the application
level.
Xebia India

23
NoSQL DB








Relationships between aggregates aren't first
class citizens in the data model.
Foreign aggregate "links" are not reflexive.
Need to use some external compute infrastructure
e.g Hadoop for such processing.
Do not maintain consistency of connected data.
Do not support index-free adjacency.

Xebia India

24
Graph DB


Find friends-of-friends in a social network, to a
maximum depth of 5.



Total records : 1,000,000
Each with approximately 50 friends

Xebia India

27
Data Modeling with Graph

Xebia India

29
Data Modeling






“Whiteboard” friendly

The typical whiteboard view of a problem is a
GRAPH.
Sketch in our creative and analytical
modes, maps closely to the data model inside the
database.

Xebia India

30
Cypher : Graph Query Language









Pattern-Matching Query Language
Humane language
Expressive
Declarative : Say what you want, now how
Borrows from well know query languages
Aggregation, Ordering, Limit
Update the Graph

Xebia India

32
Cypher


Cypher Representation :
(c)-[:KNOWS]->(b)-[:KNOWS]->(a), (c)-[:KNOWS]->(a)
(c)-[:KNOWS]->(b)-[:KNOWS]->(a)<-[:KNOWS]-(c)

Xebia India

33
Cypher
START c=node:user(name='Michael')
MATCH (c)-[:KNOWS]->(b)-[:KNOWS]->(a), (c)[:KNOWS]->(a)
RETURN a, b

Xebia India

34
Other Cypher Clauses


WHERE




CREATE and CREATE UNIQUE




Create nodes and relationships

DELETE




Provides criteria for filtering pattern matching
results.

Removes nodes, relationships and properties

SET


Sets property values

Xebia India

35
Other Cypher Clauses


FOREACH




UNION




Performs an updating action for graph element in
a list.
Merge results from two or more queries.

WITH


Chains subsequent query parts and forward
results from one to the next. Similar to piping
commands in UNIX.

Xebia India

36
Comparison of Relational and Graph Modeling

Xebia India

37
Graph Database Internals

Xebia India

43
Non Functional Characteristics


Transactions






Fully ACID

Recoverability
Availability
Scalability

Xebia India

44
Scalability


Capacity (Graph Size)



Latency (Response Time)



Read and Write Throughput

Xebia India

45
Capacity




1.9 Release of Neo4j can support single graphs
having 10s of billions of nodes, relationships
and properties.
The Neo4j team has publicly expressed the
intention to support 100B+
nodes/relationships/properties in a single
graph.

Xebia India

46
Latency











RDBMS – more data in tables/indexes result in
longer join operations.
Graph DB doesn't suffer the same latency
problem.
Index is used to find starting node.
Traversal uses a combination of pointer chasing
and pattern matching to search the data.
Performance does not depend on total size of the
dataset.
Depends only on the data being queried.
Xebia India

47
Throughput


Constant performance irrespective of graph size.

Xebia India

48
Graphs in the Real World

Xebia India

49
Common Use Cases





Social
Recommendations
Geo
Logistics Networks : for package routing, finding shortest
Path





Financial Transaction Graphs : for fraud detection
Master Data Management
Bioinformatics : Era7 to relate complex web of information
that includes genes, proteins and enzymes



Authorization and Access Control : Adobe Creative
Cloud, Telenor
Xebia India

50
Thank You

Xebia India

53
BigData & Real Time Analytics

Services
Visualization (Tableau)
Analytics Framework (Mahout)
Integration (Sqoop, Flume , Storm)
Hadoop Powered Solutions (Pig, Hive, Oozie,
Hbase Impala) (Solr, Elastic Search)
Core Hadoop
(HDFS, MapReduce,Zookeeper, Cloudera

Trainings
- Cloudera Data Analyst /
Developer / Admin
Training

Products
- Divolte
- Wearable Sensors

Solutions
- Big data warehousing
- Scalable big data etl
- High volume web
analytics
Contact us @

Websites

www.xebia.in
www.xebia.com
www.xebia.fr

Xebia India

infoindia@xebia.com

Thought
Leadership

Htto://xebee.xebia.in
http://blog.xebia.com
http://podcast.xebia.com

Contenu connexe

Tendances

Deblina Dey - Resume
Deblina Dey - ResumeDeblina Dey - Resume
Deblina Dey - Resume
deblina dey
 

Tendances (20)

Big dataintegration rahm-part3Scalable and privacy-preserving data integratio...
Big dataintegration rahm-part3Scalable and privacy-preserving data integratio...Big dataintegration rahm-part3Scalable and privacy-preserving data integratio...
Big dataintegration rahm-part3Scalable and privacy-preserving data integratio...
 
Graph-Powered Machine Learning
Graph-Powered Machine LearningGraph-Powered Machine Learning
Graph-Powered Machine Learning
 
Informatica training
Informatica trainingInformatica training
Informatica training
 
Neo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpNeo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExp
 
Improving Machine Learning using Graph Algorithms
Improving Machine Learning using Graph AlgorithmsImproving Machine Learning using Graph Algorithms
Improving Machine Learning using Graph Algorithms
 
Resume
ResumeResume
Resume
 
Power bi ea content pack v0.1
Power bi   ea content pack v0.1Power bi   ea content pack v0.1
Power bi ea content pack v0.1
 
Tableau workshop during ICCTAC 2018
Tableau workshop during ICCTAC 2018Tableau workshop during ICCTAC 2018
Tableau workshop during ICCTAC 2018
 
Graphs and Financial Services Analytics
Graphs and Financial Services AnalyticsGraphs and Financial Services Analytics
Graphs and Financial Services Analytics
 
Deblina Dey - Resume
Deblina Dey - ResumeDeblina Dey - Resume
Deblina Dey - Resume
 
What is Power BI
What is Power BIWhat is Power BI
What is Power BI
 
An Introduction to Graph: Database, Analytics, and Cloud Services
An Introduction to Graph:  Database, Analytics, and Cloud ServicesAn Introduction to Graph:  Database, Analytics, and Cloud Services
An Introduction to Graph: Database, Analytics, and Cloud Services
 
Graph analytics in Linkurious Enterprise
Graph analytics in Linkurious EnterpriseGraph analytics in Linkurious Enterprise
Graph analytics in Linkurious Enterprise
 
Conference 2014: Rajat Arya - Deployment with GraphLab Create
Conference 2014: Rajat Arya - Deployment with GraphLab Create Conference 2014: Rajat Arya - Deployment with GraphLab Create
Conference 2014: Rajat Arya - Deployment with GraphLab Create
 
Tableau
TableauTableau
Tableau
 
Business Intelligence tools comparison
Business Intelligence tools comparisonBusiness Intelligence tools comparison
Business Intelligence tools comparison
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Pentaho etl-tool
Pentaho etl-toolPentaho etl-tool
Pentaho etl-tool
 
Graphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXGraphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphX
 
GraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinGraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos Guestrin
 

En vedette

Finding Insights In Connected Data: Using Graph Databases In Journalism
Finding Insights In Connected Data: Using Graph Databases In JournalismFinding Insights In Connected Data: Using Graph Databases In Journalism
Finding Insights In Connected Data: Using Graph Databases In Journalism
William Lyon
 
Neo4j graphs in the real world - graph days d.c. - april 14, 2015
Neo4j   graphs in the real world - graph days d.c. - april 14, 2015Neo4j   graphs in the real world - graph days d.c. - april 14, 2015
Neo4j graphs in the real world - graph days d.c. - april 14, 2015
Neo4j
 
Using a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDMUsing a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDM
Neo4j
 

En vedette (12)

Managing RDF data with graph databases
Managing RDF data with graph databasesManaging RDF data with graph databases
Managing RDF data with graph databases
 
Impact of BIG Data on MDM
Impact of BIG Data on MDMImpact of BIG Data on MDM
Impact of BIG Data on MDM
 
The NoSQL Geospatial Landscape
The NoSQL Geospatial LandscapeThe NoSQL Geospatial Landscape
The NoSQL Geospatial Landscape
 
Finding Insights In Connected Data: Using Graph Databases In Journalism
Finding Insights In Connected Data: Using Graph Databases In JournalismFinding Insights In Connected Data: Using Graph Databases In Journalism
Finding Insights In Connected Data: Using Graph Databases In Journalism
 
NoSQL Database: Classification, Characteristics and Comparison
NoSQL Database: Classification, Characteristics and ComparisonNoSQL Database: Classification, Characteristics and Comparison
NoSQL Database: Classification, Characteristics and Comparison
 
Neo4j GraphTalks - Semantische Netze
Neo4j GraphTalks - Semantische NetzeNeo4j GraphTalks - Semantische Netze
Neo4j GraphTalks - Semantische Netze
 
Graphs in the Real World
Graphs in the Real WorldGraphs in the Real World
Graphs in the Real World
 
Neo4j graphs in the real world - graph days d.c. - april 14, 2015
Neo4j   graphs in the real world - graph days d.c. - april 14, 2015Neo4j   graphs in the real world - graph days d.c. - april 14, 2015
Neo4j graphs in the real world - graph days d.c. - april 14, 2015
 
Customer 360
Customer 360Customer 360
Customer 360
 
Using a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDMUsing a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDM
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in Graphdatenbanken
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 

Similaire à Using Graph Databases For Insights Into Connected Data.

Big_SQL_3.0_Whitepaper
Big_SQL_3.0_WhitepaperBig_SQL_3.0_Whitepaper
Big_SQL_3.0_Whitepaper
Scott Gray
 
Whitepaper sones GraphDB (eng)
Whitepaper sones GraphDB (eng)Whitepaper sones GraphDB (eng)
Whitepaper sones GraphDB (eng)
sones GmbH
 
NoSQL Options Compared
NoSQL Options ComparedNoSQL Options Compared
NoSQL Options Compared
Sergey Bushik
 

Similaire à Using Graph Databases For Insights Into Connected Data. (20)

Graph db
Graph dbGraph db
Graph db
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
 
Fast Focus: SQL Server Graph Database & Processing
Fast Focus: SQL Server Graph Database & ProcessingFast Focus: SQL Server Graph Database & Processing
Fast Focus: SQL Server Graph Database & Processing
 
Graph based data models
Graph based data modelsGraph based data models
Graph based data models
 
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
 
Graph Databases 101
Graph Databases 101 Graph Databases 101
Graph Databases 101
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
 
Neo4j GraphDay Seattle- Sept19- in the enterprise
Neo4j GraphDay Seattle- Sept19-  in the enterpriseNeo4j GraphDay Seattle- Sept19-  in the enterprise
Neo4j GraphDay Seattle- Sept19- in the enterprise
 
C1803041317
C1803041317C1803041317
C1803041317
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
NoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and WhereNoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and Where
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edge
 
No Sql On Social And Sematic Web
No Sql On Social And Sematic WebNo Sql On Social And Sematic Web
No Sql On Social And Sematic Web
 
NoSQL On Social And Sematic Web
NoSQL On Social And Sematic WebNoSQL On Social And Sematic Web
NoSQL On Social And Sematic Web
 
GraphDatabase.pptx
GraphDatabase.pptxGraphDatabase.pptx
GraphDatabase.pptx
 
No sql database
No sql databaseNo sql database
No sql database
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014
 
Big_SQL_3.0_Whitepaper
Big_SQL_3.0_WhitepaperBig_SQL_3.0_Whitepaper
Big_SQL_3.0_Whitepaper
 
Whitepaper sones GraphDB (eng)
Whitepaper sones GraphDB (eng)Whitepaper sones GraphDB (eng)
Whitepaper sones GraphDB (eng)
 
NoSQL Options Compared
NoSQL Options ComparedNoSQL Options Compared
NoSQL Options Compared
 

Plus de Xebia IT Architects

When elephants dance , enterprise goes mobile !
When elephants dance , enterprise goes mobile !When elephants dance , enterprise goes mobile !
When elephants dance , enterprise goes mobile !
Xebia IT Architects
 
Xebia e-Commerce / mCommerce Solutions
Xebia e-Commerce / mCommerce SolutionsXebia e-Commerce / mCommerce Solutions
Xebia e-Commerce / mCommerce Solutions
Xebia IT Architects
 
A warm and prosperous Happy Diwali to all our clients
A warm and prosperous Happy Diwali to all our clientsA warm and prosperous Happy Diwali to all our clients
A warm and prosperous Happy Diwali to all our clients
Xebia IT Architects
 

Plus de Xebia IT Architects (20)

Use Cases of #Grails in #WebApplications
Use Cases of #Grails in #WebApplicationsUse Cases of #Grails in #WebApplications
Use Cases of #Grails in #WebApplications
 
When elephants dance , enterprise goes mobile !
When elephants dance , enterprise goes mobile !When elephants dance , enterprise goes mobile !
When elephants dance , enterprise goes mobile !
 
DevOps demystified
DevOps demystifiedDevOps demystified
DevOps demystified
 
Exploiting vulnerabilities in location based commerce
Exploiting vulnerabilities in location based commerceExploiting vulnerabilities in location based commerce
Exploiting vulnerabilities in location based commerce
 
Modelling RESTful applications – Why should I not use verbs in REST url
Modelling RESTful applications – Why should I not use verbs in REST urlModelling RESTful applications – Why should I not use verbs in REST url
Modelling RESTful applications – Why should I not use verbs in REST url
 
Scrumban - benefits of both the worlds
Scrumban - benefits of both the worldsScrumban - benefits of both the worlds
Scrumban - benefits of both the worlds
 
#Continuous delivery with #Deployit
#Continuous delivery with #Deployit#Continuous delivery with #Deployit
#Continuous delivery with #Deployit
 
Continuous integration using thucydides(bdd) with selenium
Continuous integration using thucydides(bdd) with seleniumContinuous integration using thucydides(bdd) with selenium
Continuous integration using thucydides(bdd) with selenium
 
Battlefield agility
Battlefield agilityBattlefield agility
Battlefield agility
 
Fish!ing for agile teams
Fish!ing for agile teamsFish!ing for agile teams
Fish!ing for agile teams
 
Xebia-Agile consulting and training offerings
Xebia-Agile consulting and training offeringsXebia-Agile consulting and training offerings
Xebia-Agile consulting and training offerings
 
Xebia e-Commerce / mCommerce Solutions
Xebia e-Commerce / mCommerce SolutionsXebia e-Commerce / mCommerce Solutions
Xebia e-Commerce / mCommerce Solutions
 
Growth at Xebia
Growth at XebiaGrowth at Xebia
Growth at Xebia
 
A warm and prosperous Happy Diwali to all our clients
A warm and prosperous Happy Diwali to all our clientsA warm and prosperous Happy Diwali to all our clients
A warm and prosperous Happy Diwali to all our clients
 
"We Plan to double our headcount" - MD, Xebia India
"We Plan to double our headcount" - MD, Xebia India"We Plan to double our headcount" - MD, Xebia India
"We Plan to double our headcount" - MD, Xebia India
 
Agile 2.0 - Our Road to Mastery
Agile 2.0 - Our Road to MasteryAgile 2.0 - Our Road to Mastery
Agile 2.0 - Our Road to Mastery
 
Agile FAQs by Shrikant Vashishtha
Agile FAQs by Shrikant VashishthaAgile FAQs by Shrikant Vashishtha
Agile FAQs by Shrikant Vashishtha
 
Agile Team Dynamics by Bhavin Chandulal Javia
Agile Team Dynamics by Bhavin Chandulal JaviaAgile Team Dynamics by Bhavin Chandulal Javia
Agile Team Dynamics by Bhavin Chandulal Javia
 
Practicing Agile in Offshore Environment by Himanshu Seth & Imran Mir
Practicing Agile in Offshore Environment by Himanshu Seth & Imran MirPracticing Agile in Offshore Environment by Himanshu Seth & Imran Mir
Practicing Agile in Offshore Environment by Himanshu Seth & Imran Mir
 
Moving Gradually to Agile Development by Kavita Gupta
Moving Gradually to Agile Development by Kavita GuptaMoving Gradually to Agile Development by Kavita Gupta
Moving Gradually to Agile Development by Kavita Gupta
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Using Graph Databases For Insights Into Connected Data.

  • 1. Using Graph Databases For Insights Into Connected Data Gagan Agrawal Xebia India 1
  • 2. SOFTWARE DEVELOPMENT DONE RIGHT Netherlands | USA | India | France | UK
  • 3. Agenda       High level view of Graph Space Comparison with RDBMS and other NoSQL stores Data Modeling Cypher : Graph Query Language Graph Database Internals Graphs In Real World Xebia India 3
  • 4. What is a Graph? Xebia India 4
  • 5.
  • 6. What is a Graph?    A collection of vertices and edges. Set of nodes and the relationships that connect them. Graph Represents    Entities as NODES The way those entities relate to the world as RELATIONSHIP Allows to model all kind of scenarios     System of road Medical history Supply chain management Data Center Xebia India 6
  • 7.
  • 8.
  • 9. High Level view of Graph Space   Graph Databases - Technologies used primarily for transactional online graph persistence – OLTP. Graph Compute Engines - Tecnologies used primarily for offline graph analytics - OLAP. Xebia India 9
  • 10. Graph Databases  Online database management system with Create, Read, Update, Delete methods that expose a graph data model.  Built for use with transactional (OLTP) systems.  Used for richly connected data.  Querying is performed through traversals.  Can perform millions of traversal steps per second.  Traversal step resembles a join in a RDBMS Xebia India 10
  • 11. Graph Database Properties  The Underlying Storage : Native / Non-Native  The Processing Engine : Native / Non-Native Xebia India 11
  • 12. Graph DB – The Underlying Storage   Native Graph Storage – Optimized and designed for storing and managing graphs. Non-Native Graph Storage – Serialize the graph data into a relational database, an object oriented database, or some other general purpose data store. Xebia India 12
  • 13.
  • 14. Graph DB – The processing Engine  Index free adjacency – Connected Nodes physically point to each other in the database Xebia India 14
  • 15.
  • 16.
  • 17.
  • 18. Power of Graph Databases  Performance  Flexibility  Agility Xebia India 18
  • 20. Relational Databases Lack Relationships      Initially designed to codify paper forms and tabular structures. Deal poorly with relationships. The rise in connectedness translates into increased joins. Lower performance. Difficult to cater for changing business needs. Xebia India 20
  • 21.
  • 22.
  • 23. NoSQL Databases also lack Relationships    NOSQL Databases e.g key-value, document or column oriented store sets of disconnected values/documents/columns. Makes it difficult to use them for connected data and graphs. One of the solution is to embed an aggregate's identifier inside the field belonging to another aggregate.   Effectively introducing foreign keys Requires joining aggregates at the application level. Xebia India 23
  • 24. NoSQL DB      Relationships between aggregates aren't first class citizens in the data model. Foreign aggregate "links" are not reflexive. Need to use some external compute infrastructure e.g Hadoop for such processing. Do not maintain consistency of connected data. Do not support index-free adjacency. Xebia India 24
  • 25.
  • 26.
  • 27. Graph DB  Find friends-of-friends in a social network, to a maximum depth of 5.   Total records : 1,000,000 Each with approximately 50 friends Xebia India 27
  • 28.
  • 29. Data Modeling with Graph Xebia India 29
  • 30. Data Modeling    “Whiteboard” friendly The typical whiteboard view of a problem is a GRAPH. Sketch in our creative and analytical modes, maps closely to the data model inside the database. Xebia India 30
  • 31.
  • 32. Cypher : Graph Query Language        Pattern-Matching Query Language Humane language Expressive Declarative : Say what you want, now how Borrows from well know query languages Aggregation, Ordering, Limit Update the Graph Xebia India 32
  • 33. Cypher  Cypher Representation : (c)-[:KNOWS]->(b)-[:KNOWS]->(a), (c)-[:KNOWS]->(a) (c)-[:KNOWS]->(b)-[:KNOWS]->(a)<-[:KNOWS]-(c) Xebia India 33
  • 35. Other Cypher Clauses  WHERE   CREATE and CREATE UNIQUE   Create nodes and relationships DELETE   Provides criteria for filtering pattern matching results. Removes nodes, relationships and properties SET  Sets property values Xebia India 35
  • 36. Other Cypher Clauses  FOREACH   UNION   Performs an updating action for graph element in a list. Merge results from two or more queries. WITH  Chains subsequent query parts and forward results from one to the next. Similar to piping commands in UNIX. Xebia India 36
  • 37. Comparison of Relational and Graph Modeling Xebia India 37
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 44. Non Functional Characteristics  Transactions     Fully ACID Recoverability Availability Scalability Xebia India 44
  • 45. Scalability  Capacity (Graph Size)  Latency (Response Time)  Read and Write Throughput Xebia India 45
  • 46. Capacity   1.9 Release of Neo4j can support single graphs having 10s of billions of nodes, relationships and properties. The Neo4j team has publicly expressed the intention to support 100B+ nodes/relationships/properties in a single graph. Xebia India 46
  • 47. Latency       RDBMS – more data in tables/indexes result in longer join operations. Graph DB doesn't suffer the same latency problem. Index is used to find starting node. Traversal uses a combination of pointer chasing and pattern matching to search the data. Performance does not depend on total size of the dataset. Depends only on the data being queried. Xebia India 47
  • 48. Throughput  Constant performance irrespective of graph size. Xebia India 48
  • 49. Graphs in the Real World Xebia India 49
  • 50. Common Use Cases     Social Recommendations Geo Logistics Networks : for package routing, finding shortest Path    Financial Transaction Graphs : for fraud detection Master Data Management Bioinformatics : Era7 to relate complex web of information that includes genes, proteins and enzymes  Authorization and Access Control : Adobe Creative Cloud, Telenor Xebia India 50
  • 51.
  • 52.
  • 54. BigData & Real Time Analytics Services Visualization (Tableau) Analytics Framework (Mahout) Integration (Sqoop, Flume , Storm) Hadoop Powered Solutions (Pig, Hive, Oozie, Hbase Impala) (Solr, Elastic Search) Core Hadoop (HDFS, MapReduce,Zookeeper, Cloudera Trainings - Cloudera Data Analyst / Developer / Admin Training Products - Divolte - Wearable Sensors Solutions - Big data warehousing - Scalable big data etl - High volume web analytics
  • 55. Contact us @ Websites www.xebia.in www.xebia.com www.xebia.fr Xebia India infoindia@xebia.com Thought Leadership Htto://xebee.xebia.in http://blog.xebia.com http://podcast.xebia.com

Notes de l'éditeur

  1. Services should include hadoop consulting rather