SlideShare une entreprise Scribd logo
1  sur  83
1,2,3,4
Add Another Data Store
(And Other Rhymes)


Eric Lubow
@elubow
elubow@simplereach.com
#cassandra12
Overview




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Overview
•   SimpleReach




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Overview
•   SimpleReach
•   Definitions and Data Stores




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Overview
•   SimpleReach
•   Definitions and Data Stores
•   Evolution to Polyglottany




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Overview
•   SimpleReach
•   Definitions and Data Stores
•   Evolution to Polyglottany
•   Tie It Together




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Overview
•   SimpleReach
•   Definitions and Data Stores
•   Evolution to Polyglottany
•   Tie It Together
•   Questions


    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Socially Intelligent



1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Size




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Size
•   100m events
    recorded per day and
    growing




     1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Size
•   100m events
    recorded per day and
    growing
•   500m Pageviews per
    month and growing




     1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Polyglot Persistence
Polyglot Persistence, like polyglot programming, is all
about choosing the right persistence option for the task
at hand.
                                   http://www.sleberknight.com/blog/sleberkn/entry/polyglot_persistence




1,2,3,4 Add Another Data Store (And Other Rhymes)                             Eric Lubow     @elubow
Right Tool For The Job




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Why?




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Why?
•   Heavier READ loads vs heavier write loads




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Why?
•   Heavier READ loads vs heavier write loads
•   Data relationships may be less important




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Why?
•   Heavier READ loads vs heavier write loads
•   Data relationships may be less important
•   Different aspects of a system have different requirements




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
No One Size Fits All




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Tools




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Free vs. Cost




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Languages




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Pre-Scale




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Scale




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
SimpleReach Pre-Scale




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
SimpleReach




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Cassandra




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion
•   Really fast writes to many locations (eventual consistency)




    1,2,3,4 Add Another Data Store (And Other Rhymes)             Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion
•   Really fast writes to many locations (eventual consistency)
•   Query by column groups within rows




    1,2,3,4 Add Another Data Store (And Other Rhymes)             Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion
•   Really fast writes to many locations (eventual consistency)
•   Query by column groups within rows
•   Range queries in Hive (partial CF scans)




    1,2,3,4 Add Another Data Store (And Other Rhymes)             Eric Lubow   @elubow
mongoDB




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
mongoDB
•   Fast atomic increments (Node.js is native JSON)




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
mongoDB
•   Fast atomic increments (Node.js is native JSON)
•   Sharding for faster distributed increments




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
mongoDB
•   Fast atomic increments (Node.js is native JSON)
•   Sharding for faster distributed increments
•   Solid ORM for Rails (MongoID)




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
mongoDB
•   Fast atomic increments (Node.js is native JSON)
•   Sharding for faster distributed increments
•   Solid ORM for Rails (MongoID)
•   Fast access for pub/sub of durable/persisted documents




    1,2,3,4 Add Another Data Store (And Other Rhymes)        Eric Lubow   @elubow
Redis




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Redis
•   Supports hundreds of thousands transactions per
    second




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Redis
•   Supports hundreds of thousands transactions per
    second
•   Great caching engine




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Redis
•   Supports hundreds of thousands transactions per
    second
•   Great caching engine
•   Supports useful variable types like sorted set




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Redis
•   Supports hundreds of thousands transactions per
    second
•   Great caching engine
•   Supports useful variable types like sorted set
•   Pay SerDe price on each access




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
InfiniDB and Infobright




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
InfiniDB and Infobright
•   Column Stores for ad-hoc analytics queries in SQL




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
InfiniDB and Infobright
•   Column Stores for ad-hoc analytics queries in SQL
•   Databases built for business intelligence




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
InfiniDB and Infobright
•   Column Stores for ad-hoc analytics queries in SQL
•   Databases built for business intelligence
•   Heavy compression of data




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
InfiniDB and Infobright
•   Column Stores for ad-hoc analytics queries in SQL
•   Databases built for business intelligence
•   Heavy compression of data
•   Pre-aggregated data (Extents/Knowledge Grid)




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Ruby, Node.js, Python




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Ruby, Node.js, Python
•   Polyglottany doesn’t only apply to data stores




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Ruby, Node.js, Python
•   Polyglottany doesn’t only apply to data stores
•   Each language has its own benefit to each data storage layer




    1,2,3,4 Add Another Data Store (And Other Rhymes)        Eric Lubow   @elubow
Ruby, Node.js, Python
•   Polyglottany doesn’t only apply to data stores
•   Each language has its own benefit to each data storage layer
•   Each language has its own individual benefits




    1,2,3,4 Add Another Data Store (And Other Rhymes)        Eric Lubow   @elubow
Ruby, Node.js, Python
•   Polyglottany doesn’t only apply to data stores
•   Each language has its own benefit to each data storage layer
•   Each language has its own individual benefits
•   JSON, APIs, Performance




    1,2,3,4 Add Another Data Store (And Other Rhymes)        Eric Lubow   @elubow
Choice




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Cons




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Cons
•   Redis - Can only utilize a single core




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Cons
•   Redis - Can only utilize a single core
•   MySQL Column Store - DELETE/UPDATEs are VERY expensive




    1,2,3,4 Add Another Data Store (And Other Rhymes)        Eric Lubow   @elubow
Cons
•   Redis - Can only utilize a single core
•   MySQL Column Store - DELETE/UPDATEs are VERY expensive
•   Cassandra - No btree indexes




    1,2,3,4 Add Another Data Store (And Other Rhymes)        Eric Lubow   @elubow
Cons
•   Redis - Can only utilize a single core
•   MySQL Column Store - DELETE/UPDATEs are VERY expensive
•   Cassandra - No btree indexes
•   Mongo - Queries slow down when shard count increases. Indexes must fit in memory




    1,2,3,4 Add Another Data Store (And Other Rhymes)          Eric Lubow   @elubow
Cons
•   Redis - Can only utilize a single core
•   MySQL Column Store - DELETE/UPDATEs are VERY expensive
•   Cassandra - No btree indexes
•   Mongo - Queries slow down when shard count increases. Indexes must fit in memory
•   Python - Whitespace. Community




    1,2,3,4 Add Another Data Store (And Other Rhymes)          Eric Lubow   @elubow
Cons
•   Redis - Can only utilize a single core
•   MySQL Column Store - DELETE/UPDATEs are VERY expensive
•   Cassandra - No btree indexes
•   Mongo - Queries slow down when shard count increases. Indexes must fit in memory
•   Python - Whitespace. Community
•   Ruby - Not high performance enough for our standards




    1,2,3,4 Add Another Data Store (And Other Rhymes)          Eric Lubow   @elubow
Cons
•   Redis - Can only utilize a single core
•   MySQL Column Store - DELETE/UPDATEs are VERY expensive
•   Cassandra - No btree indexes
•   Mongo - Queries slow down when shard count increases. Indexes must fit in memory
•   Python - Whitespace. Community
•   Ruby - Not high performance enough for our standards
•   Javascript (Node.js) - Bad for CPU or IO intensive workloads


    1,2,3,4 Add Another Data Store (And Other Rhymes)              Eric Lubow   @elubow
Tying It Together




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Tying It Together
•   Built in the cloud




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Tying It Together
•   Built in the cloud
•   Service Oriented Architecture (Internal API)




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Tying It Together
•   Built in the cloud
•   Service Oriented Architecture (Internal API)
•   Built Helenus (Cassandra Node.js driver)




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Tying It Together
•   Built in the cloud
•   Service Oriented Architecture (Internal API)
•   Built Helenus (Cassandra Node.js driver)
•   Data accuracy checks: visual and programmatic




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Tying It Together
•   Built in the cloud
•   Service Oriented Architecture (Internal API)
•   Built Helenus (Cassandra Node.js driver)
•   Data accuracy checks: visual and programmatic
•   Built framework for testing out storage engines




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Service Architecture
 Analytics


 Real-time



                                         Internal API


1,2,3,4 Add Another Data Store (And Other Rhymes)       Eric Lubow   @elubow
Helenus




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra
•   https://github.com/simplereach/helenus




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra
•   https://github.com/simplereach/helenus
•   CQL 2/3, Composite Column, Thrift Interface




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra
•   https://github.com/simplereach/helenus
•   CQL 2/3, Composite Column, Thrift Interface
•   More about Node.js and Cassandra




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Points To Consider




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Points To Consider
•   Data consistency - Same in all data stores




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Points To Consider
•   Data consistency - Same in all data stores
•   How important is data durability?




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Points To Consider
•   Data consistency - Same in all data stores
•   How important is data durability?
•   Managing many servers (Chef, AWS, CSSH)




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Points To Consider
•   Data consistency - Same in all data stores
•   How important is data durability?
•   Managing many servers (Chef, AWS, CSSH)
•   Managing and learning many different applications and
    tuning for them




    1,2,3,4 Add Another Data Store (And Other Rhymes)       Eric Lubow   @elubow
Summary




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Summary
•   Polyglottany is not a sin




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Summary
•   Polyglottany is not a sin
•   Know your data read/write patterns




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Summary
•   Polyglottany is not a sin
•   Know your data read/write patterns
•   Know the tools available to you




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Summary
•   Polyglottany is not a sin
•   Know your data read/write patterns
•   Know the tools available to you
•   Know your compromises




    1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
We’re Hiring




1,2,3,4 Add Another Data Store (And Other Rhymes)   Eric Lubow   @elubow
Questions are guaranteed in life.
Answers aren’t.
               Eric Lubow
               @elubow
               elubow@simplereach.com
               #cassandra12

               Thank you.

Contenu connexe

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

En vedette

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

En vedette (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 

1, 2, 3, 4, Add Another Data Store

  • 1. 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow elubow@simplereach.com #cassandra12
  • 2. Overview 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 3. Overview • SimpleReach 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 4. Overview • SimpleReach • Definitions and Data Stores 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 5. Overview • SimpleReach • Definitions and Data Stores • Evolution to Polyglottany 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 6. Overview • SimpleReach • Definitions and Data Stores • Evolution to Polyglottany • Tie It Together 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 7. Overview • SimpleReach • Definitions and Data Stores • Evolution to Polyglottany • Tie It Together • Questions 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 8. Socially Intelligent 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 9. Size 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 10. Size • 100m events recorded per day and growing 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 11. Size • 100m events recorded per day and growing • 500m Pageviews per month and growing 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 12. Polyglot Persistence Polyglot Persistence, like polyglot programming, is all about choosing the right persistence option for the task at hand. http://www.sleberknight.com/blog/sleberkn/entry/polyglot_persistence 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 13. Right Tool For The Job 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 14. Why? 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 15. Why? • Heavier READ loads vs heavier write loads 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 16. Why? • Heavier READ loads vs heavier write loads • Data relationships may be less important 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 17. Why? • Heavier READ loads vs heavier write loads • Data relationships may be less important • Different aspects of a system have different requirements 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 18. No One Size Fits All 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 19. Tools 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 20. Free vs. Cost 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 21. Languages 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 22. Pre-Scale 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 23. Scale 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 24. SimpleReach Pre-Scale 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 25. SimpleReach 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 26. Cassandra 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 27. Cassandra • Large data volume ingestion 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 28. Cassandra • Large data volume ingestion • Really fast writes to many locations (eventual consistency) 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 29. Cassandra • Large data volume ingestion • Really fast writes to many locations (eventual consistency) • Query by column groups within rows 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 30. Cassandra • Large data volume ingestion • Really fast writes to many locations (eventual consistency) • Query by column groups within rows • Range queries in Hive (partial CF scans) 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 31. mongoDB 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 32. mongoDB • Fast atomic increments (Node.js is native JSON) 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 33. mongoDB • Fast atomic increments (Node.js is native JSON) • Sharding for faster distributed increments 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 34. mongoDB • Fast atomic increments (Node.js is native JSON) • Sharding for faster distributed increments • Solid ORM for Rails (MongoID) 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 35. mongoDB • Fast atomic increments (Node.js is native JSON) • Sharding for faster distributed increments • Solid ORM for Rails (MongoID) • Fast access for pub/sub of durable/persisted documents 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 36. Redis 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 37. Redis • Supports hundreds of thousands transactions per second 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 38. Redis • Supports hundreds of thousands transactions per second • Great caching engine 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 39. Redis • Supports hundreds of thousands transactions per second • Great caching engine • Supports useful variable types like sorted set 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 40. Redis • Supports hundreds of thousands transactions per second • Great caching engine • Supports useful variable types like sorted set • Pay SerDe price on each access 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 41. InfiniDB and Infobright 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 42. InfiniDB and Infobright • Column Stores for ad-hoc analytics queries in SQL 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 43. InfiniDB and Infobright • Column Stores for ad-hoc analytics queries in SQL • Databases built for business intelligence 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 44. InfiniDB and Infobright • Column Stores for ad-hoc analytics queries in SQL • Databases built for business intelligence • Heavy compression of data 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 45. InfiniDB and Infobright • Column Stores for ad-hoc analytics queries in SQL • Databases built for business intelligence • Heavy compression of data • Pre-aggregated data (Extents/Knowledge Grid) 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 46. Ruby, Node.js, Python 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 47. Ruby, Node.js, Python • Polyglottany doesn’t only apply to data stores 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 48. Ruby, Node.js, Python • Polyglottany doesn’t only apply to data stores • Each language has its own benefit to each data storage layer 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 49. Ruby, Node.js, Python • Polyglottany doesn’t only apply to data stores • Each language has its own benefit to each data storage layer • Each language has its own individual benefits 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 50. Ruby, Node.js, Python • Polyglottany doesn’t only apply to data stores • Each language has its own benefit to each data storage layer • Each language has its own individual benefits • JSON, APIs, Performance 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 51. Choice 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 52. Cons 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 53. Cons • Redis - Can only utilize a single core 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 54. Cons • Redis - Can only utilize a single core • MySQL Column Store - DELETE/UPDATEs are VERY expensive 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 55. Cons • Redis - Can only utilize a single core • MySQL Column Store - DELETE/UPDATEs are VERY expensive • Cassandra - No btree indexes 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 56. Cons • Redis - Can only utilize a single core • MySQL Column Store - DELETE/UPDATEs are VERY expensive • Cassandra - No btree indexes • Mongo - Queries slow down when shard count increases. Indexes must fit in memory 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 57. Cons • Redis - Can only utilize a single core • MySQL Column Store - DELETE/UPDATEs are VERY expensive • Cassandra - No btree indexes • Mongo - Queries slow down when shard count increases. Indexes must fit in memory • Python - Whitespace. Community 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 58. Cons • Redis - Can only utilize a single core • MySQL Column Store - DELETE/UPDATEs are VERY expensive • Cassandra - No btree indexes • Mongo - Queries slow down when shard count increases. Indexes must fit in memory • Python - Whitespace. Community • Ruby - Not high performance enough for our standards 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 59. Cons • Redis - Can only utilize a single core • MySQL Column Store - DELETE/UPDATEs are VERY expensive • Cassandra - No btree indexes • Mongo - Queries slow down when shard count increases. Indexes must fit in memory • Python - Whitespace. Community • Ruby - Not high performance enough for our standards • Javascript (Node.js) - Bad for CPU or IO intensive workloads 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 60. Tying It Together 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 61. Tying It Together • Built in the cloud 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 62. Tying It Together • Built in the cloud • Service Oriented Architecture (Internal API) 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 63. Tying It Together • Built in the cloud • Service Oriented Architecture (Internal API) • Built Helenus (Cassandra Node.js driver) 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 64. Tying It Together • Built in the cloud • Service Oriented Architecture (Internal API) • Built Helenus (Cassandra Node.js driver) • Data accuracy checks: visual and programmatic 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 65. Tying It Together • Built in the cloud • Service Oriented Architecture (Internal API) • Built Helenus (Cassandra Node.js driver) • Data accuracy checks: visual and programmatic • Built framework for testing out storage engines 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 66. Service Architecture Analytics Real-time Internal API 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 67. Helenus 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 68. Helenus • Built Node.js driver for Cassandra 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 69. Helenus • Built Node.js driver for Cassandra • https://github.com/simplereach/helenus 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 70. Helenus • Built Node.js driver for Cassandra • https://github.com/simplereach/helenus • CQL 2/3, Composite Column, Thrift Interface 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 71. Helenus • Built Node.js driver for Cassandra • https://github.com/simplereach/helenus • CQL 2/3, Composite Column, Thrift Interface • More about Node.js and Cassandra 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 72. Points To Consider 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 73. Points To Consider • Data consistency - Same in all data stores 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 74. Points To Consider • Data consistency - Same in all data stores • How important is data durability? 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 75. Points To Consider • Data consistency - Same in all data stores • How important is data durability? • Managing many servers (Chef, AWS, CSSH) 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 76. Points To Consider • Data consistency - Same in all data stores • How important is data durability? • Managing many servers (Chef, AWS, CSSH) • Managing and learning many different applications and tuning for them 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 77. Summary 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 78. Summary • Polyglottany is not a sin 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 79. Summary • Polyglottany is not a sin • Know your data read/write patterns 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 80. Summary • Polyglottany is not a sin • Know your data read/write patterns • Know the tools available to you 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 81. Summary • Polyglottany is not a sin • Know your data read/write patterns • Know the tools available to you • Know your compromises 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 82. We’re Hiring 1,2,3,4 Add Another Data Store (And Other Rhymes) Eric Lubow @elubow
  • 83. Questions are guaranteed in life. Answers aren’t. Eric Lubow @elubow elubow@simplereach.com #cassandra12 Thank you.

Notes de l'éditeur

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. SimpleReach is a social intelligence tool for content creators. We track everything social action, on every major network, across the entire web in real-time. That means every like, tweet, pin, stumble and many more.\n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n