SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
August 8, 2012




Cassandra at eBay
    Time left: 29m 59s




                     Jay Patel
                     Architect, Platform Systems
                     @pateljay3001
eBay Marketplaces
 97 million active buyers and sellers
 200+ million items
 2 billion page views each day
 80 billion database calls each day
 5+ petabytes of site storage capacity
 80+ petabytes of analytics storage capacity

                                                2
How do we scale databases?
 Shard
   – Patterns: Modulus, lookup-based, range, etc.
   – Application sees only logical shard/database
 Replicate
   – Disaster recovery, read availability/scalability
 Big NOs
   – No transactions
   – No joins
   – No referential integrity constraints
                                                        3
We like Cassandra
 Multi-datacenter (active-active)    Write performance
 Availability - No SPOF              Distributed counters
 Scalability                         Hadoop support



We also utilize MongoDB & HBase




                                                              4
Are we replacing RDBMS with NoSQL?

          Not at all! But, complementing.
 Some use cases don’t fit well - sparse data, big data, schema
  optional, real-time analytics, …
 Many use cases don’t need top-tier set-ups - logging, tracking, …




                                                                  5
A glimpse on our Cassandra deployment
 Dozens of nodes across multiple clusters
 200 TB+ storage provisioned
 400M+ writes & 100M+ reads per day, and growing
 QA, LnP, and multiple Production clusters




                                                    6
Use Cases on Cassandra
      Social Signals on eBay product & item pages
      Hunch taste graph for eBay users & items
      Time series use cases (many):
     Mobile notification logging and tracking
     Tracking for fraud detection
     SOA request/response payload logging
     RedLaser server logs and analytics

                                                    7
Served by
Cassandra




            8
Manage signals via “Your Favorites”




                                      Whole page is
                                      served by
                                      Cassandra




                                                9
Why Cassandra for Social Signals?
 Need scalable counters
 Need real (or near) time analytics on collected social data
 Need good write performance
 Reads are not latency sensitive




                                                                10
Deployment
                 User request has no datacenter affinity


                           Non-sticky load balancing




Topology - NTS           Data is backed up periodically
RF - 2:2                 to protect against human or
Read CL - ONE            software error
Write CL – ONE

                                                       11
Data Model
             depends on query patterns




                                         12
Data Model (simplified)




                          13
Wait…



                    Duplicates!




        Oh, toggle button!
        Signal --> De-signal --> Signal…
                                       14
Yes, eventual consistency!
One scenario that produces duplicate signals in UserLike CF:
   1. Signal
   2. De-signal (1st operation is not propagated to all replica)
   3. Signal, again (1st operation is not propagated yet!)



 So, what’s the solution? Later…

                                                                   15
Social Signals, next phase: Real-time Analytics
 Most signaled or popular items per affinity groups (category, etc.)
 Aggregated item count per affinity group



                                                     Example affinity group




                                                                              16
Initial Data Model for real-time analytics

                                               Items in an affinitygroup
                                               is physically stored
                                               sorted by their signal
                                               count




                           Update counters for both individual item
                           and all the affinity groups that item
                           belongs to
Deployment, next phase




Topology - NTS
RF - 2:2:2
user1       bid
                                  item1
        buy

item2         watch               sell
                        user2




                                          19
Graph in Cassandra
Event consumers listen for site events (sell/bid/buy/watch) & populate graph in Cassandra




   30 million+ writes daily                Batch-oriented reads
   14 billion+ edges already                (for taste vector updates)
                                                                                    20
 Mobile notification logging and tracking
 Tracking for fraud detection
 SOA request/response payload logging
 RedLaser server logs and analytics




                                             21
A glimpse on Data Model
RedLaser tracking & monitoring console




                                         23
That’s all about the use cases..
Remember the duplicate problem in Use Case #1?




  Let’s see some options we considered to solve this…
                                                    24
Option 1 – Make ‘Like’ idempotent for UserLike
 Remove time (timeuuid) from the composite column name:
    Multiple signal operations are now Idempotent
    No need to read before de-signaling (deleting)




    X            Need timeuuid for ordering!
                 Already have a user with more than 1300 signals   25
Option 2 – Use strong consistency

 Local Quorum
  – Won’t help us. User requests are not geo-load balanced
    (no DC affinity).
 Quorum
  – Won’t survive during partition between DCs (or, one of the
    DC is down). Also, adds additional latency.

              X      Need to survive!
                                                             26
Option 3 – Adapt to eventual consistency
If desire survival!




                                                                              27
                      http://www.strangecosmos.com/content/item/101254.html
Adjustments to eventual consistency
 De-signal steps:
      – Don’t check whether item is already signaled by a user, or not
      – Read all (duplicate) signals from UserLike_unordered (new CF to avoid reading
        whole row from UserLike)
      – Delete those signals from UserLike_unordered and UserLike




Still, can get duplicate signals or false positives as there is a ‘read before delete’.
To shield further, do ‘repair on read’.                  Not a full story!
                                                                                     28
Lessons & Best Practices
• Choose proper Replication Factor and Consistency Level.
    – They alter latency, availability, durability, consistency and cost.
    – Cassandra supports tunable consistency, but remember strong consistency is not free.
• Consider all overheads in capacity planning.
    – Replicas, compaction, secondary indexes, etc.
• De-normalize and duplicate for read performance.
    – But don’t de-normalize if you don’t need to.
• Many ways to model data in Cassandra.
    – The best way depends on your use case and query patterns.
                More on http://ebaytechblog.com?p=1308
Thank You
  @pateljay3001
  #cassandra12
                  30

Contenu connexe

Tendances

Dynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theoremDynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theorem
Grisha Weintraub
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 
Big Data in Real-Time at Twitter
Big Data in Real-Time at TwitterBig Data in Real-Time at Twitter
Big Data in Real-Time at Twitter
nkallen
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
HostedbyConfluent
 

Tendances (20)

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache Spark
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Dynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theoremDynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theorem
 
Cassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesCassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary Differences
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Big Data in Real-Time at Twitter
Big Data in Real-Time at TwitterBig Data in Real-Time at Twitter
Big Data in Real-Time at Twitter
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Under the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureUnder the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database Architecture
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
 
Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs
 

En vedette

Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
DataStax
 

En vedette (7)

Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax EnterpriseSolr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
 
Cql – cassandra query language
Cql – cassandra query languageCql – cassandra query language
Cql – cassandra query language
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 

Similaire à Cassandra at eBay - Cassandra Summit 2012

Stream Processing with CompletableFuture and Flow in Java 9
Stream Processing with CompletableFuture and Flow in Java 9Stream Processing with CompletableFuture and Flow in Java 9
Stream Processing with CompletableFuture and Flow in Java 9
Trayan Iliev
 
Real World Cassandra
Real World CassandraReal World Cassandra
Real World Cassandra
GiltTech
 
How to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and FastHow to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and Fast
MapR Technologies
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013
Kognitio
 
Monitoring applications on cloud - Indicthreads cloud computing conference 2011
Monitoring applications on cloud - Indicthreads cloud computing conference 2011Monitoring applications on cloud - Indicthreads cloud computing conference 2011
Monitoring applications on cloud - Indicthreads cloud computing conference 2011
IndicThreads
 

Similaire à Cassandra at eBay - Cassandra Summit 2012 (20)

Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
 
Cassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and LimitationsCassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and Limitations
 
Stream Processing with CompletableFuture and Flow in Java 9
Stream Processing with CompletableFuture and Flow in Java 9Stream Processing with CompletableFuture and Flow in Java 9
Stream Processing with CompletableFuture and Flow in Java 9
 
Real World Cassandra
Real World CassandraReal World Cassandra
Real World Cassandra
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
The 5 Stages of Scale
The 5 Stages of ScaleThe 5 Stages of Scale
The 5 Stages of Scale
 
How to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and FastHow to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and Fast
 
Dynamo Systems - QCon SF 2012 Presentation
Dynamo Systems - QCon SF 2012 PresentationDynamo Systems - QCon SF 2012 Presentation
Dynamo Systems - QCon SF 2012 Presentation
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists
 
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloudHive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
 
Azure and cloud design patterns
Azure and cloud design patternsAzure and cloud design patterns
Azure and cloud design patterns
 
Real time analytics
Real time analyticsReal time analytics
Real time analytics
 
Apache Cassandra overview
Apache Cassandra overviewApache Cassandra overview
Apache Cassandra overview
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013
 
Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016
 
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
 
Monitoring applications on cloud - Indicthreads cloud computing conference 2011
Monitoring applications on cloud - Indicthreads cloud computing conference 2011Monitoring applications on cloud - Indicthreads cloud computing conference 2011
Monitoring applications on cloud - Indicthreads cloud computing conference 2011
 
High-speed, Reactive Microservices 2017
High-speed, Reactive Microservices 2017High-speed, Reactive Microservices 2017
High-speed, Reactive Microservices 2017
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Cassandra at eBay - Cassandra Summit 2012

  • 1. August 8, 2012 Cassandra at eBay Time left: 29m 59s Jay Patel Architect, Platform Systems @pateljay3001
  • 2. eBay Marketplaces  97 million active buyers and sellers  200+ million items  2 billion page views each day  80 billion database calls each day  5+ petabytes of site storage capacity  80+ petabytes of analytics storage capacity 2
  • 3. How do we scale databases?  Shard – Patterns: Modulus, lookup-based, range, etc. – Application sees only logical shard/database  Replicate – Disaster recovery, read availability/scalability  Big NOs – No transactions – No joins – No referential integrity constraints 3
  • 4. We like Cassandra  Multi-datacenter (active-active)  Write performance  Availability - No SPOF  Distributed counters  Scalability  Hadoop support We also utilize MongoDB & HBase 4
  • 5. Are we replacing RDBMS with NoSQL? Not at all! But, complementing.  Some use cases don’t fit well - sparse data, big data, schema optional, real-time analytics, …  Many use cases don’t need top-tier set-ups - logging, tracking, … 5
  • 6. A glimpse on our Cassandra deployment  Dozens of nodes across multiple clusters  200 TB+ storage provisioned  400M+ writes & 100M+ reads per day, and growing  QA, LnP, and multiple Production clusters 6
  • 7. Use Cases on Cassandra Social Signals on eBay product & item pages Hunch taste graph for eBay users & items Time series use cases (many):  Mobile notification logging and tracking  Tracking for fraud detection  SOA request/response payload logging  RedLaser server logs and analytics 7
  • 9. Manage signals via “Your Favorites” Whole page is served by Cassandra 9
  • 10. Why Cassandra for Social Signals?  Need scalable counters  Need real (or near) time analytics on collected social data  Need good write performance  Reads are not latency sensitive 10
  • 11. Deployment User request has no datacenter affinity Non-sticky load balancing Topology - NTS Data is backed up periodically RF - 2:2 to protect against human or Read CL - ONE software error Write CL – ONE 11
  • 12. Data Model depends on query patterns 12
  • 14. Wait… Duplicates! Oh, toggle button! Signal --> De-signal --> Signal… 14
  • 15. Yes, eventual consistency! One scenario that produces duplicate signals in UserLike CF: 1. Signal 2. De-signal (1st operation is not propagated to all replica) 3. Signal, again (1st operation is not propagated yet!) So, what’s the solution? Later… 15
  • 16. Social Signals, next phase: Real-time Analytics  Most signaled or popular items per affinity groups (category, etc.)  Aggregated item count per affinity group Example affinity group 16
  • 17. Initial Data Model for real-time analytics Items in an affinitygroup is physically stored sorted by their signal count Update counters for both individual item and all the affinity groups that item belongs to
  • 19. user1 bid item1 buy item2 watch sell user2 19
  • 20. Graph in Cassandra Event consumers listen for site events (sell/bid/buy/watch) & populate graph in Cassandra  30 million+ writes daily  Batch-oriented reads  14 billion+ edges already (for taste vector updates) 20
  • 21.  Mobile notification logging and tracking  Tracking for fraud detection  SOA request/response payload logging  RedLaser server logs and analytics 21
  • 22. A glimpse on Data Model
  • 23. RedLaser tracking & monitoring console 23
  • 24. That’s all about the use cases.. Remember the duplicate problem in Use Case #1? Let’s see some options we considered to solve this… 24
  • 25. Option 1 – Make ‘Like’ idempotent for UserLike  Remove time (timeuuid) from the composite column name:  Multiple signal operations are now Idempotent  No need to read before de-signaling (deleting) X Need timeuuid for ordering! Already have a user with more than 1300 signals 25
  • 26. Option 2 – Use strong consistency  Local Quorum – Won’t help us. User requests are not geo-load balanced (no DC affinity).  Quorum – Won’t survive during partition between DCs (or, one of the DC is down). Also, adds additional latency. X Need to survive! 26
  • 27. Option 3 – Adapt to eventual consistency If desire survival! 27 http://www.strangecosmos.com/content/item/101254.html
  • 28. Adjustments to eventual consistency De-signal steps: – Don’t check whether item is already signaled by a user, or not – Read all (duplicate) signals from UserLike_unordered (new CF to avoid reading whole row from UserLike) – Delete those signals from UserLike_unordered and UserLike Still, can get duplicate signals or false positives as there is a ‘read before delete’. To shield further, do ‘repair on read’. Not a full story! 28
  • 29. Lessons & Best Practices • Choose proper Replication Factor and Consistency Level. – They alter latency, availability, durability, consistency and cost. – Cassandra supports tunable consistency, but remember strong consistency is not free. • Consider all overheads in capacity planning. – Replicas, compaction, secondary indexes, etc. • De-normalize and duplicate for read performance. – But don’t de-normalize if you don’t need to. • Many ways to model data in Cassandra. – The best way depends on your use case and query patterns. More on http://ebaytechblog.com?p=1308
  • 30. Thank You @pateljay3001 #cassandra12 30