SlideShare a Scribd company logo
1 of 158
From 100s to 100s of
Millions
July 2011
Erik Onnen
About Me
•   Director of Platform Engineering at Urban Airship (.75
    years)
•   Previously Principal Engineer at Jive Software (3 years)
•   12 years large scale, distributed systems experience going
    back to CORBA
•   Cassandra, HBase, Kafka and ZooKeeper contributor -
    most recently CASSANDRA-2463
In this Talk

•   About Urban Airship
•   Systems Overview
•   A Tale of Storage Engines
•   Our Cassandra Deployment
•   Battle Scars
    •   Development Lessons Learned
    •   Operations Lessons Learned
•   Looking Forward
What is an Urban Airship?
•   Hosting for mobile services that developers should not
    build themselves
•   Unified API for services across platforms
•   SLAs for throughput, latency
By The Numbers
By The Numbers

•   Over 160 million active application installs use our system
    across over 80 million unique devices
By The Numbers

•   Over 160 million active application installs use our system
    across over 80 million unique devices
•   Freemium API peaks at 700 requests/second, dedicated
    customer API 10K requests/second
By The Numbers

•   Over 160 million active application installs use our system
    across over 80 million unique devices
•   Freemium API peaks at 700 requests/second, dedicated
    customer API 10K requests/second
    •   Over half of those are device check-ins
By The Numbers

•   Over 160 million active application installs use our system
    across over 80 million unique devices
•   Freemium API peaks at 700 requests/second, dedicated
    customer API 10K requests/second
    •   Over half of those are device check-ins
    •   Transactions - send push, check status, get content
By The Numbers

•   Over 160 million active application installs use our system
    across over 80 million unique devices
•   Freemium API peaks at 700 requests/second, dedicated
    customer API 10K requests/second
    •   Over half of those are device check-ins
    •   Transactions - send push, check status, get content
•   At any given point in time, we have ~ 1.1 million secure
    socket connections into our transactional core
By The Numbers

•   Over 160 million active application installs use our system
    across over 80 million unique devices
•   Freemium API peaks at 700 requests/second, dedicated
    customer API 10K requests/second
    •   Over half of those are device check-ins
    •   Transactions - send push, check status, get content
•   At any given point in time, we have ~ 1.1 million secure
    socket connections into our transactional core
•   6 months for the company to deliver 1M messages, just
    broke 4.2B
Transactional System
Transactional System

•   Edge Systems:
    •   API - Apache/Python/django+piston+pycassa
    •   Device negotiation - Java NIO + Hector
    •   Message Delivery - Python, Java NIO + Hector
    •   Device data - Java HTTPS endpoint
Transactional System

•   Edge Systems:
    •   API - Apache/Python/django+piston+pycassa
    •   Device negotiation - Java NIO + Hector
    •   Message Delivery - Python, Java NIO + Hector
    •   Device data - Java HTTPS endpoint
•   Persistence
    •   Sharded PostgreSQL
    •   Cassandra 0.7
    •   MongoDB 1.7
A Tale of Storage Engines
A Tale of Storage Engines

•   “Is there a NoSQL system you guys don’t use?”
A Tale of Storage Engines

•   “Is there a NoSQL system you guys don’t use?”
    •   Riak :)
A Tale of Storage Engines

•   “Is there a NoSQL system you guys don’t use?”
    •   Riak :)
•   We do use:
A Tale of Storage Engines

•   “Is there a NoSQL system you guys don’t use?”
    •   Riak :)
•   We do use:
    •   Cassandra
A Tale of Storage Engines

•   “Is there a NoSQL system you guys don’t use?”
    •   Riak :)
•   We do use:
    •   Cassandra
    •   HBase
A Tale of Storage Engines

•   “Is there a NoSQL system you guys don’t use?”
    •   Riak :)
•   We do use:
    •   Cassandra
    •   HBase
    •   Redis
A Tale of Storage Engines

•   “Is there a NoSQL system you guys don’t use?”
    •   Riak :)
•   We do use:
    •   Cassandra
    •   HBase
    •   Redis
    •   MongoDB
A Tale of Storage Engines

•   “Is there a NoSQL system you guys don’t use?”
    •   Riak :)
•   We do use:
    •   Cassandra
    •   HBase
    •   Redis
    •   MongoDB
•   We’re converging on Cassandra + PostgreSQL for
    transactional and HBase for long haul
A Tale of Storage Engines
A Tale of Storage Engines
A Tale of Storage Engines

•   PostgreSQL
A Tale of Storage Engines

•   PostgreSQL
    •   Bootstrapped the company on PostgreSQL in EC2
A Tale of Storage Engines

•   PostgreSQL
    •   Bootstrapped the company on PostgreSQL in EC2
    •   Highly relational, large index model
A Tale of Storage Engines

•   PostgreSQL
    •   Bootstrapped the company on PostgreSQL in EC2
    •   Highly relational, large index model
    •   Layered in memcached
A Tale of Storage Engines

•   PostgreSQL
    •   Bootstrapped the company on PostgreSQL in EC2
    •   Highly relational, large index model
    •   Layered in memcached
    •   Writes weren’t scaling after ~ 6 months
A Tale of Storage Engines

•   PostgreSQL
    •   Bootstrapped the company on PostgreSQL in EC2
    •   Highly relational, large index model
    •   Layered in memcached
    •   Writes weren’t scaling after ~ 6 months
    •   Continued to use for several silos of data but needed a
        way to grow more easily
A Tale of Storage Engines
A Tale of Storage Engines
•   MongoDB
A Tale of Storage Engines
•   MongoDB
    •   Initially, we loved Mongo
A Tale of Storage Engines
•   MongoDB
    •   Initially, we loved Mongo
        •   Document databases are cool
A Tale of Storage Engines
•   MongoDB
    •   Initially, we loved Mongo
        •   Document databases are cool
        •   BSON is nice
A Tale of Storage Engines
•   MongoDB
    •   Initially, we loved Mongo
        •   Document databases are cool
        •   BSON is nice
    •   As data set grew, we learned a lot about MongoDB
A Tale of Storage Engines
•   MongoDB
    •   Initially, we loved Mongo
        •   Document databases are cool
        •   BSON is nice
    •   As data set grew, we learned a lot about MongoDB
    •   “MongoDB does not wait for a response by default when
        writing to the database.”
A Tale of Storage Engines
•   MongoDB
    •   Initially, we loved Mongo
        •   Document databases are cool
        •   BSON is nice
    •   As data set grew, we learned a lot about MongoDB
    •   “MongoDB does not wait for a response by default when
        writing to the database.”
A Tale of Storage Engines
A Tale of Storage Engines
•   MongoDB - Read/Write Problems
A Tale of Storage Engines
•   MongoDB - Read/Write Problems
    •   Early days (1.2) one global lock (reads block
        writes and vice versa)
A Tale of Storage Engines
•   MongoDB - Read/Write Problems
    •   Early days (1.2) one global lock (reads block
        writes and vice versa)
    •   Later, one read lock, one write lock per server
A Tale of Storage Engines
•   MongoDB - Read/Write Problems
    •   Early days (1.2) one global lock (reads block
        writes and vice versa)
    •   Later, one read lock, one write lock per server
    •   Long running queries were often devastating
A Tale of Storage Engines
•   MongoDB - Read/Write Problems
    •   Early days (1.2) one global lock (reads block
        writes and vice versa)
    •   Later, one read lock, one write lock per server
    •   Long running queries were often devastating
        •   Replication would fall too far behind and stop
A Tale of Storage Engines
•   MongoDB - Read/Write Problems
    •   Early days (1.2) one global lock (reads block
        writes and vice versa)
    •   Later, one read lock, one write lock per server
    •   Long running queries were often devastating
        •   Replication would fall too far behind and stop
        •   No writes or updates
A Tale of Storage Engines
•   MongoDB - Read/Write Problems
    •   Early days (1.2) one global lock (reads block
        writes and vice versa)
    •   Later, one read lock, one write lock per server
    •   Long running queries were often devastating
        •   Replication would fall too far behind and stop
        •   No writes or updates
        •   Effectively a failure for most clients
A Tale of Storage Engines
•   MongoDB - Read/Write Problems
    •   Early days (1.2) one global lock (reads block
        writes and vice versa)
    •   Later, one read lock, one write lock per server
    •   Long running queries were often devastating
        •   Replication would fall too far behind and stop
        •   No writes or updates
        •   Effectively a failure for most clients
    •   With replication, queries for anything other than
        the shard key talk to every node in the cluster
A Tale of Storage Engines
A Tale of Storage Engines
•   MongoDB - Update Problems
A Tale of Storage Engines
•   MongoDB - Update Problems
    •   Simple updates (i.e. counters) were fine
A Tale of Storage Engines
•   MongoDB - Update Problems
    •   Simple updates (i.e. counters) were fine
    •   Bigger updates commonly resulted in large scans of the
        collection depending on position == heavy disk I/O
A Tale of Storage Engines
•   MongoDB - Update Problems
    •   Simple updates (i.e. counters) were fine
    •   Bigger updates commonly resulted in large scans of the
        collection depending on position == heavy disk I/O
    •   Frequently spill to end of the collection datafile leaving
        “holes” but not sparse files
A Tale of Storage Engines
•   MongoDB - Update Problems
    •   Simple updates (i.e. counters) were fine
    •   Bigger updates commonly resulted in large scans of the
        collection depending on position == heavy disk I/O
    •   Frequently spill to end of the collection datafile leaving
        “holes” but not sparse files
    •   Those “holes” get MMap’d even though they’re not used
A Tale of Storage Engines
•   MongoDB - Update Problems
    •   Simple updates (i.e. counters) were fine
    •   Bigger updates commonly resulted in large scans of the
        collection depending on position == heavy disk I/O
    •   Frequently spill to end of the collection datafile leaving
        “holes” but not sparse files
    •   Those “holes” get MMap’d even though they’re not used
    •   Updates moving data acquire multiple locks commonly
        blocking other read/write operations
A Tale of Storage Engines
A Tale of Storage Engines
•   MongoDB - Optimization Problems
A Tale of Storage Engines
•   MongoDB - Optimization Problems
    •   Compacting a collection locks the entire collection
A Tale of Storage Engines
•   MongoDB - Optimization Problems
    •   Compacting a collection locks the entire collection
    •   Read slave was too busy to be a backup, needed moar
        RAMs but were already on High-Memory EC2, nowhere
        else to go
A Tale of Storage Engines
•   MongoDB - Optimization Problems
    •   Compacting a collection locks the entire collection
    •   Read slave was too busy to be a backup, needed moar
        RAMs but were already on High-Memory EC2, nowhere
        else to go
    •   Mongo MMaps everything - when your data set is
        bigger than RAM, you better have fast disks
A Tale of Storage Engines
•   MongoDB - Optimization Problems
    •   Compacting a collection locks the entire collection
    •   Read slave was too busy to be a backup, needed moar
        RAMs but were already on High-Memory EC2, nowhere
        else to go
    •   Mongo MMaps everything - when your data set is
        bigger than RAM, you better have fast disks
    •   Until 1.8, no support for sparse indexes
A Tale of Storage Engines
A Tale of Storage Engines
•   MongoDB - Ops Issues
A Tale of Storage Engines
•   MongoDB - Ops Issues
    •   Lots of good information in mongostat
A Tale of Storage Engines
•   MongoDB - Ops Issues
    •   Lots of good information in mongostat
    •   Recovering a crashed system was effectively impossible
        without disabling indexes first (not the default)
A Tale of Storage Engines
•   MongoDB - Ops Issues
    •   Lots of good information in mongostat
    •   Recovering a crashed system was effectively impossible
        without disabling indexes first (not the default)
    •   Replica sets never worked for us in testing, lots of
        inconsistencies in failure scenarios
A Tale of Storage Engines
•   MongoDB - Ops Issues
    •   Lots of good information in mongostat
    •   Recovering a crashed system was effectively impossible
        without disabling indexes first (not the default)
    •   Replica sets never worked for us in testing, lots of
        inconsistencies in failure scenarios
    •   Scattered records lead to lots of I/O that hurt on bad
        disks (EC2)
Cassandra at Urban Airship
Cassandra at Urban Airship

•   Summer of 2010 - no faith left in MongoDB started a
    migration to Cassandra
Cassandra at Urban Airship

•   Summer of 2010 - no faith left in MongoDB started a
    migration to Cassandra
•   Lots of L&P testing, client analysis, etc.
Cassandra at Urban Airship

•   Summer of 2010 - no faith left in MongoDB started a
    migration to Cassandra
•   Lots of L&P testing, client analysis, etc.
•   December 2010 - Cassandra backed 85% of our Android
    stack’s persistence
Cassandra at Urban Airship

•   Summer of 2010 - no faith left in MongoDB started a
    migration to Cassandra
•   Lots of L&P testing, client analysis, etc.
•   December 2010 - Cassandra backed 85% of our Android
    stack’s persistence
    •   Six EC2 XLS with each serving:
Cassandra at Urban Airship

•   Summer of 2010 - no faith left in MongoDB started a
    migration to Cassandra
•   Lots of L&P testing, client analysis, etc.
•   December 2010 - Cassandra backed 85% of our Android
    stack’s persistence
    •   Six EC2 XLS with each serving:
        •   30GB data
Cassandra at Urban Airship

•   Summer of 2010 - no faith left in MongoDB started a
    migration to Cassandra
•   Lots of L&P testing, client analysis, etc.
•   December 2010 - Cassandra backed 85% of our Android
    stack’s persistence
    •   Six EC2 XLS with each serving:
        •   30GB data
        •   ~1000 reads/second/node
Cassandra at Urban Airship

•   Summer of 2010 - no faith left in MongoDB started a
    migration to Cassandra
•   Lots of L&P testing, client analysis, etc.
•   December 2010 - Cassandra backed 85% of our Android
    stack’s persistence
    •   Six EC2 XLS with each serving:
        •   30GB data
        •   ~1000 reads/second/node
        •   ~750 writes/second/node
Cassandra at Urban Airship
Cassandra at Urban Airship

•   Why Cassandra?
Cassandra at Urban Airship

•   Why Cassandra?
    •   Well suited for most of our data model (simple DAGs)
Cassandra at Urban Airship

•   Why Cassandra?
    •   Well suited for most of our data model (simple DAGs)
        •   Lots of UUIDs and hashes partition well
Cassandra at Urban Airship

•   Why Cassandra?
    •   Well suited for most of our data model (simple DAGs)
        •   Lots of UUIDs and hashes partition well
        •   Retrievals don’t need ordering beyond keys or TSD
Cassandra at Urban Airship

•   Why Cassandra?
    •   Well suited for most of our data model (simple DAGs)
        •   Lots of UUIDs and hashes partition well
        •   Retrievals don’t need ordering beyond keys or TSD
    •   Rolling upgrades FTW
Cassandra at Urban Airship

•   Why Cassandra?
    •   Well suited for most of our data model (simple DAGs)
        •   Lots of UUIDs and hashes partition well
        •   Retrievals don’t need ordering beyond keys or TSD
    •   Rolling upgrades FTW
    •   Dynamic rebalancing and node addition
Cassandra at Urban Airship

•   Why Cassandra?
    •   Well suited for most of our data model (simple DAGs)
        •   Lots of UUIDs and hashes partition well
        •   Retrievals don’t need ordering beyond keys or TSD
    •   Rolling upgrades FTW
    •   Dynamic rebalancing and node addition
    •   Column TTLs huge for us
Cassandra at Urban Airship

•   Why Cassandra?
    •   Well suited for most of our data model (simple DAGs)
        •   Lots of UUIDs and hashes partition well
        •   Retrievals don’t need ordering beyond keys or TSD
    •   Rolling upgrades FTW
    •   Dynamic rebalancing and node addition
    •   Column TTLs huge for us
    •   Awesome community :)
Cassandra at Urban Airship
Cassandra at Urban Airship

•   Why Cassandra cont’d?
Cassandra at Urban Airship

•   Why Cassandra cont’d?
    •   Particularly well suited to working around EC2 availability
Cassandra at Urban Airship

•   Why Cassandra cont’d?
    •   Particularly well suited to working around EC2 availability
        •   Needed a cross AZ strategy - we had seen EBS issues
            in the past, didn’t trust fault containment w/n a zone
Cassandra at Urban Airship

•   Why Cassandra cont’d?
    •   Particularly well suited to working around EC2 availability
        •   Needed a cross AZ strategy - we had seen EBS issues
            in the past, didn’t trust fault containment w/n a zone
        •   Didn’t want locality of replication so needed to stripe
            across AZs
Cassandra at Urban Airship

•   Why Cassandra cont’d?
    •   Particularly well suited to working around EC2 availability
        •   Needed a cross AZ strategy - we had seen EBS issues
            in the past, didn’t trust fault containment w/n a zone
        •   Didn’t want locality of replication so needed to stripe
            across AZs
        •   Read repair and handoff generally did the right thing
            when a node would flap (Ubuntu #708920)
Cassandra at Urban Airship

•   Why Cassandra cont’d?
    •   Particularly well suited to working around EC2 availability
        •   Needed a cross AZ strategy - we had seen EBS issues
            in the past, didn’t trust fault containment w/n a zone
        •   Didn’t want locality of replication so needed to stripe
            across AZs
        •   Read repair and handoff generally did the right thing
            when a node would flap (Ubuntu #708920)
        •   No SPoF
Cassandra at Urban Airship

•   Why Cassandra cont’d?
    •   Particularly well suited to working around EC2 availability
        •   Needed a cross AZ strategy - we had seen EBS issues
            in the past, didn’t trust fault containment w/n a zone
        •   Didn’t want locality of replication so needed to stripe
            across AZs
        •   Read repair and handoff generally did the right thing
            when a node would flap (Ubuntu #708920)
        •   No SPoF
    •   Ability to alter CLs on a per operation basis
Battle Scars - Development
Battle Scars - Development
•   Know your data model
Battle Scars - Development
•   Know your data model
    •   Creating indexes after the fact is a PITA
Battle Scars - Development
•   Know your data model
    •   Creating indexes after the fact is a PITA
    •   Design around wide rows
Battle Scars - Development
•   Know your data model
    •   Creating indexes after the fact is a PITA
    •   Design around wide rows
        •   I/O problems
Battle Scars - Development
•   Know your data model
    •   Creating indexes after the fact is a PITA
    •   Design around wide rows
        •   I/O problems
        •   Thrift problems
Battle Scars - Development
•   Know your data model
    •   Creating indexes after the fact is a PITA
    •   Design around wide rows
        •   I/O problems
        •   Thrift problems
        •   Count problems
Battle Scars - Development
•   Know your data model
    •   Creating indexes after the fact is a PITA
    •   Design around wide rows
        •   I/O problems
        •   Thrift problems
        •   Count problems
    •   Favor JSON over packed binaries if possible
Battle Scars - Development
•   Know your data model
    •   Creating indexes after the fact is a PITA
    •   Design around wide rows
        •   I/O problems
        •   Thrift problems
        •   Count problems
    •   Favor JSON over packed binaries if possible
•   Careful with Thrift in the stack
Battle Scars - Development
•   Know your data model
    •   Creating indexes after the fact is a PITA
    •   Design around wide rows
        •   I/O problems
        •   Thrift problems
        •   Count problems
    •   Favor JSON over packed binaries if possible
•   Careful with Thrift in the stack
•   Don’t fear the StorageProxy
Battle Scars - Development
Battle Scars - Development
•   Assume failure in the client
Battle Scars - Development
•   Assume failure in the client
    •   Read timeout vs. connection refused
Battle Scars - Development
•   Assume failure in the client
    •   Read timeout vs. connection refused
    •   When maintaining your own indexes, try and cleanup
        after failure
Battle Scars - Development
•   Assume failure in the client
    •   Read timeout vs. connection refused
    •   When maintaining your own indexes, try and cleanup
        after failure
    •   Be ready to cleanup inconsistencies anyway
Battle Scars - Development
•   Assume failure in the client
    •   Read timeout vs. connection refused
    •   When maintaining your own indexes, try and cleanup
        after failure
    •   Be ready to cleanup inconsistencies anyway
    •   Verify client library assumptions and exception handling
Battle Scars - Development
•   Assume failure in the client
    •   Read timeout vs. connection refused
    •   When maintaining your own indexes, try and cleanup
        after failure
    •   Be ready to cleanup inconsistencies anyway
    •   Verify client library assumptions and exception handling
        •   Retry now vs. retry later?
Battle Scars - Development
•   Assume failure in the client
    •   Read timeout vs. connection refused
    •   When maintaining your own indexes, try and cleanup
        after failure
    •   Be ready to cleanup inconsistencies anyway
    •   Verify client library assumptions and exception handling
        •   Retry now vs. retry later?
        •   Compensating action during failures?
Battle Scars - Development
•   Assume failure in the client
    •   Read timeout vs. connection refused
    •   When maintaining your own indexes, try and cleanup
        after failure
    •   Be ready to cleanup inconsistencies anyway
    •   Verify client library assumptions and exception handling
        •   Retry now vs. retry later?
        •   Compensating action during failures?
•   Don’t avoid the Cassandra code
Battle Scars - Development
•   Assume failure in the client
    •   Read timeout vs. connection refused
    •   When maintaining your own indexes, try and cleanup
        after failure
    •   Be ready to cleanup inconsistencies anyway
    •   Verify client library assumptions and exception handling
        •   Retry now vs. retry later?
        •   Compensating action during failures?
•   Don’t avoid the Cassandra code
•   Embed for testing
Battle Scars - Ops
Battle Scars - Ops
•   Cassandra in EC2:
Battle Scars - Ops
•   Cassandra in EC2:
    •   Ensure Dynamic Snitch is enabled
Battle Scars - Ops
•   Cassandra in EC2:
    •   Ensure Dynamic Snitch is enabled
    •   Disk I/O
Battle Scars - Ops
•   Cassandra in EC2:
    •   Ensure Dynamic Snitch is enabled
    •   Disk I/O
        •   Avoid EBS except for snapshot backups or use S3
Battle Scars - Ops
•   Cassandra in EC2:
    •   Ensure Dynamic Snitch is enabled
    •   Disk I/O
        •   Avoid EBS except for snapshot backups or use S3
        •   Stripe ephemerals, not EBS volumes
Battle Scars - Ops
•   Cassandra in EC2:
    •   Ensure Dynamic Snitch is enabled
    •   Disk I/O
        •   Avoid EBS except for snapshot backups or use S3
        •   Stripe ephemerals, not EBS volumes
        •   Avoid smaller instances all together
Battle Scars - Ops
•   Cassandra in EC2:
    •   Ensure Dynamic Snitch is enabled
    •   Disk I/O
        •   Avoid EBS except for snapshot backups or use S3
        •   Stripe ephemerals, not EBS volumes
        •   Avoid smaller instances all together
    •   Don’t always assume traversing close proximity AZs is
        more expensive
Battle Scars - Ops
•   Cassandra in EC2:
    •   Ensure Dynamic Snitch is enabled
    •   Disk I/O
        •   Avoid EBS except for snapshot backups or use S3
        •   Stripe ephemerals, not EBS volumes
        •   Avoid smaller instances all together
    •   Don’t always assume traversing close proximity AZs is
        more expensive
    •   Balance RAM cost vs. the cost of additional hosts and
        spending time w/ GC logs
Battle Scars - Ops
Battle Scars - Ops
•   Java Best Practices:
Battle Scars - Ops
•   Java Best Practices:
    •   All Java services are managed via the same set of
        scripts
Battle Scars - Ops
•   Java Best Practices:
    •   All Java services are managed via the same set of
        scripts
        •   In most cases, operators don’t treat Cassandra
            different from HBase
Battle Scars - Ops
•   Java Best Practices:
    •   All Java services are managed via the same set of
        scripts
        •   In most cases, operators don’t treat Cassandra
            different from HBase
        •   Simple mechanism to take thread or heap dump
Battle Scars - Ops
•   Java Best Practices:
    •   All Java services are managed via the same set of
        scripts
        •   In most cases, operators don’t treat Cassandra
            different from HBase
        •   Simple mechanism to take thread or heap dump
        •   All logging is consistent - GC, application, stdx
Battle Scars - Ops
•   Java Best Practices:
    •   All Java services are managed via the same set of
        scripts
        •   In most cases, operators don’t treat Cassandra
            different from HBase
        •   Simple mechanism to take thread or heap dump
        •   All logging is consistent - GC, application, stdx
        •   Init scripts use the same scripts operators do
Battle Scars - Ops
•   Java Best Practices:
    •   All Java services are managed via the same set of
        scripts
        •   In most cases, operators don’t treat Cassandra
            different from HBase
        •   Simple mechanism to take thread or heap dump
        •   All logging is consistent - GC, application, stdx
        •   Init scripts use the same scripts operators do
    •   Bare metal will rock your world
Battle Scars - Ops
•   Java Best Practices:
    •   All Java services are managed via the same set of
        scripts
        •   In most cases, operators don’t treat Cassandra
            different from HBase
        •   Simple mechanism to take thread or heap dump
        •   All logging is consistent - GC, application, stdx
        •   Init scripts use the same scripts operators do
    •   Bare metal will rock your world
    •   +UseLargePages will rock your world too
Battle Scars - Ops
             ParNew GC Effectiveness                                                        Mean Time ParNew GC
300                                                              0.04



225                                                              0.03



150                                                              0.02



 75                                                              0.01



 0                                                                  0
                      MB Collected                                                             Collection Time (ms)

         Bare Metal                  EC2 XL                                           Bare Metal                      EC2 XL



                                                  ParNew Collection Count
                             60000



                             45000



                             30000



                             15000



                                 0
                                                     Number of Collections

                                              Bare Metal                     EC2 XL
Battle Scars - Ops
Battle Scars - Ops
•   Java Best Practices cont’d:
Battle Scars - Ops
•   Java Best Practices cont’d:
    •   Get familiar with GC logs (-XX:+PrintGCDetails)
Battle Scars - Ops
•   Java Best Practices cont’d:
    •   Get familiar with GC logs (-XX:+PrintGCDetails)
        •   Understand what degenerate CMS collection looks like
Battle Scars - Ops
•   Java Best Practices cont’d:
    •   Get familiar with GC logs (-XX:+PrintGCDetails)
        •   Understand what degenerate CMS collection looks like
        •   We settled at -XX:CMSInitiatingOccupancyFraction=60
Battle Scars - Ops
•   Java Best Practices cont’d:
    •   Get familiar with GC logs (-XX:+PrintGCDetails)
        •   Understand what degenerate CMS collection looks like
        •   We settled at -XX:CMSInitiatingOccupancyFraction=60
        •   Possibly experiment with tenuring threshold
Battle Scars - Ops
•   Java Best Practices cont’d:
    •   Get familiar with GC logs (-XX:+PrintGCDetails)
        •   Understand what degenerate CMS collection looks like
        •   We settled at -XX:CMSInitiatingOccupancyFraction=60
        •   Possibly experiment with tenuring threshold
    •   When in doubt take a thread dump
Battle Scars - Ops
•   Java Best Practices cont’d:
    •   Get familiar with GC logs (-XX:+PrintGCDetails)
        •   Understand what degenerate CMS collection looks like
        •   We settled at -XX:CMSInitiatingOccupancyFraction=60
        •   Possibly experiment with tenuring threshold
    •   When in doubt take a thread dump
    •   TDA (http://java.net/projects/tda/)
Battle Scars - Ops
•   Java Best Practices cont’d:
    •   Get familiar with GC logs (-XX:+PrintGCDetails)
        •   Understand what degenerate CMS collection looks like
        •   We settled at -XX:CMSInitiatingOccupancyFraction=60
        •   Possibly experiment with tenuring threshold
    •   When in doubt take a thread dump
    •   TDA (http://java.net/projects/tda/)
    •   Eclipse MAT (http://www.eclipse.org/mat/)
Battle Scars - Ops
Battle Scars - Ops
•   Understand when to compact
Battle Scars - Ops
•   Understand when to compact
•   Understand upgrade implications for datafiles
Battle Scars - Ops
•   Understand when to compact
•   Understand upgrade implications for datafiles
•   Watch hinted handoff closely
Battle Scars - Ops
•   Understand when to compact
•   Understand upgrade implications for datafiles
•   Watch hinted handoff closely
•   Monitor JMX religiously
Looking Forward
Looking Forward
•   Cassandra is a great hammer but not everything is a nail
Looking Forward
•   Cassandra is a great hammer but not everything is a nail
•   Coprocessors would be awesome (hint hint)
Looking Forward
•   Cassandra is a great hammer but not everything is a nail
•   Coprocessors would be awesome (hint hint)
•   Still spend too much time worrying about GC
Looking Forward
•   Cassandra is a great hammer but not everything is a nail
•   Coprocessors would be awesome (hint hint)
•   Still spend too much time worrying about GC
•   Glad to see the ecosystem around the product evolving
Looking Forward
•   Cassandra is a great hammer but not everything is a nail
•   Coprocessors would be awesome (hint hint)
•   Still spend too much time worrying about GC
•   Glad to see the ecosystem around the product evolving
    •   CQL
Looking Forward
•   Cassandra is a great hammer but not everything is a nail
•   Coprocessors would be awesome (hint hint)
•   Still spend too much time worrying about GC
•   Glad to see the ecosystem around the product evolving
    •   CQL
    •   Pig
Looking Forward
•   Cassandra is a great hammer but not everything is a nail
•   Coprocessors would be awesome (hint hint)
•   Still spend too much time worrying about GC
•   Glad to see the ecosystem around the product evolving
    •   CQL
    •   Pig
    •   Brisk
Looking Forward
•   Cassandra is a great hammer but not everything is a nail
•   Coprocessors would be awesome (hint hint)
•   Still spend too much time worrying about GC
•   Glad to see the ecosystem around the product evolving
    •   CQL
    •   Pig
    •   Brisk
•   Guardedly optimistic about off heap data management
Thanks to



•   jbellis, driftx
•   Datastax
•   Whoever wrote TDA
•   SAP
Thanks!




•   Urban Airship: http://urbanairship.com/
•   We’re hiring! http://urbanairship.com/company/jobs/
•   Me @eonnen or erik at

More Related Content

What's hot

A la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIAA la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIALa Cuisine du Web
 
Kafka Tutorial: Streaming Data Architecture
Kafka Tutorial: Streaming Data ArchitectureKafka Tutorial: Streaming Data Architecture
Kafka Tutorial: Streaming Data ArchitectureJean-Paul Azar
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBrian Ritchie
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 
(DAT407) Amazon ElastiCache: Deep Dive
(DAT407) Amazon ElastiCache: Deep Dive(DAT407) Amazon ElastiCache: Deep Dive
(DAT407) Amazon ElastiCache: Deep DiveAmazon Web Services
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelinesSumant Tambe
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDataWorks Summit
 
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...Amazon Web Services
 
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Apache Con 2021 : Apache Bookkeeper Key Value Store and use casesApache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Apache Con 2021 : Apache Bookkeeper Key Value Store and use casesShivji Kumar Jha
 
Multi-Region Cassandra Clusters
Multi-Region Cassandra ClustersMulti-Region Cassandra Clusters
Multi-Region Cassandra ClustersInstaclustr
 
Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBasedave_revell
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to KafkaAkash Vacher
 
Change Data Capture using Kafka
Change Data Capture using KafkaChange Data Capture using Kafka
Change Data Capture using KafkaAkash Vacher
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra PerfectSATOSHI TAGOMORI
 
Use case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in productionUse case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in production知教 本間
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQLKonstantin Gredeskoul
 

What's hot (20)

A la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIAA la rencontre de Kafka, le log distribué par Florian GARCIA
A la rencontre de Kafka, le log distribué par Florian GARCIA
 
Kafka Tutorial: Streaming Data Architecture
Kafka Tutorial: Streaming Data ArchitectureKafka Tutorial: Streaming Data Architecture
Kafka Tutorial: Streaming Data Architecture
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache Kafka
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 
(DAT407) Amazon ElastiCache: Deep Dive
(DAT407) Amazon ElastiCache: Deep Dive(DAT407) Amazon ElastiCache: Deep Dive
(DAT407) Amazon ElastiCache: Deep Dive
 
Kafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internalsKafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internals
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
How and when to use NoSQL
How and when to use NoSQLHow and when to use NoSQL
How and when to use NoSQL
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analytics
 
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
 
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Apache Con 2021 : Apache Bookkeeper Key Value Store and use casesApache Con 2021 : Apache Bookkeeper Key Value Store and use cases
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
 
Multi-Region Cassandra Clusters
Multi-Region Cassandra ClustersMulti-Region Cassandra Clusters
Multi-Region Cassandra Clusters
 
Kafka
KafkaKafka
Kafka
 
Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBase
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to Kafka
 
Change Data Capture using Kafka
Change Data Capture using KafkaChange Data Capture using Kafka
Change Data Capture using Kafka
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra Perfect
 
Use case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in productionUse case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in production
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL
 

Viewers also liked

Hadoop and Cassandra at Rackspace
Hadoop and Cassandra at RackspaceHadoop and Cassandra at Rackspace
Hadoop and Cassandra at RackspaceStu Hood
 
Cassandra @Formspring
Cassandra @FormspringCassandra @Formspring
Cassandra @Formspringmartincozzi
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalabilityjbellis
 
Cassandra by example - the path of read and write requests
Cassandra by example - the path of read and write requestsCassandra by example - the path of read and write requests
Cassandra by example - the path of read and write requestsgrro
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraAdrian Cockcroft
 
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraVictor Coustenoble
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Reverse Engineering
Reverse EngineeringReverse Engineering
Reverse Engineeringdswanson
 

Viewers also liked (20)

Hadoop and Cassandra at Rackspace
Hadoop and Cassandra at RackspaceHadoop and Cassandra at Rackspace
Hadoop and Cassandra at Rackspace
 
Cassandra @Formspring
Cassandra @FormspringCassandra @Formspring
Cassandra @Formspring
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
 
Cassandra by example - the path of read and write requests
Cassandra by example - the path of read and write requestsCassandra by example - the path of read and write requests
Cassandra by example - the path of read and write requests
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
 
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
 
Management Consulting
Management ConsultingManagement Consulting
Management Consulting
 
Oprah Winfrey
Oprah WinfreyOprah Winfrey
Oprah Winfrey
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Reverse Engineering
Reverse EngineeringReverse Engineering
Reverse Engineering
 
Chess
ChessChess
Chess
 
Lionel Messi
Lionel MessiLionel Messi
Lionel Messi
 
Lionel messi
Lionel messiLionel messi
Lionel messi
 
Jeff jonas big data new physics
Jeff jonas big data new physicsJeff jonas big data new physics
Jeff jonas big data new physics
 
Growth Hacking
Growth Hacking Growth Hacking
Growth Hacking
 
Workshop
WorkshopWorkshop
Workshop
 
Datomic
DatomicDatomic
Datomic
 
Datomic
DatomicDatomic
Datomic
 
Datomic
DatomicDatomic
Datomic
 
Backbone.js
Backbone.jsBackbone.js
Backbone.js
 

Similar to From 100s to 100s of Millions

London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)Amazon Web Services
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's ArchitectureTony Tam
 
Message:Passing - lpw 2012
Message:Passing - lpw 2012Message:Passing - lpw 2012
Message:Passing - lpw 2012Tomas Doran
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersAmazon Web Services
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community EngineCommunity Engine
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community enginemathraq
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersAmazon Web Services
 
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?DATAVERSITY
 
(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million Users(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million UsersAmazon Web Services
 
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...MongoDB
 
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB DayChoosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB DayAmazon Web Services Korea
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersAmazon Web Services
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Marco Tusa
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersAmazon Web Services
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersAmazon Web Services
 
Running MongoDB in the Cloud
Running MongoDB in the CloudRunning MongoDB in the Cloud
Running MongoDB in the CloudTony Tam
 
Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06jimbojsb
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity PlanningMongoDB
 
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Amazon Web Services
 

Similar to From 100s to 100s of Millions (20)

London devops logging
London devops loggingLondon devops logging
London devops logging
 
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
 
Message:Passing - lpw 2012
Message:Passing - lpw 2012Message:Passing - lpw 2012
Message:Passing - lpw 2012
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community Engine
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
 
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
 
(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million Users(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million Users
 
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
 
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB DayChoosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
Running MongoDB in the Cloud
Running MongoDB in the CloudRunning MongoDB in the Cloud
Running MongoDB in the Cloud
 
Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
 

Recently uploaded

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Recently uploaded (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

From 100s to 100s of Millions

  • 1. From 100s to 100s of Millions July 2011 Erik Onnen
  • 2. About Me • Director of Platform Engineering at Urban Airship (.75 years) • Previously Principal Engineer at Jive Software (3 years) • 12 years large scale, distributed systems experience going back to CORBA • Cassandra, HBase, Kafka and ZooKeeper contributor - most recently CASSANDRA-2463
  • 3. In this Talk • About Urban Airship • Systems Overview • A Tale of Storage Engines • Our Cassandra Deployment • Battle Scars • Development Lessons Learned • Operations Lessons Learned • Looking Forward
  • 4. What is an Urban Airship? • Hosting for mobile services that developers should not build themselves • Unified API for services across platforms • SLAs for throughput, latency
  • 6. By The Numbers • Over 160 million active application installs use our system across over 80 million unique devices
  • 7. By The Numbers • Over 160 million active application installs use our system across over 80 million unique devices • Freemium API peaks at 700 requests/second, dedicated customer API 10K requests/second
  • 8. By The Numbers • Over 160 million active application installs use our system across over 80 million unique devices • Freemium API peaks at 700 requests/second, dedicated customer API 10K requests/second • Over half of those are device check-ins
  • 9. By The Numbers • Over 160 million active application installs use our system across over 80 million unique devices • Freemium API peaks at 700 requests/second, dedicated customer API 10K requests/second • Over half of those are device check-ins • Transactions - send push, check status, get content
  • 10. By The Numbers • Over 160 million active application installs use our system across over 80 million unique devices • Freemium API peaks at 700 requests/second, dedicated customer API 10K requests/second • Over half of those are device check-ins • Transactions - send push, check status, get content • At any given point in time, we have ~ 1.1 million secure socket connections into our transactional core
  • 11. By The Numbers • Over 160 million active application installs use our system across over 80 million unique devices • Freemium API peaks at 700 requests/second, dedicated customer API 10K requests/second • Over half of those are device check-ins • Transactions - send push, check status, get content • At any given point in time, we have ~ 1.1 million secure socket connections into our transactional core • 6 months for the company to deliver 1M messages, just broke 4.2B
  • 13. Transactional System • Edge Systems: • API - Apache/Python/django+piston+pycassa • Device negotiation - Java NIO + Hector • Message Delivery - Python, Java NIO + Hector • Device data - Java HTTPS endpoint
  • 14. Transactional System • Edge Systems: • API - Apache/Python/django+piston+pycassa • Device negotiation - Java NIO + Hector • Message Delivery - Python, Java NIO + Hector • Device data - Java HTTPS endpoint • Persistence • Sharded PostgreSQL • Cassandra 0.7 • MongoDB 1.7
  • 15. A Tale of Storage Engines
  • 16. A Tale of Storage Engines • “Is there a NoSQL system you guys don’t use?”
  • 17. A Tale of Storage Engines • “Is there a NoSQL system you guys don’t use?” • Riak :)
  • 18. A Tale of Storage Engines • “Is there a NoSQL system you guys don’t use?” • Riak :) • We do use:
  • 19. A Tale of Storage Engines • “Is there a NoSQL system you guys don’t use?” • Riak :) • We do use: • Cassandra
  • 20. A Tale of Storage Engines • “Is there a NoSQL system you guys don’t use?” • Riak :) • We do use: • Cassandra • HBase
  • 21. A Tale of Storage Engines • “Is there a NoSQL system you guys don’t use?” • Riak :) • We do use: • Cassandra • HBase • Redis
  • 22. A Tale of Storage Engines • “Is there a NoSQL system you guys don’t use?” • Riak :) • We do use: • Cassandra • HBase • Redis • MongoDB
  • 23. A Tale of Storage Engines • “Is there a NoSQL system you guys don’t use?” • Riak :) • We do use: • Cassandra • HBase • Redis • MongoDB • We’re converging on Cassandra + PostgreSQL for transactional and HBase for long haul
  • 24. A Tale of Storage Engines
  • 25.
  • 26.
  • 27. A Tale of Storage Engines
  • 28. A Tale of Storage Engines • PostgreSQL
  • 29. A Tale of Storage Engines • PostgreSQL • Bootstrapped the company on PostgreSQL in EC2
  • 30. A Tale of Storage Engines • PostgreSQL • Bootstrapped the company on PostgreSQL in EC2 • Highly relational, large index model
  • 31. A Tale of Storage Engines • PostgreSQL • Bootstrapped the company on PostgreSQL in EC2 • Highly relational, large index model • Layered in memcached
  • 32. A Tale of Storage Engines • PostgreSQL • Bootstrapped the company on PostgreSQL in EC2 • Highly relational, large index model • Layered in memcached • Writes weren’t scaling after ~ 6 months
  • 33. A Tale of Storage Engines • PostgreSQL • Bootstrapped the company on PostgreSQL in EC2 • Highly relational, large index model • Layered in memcached • Writes weren’t scaling after ~ 6 months • Continued to use for several silos of data but needed a way to grow more easily
  • 34. A Tale of Storage Engines
  • 35. A Tale of Storage Engines • MongoDB
  • 36. A Tale of Storage Engines • MongoDB • Initially, we loved Mongo
  • 37. A Tale of Storage Engines • MongoDB • Initially, we loved Mongo • Document databases are cool
  • 38. A Tale of Storage Engines • MongoDB • Initially, we loved Mongo • Document databases are cool • BSON is nice
  • 39. A Tale of Storage Engines • MongoDB • Initially, we loved Mongo • Document databases are cool • BSON is nice • As data set grew, we learned a lot about MongoDB
  • 40. A Tale of Storage Engines • MongoDB • Initially, we loved Mongo • Document databases are cool • BSON is nice • As data set grew, we learned a lot about MongoDB • “MongoDB does not wait for a response by default when writing to the database.”
  • 41. A Tale of Storage Engines • MongoDB • Initially, we loved Mongo • Document databases are cool • BSON is nice • As data set grew, we learned a lot about MongoDB • “MongoDB does not wait for a response by default when writing to the database.”
  • 42. A Tale of Storage Engines
  • 43. A Tale of Storage Engines • MongoDB - Read/Write Problems
  • 44. A Tale of Storage Engines • MongoDB - Read/Write Problems • Early days (1.2) one global lock (reads block writes and vice versa)
  • 45. A Tale of Storage Engines • MongoDB - Read/Write Problems • Early days (1.2) one global lock (reads block writes and vice versa) • Later, one read lock, one write lock per server
  • 46. A Tale of Storage Engines • MongoDB - Read/Write Problems • Early days (1.2) one global lock (reads block writes and vice versa) • Later, one read lock, one write lock per server • Long running queries were often devastating
  • 47. A Tale of Storage Engines • MongoDB - Read/Write Problems • Early days (1.2) one global lock (reads block writes and vice versa) • Later, one read lock, one write lock per server • Long running queries were often devastating • Replication would fall too far behind and stop
  • 48. A Tale of Storage Engines • MongoDB - Read/Write Problems • Early days (1.2) one global lock (reads block writes and vice versa) • Later, one read lock, one write lock per server • Long running queries were often devastating • Replication would fall too far behind and stop • No writes or updates
  • 49. A Tale of Storage Engines • MongoDB - Read/Write Problems • Early days (1.2) one global lock (reads block writes and vice versa) • Later, one read lock, one write lock per server • Long running queries were often devastating • Replication would fall too far behind and stop • No writes or updates • Effectively a failure for most clients
  • 50. A Tale of Storage Engines • MongoDB - Read/Write Problems • Early days (1.2) one global lock (reads block writes and vice versa) • Later, one read lock, one write lock per server • Long running queries were often devastating • Replication would fall too far behind and stop • No writes or updates • Effectively a failure for most clients • With replication, queries for anything other than the shard key talk to every node in the cluster
  • 51. A Tale of Storage Engines
  • 52. A Tale of Storage Engines • MongoDB - Update Problems
  • 53. A Tale of Storage Engines • MongoDB - Update Problems • Simple updates (i.e. counters) were fine
  • 54. A Tale of Storage Engines • MongoDB - Update Problems • Simple updates (i.e. counters) were fine • Bigger updates commonly resulted in large scans of the collection depending on position == heavy disk I/O
  • 55. A Tale of Storage Engines • MongoDB - Update Problems • Simple updates (i.e. counters) were fine • Bigger updates commonly resulted in large scans of the collection depending on position == heavy disk I/O • Frequently spill to end of the collection datafile leaving “holes” but not sparse files
  • 56. A Tale of Storage Engines • MongoDB - Update Problems • Simple updates (i.e. counters) were fine • Bigger updates commonly resulted in large scans of the collection depending on position == heavy disk I/O • Frequently spill to end of the collection datafile leaving “holes” but not sparse files • Those “holes” get MMap’d even though they’re not used
  • 57. A Tale of Storage Engines • MongoDB - Update Problems • Simple updates (i.e. counters) were fine • Bigger updates commonly resulted in large scans of the collection depending on position == heavy disk I/O • Frequently spill to end of the collection datafile leaving “holes” but not sparse files • Those “holes” get MMap’d even though they’re not used • Updates moving data acquire multiple locks commonly blocking other read/write operations
  • 58. A Tale of Storage Engines
  • 59. A Tale of Storage Engines • MongoDB - Optimization Problems
  • 60. A Tale of Storage Engines • MongoDB - Optimization Problems • Compacting a collection locks the entire collection
  • 61. A Tale of Storage Engines • MongoDB - Optimization Problems • Compacting a collection locks the entire collection • Read slave was too busy to be a backup, needed moar RAMs but were already on High-Memory EC2, nowhere else to go
  • 62. A Tale of Storage Engines • MongoDB - Optimization Problems • Compacting a collection locks the entire collection • Read slave was too busy to be a backup, needed moar RAMs but were already on High-Memory EC2, nowhere else to go • Mongo MMaps everything - when your data set is bigger than RAM, you better have fast disks
  • 63. A Tale of Storage Engines • MongoDB - Optimization Problems • Compacting a collection locks the entire collection • Read slave was too busy to be a backup, needed moar RAMs but were already on High-Memory EC2, nowhere else to go • Mongo MMaps everything - when your data set is bigger than RAM, you better have fast disks • Until 1.8, no support for sparse indexes
  • 64. A Tale of Storage Engines
  • 65. A Tale of Storage Engines • MongoDB - Ops Issues
  • 66. A Tale of Storage Engines • MongoDB - Ops Issues • Lots of good information in mongostat
  • 67. A Tale of Storage Engines • MongoDB - Ops Issues • Lots of good information in mongostat • Recovering a crashed system was effectively impossible without disabling indexes first (not the default)
  • 68. A Tale of Storage Engines • MongoDB - Ops Issues • Lots of good information in mongostat • Recovering a crashed system was effectively impossible without disabling indexes first (not the default) • Replica sets never worked for us in testing, lots of inconsistencies in failure scenarios
  • 69. A Tale of Storage Engines • MongoDB - Ops Issues • Lots of good information in mongostat • Recovering a crashed system was effectively impossible without disabling indexes first (not the default) • Replica sets never worked for us in testing, lots of inconsistencies in failure scenarios • Scattered records lead to lots of I/O that hurt on bad disks (EC2)
  • 71. Cassandra at Urban Airship • Summer of 2010 - no faith left in MongoDB started a migration to Cassandra
  • 72. Cassandra at Urban Airship • Summer of 2010 - no faith left in MongoDB started a migration to Cassandra • Lots of L&P testing, client analysis, etc.
  • 73. Cassandra at Urban Airship • Summer of 2010 - no faith left in MongoDB started a migration to Cassandra • Lots of L&P testing, client analysis, etc. • December 2010 - Cassandra backed 85% of our Android stack’s persistence
  • 74. Cassandra at Urban Airship • Summer of 2010 - no faith left in MongoDB started a migration to Cassandra • Lots of L&P testing, client analysis, etc. • December 2010 - Cassandra backed 85% of our Android stack’s persistence • Six EC2 XLS with each serving:
  • 75. Cassandra at Urban Airship • Summer of 2010 - no faith left in MongoDB started a migration to Cassandra • Lots of L&P testing, client analysis, etc. • December 2010 - Cassandra backed 85% of our Android stack’s persistence • Six EC2 XLS with each serving: • 30GB data
  • 76. Cassandra at Urban Airship • Summer of 2010 - no faith left in MongoDB started a migration to Cassandra • Lots of L&P testing, client analysis, etc. • December 2010 - Cassandra backed 85% of our Android stack’s persistence • Six EC2 XLS with each serving: • 30GB data • ~1000 reads/second/node
  • 77. Cassandra at Urban Airship • Summer of 2010 - no faith left in MongoDB started a migration to Cassandra • Lots of L&P testing, client analysis, etc. • December 2010 - Cassandra backed 85% of our Android stack’s persistence • Six EC2 XLS with each serving: • 30GB data • ~1000 reads/second/node • ~750 writes/second/node
  • 79. Cassandra at Urban Airship • Why Cassandra?
  • 80. Cassandra at Urban Airship • Why Cassandra? • Well suited for most of our data model (simple DAGs)
  • 81. Cassandra at Urban Airship • Why Cassandra? • Well suited for most of our data model (simple DAGs) • Lots of UUIDs and hashes partition well
  • 82. Cassandra at Urban Airship • Why Cassandra? • Well suited for most of our data model (simple DAGs) • Lots of UUIDs and hashes partition well • Retrievals don’t need ordering beyond keys or TSD
  • 83. Cassandra at Urban Airship • Why Cassandra? • Well suited for most of our data model (simple DAGs) • Lots of UUIDs and hashes partition well • Retrievals don’t need ordering beyond keys or TSD • Rolling upgrades FTW
  • 84. Cassandra at Urban Airship • Why Cassandra? • Well suited for most of our data model (simple DAGs) • Lots of UUIDs and hashes partition well • Retrievals don’t need ordering beyond keys or TSD • Rolling upgrades FTW • Dynamic rebalancing and node addition
  • 85. Cassandra at Urban Airship • Why Cassandra? • Well suited for most of our data model (simple DAGs) • Lots of UUIDs and hashes partition well • Retrievals don’t need ordering beyond keys or TSD • Rolling upgrades FTW • Dynamic rebalancing and node addition • Column TTLs huge for us
  • 86. Cassandra at Urban Airship • Why Cassandra? • Well suited for most of our data model (simple DAGs) • Lots of UUIDs and hashes partition well • Retrievals don’t need ordering beyond keys or TSD • Rolling upgrades FTW • Dynamic rebalancing and node addition • Column TTLs huge for us • Awesome community :)
  • 88. Cassandra at Urban Airship • Why Cassandra cont’d?
  • 89. Cassandra at Urban Airship • Why Cassandra cont’d? • Particularly well suited to working around EC2 availability
  • 90. Cassandra at Urban Airship • Why Cassandra cont’d? • Particularly well suited to working around EC2 availability • Needed a cross AZ strategy - we had seen EBS issues in the past, didn’t trust fault containment w/n a zone
  • 91. Cassandra at Urban Airship • Why Cassandra cont’d? • Particularly well suited to working around EC2 availability • Needed a cross AZ strategy - we had seen EBS issues in the past, didn’t trust fault containment w/n a zone • Didn’t want locality of replication so needed to stripe across AZs
  • 92. Cassandra at Urban Airship • Why Cassandra cont’d? • Particularly well suited to working around EC2 availability • Needed a cross AZ strategy - we had seen EBS issues in the past, didn’t trust fault containment w/n a zone • Didn’t want locality of replication so needed to stripe across AZs • Read repair and handoff generally did the right thing when a node would flap (Ubuntu #708920)
  • 93. Cassandra at Urban Airship • Why Cassandra cont’d? • Particularly well suited to working around EC2 availability • Needed a cross AZ strategy - we had seen EBS issues in the past, didn’t trust fault containment w/n a zone • Didn’t want locality of replication so needed to stripe across AZs • Read repair and handoff generally did the right thing when a node would flap (Ubuntu #708920) • No SPoF
  • 94. Cassandra at Urban Airship • Why Cassandra cont’d? • Particularly well suited to working around EC2 availability • Needed a cross AZ strategy - we had seen EBS issues in the past, didn’t trust fault containment w/n a zone • Didn’t want locality of replication so needed to stripe across AZs • Read repair and handoff generally did the right thing when a node would flap (Ubuntu #708920) • No SPoF • Ability to alter CLs on a per operation basis
  • 95. Battle Scars - Development
  • 96. Battle Scars - Development • Know your data model
  • 97. Battle Scars - Development • Know your data model • Creating indexes after the fact is a PITA
  • 98. Battle Scars - Development • Know your data model • Creating indexes after the fact is a PITA • Design around wide rows
  • 99. Battle Scars - Development • Know your data model • Creating indexes after the fact is a PITA • Design around wide rows • I/O problems
  • 100. Battle Scars - Development • Know your data model • Creating indexes after the fact is a PITA • Design around wide rows • I/O problems • Thrift problems
  • 101. Battle Scars - Development • Know your data model • Creating indexes after the fact is a PITA • Design around wide rows • I/O problems • Thrift problems • Count problems
  • 102. Battle Scars - Development • Know your data model • Creating indexes after the fact is a PITA • Design around wide rows • I/O problems • Thrift problems • Count problems • Favor JSON over packed binaries if possible
  • 103. Battle Scars - Development • Know your data model • Creating indexes after the fact is a PITA • Design around wide rows • I/O problems • Thrift problems • Count problems • Favor JSON over packed binaries if possible • Careful with Thrift in the stack
  • 104. Battle Scars - Development • Know your data model • Creating indexes after the fact is a PITA • Design around wide rows • I/O problems • Thrift problems • Count problems • Favor JSON over packed binaries if possible • Careful with Thrift in the stack • Don’t fear the StorageProxy
  • 105. Battle Scars - Development
  • 106. Battle Scars - Development • Assume failure in the client
  • 107. Battle Scars - Development • Assume failure in the client • Read timeout vs. connection refused
  • 108. Battle Scars - Development • Assume failure in the client • Read timeout vs. connection refused • When maintaining your own indexes, try and cleanup after failure
  • 109. Battle Scars - Development • Assume failure in the client • Read timeout vs. connection refused • When maintaining your own indexes, try and cleanup after failure • Be ready to cleanup inconsistencies anyway
  • 110. Battle Scars - Development • Assume failure in the client • Read timeout vs. connection refused • When maintaining your own indexes, try and cleanup after failure • Be ready to cleanup inconsistencies anyway • Verify client library assumptions and exception handling
  • 111. Battle Scars - Development • Assume failure in the client • Read timeout vs. connection refused • When maintaining your own indexes, try and cleanup after failure • Be ready to cleanup inconsistencies anyway • Verify client library assumptions and exception handling • Retry now vs. retry later?
  • 112. Battle Scars - Development • Assume failure in the client • Read timeout vs. connection refused • When maintaining your own indexes, try and cleanup after failure • Be ready to cleanup inconsistencies anyway • Verify client library assumptions and exception handling • Retry now vs. retry later? • Compensating action during failures?
  • 113. Battle Scars - Development • Assume failure in the client • Read timeout vs. connection refused • When maintaining your own indexes, try and cleanup after failure • Be ready to cleanup inconsistencies anyway • Verify client library assumptions and exception handling • Retry now vs. retry later? • Compensating action during failures? • Don’t avoid the Cassandra code
  • 114. Battle Scars - Development • Assume failure in the client • Read timeout vs. connection refused • When maintaining your own indexes, try and cleanup after failure • Be ready to cleanup inconsistencies anyway • Verify client library assumptions and exception handling • Retry now vs. retry later? • Compensating action during failures? • Don’t avoid the Cassandra code • Embed for testing
  • 116. Battle Scars - Ops • Cassandra in EC2:
  • 117. Battle Scars - Ops • Cassandra in EC2: • Ensure Dynamic Snitch is enabled
  • 118. Battle Scars - Ops • Cassandra in EC2: • Ensure Dynamic Snitch is enabled • Disk I/O
  • 119. Battle Scars - Ops • Cassandra in EC2: • Ensure Dynamic Snitch is enabled • Disk I/O • Avoid EBS except for snapshot backups or use S3
  • 120. Battle Scars - Ops • Cassandra in EC2: • Ensure Dynamic Snitch is enabled • Disk I/O • Avoid EBS except for snapshot backups or use S3 • Stripe ephemerals, not EBS volumes
  • 121. Battle Scars - Ops • Cassandra in EC2: • Ensure Dynamic Snitch is enabled • Disk I/O • Avoid EBS except for snapshot backups or use S3 • Stripe ephemerals, not EBS volumes • Avoid smaller instances all together
  • 122. Battle Scars - Ops • Cassandra in EC2: • Ensure Dynamic Snitch is enabled • Disk I/O • Avoid EBS except for snapshot backups or use S3 • Stripe ephemerals, not EBS volumes • Avoid smaller instances all together • Don’t always assume traversing close proximity AZs is more expensive
  • 123. Battle Scars - Ops • Cassandra in EC2: • Ensure Dynamic Snitch is enabled • Disk I/O • Avoid EBS except for snapshot backups or use S3 • Stripe ephemerals, not EBS volumes • Avoid smaller instances all together • Don’t always assume traversing close proximity AZs is more expensive • Balance RAM cost vs. the cost of additional hosts and spending time w/ GC logs
  • 125. Battle Scars - Ops • Java Best Practices:
  • 126. Battle Scars - Ops • Java Best Practices: • All Java services are managed via the same set of scripts
  • 127. Battle Scars - Ops • Java Best Practices: • All Java services are managed via the same set of scripts • In most cases, operators don’t treat Cassandra different from HBase
  • 128. Battle Scars - Ops • Java Best Practices: • All Java services are managed via the same set of scripts • In most cases, operators don’t treat Cassandra different from HBase • Simple mechanism to take thread or heap dump
  • 129. Battle Scars - Ops • Java Best Practices: • All Java services are managed via the same set of scripts • In most cases, operators don’t treat Cassandra different from HBase • Simple mechanism to take thread or heap dump • All logging is consistent - GC, application, stdx
  • 130. Battle Scars - Ops • Java Best Practices: • All Java services are managed via the same set of scripts • In most cases, operators don’t treat Cassandra different from HBase • Simple mechanism to take thread or heap dump • All logging is consistent - GC, application, stdx • Init scripts use the same scripts operators do
  • 131. Battle Scars - Ops • Java Best Practices: • All Java services are managed via the same set of scripts • In most cases, operators don’t treat Cassandra different from HBase • Simple mechanism to take thread or heap dump • All logging is consistent - GC, application, stdx • Init scripts use the same scripts operators do • Bare metal will rock your world
  • 132. Battle Scars - Ops • Java Best Practices: • All Java services are managed via the same set of scripts • In most cases, operators don’t treat Cassandra different from HBase • Simple mechanism to take thread or heap dump • All logging is consistent - GC, application, stdx • Init scripts use the same scripts operators do • Bare metal will rock your world • +UseLargePages will rock your world too
  • 133. Battle Scars - Ops ParNew GC Effectiveness Mean Time ParNew GC 300 0.04 225 0.03 150 0.02 75 0.01 0 0 MB Collected Collection Time (ms) Bare Metal EC2 XL Bare Metal EC2 XL ParNew Collection Count 60000 45000 30000 15000 0 Number of Collections Bare Metal EC2 XL
  • 135. Battle Scars - Ops • Java Best Practices cont’d:
  • 136. Battle Scars - Ops • Java Best Practices cont’d: • Get familiar with GC logs (-XX:+PrintGCDetails)
  • 137. Battle Scars - Ops • Java Best Practices cont’d: • Get familiar with GC logs (-XX:+PrintGCDetails) • Understand what degenerate CMS collection looks like
  • 138. Battle Scars - Ops • Java Best Practices cont’d: • Get familiar with GC logs (-XX:+PrintGCDetails) • Understand what degenerate CMS collection looks like • We settled at -XX:CMSInitiatingOccupancyFraction=60
  • 139. Battle Scars - Ops • Java Best Practices cont’d: • Get familiar with GC logs (-XX:+PrintGCDetails) • Understand what degenerate CMS collection looks like • We settled at -XX:CMSInitiatingOccupancyFraction=60 • Possibly experiment with tenuring threshold
  • 140. Battle Scars - Ops • Java Best Practices cont’d: • Get familiar with GC logs (-XX:+PrintGCDetails) • Understand what degenerate CMS collection looks like • We settled at -XX:CMSInitiatingOccupancyFraction=60 • Possibly experiment with tenuring threshold • When in doubt take a thread dump
  • 141. Battle Scars - Ops • Java Best Practices cont’d: • Get familiar with GC logs (-XX:+PrintGCDetails) • Understand what degenerate CMS collection looks like • We settled at -XX:CMSInitiatingOccupancyFraction=60 • Possibly experiment with tenuring threshold • When in doubt take a thread dump • TDA (http://java.net/projects/tda/)
  • 142. Battle Scars - Ops • Java Best Practices cont’d: • Get familiar with GC logs (-XX:+PrintGCDetails) • Understand what degenerate CMS collection looks like • We settled at -XX:CMSInitiatingOccupancyFraction=60 • Possibly experiment with tenuring threshold • When in doubt take a thread dump • TDA (http://java.net/projects/tda/) • Eclipse MAT (http://www.eclipse.org/mat/)
  • 144. Battle Scars - Ops • Understand when to compact
  • 145. Battle Scars - Ops • Understand when to compact • Understand upgrade implications for datafiles
  • 146. Battle Scars - Ops • Understand when to compact • Understand upgrade implications for datafiles • Watch hinted handoff closely
  • 147. Battle Scars - Ops • Understand when to compact • Understand upgrade implications for datafiles • Watch hinted handoff closely • Monitor JMX religiously
  • 149. Looking Forward • Cassandra is a great hammer but not everything is a nail
  • 150. Looking Forward • Cassandra is a great hammer but not everything is a nail • Coprocessors would be awesome (hint hint)
  • 151. Looking Forward • Cassandra is a great hammer but not everything is a nail • Coprocessors would be awesome (hint hint) • Still spend too much time worrying about GC
  • 152. Looking Forward • Cassandra is a great hammer but not everything is a nail • Coprocessors would be awesome (hint hint) • Still spend too much time worrying about GC • Glad to see the ecosystem around the product evolving
  • 153. Looking Forward • Cassandra is a great hammer but not everything is a nail • Coprocessors would be awesome (hint hint) • Still spend too much time worrying about GC • Glad to see the ecosystem around the product evolving • CQL
  • 154. Looking Forward • Cassandra is a great hammer but not everything is a nail • Coprocessors would be awesome (hint hint) • Still spend too much time worrying about GC • Glad to see the ecosystem around the product evolving • CQL • Pig
  • 155. Looking Forward • Cassandra is a great hammer but not everything is a nail • Coprocessors would be awesome (hint hint) • Still spend too much time worrying about GC • Glad to see the ecosystem around the product evolving • CQL • Pig • Brisk
  • 156. Looking Forward • Cassandra is a great hammer but not everything is a nail • Coprocessors would be awesome (hint hint) • Still spend too much time worrying about GC • Glad to see the ecosystem around the product evolving • CQL • Pig • Brisk • Guardedly optimistic about off heap data management
  • 157. Thanks to • jbellis, driftx • Datastax • Whoever wrote TDA • SAP
  • 158. Thanks! • Urban Airship: http://urbanairship.com/ • We’re hiring! http://urbanairship.com/company/jobs/ • Me @eonnen or erik at

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n
  71. \n
  72. \n
  73. \n
  74. \n
  75. \n
  76. \n
  77. \n
  78. \n
  79. \n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. \n
  87. \n
  88. \n
  89. \n
  90. \n
  91. \n
  92. \n
  93. \n
  94. \n
  95. \n
  96. \n
  97. \n
  98. \n
  99. \n
  100. \n
  101. \n
  102. \n
  103. \n
  104. \n
  105. \n
  106. \n
  107. \n
  108. \n
  109. \n
  110. \n
  111. \n
  112. \n
  113. \n
  114. \n
  115. \n
  116. \n
  117. \n
  118. \n
  119. \n
  120. \n
  121. \n
  122. \n
  123. \n
  124. \n
  125. \n
  126. \n
  127. \n
  128. \n
  129. \n
  130. \n
  131. \n
  132. \n
  133. \n
  134. \n
  135. \n
  136. \n
  137. \n
  138. \n
  139. \n