SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Fears, Misconceptions, and
Accepted Anti-patterns of a first
time Cassandra Adopter
Ben Christenson
Who am I?
● Full Stack Developer / Architect
at Kinetic Data
● MSP Cassandra Meetups since August 2014
● I like cool technology, whisky,
and brazilian jiu-jitsu.
Why am I presenting?
● Cassandra is a great strategy even if you aren’t looking for infinite OPS
○ Lots of articles for newbies and experts, not a lot of content on non-extreme use
● Give back to the meetup
○ I enjoy hearing about real implementations
○ Meetup is one of the reasons we chose Cassandra
● A little symbiotic selfishness
○ There may be better patterns that I don’t know about
○ Developing a more technical version of the presentation with sample code, metrics, etc
About Kinetic Data
● About 50 employees
● Main office in St. Paul, secondary
office in Sydney Australia, satellite
Offices throughout US
● Develop software to improve
service experience
Cassandra
Adoption ● What is Kinetic Request?
● Why Cassandra?
What is Kinetic Request?
What is Kinetic Request?
What is Kinetic Request?
Workflow automation
through Kinetic Task
What is Kinetic Request?
● Originally developed on the BMC Action Request System (ARS)
● Evolved into a Java webapp
● Planning for ARS decoupling for about 5 years
● January 2015 - Started Request CE development
● May 2015 - Demoed Request CE prototype
● March 2016 - Released Request CE v1.0.0
Why Cassandra?
● Multi-datacenter replication
○ Significantly improved performance for global customers
● Durability
● Scalability
○ Easier to start a current scale
○ Scale out without migrating
○ Scale size and throughput
● Community
Fears and
Misconceptions
● StackOverflow-itis
● Kinetic Request is a deployed
solution
● Cassandra is for write-heavy
workloads
● Cassandra is for time series data
● Cassandra requires Java and
Linux experts
Stack Overflow-itis
Fears
● ALLOW FILTERING
○ Don’t ever use that!
● Secondary indexes
○ Don’t use those.
● Collections
○ Probably shouldn’t use those either…
● Counters
○ Don’t you want to use something that works
● Tombstones, Tombstones?!, TOMBSTONES!
Reality
● Thank you MSP Cassandra Meetup for being
the cure!
○ Everything was included for a reason
○ “You probably don’t want to use Xyz for that,
but here is when you would.”
Kinetic Request is a deployed solution
Fear
● Very hard to find anyone using Cassandra
for customer-managed solutions
Reality
● Many of our customers already pay for
Cassandra support
● Many of our customers understand the
benefits
● As a deployed solution our data usage and
schemas don’t change frequently and
potential issues are (hopefully) caught
before reaching the customer
● Possible future talk?
Cassandra is for write-heavy workloads
Misconception
● Cassandra is only for write-heavy workloads
Reality
● Cassandra is extremely good at write-heavy
workloads
● Cassandra can be implemented to be good at
read-heavy workloads
● Even with heavy delete-and-insert updates,
reads are still outperforming previous
versions of Kinetic Request
Cassandra is for time series data
Misconception
● Cassandra is only for time series data
Reality
● Cassandra is extremely good at time series
data
● Cassandra is extremely good at replicating
all data
● Just because Cassandra can handle extreme
operations per second, doesn’t mean it isn’t
suited for lower OPS usage (and you can get
away with a lot more)
Cassandra requires Java and Linux experts
Misconception
● We were going to need to become Java and
Linux experts to use Cassandra
Reality
● We needed to have a computer to use
Cassandra
● We needed to be willing to learn more about
Java, Linux commands, and Cassandra
internals as we went
Accepted Anti-
paterns
● Atomicity and Read-Before-Write
● Distributed Joins
● Lookup Tables
● Delete-And-Insert Updates
● Queues
Atomicity and Read before Write
● Read before write is often described as an anti-pattern
○ Potential inconsistency or Check and Set (CAS) / Lightweight Transaction (LWT) operations
○ Event sourcing may be an alternative
● There isn’t always an alternative
● Kinetic Request uses LWTs for “Optimistic Locking” (and uniqueness)
● Even at an order of magnitude slower, more than fast enough at our scale
○ < 10ms with Replication Factor 3
○ Order of magnitude faster than what is necessary for us
Atomicity and Read before Write
Sample Schema
CREATE TABLE IF NOT EXISTS widgets (
name text,
tenant_id timeuuid,
value text,
version_id timeuuid,
PRIMARY KEY ((tenant_id), name)
) WITH CLUSTERING ORDER BY (name ASC);
Uniqueness
INSERT INTO widgets
(name, tenant_id, value, version_id)
VALUES (:name, :tenant_id, :value, :version_id)
IF NOT EXISTS
Optimistic Locking
UPDATE widgets
SET value = :value
WHERE tenant_id = :tenant_id AND name = :name
IF version_id = :version_id
Distributed Joins
● Cassandra doesn’t support joins, but you can do them in memory
○ Requires multiple sequential reads and/or multi-partition queries
○ Embedded or denormalized content is sometimes an alternative
● We use a distributed join between Submissions and Forms
○ Allows us to rename the form and maintain the link
○ Acceptable because forms are finite enough to keep in memory
● Christopher Batey has a great blog post on this: http://christopher-batey.blogspot.
com/2015/02/cassandra-anti-pattern-distributed.html
Submissions Schema
CREATE TABLE IF NOT EXISTS submissions (
form_id timeuuid,
id timeuuid,
tenant_id timeuuid,
...
PRIMARY KEY ((tenant_id), id)
) WITH CLUSTERING ORDER BY (name ASC);
Distributed Joins
Forms Schema
CREATE TABLE IF NOT EXISTS forms (
name text,
tenant_id timeuuid,
...
PRIMARY KEY ((tenant_id), name)
) WITH CLUSTERING ORDER BY (name ASC);
SELECT * FROM submissions WHERE tenant_id = :tenant_id AND id = :id;
SELECT * FROM forms WHERE tenant_id = :tenant_id AND id = :submission_form_id;
Lookup Tables
● Lookup Tables are another form of Distributed Join
○ Table contains only data necessary for the query and an id used to lookup from the source of truth
○ Requires a “multi-get” to retrieve actual records
○ Often considered an anti-pattern for similar reasons as distributed joins
● We use a lookup tables for Webhooks and Submissions
○ Duplicating data would lead to storage requirements orders of magnitude higher
Lookup Tables
CREATE TABLE IF NOT EXISTS webhooks (
id timeuuid,
scheduled_at timestamp,
tenant_id timeuuid,
...
PRIMARY KEY ((tenant_id), id)
) WITH CLUSTERING ORDER BY (id ASC);
CREATE TABLE IF NOT EXISTS webhooks_index (
bucket text,
id timeuuid,
index_type text, // Tenant, Webhook, Parent
index_key text,
scheduled_at timestamp,
tenant_id timeuuid,
PRIMARY KEY ((tenant_id, bucket, index_type, index_key), scheduled_at, id)
) WITH CLUSTERING ORDER BY (scheduled_at DESC, id DESC) ...;
Delete-And-Insert Updates
● Fundamental problem:
○ Cassandra retrieves by primary key
○ User’s want to search by values that change
○ Updating a primary key is done as a DELETE and INSERT (which leads to tombstones)
○ Want to minimize environmental complexity
● No simple solution for us
○ Try to minimize number of deletes for a given query path
○ Try to optimize for tombstones
Delete-And-Insert Updates
● The biggest source of our DELETE-AND-INSERT usage to support our Ad-hoc
querying of submissions
● Example Ad-hoc query:
values[Foo] IN ("Bar", "Baz")
AND (
values[Requested By]="ben.christenson"
OR values[Requested For]="ben.christenson"
)
● Our solution is similar to the C* Summit presentation on multi-criteria queries
http://fr.slideshare.net/ippontech/multi-criteria-queries-on-a-cassandra-application
Delete-And-Insert Updates
Writing
● Read record from Cassandra
(including version_id)
● An “Indexer” class generates index sets from
original and updated model
● Optimistically update the source of truth
record
● Asynchronously create/delete necessary
index records
Reading
● Each criterion is a separate async query
● An in memory evaluator aggregates the
lookup table IDs
● The submissions associated to the resulting
IDs are each retrieved asynchronously
● The in memory evaluator “double checks”
the submissions match the query and may
re-execute another search to fill in gaps for
submissions that have been updated since
the initial index queries (very rare)
Delete-And-Insert Updates
CREATE TABLE IF NOT EXISTS submissions_index (
tenant_id timeuuid, timeline text, bucket text, // ‘’ for active or ‘YYYY-mm’
key text, value text,
timestamp timestamp, submission_id timeuuid,
PRIMARY KEY ((tenant_id, timeline, bucket, key), value, timestamp, submission_id)
)
WITH CLUSTERING ORDER BY (value DESC, timestamp DESC, submission_id DESC)
AND COMPACTION={
'sstable_size_in_mb': '256',
'tombstone_threshold': '0.05',
'unchecked_tombstone_compaction': 'true',
'tombstone_compaction_interval': '3600',
'class': 'LeveledCompactionStrategy'
};
Delete-And-Insert Updates
● Even with Delete-And-Insert and lookup tables, performance is acceptable
○ Supports queries that were previously impossible
○ Extremely complicated search queries still return in < 150ms
● Does have some caveats
○ Only supports AND, OR, IN, and =
(would like to support !=, starts with, ends with, etc)
○ Whenever an AND is used at least one of the criterions must return less than 1000 matches
○ In order to support pagination, sort orders must be indexed independently
(combination of date and uuid; we index multiple date properties)
Queues
● Queues are one of the most commonly referred to anti-patterns
● Problem comes down to tombstones again
○ Can be improved by truncating, knowing where live data begins, or complicated rotations
○ Can be improved by including additional technologies (real message queue)
Queues
● One of the queue-like structures used by Kinetic Request is for Webhooks
○ Can fail to connect and should be automatically retried (put back on queue)
○ Happen often and have a very specific query path so tombstones are worrisome
● In this case, the queue “event” can be processed initially by the server in memory
○ Write directly to the source of truth / historical index and avoid tombstones for normal executions
○ Only if the initial webhook fails is it added to the queue index
Queues
● Other styles of queues can’t necessarily be processed by the event server
○ Scheduled for the future
○ Handled by
● For this case, we are experimenting using an in-memory distributed queue
○ Hazelcast or Ignite (which have many other coordination benefits)
○ Avoids hitting tombstones by using Cassandra as a persistence mechanism only queried at startup
Takeaways
● Cassandra has many benefits,
even if you are not using it at
extreme scales
● The barrier of entry is not as
scary as it seems
● Play, play, play, test, test, test
● Find good resources
Questions?

Contenu connexe

Tendances

Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandraaaronmorton
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with AlternatorEmpowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with AlternatorScyllaDB
 
Elassandra schema management - Apache Con 2019
Elassandra schema management - Apache Con 2019Elassandra schema management - Apache Con 2019
Elassandra schema management - Apache Con 2019Vincent Royer
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016DataStax
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinDataStax Academy
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruTim Callaghan
 
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...Felix Gessert
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedOmid Vahdaty
 
Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databasesjbellis
 
Cloud Databases in Research and Practice
Cloud Databases in Research and PracticeCloud Databases in Research and Practice
Cloud Databases in Research and PracticeFelix Gessert
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
Engineering practices in big data storage and processing
Engineering practices in big data storage and processingEngineering practices in big data storage and processing
Engineering practices in big data storage and processingSchubert Zhang
 
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it tooQuerying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it tooAll Things Open
 
Introduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraIntroduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraJohnny Miller
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...DataStax
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)gdusbabek
 

Tendances (20)

Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with AlternatorEmpowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with Alternator
 
Elassandra schema management - Apache Con 2019
Elassandra schema management - Apache Con 2019Elassandra schema management - Apache Con 2019
Elassandra schema management - Apache Con 2019
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
 
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
 
Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databases
 
Cloud Databases in Research and Practice
Cloud Databases in Research and PracticeCloud Databases in Research and Practice
Cloud Databases in Research and Practice
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Engineering practices in big data storage and processing
Engineering practices in big data storage and processingEngineering practices in big data storage and processing
Engineering practices in big data storage and processing
 
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it tooQuerying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
 
No sql
No sqlNo sql
No sql
 
Introduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraIntroduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache Cassandra
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
 
Google Cloud Spanner Preview
Google Cloud Spanner PreviewGoogle Cloud Spanner Preview
Google Cloud Spanner Preview
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)
 

Similaire à Fears, misconceptions, and accepted anti patterns of a first time cassandra adopter

Avoiding Pitfalls for Cassandra.pdf
Avoiding Pitfalls for Cassandra.pdfAvoiding Pitfalls for Cassandra.pdf
Avoiding Pitfalls for Cassandra.pdfCédrick Lunven
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseAll Things Open
 
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...ScyllaDB
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTORiccardo Zamana
 
Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!DataStax Academy
 
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandracodecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with CassandraDataStax Academy
 
Manchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationManchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationChristopher Batey
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...DataStax
 
Reading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache SparkReading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache SparkChristopher Batey
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japanHiromitsu Komatsu
 
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOTAWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOTAmazon Web Services
 
Using cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb dataUsing cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb dataRamesh Veeramani
 
Spark & Cassandra - DevFest Córdoba
Spark & Cassandra - DevFest CórdobaSpark & Cassandra - DevFest Córdoba
Spark & Cassandra - DevFest CórdobaJose Mº Muñoz
 
Cloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure toolsCloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure toolsPushkar Chivate
 
Cassandra Data Modelling
Cassandra Data ModellingCassandra Data Modelling
Cassandra Data ModellingKnoldus Inc.
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkA Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkRed Hat Developers
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL ServicesAmazon Web Services
 
Cassandra's Odyssey @ Netflix
Cassandra's Odyssey @ NetflixCassandra's Odyssey @ Netflix
Cassandra's Odyssey @ NetflixRoopa Tangirala
 
Introduction to NoSQL Database
Introduction to NoSQL DatabaseIntroduction to NoSQL Database
Introduction to NoSQL DatabaseMohammad Alghanem
 

Similaire à Fears, misconceptions, and accepted anti patterns of a first time cassandra adopter (20)

Avoiding Pitfalls for Cassandra.pdf
Avoiding Pitfalls for Cassandra.pdfAvoiding Pitfalls for Cassandra.pdf
Avoiding Pitfalls for Cassandra.pdf
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series Database
 
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTO
 
Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!
 
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandracodecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
 
Manchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationManchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra Integration
 
Cassandra Metrics
Cassandra MetricsCassandra Metrics
Cassandra Metrics
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
Reading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache SparkReading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache Spark
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japan
 
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOTAWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
 
Using cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb dataUsing cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb data
 
Spark & Cassandra - DevFest Córdoba
Spark & Cassandra - DevFest CórdobaSpark & Cassandra - DevFest Córdoba
Spark & Cassandra - DevFest Córdoba
 
Cloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure toolsCloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure tools
 
Cassandra Data Modelling
Cassandra Data ModellingCassandra Data Modelling
Cassandra Data Modelling
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkA Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
 
Cassandra's Odyssey @ Netflix
Cassandra's Odyssey @ NetflixCassandra's Odyssey @ Netflix
Cassandra's Odyssey @ Netflix
 
Introduction to NoSQL Database
Introduction to NoSQL DatabaseIntroduction to NoSQL Database
Introduction to NoSQL Database
 

Dernier

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Dernier (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Fears, misconceptions, and accepted anti patterns of a first time cassandra adopter

  • 1. Fears, Misconceptions, and Accepted Anti-patterns of a first time Cassandra Adopter Ben Christenson
  • 2. Who am I? ● Full Stack Developer / Architect at Kinetic Data ● MSP Cassandra Meetups since August 2014 ● I like cool technology, whisky, and brazilian jiu-jitsu.
  • 3. Why am I presenting? ● Cassandra is a great strategy even if you aren’t looking for infinite OPS ○ Lots of articles for newbies and experts, not a lot of content on non-extreme use ● Give back to the meetup ○ I enjoy hearing about real implementations ○ Meetup is one of the reasons we chose Cassandra ● A little symbiotic selfishness ○ There may be better patterns that I don’t know about ○ Developing a more technical version of the presentation with sample code, metrics, etc
  • 4. About Kinetic Data ● About 50 employees ● Main office in St. Paul, secondary office in Sydney Australia, satellite Offices throughout US ● Develop software to improve service experience
  • 5. Cassandra Adoption ● What is Kinetic Request? ● Why Cassandra?
  • 6. What is Kinetic Request?
  • 7. What is Kinetic Request?
  • 8. What is Kinetic Request? Workflow automation through Kinetic Task
  • 9. What is Kinetic Request? ● Originally developed on the BMC Action Request System (ARS) ● Evolved into a Java webapp ● Planning for ARS decoupling for about 5 years ● January 2015 - Started Request CE development ● May 2015 - Demoed Request CE prototype ● March 2016 - Released Request CE v1.0.0
  • 10. Why Cassandra? ● Multi-datacenter replication ○ Significantly improved performance for global customers ● Durability ● Scalability ○ Easier to start a current scale ○ Scale out without migrating ○ Scale size and throughput ● Community
  • 11. Fears and Misconceptions ● StackOverflow-itis ● Kinetic Request is a deployed solution ● Cassandra is for write-heavy workloads ● Cassandra is for time series data ● Cassandra requires Java and Linux experts
  • 12. Stack Overflow-itis Fears ● ALLOW FILTERING ○ Don’t ever use that! ● Secondary indexes ○ Don’t use those. ● Collections ○ Probably shouldn’t use those either… ● Counters ○ Don’t you want to use something that works ● Tombstones, Tombstones?!, TOMBSTONES! Reality ● Thank you MSP Cassandra Meetup for being the cure! ○ Everything was included for a reason ○ “You probably don’t want to use Xyz for that, but here is when you would.”
  • 13. Kinetic Request is a deployed solution Fear ● Very hard to find anyone using Cassandra for customer-managed solutions Reality ● Many of our customers already pay for Cassandra support ● Many of our customers understand the benefits ● As a deployed solution our data usage and schemas don’t change frequently and potential issues are (hopefully) caught before reaching the customer ● Possible future talk?
  • 14. Cassandra is for write-heavy workloads Misconception ● Cassandra is only for write-heavy workloads Reality ● Cassandra is extremely good at write-heavy workloads ● Cassandra can be implemented to be good at read-heavy workloads ● Even with heavy delete-and-insert updates, reads are still outperforming previous versions of Kinetic Request
  • 15. Cassandra is for time series data Misconception ● Cassandra is only for time series data Reality ● Cassandra is extremely good at time series data ● Cassandra is extremely good at replicating all data ● Just because Cassandra can handle extreme operations per second, doesn’t mean it isn’t suited for lower OPS usage (and you can get away with a lot more)
  • 16. Cassandra requires Java and Linux experts Misconception ● We were going to need to become Java and Linux experts to use Cassandra Reality ● We needed to have a computer to use Cassandra ● We needed to be willing to learn more about Java, Linux commands, and Cassandra internals as we went
  • 17. Accepted Anti- paterns ● Atomicity and Read-Before-Write ● Distributed Joins ● Lookup Tables ● Delete-And-Insert Updates ● Queues
  • 18. Atomicity and Read before Write ● Read before write is often described as an anti-pattern ○ Potential inconsistency or Check and Set (CAS) / Lightweight Transaction (LWT) operations ○ Event sourcing may be an alternative ● There isn’t always an alternative ● Kinetic Request uses LWTs for “Optimistic Locking” (and uniqueness) ● Even at an order of magnitude slower, more than fast enough at our scale ○ < 10ms with Replication Factor 3 ○ Order of magnitude faster than what is necessary for us
  • 19. Atomicity and Read before Write Sample Schema CREATE TABLE IF NOT EXISTS widgets ( name text, tenant_id timeuuid, value text, version_id timeuuid, PRIMARY KEY ((tenant_id), name) ) WITH CLUSTERING ORDER BY (name ASC); Uniqueness INSERT INTO widgets (name, tenant_id, value, version_id) VALUES (:name, :tenant_id, :value, :version_id) IF NOT EXISTS Optimistic Locking UPDATE widgets SET value = :value WHERE tenant_id = :tenant_id AND name = :name IF version_id = :version_id
  • 20. Distributed Joins ● Cassandra doesn’t support joins, but you can do them in memory ○ Requires multiple sequential reads and/or multi-partition queries ○ Embedded or denormalized content is sometimes an alternative ● We use a distributed join between Submissions and Forms ○ Allows us to rename the form and maintain the link ○ Acceptable because forms are finite enough to keep in memory ● Christopher Batey has a great blog post on this: http://christopher-batey.blogspot. com/2015/02/cassandra-anti-pattern-distributed.html
  • 21. Submissions Schema CREATE TABLE IF NOT EXISTS submissions ( form_id timeuuid, id timeuuid, tenant_id timeuuid, ... PRIMARY KEY ((tenant_id), id) ) WITH CLUSTERING ORDER BY (name ASC); Distributed Joins Forms Schema CREATE TABLE IF NOT EXISTS forms ( name text, tenant_id timeuuid, ... PRIMARY KEY ((tenant_id), name) ) WITH CLUSTERING ORDER BY (name ASC); SELECT * FROM submissions WHERE tenant_id = :tenant_id AND id = :id; SELECT * FROM forms WHERE tenant_id = :tenant_id AND id = :submission_form_id;
  • 22. Lookup Tables ● Lookup Tables are another form of Distributed Join ○ Table contains only data necessary for the query and an id used to lookup from the source of truth ○ Requires a “multi-get” to retrieve actual records ○ Often considered an anti-pattern for similar reasons as distributed joins ● We use a lookup tables for Webhooks and Submissions ○ Duplicating data would lead to storage requirements orders of magnitude higher
  • 23. Lookup Tables CREATE TABLE IF NOT EXISTS webhooks ( id timeuuid, scheduled_at timestamp, tenant_id timeuuid, ... PRIMARY KEY ((tenant_id), id) ) WITH CLUSTERING ORDER BY (id ASC); CREATE TABLE IF NOT EXISTS webhooks_index ( bucket text, id timeuuid, index_type text, // Tenant, Webhook, Parent index_key text, scheduled_at timestamp, tenant_id timeuuid, PRIMARY KEY ((tenant_id, bucket, index_type, index_key), scheduled_at, id) ) WITH CLUSTERING ORDER BY (scheduled_at DESC, id DESC) ...;
  • 24. Delete-And-Insert Updates ● Fundamental problem: ○ Cassandra retrieves by primary key ○ User’s want to search by values that change ○ Updating a primary key is done as a DELETE and INSERT (which leads to tombstones) ○ Want to minimize environmental complexity ● No simple solution for us ○ Try to minimize number of deletes for a given query path ○ Try to optimize for tombstones
  • 25. Delete-And-Insert Updates ● The biggest source of our DELETE-AND-INSERT usage to support our Ad-hoc querying of submissions ● Example Ad-hoc query: values[Foo] IN ("Bar", "Baz") AND ( values[Requested By]="ben.christenson" OR values[Requested For]="ben.christenson" ) ● Our solution is similar to the C* Summit presentation on multi-criteria queries http://fr.slideshare.net/ippontech/multi-criteria-queries-on-a-cassandra-application
  • 26. Delete-And-Insert Updates Writing ● Read record from Cassandra (including version_id) ● An “Indexer” class generates index sets from original and updated model ● Optimistically update the source of truth record ● Asynchronously create/delete necessary index records Reading ● Each criterion is a separate async query ● An in memory evaluator aggregates the lookup table IDs ● The submissions associated to the resulting IDs are each retrieved asynchronously ● The in memory evaluator “double checks” the submissions match the query and may re-execute another search to fill in gaps for submissions that have been updated since the initial index queries (very rare)
  • 27. Delete-And-Insert Updates CREATE TABLE IF NOT EXISTS submissions_index ( tenant_id timeuuid, timeline text, bucket text, // ‘’ for active or ‘YYYY-mm’ key text, value text, timestamp timestamp, submission_id timeuuid, PRIMARY KEY ((tenant_id, timeline, bucket, key), value, timestamp, submission_id) ) WITH CLUSTERING ORDER BY (value DESC, timestamp DESC, submission_id DESC) AND COMPACTION={ 'sstable_size_in_mb': '256', 'tombstone_threshold': '0.05', 'unchecked_tombstone_compaction': 'true', 'tombstone_compaction_interval': '3600', 'class': 'LeveledCompactionStrategy' };
  • 28. Delete-And-Insert Updates ● Even with Delete-And-Insert and lookup tables, performance is acceptable ○ Supports queries that were previously impossible ○ Extremely complicated search queries still return in < 150ms ● Does have some caveats ○ Only supports AND, OR, IN, and = (would like to support !=, starts with, ends with, etc) ○ Whenever an AND is used at least one of the criterions must return less than 1000 matches ○ In order to support pagination, sort orders must be indexed independently (combination of date and uuid; we index multiple date properties)
  • 29. Queues ● Queues are one of the most commonly referred to anti-patterns ● Problem comes down to tombstones again ○ Can be improved by truncating, knowing where live data begins, or complicated rotations ○ Can be improved by including additional technologies (real message queue)
  • 30. Queues ● One of the queue-like structures used by Kinetic Request is for Webhooks ○ Can fail to connect and should be automatically retried (put back on queue) ○ Happen often and have a very specific query path so tombstones are worrisome ● In this case, the queue “event” can be processed initially by the server in memory ○ Write directly to the source of truth / historical index and avoid tombstones for normal executions ○ Only if the initial webhook fails is it added to the queue index
  • 31. Queues ● Other styles of queues can’t necessarily be processed by the event server ○ Scheduled for the future ○ Handled by ● For this case, we are experimenting using an in-memory distributed queue ○ Hazelcast or Ignite (which have many other coordination benefits) ○ Avoids hitting tombstones by using Cassandra as a persistence mechanism only queried at startup
  • 32. Takeaways ● Cassandra has many benefits, even if you are not using it at extreme scales ● The barrier of entry is not as scary as it seems ● Play, play, play, test, test, test ● Find good resources