SlideShare une entreprise Scribd logo
1  sur  19
Apache Cassandra 2.0
http://www.stealth.ly
Joe Stein
About me
 Tweets @allthingshadoop
 All Things Hadoop Blog & Podcast – http://www.allthingshadoop.com
 Watched 0.3 from a distance
 Tried to integrated C* into my stack using 0.4 and 0.5 and 0.6
 0.7 was awesome! Expiring columns, live schema updates, hadoop output
 0.8 things got more interesting with counters and I did our first prod deploy, partial
 1.0 it all came together for us with compression, full deploy
 Cassandra in prod supported over 100 million daily unique mobile devices
 Born again into CQL3
 Apache Kafka http://kafka.apache.org/ committer & PMC member
 Founder & Principal Consultant of Big Data Open Source Security LLC – http://www.stealth.ly
 BDOSS is all about the "glue" and helping companies to not only figure out what Big Data
Infrastructure Components to use but also how to change their existing (or build new)
systems to work with them.
 Working with clients on new projects using Cassandra 2.0
Apache Cassandra 2.0
 Based on Big Table & Dynamo
 High Performance
 Reliable / Available
 Massively Scalable
 Easy To Use
 Developer Productivity Focused
What’s new in C* 2.0
 Finalized some legacy items
 CQL Improvements
 Lightweight Transactions
 Triggers
 Compaction Improvements
 Eager Retries
 More, More, More!!!
Super Columns refactored to use
composite keys instead
Virtual Nodes are the default
CQL Improvements
 Support for multiple prepared statements in batch
 Prepared statement for the consistency level, timestamp and ttl
 Add cursor API/auto paging to the native CQL protocol
 Support indexes on composite column components
 ALTER TABLE to DROP a column
 Column alias in SELECT statement
 SASL (Simple Authentication and Security Layer ) support for improved authentication
 Conditional DDL - Conditionally tests for the existence of a table, keyspace, or index before
issuing a DROP or CREATE statement using IF EXISTS or IF NOT EXISTS
 Support for an empty list of values in the IN clause of SELECT, UPDATE, and DELETE
commands, useful in Java Driver applications when passing empty arrays as arguments for the
IN clause
 Imports and exports CSV (comma-separated values) data to and from Cassandra 1.1.3 and
higher.
COPY table_name ( column, ...)
FROM ( 'file_name' | STDIN ) or TO ( 'file_name' | STDOUT )
WITH option = 'value' AND ...
Lightweight Transactions – Why?
 Cassandra can provide strong consistency with quorum reads & writes
 A write must be written to the commit log and memory table on a quorum of
replica nodes.
 Quorum = (replication_factor / 2) + 1 (rounded down to a whole number)
 This is sometimes not enough when transactions need to be handled in
sequence (linearized) or when operating concurrently achieve the expected
results without race conditions.
 "Strong" consistency is not enough to prevent race conditions. The classic
example is user account creation: we want to ensure usernames are unique,
so we only want to signal account creation success if nobody else has created
the account yet. But naive read-then-write allows clients to race and both
think they have a green light to create.
Lightweight Transactions - "when you
really need it," not for all your updates.
 Prepare: the coordinator generates a ballot (timeUUID in our case) and asks replicas to (a) promise
not to accept updates from older ballots and (b) tell us about the most recent update it has already
accepted.
 Read: we perform a read (of committed values) between the prepare and accept phases. RETURNS
HERE IF CAS failure
 Accept: if a majority of replicas reply, the coordinator asks replicas to accept the value of the
highest proposal ballot it heard about, or a new value if no in-progress proposals were reported.
 Commit (Learn): if a majority of replicas acknowledge the accept request, we can commit the new
value.
 The coordinator sends a commit message to all replicas with the ballot and value.
 Because of Prepare & Accept, this will be the highest-seen commit ballot. The replicas will note that, and send
it with subsequent promise replies. This allows us to discard acceptance records for successfully committed
replicas, without allowing incomplete proposals to commit erroneously later on.
 1 second default timeout for total of above operations configured in cas_contention_timeout_in_ms
 For more details on exactly how this works, look at the code 
https://github.com/apache/cassandra/blob/cassandra-
2.0.1/src/java/org/apache/cassandra/service/StorageProxy.java#L202
Lightweight Transactions
How do we use it?
 Is isolated to the “partition key” for a CQL3 table
 INSERT
 IF NOT EXISTS
 i.e. INSERT INTO users (username, email, full_name) values (‘hellocas',
cas@updateme.com', ‘Hello World CAS', 1) IF NOT EXISTS;
 UPDATE
 IF column = some_value
 IF map[column] = some_value
 The some_value is what you read or expected the value to be prior to your change
 i.e. UPDATE inventory set orderHistory[(now)] = uuid, total[wharehouse13] = 7 where
itemID = uuid IF map[wherehouse13] = 8;
 Conditional updates are not allowed in batches
 The columns updated do NOT have to be the same as the columns in the IF clause.
Trigger (Experimental)
 Experimental = Expect the API to change and your triggers to have to be
refactored in 2.1 … provide feedback to the community if you use this feature
its how we make C* better!!!
 Asynchronous triggers is a basic mechanism to implement various use cases of
asynchronous execution of application code at database side. For example to
support indexes and materialized views, online analytics, push-based data
propagation.
 Basically it takes a RowMutation that is occurring and allows you to pass back
other RowMutation(s) you also want to occur 
 triggers on counter tables are generally not supported (counter mutations are
not allowed inside logged batches for obvious reasons – they aren’t
idempotent).
Trigger Example
 https://github.com/apache/cassandra/tree/trunk/examples/triggers
 RowKey:ColumnName:Value to Value:ColumnName:RowKey
 10214124124 – username = to joestein – username = 10214124124
 CREATE TRIGGER test1 ON "Keyspace1"."Standard1" EXECUTE ('org.apache.cassandra.triggers.InvertedIndex');
 Basically we are returning a row mutation from within a jar so we can do anything we want pretty much
public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update) {
List<RowMutation> mutations = new ArrayList<RowMutation>();
for (ByteBuffer name : update.getColumnNames()) {
RowMutation mutation = new
RowMutation(properties.getProperty("keyspace"), update.getColumn(name).value());
mutation.add(properties.getProperty("columnfamily"), name, key, System.currentTimeMillis());
mutations.add(mutation);
}
return mutations;
}
Improved Compaction
 During compaction, Cassandra combines multiple data files to improve the
performance of partition scans and to reclaim space from deleted data.
 SizeTieredCompactionStrategy: The default compaction strategy. This strategy
gathers SSTables of similar size and compacts them together into a larger SSTable
This strategy is best suited for column families with insert-mostly workloads that
are not read as frequently. This strategy also requires closer monitoring of disk
utilization because (as a worst case scenario) a column family can temporarily
double in size while a compaction is in progress..
 LeveledCompactionStrategy: Introduced in Cassandra 1.0, this strategy creates
SSTables of a fixed, relatively small size (5 MB by default) that are grouped into
levels. Within each level, SSTables are guaranteed to be non-overlapping. Each
level (L0, L1, L2 and so on) is 10 times as large as the previous. This strategy is
best suited for column families with read-heavy workloads that also have frequent
updates to existing rows. When using this strategy, you want to keep an eye on
read latency performance for the column family. If a node cannot keep up with
the write workload and pending compactions are piling up, then read performance
will degrade for a longer period of time….
Leveled Compaction – in theory
 Reads are through a small amount of files making it performant
 Compaction happens fast enough for the new L0 tier coming in
Leveled Compaction – high write load
 This will cause read latency using level compaction when the tiered
compaction can’t keep up with the writes coming in
Leveled Compaction
 Hybrid – best of both worlds 
Eager Retries
 Speculative execution for reads
 Keeps metrics of read response times to nodes
 Avoid query timeouts by sending redundant requests to other replicas if too
much time elapses on the original request
 ALWAYS
 99th (Default)
 X percentile
 X ms
 NONE
Eager Retries – in action
The JIRA for this test
More, More, More!!!
 The java heap and GC has not been able to keep pace with the heap to data ratio with the
structures that C* has that can be cleaned up with manual garbage collection…. This was
done initially started in 1.2 and finished up in 2.0.
 New commands to disable background compactions nodetool disableautocompaction and
nodetool enableautocompaction
 Auto_bootstrapping of a single-token node with no initial_token
 Timestamp condition eliminates sstable seeks with sstable holding min/max timestamp for
each file skipping unnecessary files
 Thrift users got a major bump in performance using a LMAX disruptor implementation
 Java 7 is now required
 Level compaction information is moved into the sstables
 Streaming has been rewritten – better control, traceability and performance!
 Removed row level bloom filters for columns
 Row reads during compaction halved … doubling the speed

Contenu connexe

Tendances

Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...DataStax Academy
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsJulien Anguenot
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value StoreSantal Li
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraFolio3 Software
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...DataStax Academy
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)zznate
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraDataStax
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsDave Gardner
 
Learn Cassandra at edureka!
Learn Cassandra at edureka!Learn Cassandra at edureka!
Learn Cassandra at edureka!Edureka!
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architectureT Jake Luciani
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsnarsiman
 
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...DataStax
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckDataStax Academy
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraSoftwareMill
 

Tendances (20)

Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache Cassandra
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
 
Learn Cassandra at edureka!
Learn Cassandra at edureka!Learn Cassandra at edureka!
Learn Cassandra at edureka!
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 

En vedette

Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1Johnny Miller
 
Cassandra
Cassandra Cassandra
Cassandra Pooja GV
 
jstein.cassandra.nyc.2011
jstein.cassandra.nyc.2011jstein.cassandra.nyc.2011
jstein.cassandra.nyc.2011Joe Stein
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemVarad Meru
 
Storing Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite ColumnsStoring Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite ColumnsJoe Stein
 
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...DataStax
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
 
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Joe Stein
 
Containerized Data Persistence on Mesos
Containerized Data Persistence on MesosContainerized Data Persistence on Mesos
Containerized Data Persistence on MesosJoe Stein
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Making Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache MesosMaking Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache MesosJoe Stein
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Markus Klems
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache KafkaJoe Stein
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache MesosJoe Stein
 
Hadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With PythonHadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With PythonJoe Stein
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogJoe Stein
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaJoe Stein
 

En vedette (20)

Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1
 
Cassandra
Cassandra Cassandra
Cassandra
 
jstein.cassandra.nyc.2011
jstein.cassandra.nyc.2011jstein.cassandra.nyc.2011
jstein.cassandra.nyc.2011
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
 
Storing Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite ColumnsStoring Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite Columns
 
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
 
Data Pipeline with Kafka
Data Pipeline with KafkaData Pipeline with Kafka
Data Pipeline with Kafka
 
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
 
Containerized Data Persistence on Mesos
Containerized Data Persistence on MesosContainerized Data Persistence on Mesos
Containerized Data Persistence on Mesos
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Making Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache MesosMaking Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache Mesos
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache Mesos
 
Hadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With PythonHadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With Python
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
Cassandra 3.0
Cassandra 3.0Cassandra 3.0
Cassandra 3.0
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache Kafka
 

Similaire à Apache Cassandra 2.0

[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)datastaxjp
 
Introduction to LAVA Workload Scheduler
Introduction to LAVA Workload SchedulerIntroduction to LAVA Workload Scheduler
Introduction to LAVA Workload SchedulerNopparat Nopkuat
 
Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Michael Renner
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_storedrewz lin
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetLucian Oprea
 
Data SLA in the public cloud
Data SLA in the public cloudData SLA in the public cloud
Data SLA in the public cloudLiran Zelkha
 
Cassandra hands on
Cassandra hands onCassandra hands on
Cassandra hands onniallmilton
 
Sql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton VidishchevSql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton VidishchevAlex Tumanoff
 
Windows Azure Storage: Overview, Internals, and Best Practices
Windows Azure Storage: Overview, Internals, and Best PracticesWindows Azure Storage: Overview, Internals, and Best Practices
Windows Azure Storage: Overview, Internals, and Best PracticesAnton Vidishchev
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008paulguerin
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataDataWorks Summit/Hadoop Summit
 
White Paper On ConCurrency For PCMS Application Architecture
White Paper On ConCurrency For PCMS Application ArchitectureWhite Paper On ConCurrency For PCMS Application Architecture
White Paper On ConCurrency For PCMS Application ArchitectureShahzad
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's Newdpcobb
 
ADO.Net Improvements in .Net 2.0
ADO.Net Improvements in .Net 2.0ADO.Net Improvements in .Net 2.0
ADO.Net Improvements in .Net 2.0David Truxall
 
CQRS introduction
CQRS introductionCQRS introduction
CQRS introductionYura Taras
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"Jihyun Ahn
 

Similaire à Apache Cassandra 2.0 (20)

[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
 
AWS RDS Migration Tool
AWS RDS Migration Tool AWS RDS Migration Tool
AWS RDS Migration Tool
 
No sql
No sqlNo sql
No sql
 
Introduction to LAVA Workload Scheduler
Introduction to LAVA Workload SchedulerIntroduction to LAVA Workload Scheduler
Introduction to LAVA Workload Scheduler
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_store
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_Cheatsheet
 
Data SLA in the public cloud
Data SLA in the public cloudData SLA in the public cloud
Data SLA in the public cloud
 
Cassandra hands on
Cassandra hands onCassandra hands on
Cassandra hands on
 
Sql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton VidishchevSql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton Vidishchev
 
Windows Azure Storage: Overview, Internals, and Best Practices
Windows Azure Storage: Overview, Internals, and Best PracticesWindows Azure Storage: Overview, Internals, and Best Practices
Windows Azure Storage: Overview, Internals, and Best Practices
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
White Paper On ConCurrency For PCMS Application Architecture
White Paper On ConCurrency For PCMS Application ArchitectureWhite Paper On ConCurrency For PCMS Application Architecture
White Paper On ConCurrency For PCMS Application Architecture
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's New
 
ADO.Net Improvements in .Net 2.0
ADO.Net Improvements in .Net 2.0ADO.Net Improvements in .Net 2.0
ADO.Net Improvements in .Net 2.0
 
CQRS introduction
CQRS introductionCQRS introduction
CQRS introduction
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 

Plus de Joe Stein

SMACK Stack 1.1
SMACK Stack 1.1SMACK Stack 1.1
SMACK Stack 1.1Joe Stein
 
Get started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosGet started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosJoe Stein
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache MesosJoe Stein
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraReal-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraJoe Stein
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
 
Building and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosBuilding and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosJoe Stein
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosJoe Stein
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache KafkaJoe Stein
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache MesosJoe Stein
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaJoe Stein
 
Apache Kafka
Apache KafkaApache Kafka
Apache KafkaJoe Stein
 

Plus de Joe Stein (12)

SMACK Stack 1.1
SMACK Stack 1.1SMACK Stack 1.1
SMACK Stack 1.1
 
Get started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosGet started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache Mesos
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache Mesos
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraReal-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
 
Building and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosBuilding and Deploying Application to Apache Mesos
Building and Deploying Application to Apache Mesos
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on Mesos
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache Mesos
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache Kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 

Dernier

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Apache Cassandra 2.0

  • 2. About me  Tweets @allthingshadoop  All Things Hadoop Blog & Podcast – http://www.allthingshadoop.com  Watched 0.3 from a distance  Tried to integrated C* into my stack using 0.4 and 0.5 and 0.6  0.7 was awesome! Expiring columns, live schema updates, hadoop output  0.8 things got more interesting with counters and I did our first prod deploy, partial  1.0 it all came together for us with compression, full deploy  Cassandra in prod supported over 100 million daily unique mobile devices  Born again into CQL3  Apache Kafka http://kafka.apache.org/ committer & PMC member  Founder & Principal Consultant of Big Data Open Source Security LLC – http://www.stealth.ly  BDOSS is all about the "glue" and helping companies to not only figure out what Big Data Infrastructure Components to use but also how to change their existing (or build new) systems to work with them.  Working with clients on new projects using Cassandra 2.0
  • 3. Apache Cassandra 2.0  Based on Big Table & Dynamo  High Performance  Reliable / Available  Massively Scalable  Easy To Use  Developer Productivity Focused
  • 4. What’s new in C* 2.0  Finalized some legacy items  CQL Improvements  Lightweight Transactions  Triggers  Compaction Improvements  Eager Retries  More, More, More!!!
  • 5. Super Columns refactored to use composite keys instead
  • 6. Virtual Nodes are the default
  • 7. CQL Improvements  Support for multiple prepared statements in batch  Prepared statement for the consistency level, timestamp and ttl  Add cursor API/auto paging to the native CQL protocol  Support indexes on composite column components  ALTER TABLE to DROP a column  Column alias in SELECT statement  SASL (Simple Authentication and Security Layer ) support for improved authentication  Conditional DDL - Conditionally tests for the existence of a table, keyspace, or index before issuing a DROP or CREATE statement using IF EXISTS or IF NOT EXISTS  Support for an empty list of values in the IN clause of SELECT, UPDATE, and DELETE commands, useful in Java Driver applications when passing empty arrays as arguments for the IN clause  Imports and exports CSV (comma-separated values) data to and from Cassandra 1.1.3 and higher. COPY table_name ( column, ...) FROM ( 'file_name' | STDIN ) or TO ( 'file_name' | STDOUT ) WITH option = 'value' AND ...
  • 8. Lightweight Transactions – Why?  Cassandra can provide strong consistency with quorum reads & writes  A write must be written to the commit log and memory table on a quorum of replica nodes.  Quorum = (replication_factor / 2) + 1 (rounded down to a whole number)  This is sometimes not enough when transactions need to be handled in sequence (linearized) or when operating concurrently achieve the expected results without race conditions.  "Strong" consistency is not enough to prevent race conditions. The classic example is user account creation: we want to ensure usernames are unique, so we only want to signal account creation success if nobody else has created the account yet. But naive read-then-write allows clients to race and both think they have a green light to create.
  • 9. Lightweight Transactions - "when you really need it," not for all your updates.  Prepare: the coordinator generates a ballot (timeUUID in our case) and asks replicas to (a) promise not to accept updates from older ballots and (b) tell us about the most recent update it has already accepted.  Read: we perform a read (of committed values) between the prepare and accept phases. RETURNS HERE IF CAS failure  Accept: if a majority of replicas reply, the coordinator asks replicas to accept the value of the highest proposal ballot it heard about, or a new value if no in-progress proposals were reported.  Commit (Learn): if a majority of replicas acknowledge the accept request, we can commit the new value.  The coordinator sends a commit message to all replicas with the ballot and value.  Because of Prepare & Accept, this will be the highest-seen commit ballot. The replicas will note that, and send it with subsequent promise replies. This allows us to discard acceptance records for successfully committed replicas, without allowing incomplete proposals to commit erroneously later on.  1 second default timeout for total of above operations configured in cas_contention_timeout_in_ms  For more details on exactly how this works, look at the code  https://github.com/apache/cassandra/blob/cassandra- 2.0.1/src/java/org/apache/cassandra/service/StorageProxy.java#L202
  • 10. Lightweight Transactions How do we use it?  Is isolated to the “partition key” for a CQL3 table  INSERT  IF NOT EXISTS  i.e. INSERT INTO users (username, email, full_name) values (‘hellocas', cas@updateme.com', ‘Hello World CAS', 1) IF NOT EXISTS;  UPDATE  IF column = some_value  IF map[column] = some_value  The some_value is what you read or expected the value to be prior to your change  i.e. UPDATE inventory set orderHistory[(now)] = uuid, total[wharehouse13] = 7 where itemID = uuid IF map[wherehouse13] = 8;  Conditional updates are not allowed in batches  The columns updated do NOT have to be the same as the columns in the IF clause.
  • 11. Trigger (Experimental)  Experimental = Expect the API to change and your triggers to have to be refactored in 2.1 … provide feedback to the community if you use this feature its how we make C* better!!!  Asynchronous triggers is a basic mechanism to implement various use cases of asynchronous execution of application code at database side. For example to support indexes and materialized views, online analytics, push-based data propagation.  Basically it takes a RowMutation that is occurring and allows you to pass back other RowMutation(s) you also want to occur   triggers on counter tables are generally not supported (counter mutations are not allowed inside logged batches for obvious reasons – they aren’t idempotent).
  • 12. Trigger Example  https://github.com/apache/cassandra/tree/trunk/examples/triggers  RowKey:ColumnName:Value to Value:ColumnName:RowKey  10214124124 – username = to joestein – username = 10214124124  CREATE TRIGGER test1 ON "Keyspace1"."Standard1" EXECUTE ('org.apache.cassandra.triggers.InvertedIndex');  Basically we are returning a row mutation from within a jar so we can do anything we want pretty much public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update) { List<RowMutation> mutations = new ArrayList<RowMutation>(); for (ByteBuffer name : update.getColumnNames()) { RowMutation mutation = new RowMutation(properties.getProperty("keyspace"), update.getColumn(name).value()); mutation.add(properties.getProperty("columnfamily"), name, key, System.currentTimeMillis()); mutations.add(mutation); } return mutations; }
  • 13. Improved Compaction  During compaction, Cassandra combines multiple data files to improve the performance of partition scans and to reclaim space from deleted data.  SizeTieredCompactionStrategy: The default compaction strategy. This strategy gathers SSTables of similar size and compacts them together into a larger SSTable This strategy is best suited for column families with insert-mostly workloads that are not read as frequently. This strategy also requires closer monitoring of disk utilization because (as a worst case scenario) a column family can temporarily double in size while a compaction is in progress..  LeveledCompactionStrategy: Introduced in Cassandra 1.0, this strategy creates SSTables of a fixed, relatively small size (5 MB by default) that are grouped into levels. Within each level, SSTables are guaranteed to be non-overlapping. Each level (L0, L1, L2 and so on) is 10 times as large as the previous. This strategy is best suited for column families with read-heavy workloads that also have frequent updates to existing rows. When using this strategy, you want to keep an eye on read latency performance for the column family. If a node cannot keep up with the write workload and pending compactions are piling up, then read performance will degrade for a longer period of time….
  • 14. Leveled Compaction – in theory  Reads are through a small amount of files making it performant  Compaction happens fast enough for the new L0 tier coming in
  • 15. Leveled Compaction – high write load  This will cause read latency using level compaction when the tiered compaction can’t keep up with the writes coming in
  • 16. Leveled Compaction  Hybrid – best of both worlds 
  • 17. Eager Retries  Speculative execution for reads  Keeps metrics of read response times to nodes  Avoid query timeouts by sending redundant requests to other replicas if too much time elapses on the original request  ALWAYS  99th (Default)  X percentile  X ms  NONE
  • 18. Eager Retries – in action The JIRA for this test
  • 19. More, More, More!!!  The java heap and GC has not been able to keep pace with the heap to data ratio with the structures that C* has that can be cleaned up with manual garbage collection…. This was done initially started in 1.2 and finished up in 2.0.  New commands to disable background compactions nodetool disableautocompaction and nodetool enableautocompaction  Auto_bootstrapping of a single-token node with no initial_token  Timestamp condition eliminates sstable seeks with sstable holding min/max timestamp for each file skipping unnecessary files  Thrift users got a major bump in performance using a LMAX disruptor implementation  Java 7 is now required  Level compaction information is moved into the sstables  Streaming has been rewritten – better control, traceability and performance!  Removed row level bloom filters for columns  Row reads during compaction halved … doubling the speed