Cassandra Fundamentals Overview

Main points

Structured log storage

Columns ordered by name inside key

Rows ordered by hash of row key *

Column family storage

Fully distributed peer-to-peer

Partitioned by row key

Dynamo consistency

Structured log storage

No writes in place for you

JVM heap is reserved for memtables

Memtables are sorted

Memtables reach a specific size they are
flushed to disk
− Creates sstable file
− Bloom filter file
− Index file

Compaction

SSTables merged

Deleted columns physically removed

Two compaction strategies
− Sized
− LevelDB

Commit logs

Every write/delete operation goes to commit
log

If a node were to shutdown with un-flushed
memtables (every shutdown really)

Replay the commit logs

Columns ordered inside key

Cassandra likes wide rows
− Up to 2 billion
− (but not really would be a 32GB row)

set mystuff['ecapriolo']['a']='1'

set mystuff['ecapriolo']['b']='2'

set mystuff['ecapriolo']['c']='3'
...

slice mystuff['ecapriolo'] ['b'] ['g']

Rows ordered by hash of row key

All columns of row 'a1' on the same node

But all columns of row 'a2' may not be on
same node

Reduces hot spots

But there is no total ordering based on row
keys

Peer to Peer

Node list and token range is gossip-ed

Each node responsible for local storage and
requests

When a new node joins it take some token
range away from other nodes.

Ed Ed Ed
stacey stacey stacey
bob bob bob
Replication 3

Dynamo consistency

Operations have a requested Consistency
Level
− ONE
− QUORUM

CL nodes ack the operation before the user
receives ack

If an operation fails it is safe to retry *

Fully distributed. The good

Highly available

Redundant

Fault tolerant

Fully distributed! The bad

Locks

Counters

Tombstones

Consistency

Hadoop and Cassandra

ColumnFamilyInputFormat
− Takes a ColumnFamily as input
− Map(ByteBuffer[] key,
SortedMap<ByteBuffer,Column>

ColumnFamilyOutputFormat
− Writes out to a column family
− OutputFormat ByteBuffer,List<Mutation>

Hadoop optimizations

Tasks run with locality if c* and h same node

InputFormat can leverage c* secondary
indexes

OutputFormat can use bulk loader
− C* writes are helluva fast anyway

Hive and Cassandra

Hive support similar to the hbase handler
support

Create a hive table specifying properties
similar to those in map reduce

hive> CREATE EXTERNAL TABLE
Users(userid string, name string, email
string, phone string)
STORED BY
'org.apache.hadoop.hive.cassandra.Ca
ssandraStorageHandler' WITH

Other support out there

github.com/edwardcapriolo/hive-cassandra-
udfs
− Delete UDF
− Composite splitter/builder UDFS

Not very hard to roll your own input format
− OneRowInputFormat
− ListOfRowsInputFormat

Pig Cassandra

Nice support for pig/cassandra

Pigmalian library

But I don't use it
− Cause I use hive
− You should as well
− And get my book :)

Comparison between c*
and “other noSQL”

I know your talking about hbase :)

Cassandra does not store multiple versions of
column
− Last update wins
− Use UUID as part of column name instead

The row keys are not globally ordered *
− Unless you are using ByteOrderPartitioner (no one
should use this)

Comparison between c*
and “other noSQL”

Each c* replica actively servers reads & writes

Cassandra directly manages its storage

Shards are pre-defined tokens (no auto-split)

Qualifier/column name can NOT be null

Know your data

Design for the long tail scenarios
− With design x our largest customer will have
10000000000000 columns in one row

How large will this column family be in 5
months?

What is the request rate?

How random is the read pattern

Understanding write-once files

Deletes are writes that get compacted away
later

Can you optimize from blind writes?

What percent of your application is
update/insert?

Profiling / Dark Launch

Compression

Compaction strategy

Metrics

Collect the JMX information
− Column family
− Caches

Set milestone alerts (traps)

Hardware

Fast disk (you almost always want SSD)

RAM
− Caches, bloom filters, young gen

CPU
− Garbage collector, deserialization + compaction
needs cpu to work

Anti patterns

Using one row key as a queue

Doing N reads to satisfy a request

Read before write

Using collection support in place of wide rows

Encoding

Cassandra Fundamentals Overview

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Cassandra Fundamentals Overview

Similar to Cassandra Fundamentals Overview (20)

More from Edward Capriolo

More from Edward Capriolo (15)

Recently uploaded

Recently uploaded (20)

Cassandra Fundamentals Overview