3. Why Apache Cassandra?
❏ Open Source
❏ Written in Java
❏ Scalability & High Availability
❏ Fault Tolerance
❏ Replication across multiple datacenters
❏ Async Masterless Replication
❏ No Single Point of Failure
❏ Based on Amazon Dynamo paper
❏ Created by Facebook, open sourced to apache in 2008
8. Murmur3Partitioner
❏ Murmur3Partitioner
❏ Default
❏ 3-5x faster than RandomPartitioner
❏ Based on Tokens hash values
❏ Uniform
❏ RandomPartitioner
❏ Uniform
❏ MD5 hash
❏ ByteOrderedPartitioner
❏ Lexically ordered by key bytes
❏ Ordered partition
❏ Not Recommended:
❏ Difficult LB, Hot Spots, Uneven LB multiple tables.
9. Replication
❏ Concepts:
❏ Virtual Nodes: Data ownership to machines
❏ Partitioner: Partitions data on the cluster
❏ Replication Strategy: Determine Replicas for each row of data
❏ Snitch: Topology, information about replicas and strategy.
❏ Client writes to any node
❏ Node coordinates with replicas
❏ Replication happens in parallel
❏ Replication Factor = How many nodes with same data? I.E. 3.
❏ SimpleStrategy VS NetworkTopologyStrategy
❏ Design: Nodes, Racks, Data Centers great for Cloud Computing!
11. Consistency
❏ Tunable Consistency: Reads and Writes:
❏ Consistency VS Availability Trade Offs:
❏ ONE, TWO, THREE
❏ QUORUM(majority = N /2 + 1) - LOCAL_QUORUM(majority local dc)
❏ EACH_QUORUM (majority all dcs)
❏ LOCAL_ONE
❏ ALL
❏ ANY (Just for writes)
❏ Disaster Recovery scenarios:
❏ SERIAL
❏ LOCAL_SERIAL
12. Reads and Index
❏ Partition key Cache
❏ Off Heap
❏ Configurable
❏ Row Cache
❏ Off Heap
❏ Configurable
❏ Secondary Index
❏ Filter data on table by non-primary key
❏ ALLOW FILTERING - Could be problematic
❏ Cassandra 3.4 - SASI Secondary Index
❏ Better Performance
❏ In memory mapped B+ tree
❏ Can't use with collections
13. Storage
❏ Log-Structured Merge Tree(don't use B-TREE)
❏ Avoid Read before write
❏ Flavors Latency
❏ Cass Groups Insert/Updates in memory. Periodically SYNC to disk(sequential append).
❏ Immutable Data
❏ Check before write? Use Lightweight Transactions.
❏ Writes: Commit Log -> Memtable -> Flush -> Disk SSTABLE
❏ All writes are versioned
❏ No Delete: Tombstones
❏ Reads: Bloom filter
❏ Off Heap structure for SSTable