1. Administrator Training with Apache Cassandra
Tuning
Datastax Inc.
http://www.datastax.com/
August 27, 2012
Datastax Inc. Administrator Training with Apache Cassandra 1/ 24
2. Outline
1 Overview
2 Cache
3 Summary
Datastax Inc. Administrator Training with Apache Cassandra 2/ 24
3. Tuning Cassandra
Effective tuning depends on your use case
What operations are performed?
What does your data look like?
Wide Rows
Skinny Rows
Datastax Inc. Administrator Training with Apache Cassandra 3/ 24
4. Outline
1 Overview
2 Cache
3 Summary
Datastax Inc. Administrator Training with Apache Cassandra 4/ 24
5. Tuning the Cache
Cassandra’s Cache is VERY efficient
Takes lessons from dedicated caches
Solve cache coherence issues
Remove tiers from the application
Cache never needs to be completely cold
Datastax Inc. Administrator Training with Apache Cassandra 5/ 24
6. Row Cache
Return whenever possible
A hit will never seek
Datastax Inc. Administrator Training with Apache Cassandra 6/ 24
7. Key Cache
Used if Row Cache misses
Key → SSTable is not 1:1
Each hit saves 1 seek
Populated from a miss
Datastax Inc. Administrator Training with Apache Cassandra 7/ 24
8. Key Cache(cont.)
Map of (key, list<(sstable, location)>)
Per ColumnFamily
Large numbers of rows accessed frequently
Adjust by changing keys_cached on the ColumnFamily
Datastax Inc. Administrator Training with Apache Cassandra 8/ 24
9. Adjusting the Key Cache
[
cassandra-cli to adjust the key cache]
[default@demo] UPDATE COLUMN FAMILY users WITH keys_cached=
Datastax Inc. Administrator Training with Apache Cassandra 9/ 24
10. Monitor Key Cache
nodetool cfstats
RecentHitRate (jmx)
Datastax Inc. Administrator Training with Apache Cassandra 10/ 24
11. Monitor Key Cache
nodetool cfstats
RecentHitRate (jmx)
Datastax Inc. Administrator Training with Apache Cassandra 11/ 24
12. Configuring Row Cache
Put the entire row in memory
This can be harmful
Disabled by default
Disable it for ColumnFamilies with wide rows
Now stored off-heap
Datastax Inc. Administrator Training with Apache Cassandra 12/ 24
13. Candidate Column Families for Row Cache
Accesses a small subset of rows
All, or most columns are returned
Used properly improvement is substantial
Datastax Inc. Administrator Training with Apache Cassandra 13/ 24
14. Adjusting the Row Cache
[
cassandra-cli to adjust the key cache]
[default@demo] UPDATE COLUMN FAMILY users WITH rows_cached=
AND row_cache_provider=’SerializingCacheProvider’;
Datastax Inc. Administrator Training with Apache Cassandra 14/ 24
15. Monitor Row Cache
nodetool cfstats
RecentHitRate (jmx)
Datastax Inc. Administrator Training with Apache Cassandra 15/ 24
16. Data Modeling Considerations
Logical separation of workloads
Read heavy
Write heavy
Narrow Rows, Frequently Accessed
Normal (Gaussian) distributions work well
A few ’Hot’ rows
Datastax Inc. Administrator Training with Apache Cassandra 16/ 24
17. Hardware and OS
Large number of nodes under light load
Ideal for in memory cache
OS Page Cache
Lower memtable size
Lower heap size
Balanced need
heap
memtables
cache
Datastax Inc. Administrator Training with Apache Cassandra 17/ 24
18. Estimating Cache Sizes
nodetool cfstats
Number of keys cached * average key size
Number of rows cached * average row size
Datastax Inc. Administrator Training with Apache Cassandra 18/ 24
19. Write Performance
Write-back cache
Thresholds are per-node
PerCF thresholds have been deprecated
commit_log_total_space_in_mb
memtable_total_space_in_mb
memtable_flush_queue_size (for secondary indexes)
Datastax Inc. Administrator Training with Apache Cassandra 19/ 24
20. Java Heap Size
Default 1/4 of the memory
1GB - 8GB
Little to negative benefit ¿ 8GB Heap
(memtable_total_space_in_mb) + 1GB + (key_cache_size_e
Datastax Inc. Administrator Training with Apache Cassandra 20/ 24
21. Garbage Collection
Log Information when GC ¿ 200ms
Add Nodes
Decrease key-cache size
index_interval
default 128, 256, 512
Datastax Inc. Administrator Training with Apache Cassandra 21/ 24
22. Compaction
multithreaded_compaction
compaction_throughput_mb_per_sec
in_memory_compaction_limit_in_mb
compaction_preheat_key_cache
Datastax Inc. Administrator Training with Apache Cassandra 22/ 24
23. Outline
1 Overview
2 Cache
3 Summary
Datastax Inc. Administrator Training with Apache Cassandra 23/ 24
24. Overview
Lots of Options
Dependent on your use case
Cache
IO
Datastax Inc. Administrator Training with Apache Cassandra 24/ 24