Monal Daxini presented on the declarative benchmarking tool NDBench and its Cassandra plugin. The tool allows users to define performance test profiles that specify the Cassandra schema, queries, load patterns, and other parameters. It executes the queries against Cassandra clusters and collects metrics to analyze performance. The plugin supports all Cassandra data types and allows testing different versions. Netflix uses it to validate data models and certify Cassandra upgrades. Future enhancements include adding more data generators and supporting other data stores.
Declarative benchmarking of cassandra and it's data models
1. Monal Daxini @ monaldax
11/11/2019 ApacheCon, Las Vegas, 2019
https://www.linkedin.com/in/monaldaxini
Declarative Benchmarking of
Cassandra and It's Data Models
2. ● Cloud Data Engineering @ Netflix, work on many data stores
● Help engineers build scalable solutions
● Built scalable data platforms using Apache Flink / Kafka / Docker
● Working with distributed systems for 18+ years
Profile
@monaldax
3. • 100’s of applications using Cassandra
• (several unique data models / config)
• 10’s of thousands instances
• 100’s of global C* clusters
• > 6 PB of data
• Millions of requests/ seconds
Netflix Cassandra Footprint
@monaldax
4. • Challenges developing a scalable data model (Cassandra)
• Declarative Cassandra benchmarking tool in action
• Tool’s philosophy, how it works, & how it can apply to other data stores
Structure Of The Talk
@monaldax
5. 1. Design data model & schema
2. Design application queries
3. Identify application load & query
distribution
4. Prepare test data
5. Prepare query parameter values to
run queries efficiently
Developing a Scalable Cassandra Data Model
For each application:
6. Code an app to execute queries, and
instrument to capture metrics
7. Generate load against application to run
queries with desired distribution
8. Analyze results (build dashboard)
9. If results unsatisfactory, iterate from step 1
@monaldax
6. In addition,
We may need to test application workload on different
versions of Cassandra and or data models.
@monaldax
7. That’s a lot of steps, duplicate effort, and its cumbersome!
@monaldax
We want it to be easy, quick, and ergonomic!
8. 1. Design data model & schema
2. Design application queries
3. Identify the application load & query
distribution
4. Prepare test data (generate)
9. Config tool, run test, if results
unsatisfactory, iterate from step 1
Developing a Scalable Cassandra Data Model
With tooling for each application:
5. Prepare query parameter values to run
queries efficiently
6. Code an app to execute queries, and
instrument to capture metrics
7. Generate load against application to run
queries with desired distribution
8. Analyze results (build dashboard)
Heavy Lifting in a Tool
@monaldax
9. ● Generic benchmarking tool
● Support different data stores via plugin (available plugins)
● Dynamically tunable RPS and configuration
● Load patterns - random, time window, zipfian
What is NDBench?
@monaldax
10. NDBench In Action
NDBench NodeNDBench Node
(EC2 Instance)
NDBench Node
NDBench Node
(EC2 Instance)
Test
Cassandra Cluster
Schema & Test Data
reads / writes
Record Metrics
NDBench NodeNDBench APP UI
@monaldax
11. • Emulate application query logic runs against real or generated data
• Specify the traffic % distribution
• Basic data type coalescing for using query result in another query
• Run any CQL statement (Select, Update, Insert, Delete) & support all CQL types
• Support any Cassandra version with CQL support
Cassandra NDBench CQL plugin
@monaldax
12. • Validate scalability of data model and application query workload
• Compare the performance of data model for Cassandra version 3.x & 2.x
• Help certify Cassandra updates / upgrades - test different data models and
application workloads
• Use for data generation for given schema before running queries
What Do We Use It For / Plan To Use It For
@monaldax
15. Application CQL Queries For API 1 (steps 2, 3)
Query Group 1: 70%
SELECT user_id, profile_id FROM user WHERE user_id = ?;
SELECT foreign_keys FROM user_index WHERE type =
'profile_id' AND value = ?;
@monaldax
16. Application CQL Queries For API 2 (steps 2, 3)
Query Group 2: 30%
SELECT user_id, profile_id, acc_guid FROM user WHERE user_id = ?;
BEGIN BATCH INSERT INTO user_index (create_time, foreign_keys, type, value)
VALUES (?, [ ?, ? ], ''profile_id'', ?); INSERT INTO user_index (create_time,
foreign_keys, type, value) VALUES (?, [ ? ], ''acc_guid'', ?); APPLY BATCH;
INSERT INTO map_test (id, uid_pid) VALUES (''1'', {user_id : ?, profile_id: ?});
INSERT INTO set_test(id, uid_pid) VALUES (''2'', {?});
@monaldax
17. NDBench CQL Plugin Overview
Test
Cassandra Cluster
Schema &
Test Data
Run Queries
ndb_perf_queries
Perf Test Profile
NDBench NodeNDBench NodeNDBench Node
With CQL Plugin
(EC2 Instance)
Record Metrics
NDBench NodeNDBench APP UI
@monaldax
18. NDBench CQL Plugin Perf-Test-Profile Schema (step 9)
@monaldax
var_* columns point to
different sources for
query parameter values.
Only one is used
ordered CQL in group (id)
19. Modified App Query With Parameter Reference - Group 1 (70%)
SELECT user_id, profile_id FROM user WHERE user_id = ?user_id?;
SELECT foreign_keys FROM user_index WHERE type = 'profile_id' AND value
= ?profile_id?;
@monaldax
20. Modified App Query With Reference - 2 (30%)
SELECT user_id, profile_id, acc_guid FROM user WHERE user_id = ?user_id?;
BEGIN BATCH INSERT INTO user_index (create_time, foreign_keys, type, value)
VALUES (?:TS?, ?[user_id, profile_id]?, ''profile_id'', ?profile_id?); INSERT
INTO user_index (create_time, foreign_keys, type, value) VALUES (?:TS?,
?[user_id]?, ''acc_guid'', ?acc_guid?); APPLY BATCH;
INSERT INTO map_test (id, uid_pid) VALUES (''1'', ?{user_id : user_id,
profile_id: profile_id}?);
INSERT INTO set_test(id, uid_pid) VALUES (''2'', ?s{user_id}s?);
Type Coercion
@monaldax
25. • Total traffic % of query groups must add up to 100
• Support different consistency level for each statement
• Columns in cql statement inferred, and available from the parameter source
• Parameter source - Table, Previous query results, SELECT statement
• Support large number of parameters to perf test CQL queries
Summary - Ergonomic Perf Test Profile, & Comprehensive Validation
@monaldax
37. • Test scale up to 1.2 million ops / second (1.2 billion parameter rows)
• 96 nodes i3.8xl, LCS (compaction), LZ4, mostly read heavy
• Found data model bug, slowly leading to wide rows
• Client wrapper bugs - slow memory leak, metrics, prepared statement
caching not working
Testing C* Data Model For A Critical Service On 2.x & 3.x
@monaldax
38. We Would Like To Use Plugin To Test Cassandra @ Netflix
Use restores from prod data backups and define of
CQL Perf Test Profiles, exercised by the NDBench
CQL plugin, and triggered by Cassandra builds
@monaldax
40. NDBench CQL Plugin Architecture
Test
Cassandra Cluster
Schema &
Test Data
ndb_perf_queries
Run QueriesNDBench NodeNDBench Node
(EC2 Instance)
NDBench NodeNDBench Node
With CQL Plugin
(EC2 Instance)
Record Metrics
NDBench NodeNDBench APP UI
@monaldax
Perf Test Profile
41. @monaldax
NDBench NodeNDBench Node
Sqlite
Param store
Cassandra Cluster
ndb_perf_queries
Schema &
Test Data
Metadata could live on
any Cassandra cluster.
Parse metadata1
Load from user & Storeon node in Sqlite
2
Run queries with param values from Sqlite
& record metrics
4
NDBench UI
/init/
all nodes
0
REST
/start/ all nodes3
High-level Architecture
Randomize start
42. High-level Architecture (optimized)
@monaldax
NDBench NodeNDBench Node
Sqlite
Param store
Cassandra Cluster
Schema &
Test Data
Metadata could live on
any Cassandra cluster.
Parse metadata1
If ! user param on S3Load from & Store on1 node in Sqlite
2
Run queries with param values from Sqlite
& record metrics
7
Upload Sqllite file3
/init/ a node0
NDBench UI
/init/
all nodes
4
REST
/start/ all nodes6
Download Sqllite file
from each node
5
Randomize start
ndb_perf_queries
44. Lock-free Randomized Deterministic % Query Distribution On Each Node
Query Group ID 1: 70% Query Group ID 2: 30% ( 1 )
1 1 1 1 1 1 1 2 2 2 2
70 1s for Query Group 1 30 2s for Query Group 2
100 Element Array ↓
@monaldax
45. 1 2 1 1 2 1 2 1 2 1 1
1 time Fisher-Yates Shuffle
Lock-free Randomized Deterministic % Query Distribution On Each Node
Query Group ID 1: 70% Query Group ID 2: 30% ( 2 )
@monaldax
46. 1 2 1 1 2 1 2 1 2 1 1
Lock-free Randomized Deterministic % Query Distribution On Each Node
Query Group ID 1: 70% Query Group ID 2: 30% ( 3 )
Thread 1
︴ThreadLocal
Array Index
Thread n
︴ThreadLocal
Array Index
@monaldax
47. Data Generators And Generating Test Data
• ?:TS? - This is replaced by a timestamp.
• Add more generators (future)
• generation of non-collection (bigint, text, uuid, etc.) and collection types
• Use generators in INSERT to generate data for new schema
@monaldax
49. • Declaratively benchmarking significantly reduces overhead in iterating over
schema and Cassandra config to achieve scale
• Used to test and benchmark against curated data sets and perf-test-profiles
• Support all data types & LWT Support (beta)
• Randomized deterministic percentage distribution of queries
Summary
@monaldax
50. • Open source NDBench CQL plugin (WIP)
• Add more generators
• Load sharded query parameter data on each NDBench node
• UDT Support in dynamic collections
• Build support for other data stores - leverage same philosophy & reuse code
Future Enhancements (Lazily)
@monaldax