Scylla is a new open source NoSQL database that is compatible with Apache Cassandra but provides significantly higher performance through a redesign that takes advantage of modern hardware. Scylla is capable of over 1.8 million operations per second per node with predictable low latencies. It uses an architecture with shard-per-core and reactor programming that avoids locks and threads for near-linear scaling. Scylla also has its own efficient unified cache and I/O scheduler that maximize throughput and allow it to outperform Cassandra on benchmarks by an order of magnitude. Scylla is fully compatible with Cassandra and aims to build an open source community around ongoing core database improvements.
2. Capable of 1,800,000 operations per second
PER NODE
With predictable, low latencies
Compatible with Apache Cassandra
Scylla: A new Open Source NoSQL Database
13. WHAT WOULD YOU DO WITH 1 MILLION TPS?
Shrink your cluster by a factor of X10
Handle 10X traffic spikes on Black Friday
Faster repairs, faster scale out.
Get the most out of your data - Run more queries
Administration operations while serving
Stop using caches in front of the database
15. SCYLLA IS QUITE DIFFERENT
Shard-per-core, no locks, no threads, zero-copy
Reactor programing with C++14
Our own efficient, DB-aware cache, not using Linux page cache
Better storage engine
Max out all HW resources - NUMA friendly, multiqueue NICs, etc
Userspace I/O scheduler
Based on Seastar project
16. SCYLLA ARCHITECTURE COMPARISON
Cassandra
TCP/IPScheduler
queuequeuequeuequeuequeue
threads
NIC
Queues
Kernel
Traditional stack Seastar’s sharded stack
Memory
Lock contention
Cache contention
NUMA unfriendly
Application
TCP/IP
Task Scheduler
queuequeuequeuequeuequeuesmp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
Application
TCP/IP
Task Scheduler
queuequeuequeuequeuequeuesmp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
Application
TCP/IP
Task Scheduler
queuequeuequeuequeuequeuesmp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
No contention
Linear scaling
NUMA friendly
Kernel
Core
Database
Task Scheduler
queuequeuequeuequeuequeuesmp queue
Userspace
17. Scylla has its own task scheduler
Traditional stack Scylla’s stack
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise is a
pointer to
eventually
computed value
Task is a
pointer to a
lambda function
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread is a
function pointer
Stack is a byte
array from 64k
to megabytes
Context switch cost is
high. Large stacks pollutes
the caches No sharing, millions of
parallel events
20. Scylla has an I/O scheduler
Traditional stack
Scylla stack
Max useful disk
concurrency
I/O scheduled
by priority
here
Source: http://www.scylladb.com/2016/04/14/io-scheduler-1/
23. SCYLLA as an INFRASTRUCTURE
Scale up with the number of cores
Kernel bypass for direct networking and block I/O
Good match for upcoming Non Volatile Memory technology
High availability with gossip and flexible replication
Runs everywhere: Physical, virtual, containers
Can be integrated with microservices with its own httpd