Most databases are based on architectures that pre-date advances to modern hardware. This results in performance issues, the need to overprovision, and a high total cost of ownership. In this webinar we will discuss the advances to modern server technology and take a deep dive into Scylla’s shard-per-core architecture and our asynchronous engine, the Seastar framework.
Join us to learn how Seastar (and Scylla):
Avoid locks and contention on the CPU level
Bypass kernel bottlenecks
Implement its per-core shared-nothing autosharding mechanism
Utilize modern storage hardware
Leverage NUMA to get the best RAM performance
Balance your data across CPUs and nodes for best and smoothest performance
Plus we’ll cover the advantages of unlocking vertical scalability.
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Under the Hood of a Shard-per-Core Database Architecture
1. Sharding All
the Way Down
Under the Hood of a Shard-per-Core Architecture
Avishai Ish-Shalom
2.
3. 3
~$ whoami
Avishai Ish Shalom
+ Developer Advocate for ScyllaDB
+ 17 years in the industry, doing dev, ops and
other magical roles
4. 4
+ The Real-Time Big Data Database
+ Drop-in replacement for Apache Cassandra
and Amazon DynamoDB
+ 10X the performance & low tail latency
+ Open Source, Enterprise and Cloud options
+ Founded by the creators of KVM hypervisor
+ HQs: Palo Alto, CA, USA; Herzelia, Israel;
Warsaw, Poland
About ScyllaDB
5. The universal scalability law (USL)
5
Diminishing
returns:
contention /
queueing
Negative
returns:
coherency /
consensus /
synchronization
Load/resources
13. What happened?
13
+ Per thread performance plateaued
+ Cores: 1 ⟶ 256
+ RAM: 2GB ⟶ 2TB
+ Disk space: 10GB ⟶ 10TB
+ Disk seek time: 10-20ms ⟶ 20µs
+ Network throughput: 1Gbps ⟶ 100Gbps
This year: 64/128 cores/threads/cpu, 400Gbps NIC, Disk 10µs latency, 1.5TB/device, DDR5
2TB/DIMM
AWS u-24tb1.metal: 224 cores, 448 threads, 24TB RAM
14. The CPU-RAM-storage gap
+ RAM latency hasn’t changed in
20 years (~13ns)
+ Bandwidth grows, but slowly
+ Memory seek is ~1000 CPU
cycles
+ NVMe seek is ~1000 memory
seeks
RAM bandwidth has become bottleneck
14
17. What’s going on?
+ MySQL max out around 48 cores
+ Context switch ~1-5µs
+ Lots of context switches, wasted CPU and wall time
+ Locks, locks and damn locks
+ Because shared memory
17
18. How can we use the hardware?
+ No locks
+ No shared memory
+ No coordination/synchronization
+ No context switches
+ No memory copies
18
19. Poll Question
How much data do you have under management in your own transactional
database systems?
+ <1 terabyte
+ 1 to 50 terabytes
+ 50-100 terabytes
+ >100 terabytes
19
21. Interrupt
handler
21
A day in the life of a packet
TCP/IP
stack
Application
code
Single core?
Another core?
Memory copy?
Another core?
Context switch?
22. 22
+ Context switches
+ Memory locks
+ Socket locks
+ CPU core locality
+ NUMA locality
Ouch
23. 23
The kernel has many shared structures
+ Counters, stats
+ VFS metadata
+ The pagecache
+ Various caches
And they all need to be synchronized
Locks, locks everywhere
24. 24
+ High concurrency -> many context switches
+ Kernel design assumes context switch is cheap
+ Context switch cost ~1-5 µsec
+ Modern I/O op ~ 20-50 µsec (and going down)
+ 10% overhead on I/O op???
Context switch cost
27. Sharding/partitioning
+ Common concept in distributed databases
+ Break the system to N non-interacting parts
+ Usually done by hash(partition_key) % N
+ Data/load may be unbalanced
+ Fact of life in distributed databases 🤷
+ Logical mapping of data shards to core shards
27
33. Thou shalt not block
Query
Commitlog
Compaction
Queue
Queue
Queue
Userspace
I/O
Scheduler
Disk
Max useful disk concurrency
I/O queued in FS/device
No queues
33
34. 34
Shard aware I/O scheduler(s)
+ Each shard has independent scheduler
+ Capacity groups per NUMA zone
+ Shards grab capacity leases
Minimal, low cost coordination between shards!
38. Q&A
@nukemberg
Stay in touch
Join us at P99 CONF
p99conf.io
Sign up for our next Scylla
Virtual Workshop | July 23rd
Interactive sessions with our
NoSQL solution architects
scylladb.com
October 6-7, 2021
39. United States
2445 Faber St, Suite #200
Palo Alto, CA USA 94303
Israel
Maskit 4
Herzliya, Israel 4673304
www.scylladb.com
@scylladb
Thank You!