Key-Key-Value Store: Generic NoSQL Datastore with Tombstone Reduction and Automatic Partition Splitting

Key-Key-Value Store:
Generic NoSQL Datastore with
Tombstone Reduction and
Automatic Partition Splitting
Stephen Ma, Senior Software Engineer

Sining (Stephen) Ma
■ Sr Software Engineer working in the persistence
infrastructure team at Discord
■ Building and maintaining services on Discord’s
ScyllaDB clusters

Context: Why do we want to build Key Key Value (KKV) store?
Features: What features are provided by KKV store?
Implementation: Methods provided by KKV & implementation challenges?
Agenda

Why do we want to build KKV store?
Context

Why ?
● We want to support storing and querying entities identified by composite
primary keys. The composite key consists of a parent ID and an entity ID.
a. An example, an emoji in a server is identified by a server id and an emoji id.
b. Data is stored as binary blob.
c. Supports Range Scan of entity IDs within one parent ID.
● A very easy and fast solution to onboard new use cases.
a. No new table creation or database schema migration when iterating on the data model.
● Hide Scylla-specific pitfalls from developers.
a. Mitigate potential performance costs associated with tombstones.
b. Avoid too many entities stored in one ScyllaDB partition.

Why not BigTable but ScyllaDB?
● BigTable Pros:
a. Key is stored as string, arbitrary number of keys, key data type is ﬂexible.
■ e.g. 822219052402606132#857417264214442014
■ Bigtable supports scanning rows given a key preﬁx
b. A GCP service, simple administration and easy for operational work.
● BigTable Cons:
a. GCP Cloud Bigtable is one zone per cluster.
b. Cloud Bigtable provides read after your write consistency only if clients do single-cluster
routing but single-cluster routing does not provide auto-failover.

What features are provided by KKV store?
Features

Features
● All entities of one use case are stored under one namespace.
● The stored data payload of an entity is protobuf binary format.
● There is an upper limit for the data size per entity.
● No support to partial update a subset of ﬁelds in an entity.
● Offer no guarantees of sync reads/writes on the entity. Guard your code
accordingly.

APIs provided by KKV Store & internal implementation
Implementation

APIs Provided by KKV Store
● Create a new entity or update an existing one.
● Get an entity by a parent ID and entity ID.
● Delete an entity given a parent ID and entity ID.
● Range scan multiple entities under a parent ID.
a. Specify a start entity ID and/or an end entity ID.
b. Limit the number of entities returned.
c. Return unsorted entities.
d. Or return sorted entities.
● Binary data serialization and deserialization are done in the clients.

Mitigating Performance Impact of
Tombstones with Application Tombstones
● Excessive tombstones could
increase read latency or failures
from ScyllaDB, especially for
range scan queries.
● When an entity is deleted from
ScyllaDB, ScyllaDB doesn’t purge
it but writes a tombstone.
● If tombstones are not written to
3 replicas, this results in
resurrection of deleted data after
tombstones are deleted as part
of compaction.
● App tombstone design allows to
set gc_grace_seconds to 0. gc_grace_seconds = 0 !!

KKV Store Tables
CREATE TABLE discord_kkv_store.kkv_store_partition_info (
namespace int,
parent_id bigint,
partition_info blob,
PRIMARY KEY ((namespace, parent_id))
)
PartitionInfo:
partition_id,
partition_num_shard,
next_partition_id,
next_partition_num_shards
12
CREATE TABLE discord_kkv_store.kkv_store_data (
partition_id bigint,
partition_shard_id smallint,
entity_id bigint,
data blob,
PRIMARY KEY ((partition_id, partition_shard_id), entity_id)
) gc_grace_seconds = 0

Partition Resharding
If all entities under one parent ID are stored in one shard, we could have large partitions.
Large partitions could cause hot-spotting and query timeout.
Solution:
● One shard has a limit for the maximum number of entities.
● When more than a certain number of entities are stored in one shard, send a
resharding message to Google pubsub.
● Resharding process doubles total shard count and then copies all entities from old
shards into new shards.

Resharding Workflow
● Store the resharding state metadata in ScyllaDB
between each step.
● Updating partition info table must grab a distributed
lock.
● Copying or deleting data in a shard enables client
speculative retry.
Eﬃcient retry on data copy or clean up failures.
Validate
Copy data from old
partition to new partition
Update Partition Info
Cleanup data in old partition
Complete

After Shipped …
● Written in Rust
● Serving 20+ use cases in production and keep growing
● Taking one day to implement and then ship a new use case to
production
No production incident from KKV store Service was caused by ScyllaDB

Thank You
Stay in Touch
Stephen Ma
siningma
www.linkedin.com/in/siningma/

Key-Key-Value Store: Generic NoSQL Datastore with Tombstone Reduction and Automatic Partition Splitting

Recommandé

Recommandé

Contenu connexe

Similaire à Key-Key-Value Store: Generic NoSQL Datastore with Tombstone Reduction and Automatic Partition Splitting

Similaire à Key-Key-Value Store: Generic NoSQL Datastore with Tombstone Reduction and Automatic Partition Splitting (20)

Plus de ScyllaDB

Plus de ScyllaDB (20)

Dernier

Dernier (20)

Key-Key-Value Store: Generic NoSQL Datastore with Tombstone Reduction and Automatic Partition Splitting