This webinar introduces Project Alternator, ScyllaDB's open-source DynamoDB-compatible API. The presenters are Dor Laor, CEO of ScyllaDB, Avi Kivity, CTO of ScyllaDB, and Nadav Har'El. The agenda includes discussions of what Alternator is, live demos of running Alternator and migrating from DynamoDB to Alternator, and how the Alternator implementation works and its current compatibility. The goal of Alternator is to allow applications designed for DynamoDB to run on ScyllaDB for better performance and flexibility.
2. Presenters
2
Dor Laor
Dor Laor is the CEO of ScyllaDB. Previously, Dor was part of the founding team of the KVM
hypervisor under Qumranet that was acquired by Red Hat. At Red Hat Dor was managing the KVM
and Xen development for several years. Dor holds an MSc from the Technion and a Phd in
snowboarding.
Avi Kivity
Avi Kivity, CTO of ScyllaDB, is known mostly for starting the Kernel-based Virtual Machine (KVM)
project, the hypervisor underlying many production clouds. He has worked for Qumranet and Red
Hat as KVM maintainer until December 2012. Avi is now CTO of ScyllaDB, a company that seeks to
bring the same kind of innovation to the public cloud space.
Nadav Har’El
Nadav Har’El has had a diverse 20-year career in computer programming and computer science. In
the past he worked on scientific computing, networking software, and information retrieval, but in
recent years his focus has been on virtualization and operating systems, and among other things he
has worked on nested virtualization and exit-less I/O in KVM, and today he maintains the OSv kernel
and also works on Seastar and ScyllaDB.
3. 3
+ The Real-Time Big Data Database
+ Drop-in replacement for Cassandra
+ 10X the performance & low tail latency
+ New: Scylla Cloud, DBaaS
+ Open source and enterprise editions
+ Founded by the creators of KVM hypervisor
+ HQs: Palo Alto, CA; Herzelia, Israel
About ScyllaDB
4. What is Alternator
+ Why a DynamoDB-compatible API?
Live demos
+ Get started in 5 minutes in docker
+ CLI access to the DynamoDB API
+ Play a game on Alternator
+ Monitoring Alternator
Alternator implementation
+ How Scylla differs from DynamoDB
+ Current compatibility and limitations
Migrating from DynamoDB to Alternator
Agenda
6. What is alternator?
+ Scylla is an efficient NoSQL data store, announced September 2015.
+ It is open source, with enterprise support and cloud (SaaS) options.
+ Compatible with Cassandra and its APIs (CQL, Thrift).
+ Alternator: Adding a DynamoDB API to Scylla.
7. Scylla design principles
+ Efficient implementation for modern hardware
+ Throughput 10x higher than Cassandra
+ Linear scalability to many-core machines
+ Focused on modern fast SSDs
+ Low tail latency
+ Reliability
+ Autonomous database (minimal configuration)
We can apply these advantages to more than just Cassandra compatibility!
8. Why DynamoDB API?
DynamoDB is similar in design and data model to Scylla
More details on the similarities, and differences, later.
Amazon Dynamo
(2007 paper)
Google Bigtable
(2006 paper)
9. Why DynamoDB API?
DynamoDB is SaaS
+ SaaS is easy to get started with,
SaaS trend in industry
+ DynamoDB popularity growing
Vs. Cassandra:
Cassandra
DynamoDB
10. Why DynamoDB API?
Better price/performance https://www.scylladb.com/product/benchmarks/dynamodb-benchmark/
+ Managed Scylla (“Scylla Cloud”) is
5 times cheaper than DynamoDB.
11. Why DynamoDB API?
Vendor lock-in
+ Users want to move their DynamoDB application to
+ a different cloud provider,
+ a private datacenter,
+ or a hybrid of several clouds or datacenters.
+ Scylla can be run on any cloud or datacenter.
13. Getting Alternator running in 5 minutes
+ Running Alternator is simply running Scylla, with the parameter
“alternator-port” set - to listen for the DynamoAPI API on that port.
+ You can get it running on local your machine in 5 minutes using docker:
docker run --name scylla -d -p 8000:8000
scylladb/scylla-nightly:alternator --alternator-port=8000
14. Let’s test it really works
+ We can then run Amazon’s DynamoDB CLI tools against port 8000
+ aws --endpoint-url 'http://172.17.0.1:8000' dynamodb create-table
--table-name mytab
--attribute-definitions AttributeName=key,AttributeType=S
--key-schema AttributeName=key,KeyType=HASH
--billing-mode PAY_PER_REQUEST
+ First attempt works, second fails (as expected)
19. Live demo
+ Live workload
+ Cluster of three 30-core nodes in AWS, each in separate AZ.
+ 1.1 TB data - 1 billion items, 1.1KB each.
+ YCSB workload, 50% read 50% write, Zipfian distribution.
21. Alternator implementation
Let’s survey some of the similarities and differences of Scylla and DynamoDB
+ How did we handle the differences?
+ What still needs to be done?
A much more detailed survey can be found in this document.
22. DynamoDB vs Scylla: similarities
+ Fast scalable NoSQL databases with real-time response and huge volume
+ Key/Value and Wide-row stores
+ Eventual and configurable consistencies
+ Hashed partition keys - and also sort key for items inside a partition
23. DynamoDB vs Scylla: basic differences
+ DynamoDB is Cloud Native
+ Only available as a service, part of a huge deployment
+ Scylla has different options: OSS, Enterprise, as-a-service
+ More flexible - runs everywhere
+ Many configuration options - amount of replicas, different consistency levels,
+ Scylla has node operations, cli, etc
+ Scylla integrates with many OSS projects, Prometheus, Kafka, Spark (as first citizens)
+ Scylla’s units are servers (cpu shards). Dynamo’s units are IOPS/Tablets
+ Dynamo uses HTTP(s)/Json, Scylla used CQL
24. Data model - similarities
+ Data is divided into tables.
+ Data composed of Items in partitions.
Same as Scylla’s rows in partitions.
+ Item’s key has hash and range parts - like Scylla’s partition and clustering key.
(in DynamoDB API - only one of each)
+ Type of key columns defined in table’s schema (string, number, bytes)
25. Data model - differences
+ But DynamoDB items can have additional attributes - not defined in schema
+ Attributes may be scalars (string, number, etc.), lists, sets, document.
+ Similar to a JSON document.
+ Needs to be emulated in Scylla
+ Put in table one map for top-level attributes
(mapping attribute name to JSON value).
+ Map instead of single JSON allows concurrent updates to
different top-level attributes of same item.
+ TODO: support updates to deep attributes.
26. Write mechanism - differences
● DynamoDB natively supports Read-Modify-Write (RMW) updates:
○ Conditional updates (set a = 2 if a == 1)
○ Counters (set a = a + 1)
○ Attribute copy (set a = b)
● Scylla natively supports independent writes to different columns
(CRDT):
○ Efficient updates to different columns - not requiring a read.
We are adding support for Read-modify-write operations - LWT.
27. Write mechanism - why is it different?
Scylla DynamoDB
Log Structured Merge (LSM)
■ Efficient writes - without prior
read
BTree
■ Write includes a free read, but
are slow.
Quorum-based consistency
■ Writes done on several
replicas independently
■ Concurrent read-modify-write
operations are not serialized.
Leader model
■ One replica (“leader”)
responsible for a write
operation, so can serialize
read-modify-write operations.
28. Server
+ Each node in Scylla cluster also answers DynamoDB API requests on this port.
+ So no need for separate sizing of an API translation cluster.
+ Same nodes can do both CQL and DynamoDB API
+ As needed, forwards the request to other nodes holding the requested data.
+ Uses internal Scylla function calls and RPC - no translation to CQL.
+ Client needs to send requests to the different Scylla nodes. Can be done via
DNS or HTTP load balancer.
29. The DynamoDB API
DynamoDB API is: JSON-formatted requests and responses over HTTP/HTTPS.
Request Response
POST / HTTP/1.1
...
X-Amz-Target: DynamoDB_20120810.CreateTable
{ "TableName": "mytab",
"KeySchema":
[{"AttributeName": "key", "KeyType": "HASH"}],
"AttributeDefinitions":
[{"AttributeName": "key", "AttributeType": "S"}],
"BillingMode": "PAY_PER_REQUEST"
}
{ "TableDescription":
{ "AttributeDefinitions":
[{ "AttributeName":"key","AttributeType":"S"}],
"TableName": "mytab",
"KeySchema":
[{ “AttributeName":"key","KeyType":"HASH"}],
"TableStatus": "ACTIVE",
"CreationDateTime": 1569242964,
"TableId":
"91347050-de00-11e9-a100-000000000000"
}
}
30. Alternator’s compatibility and limitations
+ See detailed current status in alternator.md, and issues in bug tracker:
+ https://github.com/scylladb/scylla/blob/master/docs/alternator/alternator.md
+ https://github.com/scylladb/scylla/issues
+ Several DynamoDB applications already work unmodified.
+ Some of the issues we will address for the GA:
+ Safe concurrent read-modify-write operations
+ A few operations and subcases of operations not yet supported
+ Authentication
+ SaaS on Scylla Cloud
+ DynamoDB streams (CDC)
32. Migrating from DynamoDB to Scylla
+ Install Scylla and load balancer (or wait for Scylla Cloud SaaS availability)
+ Tell your application, written to use DynamoDB, Scylla’s endpoint address
+ This is a preview release. Watch out for unsupported features and unsafe
concurrent RMW operations.
+ Migrate existing data from DynamoDB to Scylla using DynamoDB API
+ E.g. Spark migrator:
https://www.scylladb.com/2019/09/12/migrating-from-dynamodb-to-scylla
33. Summary
+ Scylla is a very efficient, reliable, low latency NoSQL data store,
that began with Cassandra compatibility.
+ The Alternator Project adds to Scylla DynamoDB API compatibility.
+ Can run existing applications designed for DynamoDB,
+ On any cloud or data center, not just on AWS.
+ Open source.
+ Currently a preview release, with some limitations, but GA expected soon.