SlideShare a Scribd company logo
1 of 36
Download to read offline
Introducing
Project Alternator
Scylla’s Open-Source
DynamoDB-compatible API
WEBINAR
Presenters
2
Dor Laor
Dor Laor is the CEO of ScyllaDB. Previously, Dor was part of the founding team of the KVM
hypervisor under Qumranet that was acquired by Red Hat. At Red Hat Dor was managing the KVM
and Xen development for several years. Dor holds an MSc from the Technion and a Phd in
snowboarding.
Avi Kivity
Avi Kivity, CTO of ScyllaDB, is known mostly for starting the Kernel-based Virtual Machine (KVM)
project, the hypervisor underlying many production clouds. He has worked for Qumranet and Red
Hat as KVM maintainer until December 2012. Avi is now CTO of ScyllaDB, a company that seeks to
bring the same kind of innovation to the public cloud space.
Nadav Har’El
Nadav Har’El has had a diverse 20-year career in computer programming and computer science. In
the past he worked on scientific computing, networking software, and information retrieval, but in
recent years his focus has been on virtualization and operating systems, and among other things he
has worked on nested virtualization and exit-less I/O in KVM, and today he maintains the OSv kernel
and also works on Seastar and ScyllaDB.
3
+ The Real-Time Big Data Database
+ Drop-in replacement for Cassandra
+ 10X the performance & low tail latency
+ New: Scylla Cloud, DBaaS
+ Open source and enterprise editions
+ Founded by the creators of KVM hypervisor
+ HQs: Palo Alto, CA; Herzelia, Israel
About ScyllaDB
What is Alternator
+ Why a DynamoDB-compatible API?
Live demos
+ Get started in 5 minutes in docker
+ CLI access to the DynamoDB API
+ Play a game on Alternator
+ Monitoring Alternator
Alternator implementation
+ How Scylla differs from DynamoDB
+ Current compatibility and limitations
Migrating from DynamoDB to Alternator
Agenda
What is Alternator?
What is alternator?
+ Scylla is an efficient NoSQL data store, announced September 2015.
+ It is open source, with enterprise support and cloud (SaaS) options.
+ Compatible with Cassandra and its APIs (CQL, Thrift).
+ Alternator: Adding a DynamoDB API to Scylla.
Scylla design principles
+ Efficient implementation for modern hardware
+ Throughput 10x higher than Cassandra
+ Linear scalability to many-core machines
+ Focused on modern fast SSDs
+ Low tail latency
+ Reliability
+ Autonomous database (minimal configuration)
We can apply these advantages to more than just Cassandra compatibility!
Why DynamoDB API?
DynamoDB is similar in design and data model to Scylla
More details on the similarities, and differences, later.
Amazon Dynamo
(2007 paper)
Google Bigtable
(2006 paper)
Why DynamoDB API?
DynamoDB is SaaS
+ SaaS is easy to get started with,
SaaS trend in industry
+ DynamoDB popularity growing
Vs. Cassandra:
Cassandra
DynamoDB
Why DynamoDB API?
Better price/performance https://www.scylladb.com/product/benchmarks/dynamodb-benchmark/
+ Managed Scylla (“Scylla Cloud”) is
5 times cheaper than DynamoDB.
Why DynamoDB API?
Vendor lock-in
+ Users want to move their DynamoDB application to
+ a different cloud provider,
+ a private datacenter,
+ or a hybrid of several clouds or datacenters.
+ Scylla can be run on any cloud or datacenter.
Live 1-node
Alternator Demo
Getting Alternator running in 5 minutes
+ Running Alternator is simply running Scylla, with the parameter
“alternator-port” set - to listen for the DynamoAPI API on that port.
+ You can get it running on local your machine in 5 minutes using docker:
docker run --name scylla -d -p 8000:8000
scylladb/scylla-nightly:alternator --alternator-port=8000
Let’s test it really works
+ We can then run Amazon’s DynamoDB CLI tools against port 8000
+ aws --endpoint-url 'http://172.17.0.1:8000' dynamodb create-table
--table-name mytab
--attribute-definitions AttributeName=key,AttributeType=S
--key-schema AttributeName=key,KeyType=HASH
--billing-mode PAY_PER_REQUEST
+ First attempt works, second fails (as expected)
Amazon’s Tic-Tac-Toe demo
https://github.com/awsdocs/amazon-dynamodb-developer-guide/blob/master/doc_source/TicTacToe.Phase1.md
+ An open-source Python application using DynamoDB
+ Written by Amazon, demonstrates many DynamoDB features
+ Written in Python, using the Amazon’s AWS client library (boto3)
+ python application.py --mode local --port 8000
Implements a multiplayer Tic-Tac-Toe game server.
Many users can connect, invite each other to games, and play against each other,
and keep score.
Amazon’s Tic-Tac-Toe demo
Alternator
monitoring
Monitoring
+ Observability
Live demo
+ Live workload
+ Cluster of three 30-core nodes in AWS, each in separate AZ.
+ 1.1 TB data - 1 billion items, 1.1KB each.
+ YCSB workload, 50% read 50% write, Zipfian distribution.
Alternator
implementation
Alternator implementation
Let’s survey some of the similarities and differences of Scylla and DynamoDB
+ How did we handle the differences?
+ What still needs to be done?
A much more detailed survey can be found in this document.
DynamoDB vs Scylla: similarities
+ Fast scalable NoSQL databases with real-time response and huge volume
+ Key/Value and Wide-row stores
+ Eventual and configurable consistencies
+ Hashed partition keys - and also sort key for items inside a partition
DynamoDB vs Scylla: basic differences
+ DynamoDB is Cloud Native
+ Only available as a service, part of a huge deployment
+ Scylla has different options: OSS, Enterprise, as-a-service
+ More flexible - runs everywhere
+ Many configuration options - amount of replicas, different consistency levels,
+ Scylla has node operations, cli, etc
+ Scylla integrates with many OSS projects, Prometheus, Kafka, Spark (as first citizens)
+ Scylla’s units are servers (cpu shards). Dynamo’s units are IOPS/Tablets
+ Dynamo uses HTTP(s)/Json, Scylla used CQL
Data model - similarities
+ Data is divided into tables.
+ Data composed of Items in partitions.
Same as Scylla’s rows in partitions.
+ Item’s key has hash and range parts - like Scylla’s partition and clustering key.
(in DynamoDB API - only one of each)
+ Type of key columns defined in table’s schema (string, number, bytes)
Data model - differences
+ But DynamoDB items can have additional attributes - not defined in schema
+ Attributes may be scalars (string, number, etc.), lists, sets, document.
+ Similar to a JSON document.
+ Needs to be emulated in Scylla
+ Put in table one map for top-level attributes
(mapping attribute name to JSON value).
+ Map instead of single JSON allows concurrent updates to
different top-level attributes of same item.
+ TODO: support updates to deep attributes.
Write mechanism - differences
● DynamoDB natively supports Read-Modify-Write (RMW) updates:
○ Conditional updates (set a = 2 if a == 1)
○ Counters (set a = a + 1)
○ Attribute copy (set a = b)
● Scylla natively supports independent writes to different columns
(CRDT):
○ Efficient updates to different columns - not requiring a read.
We are adding support for Read-modify-write operations - LWT.
Write mechanism - why is it different?
Scylla DynamoDB
Log Structured Merge (LSM)
■ Efficient writes - without prior
read
BTree
■ Write includes a free read, but
are slow.
Quorum-based consistency
■ Writes done on several
replicas independently
■ Concurrent read-modify-write
operations are not serialized.
Leader model
■ One replica (“leader”)
responsible for a write
operation, so can serialize
read-modify-write operations.
Server
+ Each node in Scylla cluster also answers DynamoDB API requests on this port.
+ So no need for separate sizing of an API translation cluster.
+ Same nodes can do both CQL and DynamoDB API
+ As needed, forwards the request to other nodes holding the requested data.
+ Uses internal Scylla function calls and RPC - no translation to CQL.
+ Client needs to send requests to the different Scylla nodes. Can be done via
DNS or HTTP load balancer.
The DynamoDB API
DynamoDB API is: JSON-formatted requests and responses over HTTP/HTTPS.
Request Response
POST / HTTP/1.1
...
X-Amz-Target: DynamoDB_20120810.CreateTable
{ "TableName": "mytab",
"KeySchema":
[{"AttributeName": "key", "KeyType": "HASH"}],
"AttributeDefinitions":
[{"AttributeName": "key", "AttributeType": "S"}],
"BillingMode": "PAY_PER_REQUEST"
}
{ "TableDescription":
{ "AttributeDefinitions":
[{ "AttributeName":"key","AttributeType":"S"}],
"TableName": "mytab",
"KeySchema":
[{ “AttributeName":"key","KeyType":"HASH"}],
"TableStatus": "ACTIVE",
"CreationDateTime": 1569242964,
"TableId":
"91347050-de00-11e9-a100-000000000000"
}
}
Alternator’s compatibility and limitations
+ See detailed current status in alternator.md, and issues in bug tracker:
+ https://github.com/scylladb/scylla/blob/master/docs/alternator/alternator.md
+ https://github.com/scylladb/scylla/issues
+ Several DynamoDB applications already work unmodified.
+ Some of the issues we will address for the GA:
+ Safe concurrent read-modify-write operations
+ A few operations and subcases of operations not yet supported
+ Authentication
+ SaaS on Scylla Cloud
+ DynamoDB streams (CDC)
Migrating from
DynamoDB to Scylla
Migrating from DynamoDB to Scylla
+ Install Scylla and load balancer (or wait for Scylla Cloud SaaS availability)
+ Tell your application, written to use DynamoDB, Scylla’s endpoint address
+ This is a preview release. Watch out for unsupported features and unsafe
concurrent RMW operations.
+ Migrate existing data from DynamoDB to Scylla using DynamoDB API
+ E.g. Spark migrator:
https://www.scylladb.com/2019/09/12/migrating-from-dynamodb-to-scylla
Summary
+ Scylla is a very efficient, reliable, low latency NoSQL data store,
that began with Cassandra compatibility.
+ The Alternator Project adds to Scylla DynamoDB API compatibility.
+ Can run existing applications designed for DynamoDB,
+ On any cloud or data center, not just on AWS.
+ Open source.
+ Currently a preview release, with some limitations, but GA expected soon.
Q&A
Stay in touch
dor@scylladb.com
@DorLaor
Dor Laor Avi Kivity
avi@scylladb.com
@AviKivity
Nadav Harel
nyh@scylladb.com
@nyharel
United States
1900 Embarcadero Road
Palo Alto, CA 94303
Israel
11 Galgalei Haplada
Herzelia, Israel
www.scylladb.com
@scylladb
Thank you

More Related Content

What's hot

DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopHakka Labs
 
CouchDB – A Database for the Web
CouchDB – A Database for the WebCouchDB – A Database for the Web
CouchDB – A Database for the WebKarel Minarik
 
SQL for NoSQL and how Apache Calcite can help
SQL for NoSQL and how  Apache Calcite can helpSQL for NoSQL and how  Apache Calcite can help
SQL for NoSQL and how Apache Calcite can helpChristian Tzolov
 
The CIOs Guide to NoSQL
The CIOs Guide to NoSQLThe CIOs Guide to NoSQL
The CIOs Guide to NoSQLDATAVERSITY
 
OrientDB the database for the web 1.1
OrientDB the database for the web 1.1OrientDB the database for the web 1.1
OrientDB the database for the web 1.1Luca Garulli
 
Spark SQL with Scala Code Examples
Spark SQL with Scala Code ExamplesSpark SQL with Scala Code Examples
Spark SQL with Scala Code ExamplesTodd McGrath
 
Storlets fb session_16_9
Storlets fb session_16_9Storlets fb session_16_9
Storlets fb session_16_9Eran Rom
 
NoSql presentation
NoSql presentationNoSql presentation
NoSql presentationMat Wall
 
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...Chris Fregly
 
Streamlining JSON mapping
Streamlining JSON mappingStreamlining JSON mapping
Streamlining JSON mappingJonathan Lehr
 
Why is data independence (still) so important? Optiq and Apache Drill.
Why is data independence (still) so important? Optiq and Apache Drill.Why is data independence (still) so important? Optiq and Apache Drill.
Why is data independence (still) so important? Optiq and Apache Drill.Julian Hyde
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraRustam Aliyev
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifyNeville Li
 
Apache Cassandra and Python for Analyzing Streaming Big Data
Apache Cassandra and Python for Analyzing Streaming Big Data Apache Cassandra and Python for Analyzing Streaming Big Data
Apache Cassandra and Python for Analyzing Streaming Big Data prajods
 
NoSQL and AWS Dynamodb
NoSQL and AWS DynamodbNoSQL and AWS Dynamodb
NoSQL and AWS DynamodbEduardo Bohrer
 
Spark meetup v2.0.5
Spark meetup v2.0.5Spark meetup v2.0.5
Spark meetup v2.0.5Yan Zhou
 
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016Luigi Dell'Aquila
 

What's hot (20)

DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
 
CouchDB – A Database for the Web
CouchDB – A Database for the WebCouchDB – A Database for the Web
CouchDB – A Database for the Web
 
NoSQL Introduction
NoSQL IntroductionNoSQL Introduction
NoSQL Introduction
 
SQL for NoSQL and how Apache Calcite can help
SQL for NoSQL and how  Apache Calcite can helpSQL for NoSQL and how  Apache Calcite can help
SQL for NoSQL and how Apache Calcite can help
 
The CIOs Guide to NoSQL
The CIOs Guide to NoSQLThe CIOs Guide to NoSQL
The CIOs Guide to NoSQL
 
OrientDB the database for the web 1.1
OrientDB the database for the web 1.1OrientDB the database for the web 1.1
OrientDB the database for the web 1.1
 
Spark SQL with Scala Code Examples
Spark SQL with Scala Code ExamplesSpark SQL with Scala Code Examples
Spark SQL with Scala Code Examples
 
Storlets fb session_16_9
Storlets fb session_16_9Storlets fb session_16_9
Storlets fb session_16_9
 
NoSql presentation
NoSql presentationNoSql presentation
NoSql presentation
 
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
 
Scio
ScioScio
Scio
 
Streamlining JSON mapping
Streamlining JSON mappingStreamlining JSON mapping
Streamlining JSON mapping
 
Why is data independence (still) so important? Optiq and Apache Drill.
Why is data independence (still) so important? Optiq and Apache Drill.Why is data independence (still) so important? Optiq and Apache Drill.
Why is data independence (still) so important? Optiq and Apache Drill.
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and Cassandra
 
CouchDB
CouchDBCouchDB
CouchDB
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at Spotify
 
Apache Cassandra and Python for Analyzing Streaming Big Data
Apache Cassandra and Python for Analyzing Streaming Big Data Apache Cassandra and Python for Analyzing Streaming Big Data
Apache Cassandra and Python for Analyzing Streaming Big Data
 
NoSQL and AWS Dynamodb
NoSQL and AWS DynamodbNoSQL and AWS Dynamodb
NoSQL and AWS Dynamodb
 
Spark meetup v2.0.5
Spark meetup v2.0.5Spark meetup v2.0.5
Spark meetup v2.0.5
 
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016
 

Similar to Alternator webinar september 2019

Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible APIIntroducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible APIScyllaDB
 
Build DynamoDB-Compatible Apps with Python
Build DynamoDB-Compatible Apps with PythonBuild DynamoDB-Compatible Apps with Python
Build DynamoDB-Compatible Apps with PythonScyllaDB
 
Running a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesRunning a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesScyllaDB
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversScyllaDB
 
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...DevOps.com
 
AWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAmazon Web Services
 
What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0ScyllaDB
 
Demystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdfDemystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdfScyllaDB
 
Free & Open DynamoDB API for Everyone
Free & Open DynamoDB API for EveryoneFree & Open DynamoDB API for Everyone
Free & Open DynamoDB API for EveryoneScyllaDB
 
Fighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with EmbulkFighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with EmbulkSadayuki Furuhashi
 
Cloud State of the Union for Java Developers
Cloud State of the Union for Java DevelopersCloud State of the Union for Java Developers
Cloud State of the Union for Java DevelopersBurr Sutter
 
JS App Architecture
JS App ArchitectureJS App Architecture
JS App ArchitectureCorey Butler
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorialmubarakss
 
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Animesh Singh
 
Flux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JSFlux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JSIvo Andreev
 
"Serverless Java Applications" at Froscon 2018 by Vadym Kazulkin/Elmar Warken
"Serverless Java Applications" at Froscon 2018 by Vadym Kazulkin/Elmar Warken"Serverless Java Applications" at Froscon 2018 by Vadym Kazulkin/Elmar Warken
"Serverless Java Applications" at Froscon 2018 by Vadym Kazulkin/Elmar WarkenVadym Kazulkin
 
Webinar: How to build a highly available time series solution with KairosDB
Webinar: How to build a highly available time series solution with KairosDBWebinar: How to build a highly available time series solution with KairosDB
Webinar: How to build a highly available time series solution with KairosDBScyllaDB
 
Webinar how to build a highly available time series solution with kairos-db (1)
Webinar  how to build a highly available time series solution with kairos-db (1)Webinar  how to build a highly available time series solution with kairos-db (1)
Webinar how to build a highly available time series solution with kairos-db (1)Julia Angell
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Bostonkbajda
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksDatabricks
 

Similar to Alternator webinar september 2019 (20)

Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible APIIntroducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
 
Build DynamoDB-Compatible Apps with Python
Build DynamoDB-Compatible Apps with PythonBuild DynamoDB-Compatible Apps with Python
Build DynamoDB-Compatible Apps with Python
 
Running a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesRunning a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes Services
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database Drivers
 
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
 
AWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDB
 
What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0
 
Demystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdfDemystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdf
 
Free & Open DynamoDB API for Everyone
Free & Open DynamoDB API for EveryoneFree & Open DynamoDB API for Everyone
Free & Open DynamoDB API for Everyone
 
Fighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with EmbulkFighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with Embulk
 
Cloud State of the Union for Java Developers
Cloud State of the Union for Java DevelopersCloud State of the Union for Java Developers
Cloud State of the Union for Java Developers
 
JS App Architecture
JS App ArchitectureJS App Architecture
JS App Architecture
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorial
 
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
 
Flux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JSFlux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JS
 
"Serverless Java Applications" at Froscon 2018 by Vadym Kazulkin/Elmar Warken
"Serverless Java Applications" at Froscon 2018 by Vadym Kazulkin/Elmar Warken"Serverless Java Applications" at Froscon 2018 by Vadym Kazulkin/Elmar Warken
"Serverless Java Applications" at Froscon 2018 by Vadym Kazulkin/Elmar Warken
 
Webinar: How to build a highly available time series solution with KairosDB
Webinar: How to build a highly available time series solution with KairosDBWebinar: How to build a highly available time series solution with KairosDB
Webinar: How to build a highly available time series solution with KairosDB
 
Webinar how to build a highly available time series solution with kairos-db (1)
Webinar  how to build a highly available time series solution with kairos-db (1)Webinar  how to build a highly available time series solution with kairos-db (1)
Webinar how to build a highly available time series solution with kairos-db (1)
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Boston
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Alternator webinar september 2019

  • 2. Presenters 2 Dor Laor Dor Laor is the CEO of ScyllaDB. Previously, Dor was part of the founding team of the KVM hypervisor under Qumranet that was acquired by Red Hat. At Red Hat Dor was managing the KVM and Xen development for several years. Dor holds an MSc from the Technion and a Phd in snowboarding. Avi Kivity Avi Kivity, CTO of ScyllaDB, is known mostly for starting the Kernel-based Virtual Machine (KVM) project, the hypervisor underlying many production clouds. He has worked for Qumranet and Red Hat as KVM maintainer until December 2012. Avi is now CTO of ScyllaDB, a company that seeks to bring the same kind of innovation to the public cloud space. Nadav Har’El Nadav Har’El has had a diverse 20-year career in computer programming and computer science. In the past he worked on scientific computing, networking software, and information retrieval, but in recent years his focus has been on virtualization and operating systems, and among other things he has worked on nested virtualization and exit-less I/O in KVM, and today he maintains the OSv kernel and also works on Seastar and ScyllaDB.
  • 3. 3 + The Real-Time Big Data Database + Drop-in replacement for Cassandra + 10X the performance & low tail latency + New: Scylla Cloud, DBaaS + Open source and enterprise editions + Founded by the creators of KVM hypervisor + HQs: Palo Alto, CA; Herzelia, Israel About ScyllaDB
  • 4. What is Alternator + Why a DynamoDB-compatible API? Live demos + Get started in 5 minutes in docker + CLI access to the DynamoDB API + Play a game on Alternator + Monitoring Alternator Alternator implementation + How Scylla differs from DynamoDB + Current compatibility and limitations Migrating from DynamoDB to Alternator Agenda
  • 6. What is alternator? + Scylla is an efficient NoSQL data store, announced September 2015. + It is open source, with enterprise support and cloud (SaaS) options. + Compatible with Cassandra and its APIs (CQL, Thrift). + Alternator: Adding a DynamoDB API to Scylla.
  • 7. Scylla design principles + Efficient implementation for modern hardware + Throughput 10x higher than Cassandra + Linear scalability to many-core machines + Focused on modern fast SSDs + Low tail latency + Reliability + Autonomous database (minimal configuration) We can apply these advantages to more than just Cassandra compatibility!
  • 8. Why DynamoDB API? DynamoDB is similar in design and data model to Scylla More details on the similarities, and differences, later. Amazon Dynamo (2007 paper) Google Bigtable (2006 paper)
  • 9. Why DynamoDB API? DynamoDB is SaaS + SaaS is easy to get started with, SaaS trend in industry + DynamoDB popularity growing Vs. Cassandra: Cassandra DynamoDB
  • 10. Why DynamoDB API? Better price/performance https://www.scylladb.com/product/benchmarks/dynamodb-benchmark/ + Managed Scylla (“Scylla Cloud”) is 5 times cheaper than DynamoDB.
  • 11. Why DynamoDB API? Vendor lock-in + Users want to move their DynamoDB application to + a different cloud provider, + a private datacenter, + or a hybrid of several clouds or datacenters. + Scylla can be run on any cloud or datacenter.
  • 13. Getting Alternator running in 5 minutes + Running Alternator is simply running Scylla, with the parameter “alternator-port” set - to listen for the DynamoAPI API on that port. + You can get it running on local your machine in 5 minutes using docker: docker run --name scylla -d -p 8000:8000 scylladb/scylla-nightly:alternator --alternator-port=8000
  • 14. Let’s test it really works + We can then run Amazon’s DynamoDB CLI tools against port 8000 + aws --endpoint-url 'http://172.17.0.1:8000' dynamodb create-table --table-name mytab --attribute-definitions AttributeName=key,AttributeType=S --key-schema AttributeName=key,KeyType=HASH --billing-mode PAY_PER_REQUEST + First attempt works, second fails (as expected)
  • 15. Amazon’s Tic-Tac-Toe demo https://github.com/awsdocs/amazon-dynamodb-developer-guide/blob/master/doc_source/TicTacToe.Phase1.md + An open-source Python application using DynamoDB + Written by Amazon, demonstrates many DynamoDB features + Written in Python, using the Amazon’s AWS client library (boto3) + python application.py --mode local --port 8000 Implements a multiplayer Tic-Tac-Toe game server. Many users can connect, invite each other to games, and play against each other, and keep score.
  • 19. Live demo + Live workload + Cluster of three 30-core nodes in AWS, each in separate AZ. + 1.1 TB data - 1 billion items, 1.1KB each. + YCSB workload, 50% read 50% write, Zipfian distribution.
  • 21. Alternator implementation Let’s survey some of the similarities and differences of Scylla and DynamoDB + How did we handle the differences? + What still needs to be done? A much more detailed survey can be found in this document.
  • 22. DynamoDB vs Scylla: similarities + Fast scalable NoSQL databases with real-time response and huge volume + Key/Value and Wide-row stores + Eventual and configurable consistencies + Hashed partition keys - and also sort key for items inside a partition
  • 23. DynamoDB vs Scylla: basic differences + DynamoDB is Cloud Native + Only available as a service, part of a huge deployment + Scylla has different options: OSS, Enterprise, as-a-service + More flexible - runs everywhere + Many configuration options - amount of replicas, different consistency levels, + Scylla has node operations, cli, etc + Scylla integrates with many OSS projects, Prometheus, Kafka, Spark (as first citizens) + Scylla’s units are servers (cpu shards). Dynamo’s units are IOPS/Tablets + Dynamo uses HTTP(s)/Json, Scylla used CQL
  • 24. Data model - similarities + Data is divided into tables. + Data composed of Items in partitions. Same as Scylla’s rows in partitions. + Item’s key has hash and range parts - like Scylla’s partition and clustering key. (in DynamoDB API - only one of each) + Type of key columns defined in table’s schema (string, number, bytes)
  • 25. Data model - differences + But DynamoDB items can have additional attributes - not defined in schema + Attributes may be scalars (string, number, etc.), lists, sets, document. + Similar to a JSON document. + Needs to be emulated in Scylla + Put in table one map for top-level attributes (mapping attribute name to JSON value). + Map instead of single JSON allows concurrent updates to different top-level attributes of same item. + TODO: support updates to deep attributes.
  • 26. Write mechanism - differences ● DynamoDB natively supports Read-Modify-Write (RMW) updates: ○ Conditional updates (set a = 2 if a == 1) ○ Counters (set a = a + 1) ○ Attribute copy (set a = b) ● Scylla natively supports independent writes to different columns (CRDT): ○ Efficient updates to different columns - not requiring a read. We are adding support for Read-modify-write operations - LWT.
  • 27. Write mechanism - why is it different? Scylla DynamoDB Log Structured Merge (LSM) ■ Efficient writes - without prior read BTree ■ Write includes a free read, but are slow. Quorum-based consistency ■ Writes done on several replicas independently ■ Concurrent read-modify-write operations are not serialized. Leader model ■ One replica (“leader”) responsible for a write operation, so can serialize read-modify-write operations.
  • 28. Server + Each node in Scylla cluster also answers DynamoDB API requests on this port. + So no need for separate sizing of an API translation cluster. + Same nodes can do both CQL and DynamoDB API + As needed, forwards the request to other nodes holding the requested data. + Uses internal Scylla function calls and RPC - no translation to CQL. + Client needs to send requests to the different Scylla nodes. Can be done via DNS or HTTP load balancer.
  • 29. The DynamoDB API DynamoDB API is: JSON-formatted requests and responses over HTTP/HTTPS. Request Response POST / HTTP/1.1 ... X-Amz-Target: DynamoDB_20120810.CreateTable { "TableName": "mytab", "KeySchema": [{"AttributeName": "key", "KeyType": "HASH"}], "AttributeDefinitions": [{"AttributeName": "key", "AttributeType": "S"}], "BillingMode": "PAY_PER_REQUEST" } { "TableDescription": { "AttributeDefinitions": [{ "AttributeName":"key","AttributeType":"S"}], "TableName": "mytab", "KeySchema": [{ “AttributeName":"key","KeyType":"HASH"}], "TableStatus": "ACTIVE", "CreationDateTime": 1569242964, "TableId": "91347050-de00-11e9-a100-000000000000" } }
  • 30. Alternator’s compatibility and limitations + See detailed current status in alternator.md, and issues in bug tracker: + https://github.com/scylladb/scylla/blob/master/docs/alternator/alternator.md + https://github.com/scylladb/scylla/issues + Several DynamoDB applications already work unmodified. + Some of the issues we will address for the GA: + Safe concurrent read-modify-write operations + A few operations and subcases of operations not yet supported + Authentication + SaaS on Scylla Cloud + DynamoDB streams (CDC)
  • 32. Migrating from DynamoDB to Scylla + Install Scylla and load balancer (or wait for Scylla Cloud SaaS availability) + Tell your application, written to use DynamoDB, Scylla’s endpoint address + This is a preview release. Watch out for unsupported features and unsafe concurrent RMW operations. + Migrate existing data from DynamoDB to Scylla using DynamoDB API + E.g. Spark migrator: https://www.scylladb.com/2019/09/12/migrating-from-dynamodb-to-scylla
  • 33. Summary + Scylla is a very efficient, reliable, low latency NoSQL data store, that began with Cassandra compatibility. + The Alternator Project adds to Scylla DynamoDB API compatibility. + Can run existing applications designed for DynamoDB, + On any cloud or data center, not just on AWS. + Open source. + Currently a preview release, with some limitations, but GA expected soon.
  • 34.
  • 35. Q&A Stay in touch dor@scylladb.com @DorLaor Dor Laor Avi Kivity avi@scylladb.com @AviKivity Nadav Harel nyh@scylladb.com @nyharel
  • 36. United States 1900 Embarcadero Road Palo Alto, CA 94303 Israel 11 Galgalei Haplada Herzelia, Israel www.scylladb.com @scylladb Thank you