3. Compose – Database as a
Service
• We already host MongoDB, PostgreSQL, Redis,
Elasticsearch, RethinkDB, etcd and RabbitMQ
• Today, we’re announcing ScyllaDB on Compose
§ On Compose Enterprise today
§ On Compose generally very soon
5. Compose – Phase 1
• Founded as MongoHQ in 2009
§ The first hosted MongoDB-as-a-service company
§ Y-Combinator Class of 2011
• Early adopters of containers
§ Containers helped build the infrastructure to offer DBaaS
6. Early problems…
• Static deployments, lost a lot of the magic of containers
• Pre-provisioned deployments
• Ran the full system, cron, sshd, everything was running in
the container
• Everything ran on the public internet
• This only worked for Mongo
10. Compose – Phase 2
• Elastic deployments
§ More dynamic containers
§ Deploy & destroy containers on demand
• Bonus:
§ Orchestration becomes the authority for business/DB logic
11. But…
• Everything was still facing the public internet
• Orchestration layer became bloated
• This infrastructure limited us
• Still running the entire OS inside the container
• And, of course…
14. What needed to change…
• At Compose, each database needs:
§ To be on a private network
§ Not accessible to the internet
§ Deploy & destroy containers on demand
§ Logic-less orchestration
§ Lightweight dynamic containers
15. The Bad News
• From the outside, each database looks different.
§ Different query languages
§ Different set of drivers
§ Configured differently
§ Needs a different environment to run in
• Finding common ground is a daunting problem.
16. The Good News
• From the outside, all databases need the same things:
§ Scaling
§ Clustering and Failover
§ Backup and restore data
§ Quiescing nicely
§ Private networks
§ Operational health checks
17. Working for the future
• The next platform we built would need all those things
• So we refined the concepts
• And we took inspiration from the ideas in the Twelve-
Factor App - https://12factor.net/
• So we created…
21. Configuration
• Store configurable parameters in the container’s
environment
§ This encourages reuse of container images
• DBs need to be run with one or more configuration
§ Configuration files can be built quickly & repeatedly during the
pre-start `Configure` process by putting environmental data into
templates
• For Scylla we write the scylla.yaml file and configure the
listen addresses and cluster names.
22. Deployments and Processes
• A usefully redundant database contains more than one
moving piece.
§ Decompose the database into a collection of useful and distinct
processes
§ Run these processes as possibly stateful services in their own
environment/container
• Scylla is 3 data nodes and 3 proxy nodes
23. Disposability
• Database containers should be entirely ephemeral,
§ Easily created and destroyed.
• Destroying a container should not destroy the data
§ The database has a different life cycle than the data
• The database and the data have different lifecycles
§ Whatever happens to the database instance the data has to live
on
24. Affinity and Storage
• There needs to be affinity between a database and the
storage
§ Database nodes can have one or more attached volumes that are
persistent on container restart
§ These volumes have a different lifecycle than the container
§ Data volumes should be accessible from the host
25. Network and Portal Access
• Databases should live on their own private network
• Access should only happen through specialized “portal”
containers
§ Do not unnecessarily expose things on public networks
§ Only expose specific, hardened entry points on portal containers
via port binding
§ Portal containers should terminate ssl
• For Scylla, each data node is matched with a portal
controlling outside access to it
26. Fixed Network Identity - 1
• The naming and addressing of the the entire system
should be fixed before creation
• Container addresses should be static across container
restarts
• This includes the all elements that make up the networks
configuration
27. Fixed Network Identity -2
• Discovering network after the fact is problematic.
• Scylla, because of client driver auto discovery, needs the
same number of portals as there are database nodes.
• Each portal needs to know the address of it's partner
database node.
• Each database node also needs to know the address of the
portal.
28. Scaling
• First scale up the container
§ In a hosting environment, there should be plenty of leeway to
expand a container’s resources.
§ Databases prefer to stay up so just add resources
• Then scale deployments horizontally
§ Scylla lets us do this easily
§ PostgreSQL, Redis, MySQL are difficult to scale horizontally
§ Scaling horizontally and moving storage can be costly
29. Logs & Metrics
• Database specific metrics and log collection should be
done within the deployment from an extra container
• Collect Logs and Metrics from all nodes
• Each node should provide a stream of logs to stdout
§ Easier to collect on container hosts
§ Standardized practice across all nodes
30. Database Orchestration
• Encapsulate administration functions within the container
§ Push the logic out of the orchestration and into the containers
• Manage the deployment through the use of recipes
§ Sequence of ordered operations
§ Don’t know about the internal state of the database
31. Codebase
• Use one image
§ with a deterministic build process
§ can be used to create many running instances
• Different images should be provided for different database
versions
§ For example Scylla 1.0 vs 1.2 vs 1.3 would be three separate
images
32. Tools and Versions
• DB administration tools
§ Should be versioned
§ Kept in lock-step with the version of the database
- Avoids exposure to changes in how admin tools function.
§ Thrift support in Scylla was released in 1.3
- The tools to administer that were added only to that version
• No restarts on upgrades to administration tooling
§ Consider overlaying your tooling on top of running databases
34. In practice at Compose today
• We apply these factors throughout our platform
• It works for a wide range of database technologies
• It gives us reliable, repeatable, resilient systems
• And now Scylla is coming to that platform