How to Troubleshoot Apps for the Modern Connected Worker
Riak seattle-meetup-august
1. The Best Open Source Database
You Will Ever Have The
Pleasure Of Running In Production
Seattle Scalability Meetup
August 24, 2011
2. Who Am I?
• Mark Phillips
• Community Manager
• Basho Technologies
• @pharkmillups
3. What is Riak?
• a database
• a key/value store
• distributed
• fault-tolerant
• scalable
• Dynamo-inspired
• used by startups
• used by FORTUNE 100 companies
• written (primarily) in Erlang
• pronounced “REE-awk”
• not the right fit for every project and app
4. Riak’s Design Goals
• Simple, Elegant Scalability
• Ease of operations
• Resiliency in the face of failure
• Plumbing
8. Building Clusters is Dead Simple
$ riak start
# to get your first node running
$ riak start
# to get your second node running
$ riak-admin join riak@192.168.1.10
# send a join request to an existing Riak node
$ later, rinse, repeat until you’re web scale
9. APIs
•HTTP - Richly Featured RESTful API
•Protocol Buffers - Courtesy of Google
10. Current HTTP API
(made possible by webmachine)
Store
POST /riak/bucket # Riak-defined key
PUT /riak/bucket/key # User-defined key
Fetch
GET /riak/bucket/key
Delete
DELETE /riak/bucket/key
11. How Riak Organizes Data
Bucket/Key/Value
•Bucket - top-level namespace in Riak. Used for
basic data organization; also used to set
properties for keys in that bucket (n_val, commit
hooks, choice of backend, etc.)
•Key - binary blob to identify value
•Value - any type of data type you can ever
imagine
12. What we borrowed from Dynamo:
• Gossip protocol - ring membership, partition assignment
• Consistent Hashing - division of work
• Vector Clocks - versioning, conflict resolution
• Read Repair - anti-entropy
• Hinted Handoff - failure masking, data migration
Paper was not a spec for a system but one
approach; we learned from it but deviated where
necessary
14. N, R, W Values
• N = number of replicas to store (on distinct
physical nodes)
• R = number of replica responses needed for
a successful read
•W = number of replica responses needed for
a successful write
17. Writing Things to Disk
(because that’s what databases do)
Riak allows for pluggable local storage
• Bitcask (recommended, ships as default)
• LevelDB (recently released by Google)
• Innostore
• Several other specialty backends
18. Querying Riak
• Primary key based lookups
• MapReduce
• Links/Link Walking
• Pre- and Post- Commit Hooks
• Full Text Search
• Secondary Indexes
19. Riak’s Design Goals
(revisited)
• Simple, Elegant Scalability
• Ease of operations
• Resiliency in the face of failure
• Plumbing
• Ease of Development, Developer
Usability
20. Our Current Usability Focus
• Robust, supported client libs
• More complex querying capabilities
• Documentation
• Sample Apps, code, etc.
• Cluster administration tools
• Logging
22. Community and Open Source
• Very committed to our open source community
• Companies like Yammer, Comcast, DISQUS, Trifork
and Formspring contributing actively
• The Riak community is a great place to work and
play. Come join us!
25. About Basho
Offices in:
San Francisco, CA;
Cambridge, MA;
Reston, VA
Employees:
All over the world
Total Employees:
~35
26. About Basho
How do we pay the bills?
Enterprise Licenses of Riak and SLA’d Support
Professional Services
Consulting
Other Prominent Open Source Software:
Webmachine
Rebar
Bitcask
Lager
27. Get Involved
• wiki.basho.com
• github.com/basho/*
• basho.com
• twitter.com/basho
• Riak Mailing List
• downloads.basho.com/riak/CURRENT
Notes de l'éditeur
\n
\n
* Lots of buzz words \n* Ther e are \n
Talk about how this is borne from what we built back at Basho\n* Sales app \n* Never wanted it to go down \n* apple, akamai pedigree \n* fungible assets \n\n
* Just wanted to touch on this a bit more\n - Every node is the same \n
\n
\n
- After you download and build Riak, here’s all that stands between you have cluster of N nodes. \n - We also have a extended suite of command line tools that make cluster management very simple \n\n\n
* These are both very well documented on the wiki \n* Can easily write your own client code to talk to Riak using these as guides\n
* Webmachine a REST toolkit if you like\n
* Buckets are used for basic data organization. (not in Dynamo) Rougly analogous to “tables” but only insofar as they can be used for organization. Buckets are also where you set certain properties. \n\nJSON is a data type we see a lot. BSON showed up recently,too. Python byte code. \n\n\n
* dynamo differences\n- buckets \n\n
ring, keyspace, vnode, partition, node. \n* pref list is the list of the N vnodes on which the data should be stored. \n* the ease of scaling out here should be pretty apparent. We aren’t asking you to pick a shard key, etc. Y * You select an N_val that suits your business needs (store more copies durably that matter more), and then add nodes to suit.\n* each color is a physical node\n* nodes run an equal # of partitions (optimistically) \n* each partition then runs an erlang process called a vnode\n* vnodes are responsible for the handling of requests\n* ring space = integer of 2^160. \n
* these provide the developer and stakeholders to tune consistency and availability into their apps at the bucket let. (and we are working making this more granular)\n
\n
Tunable consistency based on your business needs. \nLog data can be written with an W of 1 for speed. \nPrescription information can be read with an R val of 3 for higher consistency \n
enable &#x201C;multi-backend&#x201D; in your config file; backends are configurable at the bucket level\n* Bitcask - current default. we wrote this. Append only on disk. keeps a pointer to most recent copy of object in mem. updates that on each write. performs merges to clean upu\nold values. Super fast - < 5ms latencies on most systems at 99.9% Memory is only concern. (~100 bytes per object depending on key size)\n* LevelDB - just released by google. permissive licensing, items are ordered on disk, far less drastic memory requirements per key (less ram, slightly longer latencies) opens up the door for more time series/range based queries in Riak\n* Innostore - recommended when RAM requirements might be too high. embedded innoDB. We will start recommending this less at level support become more stable\n* In-memory for testing. We are also deprecating them\n\n\n
* Links are metadata that establish one-way relationships between objects in Riak; once links are attached, you can them perform links walking queries to find relationships. bucket - talks, key - Seattle, riaktag=&#x201D;talk&#x201D; (in header). \n* MapReduce - Chain any number of Map and Reduce phases. Map produces 0 or more results based on you function. &#x201C;Reduce&#x201D; is will combine the results of MapPhase and return them to client. \n* Pre commit - json validation. Post commit - send to another db/service \n* Full Text Search - full-text search engine built around Riak Core and tightly integrated with Riak KV. Use a pre-commit hook at the bucket level to index. \n
* We are pushing this super hard, as the next slide will show. \n
* Client libs - Erlang, java, JS, python, node.js, Ruby (ripple), haskell, Smalltalk, Go, Scala, PHP, Perl \n* Secondary indexing, deeper search integration, robust, better-defined MapReduce with pipe\n* Working on a successor to Rekon that will be an admin tool (use DISQUS anecdote about Cassandra Dashboard) \n
* Comcast - requirement is basically an internal Amazon S3 - a straight Key/Value store with an HTTP interface. This is to build a product called HOSS (high availably object storage system). This infrastructure is used across Comcast to store DVR data.\n* European country is using Rik \n* MIG-CAN - SMS Gateway for British \n* Wikia - multiple data centers for session storage\n\n
\n
\n
* Rusty&#x2019;s 2I deck is available; code is in master\n* Lager blog post\n* pipe is in master; extensive readme on GitHub \n* Will be doing prereleases (packages, as opposed to building from source) - get on the mailing list to be part of this \n