Talk at DevNation Portland, July 10th, 2010.
Keynote available for download at:
http://dl.dropbox.com/u/458036/presentations/DevNation%20Portland.zip
Sources and additional information in the presenter notes.
Presentation on how to chat with PDF using ChatGPT code interpreter
Riak: A friendly key/value store for the web.
1. riak
A friendly key/value store for the web.
ION
EV NAT
010D
D
2 N TLA
POR
A primer by Bruce Williams
2. D PO
EV R
N TLA
A N
TI D
O
N
My name is
Bruce Williams.
ct ed
di g
ad din
I’m lee
a nd e b
th e.
to e dg
3. D PO
EV R
N TLA
A N
TI D
O
N
2001 - Present Day
wa yyy before it was
a viable job choice.
4. D PO
EV R
N TLA
A N
TI D
O
N
But I use other
languages, too.
rom .
y f ms
all ig
ci ad
es pe ar
rp
o the
5. D PO
EV R
N TLA
A N
TI D
O
N
Photo by oddsteph - http://flic.kr/p/6vWPBU
me
su of
as e
Le t’s on ll
a is ba
J av base
t he ats.
b
Choose the Right Weapon
6. D PO
EV R
N TLA
A N
TI D
O
N
Based in the D.C. area.
(but I’m not.)
7. You may find the following
conspicuously missing in
this talk:
r y!
o r
S
8. D PO
EV R
N TLA
I will not be
A N
TI D
O
N
presenting a paper on
Dynamo, the CAP
theorem, vector
clocks, merkle trees,
etc. These are explained
elsewhere by my
alg orithmic betters.
9. D PO
EV R
N TLA
A N
TI D
O
N
I will not be dwelling
on performance or
redundancy.
Expect some vague
statements like “very
fa st” and “very robust.”
10. D PO
EV R
N TLA
A N
TI D
O
N
I will not try to
convince you that
“NoSQL” is the
messiah.
I t’s an alternative that
m akes sense in some
situations.
11. D PO
EV R
N TLA
A N
TI D
O
N
I will not be conducting
a large-scale
comparison of
competing technologies.
b ut I’d love to hear
abou t what you use, and
why
27. D PO
EV R
A Quick Local Cluster
N TLA
A N
TI D
O
N
$ ./riak1/bin/riak start
$ ./riak2/bin/riak start
$ ./riak3/bin/riak start
Start three
“nodes”
$ ./riak2/bin/riak-admin join riak1@127.0.0.1
$ ./riak3/bin/riak-admin join riak1@127.0.0.1
Join them in
to a cluster
29. D PO
EV R
Object
N TLA
A N
TI D
O
N
Content Type
Body
+ Links
The thing you’re storing.
30. D PO
EV R
ca
Key
N TLA
n
A N
TI D
de be
O
N
au fi
to ne use
ge ma d o r-
ne tic r
ra a
te lly
d
pic1
The identifier for the object.
31. D PO
EV R
Bucket
N TLA
A N
TI D
O
N
“p thin
ic
wi
1” “im
is ag
un es
iq ”
pic1
ue
pic2 pic3
images
The type or category of object.
32. D PO
EV R
Addressability
N TLA
A N
TI D
O
N
<i
ma
ge
images
s/
pi
c1
>
pic1
Refer to objects by bucket and key.
33. D PO
EV R
Example
N TLA
A N
TI D
O
N
require 'riak'
client = Riak::Client.new
client.bucket('images').new('pic1').tap do |pic1|
pic1.content_type = 'image/jpeg'
pic1.data = File.read('/path/to/jpg')
pic1.store
end
$g em install riak-client
34. D PO
EV R
Example
N TLA
A N
TI D
O
N
client.bucket('people').new('bruce').tap do |bruce|
bruce.data = {
name: 'Bruce Williams',
email: 'bruce@codefluency.com'
}
bruce.store
end
puts client['people']['bruce'].data['name']
“application/json” is the
d efault for riak-client
35. D PO
EV R
Links
N TLA
A N
TI D
O
N
st
or
ed
images people
he
re
pic1 bruce
can also be
“tagged”
Connect objects
36. D PO
EV R
Example
N TLA
A N
TI D
O
N
client['people']['bruce'].tap do |bruce|
bruce.links << client['images']['pic1'].to_link('avatar')
bruce.store
end
client['people']['bruce'].walk(:tag => 'avatar')
39. D PO
EV R
The Ring
N TLA
A N
TI D
O
N
A 160-bit integer space
40. D PO
EV R
The Ring
N TLA
A N
TI D
O
N
broken into equal sized partitions.
41. N
O
TI D
A N
N TLA
EV R
D PO
st more functional)
looks kinda like this
The Ring
(it’s ju
It
Photo by marchdoe - http://flickr.com/photos/marchdoe/457741149
42. D PO
EV R
The Ring
N TLA
A N
TI D
O
N
Each partition is managed
by a vnode (virtual node),
43. D PO
EV R
The Ring
N TLA
A N
TI D
O
N
Each vnode runs on
a [physical] node.
44. D PO
EV R
The Ring
N TLA
A N
TI D
O
N
1 2
3 4
Each node owns an equal share of
vnodes (& partitions)
45. D PO
EV R
Replication
N TLA
A N
TI D
O
N
3
is
th
e
de
fa
ult
n_val = 3
Objects are written to multiple
partitions.
46. ,
ils
N
O
TI D
A N
N TLA fa
EV R ” up
“2 ck
Uses Hinted Handoff to deal with
D PO
de pi k.
no s
n er lac
he th s
W e o the
th
Availability
node failures.
4
2
3
1
47. D PO
EV R
Persistence
N TLA
A N
TI D
O
N
dets ets fs
gb_trees innostore
bitcask multi +
Supports pluggable backends
49. D PO
EV R
GET
N TLA
A N
TI D
O
N
r
how many replicas need to agree (default: 2)
50. D PO
EV R
PUT
N TLA
A N
TI D
O
N
r
how many replicas need to agree when retrieving an
existing object before the write (default: 2)
w
how many replicas to write to before returning a
successful response (default: 2).
dw
how many replicas to commit to durable storage
before returning a successful response (default: 0)
52. D PO
EV R
Map
N TLA
A N
TI D
O
N
obj [result, ...]
your function
Map functions take one piece of data
as input, and produce zero or more
results as output.
53. Data-locality is important in Riak.
Map phases are run where the data is
stored.
You can have multiple map phases.
The input to a map definition is a
series of [bucket, key] names.
unlike CouchDB
54. D PO
EV R
Link
N TLA
A N
TI D
O
N
obj [linked_obj, ...]
link walk, using a
pattern
A special kind of map phase; links
matching a pattern are “walked” to
find objects to be output.
55. D PO
EV R
Reduce
N TLA
A N
TI D
O
N
[obj, ...] [result]
your function
Reduce functions combine the output
of many "map" step evaluations, into
one result
56. The reduce phase occurs on the
“coordinating node.”
Reduces may be run multiple times
as more input comes in (eg, re-
reduce)
57. D PO
EV R
Example
N TLA
A N
TI D
O
N
bruce = client['people']['bruce']
melissa = client['people']['melissa']
lets assume these have ages
addy = client['addresses'].new('123fake')
addy.data = {
street: '123 Fake St',
city: 'Portland', state: 'OR', zip: '97214'
}
addy.links << bruce.to_link('resident')
addy.links << melissa.to_link('resident')
addy.store
58. D PO
EV R
Example
N TLA
A N
TI D
O
N
Riak::MapReduce.new(client).add(addy).
link(tag: 'resident').
map("function (v) { return [Riak.mapValuesJson(v)[0]['age'] || 0] }").
reduce(function: 'Riak.reduceSum', keep: true).
run
We should get an array with one value
60. D PO
EV R
N TLA
No range queries.
A N
TI D
O
N
Sorry, Cassandra fans
Things like time
series data require
creative approaches.
like bucket and key naming, etc
61. D PO
EV R
N TLA
A N
Don’t list keys.
TI D
O
N
ever, if you can avoid it.
Processing an entire
bucket is more expensive
than you might think.
because it lists keys
62. D PO
EV R
N TLA
A N
TI D
O
N
Watch your encoding.
MapReduce Javascript
phases need your data
to be in valid Unicode.
you’ll get a “bad encoding” error