SlideShare une entreprise Scribd logo
1  sur  64
Télécharger pour lire hors ligne
riak
            A friendly key/value store for the web.



       ION
 EV NAT

     010D
D
    2 N TLA
    POR


                                  A primer by Bruce Williams
D PO
                          EV R
                            N TLA
                              A N
                               TI D
                                 O
                                   N
  My name is
Bruce Williams.
                         ct ed
                       di g
                    ad din
                I’m lee
           a nd e b
                 th      e.
              to    e dg
D PO
                         EV R
                           N TLA
                             A N
                              TI D
                                O
                                  N
2001 - Present Day


 wa yyy before it was
 a viable job choice.
D PO
                             EV R
                               N TLA
                                 A N
                                  TI D
                                    O
                                      N
But I use other
languages, too.
                    rom .
                 y f ms
              all ig
            ci ad
      es pe ar
            rp
      o the
D PO
                                                                          EV R
                                                                            N TLA
                                                                              A N
                                                                               TI D
                                                                                 O
                                                                                   N
Photo by oddsteph - http://flic.kr/p/6vWPBU




                                                                            me
                                                                         su of
                                                                      as e
                                                               Le t’s      on ll
                                                                     a is    ba
                                                               J av base
                                                                   t he ats.
                                                                          b


                                              Choose the Right Weapon
D PO
                              EV R
                                N TLA
                                  A N
                                   TI D
                                     O
                                       N
Based in the D.C. area.


            (but I’m not.)
You may find the following
 conspicuously missing in
        this talk:
                   r y!
               o r
             S
D PO
                                    EV R
                                      N TLA
     I will not be




                                        A N
                                         TI D
                                           O
                                             N
presenting a paper on
  Dynamo, the CAP
   theorem, vector
clocks, merkle trees,
          etc. These are explained
                     elsewhere by my
                    alg orithmic betters.
D PO
                                   EV R
                                     N TLA
                                       A N
                                        TI D
                                          O
                                            N
I will not be dwelling
 on performance or
     redundancy.
            Expect some vague
           statements like “very
          fa st” and “very robust.”
D PO
                               EV R
                                 N TLA
                                   A N
                                    TI D
                                      O
                                        N
 I will not try to
convince you that
  “NoSQL” is the
    messiah.
        I t’s an alternative that
         m  akes sense in some
                situations.
D PO
                                 EV R
                                   N TLA
                                     A N
                                      TI D
                                        O
                                          N
I will not be conducting
       a large-scale
     comparison of
competing technologies.
             b ut I’d love to hear
           abou t what you use, and
                       why
What is Riak?
D PO
                      EV R
                        N TLA
                          A N
                           TI D
                             O
                               N
NoSQL
 and of the Dynamo
    persuasion.
D PO
                           EV R
                             N TLA
                               A N
                                TI D
                                  O
                                    N
Open Source
      & a commercial
       “EnterpriseDS”
      version with some
     proprietary pieces
D PO
                             EV R
                               N TLA
                                 A N
                                  TI D
                                    O
                                      N
Key/Value Store
      With some metadata.
D PO
                              EV R
                                N TLA
                                  A N
                                   TI D
                                     O
                                       N
Schema-less
   Great  for sparse data,
      but requires more
           discipline.
D PO
                                 EV R
                                   N TLA
                                     A N
                                      TI D
                                        O
                                          N
Datatype Agnostic
       Con tent-Type is King.
D PO
                                 EV R
                                   N TLA
                                     A N
                                      TI D
                                        O
                                          N
Language Agnostic
               REST & PBC

    Erlang, Javascript, Java,
       PHP, Python, Ruby, ...
D PO
                                EV R
                                  N TLA
                                    A N
                                     TI D
                                       O
                                         N
Distributed
  It’s [mostly] Erlang, what
       did you expect?
D PO
                          EV R
                            N TLA
                              A N
                               TI D
                                 O
                                   N
Masterless
   All nodes are equal
D PO
                           EV R
                             N TLA
                               A N
                                TI D
                                  O
                                    N
Scalable
   o r “easy to scale.”
D PO
                         EV R
                           N TLA
                             A N
                              TI D
                                O
                                  N
Eventually
Consistent
     and CAP tunable.
D PO
                     EV R
                       N TLA
                         A N
                          TI D
                            O
                              N
Uses Map/Reduce
      and “Link.”
Getting
Up & Running
N
        O
      TI D
     A N
   N TLA
              http://riak.basho.com




 EV R
D PO
N
        O
      TI D
     A N
   N TLA
 EV R
D PO
              hg & git
D PO
                                                   EV R
A Quick Local Cluster




                                                     N TLA
                                                       A N
                                                        TI D
                                                          O
                                                            N
         $ ./riak1/bin/riak start
         $ ./riak2/bin/riak start
         $ ./riak3/bin/riak start



                               Start three
                                           “nodes”

  $ ./riak2/bin/riak-admin join riak1@127.0.0.1
  $ ./riak3/bin/riak-admin join riak1@127.0.0.1




                            Join them in
                                         to a cluster
Your Data
D PO
                                 EV R
       Object




                                   N TLA
                                     A N
                                      TI D
                                        O
                                          N
                 Content Type
                 Body
                 + Links


The thing you’re storing.
D PO
                                      EV R
                         ca
           Key




                                        N TLA
                            n




                                          A N
                                           TI D
                        de be




                                             O
                                               N
                    au fi
                      to    ne use
                   ge ma d o r-
                     ne tic r
                       ra a
                          te lly
                             d
            pic1




The identifier for the object.
D PO
                                   EV R
        Bucket




                                     N TLA
                                       A N
                                        TI D
                                          O
                                            N
                         “p thin
                           ic
                         wi

                             1” “im
                               is ag
                                 un es
                                   iq ”
               pic1




                                     ue
        pic2      pic3

           images


The type or category of object.
D PO
                                     EV R
    Addressability




                                       N TLA
                                         A N
                                          TI D
                                            O
                                              N
                      <i
                      ma
                       ge
             images




                           s/
                             pi
                               c1
                              >
              pic1




Refer to objects by bucket and key.
D PO
                                                     EV R
        Example




                                                       N TLA
                                                         A N
                                                          TI D
                                                            O
                                                              N
require 'riak'

client = Riak::Client.new
client.bucket('images').new('pic1').tap do |pic1|
  pic1.content_type = 'image/jpeg'
  pic1.data = File.read('/path/to/jpg')
  pic1.store
end




        $g em install riak-client
D PO
                                                       EV R
         Example




                                                         N TLA
                                                           A N
                                                            TI D
                                                              O
                                                                N
client.bucket('people').new('bruce').tap do |bruce|
  bruce.data = {
    name: 'Bruce Williams',
    email: 'bruce@codefluency.com'
  }
  bruce.store
end

puts client['people']['bruce'].data['name']




        “application/json” is the
         d efault for riak-client
D PO
                                    EV R
            Links




                                      N TLA
                                        A N
                                         TI D
                                           O
                                             N
        st
          or
            ed
images                    people

            he
             re
 pic1                     bruce

            can also be
             “tagged”


         Connect objects
D PO
                                                           EV R
                Example




                                                             N TLA
                                                               A N
                                                                TI D
                                                                  O
                                                                    N
 client['people']['bruce'].tap do |bruce|
   bruce.links << client['images']['pic1'].to_link('avatar')
   bruce.store
 end




client['people']['bruce'].walk(:tag => 'avatar')
Hooks

pre-commit
reject or transform an object to be committed

post-commit
notify external services, build your own indexe
Where does it go?
D PO
                           EV R
    The Ring




                             N TLA
                               A N
                                TI D
                                  O
                                    N
A 160-bit integer space
D PO
                                  EV R
          The Ring




                                    N TLA
                                      A N
                                       TI D
                                         O
                                           N
broken into equal sized partitions.
N
        O
      TI D
     A N
   N TLA
 EV R
D PO
                                                              st more functional)
                                                          looks kinda like this
  The Ring




                                                     (it’s ju
                                                       It
   Photo by marchdoe - http://flickr.com/photos/marchdoe/457741149
D PO
                              EV R
     The Ring




                                N TLA
                                  A N
                                   TI D
                                     O
                                       N
Each partition is managed
by a vnode (virtual node),
D PO
                       EV R
  The Ring




                         N TLA
                           A N
                            TI D
                              O
                                N
Each vnode runs on
 a [physical] node.
D PO
                                EV R
        The Ring




                                  N TLA
                                    A N
                                     TI D
                                       O
                                         N
             1   2

             3   4



Each node owns an equal share of
     vnodes (& partitions)
D PO
                                     EV R
     Replication




                                       N TLA
                                         A N
                                          TI D
                                            O
                                              N
                      3
                      is
                       th
                          e
                           de
                             fa
                              ult
          n_val = 3




Objects are written to multiple
          partitions.
,
                                  ils
          N
        O
      TI D
     A N
   N TLA                       fa
 EV R                        ”       up
                           “2 ck
                                          Uses Hinted Handoff to deal with



D PO
                        de pi      k.
                      no s
                    n     er lac
                  he th s
                 W e o the
                   th
  Availability




                                                   node failures.
                                     4
                             2

                                     3
                             1
D PO
                                                EV R
    Persistence




                                                  N TLA
                                                    A N
                                                     TI D
                                                       O
                                                         N
  dets                ets                 fs

           gb_trees           innostore



 bitcask              multi               +




Supports pluggable backends
CAP Tuning
D PO
                                                EV R
                    GET




                                                  N TLA
                                                    A N
                                                     TI D
                                                       O
                                                         N
r
how many replicas need to agree (default: 2)
D PO
                                                 EV R
                     PUT




                                                   N TLA
                                                     A N
                                                      TI D
                                                        O
                                                          N
r
how many replicas need to agree when retrieving an
existing object before the write (default: 2)

w
how many replicas to write to before returning a
successful response (default: 2).

dw
how many replicas to commit to durable storage
before returning a successful response (default: 0)
(Map|Link)*Reduce
D PO
                                          EV R
              Map




                                            N TLA
                                              A N
                                               TI D
                                                 O
                                                   N
      obj                    [result, ...]


             your function


Map functions take one piece of data
as input, and produce zero or more
         results as output.
Data-locality is important in Riak.
Map phases are run where the data is
               stored.

 You can have multiple map phases.

  The input to a map definition is a
   series of [bucket, key] names.

                        unlike CouchDB
D PO
                                             EV R
              Link




                                               N TLA
                                                 A N
                                                  TI D
                                                    O
                                                      N
      obj                        [linked_obj, ...]



            link walk, using a
                 pattern

A special kind of map phase; links
matching a pattern are “walked” to
    find objects to be output.
D PO
                                             EV R
                 Reduce




                                               N TLA
                                                 A N
                                                  TI D
                                                    O
                                                      N
    [obj, ...]                   [result]



                 your function

Reduce functions combine the output
of many "map" step evaluations, into
             one result
The reduce phase occurs on the
      “coordinating node.”

Reduces may be run multiple times
  as more input comes in (eg, re-
             reduce)
D PO
                                                  EV R
         Example




                                                    N TLA
                                                      A N
                                                       TI D
                                                         O
                                                           N
bruce = client['people']['bruce']
melissa = client['people']['melissa']




          lets assume these have ages

 addy = client['addresses'].new('123fake')
 addy.data = {
   street: '123 Fake St',
   city: 'Portland', state: 'OR', zip: '97214'
 }
 addy.links << bruce.to_link('resident')
 addy.links << melissa.to_link('resident')
 addy.store
D PO
                                                                 EV R
                Example




                                                                   N TLA
                                                                     A N
                                                                      TI D
                                                                        O
                                                                          N
Riak::MapReduce.new(client).add(addy).
  link(tag: 'resident').
  map("function (v) { return [Riak.mapValuesJson(v)[0]['age'] || 0] }").
  reduce(function: 'Riak.reduceSum', keep: true).
  run




           We should get an array with one value
Hurdles
D PO
                                         EV R
                                           N TLA
 No range queries.




                                             A N
                                              TI D
                                                O
                                                  N
        Sorry, Cassandra fans


  Things like time
 series data require
creative approaches.
      like bucket and key naming, etc
D PO
                                       EV R
                                         N TLA
                                           A N
   Don’t list keys.




                                            TI D
                                              O
                                                N
         ever, if you can avoid it.



  Processing an entire
bucket is more expensive
 than you might think.

          because it lists keys
D PO
                                      EV R
                                        N TLA
                                          A N
                                           TI D
                                             O
                                               N
Watch your encoding.

MapReduce Javascript
phases need your data
to be in valid Unicode.
        you’ll get a “bad encoding” error
sy
a
E Questions?
N
        O
      TI D
     A N
   N TLA
 EV R
D PO
                        @wbruce
              Thanks!

Contenu connexe

En vedette

Couchbase Performance Benchmarking 2012
Couchbase Performance Benchmarking 2012Couchbase Performance Benchmarking 2012
Couchbase Performance Benchmarking 2012Altoros
 
Riak - From Small to Large
Riak - From Small to LargeRiak - From Small to Large
Riak - From Small to LargeRusty Klophaus
 
Social Insights on the UK Newspaper Industry
Social Insights on the UK Newspaper IndustrySocial Insights on the UK Newspaper Industry
Social Insights on the UK Newspaper IndustryBrandwatch
 
Riak in Ten Minutes
Riak in Ten MinutesRiak in Ten Minutes
Riak in Ten MinutesJon Meredith
 
Schema Design for Riak
Schema Design for RiakSchema Design for Riak
Schema Design for RiakSean Cribbs
 
NoSql presentation
NoSql presentationNoSql presentation
NoSql presentationMat Wall
 
Coding with Riak (from Velocity 2015)
Coding with Riak (from Velocity 2015)Coding with Riak (from Velocity 2015)
Coding with Riak (from Velocity 2015)Basho Technologies
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbaseDipti Borkar
 
Literature and Literary Standards
Literature and Literary StandardsLiterature and Literary Standards
Literature and Literary StandardsValerie Cruz
 
Riak Operations
Riak OperationsRiak Operations
Riak Operationsgschofield
 
Why we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.ukWhy we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.ukGraham Tackley
 
Are Content Strategists the Next Corporate Rock Stars?
Are Content Strategists the Next Corporate Rock Stars?Are Content Strategists the Next Corporate Rock Stars?
Are Content Strategists the Next Corporate Rock Stars?Mark Fidelman
 

En vedette (15)

Couchbase Performance Benchmarking 2012
Couchbase Performance Benchmarking 2012Couchbase Performance Benchmarking 2012
Couchbase Performance Benchmarking 2012
 
The New Guardian
The New GuardianThe New Guardian
The New Guardian
 
Riak - From Small to Large
Riak - From Small to LargeRiak - From Small to Large
Riak - From Small to Large
 
Social Insights on the UK Newspaper Industry
Social Insights on the UK Newspaper IndustrySocial Insights on the UK Newspaper Industry
Social Insights on the UK Newspaper Industry
 
Riak in Ten Minutes
Riak in Ten MinutesRiak in Ten Minutes
Riak in Ten Minutes
 
Schema Design for Riak
Schema Design for RiakSchema Design for Riak
Schema Design for Riak
 
Voldemort Nosql
Voldemort NosqlVoldemort Nosql
Voldemort Nosql
 
NoSql presentation
NoSql presentationNoSql presentation
NoSql presentation
 
What is literature
What is literatureWhat is literature
What is literature
 
Coding with Riak (from Velocity 2015)
Coding with Riak (from Velocity 2015)Coding with Riak (from Velocity 2015)
Coding with Riak (from Velocity 2015)
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbase
 
Literature and Literary Standards
Literature and Literary StandardsLiterature and Literary Standards
Literature and Literary Standards
 
Riak Operations
Riak OperationsRiak Operations
Riak Operations
 
Why we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.ukWhy we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.uk
 
Are Content Strategists the Next Corporate Rock Stars?
Are Content Strategists the Next Corporate Rock Stars?Are Content Strategists the Next Corporate Rock Stars?
Are Content Strategists the Next Corporate Rock Stars?
 

Dernier

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Riak: A friendly key/value store for the web.

  • 1. riak A friendly key/value store for the web. ION EV NAT 010D D 2 N TLA POR A primer by Bruce Williams
  • 2. D PO EV R N TLA A N TI D O N My name is Bruce Williams. ct ed di g ad din I’m lee a nd e b th e. to e dg
  • 3. D PO EV R N TLA A N TI D O N 2001 - Present Day wa yyy before it was a viable job choice.
  • 4. D PO EV R N TLA A N TI D O N But I use other languages, too. rom . y f ms all ig ci ad es pe ar rp o the
  • 5. D PO EV R N TLA A N TI D O N Photo by oddsteph - http://flic.kr/p/6vWPBU me su of as e Le t’s on ll a is ba J av base t he ats. b Choose the Right Weapon
  • 6. D PO EV R N TLA A N TI D O N Based in the D.C. area. (but I’m not.)
  • 7. You may find the following conspicuously missing in this talk: r y! o r S
  • 8. D PO EV R N TLA I will not be A N TI D O N presenting a paper on Dynamo, the CAP theorem, vector clocks, merkle trees, etc. These are explained elsewhere by my alg orithmic betters.
  • 9. D PO EV R N TLA A N TI D O N I will not be dwelling on performance or redundancy. Expect some vague statements like “very fa st” and “very robust.”
  • 10. D PO EV R N TLA A N TI D O N I will not try to convince you that “NoSQL” is the messiah. I t’s an alternative that m akes sense in some situations.
  • 11. D PO EV R N TLA A N TI D O N I will not be conducting a large-scale comparison of competing technologies. b ut I’d love to hear abou t what you use, and why
  • 13. D PO EV R N TLA A N TI D O N NoSQL and of the Dynamo persuasion.
  • 14. D PO EV R N TLA A N TI D O N Open Source & a commercial “EnterpriseDS” version with some proprietary pieces
  • 15. D PO EV R N TLA A N TI D O N Key/Value Store With some metadata.
  • 16. D PO EV R N TLA A N TI D O N Schema-less Great for sparse data, but requires more discipline.
  • 17. D PO EV R N TLA A N TI D O N Datatype Agnostic Con tent-Type is King.
  • 18. D PO EV R N TLA A N TI D O N Language Agnostic REST & PBC Erlang, Javascript, Java, PHP, Python, Ruby, ...
  • 19. D PO EV R N TLA A N TI D O N Distributed It’s [mostly] Erlang, what did you expect?
  • 20. D PO EV R N TLA A N TI D O N Masterless All nodes are equal
  • 21. D PO EV R N TLA A N TI D O N Scalable o r “easy to scale.”
  • 22. D PO EV R N TLA A N TI D O N Eventually Consistent and CAP tunable.
  • 23. D PO EV R N TLA A N TI D O N Uses Map/Reduce and “Link.”
  • 25. N O TI D A N N TLA http://riak.basho.com EV R D PO
  • 26. N O TI D A N N TLA EV R D PO hg & git
  • 27. D PO EV R A Quick Local Cluster N TLA A N TI D O N $ ./riak1/bin/riak start $ ./riak2/bin/riak start $ ./riak3/bin/riak start Start three “nodes” $ ./riak2/bin/riak-admin join riak1@127.0.0.1 $ ./riak3/bin/riak-admin join riak1@127.0.0.1 Join them in to a cluster
  • 29. D PO EV R Object N TLA A N TI D O N Content Type Body + Links The thing you’re storing.
  • 30. D PO EV R ca Key N TLA n A N TI D de be O N au fi to ne use ge ma d o r- ne tic r ra a te lly d pic1 The identifier for the object.
  • 31. D PO EV R Bucket N TLA A N TI D O N “p thin ic wi 1” “im is ag un es iq ” pic1 ue pic2 pic3 images The type or category of object.
  • 32. D PO EV R Addressability N TLA A N TI D O N <i ma ge images s/ pi c1 > pic1 Refer to objects by bucket and key.
  • 33. D PO EV R Example N TLA A N TI D O N require 'riak' client = Riak::Client.new client.bucket('images').new('pic1').tap do |pic1| pic1.content_type = 'image/jpeg' pic1.data = File.read('/path/to/jpg') pic1.store end $g em install riak-client
  • 34. D PO EV R Example N TLA A N TI D O N client.bucket('people').new('bruce').tap do |bruce| bruce.data = { name: 'Bruce Williams', email: 'bruce@codefluency.com' } bruce.store end puts client['people']['bruce'].data['name'] “application/json” is the d efault for riak-client
  • 35. D PO EV R Links N TLA A N TI D O N st or ed images people he re pic1 bruce can also be “tagged” Connect objects
  • 36. D PO EV R Example N TLA A N TI D O N client['people']['bruce'].tap do |bruce| bruce.links << client['images']['pic1'].to_link('avatar') bruce.store end client['people']['bruce'].walk(:tag => 'avatar')
  • 37. Hooks pre-commit reject or transform an object to be committed post-commit notify external services, build your own indexe
  • 39. D PO EV R The Ring N TLA A N TI D O N A 160-bit integer space
  • 40. D PO EV R The Ring N TLA A N TI D O N broken into equal sized partitions.
  • 41. N O TI D A N N TLA EV R D PO st more functional) looks kinda like this The Ring (it’s ju It Photo by marchdoe - http://flickr.com/photos/marchdoe/457741149
  • 42. D PO EV R The Ring N TLA A N TI D O N Each partition is managed by a vnode (virtual node),
  • 43. D PO EV R The Ring N TLA A N TI D O N Each vnode runs on a [physical] node.
  • 44. D PO EV R The Ring N TLA A N TI D O N 1 2 3 4 Each node owns an equal share of vnodes (& partitions)
  • 45. D PO EV R Replication N TLA A N TI D O N 3 is th e de fa ult n_val = 3 Objects are written to multiple partitions.
  • 46. , ils N O TI D A N N TLA fa EV R ” up “2 ck Uses Hinted Handoff to deal with D PO de pi k. no s n er lac he th s W e o the th Availability node failures. 4 2 3 1
  • 47. D PO EV R Persistence N TLA A N TI D O N dets ets fs gb_trees innostore bitcask multi + Supports pluggable backends
  • 49. D PO EV R GET N TLA A N TI D O N r how many replicas need to agree (default: 2)
  • 50. D PO EV R PUT N TLA A N TI D O N r how many replicas need to agree when retrieving an existing object before the write (default: 2) w how many replicas to write to before returning a successful response (default: 2). dw how many replicas to commit to durable storage before returning a successful response (default: 0)
  • 52. D PO EV R Map N TLA A N TI D O N obj [result, ...] your function Map functions take one piece of data as input, and produce zero or more results as output.
  • 53. Data-locality is important in Riak. Map phases are run where the data is stored. You can have multiple map phases. The input to a map definition is a series of [bucket, key] names. unlike CouchDB
  • 54. D PO EV R Link N TLA A N TI D O N obj [linked_obj, ...] link walk, using a pattern A special kind of map phase; links matching a pattern are “walked” to find objects to be output.
  • 55. D PO EV R Reduce N TLA A N TI D O N [obj, ...] [result] your function Reduce functions combine the output of many "map" step evaluations, into one result
  • 56. The reduce phase occurs on the “coordinating node.” Reduces may be run multiple times as more input comes in (eg, re- reduce)
  • 57. D PO EV R Example N TLA A N TI D O N bruce = client['people']['bruce'] melissa = client['people']['melissa'] lets assume these have ages addy = client['addresses'].new('123fake') addy.data = { street: '123 Fake St', city: 'Portland', state: 'OR', zip: '97214' } addy.links << bruce.to_link('resident') addy.links << melissa.to_link('resident') addy.store
  • 58. D PO EV R Example N TLA A N TI D O N Riak::MapReduce.new(client).add(addy). link(tag: 'resident'). map("function (v) { return [Riak.mapValuesJson(v)[0]['age'] || 0] }"). reduce(function: 'Riak.reduceSum', keep: true). run We should get an array with one value
  • 60. D PO EV R N TLA No range queries. A N TI D O N Sorry, Cassandra fans Things like time series data require creative approaches. like bucket and key naming, etc
  • 61. D PO EV R N TLA A N Don’t list keys. TI D O N ever, if you can avoid it. Processing an entire bucket is more expensive than you might think. because it lists keys
  • 62. D PO EV R N TLA A N TI D O N Watch your encoding. MapReduce Javascript phases need your data to be in valid Unicode. you’ll get a “bad encoding” error
  • 64. N O TI D A N N TLA EV R D PO @wbruce Thanks!