SlideShare une entreprise Scribd logo
1  sur  27
The Best Open Source Database
     You Will Ever Have The
Pleasure Of Running In Production



    Seattle Scalability Meetup
        August 24, 2011
Who Am I?

•   Mark Phillips
•   Community Manager
•   Basho Technologies
•   @pharkmillups
What is Riak?
• a database
• a key/value store
• distributed
• fault-tolerant
• scalable
• Dynamo-inspired
• used by startups
• used by FORTUNE 100 companies
• written (primarily) in Erlang
• pronounced “REE-awk”
• not the right fit for every project and app
Riak’s Design Goals


• Simple, Elegant Scalability
• Ease of operations
• Resiliency in the face of failure
• Plumbing
Distributed, Scalable, Fault Tolerant

       No central coordinator;
      Easy to setup and operate
Distributed, Scalable, Fault Tolerant

        Horizontally Scalable;
Add commodity hardware to get more
 [throughput | processing | storage].
Distributed, Scalable, Fault Tolerant

         Always Available
      No Single Point of Failure
            Self-healing
Building Clusters is Dead Simple

$ riak start
# to get your first node running

$ riak start
# to get your second node running

$ riak-admin join riak@192.168.1.10
# send a join request to an existing Riak node

$ later, rinse, repeat until you’re web scale
APIs



•HTTP - Richly Featured RESTful API
•Protocol Buffers - Courtesy of Google
Current HTTP API
      (made possible by webmachine)
Store
POST /riak/bucket         # Riak-defined key
PUT /riak/bucket/key      # User-defined key


Fetch
GET /riak/bucket/key


Delete
DELETE /riak/bucket/key
How Riak Organizes Data
             Bucket/Key/Value
•Bucket - top-level namespace in Riak. Used for
basic data organization; also used to set
properties for keys in that bucket (n_val, commit
hooks, choice of backend, etc.)
•Key - binary blob to identify value
•Value - any type of data type you can ever
imagine
What we borrowed from Dynamo:

• Gossip protocol - ring membership, partition assignment
• Consistent Hashing - division of work
• Vector Clocks - versioning, conflict resolution
• Read Repair - anti-entropy
• Hinted Handoff - failure masking, data migration

     Paper was not a spec for a system but one
  approach; we learned from it but deviated where
                     necessary
Consistent Hashing
N, R, W Values

• N = number of replicas to store (on distinct
physical nodes)
• R = number of replica responses needed for
a successful read
•W = number of replica responses needed for
a successful write
N, R, W Values
N, R, W Values
Writing Things to Disk
       (because that’s what databases do)
Riak allows for pluggable local storage

• Bitcask (recommended, ships as default)
• LevelDB (recently released by Google)
• Innostore
• Several other specialty backends
Querying Riak
• Primary key based lookups
• MapReduce
• Links/Link Walking
• Pre- and Post- Commit Hooks
• Full Text Search
• Secondary Indexes
Riak’s Design Goals
                  (revisited)
• Simple, Elegant Scalability
• Ease of operations
• Resiliency in the face of failure
• Plumbing
• Ease of Development, Developer
Usability
Our Current Usability Focus

• Robust, supported client libs
• More complex querying capabilities
• Documentation
• Sample Apps, code, etc.
• Cluster administration tools
• Logging
Selected Use Cases

• Session Storage (Wikia, DISQUS, MochiMedia)
• Timeline (Yammer, DISQUS, Formspring)
• Generic Object storage (Comcast)
• Scalable Full-Text Search (Best Buy, Clipboard.com)
• User Profile/Data Storage (Danish Government)
• Message Storage (Voxer, MIG-CAN)
• Gaming (MochiMedia)
Community and Open Source


• Very committed to our open source community
• Companies like Yammer, Comcast, DISQUS, Trifork
and Formspring contributing actively
• The Riak community is a great place to work and
play. Come join us!
Productions Deploys



>250 clusters out there right now
Riak 1.0
             (dropping late Sept.)

• Secondary Indexing
• Lager - new logging framework
• riak_pipe - massive overhaul of M/R framework
• Riak Search Integration
• Lots o’ Bug Fixes/Stability Improvements
About Basho
   Offices in:
 San Francisco, CA;
  Cambridge, MA;
    Reston, VA

  Employees:
 All over the world

Total Employees:
       ~35
About Basho

        How do we pay the bills?
    Enterprise Licenses of Riak and SLA’d Support
                 Professional Services
                      Consulting

Other Prominent Open Source Software:
                    Webmachine
                      Rebar
                      Bitcask
                      Lager
Get Involved

•   wiki.basho.com
•   github.com/basho/*
•   basho.com
•   twitter.com/basho
•   Riak Mailing List
•   downloads.basho.com/riak/CURRENT

Contenu connexe

Tendances

RedisConf18 - Microservicesand Redis: A Match made in Heaven
RedisConf18 - Microservicesand Redis: A Match made in HeavenRedisConf18 - Microservicesand Redis: A Match made in Heaven
RedisConf18 - Microservicesand Redis: A Match made in Heaven
Redis Labs
 
RedisConf18 - Application of Redis in IOT Edge Devices
RedisConf18 - Application of Redis in IOT Edge DevicesRedisConf18 - Application of Redis in IOT Edge Devices
RedisConf18 - Application of Redis in IOT Edge Devices
Redis Labs
 
2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm Meetup2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm Meetup
Mårten Gustafson
 
Creating Highly-Available MongoDB Microservices with Docker Containers and Ku...
Creating Highly-Available MongoDB Microservices with Docker Containers and Ku...Creating Highly-Available MongoDB Microservices with Docker Containers and Ku...
Creating Highly-Available MongoDB Microservices with Docker Containers and Ku...
MongoDB
 

Tendances (20)

Riak - From Small to Large
Riak - From Small to LargeRiak - From Small to Large
Riak - From Small to Large
 
Actor-based concurrency in a modern Java Enterprise
Actor-based concurrency in a modern Java EnterpriseActor-based concurrency in a modern Java Enterprise
Actor-based concurrency in a modern Java Enterprise
 
CockroachDB
CockroachDBCockroachDB
CockroachDB
 
OpenStack Swift tiering proposal and prototype details
OpenStack Swift tiering proposal and prototype detailsOpenStack Swift tiering proposal and prototype details
OpenStack Swift tiering proposal and prototype details
 
ION Durban - DNSSEC, and Why We Can't Avoid It
ION Durban - DNSSEC, and Why We Can't Avoid ItION Durban - DNSSEC, and Why We Can't Avoid It
ION Durban - DNSSEC, and Why We Can't Avoid It
 
RedisConf18 - Microservicesand Redis: A Match made in Heaven
RedisConf18 - Microservicesand Redis: A Match made in HeavenRedisConf18 - Microservicesand Redis: A Match made in Heaven
RedisConf18 - Microservicesand Redis: A Match made in Heaven
 
Build agile and elastic data pipeline
Build agile and elastic data pipelineBuild agile and elastic data pipeline
Build agile and elastic data pipeline
 
RedisConf18 - Application of Redis in IOT Edge Devices
RedisConf18 - Application of Redis in IOT Edge DevicesRedisConf18 - Application of Redis in IOT Edge Devices
RedisConf18 - Application of Redis in IOT Edge Devices
 
Spark Streaming & Kafka-The Future of Stream Processing
Spark Streaming & Kafka-The Future of Stream ProcessingSpark Streaming & Kafka-The Future of Stream Processing
Spark Streaming & Kafka-The Future of Stream Processing
 
In-memory No SQL- GIDS2014
In-memory No SQL- GIDS2014In-memory No SQL- GIDS2014
In-memory No SQL- GIDS2014
 
2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm Meetup2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm Meetup
 
Running Analytics at the Speed of Your Business
Running Analytics at the Speed of Your BusinessRunning Analytics at the Speed of Your Business
Running Analytics at the Speed of Your Business
 
NoSQL @ Qbranch -2010-04-15
NoSQL @ Qbranch -2010-04-15NoSQL @ Qbranch -2010-04-15
NoSQL @ Qbranch -2010-04-15
 
Distributed messaging through Kafka
Distributed messaging through KafkaDistributed messaging through Kafka
Distributed messaging through Kafka
 
RedisConf18 - Writing modular & encapsulated Redis code
RedisConf18 - Writing modular & encapsulated Redis codeRedisConf18 - Writing modular & encapsulated Redis code
RedisConf18 - Writing modular & encapsulated Redis code
 
Lessons Learned From Running Spark On Docker
Lessons Learned From Running Spark On DockerLessons Learned From Running Spark On Docker
Lessons Learned From Running Spark On Docker
 
Akka 2.4 plus commercial features in Typesafe Reactive Platform
Akka 2.4 plus commercial features in Typesafe Reactive PlatformAkka 2.4 plus commercial features in Typesafe Reactive Platform
Akka 2.4 plus commercial features in Typesafe Reactive Platform
 
ION Durban - NAT64/DNS64 Experiments and the NAT64Check Tool
ION Durban - NAT64/DNS64 Experiments and the NAT64Check ToolION Durban - NAT64/DNS64 Experiments and the NAT64Check Tool
ION Durban - NAT64/DNS64 Experiments and the NAT64Check Tool
 
High-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using RedisHigh-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using Redis
 
Creating Highly-Available MongoDB Microservices with Docker Containers and Ku...
Creating Highly-Available MongoDB Microservices with Docker Containers and Ku...Creating Highly-Available MongoDB Microservices with Docker Containers and Ku...
Creating Highly-Available MongoDB Microservices with Docker Containers and Ku...
 

En vedette

En vedette (10)

Riak add presentation
Riak add presentationRiak add presentation
Riak add presentation
 
Redis use cases
Redis use casesRedis use cases
Redis use cases
 
Introduction to Redis Data Structures: Sorted Sets
Introduction to Redis Data Structures: Sorted SetsIntroduction to Redis Data Structures: Sorted Sets
Introduction to Redis Data Structures: Sorted Sets
 
Redis and its many use cases
Redis and its many use casesRedis and its many use cases
Redis and its many use cases
 
Introduction to some top Redis use cases
Introduction to some top Redis use casesIntroduction to some top Redis use cases
Introduction to some top Redis use cases
 
The BestBuy.com Cloud Architecture
The BestBuy.com Cloud ArchitectureThe BestBuy.com Cloud Architecture
The BestBuy.com Cloud Architecture
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in Graphdatenbanken
 
Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)
 
Neo4j GraphTalks Panama Papers
Neo4j GraphTalks Panama PapersNeo4j GraphTalks Panama Papers
Neo4j GraphTalks Panama Papers
 
Everything you always wanted to know about Redis but were afraid to ask
Everything you always wanted to know about Redis but were afraid to askEverything you always wanted to know about Redis but were afraid to ask
Everything you always wanted to know about Redis but were afraid to ask
 

Similaire à Riak seattle-meetup-august

HBaseCon 2013: General Session
HBaseCon 2013: General SessionHBaseCon 2013: General Session
HBaseCon 2013: General Session
Cloudera, Inc.
 

Similaire à Riak seattle-meetup-august (20)

How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
 
Building Distributed Systems With Riak and Riak Core
Building Distributed Systems With Riak and Riak CoreBuilding Distributed Systems With Riak and Riak Core
Building Distributed Systems With Riak and Riak Core
 
Riak CS Build Your Own Cloud Storage
Riak CS Build Your Own Cloud StorageRiak CS Build Your Own Cloud Storage
Riak CS Build Your Own Cloud Storage
 
VA Smalltalk Update
VA Smalltalk UpdateVA Smalltalk Update
VA Smalltalk Update
 
Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.
 
HBaseCon 2013: General Session
HBaseCon 2013: General SessionHBaseCon 2013: General Session
HBaseCon 2013: General Session
 
MariaDB 初学者指南
MariaDB 初学者指南MariaDB 初学者指南
MariaDB 初学者指南
 
Databases in the hosted cloud
Databases in the hosted cloud Databases in the hosted cloud
Databases in the hosted cloud
 
Michael stack -the state of apache h base
Michael stack -the state of apache h baseMichael stack -the state of apache h base
Michael stack -the state of apache h base
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
Going Agile: Brought to You by the Public Broadcasting System - Atlassian Sum...
Going Agile: Brought to You by the Public Broadcasting System - Atlassian Sum...Going Agile: Brought to You by the Public Broadcasting System - Atlassian Sum...
Going Agile: Brought to You by the Public Broadcasting System - Atlassian Sum...
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0
 
Restful风格ž„web服务架构
Restful风格ž„web服务架构Restful风格ž„web服务架构
Restful风格ž„web服务架构
 
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
 
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
 
Meetup callback
Meetup callbackMeetup callback
Meetup callback
 
7 Apache Process Cloudstack Developer Day
7 Apache Process Cloudstack Developer Day7 Apache Process Cloudstack Developer Day
7 Apache Process Cloudstack Developer Day
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 
Internet scaleservice
Internet scaleserviceInternet scaleservice
Internet scaleservice
 
DEVNET-1106 Upcoming Services in OpenStack
DEVNET-1106	Upcoming Services in OpenStackDEVNET-1106	Upcoming Services in OpenStack
DEVNET-1106 Upcoming Services in OpenStack
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Riak seattle-meetup-august

  • 1. The Best Open Source Database You Will Ever Have The Pleasure Of Running In Production Seattle Scalability Meetup August 24, 2011
  • 2. Who Am I? • Mark Phillips • Community Manager • Basho Technologies • @pharkmillups
  • 3. What is Riak? • a database • a key/value store • distributed • fault-tolerant • scalable • Dynamo-inspired • used by startups • used by FORTUNE 100 companies • written (primarily) in Erlang • pronounced “REE-awk” • not the right fit for every project and app
  • 4. Riak’s Design Goals • Simple, Elegant Scalability • Ease of operations • Resiliency in the face of failure • Plumbing
  • 5. Distributed, Scalable, Fault Tolerant No central coordinator; Easy to setup and operate
  • 6. Distributed, Scalable, Fault Tolerant Horizontally Scalable; Add commodity hardware to get more [throughput | processing | storage].
  • 7. Distributed, Scalable, Fault Tolerant Always Available No Single Point of Failure Self-healing
  • 8. Building Clusters is Dead Simple $ riak start # to get your first node running $ riak start # to get your second node running $ riak-admin join riak@192.168.1.10 # send a join request to an existing Riak node $ later, rinse, repeat until you’re web scale
  • 9. APIs •HTTP - Richly Featured RESTful API •Protocol Buffers - Courtesy of Google
  • 10. Current HTTP API (made possible by webmachine) Store POST /riak/bucket # Riak-defined key PUT /riak/bucket/key # User-defined key Fetch GET /riak/bucket/key Delete DELETE /riak/bucket/key
  • 11. How Riak Organizes Data Bucket/Key/Value •Bucket - top-level namespace in Riak. Used for basic data organization; also used to set properties for keys in that bucket (n_val, commit hooks, choice of backend, etc.) •Key - binary blob to identify value •Value - any type of data type you can ever imagine
  • 12. What we borrowed from Dynamo: • Gossip protocol - ring membership, partition assignment • Consistent Hashing - division of work • Vector Clocks - versioning, conflict resolution • Read Repair - anti-entropy • Hinted Handoff - failure masking, data migration Paper was not a spec for a system but one approach; we learned from it but deviated where necessary
  • 14. N, R, W Values • N = number of replicas to store (on distinct physical nodes) • R = number of replica responses needed for a successful read •W = number of replica responses needed for a successful write
  • 15. N, R, W Values
  • 16. N, R, W Values
  • 17. Writing Things to Disk (because that’s what databases do) Riak allows for pluggable local storage • Bitcask (recommended, ships as default) • LevelDB (recently released by Google) • Innostore • Several other specialty backends
  • 18. Querying Riak • Primary key based lookups • MapReduce • Links/Link Walking • Pre- and Post- Commit Hooks • Full Text Search • Secondary Indexes
  • 19. Riak’s Design Goals (revisited) • Simple, Elegant Scalability • Ease of operations • Resiliency in the face of failure • Plumbing • Ease of Development, Developer Usability
  • 20. Our Current Usability Focus • Robust, supported client libs • More complex querying capabilities • Documentation • Sample Apps, code, etc. • Cluster administration tools • Logging
  • 21. Selected Use Cases • Session Storage (Wikia, DISQUS, MochiMedia) • Timeline (Yammer, DISQUS, Formspring) • Generic Object storage (Comcast) • Scalable Full-Text Search (Best Buy, Clipboard.com) • User Profile/Data Storage (Danish Government) • Message Storage (Voxer, MIG-CAN) • Gaming (MochiMedia)
  • 22. Community and Open Source • Very committed to our open source community • Companies like Yammer, Comcast, DISQUS, Trifork and Formspring contributing actively • The Riak community is a great place to work and play. Come join us!
  • 23. Productions Deploys >250 clusters out there right now
  • 24. Riak 1.0 (dropping late Sept.) • Secondary Indexing • Lager - new logging framework • riak_pipe - massive overhaul of M/R framework • Riak Search Integration • Lots o’ Bug Fixes/Stability Improvements
  • 25. About Basho Offices in: San Francisco, CA; Cambridge, MA; Reston, VA Employees: All over the world Total Employees: ~35
  • 26. About Basho How do we pay the bills? Enterprise Licenses of Riak and SLA’d Support Professional Services Consulting Other Prominent Open Source Software: Webmachine Rebar Bitcask Lager
  • 27. Get Involved • wiki.basho.com • github.com/basho/* • basho.com • twitter.com/basho • Riak Mailing List • downloads.basho.com/riak/CURRENT

Notes de l'éditeur

  1. \n
  2. \n
  3. * Lots of buzz words \n* Ther e are \n
  4. Talk about how this is borne from what we built back at Basho\n* Sales app \n* Never wanted it to go down \n* apple, akamai pedigree \n* fungible assets \n\n
  5. * Just wanted to touch on this a bit more\n - Every node is the same \n
  6. \n
  7. \n
  8. - After you download and build Riak, here’s all that stands between you have cluster of N nodes. \n - We also have a extended suite of command line tools that make cluster management very simple \n\n\n
  9. * These are both very well documented on the wiki \n* Can easily write your own client code to talk to Riak using these as guides\n
  10. * Webmachine a REST toolkit if you like\n
  11. * Buckets are used for basic data organization. (not in Dynamo) Rougly analogous to “tables” but only insofar as they can be used for organization. Buckets are also where you set certain properties. \n\nJSON is a data type we see a lot. BSON showed up recently,too. Python byte code. \n\n\n
  12. * dynamo differences\n- buckets \n\n
  13. ring, keyspace, vnode, partition, node. \n* pref list is the list of the N vnodes on which the data should be stored. \n* the ease of scaling out here should be pretty apparent. We aren’t asking you to pick a shard key, etc. Y * You select an N_val that suits your business needs (store more copies durably that matter more), and then add nodes to suit.\n* each color is a physical node\n* nodes run an equal # of partitions (optimistically) \n* each partition then runs an erlang process called a vnode\n* vnodes are responsible for the handling of requests\n* ring space = integer of 2^160. \n
  14. * these provide the developer and stakeholders to tune consistency and availability into their apps at the bucket let. (and we are working making this more granular)\n
  15. \n
  16. Tunable consistency based on your business needs. \nLog data can be written with an W of 1 for speed. \nPrescription information can be read with an R val of 3 for higher consistency \n
  17. enable “multi-backend” in your config file; backends are configurable at the bucket level\n* Bitcask - current default. we wrote this. Append only on disk. keeps a pointer to most recent copy of object in mem. updates that on each write. performs merges to clean upu\nold values. Super fast - < 5ms latencies on most systems at 99.9% Memory is only concern. (~100 bytes per object depending on key size)\n* LevelDB - just released by google. permissive licensing, items are ordered on disk, far less drastic memory requirements per key (less ram, slightly longer latencies) opens up the door for more time series/range based queries in Riak\n* Innostore - recommended when RAM requirements might be too high. embedded innoDB. We will start recommending this less at level support become more stable\n* In-memory for testing. We are also deprecating them\n\n\n
  18. * Links are metadata that establish one-way relationships between objects in Riak; once links are attached, you can them perform links walking queries to find relationships. bucket - talks, key - Seattle, riaktag=”talk” (in header). \n* MapReduce - Chain any number of Map and Reduce phases. Map produces 0 or more results based on you function. “Reduce” is will combine the results of MapPhase and return them to client. \n* Pre commit - json validation. Post commit - send to another db/service \n* Full Text Search - full-text search engine built around Riak Core and tightly integrated with Riak KV. Use a pre-commit hook at the bucket level to index. \n
  19. * We are pushing this super hard, as the next slide will show. \n
  20. * Client libs - Erlang, java, JS, python, node.js, Ruby (ripple), haskell, Smalltalk, Go, Scala, PHP, Perl \n* Secondary indexing, deeper search integration, robust, better-defined MapReduce with pipe\n* Working on a successor to Rekon that will be an admin tool (use DISQUS anecdote about Cassandra Dashboard) \n
  21. * Comcast - requirement is basically an internal Amazon S3 - a straight Key/Value store with an HTTP interface. This is to build a product called HOSS (high availably object storage system). This infrastructure is used across Comcast to store DVR data.\n* European country is using Rik \n* MIG-CAN - SMS Gateway for British \n* Wikia - multiple data centers for session storage\n\n
  22. \n
  23. \n
  24. * Rusty’s 2I deck is available; code is in master\n* Lager blog post\n* pipe is in master; extensive readme on GitHub \n* Will be doing prereleases (packages, as opposed to building from source) - get on the mailing list to be part of this \n
  25. \n
  26. \n
  27. \n