SlideShare une entreprise Scribd logo
1  sur  18
Télécharger pour lire hors ligne
Methods of benchmarking NoSQL database systems

                              Ilya Bakulin
                webmaster@kibab.com, kibab@FreeBSD.org
                               SMS Traffic


                              LVEE 2011




Ilya Bakulin (SMS Traffic)     NoSQL benchmarking          July 2, 2011   1 / 16
1   Introduction


2   YCSB benchmarking framework


3   YCSB practical usage


4   Results


5   Where to find further information




    Ilya Bakulin (SMS Traffic)   NoSQL benchmarking   July 2, 2011   2 / 16
Why benchmarking NoSQL is nessesary




    No guides / FAQs about performance are generally available, or are
    outdated
    NoSQL systems are actively developed
    Nobody wants to end up with crashed DB in production right before
    2-week vacation




  Ilya Bakulin (SMS Traffic)   NoSQL benchmarking            July 2, 2011   3 / 16
Why benchmarking NoSQL is complex




    RDBMS use SQL to provide access to data stored in them, while
    NOSQL systems don’t
    Each NoSQL uses different protocol (Thrift, Memcached-style, own
    protocols)
    Existing benchmarks require SQL to work with database under
    inspection.




  Ilya Bakulin (SMS Traffic)   NoSQL benchmarking          July 2, 2011   4 / 16
What is YCSB?




    YCSB stands for Yahoo Cloud Serving Benchmark
    Developed by Yahoo! Research group
    Open Source project, hosted on GitHub (178 watchers, 42 forks)




  Ilya Bakulin (SMS Traffic)   NoSQL benchmarking           July 2, 2011   5 / 16
Architecture                  Benchmark tool
  •! Java application
      Java application
      –! Many systems have Java APIs
    Shipped with ready-to-use adapters for several popular Opensource
    databases systems via HTTP/REST, JNI or some other solution
      –! Other

                                        Command-line parameters
                                        •! DB to use
                                        •! Target throughput
                                        •! Number of threads
                                        •! …




             Workload
                                        YCSB client




                                                                                         Cloud DB
             parameter file




                                                             DB client
             •! R/W mix
             •! Record size                       Client
                                        Workload threads
             •! Data set
                                        executor
             •! …
                                                    Stats


     Extensible: define new workloads
                                              Extensible: plug in new clients
                                                                                                        5


  Ilya Bakulin (SMS Traffic)                 NoSQL benchmarking                   July 2, 2011        6 / 16
More on DB interface
                              Benchmark tool
     Simple operations: INSERT, UPDATE, REPLACE, DELETE, SCAN
  •! Does not use SQL
       Java application
     ... –! Many systems have Java APIs
          but SQL support is avaible through contributed JDBC driver
     ... –! Other systems configurations are possible other solution
          Even sharding via HTTP/REST, JNI or some

                                        Command-line parameters
                                        •! DB to use
                                        •! Target throughput
                                        •! Number of threads
                                        •! …




             Workload
                                        YCSB client




                                                                                         Cloud DB
             parameter file




                                                             DB client
             •! R/W mix
             •! Record size                       Client
                                        Workload threads
             •! Data set
                                        executor
             •! …
                                                    Stats


     Extensible: define new workloads
                                              Extensible: plug in new clients
                                                                                                        5

  Ilya Bakulin (SMS Traffic)                 NoSQL benchmarking                   July 2, 2011        7 / 16
More on DB interface
                              Benchmark tool
     Simple operations: INSERT, UPDATE, REPLACE, DELETE, SCAN
  •! Does not use SQL
       Java application
     ... –! Many systems have Java APIs
          but SQL support is avaible through contributed JDBC driver
     ... –! Other systems configurations are possible other solution
          Even sharding via HTTP/REST, JNI or some

                                        Command-line parameters
                                        •! DB to use
                                        •! Target throughput
                                        •! Number of threads
                                        •! …




             Workload
                                        YCSB client




                                                                                         Cloud DB
             parameter file




                                                             DB client
             •! R/W mix
             •! Record size                       Client
                                        Workload threads
             •! Data set
                                        executor
             •! …
                                                    Stats


     Extensible: define new workloads
                                              Extensible: plug in new clients
                                                                                                        5

  Ilya Bakulin (SMS Traffic)                 NoSQL benchmarking                   July 2, 2011        7 / 16
More on DB interface
                              Benchmark tool
     Simple operations: INSERT, UPDATE, REPLACE, DELETE, SCAN
  •! Does not use SQL
       Java application
     ... –! Many systems have Java APIs
          but SQL support is avaible through contributed JDBC driver
     ... –! Other systems configurations are possible other solution
          Even sharding via HTTP/REST, JNI or some

                                        Command-line parameters
                                        •! DB to use
                                        •! Target throughput
                                        •! Number of threads
                                        •! …




             Workload
                                        YCSB client




                                                                                         Cloud DB
             parameter file




                                                             DB client
             •! R/W mix
             •! Record size                       Client
                                        Workload threads
             •! Data set
                                        executor
             •! …
                                                    Stats


     Extensible: define new workloads
                                              Extensible: plug in new clients
                                                                                                        5

  Ilya Bakulin (SMS Traffic)                 NoSQL benchmarking                   July 2, 2011        7 / 16
More on workload

 d by replacing the value of one                                                  Uniform:




                                                       +(,-./%'&0
 ther one randomly chosen field
              Specifies what DB
                                                                    1)))2)))333
              operations are used by
 order, starting at a randomly                                                    !"#$%&'(")(%*$%
                                                                                                               4

  number of application
              records to scan is
                                                                                   Zipfian:
             Also defines request




                                                       +(,-./%'&0
 distribution of scan lengths is
             distribution
oad. Thus, the scan() method
 number of records to scan.to specify
              It is possible Of
 y instead specify a scan interval                                  1)))2)))333                                4
              record size
February 15th). The number of                                                     !"#$%&'(")(%*$%

              It’s possible to specify
  to control the size of these in-                                                  Latest:
 termine and number of records and
              specify meaningful
                                                       +(,-./%'&0
 of the database calls, including
 tion 5.2.1.) operations
                                                                    1)))2)))333                                4
                                                                                  !"#$%&'(")(%*$%


 t make many random choices
       Ilya Bakulin (SMS Traffic)          NoSQL benchmarking                                         July 2, 2011   8 / 16
/* TODO: Remove this crap */




  Ilya Bakulin (SMS Traffic)   NoSQL benchmarking   July 2, 2011   9 / 16
SMS Traffic: workload construction




SMS service provider, several gateways, big clients (such as banks)
    15% inserts, 65% updates, 15% reads
    Request distribution: latest SMS messages are the ”hottest” ones
    Evaluated Cassandra and sharded MySQL as DB storage for the next
    generation of SMS sending platform




  Ilya Bakulin (SMS Traffic)     NoSQL benchmarking            July 2, 2011   10 / 16
Testing process




    3 instances of DBMS system on one server (Core Quad Q9400, 4GB
    RAM, SATA-II HDD, FreeBSD 8.2-amd64)
    Cassandra 0.7.4 (1GB Java heap / instance)
    MySQL 5.1 + InnoDB engine (1GB InnoDB buffer pool size /
    instance)
    Client: separate machine, 1Gb/s connection
Should avoid swapping and disk IO saturation




  Ilya Bakulin (SMS Traffic)   NoSQL benchmarking        July 2, 2011   11 / 16
Some resuts: Workload ”A”: 50% read / 50% write
                                                                     Workload A – Upd
                               •! 50/50 Read/update
                                                                 Workload A - Read latency

                                                70                                                                     80

                                                60                                                                     70
                    Average read latency (ms)




                                                                                                        Update latency (ms)
                                                                                                                       60
                                                50
                                                                                                                       50
                                                40
                                                                                                                       40
                                                30
                                                                                                                       30
                                                20
                                                                                                                       20
                                                10                                                                     10
                                                 0                                                                            0
                                                     0               5000          10000        15000                             0
                                                                      Throughput (ops/sec)

                                                         Cassandra     Hbase      Sherpa     MySQL                                    Cassan



  Ilya Bakulin (SMS Traffic)     Comment: Cassandra is optimized for writes, and achieves h
                                           NoSQL benchmarking            July 2, 2011 12 / 16
oad A – Workload ”A”: 50% read / 50% write
 Some resuts:
              Update heavy
 y                                                     Workload A - Update latency

                                 80

                                 70
                  Update latency (ms)

                                 60

                                 50

                                 40

                                 30

                                 20

                                 10

                                        0
        15000                               0               5000          10000         15000
 )                                                           Throughput (ops/sec)

     MySQL                                      Cassandra      Hbase     Sherpa      MySQL



 zed for writes, and achieves higher throughput and lower
       Ilya Bakulin (SMS Traffic)   NoSQL benchmarking                                            July 2, 2011   13 / 16
Some resuts: Workload ”B”: 95% read / 5% write                       Workload B – Re
                                                •!       95/5 Read/update
                                                                 Workload B - Read latency

                                            20                                                                                               40
                                            18                                                                                               35




                                                                                                                     Average update latency (ms)
                    Average read latency (ms)

                                            16
                                                                                                                                             30
                                            14
                                            12                                                                                               25

                                            10                                                                                               20
                                                8                                                                                            15
                                                6
                                                                                                                                             10
                                                4
                                                                                                                                                   5
                                                2
                                                0                                                                                                  0
                                                     0       2000        4000      6000       8000   10000                                             0             2
                                                                     Throughput (operations/sec)

                                                         Cassandra       HBase       Sherpa     MySQL                                                        Cassan




  Ilya Bakulin (SMS Traffic)                                            NoSQL benchmarking                     July 2, 2011                                  14 / 16
oad resuts: Workload ”B”:heavy 5% write
 Some B – Read 95% read /

                                                                        Workload B - Update latency

                                                   40
                                                   35
                           Average update latency (ms)


                                                   30
                                                   25
                                                   20
                                                   15
                                                   10
                                                         5
                                                         0
 8000      10000                                             0       2000        4000      6000       8000   10000
ec)                                                                          Throughput (operations/sec)

    MySQL                                                        Cassandra        Hbase      Sherpa     MySQL




        Ilya Bakulin (SMS Traffic)                                              NoSQL benchmarking                     July 2, 2011   15 / 16
Links




    Yahoo Cloud Serving Benchmark:
    https://github.com/brianfrankcooper/YCSB
    google://
    webmaster@kibab.com, kibab@FreeBSD.org




  Ilya Bakulin (SMS Traffic)   NoSQL benchmarking   July 2, 2011   16 / 16

Contenu connexe

Tendances

Distributed Caching Essential Lessons (Ts 1402)
Distributed Caching   Essential Lessons (Ts 1402)Distributed Caching   Essential Lessons (Ts 1402)
Distributed Caching Essential Lessons (Ts 1402)
Yury Kaliaha
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
Jack Levin
 
Scaling Out Tier Based Applications
Scaling Out Tier Based ApplicationsScaling Out Tier Based Applications
Scaling Out Tier Based Applications
Yury Kaliaha
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
Scott Miao
 
004 architecture andadvanceduse
004 architecture andadvanceduse004 architecture andadvanceduse
004 architecture andadvanceduse
Scott Miao
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
DataWorks Summit
 

Tendances (20)

Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
 
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageWebinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
 
Distributed Caching Essential Lessons (Ts 1402)
Distributed Caching   Essential Lessons (Ts 1402)Distributed Caching   Essential Lessons (Ts 1402)
Distributed Caching Essential Lessons (Ts 1402)
 
Alfresco Large Scale Enterprise Deployments
Alfresco Large Scale Enterprise DeploymentsAlfresco Large Scale Enterprise Deployments
Alfresco Large Scale Enterprise Deployments
 
Hadoop on Virtual Machines
Hadoop on Virtual MachinesHadoop on Virtual Machines
Hadoop on Virtual Machines
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011
 
Scaling Out Tier Based Applications
Scaling Out Tier Based ApplicationsScaling Out Tier Based Applications
Scaling Out Tier Based Applications
 
Apache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesApache Hadoop on Virtual Machines
Apache Hadoop on Virtual Machines
 
Cloud Storage Adoption, Practice, and Deployment
Cloud Storage Adoption, Practice, and DeploymentCloud Storage Adoption, Practice, and Deployment
Cloud Storage Adoption, Practice, and Deployment
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Codemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labCodemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech lab
 
Postgres: The NoSQL Cake You Can Eat
Postgres: The NoSQL Cake You Can EatPostgres: The NoSQL Cake You Can Eat
Postgres: The NoSQL Cake You Can Eat
 
How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop Cluster
 
Hbase: an introduction
Hbase: an introductionHbase: an introduction
Hbase: an introduction
 
Hadoop Summit 2012 | HBase Consistency and Performance Improvements
Hadoop Summit 2012 | HBase Consistency and Performance ImprovementsHadoop Summit 2012 | HBase Consistency and Performance Improvements
Hadoop Summit 2012 | HBase Consistency and Performance Improvements
 
Award winning scale-up and scale-out storage for Xen
Award winning scale-up and scale-out storage for XenAward winning scale-up and scale-out storage for Xen
Award winning scale-up and scale-out storage for Xen
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
004 architecture andadvanceduse
004 architecture andadvanceduse004 architecture andadvanceduse
004 architecture andadvanceduse
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 

En vedette

Cignex mongodb-sharding-mongodbdays
Cignex mongodb-sharding-mongodbdaysCignex mongodb-sharding-mongodbdays
Cignex mongodb-sharding-mongodbdays
MongoDB APAC
 
247 overviewmongodbevening-bangalore
247 overviewmongodbevening-bangalore247 overviewmongodbevening-bangalore
247 overviewmongodbevening-bangalore
MongoDB APAC
 
An afternoon with mongo db new delhi
An afternoon with mongo db new delhiAn afternoon with mongo db new delhi
An afternoon with mongo db new delhi
Rajnish Verma
 
Mongo db eveningschemadesign
Mongo db eveningschemadesignMongo db eveningschemadesign
Mongo db eveningschemadesign
MongoDB APAC
 
Big Data Benchmarking Tutorial
Big Data Benchmarking TutorialBig Data Benchmarking Tutorial
Big Data Benchmarking Tutorial
Tilmann Rabl
 

En vedette (17)

Introduction to Couchbase Server 2.0
Introduction to Couchbase Server 2.0Introduction to Couchbase Server 2.0
Introduction to Couchbase Server 2.0
 
Introduction to Database Benchmarking with Benchmark Factory
Introduction to Database Benchmarking with Benchmark FactoryIntroduction to Database Benchmarking with Benchmark Factory
Introduction to Database Benchmarking with Benchmark Factory
 
Database Hardware Benchmarking
Database Hardware BenchmarkingDatabase Hardware Benchmarking
Database Hardware Benchmarking
 
Presentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membasePresentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membase
 
MongoDB In Production At Sailthru
MongoDB In Production At SailthruMongoDB In Production At Sailthru
MongoDB In Production At Sailthru
 
Cignex mongodb-sharding-mongodbdays
Cignex mongodb-sharding-mongodbdaysCignex mongodb-sharding-mongodbdays
Cignex mongodb-sharding-mongodbdays
 
Rpsonmongodb
RpsonmongodbRpsonmongodb
Rpsonmongodb
 
Pelicamigrator
PelicamigratorPelicamigrator
Pelicamigrator
 
What's new in MongoDB 2.6
What's new in MongoDB 2.6What's new in MongoDB 2.6
What's new in MongoDB 2.6
 
247 overviewmongodbevening-bangalore
247 overviewmongodbevening-bangalore247 overviewmongodbevening-bangalore
247 overviewmongodbevening-bangalore
 
An afternoon with mongo db new delhi
An afternoon with mongo db new delhiAn afternoon with mongo db new delhi
An afternoon with mongo db new delhi
 
MMS - Monitoring, backup and management at a single click
MMS - Monitoring, backup and management at a single clickMMS - Monitoring, backup and management at a single click
MMS - Monitoring, backup and management at a single click
 
Mongo db eveningschemadesign
Mongo db eveningschemadesignMongo db eveningschemadesign
Mongo db eveningschemadesign
 
Big Data Benchmarking Tutorial
Big Data Benchmarking TutorialBig Data Benchmarking Tutorial
Big Data Benchmarking Tutorial
 
Lightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at CogentaLightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at Cogenta
 
An Introduction to MongoDB Ops Manager
An Introduction to MongoDB Ops ManagerAn Introduction to MongoDB Ops Manager
An Introduction to MongoDB Ops Manager
 
Internet of things for Smart Home
Internet of things for Smart Home Internet of things for Smart Home
Internet of things for Smart Home
 

Similaire à Methods of NoSQL database systems benchmarking

Yahoo Cloud Serving Benchmark
Yahoo Cloud Serving BenchmarkYahoo Cloud Serving Benchmark
Yahoo Cloud Serving Benchmark
kevin han
 
Microsoft Openness Mongo DB
Microsoft Openness Mongo DBMicrosoft Openness Mongo DB
Microsoft Openness Mongo DB
Heriyadi Janwar
 
SQL Explore 2012 - Meir Dudai: DAC
SQL Explore 2012 - Meir Dudai: DACSQL Explore 2012 - Meir Dudai: DAC
SQL Explore 2012 - Meir Dudai: DAC
sqlserver.co.il
 
Microsoft Cloud BI Update 2012 for SQL Saturday Philly
Microsoft Cloud BI Update 2012 for SQL Saturday PhillyMicrosoft Cloud BI Update 2012 for SQL Saturday Philly
Microsoft Cloud BI Update 2012 for SQL Saturday Philly
Mark Kromer
 

Similaire à Methods of NoSQL database systems benchmarking (20)

SQL Server 2008 Fast Track Data Warehouse
SQL Server 2008 Fast Track Data WarehouseSQL Server 2008 Fast Track Data Warehouse
SQL Server 2008 Fast Track Data Warehouse
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
 
Yahoo Cloud Serving Benchmark
Yahoo Cloud Serving BenchmarkYahoo Cloud Serving Benchmark
Yahoo Cloud Serving Benchmark
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Data
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Top 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data GridTop 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data Grid
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlanta
 
2011 04-dsi-javaee-in-the-cloud-andreadis
2011 04-dsi-javaee-in-the-cloud-andreadis2011 04-dsi-javaee-in-the-cloud-andreadis
2011 04-dsi-javaee-in-the-cloud-andreadis
 
MEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftMEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop Microsoft
 
Windows Azure Platform Technical Deep Dive - Chris Auld (Intergen)
Windows Azure Platform Technical Deep Dive - Chris Auld (Intergen)Windows Azure Platform Technical Deep Dive - Chris Auld (Intergen)
Windows Azure Platform Technical Deep Dive - Chris Auld (Intergen)
 
Building Distributed Systems With Riak and Riak Core
Building Distributed Systems With Riak and Riak CoreBuilding Distributed Systems With Riak and Riak Core
Building Distributed Systems With Riak and Riak Core
 
MySQL高可用
MySQL高可用MySQL高可用
MySQL高可用
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmark
 
Windows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing PlatformWindows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing Platform
 
Apache Drill
Apache DrillApache Drill
Apache Drill
 
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)
 
Microsoft Openness Mongo DB
Microsoft Openness Mongo DBMicrosoft Openness Mongo DB
Microsoft Openness Mongo DB
 
SQL Explore 2012 - Meir Dudai: DAC
SQL Explore 2012 - Meir Dudai: DACSQL Explore 2012 - Meir Dudai: DAC
SQL Explore 2012 - Meir Dudai: DAC
 
Microsoft Cloud BI Update 2012 for SQL Saturday Philly
Microsoft Cloud BI Update 2012 for SQL Saturday PhillyMicrosoft Cloud BI Update 2012 for SQL Saturday Philly
Microsoft Cloud BI Update 2012 for SQL Saturday Philly
 
SSD Performance Benchmarking
SSD Performance BenchmarkingSSD Performance Benchmarking
SSD Performance Benchmarking
 

Plus de Транслируем.бел

Plus de Транслируем.бел (20)

Медицинские трансляции
Медицинские трансляцииМедицинские трансляции
Медицинские трансляции
 
Vinteo
VinteoVinteo
Vinteo
 
Руководство по видео, трансляциям и премьерам (Youtube 2020)
Руководство по видео, трансляциям и премьерам (Youtube 2020)Руководство по видео, трансляциям и премьерам (Youtube 2020)
Руководство по видео, трансляциям и премьерам (Youtube 2020)
 
Корпоративный новый год онлайн
Корпоративный новый год онлайнКорпоративный новый год онлайн
Корпоративный новый год онлайн
 
Unofficial guide to vmix by streamgeeks
Unofficial guide to vmix by streamgeeksUnofficial guide to vmix by streamgeeks
Unofficial guide to vmix by streamgeeks
 
Руководство для малого и среднего бизнеса по использованию цифровых решений
Руководство для малого и среднего бизнеса по использованию цифровых решенийРуководство для малого и среднего бизнеса по использованию цифровых решений
Руководство для малого и среднего бизнеса по использованию цифровых решений
 
Sennheiser ew100 g2
Sennheiser ew100 g2Sennheiser ew100 g2
Sennheiser ew100 g2
 
Sony mcs 8m
Sony mcs 8mSony mcs 8m
Sony mcs 8m
 
Сравнение поколений Y и Z
Сравнение поколений Y и ZСравнение поколений Y и Z
Сравнение поколений Y и Z
 
Онлайн-трансляции в соцсетях
Онлайн-трансляции в соцсетяхОнлайн-трансляции в соцсетях
Онлайн-трансляции в соцсетях
 
Как организовать трансляцию в Facebook
Как организовать трансляцию в FacebookКак организовать трансляцию в Facebook
Как организовать трансляцию в Facebook
 
The ultimate guide to facebook live for your event
The ultimate guide to facebook live for your eventThe ultimate guide to facebook live for your event
The ultimate guide to facebook live for your event
 
Guide to facebook live
Guide to facebook liveGuide to facebook live
Guide to facebook live
 
Comdi player
Comdi playerComdi player
Comdi player
 
Что сделать, чтобы сто раз все не переделывать
Что сделать, чтобы сто раз все не переделыватьЧто сделать, чтобы сто раз все не переделывать
Что сделать, чтобы сто раз все не переделывать
 
Когда сказать нет. Арсений Кравченко
Когда сказать нет. Арсений КравченкоКогда сказать нет. Арсений Кравченко
Когда сказать нет. Арсений Кравченко
 
Marketing Essentials for Startup Teams
Marketing Essentials for Startup TeamsMarketing Essentials for Startup Teams
Marketing Essentials for Startup Teams
 
SMM учебник. Как продвигать банк в социальных сетях. Наглядное пособие
SMM учебник. Как продвигать банк в социальных сетях. Наглядное пособиеSMM учебник. Как продвигать банк в социальных сетях. Наглядное пособие
SMM учебник. Как продвигать банк в социальных сетях. Наглядное пособие
 
методы монетизации интернет проектов
методы монетизации интернет проектовметоды монетизации интернет проектов
методы монетизации интернет проектов
 
Belarus internet users discovery
Belarus internet users discoveryBelarus internet users discovery
Belarus internet users discovery
 

Dernier

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Dernier (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Methods of NoSQL database systems benchmarking

  • 1. Methods of benchmarking NoSQL database systems Ilya Bakulin webmaster@kibab.com, kibab@FreeBSD.org SMS Traffic LVEE 2011 Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 1 / 16
  • 2. 1 Introduction 2 YCSB benchmarking framework 3 YCSB practical usage 4 Results 5 Where to find further information Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 2 / 16
  • 3. Why benchmarking NoSQL is nessesary No guides / FAQs about performance are generally available, or are outdated NoSQL systems are actively developed Nobody wants to end up with crashed DB in production right before 2-week vacation Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 3 / 16
  • 4. Why benchmarking NoSQL is complex RDBMS use SQL to provide access to data stored in them, while NOSQL systems don’t Each NoSQL uses different protocol (Thrift, Memcached-style, own protocols) Existing benchmarks require SQL to work with database under inspection. Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 4 / 16
  • 5. What is YCSB? YCSB stands for Yahoo Cloud Serving Benchmark Developed by Yahoo! Research group Open Source project, hosted on GitHub (178 watchers, 42 forks) Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 5 / 16
  • 6. Architecture Benchmark tool •! Java application Java application –! Many systems have Java APIs Shipped with ready-to-use adapters for several popular Opensource databases systems via HTTP/REST, JNI or some other solution –! Other Command-line parameters •! DB to use •! Target throughput •! Number of threads •! … Workload YCSB client Cloud DB parameter file DB client •! R/W mix •! Record size Client Workload threads •! Data set executor •! … Stats Extensible: define new workloads Extensible: plug in new clients 5 Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 6 / 16
  • 7. More on DB interface Benchmark tool Simple operations: INSERT, UPDATE, REPLACE, DELETE, SCAN •! Does not use SQL Java application ... –! Many systems have Java APIs but SQL support is avaible through contributed JDBC driver ... –! Other systems configurations are possible other solution Even sharding via HTTP/REST, JNI or some Command-line parameters •! DB to use •! Target throughput •! Number of threads •! … Workload YCSB client Cloud DB parameter file DB client •! R/W mix •! Record size Client Workload threads •! Data set executor •! … Stats Extensible: define new workloads Extensible: plug in new clients 5 Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 7 / 16
  • 8. More on DB interface Benchmark tool Simple operations: INSERT, UPDATE, REPLACE, DELETE, SCAN •! Does not use SQL Java application ... –! Many systems have Java APIs but SQL support is avaible through contributed JDBC driver ... –! Other systems configurations are possible other solution Even sharding via HTTP/REST, JNI or some Command-line parameters •! DB to use •! Target throughput •! Number of threads •! … Workload YCSB client Cloud DB parameter file DB client •! R/W mix •! Record size Client Workload threads •! Data set executor •! … Stats Extensible: define new workloads Extensible: plug in new clients 5 Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 7 / 16
  • 9. More on DB interface Benchmark tool Simple operations: INSERT, UPDATE, REPLACE, DELETE, SCAN •! Does not use SQL Java application ... –! Many systems have Java APIs but SQL support is avaible through contributed JDBC driver ... –! Other systems configurations are possible other solution Even sharding via HTTP/REST, JNI or some Command-line parameters •! DB to use •! Target throughput •! Number of threads •! … Workload YCSB client Cloud DB parameter file DB client •! R/W mix •! Record size Client Workload threads •! Data set executor •! … Stats Extensible: define new workloads Extensible: plug in new clients 5 Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 7 / 16
  • 10. More on workload d by replacing the value of one Uniform: +(,-./%'&0 ther one randomly chosen field Specifies what DB 1)))2)))333 operations are used by order, starting at a randomly !"#$%&'(")(%*$% 4 number of application records to scan is Zipfian: Also defines request +(,-./%'&0 distribution of scan lengths is distribution oad. Thus, the scan() method number of records to scan.to specify It is possible Of y instead specify a scan interval 1)))2)))333 4 record size February 15th). The number of !"#$%&'(")(%*$% It’s possible to specify to control the size of these in- Latest: termine and number of records and specify meaningful +(,-./%'&0 of the database calls, including tion 5.2.1.) operations 1)))2)))333 4 !"#$%&'(")(%*$% t make many random choices Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 8 / 16
  • 11. /* TODO: Remove this crap */ Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 9 / 16
  • 12. SMS Traffic: workload construction SMS service provider, several gateways, big clients (such as banks) 15% inserts, 65% updates, 15% reads Request distribution: latest SMS messages are the ”hottest” ones Evaluated Cassandra and sharded MySQL as DB storage for the next generation of SMS sending platform Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 10 / 16
  • 13. Testing process 3 instances of DBMS system on one server (Core Quad Q9400, 4GB RAM, SATA-II HDD, FreeBSD 8.2-amd64) Cassandra 0.7.4 (1GB Java heap / instance) MySQL 5.1 + InnoDB engine (1GB InnoDB buffer pool size / instance) Client: separate machine, 1Gb/s connection Should avoid swapping and disk IO saturation Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 11 / 16
  • 14. Some resuts: Workload ”A”: 50% read / 50% write Workload A – Upd •! 50/50 Read/update Workload A - Read latency 70 80 60 70 Average read latency (ms) Update latency (ms) 60 50 50 40 40 30 30 20 20 10 10 0 0 0 5000 10000 15000 0 Throughput (ops/sec) Cassandra Hbase Sherpa MySQL Cassan Ilya Bakulin (SMS Traffic) Comment: Cassandra is optimized for writes, and achieves h NoSQL benchmarking July 2, 2011 12 / 16
  • 15. oad A – Workload ”A”: 50% read / 50% write Some resuts: Update heavy y Workload A - Update latency 80 70 Update latency (ms) 60 50 40 30 20 10 0 15000 0 5000 10000 15000 ) Throughput (ops/sec) MySQL Cassandra Hbase Sherpa MySQL zed for writes, and achieves higher throughput and lower Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 13 / 16
  • 16. Some resuts: Workload ”B”: 95% read / 5% write Workload B – Re •! 95/5 Read/update Workload B - Read latency 20 40 18 35 Average update latency (ms) Average read latency (ms) 16 30 14 12 25 10 20 8 15 6 10 4 5 2 0 0 0 2000 4000 6000 8000 10000 0 2 Throughput (operations/sec) Cassandra HBase Sherpa MySQL Cassan Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 14 / 16
  • 17. oad resuts: Workload ”B”:heavy 5% write Some B – Read 95% read / Workload B - Update latency 40 35 Average update latency (ms) 30 25 20 15 10 5 0 8000 10000 0 2000 4000 6000 8000 10000 ec) Throughput (operations/sec) MySQL Cassandra Hbase Sherpa MySQL Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 15 / 16
  • 18. Links Yahoo Cloud Serving Benchmark: https://github.com/brianfrankcooper/YCSB google:// webmaster@kibab.com, kibab@FreeBSD.org Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 16 / 16