SlideShare une entreprise Scribd logo
1  sur  5
Télécharger pour lire hors ligne
Cassandra Performance Evaluation with Compression
                           Schubert Zhang, May.2010
                           schubert.zhang@gmail.com

The current implementation of Cassandra’s storage layer and indexing mechanism
only allow compression at row level.


   Column Family Row Serialization Structure:
1. The old structure:
                               Len          HashCount
      bloom filter                                                               BitSet
                               (int)           (int)
       index size                int
                              FirstColumName    LastColumName                         Offset (long)         Block Width
    index of block 0
                             (Len(short)+name) (Len(short)+name)                    (0 for first block)        (long)

    index of block 1

                             localDeletionTime         markedForDeleteAt
     deletion meta
                                   (int)                     (long)
     column count                int

    column block 0
    (uncompressed)            Column0            Column1       Column2              Column3

    column block 1
    (uncompressed)           deleteMark       timestamp                            value
                               (bool)           (long)                            (byte[])


2. The new structure (to support compression)
The new structure is appropriate for both old (uncompressed) and new (compressed)
format.

              format         (int): -1 (old format), 0 (new, LZO compressed), 1(new, GZ compressed), 2(new, uncompressed)
                               Len        HashCount
            bloom filter                                             BitSet
                               (int)         (int)
                             localDeletionTime     markedForDeleteAt
           deletion meta
                                    (int)                (long)
           column count        int

         column block 0
       (compressed or not)    Column0       Column1        Column2      Column3

         column block 1
       (compressed or not)   deleteMark    timestamp                   value
                               (bool)        (long)                   (byte[])
             index size        int
                              FirstColumName    LastColumName             Offset (long)       Block Width     Size on Disk
          index of block 0
                             (Len(short)+name) (Len(short)+name)        (0 for first block)      (long)           (int)

          index of block 1

           index size’

If the first int (format) is -1, the following structure will be same as “The
old structure”, except the “index of block” will use the new one.




                                                       1
     Benchmark:
1. Just one single node (only one disk, 4GB RAM(3GB for JVM heap), 4 Cores)
2. Dataset:
      ~200 bytes per column (thrift compactly encoded, the original CSV string is
~250 bytes)
      100,000 keys
      500,000,000 columns totally
      ~5,000 columns per key in average
3. Key Cache and Row Cache both disabled
4. Write or Read Client has 4 Threads, totally execute 10,000 read operations.
5. Every read operation only read the first 100 columns of the specified key.
5. The read performance is got after major compaction, i.e. only one SSTable.


     Compression Performance Matrix:
Field                   Model      Uncompressed         Compressed         Compressed
            Criteria                 (Default)             (GZ)                (LZO)
    Size      Disk Space(B)          104.545GB           45.067GB            54.656GB
            Compression Ratio           1/1                1/2.3              1/1.9
Compact       Major Time(H)             3:16               5:30               3:08
             Row Max Size(B)          1186948             512475              624396
 Write      Throughput(ops/s)          12635               11806              11034
             Avg Latency(ms)           0.320               0.334              0.347
             Min Latency(ms)           0.079               0.083              0.089
             Max Latency(ms)           19331               5128               10227
            Local Latency(ms)          0.032               0.033              0.037
    Read    Throughput(ops/s)            25                 28                  25
             Avg Latency(ms)            159                 144                159
             Min Latency(ms)             1                   2                   1
             Max Latency(ms)            1038               1526                619
            Local Latency(ms)           159                 144                159
Note:
1. The bottleneck of Write is CPU and memory.
      a)   In theory, we may get better performance under more power CPU and more
           RAM.
      b)   And if the commitlog is stored on a dedicated disk, we may get better result.
2. The bottleneck of Read is disk utility (100%).
      a)   Too many seeks.
      b)   Every read need 2 seeks to reach the row. So, a read operation needs at
           least 20ms on disk seek. The maximum throughput (ops/s) is 50.
      c)   If the row is compressed, one additional seek in the row is needed.
3. The compression ratio will become better along with the average size of row.
      a)   Since our dataset are very random, the ratio is just about 1/2.
4. Compaction is CPU-bound, since compaction is single-threaded. Gzip compression
     is slower.

                                               2
Configuration:
Parameter                       Value
KeysCached                      0
DiskAccessMode                  standard
SlicedBufferSizeInKB            64
FlushDataBufferSizeInMB         32
FlushIndexBufferSizeInMB        8
ColumnIndexSizeInKB             64
MemtableThroughputInMB          128
ConcurrentReads                 16
ConcurrentWrites                64
CommitLogSync                   periodic
CommitLogSyncPeriodInMS         10000


     Encoding + Compression:




1. The original text CSV column: ~250 bytes
2. Use thrift compacted encoding: ~200 bytes
3. Encoding + Compression, compositive reduce ratio: ~1/3


     Read Throughput/Latency on slice size (count of columns):
      Test on LZO compressed data, totally executed 10,000 read operations.
           Slice Size           50               500                5000
Criteria
Throughput(ops/s)               25               21                  15
    Avg Latency(ms)        158.865             186.571            256.837
    Min Latency(ms)         1.278               5.041              60.934
    Max Latency(ms)        288.307             395.427            1223.202




                                           3
Read Throughput

                                                 30




                           Throughput(ops/s)
                                                 25               25
                                                 20                                       21

                                                 15                                                              15
                                                 10
                                                  5
                                                  0
                                                             50                     500                   5000
                                                                       Slice Size(Count of Columns)



                                                                         Read Latency

                    1400
                    1200                                                                       1223.202
      Latency(ms)




                    1000
                                                                                                             Avg Latency(ms)
                     800
                                                                                                             Min Latency(ms)
                     600
                                                                                                             Max Latency(ms)
                     400                                                  395.427
                                                       288.307  256.837
                     200                             186.571
                                                       158.865
                       0                             5.041
                                                       1.278    60.934
                                        50        500        5000
                                     Slice Size(Count of Columns)


     Read Throughput/Latency on KeyCache, mmap, etc:
      Test on LZO compressed data, use the benchmark. Totally executed 10,000 read
operations.
                     Feature                             KeyCache=100%                         KeyCache=0               KeyCache=0
                                                      DiskAccess=standard                  DiskAccess=                DiskAccess=mmap
Criteria                                                                              mmap_index_only
Throughput(ops/s)                                                 40                               40                       84
    Avg Latency(ms)                                         100.522                             101.762                   47.342
    Min Latency(ms)                                          1.566                               1.453                     1.270
    Max Latency(ms)                                         278.975                             267.120                   239.816

                                                 90                                                         84
                                                 80
                             Throughput(ops/s)




                                                 70
                                                 60
                                                 50
                                                             40                      40
                                                 40
                                                 30
                                                 20
                                                 10
                                                  0
                                                      KeyCache_standard       mmap_index_only             mmap

But, for a long time of evaluation, the performance of on mmap is unstable. Following
evaluation executed 1000,000 read operations. It may because of GC.




                                                                                 4
Read Throughput (mmap)

Throughput(ops/s)   120
                    100
                     80
                     60
                     40
                     20
                      0
                          1   15   29 43   57 71 85   99 113 127 141 155 169 183 197 211 225 239 253 267 281
                                                            Time( 1minute)




                                                               5

Contenu connexe

Tendances

Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012Big Data Spain
 
Process' Virtual Address Space in GNU/Linux
Process' Virtual Address Space in GNU/LinuxProcess' Virtual Address Space in GNU/Linux
Process' Virtual Address Space in GNU/LinuxVarun Mahajan
 
Multiprocessing with python
Multiprocessing with pythonMultiprocessing with python
Multiprocessing with pythonPatrick Vergain
 
Basic of Multithreading in JAva
Basic of Multithreading in JAvaBasic of Multithreading in JAva
Basic of Multithreading in JAvasuraj pandey
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...npinto
 
Concurrency in Python
Concurrency in PythonConcurrency in Python
Concurrency in PythonGavin Roy
 
Accelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCIgor Sfiligoi
 
Java and the machine - Martijn Verburg and Kirk Pepperdine
Java and the machine - Martijn Verburg and Kirk PepperdineJava and the machine - Martijn Verburg and Kirk Pepperdine
Java and the machine - Martijn Verburg and Kirk PepperdineJAX London
 

Tendances (12)

Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
 
PyData Paris 2015 - Closing keynote Francesc Alted
PyData Paris 2015 - Closing keynote Francesc AltedPyData Paris 2015 - Closing keynote Francesc Alted
PyData Paris 2015 - Closing keynote Francesc Alted
 
Process' Virtual Address Space in GNU/Linux
Process' Virtual Address Space in GNU/LinuxProcess' Virtual Address Space in GNU/Linux
Process' Virtual Address Space in GNU/Linux
 
Multi threading
Multi threadingMulti threading
Multi threading
 
Multiprocessing with python
Multiprocessing with pythonMultiprocessing with python
Multiprocessing with python
 
Basic of Multithreading in JAva
Basic of Multithreading in JAvaBasic of Multithreading in JAva
Basic of Multithreading in JAva
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
 
Concurrency in Python
Concurrency in PythonConcurrency in Python
Concurrency in Python
 
Storm 0.8.2
Storm 0.8.2Storm 0.8.2
Storm 0.8.2
 
Java threading
Java threadingJava threading
Java threading
 
Accelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACC
 
Java and the machine - Martijn Verburg and Kirk Pepperdine
Java and the machine - Martijn Verburg and Kirk PepperdineJava and the machine - Martijn Verburg and Kirk Pepperdine
Java and the machine - Martijn Verburg and Kirk Pepperdine
 

En vedette

Learning from google megastore (Part-1)
Learning from google megastore (Part-1)Learning from google megastore (Part-1)
Learning from google megastore (Part-1)Schubert Zhang
 
Google Megastore
Google MegastoreGoogle Megastore
Google Megastorebergwolf
 
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...DataStax Academy
 
Megastore - ID2220 Presentation
Megastore - ID2220 PresentationMegastore - ID2220 Presentation
Megastore - ID2220 PresentationArinto Murdopo
 
TestNet thema avond 11-12-2013 - De T-Shaped Tester
TestNet thema avond 11-12-2013 - De T-Shaped TesterTestNet thema avond 11-12-2013 - De T-Shaped Tester
TestNet thema avond 11-12-2013 - De T-Shaped TesterRemi-Armand Collaris
 
Mongo db administration guide
Mongo db administration guideMongo db administration guide
Mongo db administration guideDeysi Gmarra
 
StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit ...
StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit ...StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit ...
StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit ...Álvaro Agea Herradón
 
Scaling DataStax in Docker
Scaling DataStax in DockerScaling DataStax in Docker
Scaling DataStax in DockerDataStax
 
Fosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and MonoFosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and MonoAchim Friedland
 
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016DataStax
 
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...MongoDB
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...DataStax
 
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...DataStax
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseDataStax Academy
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101MongoDB
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...DataStax
 
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...DataStax
 
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...DataStax
 

En vedette (20)

Learning from google megastore (Part-1)
Learning from google megastore (Part-1)Learning from google megastore (Part-1)
Learning from google megastore (Part-1)
 
Google Megastore
Google MegastoreGoogle Megastore
Google Megastore
 
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...
 
Megastore - ID2220 Presentation
Megastore - ID2220 PresentationMegastore - ID2220 Presentation
Megastore - ID2220 Presentation
 
Megastore by Google
Megastore by GoogleMegastore by Google
Megastore by Google
 
TestNet thema avond 11-12-2013 - De T-Shaped Tester
TestNet thema avond 11-12-2013 - De T-Shaped TesterTestNet thema avond 11-12-2013 - De T-Shaped Tester
TestNet thema avond 11-12-2013 - De T-Shaped Tester
 
Mongo db administration guide
Mongo db administration guideMongo db administration guide
Mongo db administration guide
 
StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit ...
StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit ...StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit ...
StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit ...
 
Scaling DataStax in Docker
Scaling DataStax in DockerScaling DataStax in Docker
Scaling DataStax in Docker
 
Fosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and MonoFosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
 
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
 
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
 
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series Database
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
 
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
 
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
 

Similaire à Cassandra Compression and Performance Evaluation

Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1
Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1
Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1Hsien-Hsin Sean Lee, Ph.D.
 
A compact bytecode format for JavaScriptCore
A compact bytecode format for JavaScriptCoreA compact bytecode format for JavaScriptCore
A compact bytecode format for JavaScriptCoreTadeu Zagallo
 
Memory-Based Cloud Architectures
Memory-Based Cloud ArchitecturesMemory-Based Cloud Architectures
Memory-Based Cloud Architectures小新 制造
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton insertsChris Adkin
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesRakuten Group, Inc.
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaCloudera, Inc.
 
Performance evaluation of Linux Discard Support
Performance evaluation of Linux Discard SupportPerformance evaluation of Linux Discard Support
Performance evaluation of Linux Discard SupportLukáš Czerner
 
Ca บทที่สี่
Ca บทที่สี่Ca บทที่สี่
Ca บทที่สี่atit604
 
General Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics HardwareGeneral Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics HardwareDaniel Blezek
 
Sql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ramSql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ramChris Adkin
 
Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits ScyllaDB
 
Large_Data_Block_Size_NSIC898
Large_Data_Block_Size_NSIC898Large_Data_Block_Size_NSIC898
Large_Data_Block_Size_NSIC898Martin Hassner
 
Advance computer architecture
Advance computer architectureAdvance computer architecture
Advance computer architecturesuma1991
 
9.1-CSE3421-multicolumn-cache.pdf
9.1-CSE3421-multicolumn-cache.pdf9.1-CSE3421-multicolumn-cache.pdf
9.1-CSE3421-multicolumn-cache.pdfrishav957243
 
Apache Cassandra Opinion and Fact
Apache Cassandra Opinion and FactApache Cassandra Opinion and Fact
Apache Cassandra Opinion and Factmediumdata
 

Similaire à Cassandra Compression and Performance Evaluation (20)

Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1
Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1
Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1
 
A compact bytecode format for JavaScriptCore
A compact bytecode format for JavaScriptCoreA compact bytecode format for JavaScriptCore
A compact bytecode format for JavaScriptCore
 
Memory-Based Cloud Architectures
Memory-Based Cloud ArchitecturesMemory-Based Cloud Architectures
Memory-Based Cloud Architectures
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select Dictionaries
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
 
Performance evaluation of Linux Discard Support
Performance evaluation of Linux Discard SupportPerformance evaluation of Linux Discard Support
Performance evaluation of Linux Discard Support
 
Atc..
Atc..Atc..
Atc..
 
Ca บทที่สี่
Ca บทที่สี่Ca บทที่สี่
Ca บทที่สี่
 
General Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics HardwareGeneral Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics Hardware
 
Lecture 25
Lecture 25Lecture 25
Lecture 25
 
SFScon19 - Davide Montesin - Why you should consider using btrfs
SFScon19 - Davide Montesin - Why you should consider using btrfsSFScon19 - Davide Montesin - Why you should consider using btrfs
SFScon19 - Davide Montesin - Why you should consider using btrfs
 
Architecture Assignment Help
Architecture Assignment HelpArchitecture Assignment Help
Architecture Assignment Help
 
Sql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ramSql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ram
 
Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits
 
Large_Data_Block_Size_NSIC898
Large_Data_Block_Size_NSIC898Large_Data_Block_Size_NSIC898
Large_Data_Block_Size_NSIC898
 
Advance computer architecture
Advance computer architectureAdvance computer architecture
Advance computer architecture
 
9.1-CSE3421-multicolumn-cache.pdf
9.1-CSE3421-multicolumn-cache.pdf9.1-CSE3421-multicolumn-cache.pdf
9.1-CSE3421-multicolumn-cache.pdf
 
Apache Cassandra Opinion and Fact
Apache Cassandra Opinion and FactApache Cassandra Opinion and Fact
Apache Cassandra Opinion and Fact
 
10
1010
10
 

Plus de Schubert Zhang

Engineering Culture and Infrastructure
Engineering Culture and InfrastructureEngineering Culture and Infrastructure
Engineering Culture and InfrastructureSchubert Zhang
 
Simple practices in performance monitoring and evaluation
Simple practices in performance monitoring and evaluationSimple practices in performance monitoring and evaluation
Simple practices in performance monitoring and evaluationSchubert Zhang
 
Scrum Agile Development
Scrum Agile DevelopmentScrum Agile Development
Scrum Agile DevelopmentSchubert Zhang
 
Engineering practices in big data storage and processing
Engineering practices in big data storage and processingEngineering practices in big data storage and processing
Engineering practices in big data storage and processingSchubert Zhang
 
Bigtable数据模型解决CDR清单存储问题的资源估算
Bigtable数据模型解决CDR清单存储问题的资源估算Bigtable数据模型解决CDR清单存储问题的资源估算
Bigtable数据模型解决CDR清单存储问题的资源估算Schubert Zhang
 
Big Data Engineering Team Meeting 20120223a
Big Data Engineering Team Meeting 20120223aBig Data Engineering Team Meeting 20120223a
Big Data Engineering Team Meeting 20120223aSchubert Zhang
 
HBase Coprocessor Introduction
HBase Coprocessor IntroductionHBase Coprocessor Introduction
HBase Coprocessor IntroductionSchubert Zhang
 
Hadoop大数据实践经验
Hadoop大数据实践经验Hadoop大数据实践经验
Hadoop大数据实践经验Schubert Zhang
 
Wild Thinking of BigdataBase
Wild Thinking of BigdataBaseWild Thinking of BigdataBase
Wild Thinking of BigdataBaseSchubert Zhang
 
RockStor - A Cloud Object System based on Hadoop
RockStor -  A Cloud Object System based on HadoopRockStor -  A Cloud Object System based on Hadoop
RockStor - A Cloud Object System based on HadoopSchubert Zhang
 
Hadoop compress-stream
Hadoop compress-streamHadoop compress-stream
Hadoop compress-streamSchubert Zhang
 
Ganglia轻度使用指南
Ganglia轻度使用指南Ganglia轻度使用指南
Ganglia轻度使用指南Schubert Zhang
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionSchubert Zhang
 

Plus de Schubert Zhang (20)

Blockchain in Action
Blockchain in ActionBlockchain in Action
Blockchain in Action
 
科普区块链
科普区块链科普区块链
科普区块链
 
Engineering Culture and Infrastructure
Engineering Culture and InfrastructureEngineering Culture and Infrastructure
Engineering Culture and Infrastructure
 
Simple practices in performance monitoring and evaluation
Simple practices in performance monitoring and evaluationSimple practices in performance monitoring and evaluation
Simple practices in performance monitoring and evaluation
 
Scrum Agile Development
Scrum Agile DevelopmentScrum Agile Development
Scrum Agile Development
 
Career Advice
Career AdviceCareer Advice
Career Advice
 
Engineering practices in big data storage and processing
Engineering practices in big data storage and processingEngineering practices in big data storage and processing
Engineering practices in big data storage and processing
 
HiveServer2
HiveServer2HiveServer2
HiveServer2
 
Horizon for Big Data
Horizon for Big DataHorizon for Big Data
Horizon for Big Data
 
Bigtable数据模型解决CDR清单存储问题的资源估算
Bigtable数据模型解决CDR清单存储问题的资源估算Bigtable数据模型解决CDR清单存储问题的资源估算
Bigtable数据模型解决CDR清单存储问题的资源估算
 
Big Data Engineering Team Meeting 20120223a
Big Data Engineering Team Meeting 20120223aBig Data Engineering Team Meeting 20120223a
Big Data Engineering Team Meeting 20120223a
 
HBase Coprocessor Introduction
HBase Coprocessor IntroductionHBase Coprocessor Introduction
HBase Coprocessor Introduction
 
Hadoop大数据实践经验
Hadoop大数据实践经验Hadoop大数据实践经验
Hadoop大数据实践经验
 
Wild Thinking of BigdataBase
Wild Thinking of BigdataBaseWild Thinking of BigdataBase
Wild Thinking of BigdataBase
 
RockStor - A Cloud Object System based on Hadoop
RockStor -  A Cloud Object System based on HadoopRockStor -  A Cloud Object System based on Hadoop
RockStor - A Cloud Object System based on Hadoop
 
Fans of running gump
Fans of running gumpFans of running gump
Fans of running gump
 
Hadoop compress-stream
Hadoop compress-streamHadoop compress-stream
Hadoop compress-stream
 
Ganglia轻度使用指南
Ganglia轻度使用指南Ganglia轻度使用指南
Ganglia轻度使用指南
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solution
 
Big data and cloud
Big data and cloudBig data and cloud
Big data and cloud
 

Dernier

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Dernier (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Cassandra Compression and Performance Evaluation

  • 1. Cassandra Performance Evaluation with Compression Schubert Zhang, May.2010 schubert.zhang@gmail.com The current implementation of Cassandra’s storage layer and indexing mechanism only allow compression at row level.  Column Family Row Serialization Structure: 1. The old structure: Len HashCount bloom filter BitSet (int) (int) index size int FirstColumName LastColumName Offset (long) Block Width index of block 0 (Len(short)+name) (Len(short)+name) (0 for first block) (long) index of block 1 localDeletionTime markedForDeleteAt deletion meta (int) (long) column count int column block 0 (uncompressed) Column0 Column1 Column2 Column3 column block 1 (uncompressed) deleteMark timestamp value (bool) (long) (byte[]) 2. The new structure (to support compression) The new structure is appropriate for both old (uncompressed) and new (compressed) format. format (int): -1 (old format), 0 (new, LZO compressed), 1(new, GZ compressed), 2(new, uncompressed) Len HashCount bloom filter BitSet (int) (int) localDeletionTime markedForDeleteAt deletion meta (int) (long) column count int column block 0 (compressed or not) Column0 Column1 Column2 Column3 column block 1 (compressed or not) deleteMark timestamp value (bool) (long) (byte[]) index size int FirstColumName LastColumName Offset (long) Block Width Size on Disk index of block 0 (Len(short)+name) (Len(short)+name) (0 for first block) (long) (int) index of block 1 index size’ If the first int (format) is -1, the following structure will be same as “The old structure”, except the “index of block” will use the new one. 1
  • 2. Benchmark: 1. Just one single node (only one disk, 4GB RAM(3GB for JVM heap), 4 Cores) 2. Dataset: ~200 bytes per column (thrift compactly encoded, the original CSV string is ~250 bytes) 100,000 keys 500,000,000 columns totally ~5,000 columns per key in average 3. Key Cache and Row Cache both disabled 4. Write or Read Client has 4 Threads, totally execute 10,000 read operations. 5. Every read operation only read the first 100 columns of the specified key. 5. The read performance is got after major compaction, i.e. only one SSTable.  Compression Performance Matrix: Field Model Uncompressed Compressed Compressed Criteria (Default) (GZ) (LZO) Size Disk Space(B) 104.545GB 45.067GB 54.656GB Compression Ratio 1/1 1/2.3 1/1.9 Compact Major Time(H) 3:16 5:30 3:08 Row Max Size(B) 1186948 512475 624396 Write Throughput(ops/s) 12635 11806 11034 Avg Latency(ms) 0.320 0.334 0.347 Min Latency(ms) 0.079 0.083 0.089 Max Latency(ms) 19331 5128 10227 Local Latency(ms) 0.032 0.033 0.037 Read Throughput(ops/s) 25 28 25 Avg Latency(ms) 159 144 159 Min Latency(ms) 1 2 1 Max Latency(ms) 1038 1526 619 Local Latency(ms) 159 144 159 Note: 1. The bottleneck of Write is CPU and memory. a) In theory, we may get better performance under more power CPU and more RAM. b) And if the commitlog is stored on a dedicated disk, we may get better result. 2. The bottleneck of Read is disk utility (100%). a) Too many seeks. b) Every read need 2 seeks to reach the row. So, a read operation needs at least 20ms on disk seek. The maximum throughput (ops/s) is 50. c) If the row is compressed, one additional seek in the row is needed. 3. The compression ratio will become better along with the average size of row. a) Since our dataset are very random, the ratio is just about 1/2. 4. Compaction is CPU-bound, since compaction is single-threaded. Gzip compression is slower. 2
  • 3. Configuration: Parameter Value KeysCached 0 DiskAccessMode standard SlicedBufferSizeInKB 64 FlushDataBufferSizeInMB 32 FlushIndexBufferSizeInMB 8 ColumnIndexSizeInKB 64 MemtableThroughputInMB 128 ConcurrentReads 16 ConcurrentWrites 64 CommitLogSync periodic CommitLogSyncPeriodInMS 10000  Encoding + Compression: 1. The original text CSV column: ~250 bytes 2. Use thrift compacted encoding: ~200 bytes 3. Encoding + Compression, compositive reduce ratio: ~1/3  Read Throughput/Latency on slice size (count of columns): Test on LZO compressed data, totally executed 10,000 read operations. Slice Size 50 500 5000 Criteria Throughput(ops/s) 25 21 15 Avg Latency(ms) 158.865 186.571 256.837 Min Latency(ms) 1.278 5.041 60.934 Max Latency(ms) 288.307 395.427 1223.202 3
  • 4. Read Throughput 30 Throughput(ops/s) 25 25 20 21 15 15 10 5 0 50 500 5000 Slice Size(Count of Columns) Read Latency 1400 1200 1223.202 Latency(ms) 1000 Avg Latency(ms) 800 Min Latency(ms) 600 Max Latency(ms) 400 395.427 288.307 256.837 200 186.571 158.865 0 5.041 1.278 60.934 50 500 5000 Slice Size(Count of Columns)  Read Throughput/Latency on KeyCache, mmap, etc: Test on LZO compressed data, use the benchmark. Totally executed 10,000 read operations. Feature KeyCache=100% KeyCache=0 KeyCache=0 DiskAccess=standard DiskAccess= DiskAccess=mmap Criteria mmap_index_only Throughput(ops/s) 40 40 84 Avg Latency(ms) 100.522 101.762 47.342 Min Latency(ms) 1.566 1.453 1.270 Max Latency(ms) 278.975 267.120 239.816 90 84 80 Throughput(ops/s) 70 60 50 40 40 40 30 20 10 0 KeyCache_standard mmap_index_only mmap But, for a long time of evaluation, the performance of on mmap is unstable. Following evaluation executed 1000,000 read operations. It may because of GC. 4
  • 5. Read Throughput (mmap) Throughput(ops/s) 120 100 80 60 40 20 0 1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239 253 267 281 Time( 1minute) 5