SlideShare a Scribd company logo
1 of 11
Using NoSQL Databases
for Interactive Applications
By Alexey Diomin and Kirill Grigorchuk
2
Using NoSQL Databases for Interactive Applications
©  Altoros  Systems
Contents
Introduction 3
Cassandra, MongoDB, and Couchbase 3
Key Considerations for Interactive Applications 3
Performance Benchmarking 5
Results 7
Analysis 10
Conclusion 10
About the authors 11
Additional Links 11
3
Using NoSQL Databases for Interactive Applications
©  Altoros  Systems
Introduction
Interactive web applications need high-performance and scalability, calling for a different kind of
database. If your website is not fast enough, users may quickly abandon it and look for alternatives. For
example, in paid online social games, players are extremely demanding and will drop out, even if there is
a slight delay. To deliver the best user experience, you must pick the right database.
Traditional RDBMS are the wrong tool for the job because they do not provide the necessary scalability
and performance for working with large amounts of data and application requests. In contrast, NoSQL
databases have become a viable alternative to RDBMS, particularly for applications that need to change
rapidly. They provide high throughput, low latency, and horizontal scaling. But with so many different
options around, choosing the right NoSQL database for your specific application needs can be tricky.
Recently we took the time to review and benchmark several NoSQL databases. This whitepaper provides
an overview of three popular NoSQL solutions: Cassandra, MongoDB, and Couchbase. In addition, it
presents a vendor-independent performance comparison of these products and can be used as a guide
when choosing a NoSQL database for an interactive application.
Cassandra, MongoDB, and Couchbase
Since we had to pick some NoSQL databases to start with, we looked around for commonly used open-
source NoSQL solutions. Cassandra, Couchbase, and MongoDB seemed to be the most mature open
source products in their class. If you are already familiar with these NoSQL databases, you might want to
skip the rest of this section and go directly to the performance evaluation.
Cassandra is a distributed columnar key-value database with eventual consistency. It is optimized for
write operations and has no central master—data can be written or read to and from any of the nodes in
the cluster. Cassandra provides seamless horizontal scaling and has no single point of failure—if a node
in the cluster fails, another node steps up to replace it. At the moment, Cassandra is an Apache 2.0
licensed project supported by the Apache Community.
MongoDB is a schema-free, document-oriented, NoSQL database. In MongoDB, data is stored in the
BSON format—BSON document is essentially a JSON document represented in a binary format, which
allows for easier and faster integration of data in certain types of applications. This database also
provides horizontal scalability and has no single point of failure. However, a MongoDB cluster is different
from a Cassandra or Couchbase Server cluster—it includes an arbiter, a master, and multiple slaves. As
of 2009, MongoDB is an open source project with an AGPL license supported by 10Gen.
Couchbase is a NoSQL document database. Documents in Couchbase Server are stored as JSON. With
built-in caching, Couchbase provides low-latency read and write operations with linearly scalable
throughput. The architecture has no single point of failure. It is easy to scale-out the cluster and support
live cluster topology changes. This means, there is no application downtime when you are upgrading your
database, software, or hardware using rolling upgrades. Couchbase, Inc. develops and provides
commercial support for the Couchbase Apache 2.0 licensed project.
Key Considerations for Interactive Applications
our database is the workhorse for your Web application. When choosing a database, the following factors
are important to keep in mind:
1 Scalability: It’s  hard  to   predict  when  your  application  needs  to  scale,  but  when  your   Web site
traffic suddenly spikes and your database does not have enough capacity, you need to scale your
database quickly, on demand, and without any application changes. Similarly, when your system
is idle, you should have a possibility to decrease the amount of resources used. Scaling your
4
Using NoSQL Databases for Interactive Applications
©  Altoros  Systems
database must be a simple operation—you should not need to deal with complicated procedures
or make any changes to your application.
In this paper, we will only speak about horizontal scalability, which involves dividing a system into
small structural components hosted on different physical machines (or groups of machines)
and/or increasing the number of servers that perform the same function in parallel.
a Cassandra meets the requirements of an ideal horizontally scalable system. Nodes can
be added seamlessly as you need more capacity. The cluster automatically utilizes the
new resources. A node can be decommissioned in automatic or semi-automatic mode.
b Couchbase scales horizontally. All nodes are identical and easy to setup. Nodes can be
added or removed from the cluster with a single button click and no changes to the
application. Auto-sharding evenly distributes data across all nodes in the cluster without
any hotspots. Cross datacenter replication makes it possible to scale a cluster across
datacenters for better data locality and faster data access.
c MongoDB—this database has a number of functions related to scalability. These include:
automatic sharding (auto-partitioning of data across servers), reads and writes distributed
over shards, and eventually-consistent reads that can be distributed over replicated
servers. When the system is idle, cluster size can only be decreased manually. The
administrator  uses  the  management  console  to  change  the  system’s configuration. After
that, the server process of MongoDB can be safely stopped on the vacant machines.
2 Performance: Interactive applications require very low read and write latencies. The database
must deliver consistently low latencies for read and write operations independent of load or the
size of data being accessed. In general, the read and write latency of NoSQL databases is very
low because data is shared across all the nodes in a cluster while the application’s working set is
in memory.
Interactive applications need to support millions of users and have different workloads—read,
write, or mixed. In the next section, we share some performance test results on different NoSQL
databases measuring latency versus varying levels of throughput.
3 Availability: Interactive Web applications need a database that is highly available. If your
application is down, you simply are not making any money. To ensure high availability, your
solution should be able to do online upgrades to the latest version, easily remove a node for
maintenance without affecting the availability of the cluster, handle online operations, such as
backups, and provide disaster recovery, if an entire datacenter goes down.
Below are examples of how availability is achieved in different NoSQL databases:
a Cassandra: Every node in a Cassandra  cluster,  or  “ring”,  is  given  a  range  of  data  for  
which it is responsible. When Cassandra receives a write operation designated to be
stored in a node that has failed, it will automatically route the write request to a node that
is alive. The node that receives the write request saves the write operation with a hint.
The hint is a message that contains information about the failed node that should have
handled the write request. The node that holds the hint monitors the node ring for the
recovery of the failed node that missed the write request. If the failed node comes back
online, the node that holds the hint will handoff the hint message to the recovered node,
so that the write requests can be persisted in their proper location. When a new node is
added to the cluster, the workload is distributed to this new node as well.
b Couchbase: Couchbase Server maintains multiple copies (up to 3 replicas) of each
document in a cluster. Each server is identical and serves active and replica documents.
Data is uniformly distributed across all the nodes and the clients are aware of the
topology. If a node in the cluster fails, Couchbase Server detects the failure and
promotes replica documents on other live nodes to active. The client cluster map is
updated to reflect the new topology, so the application continues to work without
5
Using NoSQL Databases for Interactive Applications
©  Altoros  Systems
downtime. When capacity is added, data is rebalanced automatically, also without any
downtime.
c MongoDB: Data in MongoDB is spread across several shards. Typically, each shard
(replica set) consists of multiple mongo-daemon instances, including an arbiter node, a
master node, and multiple slaves. If a slave node fails, the master node automatically
redistributes the workload to the rest of the slave nodes. In case the master node
crashes, the arbiter node elects a new master. If the arbiter node fails and there are no
instances left in the shard, the shard is dead. In MongoDB, a replica set can span across
multiple datacenters but writes can only go to one primary instance in one data-center
(master-slave replication).
4 Ease of development: Relational databases require a rigid schema to model an application. If
your application changes, your database schema needs to change as well. In this regard, NoSQL
databases have the following advantages:
a Flexible schema: You do not have to modify the existing structural elements when new
fields are added to a document. New documents can co-exist with existing documents
without any additional changes.
b Simple query language: Because data in a NoSQL document is stored in a de-
normalized state, you can get and update a document with the help of put and get
operations.
Performance Benchmarking
Our test infrastructure consisted of 4 extra-large instances on Amazon EC2 for the NoSQL databases and
1 instance for the client. Each instance had 4 virtual CPU cores with 2 Amazon compute units per core,
15GB of RAM, and 4 EBS 50GB volumes with RAID 0 striping. We used 64-bit Amazon Linux as the OS.
Networking was all 10GigE.
The client used the Yahoo! Cloud Serving Benchmark (YCSB), which was modified to suit our needs—we
added a warm-up phase and adjusted working-set load generation that simulates different users
accessing different data objects with meaningful data amounts and runtime. As shown in Figure 1, the
YCSB client consists of two main parts—the workload generator and workload scenarios.
The benchmark had 30 parallel client threads to drive the test, generating a mixed read-write workload
with 5% of creates, 33% of updates, 2% of deletes, and 60% of reads. For all the tests, we used 1.5 KB
documents (15 fields and 100 bytes each)—a typical document size across several NoSQL database
use-cases. The total number of documents in the cluster was 30 million—15 million of active and 15
million of replica documents for each database.
6
Using NoSQL Databases for Interactive Applications
©  Altoros  Systems
Figure 1: YCSB client—NoSQL database server architecture
We ran each test five times for every NoSQL database and compared the average data access latency
against different throughput levels. The NoSQL databases were setup using the following configuration:
Cassandra 1.1.2
Cassandra JVM settings:
1 MAX_HEAP_SIZE, which is the total amount of memory dedicated to the Java heap—6GB
2 HEAP_NEWSIZE, which is the total amount of memory for a new generation of objects—400MB
Cassandra settings:
1 RandomPartitioner that uses MD5 Hashing to evenly distribute rows across the cluster
2 Memtable of 4GB in size
7
Using NoSQL Databases for Interactive Applications
©  Altoros  Systems
Couchbase 2.0 - Beta build 1723
1 1 replica setting
2 12 GB used as per node RAM quota using the Couchbase bucket type
MongoDB
1 4 shards, each with 1 replica; each shard is a set of 2 nodes—primary and secondary
2 Journaling disabled
3 Each node was running 2 mongo daemon processes and 4 mongo router processes.
Results
Figure 2 shows the average latency at varying throughput levels for read, insert, and update operations
measured from the client to the server and back against varying levels of throughput for each NoSQL
database. The lower the latency values, the better.
8
Using NoSQL Databases for Interactive Applications
©  Altoros  Systems
Figure 2: Average latency vs. throughput
We also calculated the 95th percentile time taken for a request to execute a read, insert and update
operations measured from the client to the server and back against varying levels of throughput for each
NoSQL database.
9
Using NoSQL Databases for Interactive Applications
©  Altoros  Systems
Figure 3: 95th Percentile latency vs. throughput
10
Using NoSQL Databases for Interactive Applications
©  Altoros  Systems
Typically, you want to see flat latency curves irrespective of the throughput to ensure a consistent user
experience. Couchbase had faster read and write times than MongoDB and Cassandra.
Analysis
While not an exhaustive list, these are the most relevant pros and cons identified after reviewing these
databases:
MongoDB demonstrated the lowest throughput among all the databases compared in our test. We saw
high latencies for write operations at average throughput because the coarser locking in MongoDB limits
the write throughput of the server. Read requests were faster than in Cassandra but slower than in
Couchbase.
Increasing the size of the cluster in MongoDB was rather complicated. Many MongoDB operations need
to be done manually through the command line and it is mandatory that you have a highly skilled system
administrator. The advantages include support for in-built MapReduce and CAS transactions.
Cassandra showed better results than MongoDB because it uses an eventually consistent architecture
where in order to confirm a record you only need a reply from one node. In addition, unlike MongoDB,
Cassandra is rather flexible when the cluster needs to be resized. Unfortunately, its extreme
flexibility designed to sustain performance in highly distributed environments resulted in
additional limitations. The database supports no transactions and cannot block separate
records.
Couchbase showed the lowest latencies and highest throughput among all the databases compared.
The in-built object managed cache is responsible for the low latency. With fine grain locking at the
document level, Couchbase Server was capable of providing high throughput for both reads and writes.
The admin console in Couchbase has flexible settings for changing cluster size. Each document in the
cluster has an active copy and multiple replicas. Access requests for a particular document are processed
by the server holding the active document, which makes it possible to add extended transaction
processing systems, locking, and CAS. This also eliminates the problem with eventual consistency, when
read replicas have obsolete values. As a bonus for database administrators, Couchbase also comes with
advanced tools for monitoring the status of the whole cluster and its separate nodes.
Conclusion
Choosing the right NoSQL database for your application is a very complicated process because every
NoSQL solution is optimized for a particular type of load. This is why you should properly evaluate all
available options before picking a suitable data store for your application.
11
Using NoSQL Databases for Interactive Applications
©  Altoros  Systems
About the authors
Kirill Grigorchuk is the head of R&D department at Altoros Systems Inc. Mr. Grigorchuk has 15+ years
of experience in IT and profound skills in R&D process engineering, product and project management,
Web development, and big data. At the moment, he leads and coordinates research into a wide range of
cutting edge technologies, including distributed computing and NoSQL solutions.
Alexey Diomin is a senior Java developer at Altoros Systems Inc. with vast experience in distributed
computing, NoSQL databases, and Linux. Having excellent skills in building, administering, and
supporting large-scale distributed computing systems, Mr. Diomin did an extensive research into the field
of big data.
Additional Links
Cassandra website — http://cassandra.apache.org
Couchbase server website — http://www.couchbase.com
Mongodb website — http://www.mongodb.org
YCSB Github — https://github.com/Altoros/YCSB

More Related Content

What's hot

Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra Nikiforos Botis
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databasesguestdfd1ec
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overviewPritamKathar
 
Cassandra vs. MongoDB
Cassandra vs. MongoDBCassandra vs. MongoDB
Cassandra vs. MongoDBScaleGrid.io
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoopChiou-Nan Chen
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introductionPooyan Mehrparvar
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMIJCI JOURNAL
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseDataStax
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsOleg Magazov
 
Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Jay Patel
 
Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0Asis Mohanty
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architectureT Jake Luciani
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbaseDipti Borkar
 
Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentationsharonyb
 
cassandra
cassandracassandra
cassandraAkash R
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraNguyen Quang
 

What's hot (20)

Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Cassandra vs. MongoDB
Cassandra vs. MongoDBCassandra vs. MongoDB
Cassandra vs. MongoDB
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and Basics
 
Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012
 
Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0
 
Cassandra Architecture FTW
Cassandra Architecture FTWCassandra Architecture FTW
Cassandra Architecture FTW
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbase
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
 
Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentation
 
cassandra
cassandracassandra
cassandra
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 

Viewers also liked

Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113longJeff Harris
 
Couchbase Overview Nov 2013
Couchbase Overview Nov 2013Couchbase Overview Nov 2013
Couchbase Overview Nov 2013Jeff Harris
 
Partes de la computadora hernandez
Partes de la computadora hernandezPartes de la computadora hernandez
Partes de la computadora hernandezYanira Cifuentes
 
Partes de la computadora hernandez
Partes de la computadora hernandezPartes de la computadora hernandez
Partes de la computadora hernandezYanira Cifuentes
 
Sistema puesta a tierra Saia UFT
Sistema puesta a tierra Saia UFTSistema puesta a tierra Saia UFT
Sistema puesta a tierra Saia UFTmarielisatovar
 
Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113longJeff Harris
 
Build Application With MongoDB
Build Application With MongoDBBuild Application With MongoDB
Build Application With MongoDBEdureka!
 

Viewers also liked (8)

Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113long
 
Couchbase Overview Nov 2013
Couchbase Overview Nov 2013Couchbase Overview Nov 2013
Couchbase Overview Nov 2013
 
Partes de la computadora hernandez
Partes de la computadora hernandezPartes de la computadora hernandez
Partes de la computadora hernandez
 
Partes de la computadora hernandez
Partes de la computadora hernandezPartes de la computadora hernandez
Partes de la computadora hernandez
 
Sistema puesta a tierra Saia UFT
Sistema puesta a tierra Saia UFTSistema puesta a tierra Saia UFT
Sistema puesta a tierra Saia UFT
 
Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113long
 
French English Relations
French English RelationsFrench English Relations
French English Relations
 
Build Application With MongoDB
Build Application With MongoDBBuild Application With MongoDB
Build Application With MongoDB
 

Similar to Altoros using no sql databases for interactive_applications

Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLbalwinders
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfajajkhan16
 
No sql databases explained
No sql databases explainedNo sql databases explained
No sql databases explainedSalil Mehendale
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesEditor Jacotech
 
Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Ahmed Rashwan
 
No sqlpresentation
No sqlpresentationNo sqlpresentation
No sqlpresentationSalma Gouia
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAijfcstjournal
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAijfcstjournal
 
SQL vs NoSQL deep dive
SQL vs NoSQL deep diveSQL vs NoSQL deep dive
SQL vs NoSQL deep diveAhmed Shaaban
 
Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...ijdms
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGijiert bestjournal
 
Data Storage and Management project Report
Data Storage and Management project ReportData Storage and Management project Report
Data Storage and Management project ReportTushar Dalvi
 

Similar to Altoros using no sql databases for interactive_applications (20)

Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdf
 
Dsm project-h base-cassandra
Dsm project-h base-cassandraDsm project-h base-cassandra
Dsm project-h base-cassandra
 
No sql database
No sql databaseNo sql database
No sql database
 
No sql databases explained
No sql databases explainedNo sql databases explained
No sql databases explained
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunities
 
MongoDB
MongoDBMongoDB
MongoDB
 
Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?
 
No sql
No sqlNo sql
No sql
 
No sqlpresentation
No sqlpresentationNo sqlpresentation
No sqlpresentation
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
 
Selecting best NoSQL
Selecting best NoSQL Selecting best NoSQL
Selecting best NoSQL
 
SQL vs NoSQL deep dive
SQL vs NoSQL deep diveSQL vs NoSQL deep dive
SQL vs NoSQL deep dive
 
NoSql Databases
NoSql DatabasesNoSql Databases
NoSql Databases
 
Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
 
Data Storage and Management project Report
Data Storage and Management project ReportData Storage and Management project Report
Data Storage and Management project Report
 

Recently uploaded

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Altoros using no sql databases for interactive_applications

  • 1. Using NoSQL Databases for Interactive Applications By Alexey Diomin and Kirill Grigorchuk
  • 2. 2 Using NoSQL Databases for Interactive Applications ©  Altoros  Systems Contents Introduction 3 Cassandra, MongoDB, and Couchbase 3 Key Considerations for Interactive Applications 3 Performance Benchmarking 5 Results 7 Analysis 10 Conclusion 10 About the authors 11 Additional Links 11
  • 3. 3 Using NoSQL Databases for Interactive Applications ©  Altoros  Systems Introduction Interactive web applications need high-performance and scalability, calling for a different kind of database. If your website is not fast enough, users may quickly abandon it and look for alternatives. For example, in paid online social games, players are extremely demanding and will drop out, even if there is a slight delay. To deliver the best user experience, you must pick the right database. Traditional RDBMS are the wrong tool for the job because they do not provide the necessary scalability and performance for working with large amounts of data and application requests. In contrast, NoSQL databases have become a viable alternative to RDBMS, particularly for applications that need to change rapidly. They provide high throughput, low latency, and horizontal scaling. But with so many different options around, choosing the right NoSQL database for your specific application needs can be tricky. Recently we took the time to review and benchmark several NoSQL databases. This whitepaper provides an overview of three popular NoSQL solutions: Cassandra, MongoDB, and Couchbase. In addition, it presents a vendor-independent performance comparison of these products and can be used as a guide when choosing a NoSQL database for an interactive application. Cassandra, MongoDB, and Couchbase Since we had to pick some NoSQL databases to start with, we looked around for commonly used open- source NoSQL solutions. Cassandra, Couchbase, and MongoDB seemed to be the most mature open source products in their class. If you are already familiar with these NoSQL databases, you might want to skip the rest of this section and go directly to the performance evaluation. Cassandra is a distributed columnar key-value database with eventual consistency. It is optimized for write operations and has no central master—data can be written or read to and from any of the nodes in the cluster. Cassandra provides seamless horizontal scaling and has no single point of failure—if a node in the cluster fails, another node steps up to replace it. At the moment, Cassandra is an Apache 2.0 licensed project supported by the Apache Community. MongoDB is a schema-free, document-oriented, NoSQL database. In MongoDB, data is stored in the BSON format—BSON document is essentially a JSON document represented in a binary format, which allows for easier and faster integration of data in certain types of applications. This database also provides horizontal scalability and has no single point of failure. However, a MongoDB cluster is different from a Cassandra or Couchbase Server cluster—it includes an arbiter, a master, and multiple slaves. As of 2009, MongoDB is an open source project with an AGPL license supported by 10Gen. Couchbase is a NoSQL document database. Documents in Couchbase Server are stored as JSON. With built-in caching, Couchbase provides low-latency read and write operations with linearly scalable throughput. The architecture has no single point of failure. It is easy to scale-out the cluster and support live cluster topology changes. This means, there is no application downtime when you are upgrading your database, software, or hardware using rolling upgrades. Couchbase, Inc. develops and provides commercial support for the Couchbase Apache 2.0 licensed project. Key Considerations for Interactive Applications our database is the workhorse for your Web application. When choosing a database, the following factors are important to keep in mind: 1 Scalability: It’s  hard  to   predict  when  your  application  needs  to  scale,  but  when  your   Web site traffic suddenly spikes and your database does not have enough capacity, you need to scale your database quickly, on demand, and without any application changes. Similarly, when your system is idle, you should have a possibility to decrease the amount of resources used. Scaling your
  • 4. 4 Using NoSQL Databases for Interactive Applications ©  Altoros  Systems database must be a simple operation—you should not need to deal with complicated procedures or make any changes to your application. In this paper, we will only speak about horizontal scalability, which involves dividing a system into small structural components hosted on different physical machines (or groups of machines) and/or increasing the number of servers that perform the same function in parallel. a Cassandra meets the requirements of an ideal horizontally scalable system. Nodes can be added seamlessly as you need more capacity. The cluster automatically utilizes the new resources. A node can be decommissioned in automatic or semi-automatic mode. b Couchbase scales horizontally. All nodes are identical and easy to setup. Nodes can be added or removed from the cluster with a single button click and no changes to the application. Auto-sharding evenly distributes data across all nodes in the cluster without any hotspots. Cross datacenter replication makes it possible to scale a cluster across datacenters for better data locality and faster data access. c MongoDB—this database has a number of functions related to scalability. These include: automatic sharding (auto-partitioning of data across servers), reads and writes distributed over shards, and eventually-consistent reads that can be distributed over replicated servers. When the system is idle, cluster size can only be decreased manually. The administrator  uses  the  management  console  to  change  the  system’s configuration. After that, the server process of MongoDB can be safely stopped on the vacant machines. 2 Performance: Interactive applications require very low read and write latencies. The database must deliver consistently low latencies for read and write operations independent of load or the size of data being accessed. In general, the read and write latency of NoSQL databases is very low because data is shared across all the nodes in a cluster while the application’s working set is in memory. Interactive applications need to support millions of users and have different workloads—read, write, or mixed. In the next section, we share some performance test results on different NoSQL databases measuring latency versus varying levels of throughput. 3 Availability: Interactive Web applications need a database that is highly available. If your application is down, you simply are not making any money. To ensure high availability, your solution should be able to do online upgrades to the latest version, easily remove a node for maintenance without affecting the availability of the cluster, handle online operations, such as backups, and provide disaster recovery, if an entire datacenter goes down. Below are examples of how availability is achieved in different NoSQL databases: a Cassandra: Every node in a Cassandra  cluster,  or  “ring”,  is  given  a  range  of  data  for   which it is responsible. When Cassandra receives a write operation designated to be stored in a node that has failed, it will automatically route the write request to a node that is alive. The node that receives the write request saves the write operation with a hint. The hint is a message that contains information about the failed node that should have handled the write request. The node that holds the hint monitors the node ring for the recovery of the failed node that missed the write request. If the failed node comes back online, the node that holds the hint will handoff the hint message to the recovered node, so that the write requests can be persisted in their proper location. When a new node is added to the cluster, the workload is distributed to this new node as well. b Couchbase: Couchbase Server maintains multiple copies (up to 3 replicas) of each document in a cluster. Each server is identical and serves active and replica documents. Data is uniformly distributed across all the nodes and the clients are aware of the topology. If a node in the cluster fails, Couchbase Server detects the failure and promotes replica documents on other live nodes to active. The client cluster map is updated to reflect the new topology, so the application continues to work without
  • 5. 5 Using NoSQL Databases for Interactive Applications ©  Altoros  Systems downtime. When capacity is added, data is rebalanced automatically, also without any downtime. c MongoDB: Data in MongoDB is spread across several shards. Typically, each shard (replica set) consists of multiple mongo-daemon instances, including an arbiter node, a master node, and multiple slaves. If a slave node fails, the master node automatically redistributes the workload to the rest of the slave nodes. In case the master node crashes, the arbiter node elects a new master. If the arbiter node fails and there are no instances left in the shard, the shard is dead. In MongoDB, a replica set can span across multiple datacenters but writes can only go to one primary instance in one data-center (master-slave replication). 4 Ease of development: Relational databases require a rigid schema to model an application. If your application changes, your database schema needs to change as well. In this regard, NoSQL databases have the following advantages: a Flexible schema: You do not have to modify the existing structural elements when new fields are added to a document. New documents can co-exist with existing documents without any additional changes. b Simple query language: Because data in a NoSQL document is stored in a de- normalized state, you can get and update a document with the help of put and get operations. Performance Benchmarking Our test infrastructure consisted of 4 extra-large instances on Amazon EC2 for the NoSQL databases and 1 instance for the client. Each instance had 4 virtual CPU cores with 2 Amazon compute units per core, 15GB of RAM, and 4 EBS 50GB volumes with RAID 0 striping. We used 64-bit Amazon Linux as the OS. Networking was all 10GigE. The client used the Yahoo! Cloud Serving Benchmark (YCSB), which was modified to suit our needs—we added a warm-up phase and adjusted working-set load generation that simulates different users accessing different data objects with meaningful data amounts and runtime. As shown in Figure 1, the YCSB client consists of two main parts—the workload generator and workload scenarios. The benchmark had 30 parallel client threads to drive the test, generating a mixed read-write workload with 5% of creates, 33% of updates, 2% of deletes, and 60% of reads. For all the tests, we used 1.5 KB documents (15 fields and 100 bytes each)—a typical document size across several NoSQL database use-cases. The total number of documents in the cluster was 30 million—15 million of active and 15 million of replica documents for each database.
  • 6. 6 Using NoSQL Databases for Interactive Applications ©  Altoros  Systems Figure 1: YCSB client—NoSQL database server architecture We ran each test five times for every NoSQL database and compared the average data access latency against different throughput levels. The NoSQL databases were setup using the following configuration: Cassandra 1.1.2 Cassandra JVM settings: 1 MAX_HEAP_SIZE, which is the total amount of memory dedicated to the Java heap—6GB 2 HEAP_NEWSIZE, which is the total amount of memory for a new generation of objects—400MB Cassandra settings: 1 RandomPartitioner that uses MD5 Hashing to evenly distribute rows across the cluster 2 Memtable of 4GB in size
  • 7. 7 Using NoSQL Databases for Interactive Applications ©  Altoros  Systems Couchbase 2.0 - Beta build 1723 1 1 replica setting 2 12 GB used as per node RAM quota using the Couchbase bucket type MongoDB 1 4 shards, each with 1 replica; each shard is a set of 2 nodes—primary and secondary 2 Journaling disabled 3 Each node was running 2 mongo daemon processes and 4 mongo router processes. Results Figure 2 shows the average latency at varying throughput levels for read, insert, and update operations measured from the client to the server and back against varying levels of throughput for each NoSQL database. The lower the latency values, the better.
  • 8. 8 Using NoSQL Databases for Interactive Applications ©  Altoros  Systems Figure 2: Average latency vs. throughput We also calculated the 95th percentile time taken for a request to execute a read, insert and update operations measured from the client to the server and back against varying levels of throughput for each NoSQL database.
  • 9. 9 Using NoSQL Databases for Interactive Applications ©  Altoros  Systems Figure 3: 95th Percentile latency vs. throughput
  • 10. 10 Using NoSQL Databases for Interactive Applications ©  Altoros  Systems Typically, you want to see flat latency curves irrespective of the throughput to ensure a consistent user experience. Couchbase had faster read and write times than MongoDB and Cassandra. Analysis While not an exhaustive list, these are the most relevant pros and cons identified after reviewing these databases: MongoDB demonstrated the lowest throughput among all the databases compared in our test. We saw high latencies for write operations at average throughput because the coarser locking in MongoDB limits the write throughput of the server. Read requests were faster than in Cassandra but slower than in Couchbase. Increasing the size of the cluster in MongoDB was rather complicated. Many MongoDB operations need to be done manually through the command line and it is mandatory that you have a highly skilled system administrator. The advantages include support for in-built MapReduce and CAS transactions. Cassandra showed better results than MongoDB because it uses an eventually consistent architecture where in order to confirm a record you only need a reply from one node. In addition, unlike MongoDB, Cassandra is rather flexible when the cluster needs to be resized. Unfortunately, its extreme flexibility designed to sustain performance in highly distributed environments resulted in additional limitations. The database supports no transactions and cannot block separate records. Couchbase showed the lowest latencies and highest throughput among all the databases compared. The in-built object managed cache is responsible for the low latency. With fine grain locking at the document level, Couchbase Server was capable of providing high throughput for both reads and writes. The admin console in Couchbase has flexible settings for changing cluster size. Each document in the cluster has an active copy and multiple replicas. Access requests for a particular document are processed by the server holding the active document, which makes it possible to add extended transaction processing systems, locking, and CAS. This also eliminates the problem with eventual consistency, when read replicas have obsolete values. As a bonus for database administrators, Couchbase also comes with advanced tools for monitoring the status of the whole cluster and its separate nodes. Conclusion Choosing the right NoSQL database for your application is a very complicated process because every NoSQL solution is optimized for a particular type of load. This is why you should properly evaluate all available options before picking a suitable data store for your application.
  • 11. 11 Using NoSQL Databases for Interactive Applications ©  Altoros  Systems About the authors Kirill Grigorchuk is the head of R&D department at Altoros Systems Inc. Mr. Grigorchuk has 15+ years of experience in IT and profound skills in R&D process engineering, product and project management, Web development, and big data. At the moment, he leads and coordinates research into a wide range of cutting edge technologies, including distributed computing and NoSQL solutions. Alexey Diomin is a senior Java developer at Altoros Systems Inc. with vast experience in distributed computing, NoSQL databases, and Linux. Having excellent skills in building, administering, and supporting large-scale distributed computing systems, Mr. Diomin did an extensive research into the field of big data. Additional Links Cassandra website — http://cassandra.apache.org Couchbase server website — http://www.couchbase.com Mongodb website — http://www.mongodb.org YCSB Github — https://github.com/Altoros/YCSB