NoSQL

NOSQL, NO?
Introductory presentation

RELATIONAL

 SQL  ACID

 Relational algebra  Optimal for ad-hoc queries

 Tables, Columns, Rows  Sharding can be difficult

 Metadata separate from data

 Normalized data

 Optimized storage

POPULAR RDBMS

 MySQL  Informix

 SQL Server  Progress

 Oracle  Pervasive

 Postgres  Sybase

 DB2  Access

 Interbase, Firebird …

SQL

 Unified language to create and query both data and metadata

 Similar to English

 Verbose(!)

 Can get complex for non-trivial queries

 Does not expose execution plan – you say what you want it to
return, not how

SQL EXAMPLES
 If you can say what you mean, you can query the existing data
 Results are near-instant when querying based on primary key
select * from valute where id=1 and sid=42

 Results are fast when querying based on non-unique index
select valuta from valute where ((id=1 and sid=42)) and (valute.firma_id=123 and
valute.firma__sid=1)

 Very readable for trivial queries
select r.customer,sum(rs.iznos) sveukupno from racuni r
join racuni_stavke rs on r.id=rs.racun_id
where r.id=5
order by rs.ordinal

SQL EXAMPLES

 Not so readable for non-trivial queries
select "MP" tip_prometa, mprac.broj broj_racuna, mprac_stavke.kolicina kolicina, (mprac.tecaj*mprac_stavke.kolicina*mprac_stavke.rabat_iznos)
rabat_iznos, (round(mprac_stavke.cijena - mprac_stavke.rabat_iznos - mprac_stavke.rabat2_iznos - mprac_stavke.rabat3_iznos - mprac_stavke.porez1 -
mprac_stavke.porez2 - mprac_stavke.porez_potrosnja,6)*mprac_stavke.kolicina) iznos, (mprac_stavke.kolicina* ifnull((select
sum(pn_cijena*kolicina)/sum(kolicina) from mprac_skl left join skl_stavke on mprac_skl.skl_id=skl_stavke.skl_id and
mprac_skl.skl__sid=skl_stavke.skl__sid where mprac_skl.mprac_id=mprac.id and mprac_skl.mprac__sid=mprac.sid and
skl_stavke.artikl_id=mprac_stavke.artikl_id and skl_stavke.artikl__sid=mprac_stavke.artikl__sid ),0) ) iznos_nabavno, ifnull( (select
sum(mprac_stavke.kolicina*ambalaze.naknada_kom) from artikli_ambalaze left join ambalaze on ambalaze.id=artikli_ambalaze.ambalaza_id and
ambalaze.sid=artikli_ambalaze.ambalaza__sid where artikli_ambalaze.artikl_id=artikli.id and artikli_ambalaze.artikl__sid=artikli.sid and
ambalaze.kalkulacija="N" ),0) naknada, radnici_komercijalisti.ime racun_komercijalist_ime, (select naziv from skladista where skladista.tip_skladista="M"
and pj_id=mprac.pj_id limit 1) skladiste_naziv , pj.naziv pj_naziv, mprac.datum,
cast(concat("(",if(DayOfWeek(mprac.datum)=1,7,DayOfWeek(mprac.datum)-1),") ", if(DayOfWeek(mprac.datum)=1,"1 Nedjelja",
if(DayOfWeek(mprac.datum)=2,"2 Ponedjeljak", if(DayOfWeek(mprac.datum)=3,"3 Utorak", if(DayOfWeek(mprac.datum)=4,"4 Srijeda",
if(DayOfWeek(mprac.datum)=5,"5 Èetvratk", if(DayOfWeek(mprac.datum)=6,"6 Petak", if(DayOfWeek(mprac.datum)=7,"7 Subota","")))))))) as char(15))
dan_u_tjednu, cast(month(mprac.datum) as unsigned) mjesec, cast(week(mprac.datum) as unsigned) tjedan, cast(quarter(mprac.datum) as unsigned) kvartal,
cast(year(mprac.datum) as unsigned) godina, cast(if(tipovi_komitenata.tip="F",trim(concat(partneri.ime," ",partneri.prezime)),partneri.naziv) as char(200))
kupac_naziv, partneri_mjesta.postanski_broj kupac_mjesto, partneri_mjesta.mjesto kupac_mjesto_naziv, partneri_grupe_mjesta.naziv …

RDBMS SCALING

 Vertical scaling
• Better CPU, more CPUs
• More RAM
• More disks
• SAN

 Partitioning

 Sharding

PARTITIONING

 With many rows and heavy usage, partitioning is a must

 What to partition
• Tables
• Indexes
• Views

 Typical cases
• Monthly data
• Alphabetical keys

RDBMS SHARDING

 Sharding means using several databases where each represents part
of data (500 clients on one server, another 500 on another)

 Requires changing application code
connect(calculate_server_from(sharding_key))

 Impossible to join data from different databases, so choose your
sharding key wisely

 Very difficult to repartition your databases based on a new key

RDBMS METADATA

 Metadata: data describing other data

 RDBMS structures are explicitly defined, and each data type is
optimized for storage

 Lots of constraints

 Can get slow with lot of data

NOSQL

 “Not SQL”, “Not only SQL”

 Core NoSQL databases invented mostly because RDBMS made
life very hard for huge and heavy traffic web databases

 NoSQL databases are the ones significantly different from
relational databases

NOSQL TYPES

 Wide Column Store / Column Families
 Document Store
 Key Value / Tuple Store
 Graph Databases
 Object Databases
 XML Databases
 Multivalue Databases

4 MAIN DATA MODELS

 Key-Value Stores

 BigTable Clones (aka "ColumnFamily")

 Document Databases

 Graph Databases
Source: http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to-size-and-scaling-to-complexity.html

KEY/VALUE STORES

 Lineage: Amazon's Dynamo paper and Distributed HashTables.

 Data model: A global collection of key-value pairs.

 Example: Voldemort, Dynomite, Tokyo Cabinet

BIGTABLE CLONES

 Lineage: Google's BigTable paper.

 Data model: Column family, i.e. a tabular model where each row at
least in theory can have an individual configuration of columns.

 Example: HBase, Hypertable, Cassandra

DOCUMENT DATABASES

 Lineage: Inspired by Lotus Notes.

 Data model: Collections of documents, which contain key-value
collections (called "documents").

 Example: CouchDB, MongoDB, Riak

GRAPH DATABASES

 Lineage: Draws from Euler and graph theory.

 Data model: Nodes & relationships, both which can hold key-value
pairs

 Example: AllegroGraph, InfoGrid, Neo4j

POPULAR NOSQL

 Hadoop / Hbase  MemcacheDB

 Cassandra  Voldemort

 Amazon SimpleDB  Hypertable

 MongoDB  Cloudata

 CouchDB  IBM Lotus/Domino

 Redis

NOSQL CHARACTERISTICTS

 Almost infinite horizontal scaling
 Very fast
 Performance doesn’t deteriorate with growth (much)
 No fixed table schemas
 No join operations
 Ad-hoc queries difficult or impossible
 Structured storage
 Almost everything happens in RAM

REAL-WORLD USE
 Cassandra
• Facebook (original developer, used it till late 2010)
• Twitter
• Digg
• Reddit
• Rackspace
• Cisco
 BigTable
• Google (open-source version is HBase)
 MongoDB
• Foursquare
• Craigslist
• Bit.ly
• SourceForge
• GitHub

WHY NOSQL?

 Handles huge databases (I know, I said it before)

 Redundancy, data is pretty safe on commodity hardware

 Super flexible queries using map/reduce

 Rapid development (no fixed schema, yeah!)

 Very fast for common use cases

PERFORMANCE

 RDBMS uses buffer to ensure ACID properties

 NoSQL does not guarantee ACID and is therefore much faster

 We don’t need ACID everywhere!

 I used MySQL and switched to MongDB for my analytics app
• Data processing (every minute) is 4x faster with MongoDB, despite
being a lot more detailed (due to much simple development)

SCALING

 Simple web application with not much traffic
• Application server, database server all on one machine

SCALING

 More traffic comes in
• Application server
• Database server

SCALING

 Even more traffic comes in
• Load balancer
• Application server x2
• Database server

SCALING

 Even more traffic comes in
• Load balancer x N
• easy
• Application server x N
• easy
• Database server xN
• hard for SQL databases

SQL SLOWDOWN

 Not linear!
 http://www.slideshare.net/rightscale
/scaling-sql-and-nosql-databases-in-the-
cloud

NOSQL SCALING

 Need more storage?
• Add more servers!

 Need higher performance?

 Need better reliability?

SCALING SUMMARY

 You can scale SQL databases (Oracle, MySQL, SQL Server…)
• This will cost you dearly
• If you don’t have a lot of money, you will reach limits quickly

 You can scale NoSQL databases
• Very easy horizontal scaling
• Lots of open-source solutions
• Scaling is one of the basic incentives for design, so it is well handled
• Scaling is the cause of trade-offs causing you to have to use
map/reduce

RAM

 Why map/reduce? I just need some simple queries. Tomorrow I
will need some other queries….

 SQL databases are optimized for very efficient disk access, but for
significant scaling need RAM caching (MySQL+memcached)

 NoSQL databases are designed to keep whole working set in RAM

WORKING SET

 In real-world use working set is much less than complete database
• For analytics 99% of queries will be regarding last 30 days

 As you need RAM only for working set, you can use commodity
servers, VPS, and just add more as your app becomes more popular

WORKING SET WOES

 Foursquare has millions of users and working set the same as the database
 They used a single 66GB Amazon EC2 High-Memory Quadruple Extra Large
Instance (with cheese) for millions of users
 When their RAM usage was 65GB, they decided to shard
 Too late, they started to have disk swaps
 Disk is much slower than RAM - 100x slowdown
 Server could not keep up due to swapping
 11 hours outage (ouch!)

MAP/REDUCE

 Google’s framework for processing highly distributable
problems across huge datasets using a large number of
computers

 Let’s define large number of computers
• Cluster if all of them have same hardware
• Grid unless Cluster (if !Cluster for old-style programmers)

MAP/REDUCE

 Process split into two phases
• Map
• Take the input, partition it delegate to other machines
• Other machines can repeat the process, leading to tree structure
• Each machine returns results to the machine who gave it the task
• Reduce
• collect results from machines you gave the tasks
• combine results and return it to requester
• Slower than sequential data processing, but massively parallel
• Sort petabyte of data in a few hours
• Input, Map, Shuffle, Reduce, Output

MAP/REDUCE EXAMPLE

 You need to write two functions

 Count different words in a set of documents

MONGODB

 Document store

 Basic support for dynamic (ad hoc) queries

 Query by example (nice!)

MONGODB

 Conditional Operators
• <, <=, >, >=
• $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $and, $size, $type

 Regular expressions

MONGODB
 Data is stored as BSON (binary JSON)
• Makes it very well suited for languages with native JSON support
 Map/Reduce written in Javascript
• Slow! There is one single thread of execution in Javascript
 Master/slave replication (auto failover with replica sets)
 Sharding built-in
 Uses memory mapped files for data storage
 Performance over features
 On 32bit systems, limited to ~2.5Gb
 An empty database takes up 192Mb
 GridFS to store big data + metadata (not actually an FS)
Source: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

CASSANDRA
 Written in: Java
 Protocol: Custom, binary (Thrift)
 Tunable trade-offs for distribution and replication (N, R, W)
 Querying by column, range of keys
 BigTable-like features: columns, column families
 Writes are much faster than reads (!)
• Constant write time regardless of database size
 Map/reduce possible with Apache Hadoop

HBASE
 Written in: Java
 Main point: Billions of rows X millions of columns
 Modeled after BigTable
 Map/reduce with Hadoop
 Query predicate push down via server side scan and get filters
 Optimizations for real time queries
 A high performance Thrift gateway
 HTTP supports XML, Protobuf, and binary
 Cascading, hive, and pig source and sink modules
 No single point of failure
 While Hadoop streams data efficiently, it has overhead for starting map/reduce jobs. HBase is column oriented key/value store and
allows for low latency read and writes.
 Random access performance is like MySQL

REDIS
 Written in: C/C++
 Main point: Blazing fast
 Disk-backed in-memory database,
 Master-slave replication
 Simple values or hash tables by keys,
 Has sets (also union/diff/inter)
 Has lists (also a queue; blocking pop)
 Has hashes (objects of multiple fields)
 Sorted sets (high score table, good for range queries)
 Has transactions (!)
 Values can be set to expire (as in a cache)
 Pub/Sub lets one implement messaging (!)


COUCHDB
 Written in: Erlang
 Main point: DB consistency, ease of use
 Bi-directional (!) replication, continuous or ad-hoc, with conflict detection, thus, master-master replication. (!)
 MVCC - write operations do not block reads
 Previous versions of documents are available
 Crash-only (reliable) design
 Needs compacting from time to time
 Views: embedded map/reduce
 Formatting views: lists & shows
 Server-side document validation possible
 Authentication possible
 Real-time updates via _changes (!)
 Attachment handling
 CouchApps (standalone JS apps)

HADOOP

 Apache project

 A framework that allows for the distributed processing of large
data sets across clusters of computers

 Designed to scale up from single servers to thousands of machines

 Designed to detect and handle failures at the application layer,
instead of relying on hardware for it

HADOOP
 Created by Doug Cutting, who named it after his son's toy elephant
 Hadoop subprojects
• Cassandra
• HBase
• Pig
 Hive was a Hadoop subproject, but is now a top-level Apache project
 Used by many large & famous organizations
• http://wiki.apache.org/hadoop/PoweredBy
 Scales to hundreds or thousands of computers, each with several processor cores
 Designed to efficiently distribute large amounts of work across a set of machines
 Hundreds of gigabytes of data constitute the low end of Hadoop-scale
 Built to process "web-scale" data on the order of hundreds of gigabytes to terabytes or petabytes

HADOOP

 See http://www.slideshare.net/hadoop/practical-problem-solving-
with-apache-hadoop-pig

 Uses Java, but allows streaming so other languages can easily send
and accept data items to/from Hadoop

HADOOP

 Uses distributed file system (HDFS)
• Designed to hold very large amounts of data (terabytes or even
petabytes)
• Files are stored in a redundant fashion across multiple machines to
ensure their durability to failure and high availability to very parallel
applications
• Data organized into directories and files
• Files are divided into block (64MB by default) and distributed across
nodes
 Design of HDFS is based on the design of the Google File System

HIVE

 A petabyte-scale data warehouse system for Hadoop

 Easy data summarization, ad-hoc queries

 Query the data using a SQL-like language called HiveQL

 Hive compiler generates map-reduce jobs for most queries

PIG

 Platform for analyzing large data sets

 High-level language for expressing data analysis programs

 Compiler produces sequences of Map-Reduce programs

 Textual language called Pig Latin
• Ease of programming
• System optimizes task execution automatically
• Users can create their own functions

PIG LATIN

 Pig Latin – high level Map/Reduce programming

 Equivalent to SQL for RDBMS systems.

 Pig Latin can be extended using Java User Defined Functions

 “Word Count” script in Pig Latin

SUMMARY

 NoSQL is a great problem solver if you need it

 Choose your NoSQL platform carefully as each is designed for
specific purpose

 Get used to Map/Reduce

 It’s not a sin to use NoSQL alongside (yes)SQL database

 I am really happy to work with MongoDB  instead of MySQL

NoSQL

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (7)

Similaire à NoSQL

Similaire à NoSQL (20)

Dernier

Dernier (20)

NoSQL