2. REQUISITE SLIDE – WHO AM I?
-
Brian Enochson
- Home is the Jersey Shore
- SW Engineer who has worked as designer / developer on NOSQL
(Mongo, Cassandra)
- Consultant – HBO, ACS, CIBER
- Specialize in SW Development, architecture and training
Brian Enochson
brian.enochson@gmail.com
Available for training, consulting, architecture & development.
NOSQL INTRO & CASSANDRA
2
3. REQUISITE SLIDE # 2 – WHAT ARE WE TALKING
ABOUT?
•
•
•
•
•
•
NoSQL Introduction
What brought us here
Types of NoSQL Products
What about Hadoop?
What about Real-Time?
Quick look at MongoDB
•
•
•
•
Cassandra Intro & Architecture
Why Cassandra
Architecture
Internals
Development
•
•
•
•
•
•
Data Modeling Concepts
Old vs. New Way
Basics
Composite Types
Collections
Time Series Data
Counters
•
•
NOSQL INTRO & CASSANDRA
3
4. HISTORY OF THE DATABASE
•
1960’s – Hierarchical and Network type (IMS and CODASYL)
•
1970’s – Beginnings of theory behind relational model. Codd
•
1980’s – Rise of the relational model. SQL. E/R Model (Chen)
•
1990’s – Access/Excel and MySQL. ODMS began to appear
•
2000;’s – Two forces; large enterprise and open source. Google and Amazon.
CAP Theorem (more on that to come…)
•
2010’s – Immergence of NoSQL as an industry player and viable alternative
NOSQL INTRO & CASSANDRA
4
5. WHY WERE ALTERNATIVES NEEDED
•
Developers today are faced with Internet scale
• 100,000’s of users
• Low cost of storage
• Increased processing power
• Ability to capture (and need) of millions of events. Caching solves it to an
extent but brings other complexities
• Real-time
• Need to scale out and not up. (add infinite number of low cost machines, vs.
add a more powerful machine).
•
Cost
• Let’s not forget for enterprise DB’s Internet scale can become expensive
• Open source DB’s may solve license cost, but don’t ignore operational costs
NOSQL INTRO & CASSANDRA
5
6. A LOT OF DATA
Some facts from http://www.storagenewsletter.com/rubriques/marketreportsresearch/ibm-cmo-study/
Approximately 90 percent of all the real-time information being created today is
unstructured data
Every day we create 2.5 quintillion (10 to the 18th) bytes of data (this is 30
zeroes!!)
90 percent of the world's data today has been created in the last two years alone
NOSQL INTRO & CASSANDRA
6
7. RELATIONAL VS. NOSQL
• Relational
• Divide into tables, relate into foreign keys, DB constraints, normalized
data, the Interface is SQL
• NoSQL
• Store in schemaless format, redundancy encouraged, application access
determines the storage format (your queries).Interface varies and is
optimized for the implementation, no forced DB constraints. Tradeoff is
often you get eventual consistency.
NOSQL INTRO & CASSANDRA
7
8. TRADEOFFS?
Luckily, due to the large number of compromises made when
attempting to scale their existing relational databases,
these tradeoffs were not so foreign or
distasteful as they might have been.
Greg Burd - https://www.usenix.org/legacy/publications/login/201110/openpdfs/Burd.pdf
NOSQL INTRO & CASSANDRA
8
9. 3 V’S – DESCRIBING THE BIG DATA PROBLEM
Driving force in requiring new technology is often referred to as the “3 V Model”.
•
High Volume – amount of data
•
High Variety – range of data types and sources
•
High Velocity – speed of data in and out
OK, maybe 4 V’s
•
Veracity – is all the data applicable to the problem being analyzed.
NOSQL INTRO & CASSANDRA
9
10. NOSQL IS NOT BIG DATA
NoSQL != Big Data
NoSQL products were created to help solve the big data problem.
Big data is a much larger problem than just storage. Analysis tools like
Hadoop, messaging systems like Kafka, real time processing engines like
Storm and machine learning (Mahout) all help solve the big data problem.
NOSQL INTRO & CASSANDRA
10
11. NOSQL TYPES
Wide Column– Column Family
• Cassandra, HBASE, Amazon SimpleDB
Key Value
• Riak, Redis, DynamoDB, Voldemort, MemcacheDB
Document DB
• MongoDB, CouchDB,
Graph
• Neo4J, OrientDB
Search (also alternatives, normally used with *)
• Lucene, Solr, ElasticSearch
Many many many, many more! (http://nosql-database.org/)
NOSQL INTRO & CASSANDRA
11
12. CHOOSING THE RIGHT ONE…
Choosing the right NoSQL type and eventual product depends on…
Type of Data
• One key and a lot of data?
• High volume of data?
• Storing, media, blobs,
• Document oriented?
• Tracking relationships?
• Combination?
• Multi-Datacenter
Type of Access
Volumes of Data (there is big data and there is BIG DATA)
Need Support/Services/Training
NOSQL INTRO & CASSANDRA
12
13. VISUAL GUIDE – USING THE CAP THEOREM
HTTP://BLOG.NAHURST.COM/VISUAL-GUIDE-TO-NOSQL-SYSTEMS
NOSQL INTRO & CASSANDRA
13
14. QUICK LOOK AT MONGO
Just so we can compare to Cassandra
•
Document Oriented
•
Storage format is JSON (actually BSON)
•
Replication built in
•
Master / slave architecture
•
Strong querying support
NOSQL INTRO & CASSANDRA
14
17. CASSANDRA HISTORY
•
Developed At Facebook, based on Google Big Table and Amazon Dynamo **
•
Open Sourced in mid 2008
•
Apache Project March 2009
•
Commercial Support through Datastax (originally known as Riptano, founded
2010)
•
Used at Netflix, eBay and many more. Reportedly 300 TB on 400 machines
largest installation
•
Current version is 2.0.1
NOSQL INTRO & CASSANDRA
17
18. WHY EVEN CONSIDER C*
•
Large data sets
•
Require high availability
•
Multi Data Center
•
Require large scaling
•
Write heavy applications
•
Can design for queries
•
Understand tunable consistency and implications (more to come)
•
Willing to make the effort upfront for the reward
NOSQL INTRO & CASSANDRA
18
20. ACID
YOU PROBABLY ALL HAVE HEARD OF ACID
•
Atomic – All or None
•
Consistency – What is written is valid
•
Isolation – One operation at a time
•
Durability – Once committed to the DB, it stays
This is the world we have lived in for a long time…
NOSQL INTRO & CASSANDRA
20
21. CAP THEOREM (BREWERS)
Many may have heard this one
CAP stands for Consistency, Availability and Partition Tolerance
• Consistency –like the C in ACID. Operation is all or nothing,
• Availability – service is available.
• Partition Tolerance – No failure other than complete network failure causes
system not to respond
(REMEMBER VISUAL GUIDE TO SELECTING A NO SQL DATABASE
So.. What does this mean?
** http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
NOSQL INTRO & CASSANDRA
21
22. YOU CAN ONLY HAVE 2 OF THEM
Or better said in C* terms you can have Availability and Partition-Tolerant
AND Eventual Consistency.
Means eventually all accesses will return the last updated value.
NOSQL INTRO & CASSANDRA
22
23. BASE
But maybe you have not heard this one…
Somewhat contrived but gives CAP Theorem an acronym to use against ACID… Also
created by Eric Brewer.
Basically Available – system does guarantee availability, as much as possible.
Soft State – state may change even without input. Required because of eventual
consistency
Eventually Consistent – it will become consistent over time.
** Also, as engineers we cannot believe in anything that isn’t an acronym!
NOSQL INTRO & CASSANDRA
23
24. C* - WHAT PROBLEM IS BEING SOLVED?
•
Database for modern application requirements.
• Web Scale – massive amounts of data
• Scale Out – commodity hardware
• Flexible Schema (we will see this how this concept is evolving)
• Online admin (add to cluster, load balancing). Simpler operations
• CAP Theorem Aware
• Built based on
• Amazon Dynamo – Took partition and replication from here **
• Google Bigtable – log structured column family from here ***
** http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
*** http://research.google.com/archive/bigtable.html
NOSQL INTRO & CASSANDRA
24
25. C* BASICS
•
No Single Point of Failure – highly available.
• Peer to Peer – no master
•
Data Center Aware – distributed architecture
•
Linear Scaling – just add hardware
•
Eventual Consistency, tunable tradeoff between latency and consistency
•
Architecture is optimized for writes.
•
Can have 2 billion columns!
•
Data modeling for reads. Design starts with looking at your queries.
•
With CQL became more SQL-Like, but no joins, no subqueries, limited ordering (but very useful)
•
Column Names can part of data, e.g. Time Series
Don’t be afraid of denormalized and redundant data for read performance.
In fact embrace it! Remember, writes are fast.
NOSQL INTRO & CASSANDRA
25
26. NOTE ABOUT EVENTUAL CONSISTENCY
** Important Term **
Quorum : Q = N / 2 + 1.
We get consistency in a BASE world by satisfying W + R > N
3 obvious ways:
1.W = 1, R = N
2.W = N, R = 1
3.W = Q, R = Q
(N is replication factor, R = read replica count, W = write replica count)
NOSQL INTRO & CASSANDRA
26
27. THE C* DATA MODEL
C* data model is made of these:
Column – a name, a value and a timestamp. Applications can use the name as
the data and not use value. (RDBMS like a column).
Row – a collection of columns identified by a unique key. Key is called a partition
key (RDBMS like a row).
Column Family – container for an ordered collection rows. Each row is an
ordered collection of columns. Each column has a key and maybe a value.
(RDBMS like a table).
This is also known as a table now in C* terms.
Keyspace – administrative container for CF’s. It is a namespace. Also has a
replication strategy – more late. (RDBMS like a DB or schema).
Super Column Family – say what?
NOSQL INTRO & CASSANDRA
27
28. SUPER COLUMN FAMILY
Not recommended, but they exist. Rarely discussed
It is a key, that contains to one or more nested row keys and then these each
contain a collection of columns.
Can think of it as a hash table of hash tables that contain columns..
NOSQL INTRO & CASSANDRA
28
30. OR CAN ALSO BE VIEWED AS…
http://www.slideshare.net/gdusbabek/data-modeling-withcassandra-column-families
NOSQL INTRO & CASSANDRA
30
31. TOKENS
Tokens – partitioner dependent element on the ring.
Each node has a single unique token assigned.
Each node claims a range of tokens that is from its token to token of the previous node on the
ring.
Use this formula
Initial_Token= Zero_Indexed_Node_Number * ((2^127) / Number_Of_Nodes)
In cassandra.yaml
initial token=42535295865117307932921825928971026432
** http://blog.milford.io/cassandra-token-calculator/
NOSQL INTRO & CASSANDRA
31
32. C* PARTITIONER
RandomPartitioner – MD5 hash of key is token (128 bit number), gives you
even distribution in cluster. Default <= version 1.1
OrderPreservingPartitioner – tokens are UTF-8 strings. Uneven distribution.
Murmur3Partitioner – same functionally as RandomPartitioner, but is 3 – 5
times faster. Uses Murmur3 algorithm. Default >= 1.2
Set in cassandra.yaml
partitioner: org.apache.cassandra.dht.Murmur3Partitioner
NOSQL INTRO & CASSANDRA
32
33. REPLICATION
•
Replication is how many copies of each piece of data that should be stored.
In C* terms it is Replication Factor or “RF”.
•
In C* RF is set at the keyspace level:
CREATE KEYSPACE drg_compare WITH replication =
{'class':'SimpleStrategy', 'replication_factor':3};
•
How the data is replicated is called the Replication Strategy
• SimpleStrategy – returns nodes “next” to each other on ring, Assumes
single DC
• NetworkTopologyStrategy – for configuring per data center. Rack and
DC’s aware.
update keyspace UserProfile with strategy_options=[{DC1:3, DC2:3}];
NOSQL INTRO & CASSANDRA
33
34. SNITCH
•
Snitch maps IP’s to racks and data centers.
•
Several kinds that are configured in cassandra.yaml. Must be same across the
cluster.
SimpleSnitch - does not recognize data center or rack information. Use it for
single-data center deployments (or single-zone in public clouds)
PropertyFileSnitch - This snitch uses a user-defined description of the network
details located in the property file cassandra-topology.properties. Use this snitch
when your node IPs are not uniform or if you have complex replication grouping
requirements.
RackInferringSnitch - The RackInferringSnitch infers (assumes) the topology of
the network by the octet of the node's IP address.
EC2* - EC2Snitch, EC2MultiRegionSnitch
NOSQL INTRO & CASSANDRA
34
35. RING TOPOLOGY
When thinking of Cassandra best to think of nodes as part of ring topology, even
for multiple DC.
NOSQL INTRO & CASSANDRA
35
36. SimpleStrategy
Using token generation values from before. 4 node cluster. Write value with
token 32535295865117307932921825928971026432
NOSQL INTRO & CASSANDRA
36
39. NetworkTopologyStrategy
Using LOCAL_QUORUM, allows write to DC #2 to be asynchronous. Marked as
success when writes to 2 of 3 nodes (http://www.datastax.com/dev/blog/deploying-cassandra-across-multipledata-centers)
NOSQL INTRO & CASSANDRA
39
40. COORDINATOR & CL
•
•
•
When writing, Coordinator Node will be selected. Selected at write (or read) time.
Not a SPF!
Using Gossip Protocol nodes share information with each other. Who is up, who
is down, who is taking which token ranges, etc. Every second, each node shares
with 1 to 3 nodes.
Consistency Level (CL) – says how many nodes must agree before an operation
is a success. Set at read or write operation.
• ONE – coordinator will wait for one node to ack write (also TWO, THREE). One is
default if none provided.
• QUORUM – we saw that before. N / 2 + 1. LOCAL_QUORUM, EACH_QUORUM
• ANY – waits for some replicate. If all down, still succeeds. Only for writes. Doesn’t
guarantee it can be read.
• ALL– Blocks waiting for all replicas
NOSQL INTRO & CASSANDRA
40
41. ENSURING CONSISTENCY
3 important concepts:
Read Repair - At time of read, inconsistencies are noticed between nodes and
replicas are updated. Direct and background. Direct is determined by CL.
Anti-Entropy Node Repair - For data that is not read frequently, or to update
data on a node that has been down for a while, the nodetool repair process
(also called anti-entropy repair). Builds Merkle trees, compares nodes and
does repair.
Hinted Handoff - Writes are always sent to all replicas for the specified row
regardless of the consistency level specified by the client. If a node happens
to be down at the time of write, its corresponding replicas will save hints
about the missed writes, and then handoff the affected rows once the node
comes back online. This notification happens is via Gossip. Default 1 hour.
NOSQL INTRO & CASSANDRA
41
42. SUMMARY
C* Provider highly available, distributed, DC aware DB with tuneable consistency out of the
box.
A lot of tools at your disposal.
Work close with ops or devops .
Test, test and test again.
Don’t be afraid to use the C* community.
Brian Enochson
brian.enochson@gmail.com
Available for training, consulting, architecture & development.
Thank you!
NOSQL INTRO & CASSANDRA
42