With Apache Cassandra being a massively scalable open source NoSQL database and with the amount of data that we create and copy annually which is doubling in size every two years, it is expected to reach 44 zettabytes, or 44 trillion gigabytes, we can assume that sooner or later a DBA will be handling a Cassandra database in their shop. This beginner/intermediate-level session will take you through my journey of an Oracle DBA and my first 100 days of starting to administer a Cassandra Cluster, show several demos and all the roadblocks and the success I had along this path.
3. • 18
Years
of
Data
infrastructure
management
consulting
• 200+
Top
brands
• 6000+
databases
under
management
• Over
400
DBA’s,
in
35
countries
• Top
5%
of
DBA
work
force,
9
Oracle
ACE’s,
2
Microsoft
MVP’s,
1
Cassandra
MVP
• Oracle,
Microsoft,
MySQL,
Datastax
partners,
Netezza,
Hadoop
and
MongoDB
plus
UNIX
Sysadmin
and
Oracle
apps
About Pythian
4. Where does René come from
– Oracle
DBA
• Started
with
Version
9.2
in
2004
– Speaker
at
Oracle
Open
World,
Developers
Day
and
Collaborate
– APress
Q1
2016:
“Prac%cal
Data
Refresh”
– Movie
Fanatic
&
Music
Lover
– Bringing
the
best
from
México
(Mexihtli)
to
the
rest
of
the
world
and
in
the
process
photographing
it
:)
– rene-‐ace.com
– @rene_ace
4
5. Where does Carlos come
5
• Cassandra
Consultant
• First
contact
was
0.8
• Cassandra
MVP
&
DataStax
Certified
Architect
• Lisbon
Cassandra
Meetup
• Passion
for
distributed
systems
• Loves
a
good
challenge
• Waterpolo
is
my
sport
• @cjrolo
7. 6th Happiest Job of 2015!
7
http://www.forbes.com/sites/susanadams/2014/03/20/the-happiest-and-unhappiest-jobs-in-2014/
Work-life
balance
Relationship with
boss and co-workers
Daily tasks
Job resources
Field will grow by
15% between
2012 and 2022
DBA can be the
key driver of
success
8. Happiest Job of 2034?
Oxford University: THE FUTURE OF EMPLOYMENT: HOW SUSCEPTIBLE ARE JOBS TO COMPUTERISATION?
• 47
percent
of
American
jobs
are
at
high
risk
of
being
taken
by
computers
within
the
next
two
decades.
– 1st
Wave
• Computers
will
start
replacing
people
in
especially
vulnerable
fields
like
transportation/logistics,
production
labor,
and
administrative
support.
– 2nd
Wave
• Dependent
upon
the
development
of
good
artificial
intelligence.
This
could
next
put
jobs
in
management,
science
and
engineering,
and
the
arts
at
risk.
8
9. What is Cassandra ?
• NoSQL
database,
developed
in
JavaOne
• Fully
distributed
DB
• Meaning
that
there
is
no
master
DB,
unlike
Oracle
or
MySQL.
• Linearly
scalable
• Based
on
2
core
technologies,
Google’s
Big
Table
and
Amazon’s
Dynamo
• 2
versions
of
Cassandra
• Community
Edition.-‐
This
is
distributed
under
the
Apache™
License
• Enterprise
Edition
.-‐
This
is
distributed
by
Datastax
9
≠
10. CAP
Theorem
• In
a
distributed
system
you
can
only
have
two
out
of
the
following
three
guarantees
across
a
write/read
pair:
• Consistency.-‐
A
read
is
guaranteed
to
return
the
most
recent
write
for
a
given
client.
• Availability.-‐A
non-‐failing
node
will
return
a
reasonable
response
within
a
reasonable
amount
of
time
(no
error
or
timeout).
• Partition
Tolerance.-‐The
system
will
continue
to
function
when
network
partitions
occur.
10
N1 N2
X X
N1 N2
N1 N2
What is Cassandra ?
11. What is Cassandra ?
• Cassandra
is
a
BASE
(Basically
Available,
Soft
state,
Eventually
consistent)
type
system
11
• Not
an
ACID
(Atomicity,
Consistency,
Isolation,
Durability)
type
system
12. It Can be as easy as …
• Start
your
machine
and
install
the
following:
• ntp
(Packages
are
normally
ntp,
ntpdata
and
ntp-‐
doc)
• wget
(Unless
you
have
your
packages
copied
over
via
other
means)
• vim
(Or
your
favorite
text
editor)
• Yum
Package
Management
• Root
or
sudo
access
to
the
install
machine
• Latest
version
of
Oracle
Java
SE
Runtime
Environment
(JRE)
8
(recommended)
or
OpenJDK
7.
• Python
2.6+
(needed
if
installing
OpsCenter)
12
13. It Can be as easy as …
13
• Install
Cassandra.
~$ sudo yum install dsc21-2.1.5-1 cassandra2.1.5-1
• Install
optional
utilities.
~$ sudo yum install cassandra21-tools-2.1.5-1
• Start
Cassandra
service
~$ sudo service cassandra stop
~$ sudo rm -rf /var/lib/cassandra/data/system/*
• In
the
cassandra-‐rackdc.properties
file
#
indicate
the
rack
and
dc
for
this
node
dc=Pythian
rack=RAC1
~$ sudo service cassandra start
14. Where is everything in Cassandra?
14
Directories Description
/var/lib/cassandra Data
directories
/var/log/
cassandra Log
directory
/var/run/
cassandra Runtime
files
/usr/share/
cassandra Environment
settings
/usr/share/
cassandra/
lib
JAR
files
/usr/bin Optional
utilities,
such
as
sstablelevelreset,
sstablerepairedset,
and
sstablesplit
/usr/bin Binary
files
/usr/sbin
/etc/cassandra Configuration
files
/etc/init.d Service
startup
script
/etc/security/
limits.d Cassandra
user
limits
/etc/default
/usr/share/
doc/
cassandra/examples
Sample
cassandra.yaml
files
for
stress
testing
15. I come from this world…
12c
Version
Architecture…
15
16. I come from this world…
Oracle…
16
101010
Online Redo
Log10100
Data Files Control Files
Segment
Database
Tablespace
Extent
Oracle data
block
Schema Data file
OS block
Logical
Datafile
Physical
Datafile
17. I come from this world…
17
RAC
-‐
For
Node
Point
of
Failure
RAC Cluster
Node3Node2
ASM Disks
Node1
Public Network
Storage Network
ASM Network
CSS Network
ASM ASM ASM
DBB DBBDBB
Global
Data
Services
– Service Failover / Load Balancing
18. I come from this world…
18
Dataguard
-‐
For
Failover
Primary
Standby
Far
Sync
Instance
SYNC
ASYNC
Zero
data
loss
failover
20. One Ring to Rule them All
20
• The
total
amount
of
data
managed
by
the
cluster
is
represented
as
a
ring
• Each
node
is
assigned
a
part
of
the
database
to
hold
based
on
each
table’s
primary
key.
• To
guarantee
both
availability
and
durability
multiple
nodes
will
be
assigned
to
the
same
data.
• There
is
no
master
node
all
nodes
can
perform
all
operations
1
4
3
2
A-F,T-Z,M-S
G-L,A-F,T-Z
M-S,G-L,A-F
T-Z,M-S,G-L
21. Gossip
21
• Peer-‐to-‐peer
communication
protocol
in
which
nodes
periodically
exchange
state
information
• Runs
every
second
and
exchanges
state
messages
with
up
to
three
other
nodes
in
the
cluster
• Failure
detection
• It
determines
locally
from
gossip
state
and
history
if
another
node
in
the
system
is
down
or
has
come
back
up.
22. Consistent Hashing
22
• A
hash
consists
of
one
or
more
arithmetic
operations
on
a
piece
of
data
• Common
way
of
load
balancing
across
several
nodes
• Hash
function
must
have
a
upper
and
lower
bound
so
objects
can
be
mapped
in
a
circle
• Common
Hash
algorithms
– Simple
checksums
– Message
Digest
(MD5)
– Secure
Hash
Algorithm
(SHA-‐1/2)
– MurmurHash
23. Partitioners
23
• Determines
how
data
is
distributed
across
the
nodes
in
the
cluster
• Function
for
deriving
a
token
representing
a
row
from
its
partition
key
Cassandra
Offers:
– Murmur3Partition
– RandomPartitioner
– ByteOrderedPartitioner
24. Virtual Nodes
24
• Solution
for
avoiding
calculating
node
tokens
and
thinking
about
the
cluster
size
before
hand
• Each
node
has
multiple
virtual
nodes
• Each
node
virtual
node
own
a
much
smaller
subset
of
data
25. Coordinators
25
• Acts
as
a
proxy
between
the
client
application
and
the
nodes
that
own
the
data
being
requestedAny
client
request
can
be
sent
to
any
node.
26. Snitch
26
• Is
responsible
for
keeping
all
of
the
nodes
up
to
date
on
what
node
has
what
data,
what
nodes
are
currently
down,
what
nodes
are
bootstrapping,
etc.
• It
Interprets
the
topology
The
most
popular
are:
– Gossiping
property
file
snitch
– EC2
Snitch
– EC2
Multi-‐region
snitch
– Dynamic
Snitch
29. A CASSANDRA TABLE OR COLUMN FAMILY
29
Coordinator
Snitch
Commitlog
Writer
Mem
table
writer
Mem
Table
Flush
(Sstable
writer)
Reader
Mem
tables
Bloom
Filters
Cassandra
Node
CommitLog
10100
SSTables
30. A CASSANDRA TABLE OR COLUMN FAMILY
30
• Consists
of
one
or
more
SStables
and
0
or
more
MEMtables
• SStable
stands
for
Sorted
String
Table.
• E.G.
all
of
the
Columns
in
the
SStable
are
sorted
in
order
by
key.
• Each
SStable
consists
of
the
data
table,
bloom
filter,
index
and
some
other
minor
files.
• SStables
are
immutable.
Once
written
they
are
never
altered
only
read
and
eventually
deleted
videogames-events-data-jb-1.db
videogames-events-filters-jb-1.db
videogames-events-index-jb-1.db
videogames-events-data-jb-2.db
videogames-events-filters-jb-2.db
videogames-events-index-jb-2.db
videogames-events-data-jb-3.db
videogames-events-filters-jb-3.db
videogames-events-index-jb-3.db
videogames-events-data-jb-4.db
videogames-events-filters-jb-4.db
videogames-events-index-jb-4.db
SStables
on
disk
/var/lib/cassandra
31. REPLICATION FACTOR (RF) AND CONSISTENCY
31
• Replication
Factor
is
the
number
of
copies
of
columns
stored
in
the
ring
• Replication
factor
should
not
exceed
the
number
of
nodes
in
the
cluster
– RF=1
is
one
copy
this
means
that
the
data
for
each
column
is
stored
only
once
in
the
ring.
– RF=3
(default)
means
every
column
stored
in
the
database
is
stored
three
times.
– Quorum
.-‐
The
read
and
write
must
be
acked/returned
from
a
quorum
of
nodes.
32. REPLICATION FACTOR (RF) AND CONSISTENCY
32
• Consistency
– When
write
or
read
is
performed
the
application
can
choose
to
wait
for
n
copies
of
the
data
to
be
written
or
read
this
is
referred
to
as
consistency
of
n.
– There
is
a
special
consistency
value
called
quorum
which
means
a
response
from
RF/2+1
nodes
is
required.
33. HOW TO MAKE SURE WE DON’T LOOSE DATA
33
• Three
anti-‐entropy
mechanisms
in
Cassandra
1)
Hinted
handoff
2)
Read
repair
3)
Repair
A.K.A.
Anti-‐Entropy
35. COMPACTIONS
35
• SStables
are
immutable.
• Deletes
and
updates
are
just
new
writes
• SStables
are
merged
together
by
partitioned
key.Old
obsolete
data
is
discarded.
• Lots
of
SStables
become
a
few.
• Compaction
can
require
a
lot
of
disk
space.
DO
NOT
LET
your
disks
get
more
than
50%
full.
36. CQL - Cassandra Query Language
36
CQL
is
not
SQL
• Default
and
primary
interface
into
the
Cassandra
Database
(since
2.0)
• Cassandra
does
not
support
joins
or
subqueries
• Only
way
to
create
users
and
user
based
permissions
• Very
similar:
cqlsh> CREATE KEYSPACE sandbox WITH REPLICATION = { 'class' :
'NetworkTopologyStrategy', DC1 : 1};
cqlsh> USE sandbox;
cqlsh:sandbox>CREATE TABLE data (id uuid, data text, PRIMARY KEY (id));
cqlsh:sandbox> INSERT INTO data (id, data) values
(c37d661d-7e61-49ea-96a5-68c34e83db3a, 'testing');
cqlsh:sandbox> SELECT * FROM data;
38. 38
Feature/Function
DSE/Cassandra Oracle
RDBMS
Core architecture “Masterless”; peer-to-peer with
all nodes being the same
Traditional standalone
High availability Continuous availability with built
in redundancy and hardware
rack awareness in both single
and multiple data centers
Oracle Dataguard (for failover)
and Oracle RAC (Node SPOF)
GoldenGate
Data model Google Bigtable Relational/tabular
Data consistency model Tunable consistency (CAP
theorem consistency per
operation
Traditional ACID
Storage model Targeted directories with
separation
Tablespaces
Logical database
container
Keyspace Database
Backup/recovery Online, point-in-time restore Online, point-in-time restore
Enterprise management/
monitoring
DataStax OpsCenter Oracle Enterprise Manager
39. LESSONS LEARNED
39
• Understand
the
Data
Model
Differences
• Hardware
Setup
does
Matter
• Grep
the
logs
for
errors
and
warnings
• Make
sure
each
node
is
created
properly
• Know
your
tools
• nodetool
utility
• Cassandra
bulk
loader
(sstableloader)
• jconsole/JavaVisualVM
• Cassandra-‐Stress
• OpsCenter
41. FIT-ACER
• F – Focus (SLOW DOWN! Are you ready?)
• I – Identify server/DB name, time, authorization
• T – Type the command (do not hit enter yet)
• A – Assess the command (SPEND TIME HERE!)
• C – Check the server / database name again
• E – Execute the command
• R – Review and document the results
41
43. 43
To contact us
sales@pythian.com
1-877-PYTHIAN
To follow us
http://www.pythian.com/blog
http://www.facebook.com/pages/The-Pythian-Group/163902527671
@pythian
http://www.linkedin.com/company/pythian
Thank you – Q&A