SlideShare une entreprise Scribd logo
1  sur  89
Télécharger pour lire hors ligne
Introduction to Cassandra and CQL for Java
developers
Julien Anguenot (@anguenot)!
Houston Java User Group!
July 30th, 2014
Agenda
C* overview!
C* key features!
C* key concepts!
Getting started with C*!
CQL!
DataStax CQL Java driver
C* overview
© 2014 iland internet solutions
What is C*?
• Open source distributed storage system!
• Essentially a partitioned row store!
• A cross between Google’s BigTable (data model) and Amazon’s Dynamo
(architecture)!
• Runs off commodity hardware!
• Optimized for non-relational models!
• Cassandra Query Language (CQL)!
• Written in Java!
• Apache Licence v2.0!
• An open source community
4
© 2014 iland internet solutions
History
• Developed by Facebook for its inbox search!
• Open sourced in 2008!
• Apache Foundation top project in 2009!
• 1.0 released in 2011!
• 2.0 released in 2013!
• 2.1 to be released this year
5
© 2014 iland internet solutions
C* is today
• One of the most popular “NoSQL” database!
• Used by many (and large) organizations (Netflix, Instagram,
Twitter, eBay, etc.)!
• Contributors include Facebook, IBM, Twitter, Rackspace, etc.!
• Cassandra 2.0+ and CQL 3.1!
• Drivers and client libs available for various languages:
Python, Java, C++, C#, etc.
6
© 2014 iland internet solutions
When to consider C*?
• Performance: write is great, read is good on very large datasets.
(hundreds of TB)!
• Application running across multiple data-centers in different
geographic locations!
• Application requiring HA w/ no-SPOF (hundreds of nodes)!
• Elastic scalability is critical!
• Application running off commodity servers in premises or VMs at
your favorite IaaS!
• Looking for simplicity over other solutions such as Hadoop /
HBase
7
Cassandra vs HBase vs MongoDB
Let’s just get this out of the way
© 2014 iland internet solutions
MongoDB to be considered if / when?
• (much) smaller datasets!
• your application does not need to run across multiple
data centers.!
• it is ok for your application to have a SPOF!
• you do not need to scale out your application elastically!
• write performance decreasing with amount of data is not
a big deal
9
© 2014 iland internet solutions
HBase to be considered if / when?
• You do analytics: HBase running off Hadoop is a good
option!
• Your application has a very low transaction rate!
• Your application does not need to run in multiple data
centers!
• You are not scared of moving parts!
• Increasing your application overall architecture is fine
10
C* key features
© 2014 iland internet solutions
Scalability
• linearly scales reads and writes with number of nodes.
Throughput of application // # of nodes!
• hundreds of nodes supported!
• no downtime adding nodes!
• no application level interruption!
• multi-datacenter native replication support
12
© 2014 iland internet solutions
High Availability
• fault tolerant with tunable consistency (more on this later)!
• data replicated to multiple nodes!
• continuous availability: no SPOF (vs master / slave)
13
© 2014 iland internet solutions
Performances
• low latency!
• write is great!
• read is good!
• can handles hundreds of TB
14
© 2014 iland internet solutions
Transaction Support!
• commit log: atomicity, isolation and durability of ACID
compliance!
• consistency is tunable (more on this later)
15
© 2014 iland internet solutions
Simplicity
• all nodes in cluster are the same!
• configuration is simple!
• operation is simple
16
© 2014 iland internet solutions
Cassandra Query Language (CQL)
• SQL-like query language!
• data are in tables containing rows of columns!
• v3 replaces Thrift API and CQL v2
17
C* key concepts
© 2014 iland internet solutions
Tunable consistency!
• RDBMS: consistency and availability => transactions!
• NoSQL: partition tolerance over consistency?!
• Cassandra tunable consistency: tradeoffs in between
performance or accuracy on a per-query basis!
• Write requests: all nodes, quorum of nodes or any available
nodes!
• Read requests: all nodes “strong consistency”, quorum of
nodes or any nodes.
19
© 2014 iland internet solutions
Data model!
• Flexible data storage: structured, semi-structured,
unstructured!
• Change to data structures is dynamic!
• strict minimum: essentially a distributed hash map!
• low-level: requires application to have extensive knowledge
about the dataset!
• Does not support a fully relational model: application
responsibility!
• No foreign keys, no JOIN
20
© 2014 iland internet solutions
Partitioned row store!
• keyspace (KS) is the primary container of data (like RDBMS database)!
• KS contains column families (CF) (like relational tables)!
• CF contains rows and rows contain columns!
• CF requires a primary key: partition key (PK) is the first part of the primary key. !
• PK determines on which nodes the data is stored. !
• SELECT must include PK!
• remaining columns part of primary key are clustering columns (think ordering)!
• INSERT / UPDATE / DELETE OPS on rows w/ same PK for a CF are atomic and
isolated!
• partitioning: C* distributes transparently data across multiple nodes (nodes can be
added and removed)!
• Secondary indexes possible
21
Getting started
© 2014 iland internet solutions
Where to get started?
• http://cassandra.apache.org/

Apache foundation project Web site!
• http://planetcassandra.org/ 

Community Web site!
• http://www.datastax.com/

company providing Cassandra support and solutions to
enterprises

lots of great documentation
23
© 2014 iland internet solutions
Requirements
• Java >= 1.7 (prefer Oracle JVM)!
• Python 2.7 (cqlsh only)
24
© 2014 iland internet solutions
Downloading
• stable releases available from Apache Foundation Web
site!
• binary distributions!
• Debian / Ubuntu packages!
• DataStax provides RPMs!
• you can build C* from source (testing patches etc.)
25
© 2014 iland internet solutions
Getting started with tarball distribution
$ wget http://www.apache.org/dyn/closer.cgi?path=/
cassandra/2.0.9/apache-cassandra-2.0.9-bin.tar.gz
!
$ sudo mkdir -p /var/log/cassandra
$ sudo chown -R `whoami` /var/log/cassandra
$ sudo mkdir -p /var/lib/cassandra
$ sudo chown -R `whoami` /var/lib/cassandra
$ tar -xzf apache-cassandra-2.0.9-bin.tar.gz
!
$ bin/cassandra -f
26
© 2014 iland internet solutions
Getting started with Debian / Ubuntu (1/2)
$ sudo vim /etc/apt/sources.list.d/java.list

deb http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main

deb-src http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main
$ sudo apt-get update
$ sudo apt-get oracle-java7-installer
$ sudo apt-get install oracle-java7-set-default
!
27
© 2014 iland internet solutions
Getting started with Debian / Ubuntu (2/2)
$ sudo vim /etc/apt/sources.list.d/cassandra.list

deb http://www.apache.org/dist/cassandra/debian 20x main

deb-src http://www.apache.org/dist/cassandra/debian 20x main
$ sudo apt-get update
$ sudo apt-get install cassandra
28
© 2014 iland internet solutions
Running the CQL shell
$ (bin/)cqlsh
Connected to Test Cluster at localhost:9160.
[cqlsh 4.1.1 | Cassandra 2.0.9 | CQL spec 3.1.1 | Thrift protocol
19.39.0]
Use HELP for help.
cqlsh>
•
29
Cassandra Query Language (CQL)
© 2014 iland internet solutions
Using CQL
• cqlsh!
• DataStax driver!
• simpler than Thrift API!
• hide C* internal implementation details!
• native transport port: 9042
31
© 2014 iland internet solutions
CQL basics
• usual statements!
• CREATE / DROP / ALTER!
• SELECT!
• INSERT and UPDATE are the same (create or replace)
32
© 2014 iland internet solutions
Keyspace (KS)
• “like” a RDBMS database but…!
• replication strategy!
• SimpleStrategy: simple single DC cluster!
• NetworkTopologyStrategy: multi-DC cluster!
• replication factor: total number of replicas across the cluster!
• A replication factor of 1 means that there is only one copy of each row in
the DC!
• A replication factor of 2 means two copies of each row, where each copy is
on a different node in every DC!
• if RF > # nodes: writes rejected and read will depend on consistent level
33
© 2014 iland internet solutions
Creating KS: single node in a single DC
cqlsh> CREATE KEYSPACE HJUG WITH REPLICATION =
{ 'class' : 'SimpleStrategy', 'replication_factor' : 1 };!
!
1 node == 1 copy!
34
© 2014 iland internet solutions
Creating KS: 4 nodes cluster in a single DC (1/2)
cqlsh> CREATE KEYSPACE HJUG WITH REPLICATION =
{ 'class' : 'SimpleStrategy', 'replication_factor' : 3 };!
!
3 copies of data across 4 nodes
35
© 2014 iland internet solutions
Creating KS: 4 nodes cluster in a single DC (2/2)
• first replica on a node determined by the partitioner!
• Additional replicas placed on the next nodes clockwise in
the ring
36
© 2014 iland internet solutions
Multi-DC (NetworkTopologyStrategy)
• cluster deployed across multiple data centers!
• specify how many replicas in each data center!
• what to consider:!
• local reads with low net latency!
• failure!
• disk space!
• example:!
1. 2 replicas in each DC: 1 node can be down per DC and still allows local reads at
a consistency level of ONE (1).!
2. 3 replicas in each DC. 1 node per DC at a strong consistency level of
LOCAL_QUORUM (2) depending on query consistency level
37
© 2014 iland internet solutions
Creating KS: 2 DC of 3 nodes and RF 3
cqlsh> CREATE KEYSPACE HJUG WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy', ‘us-east' : 3, ‘us-west’:
3 };!
!
3 copies of data across 3 nodes in each DC (6 totals)
38
© 2014 iland internet solutions
nodetool status <KS>
$ bin/nodetool status HJUG!
!Datacenter: us-east!
===============!
Status=Up/Down!
|/ State=Normal/Leaving/Joining/Moving!
-- Address Load Tokens Owns (effective) Host ID Rack!
UN 10.241.206.82 989.91 GB 256 100.0% 1aeb620e-f22d-485b-b755-323f8e20388a 206!
UN 10.241.206.80 989.14 GB 256 100.0% aefbe1fc-3436-48ac-a07f-ac664c2b823f 206!
UN 10.241.206.81 989.7 GB 256 100.0% acd7b4db-7a3f-4dac-96ef-9389a2f807ba 206!
!Datacenter: us-west!
===============!
Status=Up/Down!
|/ State=Normal/Leaving/Joining/Moving!
-- Address Load Tokens Owns (effective) Host ID Rack!
UN 10.243.206.80 989.7 GB 256 100.0% 3d8ea269-3e59-400c-9f77-727da2bcf8a6 206!
UN 10.243.206.81 988.49 GB 256 100.0% 5832b870-fcfc-4046-a2d5-eff65fa53f4c 206!
UN 10.243.206.82 987.92 GB 256 100.0% b8d0792a-b5fb-433f-a9f6-ce1110a3420b 206!
!
39
© 2014 iland internet solutions
ALTER KEYSPACE <KS>
cqlsh> ALTER KEYSPACE HJUG WITH REPLICATION = {
'class' : 'NetworkTopologyStrategy', ‘us-east' : 3, ‘us-west’: 2
};!
!
You then need to run a repair
40
© 2014 iland internet solutions
DROP KEYSPACE <KS>
cqlsh> drop keyspace HJUG;!
cqlsh> drop keyspace if exists HJUG;!
!
Immediate and irreversible removal
41
© 2014 iland internet solutions
Using KS
cqlsh> use HJUG;

cqlsh> describe keyspace HJUG;
42
© 2014 iland internet solutions
To go further
• partitioner!
• snitch!
• rack!
• seeds!
• nodetool!
• read configuration file
43
© 2014 iland internet solutions
Creating table with a single primary key
cqlsh:HJUG> CREATE TABLE users (

username varchar,!
password varchar,!
[…], !
PRIMARY KEY (username));
44
© 2014 iland internet solutions
Creating table with a compound primary key
cqlsh:HJUG> CREATE TABLE users(

username varchar,!
location_id int,!
[…],!
PRIMARY KEY (username, location_id));!
!
partition key: username!
location_id: clustering columns (ordering)
45
© 2014 iland internet solutions
Creating table with a composite primary key
cqlsh:HJUG> CREATE TABLE users(

username varchar,!
location_id int,!
[…],!
PRIMARY KEY ((username, location_id)));!
!
each row will be on a separated partition of its own
46
© 2014 iland internet solutions
ALTER TABLE <T>
cqlsh:HJUG> ALTER TABLE users ADD last_login varchar;!
cqlsh:HJUG> ALTER TABLE users ALTER last_login TYPE timestamp;!
cqlsh:HJUG> ALTER TABLE users DROP last_login;!
!
cqlsh:HJUG> ALTER TABLE users with COMPRESSION =
{'sstable_compression': ''};!
47
© 2014 iland internet solutions
DESCRIBE TABLE <T>
cqlsh> use HJUG;

cqlsh:HJUG> DESCRIBE TABLE HJUG;

CREATE TABLE users(

username varchar,!
location_id int,!
[…],!
PRIMARY KEY (username, location_id)!
) WITH!
[…]!
compaction={'class': 'SizeTieredCompactionStrategy'} AND!
compression={'sstable_compression': 'LZ4Compressor'};!
!
48
© 2014 iland internet solutions
INSERT
cqlsh> INSERT INTO HJUG.users (username, location_id) VALUES
(‘janguenot’, ‘Houston’); !
!
cqlsh> use HJUG;!
cqlsh:HJUG> INSERT INTO users (username, location_id) VALUES
(‘janguenot’, ‘Houston’);
49
© 2014 iland internet solutions
UPDATE
cqlsh:HJUG> UPDATE USERS set X=‘Y’ where username=‘janguenot’
and location_id = ‘Houston’;
50
© 2014 iland internet solutions
SELECT
cqlsh:HJUG> SELECT * FROM USERS;!


cqlsh:HJUG> SELECT * FROM USERS ORDER BY location_id ASC;!


cqlsh:HJUG> SELECT * FROM USERS where username = ‘janguenot’;!
!
Remember ORDER BY can ONLY be used with columns part of primary
key!
!
51
© 2014 iland internet solutions
CQL predicates
• on partition keys: =, IN!
• on the cluster columns: <,<=,=,>=,>,IN
52
© 2014 iland internet solutions
Performance considerations
• query against single partition are fast!
• pk = <whatever>!
• queries spanning multiple partitions are slow!
• new disk seek for each partition!
• queries spanning multiple cluster columns are fast
53
© 2014 iland internet solutions
GROUP BY?
• partition key cluster columns for grouping!
• no group by statement
54
© 2014 iland internet solutions
DELETE
cqlsh:HJUG> DELETE FROM USERS where username =
‘janguenot’ and location_id = ‘Houston’;!
!
Deleted values will be permanently deleted after next
compaction
55
© 2014 iland internet solutions
TRUNCATE TABLE <T>
cqlsh:HJUG> truncate table users;
56
© 2014 iland internet solutions
DROP TABLE <T>
cqlsh:HJUG> drop table users;!
cqlsh:HJUG> drop table if exists users;
57
© 2014 iland internet solutions
CQL Types
58
© 2014 iland internet solutions
CQL Collections
cqlsh:HJUG> CREATE TABLE users (

username varchar,!
password varchar,!
emails set<text>, !
PRIMARY KEY (username));!
• Set, List and Map are supported!
• 1 to many relationship!
• they get serialized: keep it small or use extra table!
• list, that are ordered, are not performant, use set if possible or consider additional
tables if large collection
59
© 2014 iland internet solutions
Secondary Indexes
• Query against a column outside the primary key!
• CREATE INDEX <index_name> ON <T>(<column>);!
• SELECT * FROM T where column=‘x’;!
• Performances are good but not great but definitely getting
better and better
60
© 2014 iland internet solutions
Final remarks about CQL
• no sequences: you manage UUID at the app level (time
UUID types might be used for time series though)!
• remember partition key is not a primary key: beware of
UPDATE!
• In doubt, you can write: C* is good at it. Create table and
store data (One to One, One To Many)!
• Your application will drive your data model!
61
© 2014 iland internet solutions
To go further
• TTL!
• Counters!
• Static column!
• Lightweight transactions (IF, IF NOT EXISTS)
62
DataStax native CQL Java driver
© 2014 iland internet solutions
Main features
• Provides CQL3 access to C* using Java!
• Uses C* CQL Native protocol!
• Tunable policies (including consistency)!
• Load balancing / reconnection / failover / routing of requests!
• prepared statements and batches!
• Sync and Async queries supported!
• tracing query supported (for debug purposes)!
• Driver available for Python, C++ and C# as well (similar API)
64
© 2014 iland internet solutions
Driver modules
• driver-core: the core layer!
• driver-examples: example applications using the other
modules which are only meant for demonstration
purposes.
65
© 2014 iland internet solutions
Maven dependency
	 	 	 <dependency>	
	 	 	 	 <groupId>com.datastax.cassandra</groupId>	
	 	 	 	 <artifactId>cassandra-driver-core</artifactId>	
	 	 	 	 <version>2.0.3</version>	
	 	 	 </dependency>
66
© 2014 iland internet solutions
Optional dependencies for compression
	 	 	 <dependency>	
	 	 	 	 <groupId>net.jpountz.lz4</groupId>	
	 	 	 	 <artifactId>lz4</artifactId>	
	 	 	 	 <version>1.2.0</version>	
	 	 	 	 <scope>runtime</scope>	
	 	 	 </dependency>	
	 	 	 <dependency>	
	 	 	 	 <groupId>org.xerial.snappy</groupId>	
	 	 	 	 <artifactId>snappy-java</artifactId>	
	 	 	 	 <version>1.0.5</version>	
	 	 	 	 <scope>runtime</scope>	
	 	 	 </dependency>
67
© 2014 iland internet solutions
Driver documentation
• Docs

http://www.datastax.com/documentation/developer/java-driver/2.0/
index.html!
• API

http://www.datastax.com/drivers/java/2.0 !
• Jira

https://datastax-oss.atlassian.net/browse/JAVA !
• Mailing list

https://groups.google.com/a/lists.datastax.com/forum/#!forum/java-driver-
user
68
© 2014 iland internet solutions
Open Source
• Apache v2 licence!
• https://github.com/datastax/java-driver
69
Examples
© 2014 iland internet solutions
Step 1: connection to the cluster
Cluster.Builder clusterBuilder = Cluster.builder();
!
// Connect to one (1) node
clusterBuilder.addContactPoint(“10.10.10.2”);
!
// Connect to several nodes
clusterBuilder.addContactPoints(“10.10.10.2”, “10.10.10.3”);
!
// Build the the cluster
Cluster cluster = clusterBuilder.build();
!
// … do work with the cluster …
!
// Shutdown the cluster
cluster.shutdown();
71
© 2014 iland internet solutions
Step 2: connection to a keyspace
// Creating a session against the keyspace you want to interact with
Session session = cluster.connect("HJUG");
!
// Close up the session
session.shutdown()
72
© 2014 iland internet solutions
Example 1: search queries and result set
// TODO catch exceptions
!// Execute a query using the cluster and iterate over the results
ResultSet result = session.execute("SELECT * from USER;");
!// Option 1: iterate over the results
Iterator<Row> iter = result.iterator();
while (iter.hasNext()) {
Row row = iter.next();
log.info(String.format("Found user w/ username=%s", row.getString(“username”));
}
!// Option 2: get all rows and iterate
List<Row> rows = result.all();
for (Row row : rows) {
log.info(String.format("Found user w/ username=%s", row.getString(“username”));
}
73
© 2014 iland internet solutions
Example 2: inserting data
// TODO catch exceptions
!
// INSERT a new user (TODO: escape parameters when used this way)
session.execute(String.format("INSERT INTO USER (username, location_id) VALUES (%s, %s);",
"Jim", “Houston"));
74
© 2014 iland internet solutions
Example 3: prepared statements
// TOTO catch exceptions
!// Create prepared statement that can be reused throughout the application.
// You only need to create it once
PreparedStatement usersByLocationStatement = session.prepare(String.format(
"SELECT * FROM %s WHERE %s = ?;", USER, "location_id"));
!// Create bound statement and bind query parameters
BoundStatement boundStatement = new BoundStatement(usersByLocationStatement);
!// You can override the default consistent level defined at the cluster level on a per

// query basis
boundStatement.setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM);
!// Bind parameters
boundStatement.bind(“Houston”);
!!// Execute bound statement and get results
ResultSet resultSet = session.execute(boundStatement);
75
© 2014 iland internet solutions
Example 4: Batch Statement
// TODO catch exceptions
!
// Create a batch statement
// Type logged ensures atomicity
BatchStatement batchStatement = new BatchStatement(BatchStatement.Type.LOGGED);
!
// Create bound statement and bind query parameters
BoundStatement boundStatement = new BoundStatement(usersByLocationStatement);
boundStatement.bind("Houston");
!
// Add the bound statements to the batch
batchStatement.add(boundStatement);
!
// ... you can several bound statements to the batch ...
!
// execute batch
session.execute(batchStatement);
76
© 2014 iland internet solutions
Example 5: Synchronous vs Asynchronous
// TODO catch exceptions
!
// INSERT synchronously a new user (TODO: escape parameters when used this way)
session.execute(String.format("INSERT INTO USER (username, location_id) VALUES (%s, %s);",
"Jim", “Houston”));
!
// INSERT asynchronously a new user (TODO: escape parameters when used this way)
session.executeAsync(String.format("INSERT INTO USER (username, location_id) VALUES (%s, %s);",
"Jim", “Houston"));
77
© 2014 iland internet solutions
Example 6: batching result sets
// We will get <limit> items at offset <x>
// offset = x;
// limit = y;
!// Create bound statement and bind query parameters
BoundStatement boundStatement = new BoundStatement(usersByLocationStatement);
boundStatement.setFetchSize(limit);
boundStatement.bind("Houston");
!!// Execute bound statement and get results
ResultSet resultSet = session.execute(boundStatement);
!for (int i = 0; i < (offset / limit); i++) {
// Fetch the number of pages needed
resultSet.fetchMoreResults();
}
!Iterator<Row> iter = resultSet.iterator();
for (int i = 0; i < offset; i++) {
// Throw away results from earlier pages
if (iter.hasNext()) {
iter.next();
}
}
!final List<Row> rows = new ArrayList<>();
for (int i = 0; i < limit; i++) {
// Keep results from desired page
if (iter.hasNext()) {
rows.add(iter.next());
}
}
78
DataStax CQL driver rule #1
Use one Cluster instance per (physical) cluster (per
application lifetime)
© 2014 iland internet solutions
Cluster
• handles queries, connections and their policies!
• share cluster instance at the application level!
• must be tuned according to C* nodes / cluster
configuration (timeouts, retries etc.)!
• Consistency
80
© 2014 iland internet solutions
Example of a more complex Cluster setup
// Initialize cluster like in example 1.	
// You can customize policies before build()	
clusterBuilder	
.withQueryOptions(	
new QueryOptions().setConsistencyLevel(	
ConsistencyLevel.LOCAL_QUORUM))	
.withCompression(Compression.LZ4)	
.withSocketOptions(	
// Setting a value of 0 disables read timeouts: we let Cassandra timeout	
// before the cluster here.	
new SocketOptions().setConnectTimeoutMillis(1500)	
.setReadTimeoutMillis(0))

.withLoadBalancingPolicy(new DCAwareRoundRobinPolicy(“us-east”));
81
DataStax CQL driver rule #2
Use at most one Session per keyspace, or use a single
Session and explicitly specify the keyspace in your
queries
© 2014 iland internet solutions
Session
• API centered around query execution!
• manages per-node connection pools!
• avoid large # of sessions or major impact on server
resources (C* side)!
• share session instance at the application level!
• one session per keyspace at most!
• if large number of keyspace: pre-defined number of sessions
83
DataStax CQL driver rule #3
if you execute a statement more than once, consider
using a PreparedStatement
© 2014 iland internet solutions
Prepared statements
• prepare once, bind and execute multiple times.!
• parsed and prepared on the Cassandra nodes!
• cache prepared statement at the application level!
• only bound parameters and query are sent to nodes!
• performance gains are significant!
• prepared statements should be configured to rarely
receive null values when binding parameters
85
DataStax CQL driver rule #4
You can reduce the number of network roundtrips and
also have atomic operations by using Batches
© 2014 iland internet solutions
Batch operations!
• single request!
• combines multiple data modification statements into a
single logical operation!
• atomic operation: all statements pass or fail!
• can use combinations of batch and prepared statements!
• keep batch statement below the value specified in conf
file: batch_size_warn_threshold_in_kb (5 kb by default)
87
Thanks!
Slides available @ http://www.slideshare.net/anguenot/cassandra-cql-
javahjug20140730 !
@anguenot / ja@iland.com!
!
iland: http://www.iland.com!
We are hiring in Houston!!
https://www.linkedin.com/company/iland-internet-solutions/careers !
!
Introduction to Cassandra and CQL for Java developers

Contenu connexe

Tendances

Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...DataStax
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
AddThis: Scaling Cassandra up and down into containers with ZFS
AddThis: Scaling Cassandra up and down into containers with ZFSAddThis: Scaling Cassandra up and down into containers with ZFS
AddThis: Scaling Cassandra up and down into containers with ZFSDataStax Academy
 
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsBeginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsDataStax Academy
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Sparknickmbailey
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraMichael Kjellman
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0J.B. Langston
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...DataStax Academy
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterScyllaDB
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraDataStax
 
Performance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationPerformance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationRamkumar Nottath
 
PagerDuty: One Year of Cassandra Failures
PagerDuty: One Year of Cassandra FailuresPagerDuty: One Year of Cassandra Failures
PagerDuty: One Year of Cassandra FailuresDataStax Academy
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...DataStax
 
Cassandra
CassandraCassandra
Cassandraexsuns
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyBenjamin Black
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarDataStax Academy
 
NewSQL overview, Feb 2015
NewSQL overview, Feb 2015NewSQL overview, Feb 2015
NewSQL overview, Feb 2015Ivan Glushkov
 

Tendances (20)

Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
AddThis: Scaling Cassandra up and down into containers with ZFS
AddThis: Scaling Cassandra up and down into containers with ZFSAddThis: Scaling Cassandra up and down into containers with ZFS
AddThis: Scaling Cassandra up and down into containers with ZFS
 
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsBeginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to Cassandra
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra
 
Performance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationPerformance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migration
 
PagerDuty: One Year of Cassandra Failures
PagerDuty: One Year of Cassandra FailuresPagerDuty: One Year of Cassandra Failures
PagerDuty: One Year of Cassandra Failures
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
 
Cassandra
CassandraCassandra
Cassandra
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and Consistency
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
 
NewSQL overview, Feb 2015
NewSQL overview, Feb 2015NewSQL overview, Feb 2015
NewSQL overview, Feb 2015
 

Similaire à Introduction to Cassandra and CQL for Java developers

Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache CassandraCassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache CassandraDataStax Academy
 
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014NoSQLmatters
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
 
NAVER Ceph Storage on ssd for Container
NAVER Ceph Storage on ssd for ContainerNAVER Ceph Storage on ssd for Container
NAVER Ceph Storage on ssd for ContainerJangseon Ryu
 
Lessons learned from running Spark on Docker
Lessons learned from running Spark on DockerLessons learned from running Spark on Docker
Lessons learned from running Spark on DockerDataWorks Summit
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications OpenEBS
 
Apache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysApache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysDemi Ben-Ari
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OSri Ambati
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark FundamentalsZahra Eskandari
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkJames Chen
 
Manuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octManuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octParadigma Digital
 
Implementing Parallelism in PostgreSQL - PGCon 2014
Implementing Parallelism in PostgreSQL - PGCon 2014Implementing Parallelism in PostgreSQL - PGCon 2014
Implementing Parallelism in PostgreSQL - PGCon 2014EDB
 
NGS Informatics and Interpretation - Hardware Considerations by Michael McManus
NGS Informatics and Interpretation - Hardware Considerations by Michael McManusNGS Informatics and Interpretation - Hardware Considerations by Michael McManus
NGS Informatics and Interpretation - Hardware Considerations by Michael McManusKnome_Inc
 
Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)
Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)
Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)inwin stack
 
Azure DocumentDB Overview
Azure DocumentDB OverviewAzure DocumentDB Overview
Azure DocumentDB OverviewAndrew Liu
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Chris Nauroth
 
Rust is for "Big Data"
Rust is for "Big Data"Rust is for "Big Data"
Rust is for "Big Data"Andy Grove
 
From a student to an apache committer practice of apache io tdb
From a student to an apache committer  practice of apache io tdbFrom a student to an apache committer  practice of apache io tdb
From a student to an apache committer practice of apache io tdbjixuan1989
 

Similaire à Introduction to Cassandra and CQL for Java developers (20)

Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache CassandraCassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
 
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
NAVER Ceph Storage on ssd for Container
NAVER Ceph Storage on ssd for ContainerNAVER Ceph Storage on ssd for Container
NAVER Ceph Storage on ssd for Container
 
Lessons learned from running Spark on Docker
Lessons learned from running Spark on DockerLessons learned from running Spark on Docker
Lessons learned from running Spark on Docker
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
Apache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysApache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - Panorays
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2O
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
 
Manuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octManuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4oct
 
Implementing Parallelism in PostgreSQL - PGCon 2014
Implementing Parallelism in PostgreSQL - PGCon 2014Implementing Parallelism in PostgreSQL - PGCon 2014
Implementing Parallelism in PostgreSQL - PGCon 2014
 
NGS Informatics and Interpretation - Hardware Considerations by Michael McManus
NGS Informatics and Interpretation - Hardware Considerations by Michael McManusNGS Informatics and Interpretation - Hardware Considerations by Michael McManus
NGS Informatics and Interpretation - Hardware Considerations by Michael McManus
 
From 0 to syncing
From 0 to syncingFrom 0 to syncing
From 0 to syncing
 
Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)
Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)
Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)
 
Azure DocumentDB Overview
Azure DocumentDB OverviewAzure DocumentDB Overview
Azure DocumentDB Overview
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
 
Rust is for "Big Data"
Rust is for "Big Data"Rust is for "Big Data"
Rust is for "Big Data"
 
From a student to an apache committer practice of apache io tdb
From a student to an apache committer  practice of apache io tdbFrom a student to an apache committer  practice of apache io tdb
From a student to an apache committer practice of apache io tdb
 

Dernier

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Dernier (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Introduction to Cassandra and CQL for Java developers

  • 1. Introduction to Cassandra and CQL for Java developers Julien Anguenot (@anguenot)! Houston Java User Group! July 30th, 2014
  • 2. Agenda C* overview! C* key features! C* key concepts! Getting started with C*! CQL! DataStax CQL Java driver
  • 4. © 2014 iland internet solutions What is C*? • Open source distributed storage system! • Essentially a partitioned row store! • A cross between Google’s BigTable (data model) and Amazon’s Dynamo (architecture)! • Runs off commodity hardware! • Optimized for non-relational models! • Cassandra Query Language (CQL)! • Written in Java! • Apache Licence v2.0! • An open source community 4
  • 5. © 2014 iland internet solutions History • Developed by Facebook for its inbox search! • Open sourced in 2008! • Apache Foundation top project in 2009! • 1.0 released in 2011! • 2.0 released in 2013! • 2.1 to be released this year 5
  • 6. © 2014 iland internet solutions C* is today • One of the most popular “NoSQL” database! • Used by many (and large) organizations (Netflix, Instagram, Twitter, eBay, etc.)! • Contributors include Facebook, IBM, Twitter, Rackspace, etc.! • Cassandra 2.0+ and CQL 3.1! • Drivers and client libs available for various languages: Python, Java, C++, C#, etc. 6
  • 7. © 2014 iland internet solutions When to consider C*? • Performance: write is great, read is good on very large datasets. (hundreds of TB)! • Application running across multiple data-centers in different geographic locations! • Application requiring HA w/ no-SPOF (hundreds of nodes)! • Elastic scalability is critical! • Application running off commodity servers in premises or VMs at your favorite IaaS! • Looking for simplicity over other solutions such as Hadoop / HBase 7
  • 8. Cassandra vs HBase vs MongoDB Let’s just get this out of the way
  • 9. © 2014 iland internet solutions MongoDB to be considered if / when? • (much) smaller datasets! • your application does not need to run across multiple data centers.! • it is ok for your application to have a SPOF! • you do not need to scale out your application elastically! • write performance decreasing with amount of data is not a big deal 9
  • 10. © 2014 iland internet solutions HBase to be considered if / when? • You do analytics: HBase running off Hadoop is a good option! • Your application has a very low transaction rate! • Your application does not need to run in multiple data centers! • You are not scared of moving parts! • Increasing your application overall architecture is fine 10
  • 12. © 2014 iland internet solutions Scalability • linearly scales reads and writes with number of nodes. Throughput of application // # of nodes! • hundreds of nodes supported! • no downtime adding nodes! • no application level interruption! • multi-datacenter native replication support 12
  • 13. © 2014 iland internet solutions High Availability • fault tolerant with tunable consistency (more on this later)! • data replicated to multiple nodes! • continuous availability: no SPOF (vs master / slave) 13
  • 14. © 2014 iland internet solutions Performances • low latency! • write is great! • read is good! • can handles hundreds of TB 14
  • 15. © 2014 iland internet solutions Transaction Support! • commit log: atomicity, isolation and durability of ACID compliance! • consistency is tunable (more on this later) 15
  • 16. © 2014 iland internet solutions Simplicity • all nodes in cluster are the same! • configuration is simple! • operation is simple 16
  • 17. © 2014 iland internet solutions Cassandra Query Language (CQL) • SQL-like query language! • data are in tables containing rows of columns! • v3 replaces Thrift API and CQL v2 17
  • 19. © 2014 iland internet solutions Tunable consistency! • RDBMS: consistency and availability => transactions! • NoSQL: partition tolerance over consistency?! • Cassandra tunable consistency: tradeoffs in between performance or accuracy on a per-query basis! • Write requests: all nodes, quorum of nodes or any available nodes! • Read requests: all nodes “strong consistency”, quorum of nodes or any nodes. 19
  • 20. © 2014 iland internet solutions Data model! • Flexible data storage: structured, semi-structured, unstructured! • Change to data structures is dynamic! • strict minimum: essentially a distributed hash map! • low-level: requires application to have extensive knowledge about the dataset! • Does not support a fully relational model: application responsibility! • No foreign keys, no JOIN 20
  • 21. © 2014 iland internet solutions Partitioned row store! • keyspace (KS) is the primary container of data (like RDBMS database)! • KS contains column families (CF) (like relational tables)! • CF contains rows and rows contain columns! • CF requires a primary key: partition key (PK) is the first part of the primary key. ! • PK determines on which nodes the data is stored. ! • SELECT must include PK! • remaining columns part of primary key are clustering columns (think ordering)! • INSERT / UPDATE / DELETE OPS on rows w/ same PK for a CF are atomic and isolated! • partitioning: C* distributes transparently data across multiple nodes (nodes can be added and removed)! • Secondary indexes possible 21
  • 23. © 2014 iland internet solutions Where to get started? • http://cassandra.apache.org/
 Apache foundation project Web site! • http://planetcassandra.org/ 
 Community Web site! • http://www.datastax.com/
 company providing Cassandra support and solutions to enterprises
 lots of great documentation 23
  • 24. © 2014 iland internet solutions Requirements • Java >= 1.7 (prefer Oracle JVM)! • Python 2.7 (cqlsh only) 24
  • 25. © 2014 iland internet solutions Downloading • stable releases available from Apache Foundation Web site! • binary distributions! • Debian / Ubuntu packages! • DataStax provides RPMs! • you can build C* from source (testing patches etc.) 25
  • 26. © 2014 iland internet solutions Getting started with tarball distribution $ wget http://www.apache.org/dyn/closer.cgi?path=/ cassandra/2.0.9/apache-cassandra-2.0.9-bin.tar.gz ! $ sudo mkdir -p /var/log/cassandra $ sudo chown -R `whoami` /var/log/cassandra $ sudo mkdir -p /var/lib/cassandra $ sudo chown -R `whoami` /var/lib/cassandra $ tar -xzf apache-cassandra-2.0.9-bin.tar.gz ! $ bin/cassandra -f 26
  • 27. © 2014 iland internet solutions Getting started with Debian / Ubuntu (1/2) $ sudo vim /etc/apt/sources.list.d/java.list
 deb http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main
 deb-src http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main $ sudo apt-get update $ sudo apt-get oracle-java7-installer $ sudo apt-get install oracle-java7-set-default ! 27
  • 28. © 2014 iland internet solutions Getting started with Debian / Ubuntu (2/2) $ sudo vim /etc/apt/sources.list.d/cassandra.list
 deb http://www.apache.org/dist/cassandra/debian 20x main
 deb-src http://www.apache.org/dist/cassandra/debian 20x main $ sudo apt-get update $ sudo apt-get install cassandra 28
  • 29. © 2014 iland internet solutions Running the CQL shell $ (bin/)cqlsh Connected to Test Cluster at localhost:9160. [cqlsh 4.1.1 | Cassandra 2.0.9 | CQL spec 3.1.1 | Thrift protocol 19.39.0] Use HELP for help. cqlsh> • 29
  • 31. © 2014 iland internet solutions Using CQL • cqlsh! • DataStax driver! • simpler than Thrift API! • hide C* internal implementation details! • native transport port: 9042 31
  • 32. © 2014 iland internet solutions CQL basics • usual statements! • CREATE / DROP / ALTER! • SELECT! • INSERT and UPDATE are the same (create or replace) 32
  • 33. © 2014 iland internet solutions Keyspace (KS) • “like” a RDBMS database but…! • replication strategy! • SimpleStrategy: simple single DC cluster! • NetworkTopologyStrategy: multi-DC cluster! • replication factor: total number of replicas across the cluster! • A replication factor of 1 means that there is only one copy of each row in the DC! • A replication factor of 2 means two copies of each row, where each copy is on a different node in every DC! • if RF > # nodes: writes rejected and read will depend on consistent level 33
  • 34. © 2014 iland internet solutions Creating KS: single node in a single DC cqlsh> CREATE KEYSPACE HJUG WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };! ! 1 node == 1 copy! 34
  • 35. © 2014 iland internet solutions Creating KS: 4 nodes cluster in a single DC (1/2) cqlsh> CREATE KEYSPACE HJUG WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };! ! 3 copies of data across 4 nodes 35
  • 36. © 2014 iland internet solutions Creating KS: 4 nodes cluster in a single DC (2/2) • first replica on a node determined by the partitioner! • Additional replicas placed on the next nodes clockwise in the ring 36
  • 37. © 2014 iland internet solutions Multi-DC (NetworkTopologyStrategy) • cluster deployed across multiple data centers! • specify how many replicas in each data center! • what to consider:! • local reads with low net latency! • failure! • disk space! • example:! 1. 2 replicas in each DC: 1 node can be down per DC and still allows local reads at a consistency level of ONE (1).! 2. 3 replicas in each DC. 1 node per DC at a strong consistency level of LOCAL_QUORUM (2) depending on query consistency level 37
  • 38. © 2014 iland internet solutions Creating KS: 2 DC of 3 nodes and RF 3 cqlsh> CREATE KEYSPACE HJUG WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', ‘us-east' : 3, ‘us-west’: 3 };! ! 3 copies of data across 3 nodes in each DC (6 totals) 38
  • 39. © 2014 iland internet solutions nodetool status <KS> $ bin/nodetool status HJUG! !Datacenter: us-east! ===============! Status=Up/Down! |/ State=Normal/Leaving/Joining/Moving! -- Address Load Tokens Owns (effective) Host ID Rack! UN 10.241.206.82 989.91 GB 256 100.0% 1aeb620e-f22d-485b-b755-323f8e20388a 206! UN 10.241.206.80 989.14 GB 256 100.0% aefbe1fc-3436-48ac-a07f-ac664c2b823f 206! UN 10.241.206.81 989.7 GB 256 100.0% acd7b4db-7a3f-4dac-96ef-9389a2f807ba 206! !Datacenter: us-west! ===============! Status=Up/Down! |/ State=Normal/Leaving/Joining/Moving! -- Address Load Tokens Owns (effective) Host ID Rack! UN 10.243.206.80 989.7 GB 256 100.0% 3d8ea269-3e59-400c-9f77-727da2bcf8a6 206! UN 10.243.206.81 988.49 GB 256 100.0% 5832b870-fcfc-4046-a2d5-eff65fa53f4c 206! UN 10.243.206.82 987.92 GB 256 100.0% b8d0792a-b5fb-433f-a9f6-ce1110a3420b 206! ! 39
  • 40. © 2014 iland internet solutions ALTER KEYSPACE <KS> cqlsh> ALTER KEYSPACE HJUG WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', ‘us-east' : 3, ‘us-west’: 2 };! ! You then need to run a repair 40
  • 41. © 2014 iland internet solutions DROP KEYSPACE <KS> cqlsh> drop keyspace HJUG;! cqlsh> drop keyspace if exists HJUG;! ! Immediate and irreversible removal 41
  • 42. © 2014 iland internet solutions Using KS cqlsh> use HJUG;
 cqlsh> describe keyspace HJUG; 42
  • 43. © 2014 iland internet solutions To go further • partitioner! • snitch! • rack! • seeds! • nodetool! • read configuration file 43
  • 44. © 2014 iland internet solutions Creating table with a single primary key cqlsh:HJUG> CREATE TABLE users (
 username varchar,! password varchar,! […], ! PRIMARY KEY (username)); 44
  • 45. © 2014 iland internet solutions Creating table with a compound primary key cqlsh:HJUG> CREATE TABLE users(
 username varchar,! location_id int,! […],! PRIMARY KEY (username, location_id));! ! partition key: username! location_id: clustering columns (ordering) 45
  • 46. © 2014 iland internet solutions Creating table with a composite primary key cqlsh:HJUG> CREATE TABLE users(
 username varchar,! location_id int,! […],! PRIMARY KEY ((username, location_id)));! ! each row will be on a separated partition of its own 46
  • 47. © 2014 iland internet solutions ALTER TABLE <T> cqlsh:HJUG> ALTER TABLE users ADD last_login varchar;! cqlsh:HJUG> ALTER TABLE users ALTER last_login TYPE timestamp;! cqlsh:HJUG> ALTER TABLE users DROP last_login;! ! cqlsh:HJUG> ALTER TABLE users with COMPRESSION = {'sstable_compression': ''};! 47
  • 48. © 2014 iland internet solutions DESCRIBE TABLE <T> cqlsh> use HJUG;
 cqlsh:HJUG> DESCRIBE TABLE HJUG;
 CREATE TABLE users(
 username varchar,! location_id int,! […],! PRIMARY KEY (username, location_id)! ) WITH! […]! compaction={'class': 'SizeTieredCompactionStrategy'} AND! compression={'sstable_compression': 'LZ4Compressor'};! ! 48
  • 49. © 2014 iland internet solutions INSERT cqlsh> INSERT INTO HJUG.users (username, location_id) VALUES (‘janguenot’, ‘Houston’); ! ! cqlsh> use HJUG;! cqlsh:HJUG> INSERT INTO users (username, location_id) VALUES (‘janguenot’, ‘Houston’); 49
  • 50. © 2014 iland internet solutions UPDATE cqlsh:HJUG> UPDATE USERS set X=‘Y’ where username=‘janguenot’ and location_id = ‘Houston’; 50
  • 51. © 2014 iland internet solutions SELECT cqlsh:HJUG> SELECT * FROM USERS;! 
 cqlsh:HJUG> SELECT * FROM USERS ORDER BY location_id ASC;! 
 cqlsh:HJUG> SELECT * FROM USERS where username = ‘janguenot’;! ! Remember ORDER BY can ONLY be used with columns part of primary key! ! 51
  • 52. © 2014 iland internet solutions CQL predicates • on partition keys: =, IN! • on the cluster columns: <,<=,=,>=,>,IN 52
  • 53. © 2014 iland internet solutions Performance considerations • query against single partition are fast! • pk = <whatever>! • queries spanning multiple partitions are slow! • new disk seek for each partition! • queries spanning multiple cluster columns are fast 53
  • 54. © 2014 iland internet solutions GROUP BY? • partition key cluster columns for grouping! • no group by statement 54
  • 55. © 2014 iland internet solutions DELETE cqlsh:HJUG> DELETE FROM USERS where username = ‘janguenot’ and location_id = ‘Houston’;! ! Deleted values will be permanently deleted after next compaction 55
  • 56. © 2014 iland internet solutions TRUNCATE TABLE <T> cqlsh:HJUG> truncate table users; 56
  • 57. © 2014 iland internet solutions DROP TABLE <T> cqlsh:HJUG> drop table users;! cqlsh:HJUG> drop table if exists users; 57
  • 58. © 2014 iland internet solutions CQL Types 58
  • 59. © 2014 iland internet solutions CQL Collections cqlsh:HJUG> CREATE TABLE users (
 username varchar,! password varchar,! emails set<text>, ! PRIMARY KEY (username));! • Set, List and Map are supported! • 1 to many relationship! • they get serialized: keep it small or use extra table! • list, that are ordered, are not performant, use set if possible or consider additional tables if large collection 59
  • 60. © 2014 iland internet solutions Secondary Indexes • Query against a column outside the primary key! • CREATE INDEX <index_name> ON <T>(<column>);! • SELECT * FROM T where column=‘x’;! • Performances are good but not great but definitely getting better and better 60
  • 61. © 2014 iland internet solutions Final remarks about CQL • no sequences: you manage UUID at the app level (time UUID types might be used for time series though)! • remember partition key is not a primary key: beware of UPDATE! • In doubt, you can write: C* is good at it. Create table and store data (One to One, One To Many)! • Your application will drive your data model! 61
  • 62. © 2014 iland internet solutions To go further • TTL! • Counters! • Static column! • Lightweight transactions (IF, IF NOT EXISTS) 62
  • 63. DataStax native CQL Java driver
  • 64. © 2014 iland internet solutions Main features • Provides CQL3 access to C* using Java! • Uses C* CQL Native protocol! • Tunable policies (including consistency)! • Load balancing / reconnection / failover / routing of requests! • prepared statements and batches! • Sync and Async queries supported! • tracing query supported (for debug purposes)! • Driver available for Python, C++ and C# as well (similar API) 64
  • 65. © 2014 iland internet solutions Driver modules • driver-core: the core layer! • driver-examples: example applications using the other modules which are only meant for demonstration purposes. 65
  • 66. © 2014 iland internet solutions Maven dependency <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-driver-core</artifactId> <version>2.0.3</version> </dependency> 66
  • 67. © 2014 iland internet solutions Optional dependencies for compression <dependency> <groupId>net.jpountz.lz4</groupId> <artifactId>lz4</artifactId> <version>1.2.0</version> <scope>runtime</scope> </dependency> <dependency> <groupId>org.xerial.snappy</groupId> <artifactId>snappy-java</artifactId> <version>1.0.5</version> <scope>runtime</scope> </dependency> 67
  • 68. © 2014 iland internet solutions Driver documentation • Docs
 http://www.datastax.com/documentation/developer/java-driver/2.0/ index.html! • API
 http://www.datastax.com/drivers/java/2.0 ! • Jira
 https://datastax-oss.atlassian.net/browse/JAVA ! • Mailing list
 https://groups.google.com/a/lists.datastax.com/forum/#!forum/java-driver- user 68
  • 69. © 2014 iland internet solutions Open Source • Apache v2 licence! • https://github.com/datastax/java-driver 69
  • 71. © 2014 iland internet solutions Step 1: connection to the cluster Cluster.Builder clusterBuilder = Cluster.builder(); ! // Connect to one (1) node clusterBuilder.addContactPoint(“10.10.10.2”); ! // Connect to several nodes clusterBuilder.addContactPoints(“10.10.10.2”, “10.10.10.3”); ! // Build the the cluster Cluster cluster = clusterBuilder.build(); ! // … do work with the cluster … ! // Shutdown the cluster cluster.shutdown(); 71
  • 72. © 2014 iland internet solutions Step 2: connection to a keyspace // Creating a session against the keyspace you want to interact with Session session = cluster.connect("HJUG"); ! // Close up the session session.shutdown() 72
  • 73. © 2014 iland internet solutions Example 1: search queries and result set // TODO catch exceptions !// Execute a query using the cluster and iterate over the results ResultSet result = session.execute("SELECT * from USER;"); !// Option 1: iterate over the results Iterator<Row> iter = result.iterator(); while (iter.hasNext()) { Row row = iter.next(); log.info(String.format("Found user w/ username=%s", row.getString(“username”)); } !// Option 2: get all rows and iterate List<Row> rows = result.all(); for (Row row : rows) { log.info(String.format("Found user w/ username=%s", row.getString(“username”)); } 73
  • 74. © 2014 iland internet solutions Example 2: inserting data // TODO catch exceptions ! // INSERT a new user (TODO: escape parameters when used this way) session.execute(String.format("INSERT INTO USER (username, location_id) VALUES (%s, %s);", "Jim", “Houston")); 74
  • 75. © 2014 iland internet solutions Example 3: prepared statements // TOTO catch exceptions !// Create prepared statement that can be reused throughout the application. // You only need to create it once PreparedStatement usersByLocationStatement = session.prepare(String.format( "SELECT * FROM %s WHERE %s = ?;", USER, "location_id")); !// Create bound statement and bind query parameters BoundStatement boundStatement = new BoundStatement(usersByLocationStatement); !// You can override the default consistent level defined at the cluster level on a per
 // query basis boundStatement.setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM); !// Bind parameters boundStatement.bind(“Houston”); !!// Execute bound statement and get results ResultSet resultSet = session.execute(boundStatement); 75
  • 76. © 2014 iland internet solutions Example 4: Batch Statement // TODO catch exceptions ! // Create a batch statement // Type logged ensures atomicity BatchStatement batchStatement = new BatchStatement(BatchStatement.Type.LOGGED); ! // Create bound statement and bind query parameters BoundStatement boundStatement = new BoundStatement(usersByLocationStatement); boundStatement.bind("Houston"); ! // Add the bound statements to the batch batchStatement.add(boundStatement); ! // ... you can several bound statements to the batch ... ! // execute batch session.execute(batchStatement); 76
  • 77. © 2014 iland internet solutions Example 5: Synchronous vs Asynchronous // TODO catch exceptions ! // INSERT synchronously a new user (TODO: escape parameters when used this way) session.execute(String.format("INSERT INTO USER (username, location_id) VALUES (%s, %s);", "Jim", “Houston”)); ! // INSERT asynchronously a new user (TODO: escape parameters when used this way) session.executeAsync(String.format("INSERT INTO USER (username, location_id) VALUES (%s, %s);", "Jim", “Houston")); 77
  • 78. © 2014 iland internet solutions Example 6: batching result sets // We will get <limit> items at offset <x> // offset = x; // limit = y; !// Create bound statement and bind query parameters BoundStatement boundStatement = new BoundStatement(usersByLocationStatement); boundStatement.setFetchSize(limit); boundStatement.bind("Houston"); !!// Execute bound statement and get results ResultSet resultSet = session.execute(boundStatement); !for (int i = 0; i < (offset / limit); i++) { // Fetch the number of pages needed resultSet.fetchMoreResults(); } !Iterator<Row> iter = resultSet.iterator(); for (int i = 0; i < offset; i++) { // Throw away results from earlier pages if (iter.hasNext()) { iter.next(); } } !final List<Row> rows = new ArrayList<>(); for (int i = 0; i < limit; i++) { // Keep results from desired page if (iter.hasNext()) { rows.add(iter.next()); } } 78
  • 79. DataStax CQL driver rule #1 Use one Cluster instance per (physical) cluster (per application lifetime)
  • 80. © 2014 iland internet solutions Cluster • handles queries, connections and their policies! • share cluster instance at the application level! • must be tuned according to C* nodes / cluster configuration (timeouts, retries etc.)! • Consistency 80
  • 81. © 2014 iland internet solutions Example of a more complex Cluster setup // Initialize cluster like in example 1. // You can customize policies before build() clusterBuilder .withQueryOptions( new QueryOptions().setConsistencyLevel( ConsistencyLevel.LOCAL_QUORUM)) .withCompression(Compression.LZ4) .withSocketOptions( // Setting a value of 0 disables read timeouts: we let Cassandra timeout // before the cluster here. new SocketOptions().setConnectTimeoutMillis(1500) .setReadTimeoutMillis(0))
 .withLoadBalancingPolicy(new DCAwareRoundRobinPolicy(“us-east”)); 81
  • 82. DataStax CQL driver rule #2 Use at most one Session per keyspace, or use a single Session and explicitly specify the keyspace in your queries
  • 83. © 2014 iland internet solutions Session • API centered around query execution! • manages per-node connection pools! • avoid large # of sessions or major impact on server resources (C* side)! • share session instance at the application level! • one session per keyspace at most! • if large number of keyspace: pre-defined number of sessions 83
  • 84. DataStax CQL driver rule #3 if you execute a statement more than once, consider using a PreparedStatement
  • 85. © 2014 iland internet solutions Prepared statements • prepare once, bind and execute multiple times.! • parsed and prepared on the Cassandra nodes! • cache prepared statement at the application level! • only bound parameters and query are sent to nodes! • performance gains are significant! • prepared statements should be configured to rarely receive null values when binding parameters 85
  • 86. DataStax CQL driver rule #4 You can reduce the number of network roundtrips and also have atomic operations by using Batches
  • 87. © 2014 iland internet solutions Batch operations! • single request! • combines multiple data modification statements into a single logical operation! • atomic operation: all statements pass or fail! • can use combinations of batch and prepared statements! • keep batch statement below the value specified in conf file: batch_size_warn_threshold_in_kb (5 kb by default) 87
  • 88. Thanks! Slides available @ http://www.slideshare.net/anguenot/cassandra-cql- javahjug20140730 ! @anguenot / ja@iland.com! ! iland: http://www.iland.com! We are hiring in Houston!! https://www.linkedin.com/company/iland-internet-solutions/careers ! !