SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
CASSANDRA SUMMIT 2015CASSANDRA SUMMIT 2015
Mario Lazaro
September 24th 2015
#CassandraSummit2015
MULTI-REGION CASSANDRA IN AWSMULTI-REGION CASSANDRA IN AWS
WHOAMIWHOAMI
Mario Cerdan Lazaro
Big Data Engineer
Born and raised in Spain
Joined GumGum 18 months ago
About a year and a half experience with
Cassandra
#5 Ad Platform in the U.S
10B impressions / month
2,000 brand-safe premium
publisher partners
1B+ global unique visitors
Daily inventory Impressions
processed - 213M
Monthly Image Impressions
processed - 2.6B
123 employees in seven offices
AGENDAAGENDA
Old cluster
International Expansion
Challenges
Testing
Modus Operandi
Tips
Questions & Answers
25 Classic nodes cluster
1 Region / 1 Rack hosted in
AWS EC2 US East
Version 2.0.8
Datastax CQL driver
GumGum's metadata including
visitors, images, pages, and ad
performance
Usage: realtime data access
and analytics (MR jobs)
OLD C* CLUSTER - MARCH 2015OLD C* CLUSTER - MARCH 2015
OLD C* CLUSTER - REALTIME USE CASEOLD C* CLUSTER - REALTIME USE CASE
Billions of rows
Heavy read workload
60/40
TTLs everywhere - Tombstones
Heavy and critical use of counters
RTB - Read Latency constraints
(total execution time ~50 ms)
OLD C* CLUSTER - ANALYTICS USE CASEOLD C* CLUSTER - ANALYTICS USE CASE
Daily ETL jobs to extract / join data
from C*
​Hadoop MR jobs
AdHoc queries with Presto
INTERNATIONAL EXPANSIONINTERNATIONAL EXPANSION
FIRST STEPSFIRST STEPS
Start C* test datacenters in US
East & EU West and test how C*
multi region works in AWS
Run capacity/performance tests.
We expect 3x times more traffic
in 2015 Q4
FIRST THOUGHTSFIRST THOUGHTS
Use AWS Virtual Private
Cloud (VPC)
Cassandra & VPC present
some connectivity
issues challenges
Replicate entire data
with same number of replicas
TOO GOOD TO BE TRUE ...TOO GOOD TO BE TRUE ...
CHALLENGESCHALLENGES
Problems between Cassandra in EC2
Classic / VPC and Datastax Java
driver
EC2MultiRegionSnitch uses public
IPs. EC2 instances do not have an
interface with public IP address -
Cannot connect between instances in
the same region using Public IPs
/**
* Implementation of {@link AddressTranslater} used by the driver that
* translate external IPs to internal IPs.
* @author Mario <mario@gumgum.com>
*/
public class Ec2ClassicTranslater implements AddressTranslater {
private static final Logger LOGGER = LoggerFactory.getLogger(Ec2Cla
private ClusterService clusterService;
private Cluster cluster;
private List<Instance> publicDnss;
@PostConstruct
public void build() {
publicDnss = clusterService.getInstances(cluster);
}
/**
* Translates a Cassandra {@code rpc_address} to another address if
* <p>
*
* @param address the address of a node as returned by Cassandra.
* @return {@code address} translated IP address of the source.
*/
public InetSocketAddress translate(InetSocketAddress address) {
for (final Instance server : publicDnss) {
if (server.getPublicIpAddress().equals(address.getHostStrin
LOGGER.info("IP address: {} translated to {}", address.
return new InetSocketAddress(server.getPrivateIpAddress
}
}
return null;
}
public void setClusterService(ClusterService clusterService) {
this.clusterService = clusterService;
}
public void setCluster(Cluster cluster) {
this.cluster = cluster;
}
}
Problems between Cassandra in EC2
Classic / VPC and Datastax Java
driver
EC2MultiRegionSnitch uses public
IPs. EC2 instances do not have an
interface with public IP address -
Cannot connect between instances in
the same region using Public IPs
Region to Region connectivity will use
public IPs - Trust those IPs or use
software/hardware VPN
Problems between Cassandra in EC2
Classic / VPC and Datastax Java
driver
EC2MultiRegionSnitch uses public
IPs. EC2 instances do not have an
interface with public IP address -
Cannot connect between instances in
the same region using Public IPs
Region to Region connectivity will use
public IPs - Trust those IPs or use
software/hardware VPN
Your application needs to connect to
C* using private IPs - Custom EC2
translator
Datastax Java Driver Load Balancing
Multiple choices
DCAware + TokenAware
Datastax Java Driver Load Balancing
Multiple choices
DCAware + TokenAware + ?
Datastax Java Driver Load Balancing
Multiple choices
CHALLENGES
“ Clients in one AZ attempt to always communicate with C*
nodes in the same AZ. We call this zone-aware connections. This
feature is built into , Netflix’s C* Java client library.Astyanax
Zone Aware Connection:
Webapps in 3 different AZs: 1A, 1B, and 1C
C* datacenter spanning 3 AZs with 3 replicas
CHALLENGESCHALLENGES
1A
1B
1C
1B1B
We added it! - Rack/AZ awareness to
TokenAware Policy
CHALLENGESCHALLENGES
CHALLENGESCHALLENGES
Third Datacenter: Analytics
Do not impact realtime data access
Spark on top of Cassandra
Spark-Cassandra Datastax connector
Replicate specific keyspaces
Less nodes with larger disk space
Settings are different
Ex: Bloom filter chance
CHALLENGESCHALLENGES
Third Datacenter: Analytics
Cassandra Only DC
Realtime
Cassandra + Spark DC
Analytics
CHALLENGESCHALLENGES
Upgrade from 2.0.8 to 2.1.5
Counters implementation is buggy in
pre-2.1 versions
“ My code never has bugs. It just develops random
unexpected features
CHALLENGESCHALLENGES
“ To choose, or not to choose VNodes. That is the question.
(M. Lazaro, 1990 - 2500)
Previous DC using Classic Nodes
Works with MR jobs
Complexity for adding/removing nodes
Manual manage token ranges
New DCs will use VNodes
Apache Spark + Spark Cassandra Datastax
connector
Easy to add/remove new nodes as traffic
increases
TESTINGTESTING
TESTINGTESTING
Testing requires creating and modifying
many C* nodes
Create and configuring a C* cluster
is time-consuming / repetitive task
Create fully automated process for
creating/modifying/destroying
Cassandra clusters with Ansible
# Ansible settings for provisioning the EC2 instance
---
ec2_instance_type: r3.2xlarge
ec2_count:
- 0 # How many in us-east-1a ?
- 7 # How many in us-east-1b ?
ec2_vpc_subnet:
- undefined
- subnet-c51241b2
- undefined
- subnet-80f085d9
- subnet-f9138cd2
ec2_sg:
- va-ops
- va-cassandra-realtime-private
TESTING - PERFORMANCETESTING - PERFORMANCE
Performance tests using new
Cassandra 2.1 Stress Tool:
Recreate GumGum metadata /
schemas
Recreate workload and make
it 3 times bigger
Try to find limits / Saturate
clients
# Keyspace Name
keyspace: stresscql
#keyspace_definition: |
# CREATE KEYSPACE stresscql WITH replication = {'class': #'
### Column Distribution Specifications ###
columnspec:
- name: visitor_id
size: gaussian(32..32) #domain names are relatively
population: uniform(1..999M) #10M possible domains to pic
- name: bidder_code
cluster: fixed(5)
- name: bluekai_category_id
- name: bidder_custom
size: fixed(32)
- name: bidder_id
size: fixed(32)
- name: bluekai_id
size: fixed(32)
- name: dt_pd
- name: rt_exp_dt
- name: rt_opt_out
### Batch Ratio Distribution Specifications ###
insert:
partitions: fixed(1) # Our partition key is the v
select: fixed(1)/5 # We have 5 bidder_code per dom
batchtype: UNLOGGED # Unlogged batches
#
# A list of queries you wish to run against the schema
#
TESTING - PERFORMANCETESTING - PERFORMANCE
Main worry:
Latency and replication overseas
Use LOCAL_X consistency levels in your client
Only one C* node will contact only one C* node in a
different DC for sending replicas/mutations
TESTING - PERFORMANCETESTING - PERFORMANCE
Main worries:
Latency
TESTING - INSTANCE TYPETESTING - INSTANCE TYPE
Test all kind of instance types. We decided to
go with r3.2xlarge machines for our cluster:
60 GB RAM
8 Cores
160GB Ephemeral SSD Storage for commit logs and
saved caches
RAID 0 over 4 SSD EBS Volumes for data
Performance / Cost and GumGum use
case makes r3.2xlarge the best option
Disclosure: I2 instance family is the best if
you can afford it
TESTING - UPGRADETESTING - UPGRADE
Upgrade C* Datacenter from 2.0.8 to 2.1.5
Both versions can cohabit in the same DC
New settings and features tried
​DateTieredCompactionStrategy:
Compaction for Time Series Data
Incremental repairs
Counters new architecture
MODUS OPERANDIMODUS OPERANDI
MODUS OPERANDIMODUS OPERANDI
Sum up
From: One cluster / One DC in US East
To: One cluster / Two DCs in US East and one DC
in EU West
MODUS OPERANDIMODUS OPERANDI
First step:
Upgrade old cluster snitch from EC2Snitch to
EC2MultiRegionSnitch
Upgrade clients to handle it (aka translators)
Make sure your clients do not lose connection to upgraded
C* nodes (JIRA DataStax - )JAVA-809
MODUS OPERANDIMODUS OPERANDI
Second step:
Upgrade old datacenter from 2.0.8 to 2.1.5
nodetool upgradesstables (multiple nodes at a time)
Not possible to rebuild a 2.1.X C* node from a 2.0.X C*
datacenter.
rebuild
WARN [Thread-12683] 2015-06-17 10:17:22,845 IncomingTcpConnection.java:91 -
UnknownColumnFamilyException reading from socket;
closing org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=XXX
MODUS OPERANDIMODUS OPERANDI
Third step:
Start EU West and new US East DCs within the same
cluster
Replication factor in new DCs: 0
Use dc_suffix to differentiate new Virginia DC from old one
Clients do not talk to new DCs. Only C* knows they exist
Replication factor to 3 on all except analytics 1
​​Start receiving new data
Nodetool rebuild <old-datacenter>
​Old data
MODUS OPERANDIMODUS OPERANDI
RF 3
Clients
RF 3:0:0:0RF 3:3:3:1
US East Realtime
EU West Realtime
US East Analytics
Rebuild
Rebuild
Rebuild
From 39d8f76d9cae11b4db405f5a002e2a4f6f764b1d Mon Sep 17 00:00:00 2001
From: mario <mario@gumgum.com>
Date: Wed, 17 Jun 2015 14:21:32 -0700
Subject: [PATCH] AT-3576 Start using new Cassandra realtime cluster
---
src/main/java/com/gumgum/cassandra/Client.java | 30 ++++------------------
.../com/gumgum/cassandra/Ec2ClassicTranslater.java | 30 ++++++++++++++--------
src/main/java/com/gumgum/cluster/Cluster.java | 3 ++-
.../resources/applicationContext-cassandra.xml | 13 ++++------
src/main/resources/dev.properties | 2 +-
src/main/resources/eu-west-1.prod.properties | 3 +++
src/main/resources/prod.properties | 3 +--
src/main/resources/us-east-1.prod.properties | 3 +++
.../CassandraAdPerformanceDaoImplTest.java | 2 --
.../asset/cassandra/CassandraImageDaoImplTest.java | 2 --
.../CassandraExactDuplicatesDaoTest.java | 2 --
.../com/gumgum/page/CassandraPageDoaImplTest.java | 2 --
.../cassandra/CassandraVisitorDaoImplTest.java | 2 --
13 files changed, 39 insertions(+), 58 deletions(-)
MODUS OPERANDIMODUS OPERANDI
Start using new Cassandra DCs
RF 3:3:3:1
MODUS OPERANDIMODUS OPERANDI
Clients
US East Realtime
EU West Realtime
US East Analytics
RF 0:3:3:1
MODUS OPERANDIMODUS OPERANDI
Clients
US East Realtime
EU West Realtime
US East Analytics
RF 0:3:3:1
RF 3:3:1Decomission
TIPSTIPS
TIPS - AUTOMATED MAINTENANCETIPS - AUTOMATED MAINTENANCE
Maintenance in a multi-region C* cluster:
Ansible + Cassandra maintenance keyspace +
email report = zero human intervention!
CREATE TABLE maintenance.history (
dc text,
op text,
ts timestamp,
ip text,
PRIMARY KEY ((dc, op), ts)
) WITH CLUSTERING ORDER BY (ts ASC) AND
bloom_filter_fp_chance=0.010000 AND
caching='{"keys":"ALL", "rows_per_partition":"NONE"}' AND
comment='' AND
dclocal_read_repair_chance=0.100000 AND
gc_grace_seconds=864000 AND
read_repair_chance=0.000000 AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'LZ4Compressor'};
CREATE INDEX history_kscf_idx ON maintenance.history (kscf);
3-133-65:/opt/scripts/production/groovy$ groovy CassandraMaintenanceCheck.groovy -dc us-east-va-realtime -op compaction -e mari
TIPS - SPARKTIPS - SPARK
Number of workers above number of total C*
nodes in analytics
Each worker uses:
1/4 number of cores of each instance
1/3 total available RAM of each instance
Cassandra-Spark connector
​SpanBy
.joinWithCassandraTable(:x, :y)
Spark.cassandra.output.batch.size.bytes
Spark.cassandra.output.concurrent.writes
val conf = new SparkConf()
.set("spark.cassandra.connection.host", cassandraNodes)
.set("spark.cassandra.connection.local_dc", "us-east-va-analytics")
.set("spark.cassandra.connection.factory", "com.gumgum.spark.bluekai.DirectLinkConnectionFactory")
.set("spark.driver.memory","4g")
.setAppName("Cassandra presidential candidates app")
TIPS - SPARKTIPS - SPARK
Create "translator" if using EC2MultiRegionSnitch
Spark.cassandra.connection.factory
SINCE C* IN EU WEST ...SINCE C* IN EU WEST ...
US West Datacenter!
EU West DC US East DC Analytics DC US West DC
Q&AQ&A
GumGum is hiring!
http://gumgum.com/careers

Contenu connexe

Tendances

What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
DataStax
 
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
DataStax
 

Tendances (20)

Capital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting PlatformCapital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting Platform
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
 
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLBuilding a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
 
Azure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreAzure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User Store
 
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
 
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
 
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
 
Feeding Cassandra with Spark-Streaming and Kafka
Feeding Cassandra with Spark-Streaming and KafkaFeeding Cassandra with Spark-Streaming and Kafka
Feeding Cassandra with Spark-Streaming and Kafka
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japan
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on Cassandra
 
Building Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaBuilding Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and Kafka
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax Enterprise
 
Cassandra & Spark for IoT
Cassandra & Spark for IoTCassandra & Spark for IoT
Cassandra & Spark for IoT
 
Modeling the IoT with TitanDB and Cassandra
Modeling the IoT with TitanDB and CassandraModeling the IoT with TitanDB and Cassandra
Modeling the IoT with TitanDB and Cassandra
 
Deep dive into event store using Apache Cassandra
Deep dive into event store using Apache CassandraDeep dive into event store using Apache Cassandra
Deep dive into event store using Apache Cassandra
 
How to Bulletproof Your Scylla Deployment
How to Bulletproof Your Scylla DeploymentHow to Bulletproof Your Scylla Deployment
How to Bulletproof Your Scylla Deployment
 

Similaire à GumGum: Multi-Region Cassandra in AWS

Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
DataStax
 

Similaire à GumGum: Multi-Region Cassandra in AWS (20)

Multi-cluster k8ssandra
Multi-cluster k8ssandraMulti-cluster k8ssandra
Multi-cluster k8ssandra
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystem
 
Cloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit KubernetesCloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit Kubernetes
 
Netflix at-disney-09-26-2014
Netflix at-disney-09-26-2014Netflix at-disney-09-26-2014
Netflix at-disney-09-26-2014
 
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
 
A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17
A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17
A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17
 
A hitchhiker‘s guide to the cloud native stack
A hitchhiker‘s guide to the cloud native stackA hitchhiker‘s guide to the cloud native stack
A hitchhiker‘s guide to the cloud native stack
 
Building Bridges Between Applications and Data
Building Bridges Between Applications and DataBuilding Bridges Between Applications and Data
Building Bridges Between Applications and Data
 
Docker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a MinuteDocker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a Minute
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
 
Monitoring CloudStack and components
Monitoring CloudStack and componentsMonitoring CloudStack and components
Monitoring CloudStack and components
 
Machine learning at scale with aws sage maker
Machine learning at scale with aws sage makerMachine learning at scale with aws sage maker
Machine learning at scale with aws sage maker
 
Confoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New FeaturesConfoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New Features
 
Phil Basford - machine learning at scale with aws sage maker
Phil Basford - machine learning at scale with aws sage makerPhil Basford - machine learning at scale with aws sage maker
Phil Basford - machine learning at scale with aws sage maker
 
Higher order infrastructure: from Docker basics to cluster management - Nicol...
Higher order infrastructure: from Docker basics to cluster management - Nicol...Higher order infrastructure: from Docker basics to cluster management - Nicol...
Higher order infrastructure: from Docker basics to cluster management - Nicol...
 
Deploying Cloud Native Red Team Infrastructure with Kubernetes, Istio and Envoy
Deploying Cloud Native Red Team Infrastructure with Kubernetes, Istio and Envoy Deploying Cloud Native Red Team Infrastructure with Kubernetes, Istio and Envoy
Deploying Cloud Native Red Team Infrastructure with Kubernetes, Istio and Envoy
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
 
Using Grid Technologies in the Cloud for High Scalability
Using Grid Technologies in the Cloud for High ScalabilityUsing Grid Technologies in the Cloud for High Scalability
Using Grid Technologies in the Cloud for High Scalability
 
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
 

Plus de DataStax Academy

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 

Plus de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

GumGum: Multi-Region Cassandra in AWS

  • 1. CASSANDRA SUMMIT 2015CASSANDRA SUMMIT 2015 Mario Lazaro September 24th 2015 #CassandraSummit2015 MULTI-REGION CASSANDRA IN AWSMULTI-REGION CASSANDRA IN AWS
  • 2. WHOAMIWHOAMI Mario Cerdan Lazaro Big Data Engineer Born and raised in Spain Joined GumGum 18 months ago About a year and a half experience with Cassandra
  • 3. #5 Ad Platform in the U.S 10B impressions / month 2,000 brand-safe premium publisher partners 1B+ global unique visitors Daily inventory Impressions processed - 213M Monthly Image Impressions processed - 2.6B 123 employees in seven offices
  • 5. 25 Classic nodes cluster 1 Region / 1 Rack hosted in AWS EC2 US East Version 2.0.8 Datastax CQL driver GumGum's metadata including visitors, images, pages, and ad performance Usage: realtime data access and analytics (MR jobs) OLD C* CLUSTER - MARCH 2015OLD C* CLUSTER - MARCH 2015
  • 6. OLD C* CLUSTER - REALTIME USE CASEOLD C* CLUSTER - REALTIME USE CASE Billions of rows Heavy read workload 60/40 TTLs everywhere - Tombstones Heavy and critical use of counters RTB - Read Latency constraints (total execution time ~50 ms)
  • 7. OLD C* CLUSTER - ANALYTICS USE CASEOLD C* CLUSTER - ANALYTICS USE CASE Daily ETL jobs to extract / join data from C* ​Hadoop MR jobs AdHoc queries with Presto
  • 9. FIRST STEPSFIRST STEPS Start C* test datacenters in US East & EU West and test how C* multi region works in AWS Run capacity/performance tests. We expect 3x times more traffic in 2015 Q4 FIRST THOUGHTSFIRST THOUGHTS Use AWS Virtual Private Cloud (VPC) Cassandra & VPC present some connectivity issues challenges Replicate entire data with same number of replicas
  • 10. TOO GOOD TO BE TRUE ...TOO GOOD TO BE TRUE ...
  • 11. CHALLENGESCHALLENGES Problems between Cassandra in EC2 Classic / VPC and Datastax Java driver EC2MultiRegionSnitch uses public IPs. EC2 instances do not have an interface with public IP address - Cannot connect between instances in the same region using Public IPs /** * Implementation of {@link AddressTranslater} used by the driver that * translate external IPs to internal IPs. * @author Mario <mario@gumgum.com> */ public class Ec2ClassicTranslater implements AddressTranslater { private static final Logger LOGGER = LoggerFactory.getLogger(Ec2Cla private ClusterService clusterService; private Cluster cluster; private List<Instance> publicDnss; @PostConstruct public void build() { publicDnss = clusterService.getInstances(cluster); } /** * Translates a Cassandra {@code rpc_address} to another address if * <p> * * @param address the address of a node as returned by Cassandra. * @return {@code address} translated IP address of the source. */ public InetSocketAddress translate(InetSocketAddress address) { for (final Instance server : publicDnss) { if (server.getPublicIpAddress().equals(address.getHostStrin LOGGER.info("IP address: {} translated to {}", address. return new InetSocketAddress(server.getPrivateIpAddress } } return null; } public void setClusterService(ClusterService clusterService) { this.clusterService = clusterService; } public void setCluster(Cluster cluster) { this.cluster = cluster; } } Problems between Cassandra in EC2 Classic / VPC and Datastax Java driver EC2MultiRegionSnitch uses public IPs. EC2 instances do not have an interface with public IP address - Cannot connect between instances in the same region using Public IPs Region to Region connectivity will use public IPs - Trust those IPs or use software/hardware VPN Problems between Cassandra in EC2 Classic / VPC and Datastax Java driver EC2MultiRegionSnitch uses public IPs. EC2 instances do not have an interface with public IP address - Cannot connect between instances in the same region using Public IPs Region to Region connectivity will use public IPs - Trust those IPs or use software/hardware VPN Your application needs to connect to C* using private IPs - Custom EC2 translator
  • 12. Datastax Java Driver Load Balancing Multiple choices DCAware + TokenAware Datastax Java Driver Load Balancing Multiple choices DCAware + TokenAware + ? Datastax Java Driver Load Balancing Multiple choices CHALLENGES “ Clients in one AZ attempt to always communicate with C* nodes in the same AZ. We call this zone-aware connections. This feature is built into , Netflix’s C* Java client library.Astyanax
  • 13. Zone Aware Connection: Webapps in 3 different AZs: 1A, 1B, and 1C C* datacenter spanning 3 AZs with 3 replicas CHALLENGESCHALLENGES 1A 1B 1C 1B1B
  • 14. We added it! - Rack/AZ awareness to TokenAware Policy CHALLENGESCHALLENGES
  • 15. CHALLENGESCHALLENGES Third Datacenter: Analytics Do not impact realtime data access Spark on top of Cassandra Spark-Cassandra Datastax connector Replicate specific keyspaces Less nodes with larger disk space Settings are different Ex: Bloom filter chance
  • 16. CHALLENGESCHALLENGES Third Datacenter: Analytics Cassandra Only DC Realtime Cassandra + Spark DC Analytics
  • 17. CHALLENGESCHALLENGES Upgrade from 2.0.8 to 2.1.5 Counters implementation is buggy in pre-2.1 versions “ My code never has bugs. It just develops random unexpected features
  • 18. CHALLENGESCHALLENGES “ To choose, or not to choose VNodes. That is the question. (M. Lazaro, 1990 - 2500) Previous DC using Classic Nodes Works with MR jobs Complexity for adding/removing nodes Manual manage token ranges New DCs will use VNodes Apache Spark + Spark Cassandra Datastax connector Easy to add/remove new nodes as traffic increases
  • 20. TESTINGTESTING Testing requires creating and modifying many C* nodes Create and configuring a C* cluster is time-consuming / repetitive task Create fully automated process for creating/modifying/destroying Cassandra clusters with Ansible # Ansible settings for provisioning the EC2 instance --- ec2_instance_type: r3.2xlarge ec2_count: - 0 # How many in us-east-1a ? - 7 # How many in us-east-1b ? ec2_vpc_subnet: - undefined - subnet-c51241b2 - undefined - subnet-80f085d9 - subnet-f9138cd2 ec2_sg: - va-ops - va-cassandra-realtime-private
  • 21. TESTING - PERFORMANCETESTING - PERFORMANCE Performance tests using new Cassandra 2.1 Stress Tool: Recreate GumGum metadata / schemas Recreate workload and make it 3 times bigger Try to find limits / Saturate clients # Keyspace Name keyspace: stresscql #keyspace_definition: | # CREATE KEYSPACE stresscql WITH replication = {'class': #' ### Column Distribution Specifications ### columnspec: - name: visitor_id size: gaussian(32..32) #domain names are relatively population: uniform(1..999M) #10M possible domains to pic - name: bidder_code cluster: fixed(5) - name: bluekai_category_id - name: bidder_custom size: fixed(32) - name: bidder_id size: fixed(32) - name: bluekai_id size: fixed(32) - name: dt_pd - name: rt_exp_dt - name: rt_opt_out ### Batch Ratio Distribution Specifications ### insert: partitions: fixed(1) # Our partition key is the v select: fixed(1)/5 # We have 5 bidder_code per dom batchtype: UNLOGGED # Unlogged batches # # A list of queries you wish to run against the schema #
  • 22. TESTING - PERFORMANCETESTING - PERFORMANCE Main worry: Latency and replication overseas Use LOCAL_X consistency levels in your client Only one C* node will contact only one C* node in a different DC for sending replicas/mutations
  • 23. TESTING - PERFORMANCETESTING - PERFORMANCE Main worries: Latency
  • 24. TESTING - INSTANCE TYPETESTING - INSTANCE TYPE Test all kind of instance types. We decided to go with r3.2xlarge machines for our cluster: 60 GB RAM 8 Cores 160GB Ephemeral SSD Storage for commit logs and saved caches RAID 0 over 4 SSD EBS Volumes for data Performance / Cost and GumGum use case makes r3.2xlarge the best option Disclosure: I2 instance family is the best if you can afford it
  • 25. TESTING - UPGRADETESTING - UPGRADE Upgrade C* Datacenter from 2.0.8 to 2.1.5 Both versions can cohabit in the same DC New settings and features tried ​DateTieredCompactionStrategy: Compaction for Time Series Data Incremental repairs Counters new architecture
  • 27. MODUS OPERANDIMODUS OPERANDI Sum up From: One cluster / One DC in US East To: One cluster / Two DCs in US East and one DC in EU West
  • 28. MODUS OPERANDIMODUS OPERANDI First step: Upgrade old cluster snitch from EC2Snitch to EC2MultiRegionSnitch Upgrade clients to handle it (aka translators) Make sure your clients do not lose connection to upgraded C* nodes (JIRA DataStax - )JAVA-809
  • 29. MODUS OPERANDIMODUS OPERANDI Second step: Upgrade old datacenter from 2.0.8 to 2.1.5 nodetool upgradesstables (multiple nodes at a time) Not possible to rebuild a 2.1.X C* node from a 2.0.X C* datacenter. rebuild WARN [Thread-12683] 2015-06-17 10:17:22,845 IncomingTcpConnection.java:91 - UnknownColumnFamilyException reading from socket; closing org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=XXX
  • 30. MODUS OPERANDIMODUS OPERANDI Third step: Start EU West and new US East DCs within the same cluster Replication factor in new DCs: 0 Use dc_suffix to differentiate new Virginia DC from old one Clients do not talk to new DCs. Only C* knows they exist Replication factor to 3 on all except analytics 1 ​​Start receiving new data Nodetool rebuild <old-datacenter> ​Old data
  • 31. MODUS OPERANDIMODUS OPERANDI RF 3 Clients RF 3:0:0:0RF 3:3:3:1 US East Realtime EU West Realtime US East Analytics Rebuild Rebuild Rebuild
  • 32. From 39d8f76d9cae11b4db405f5a002e2a4f6f764b1d Mon Sep 17 00:00:00 2001 From: mario <mario@gumgum.com> Date: Wed, 17 Jun 2015 14:21:32 -0700 Subject: [PATCH] AT-3576 Start using new Cassandra realtime cluster --- src/main/java/com/gumgum/cassandra/Client.java | 30 ++++------------------ .../com/gumgum/cassandra/Ec2ClassicTranslater.java | 30 ++++++++++++++-------- src/main/java/com/gumgum/cluster/Cluster.java | 3 ++- .../resources/applicationContext-cassandra.xml | 13 ++++------ src/main/resources/dev.properties | 2 +- src/main/resources/eu-west-1.prod.properties | 3 +++ src/main/resources/prod.properties | 3 +-- src/main/resources/us-east-1.prod.properties | 3 +++ .../CassandraAdPerformanceDaoImplTest.java | 2 -- .../asset/cassandra/CassandraImageDaoImplTest.java | 2 -- .../CassandraExactDuplicatesDaoTest.java | 2 -- .../com/gumgum/page/CassandraPageDoaImplTest.java | 2 -- .../cassandra/CassandraVisitorDaoImplTest.java | 2 -- 13 files changed, 39 insertions(+), 58 deletions(-) MODUS OPERANDIMODUS OPERANDI Start using new Cassandra DCs
  • 33. RF 3:3:3:1 MODUS OPERANDIMODUS OPERANDI Clients US East Realtime EU West Realtime US East Analytics RF 0:3:3:1
  • 34. MODUS OPERANDIMODUS OPERANDI Clients US East Realtime EU West Realtime US East Analytics RF 0:3:3:1 RF 3:3:1Decomission
  • 36. TIPS - AUTOMATED MAINTENANCETIPS - AUTOMATED MAINTENANCE Maintenance in a multi-region C* cluster: Ansible + Cassandra maintenance keyspace + email report = zero human intervention! CREATE TABLE maintenance.history ( dc text, op text, ts timestamp, ip text, PRIMARY KEY ((dc, op), ts) ) WITH CLUSTERING ORDER BY (ts ASC) AND bloom_filter_fp_chance=0.010000 AND caching='{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment='' AND dclocal_read_repair_chance=0.100000 AND gc_grace_seconds=864000 AND read_repair_chance=0.000000 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; CREATE INDEX history_kscf_idx ON maintenance.history (kscf); 3-133-65:/opt/scripts/production/groovy$ groovy CassandraMaintenanceCheck.groovy -dc us-east-va-realtime -op compaction -e mari
  • 37. TIPS - SPARKTIPS - SPARK Number of workers above number of total C* nodes in analytics Each worker uses: 1/4 number of cores of each instance 1/3 total available RAM of each instance Cassandra-Spark connector ​SpanBy .joinWithCassandraTable(:x, :y) Spark.cassandra.output.batch.size.bytes Spark.cassandra.output.concurrent.writes
  • 38. val conf = new SparkConf() .set("spark.cassandra.connection.host", cassandraNodes) .set("spark.cassandra.connection.local_dc", "us-east-va-analytics") .set("spark.cassandra.connection.factory", "com.gumgum.spark.bluekai.DirectLinkConnectionFactory") .set("spark.driver.memory","4g") .setAppName("Cassandra presidential candidates app") TIPS - SPARKTIPS - SPARK Create "translator" if using EC2MultiRegionSnitch Spark.cassandra.connection.factory
  • 39. SINCE C* IN EU WEST ...SINCE C* IN EU WEST ... US West Datacenter! EU West DC US East DC Analytics DC US West DC