SlideShare une entreprise Scribd logo
1  sur  32
Tales From The Front:
Dan Flavin
DataStax
2015-12-08
An Architecture For Scalable Multi-Data
Center Applications for Things That Count
Using Cassandra
This Session Says What?
The Example Use Case is an Inventory Access Pattern Using
Cassandra
Same Patterns Apply To Other Distributed, Highly Current
Applications
What We're Going To Cover in 30 Minutes
• Quick review of Cassandra functionality that applies to this use case
• Common Data Center and application architectures for highly available
inventory applications, and why the were designed that way
• Cassandra implementations vis-a-vis infrastructure capabilities
– The impedance mismatch: compromises made to fit into IT infrastructures
designed and implemented with an old mindset*.
2
* just because an infrastructre made sense for the software architectures a decade ago doesn't mean it's still a good idea
Things That Count
How Did We Get HERE?
More Users, More Problems
A Common Use - Inventory
Scaling and Availability
• We all want applications and
services that are scalable and highly
available
• Scaling our app tier is usually pretty
painless, especially with cloud
infrastructure
– App tier tends to be stateless
4
while accessing a single stateful
database infrastructure
Ways We Scale our Relational Databases
SELECT
array_agg(players),
player_teams
FROM (
SELECT DISTINCT
t1.t1player AS players,
t1.player_teams
FROM (
SELECT
p.playerid AS t1id,
concat(p.playerid, ':', p.playername, ' ') AS t1player,
array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams
FROM player p
LEFT JOIN plays pl
ON p.playerid = pl.playerid
GROUP BY p.playerid, p.playername
) t1
INNER JOIN (
SELECT
p.playerid AS t2id,
array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams
FROM player p
LEFT JOIN plays pl
ON p.playerid = pl.playerid
GROUP BY p.playerid, p.playername
) t2
ON t1.player_teams = t2.player_teams
AND t1.t1id <> t2.t2id
) innerQuery
Scaling Up
SELECT * FROM denormalized_view
Denormalization
5
Ways we Scale our Relational Databases
Client
Users Data
Primary
Replica 1
Replica 2
Failover
Process
Monitor
Failover
Write RequestsRead Requests
Replication Lag
Router
A-F G-M N-T U-Z Replication Lag
Sharding and Replication (and probably Denormalization*)
6
Monitor the
Failover
Monitor
* a.k.a. Demoralization – you lose your technology moral ground
when you mix sharding, normalization / denormalization / failure
survivability / flexibility and humanity
Distributed Systems ≠Sharded or Bolt-on Replication
Systems
7
Scalability
 Distributed Systems, Distributed Data, all distributed amongst
"nodes" (a.k.a. computers)
 No node is critical path or chokepoint
Availability
 Requests will always be responded to
Partition Tolerance
 Component failure(s) do not bring whole system
down
Consistency
 All nodes see the data at the same time
Nodes are units
of performance and
availability (a.k.a.
incremental scalability)
The CAP Theorem describes
distributed systems choices
Consistency
Availability
Partition Tolerance
Cassandra values AP where
the balance between
consistency and latency is
tunable.
What is Cassandra?
• A Linearly Scaling and Fault Tolerant Distributed Database
• Fully Distributed
– Data spread over many nodes
– All nodes participate in a cluster
– All nodes are equal
– No SPOF (shared nothing)
– Run on commodity hardware
8
What is Cassandra?
Linearly Scaling
– Have More Data? Add more nodes.
– Need More Throughput? Add more nodes.
Fault Tolerant
– Nodes Down != Database Down
– Datacenter Down != Database Down
9
What is Cassandra?
• Cassandra Architecture Applicability
– Multiple Data Center "Active-Active replication"
– One Data Center goes offline, the others can seamlessly take over
– Offline DC back online becomes immediately available for workload,
Cassandra automatically rationalizes data across DC's
10
• Fully Replicated
• Clients write local DC
• Data syncs to other DC
• Replication Factor per DC
• "Success" controlled by client app
(i.e. local only, local and remote
individually, local and remote
combined...)
Data Center 1 Data Center 2
Client
Replication Factor
- How many nodes get the same data
Node 1
A
CD
Write Data Identified by a Primary Key A
• Each node will get a range of keys identified by a
hash of a table's Primary Key
• On insert key A is hashed to determine nodes data is
written to
Replication Factor=3
11
How? Replication Factor (server-side) Schema
Configuration
 Implements setting how many copies of the data
on different nodes should exist for CAP
• also good for Cassandra performance of some operations
(different topic)
Key To CAP System Implementation:
 Data is Spread Across Many Nodes
Client
Node 2
B
AD
Node 3
C
AB
Node 4
D
BC
CQL: insert A into table... hash()
Write to primary node 1, and
simultaneously to nodes 2 & 3 (RF = 3)
Consistency Level (CL) and Speed
• How many replicas we need to hear from can affect how quickly
we can read and write data in Cassandra
Client
B
AD
C
AB
A
CD
D
BC
5 µs ack
300 µs ack
12 µs ack
12 µs ack
Read A
(CL=QUORUM)
12
• Same concept applies across
Data Centers
Consistency Level and Data Consistency
• Replicas may disagree
– Column values are timestamped
– In Cassandra, Last Write Wins (LWW)
• Often use Lightweight Transactions
– Allows for serialization / test-and-set
Client
B
AD
C
AB
A
CD
D
BC
A=2
Newer
A=1
Older
A=2
Read A
(CL=QUORUM)
A good primer on consistency levels w/o LWT's: Christos from Netflix: “Eventual Consistency != Hopeful Consistency”
https://www.youtube.com/watch?v=lwIA8tsDXXE
13
Consistency Level and Availability
• Consistency Level choice affects availability
• For example, 1 Data Center with application read
Consistency Level (CL) at QUORUM can tolerate one
replica being down and still be available (in RF=3)
Client
B
AD
C
AB
A
CD
D
BC
A=2
A=2
A=2
Read A
(CL=QUORUM)
14
• Same concept applies across Data
Centers
Consistency Levels (CL's)
• Applies to both Reads and Writes (i.e. is set on each query)
• Controlled at the Client API Level (% controlled by RF). Examples include:
– ONE – one replica from any DC
– LOCAL_ONE – one replica from local DC
– QUORUM – 51% of replicas from any DC
– LOCAL_QUORUM – 51% of replicas from local DC
– EACH_QUORUM – 51% of replicas from multiple DC's
– ALL – all replicas
– TWO...
• The client code has, and can control a hierarchy of automatic failure levels
– i.e. one Data Center down with QUORUM, use LOCAL_QUORUM
• Cassandra will keep things consistent when the failed Data Center is back online
• Will there potentially be a speed cost? Of course there might be, but your database never went down now did it?
– CL fallback patterns are usually set once when connection is made in the application using Cassandra
15
Other Constructs
• BATCH – All statements succeed or fail
• Light Weight Transactions (LWT) – Test and Set
Consistency Level Choices
- Reacting when machines, data centers, networks, etc. go down (if you notice)
Primary Key and Replication Factor
16
"Write Identified by Primary Key A" means what?
 What "Primay Key A" looks like in CQL
Replication Factor
(per DC)
Review of Replication Factor (RF) and Consistency Level (CL)
What matters for this discussion
• The KEY is the thing that is an enabler for distributed systems & CAP
implementation
– Along with Replication Factor determines what nodes (nodes can live
in different Data Centers) data will be written to and read from.
– Consistency Level controlled in application for the scenario that makes
sense. A simple app may never notice node failures.
• Unless... ... something in the infrastructure breaks
– Then the application gets to decide what to do
» Ha! Like you even have those options with other
technologies
» Is another topic completely out of scope here
17
Back to the Topic: Inventory Supply and Demand
• Cassandra Architecture Applicability
– Linear scalability / de-scalability to handle heavy workload periods
– Predictable performance
– No Single Point of Failure
• Node(s)
• Data Center(s)
– Flexible Deployment Options
• On-Premise
• Cloud
• Hybrid Cloud / On-Premise
– Looks like one big old database to the client with visibility into it's
working parts when needed
18
Inventory Supply and Demand
- An Example Order Management System Implementation Overview
Logical
19Images from: http://www.ibm.com/developerworks/library/co-promising-inventory/
Physical
Natural spot for Cassandra is
[ Inventory = (Reservation <-> Order) – Supply ]
Database
App layer can
scale
Why?
Traditional RDBMS –
Not So Much
CL's, RF's, and LWT's. Oh My!
• Consistency Levels
– QUORUM (LOCAL_QUORUM)
– ONE (LOCAL_ONE)
• Reduce or Remove
– BATCH statements
– Lightweight Transactions (LWT)
Other CL's, LWT's, and BATCH are not bad...
– They put more work on the Cassandra cluster(s)
– Potentially add more scenarios to deal with in the client application
Why put more work on the clusters if you can avoid with a
circumspect design?
20
Inventory = Supply – Demand
21
Demand Mgmt Process /API's [ id'd by (sku, loc_id) ]
Available
Inventory
Supply Mgmt Process /API's [ id'd by (sku, loc_id) ]
Application Layer
Traditional Multi Data Center Infrastructure
22+ https://docs.oracle.com/middleware/1221/wls/WLCAG/weblogic_ca_archs.htm#WLCAG138 * http://www.ibm.com/developerworks/bpm/bpmjournal/1308_zhang/1308_zhang.html
What do we notice here?
• Resilient (often
stateless) Web /
App layer
• Primary / Failover
DB (high
transaction load
can force this
because active-
active is too heavy)
• All active database
access is managed
in one Data Center
+ *
Inventory txn's id'd by (sku, loc_id) can flow from applications in either Data Center
- multiple transactions for same item come through either DC, but db access only in 1 DC
Client Client
Stateless Application Processing Layer (Demand and Supply logic and queues, web
layer, etc.)
Does Tradition Kill The Future?
Pattern #1- Almost No Change To Infrastructure
Primary active -secondary failover DB's isolated in
Data Centers
Inventory database interaction (sku, loc_id) can
come from the application layer in either Data
Center with only .
• 100% of data in both Data Centers
– Cassandra's active-active data synchronization between
clusters
– more sophisticated than replication, but you can think of
it that way if it makes things easier
23
DC1 DC2
Does Traditional Active / Failover DC Architecture Kill The Future?
Obviously inefficient usage! But works fine
• Each operation will use DC1 as the primary Data Center
• Lightweight Transactions typically used
– Same item id'd by (sku, loc_id) can be simultaneously updated by two different
application paths
– Not necessarily evil, just more work for the database to do
• Consistency Level
– For the scared neophytes
• QUORUM, EACH_QUORUM, or SERIAL client gets success ack from nodes in both DC's (RF =
3 x 2 DC's = 4 nodes respond)
– Consistency Level for Cassandra's people
• LOCAL_QUORUM or LOCAL_SERIAL client gets success ack from nodes in DC1 (RF = 3 x 1 DC
= 2 nodes respond). Cassandra will simultaneously update DC2, but client app does not wait
• Application Data Center FAILOVER for Cassandra is Automatic
24
DC1 DC2
Cassandra's active-active is way, way past primary-secondary RDBMS failover architecture
of isolated in Data Centers. Stuck with old Data Center failover scenario (but no RDBMS)
... and yes. Cassandra efficiently and elegantly deals with WAN Data Center latencies
• Pattern #1- Almost no change w/ a primary – secondary C* DB
Reality Check Before Next Section (uhh... nodes and DC's talk, right?)
- How many Angels can blink while dancing on the head of a pin?
• It takes longer to think about it than it does to do it.
– Cassandra operations are happening in units of micro- and milli- seconds [
(10-6) and (10-3) seconds ] / network typically in micro- to low milli-seconds
• Reference Points
– The average human eye blink takes 350,000 microseconds (just over
1/3 of one second).
– The average human finger click takes 150,000 microseconds (just over
1/7 of one second).
– A camera flash illuminates for 1000 microseconds
• Netflix AWS Inter-Region Test*
– "1 million records in one region of a multi-region cluster. We then
initiated a read, 500ms later, in the other region, of the records that
were just written in the initial region, while keeping a production level
of load on the cluster"
• Consider Where Latency Bottlenecks Happen (slowest ship...)+
25
Comparison of Typical Latency Effects
* http://www.slideshare.net/planetcassandra/c-summit-2013-netflix-open-source-tools-and-benchmarks-for-cassandra
+ https://www.cisco.com/application/pdf/en/us/guest/netsol/ns407/c654/ccmigration_09186a008091d542.pdf
DC1 DC2
26
DC1 DC2
Does Traditional Active / Failover DC Architecture Kill The Future?
Works fine, but not super-efficient
• Either Data Center can service requests (even from the same
client)
• Lightweight Transactions typically used
– Same item id'd by (sku, loc_id) can be simultaneously updated by two
different application paths
– Potential conflict across 2 Data Centers
– Not the ultimate evil
• More work for the database to do across Data Centers
• DC failure scenarios have more work to fully recover
• Cassandra will ensure 100% of the data is in both Data Centers
• Typical Consistency Level
– QUORUM, SERIAL, EACH_QUORUM client gets success ack from nodes in
both DC's (RF = 3 x 2 DC's = 4 nodes respond)
• Application Data Center FAILOVER for Cassandra is Automatic
• Pattern #2- Client requests access two active Cassandra Data Centers
Time to drop the notion of a "Failover Data Database"
27
• Pattern #3- Consistent Routing – i.e. Location Aware
Clients From
East Region
Removes Cross-Region LWT Contention
• East transactions go to DC2, West to DC1
– loc_id in key (sku, loc_id) are routed based on geography of the
request.
• loc_id's 1 & 2 are in the East, 5 & 6 West
– Potential LWT contention only per Data Center (sku, loc_id)
– Does not have to be geographic, just consistent routing to a
data center by a piece of data that is in primary key (loc_id in
this case).
• Cassandra will ensure data is in both Data Centers
• Typical Consistency Level
– LOCAL_QUORUM client gets success ack from nodes in one DC's
(RF = 3 x 1 DC's = 2 nodes respond)
• Application Data Center FAILOVER for Cassandra is
Automatic
DC1 DC2
Clients From
West Region
txn's PK (sku, loc_id, avail_time) => [ ("3K2", 1, t0), ("551", 2, t4), ...
..., ("3K2", 2, t2), ("551", 1, t2), ... ]
[("3K2",1,t0),
("551",2,t4),...,
("3K2",2,t2),
("551",1,t2),...]
[("aaa",5,t0),
("bbb",6,t4),...,
("aaa",6,t2),
("bbb",5,t2),...]
Data Goes Where Data Lives In An Ordered Way
28
DC1 DC2
• Pattern #4 – Consistent Routing - Location Aware and Queuing Based on Primary Key
Clients From
West Region
Clients From
East Region
Removes Need for Lightweight Transactions
• East transactions go to DC2, West to DC1
– loc_id in key (sku, loc_id) are routed based on geography of the request.
• loc_id's 1 & 2 are in the East, 5 & 6 West
– Queuing will provide a time ordered queue per (sku,
loc_id) primary key
– No need for Lightweight Transactions
• Each item activity is identified by (sku, loc_id) in it's own queue in
the order submitted
• Cassandra will ensure data is in both Data Centers
• Typical Consistency Level
– LOCAL_QUORUM client only needs success ack from nodes one DC's (RF
= 3 x 1 DC's = 2 nodes respond)
• It's OK if node(s) or DC's go down, application / Data
Center FAILOVER for Cassandra is Automatic
– Just can't think this way with an RDBMS (see screaming man)
txn's PK (sku, loc_id, avail_time) => [ ("3K2", 1, t0), ("551", 2, t4), ...
..., ("3K2", 2, t2), ("551", 1, t2), ... ]
("3K2",1,t0),("3K2",1,t2)
("551",2,t2),("551",2,t4)
("xxx",x,tx),("xxx",x,tx)
("aaa",5,t0),("aaa",5,t2)
("bbb",6,t2),("bbb",6,t4)
("ccc",7,tx),("ccc",7,tx)
Blog Outlining An Implementation
• http://www.datastax.com/dev/blog/scalable-inventory-example
• Includes a logging table for immediate and historical queries and analysis
– Good use case for DataStax Enterprise Spark and Solr integration
29
Last Thoughts
• None of the architecture patterns discussed
are bad, just some more efficient than others
• Cassandra gives you design choices on how to
work with your Data Center architectures, not
the other way around
• Cassandra can always scale!
• Did not discuss different patterns as logic for
things such as safety stock is applied
30
Last Question – Why Not Active-Active RDBMS?
• Limitations of technology
often force Pattern #1, the
active / failover RDBMS in
separate Data Centers
• Operationally this means
compromising:
– Speed
– Always-on
– Scalability
• Which can mean Screaming
Man
– is not good for your sanity or
career (imho)
References and Interesting Related Stuff
32
http://www.datastax.com/dev/blog/scalable-inventory-example
http://www.nextplatform.com/2015/09/23/making-cassandra-do-azure-but-not-windows/
http://highscalability.com/blog/2015/1/26/paper-immutability-changes-everything-by-pat-helland.html
http://www.slideshare.net/planetcassandra/azure-datastax-enterprise-powers-office-365-per-user-store
Life beyond Distributed Transactions: an Apostate’s Opinion: http://www-
db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf
http://blogs.msdn.com/b/pathelland/archive/2007/07/23/normalization-is-for-sissies.aspx
http://www.oracle.com/technetwork/database/availability/fmw11gsoamultidc-aa-1998491.pdf
https://msdn.microsoft.com/en-us/library/azure/dn251004.aspx
http://www.datastax.com/dev/blog/scalable-inventory-example
https://docs.oracle.com/middleware/1221/wls/WLCAG/weblogic_ca_archs.htm#WLCAG138
http://www.ibm.com/developerworks/bpm/bpmjournal/1308_zhang/1308_zhang.html
Thank You!

Contenu connexe

Tendances

Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraNguyen Quang
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overviewSean Murphy
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Benoit Perroud
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Sparknickmbailey
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsDave Gardner
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsAcunu
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsJulien Anguenot
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0Joe Stein
 
Understanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraUnderstanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraJason Brown
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecturenickmbailey
 
Cassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and LimitationsCassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and LimitationsPanagiotis Papadopoulos
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraDataStax
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hoodAndriy Rymar
 

Tendances (20)

Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentials
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
 
Understanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraUnderstanding AntiEntropy in Cassandra
Understanding AntiEntropy in Cassandra
 
Cassandra 101
Cassandra 101Cassandra 101
Cassandra 101
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
Cassandra Architecture FTW
Cassandra Architecture FTWCassandra Architecture FTW
Cassandra Architecture FTW
 
Cassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and LimitationsCassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and Limitations
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hood
 

En vedette

Multi Data Center Strategies
Multi Data Center StrategiesMulti Data Center Strategies
Multi Data Center StrategiesSteven Francia
 
Successful Software Development with Apache Cassandra
Successful Software Development with Apache CassandraSuccessful Software Development with Apache Cassandra
Successful Software Development with Apache CassandraDataStax Academy
 
Cassandra: One (is the loneliest number)
Cassandra: One (is the loneliest number)Cassandra: One (is the loneliest number)
Cassandra: One (is the loneliest number)DataStax Academy
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph DatabasesDataStax Academy
 
Analytics with Spark and Cassandra
Analytics with Spark and CassandraAnalytics with Spark and Cassandra
Analytics with Spark and CassandraDataStax Academy
 
Cassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkCassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkDataStax Academy
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayDataStax Academy
 
Coursera's Adoption of Cassandra
Coursera's Adoption of CassandraCoursera's Adoption of Cassandra
Coursera's Adoption of CassandraDataStax Academy
 
Production Ready Cassandra (Beginner)
Production Ready Cassandra (Beginner)Production Ready Cassandra (Beginner)
Production Ready Cassandra (Beginner)DataStax Academy
 
Introduction to .Net Driver
Introduction to .Net DriverIntroduction to .Net Driver
Introduction to .Net DriverDataStax Academy
 
Spark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and FurureSpark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and FurureDataStax Academy
 
Lessons Learned with Cassandra and Spark at the US Patent and Trademark Office
Lessons Learned with Cassandra and Spark at the US Patent and Trademark OfficeLessons Learned with Cassandra and Spark at the US Patent and Trademark Office
Lessons Learned with Cassandra and Spark at the US Patent and Trademark OfficeDataStax Academy
 
Using Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with CassandraUsing Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with CassandraDataStax Academy
 
Traveler's Guide to Cassandra
Traveler's Guide to CassandraTraveler's Guide to Cassandra
Traveler's Guide to CassandraDataStax Academy
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 

En vedette (20)

Multi Data Center Strategies
Multi Data Center StrategiesMulti Data Center Strategies
Multi Data Center Strategies
 
Successful Software Development with Apache Cassandra
Successful Software Development with Apache CassandraSuccessful Software Development with Apache Cassandra
Successful Software Development with Apache Cassandra
 
Cassandra: One (is the loneliest number)
Cassandra: One (is the loneliest number)Cassandra: One (is the loneliest number)
Cassandra: One (is the loneliest number)
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph Databases
 
Analytics with Spark and Cassandra
Analytics with Spark and CassandraAnalytics with Spark and Cassandra
Analytics with Spark and Cassandra
 
Cassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkCassandra Data Maintenance with Spark
Cassandra Data Maintenance with Spark
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 
Coursera's Adoption of Cassandra
Coursera's Adoption of CassandraCoursera's Adoption of Cassandra
Coursera's Adoption of Cassandra
 
Production Ready Cassandra (Beginner)
Production Ready Cassandra (Beginner)Production Ready Cassandra (Beginner)
Production Ready Cassandra (Beginner)
 
New features in 3.0
New features in 3.0New features in 3.0
New features in 3.0
 
Introduction to .Net Driver
Introduction to .Net DriverIntroduction to .Net Driver
Introduction to .Net Driver
 
Spark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and FurureSpark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and Furure
 
Playlists at Spotify
Playlists at SpotifyPlaylists at Spotify
Playlists at Spotify
 
Lessons Learned with Cassandra and Spark at the US Patent and Trademark Office
Lessons Learned with Cassandra and Spark at the US Patent and Trademark OfficeLessons Learned with Cassandra and Spark at the US Patent and Trademark Office
Lessons Learned with Cassandra and Spark at the US Patent and Trademark Office
 
Using Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with CassandraUsing Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with Cassandra
 
Traveler's Guide to Cassandra
Traveler's Guide to CassandraTraveler's Guide to Cassandra
Traveler's Guide to Cassandra
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 

Similaire à Tales From The Front: An Architecture For Multi-Data Center Scalable Applications For Things That Count Using Cassandra

Scalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availabilityScalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availabilityPapitha Velumani
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonSpark Summit
 
Scalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availabilityScalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availabilityPapitha Velumani
 
Azure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAzure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAndre Essing
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentationSergey Enin
 
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauBig Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauSam Palani
 
Big Data_Architecture.pptx
Big Data_Architecture.pptxBig Data_Architecture.pptx
Big Data_Architecture.pptxbetalab
 
SCALE 16x on-prem container orchestrator deployment
SCALE 16x on-prem container orchestrator deploymentSCALE 16x on-prem container orchestrator deployment
SCALE 16x on-prem container orchestrator deploymentSteve Wong
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to CassandraUmair Mansoob
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseJoe Alex
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Design (Cloud systems) for Failures
Design (Cloud systems) for FailuresDesign (Cloud systems) for Failures
Design (Cloud systems) for FailuresRodolfo Kohn
 

Similaire à Tales From The Front: An Architecture For Multi-Data Center Scalable Applications For Things That Count Using Cassandra (20)

No sql
No sqlNo sql
No sql
 
Master.pptx
Master.pptxMaster.pptx
Master.pptx
 
Scalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availabilityScalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availability
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
 
Scalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availabilityScalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availability
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Introduction
IntroductionIntroduction
Introduction
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
Azure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAzure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep Dive
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentation
 
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauBig Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
 
Big Data_Architecture.pptx
Big Data_Architecture.pptxBig Data_Architecture.pptx
Big Data_Architecture.pptx
 
SCALE 16x on-prem container orchestrator deployment
SCALE 16x on-prem container orchestrator deploymentSCALE 16x on-prem container orchestrator deployment
SCALE 16x on-prem container orchestrator deployment
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to Cassandra
 
No sql
No sqlNo sql
No sql
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
NoSQL
NoSQLNoSQL
NoSQL
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Design (Cloud systems) for Failures
Design (Cloud systems) for FailuresDesign (Cloud systems) for Failures
Design (Cloud systems) for Failures
 

Plus de DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talkDataStax Academy
 

Plus de DataStax Academy (19)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talk
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Dernier (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Tales From The Front: An Architecture For Multi-Data Center Scalable Applications For Things That Count Using Cassandra

  • 1. Tales From The Front: Dan Flavin DataStax 2015-12-08 An Architecture For Scalable Multi-Data Center Applications for Things That Count Using Cassandra
  • 2. This Session Says What? The Example Use Case is an Inventory Access Pattern Using Cassandra Same Patterns Apply To Other Distributed, Highly Current Applications What We're Going To Cover in 30 Minutes • Quick review of Cassandra functionality that applies to this use case • Common Data Center and application architectures for highly available inventory applications, and why the were designed that way • Cassandra implementations vis-a-vis infrastructure capabilities – The impedance mismatch: compromises made to fit into IT infrastructures designed and implemented with an old mindset*. 2 * just because an infrastructre made sense for the software architectures a decade ago doesn't mean it's still a good idea
  • 3. Things That Count How Did We Get HERE? More Users, More Problems A Common Use - Inventory
  • 4. Scaling and Availability • We all want applications and services that are scalable and highly available • Scaling our app tier is usually pretty painless, especially with cloud infrastructure – App tier tends to be stateless 4 while accessing a single stateful database infrastructure
  • 5. Ways We Scale our Relational Databases SELECT array_agg(players), player_teams FROM ( SELECT DISTINCT t1.t1player AS players, t1.player_teams FROM ( SELECT p.playerid AS t1id, concat(p.playerid, ':', p.playername, ' ') AS t1player, array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams FROM player p LEFT JOIN plays pl ON p.playerid = pl.playerid GROUP BY p.playerid, p.playername ) t1 INNER JOIN ( SELECT p.playerid AS t2id, array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams FROM player p LEFT JOIN plays pl ON p.playerid = pl.playerid GROUP BY p.playerid, p.playername ) t2 ON t1.player_teams = t2.player_teams AND t1.t1id <> t2.t2id ) innerQuery Scaling Up SELECT * FROM denormalized_view Denormalization 5
  • 6. Ways we Scale our Relational Databases Client Users Data Primary Replica 1 Replica 2 Failover Process Monitor Failover Write RequestsRead Requests Replication Lag Router A-F G-M N-T U-Z Replication Lag Sharding and Replication (and probably Denormalization*) 6 Monitor the Failover Monitor * a.k.a. Demoralization – you lose your technology moral ground when you mix sharding, normalization / denormalization / failure survivability / flexibility and humanity
  • 7. Distributed Systems ≠Sharded or Bolt-on Replication Systems 7 Scalability  Distributed Systems, Distributed Data, all distributed amongst "nodes" (a.k.a. computers)  No node is critical path or chokepoint Availability  Requests will always be responded to Partition Tolerance  Component failure(s) do not bring whole system down Consistency  All nodes see the data at the same time Nodes are units of performance and availability (a.k.a. incremental scalability) The CAP Theorem describes distributed systems choices Consistency Availability Partition Tolerance Cassandra values AP where the balance between consistency and latency is tunable.
  • 8. What is Cassandra? • A Linearly Scaling and Fault Tolerant Distributed Database • Fully Distributed – Data spread over many nodes – All nodes participate in a cluster – All nodes are equal – No SPOF (shared nothing) – Run on commodity hardware 8
  • 9. What is Cassandra? Linearly Scaling – Have More Data? Add more nodes. – Need More Throughput? Add more nodes. Fault Tolerant – Nodes Down != Database Down – Datacenter Down != Database Down 9
  • 10. What is Cassandra? • Cassandra Architecture Applicability – Multiple Data Center "Active-Active replication" – One Data Center goes offline, the others can seamlessly take over – Offline DC back online becomes immediately available for workload, Cassandra automatically rationalizes data across DC's 10 • Fully Replicated • Clients write local DC • Data syncs to other DC • Replication Factor per DC • "Success" controlled by client app (i.e. local only, local and remote individually, local and remote combined...) Data Center 1 Data Center 2 Client
  • 11. Replication Factor - How many nodes get the same data Node 1 A CD Write Data Identified by a Primary Key A • Each node will get a range of keys identified by a hash of a table's Primary Key • On insert key A is hashed to determine nodes data is written to Replication Factor=3 11 How? Replication Factor (server-side) Schema Configuration  Implements setting how many copies of the data on different nodes should exist for CAP • also good for Cassandra performance of some operations (different topic) Key To CAP System Implementation:  Data is Spread Across Many Nodes Client Node 2 B AD Node 3 C AB Node 4 D BC CQL: insert A into table... hash() Write to primary node 1, and simultaneously to nodes 2 & 3 (RF = 3)
  • 12. Consistency Level (CL) and Speed • How many replicas we need to hear from can affect how quickly we can read and write data in Cassandra Client B AD C AB A CD D BC 5 µs ack 300 µs ack 12 µs ack 12 µs ack Read A (CL=QUORUM) 12 • Same concept applies across Data Centers
  • 13. Consistency Level and Data Consistency • Replicas may disagree – Column values are timestamped – In Cassandra, Last Write Wins (LWW) • Often use Lightweight Transactions – Allows for serialization / test-and-set Client B AD C AB A CD D BC A=2 Newer A=1 Older A=2 Read A (CL=QUORUM) A good primer on consistency levels w/o LWT's: Christos from Netflix: “Eventual Consistency != Hopeful Consistency” https://www.youtube.com/watch?v=lwIA8tsDXXE 13
  • 14. Consistency Level and Availability • Consistency Level choice affects availability • For example, 1 Data Center with application read Consistency Level (CL) at QUORUM can tolerate one replica being down and still be available (in RF=3) Client B AD C AB A CD D BC A=2 A=2 A=2 Read A (CL=QUORUM) 14 • Same concept applies across Data Centers
  • 15. Consistency Levels (CL's) • Applies to both Reads and Writes (i.e. is set on each query) • Controlled at the Client API Level (% controlled by RF). Examples include: – ONE – one replica from any DC – LOCAL_ONE – one replica from local DC – QUORUM – 51% of replicas from any DC – LOCAL_QUORUM – 51% of replicas from local DC – EACH_QUORUM – 51% of replicas from multiple DC's – ALL – all replicas – TWO... • The client code has, and can control a hierarchy of automatic failure levels – i.e. one Data Center down with QUORUM, use LOCAL_QUORUM • Cassandra will keep things consistent when the failed Data Center is back online • Will there potentially be a speed cost? Of course there might be, but your database never went down now did it? – CL fallback patterns are usually set once when connection is made in the application using Cassandra 15 Other Constructs • BATCH – All statements succeed or fail • Light Weight Transactions (LWT) – Test and Set Consistency Level Choices - Reacting when machines, data centers, networks, etc. go down (if you notice)
  • 16. Primary Key and Replication Factor 16 "Write Identified by Primary Key A" means what?  What "Primay Key A" looks like in CQL Replication Factor (per DC)
  • 17. Review of Replication Factor (RF) and Consistency Level (CL) What matters for this discussion • The KEY is the thing that is an enabler for distributed systems & CAP implementation – Along with Replication Factor determines what nodes (nodes can live in different Data Centers) data will be written to and read from. – Consistency Level controlled in application for the scenario that makes sense. A simple app may never notice node failures. • Unless... ... something in the infrastructure breaks – Then the application gets to decide what to do » Ha! Like you even have those options with other technologies » Is another topic completely out of scope here 17
  • 18. Back to the Topic: Inventory Supply and Demand • Cassandra Architecture Applicability – Linear scalability / de-scalability to handle heavy workload periods – Predictable performance – No Single Point of Failure • Node(s) • Data Center(s) – Flexible Deployment Options • On-Premise • Cloud • Hybrid Cloud / On-Premise – Looks like one big old database to the client with visibility into it's working parts when needed 18
  • 19. Inventory Supply and Demand - An Example Order Management System Implementation Overview Logical 19Images from: http://www.ibm.com/developerworks/library/co-promising-inventory/ Physical Natural spot for Cassandra is [ Inventory = (Reservation <-> Order) – Supply ] Database App layer can scale Why? Traditional RDBMS – Not So Much
  • 20. CL's, RF's, and LWT's. Oh My! • Consistency Levels – QUORUM (LOCAL_QUORUM) – ONE (LOCAL_ONE) • Reduce or Remove – BATCH statements – Lightweight Transactions (LWT) Other CL's, LWT's, and BATCH are not bad... – They put more work on the Cassandra cluster(s) – Potentially add more scenarios to deal with in the client application Why put more work on the clusters if you can avoid with a circumspect design? 20
  • 21. Inventory = Supply – Demand 21 Demand Mgmt Process /API's [ id'd by (sku, loc_id) ] Available Inventory Supply Mgmt Process /API's [ id'd by (sku, loc_id) ] Application Layer
  • 22. Traditional Multi Data Center Infrastructure 22+ https://docs.oracle.com/middleware/1221/wls/WLCAG/weblogic_ca_archs.htm#WLCAG138 * http://www.ibm.com/developerworks/bpm/bpmjournal/1308_zhang/1308_zhang.html What do we notice here? • Resilient (often stateless) Web / App layer • Primary / Failover DB (high transaction load can force this because active- active is too heavy) • All active database access is managed in one Data Center + * Inventory txn's id'd by (sku, loc_id) can flow from applications in either Data Center - multiple transactions for same item come through either DC, but db access only in 1 DC Client Client Stateless Application Processing Layer (Demand and Supply logic and queues, web layer, etc.)
  • 23. Does Tradition Kill The Future? Pattern #1- Almost No Change To Infrastructure Primary active -secondary failover DB's isolated in Data Centers Inventory database interaction (sku, loc_id) can come from the application layer in either Data Center with only . • 100% of data in both Data Centers – Cassandra's active-active data synchronization between clusters – more sophisticated than replication, but you can think of it that way if it makes things easier 23 DC1 DC2
  • 24. Does Traditional Active / Failover DC Architecture Kill The Future? Obviously inefficient usage! But works fine • Each operation will use DC1 as the primary Data Center • Lightweight Transactions typically used – Same item id'd by (sku, loc_id) can be simultaneously updated by two different application paths – Not necessarily evil, just more work for the database to do • Consistency Level – For the scared neophytes • QUORUM, EACH_QUORUM, or SERIAL client gets success ack from nodes in both DC's (RF = 3 x 2 DC's = 4 nodes respond) – Consistency Level for Cassandra's people • LOCAL_QUORUM or LOCAL_SERIAL client gets success ack from nodes in DC1 (RF = 3 x 1 DC = 2 nodes respond). Cassandra will simultaneously update DC2, but client app does not wait • Application Data Center FAILOVER for Cassandra is Automatic 24 DC1 DC2 Cassandra's active-active is way, way past primary-secondary RDBMS failover architecture of isolated in Data Centers. Stuck with old Data Center failover scenario (but no RDBMS) ... and yes. Cassandra efficiently and elegantly deals with WAN Data Center latencies • Pattern #1- Almost no change w/ a primary – secondary C* DB
  • 25. Reality Check Before Next Section (uhh... nodes and DC's talk, right?) - How many Angels can blink while dancing on the head of a pin? • It takes longer to think about it than it does to do it. – Cassandra operations are happening in units of micro- and milli- seconds [ (10-6) and (10-3) seconds ] / network typically in micro- to low milli-seconds • Reference Points – The average human eye blink takes 350,000 microseconds (just over 1/3 of one second). – The average human finger click takes 150,000 microseconds (just over 1/7 of one second). – A camera flash illuminates for 1000 microseconds • Netflix AWS Inter-Region Test* – "1 million records in one region of a multi-region cluster. We then initiated a read, 500ms later, in the other region, of the records that were just written in the initial region, while keeping a production level of load on the cluster" • Consider Where Latency Bottlenecks Happen (slowest ship...)+ 25 Comparison of Typical Latency Effects * http://www.slideshare.net/planetcassandra/c-summit-2013-netflix-open-source-tools-and-benchmarks-for-cassandra + https://www.cisco.com/application/pdf/en/us/guest/netsol/ns407/c654/ccmigration_09186a008091d542.pdf DC1 DC2
  • 26. 26 DC1 DC2 Does Traditional Active / Failover DC Architecture Kill The Future? Works fine, but not super-efficient • Either Data Center can service requests (even from the same client) • Lightweight Transactions typically used – Same item id'd by (sku, loc_id) can be simultaneously updated by two different application paths – Potential conflict across 2 Data Centers – Not the ultimate evil • More work for the database to do across Data Centers • DC failure scenarios have more work to fully recover • Cassandra will ensure 100% of the data is in both Data Centers • Typical Consistency Level – QUORUM, SERIAL, EACH_QUORUM client gets success ack from nodes in both DC's (RF = 3 x 2 DC's = 4 nodes respond) • Application Data Center FAILOVER for Cassandra is Automatic • Pattern #2- Client requests access two active Cassandra Data Centers
  • 27. Time to drop the notion of a "Failover Data Database" 27 • Pattern #3- Consistent Routing – i.e. Location Aware Clients From East Region Removes Cross-Region LWT Contention • East transactions go to DC2, West to DC1 – loc_id in key (sku, loc_id) are routed based on geography of the request. • loc_id's 1 & 2 are in the East, 5 & 6 West – Potential LWT contention only per Data Center (sku, loc_id) – Does not have to be geographic, just consistent routing to a data center by a piece of data that is in primary key (loc_id in this case). • Cassandra will ensure data is in both Data Centers • Typical Consistency Level – LOCAL_QUORUM client gets success ack from nodes in one DC's (RF = 3 x 1 DC's = 2 nodes respond) • Application Data Center FAILOVER for Cassandra is Automatic DC1 DC2 Clients From West Region txn's PK (sku, loc_id, avail_time) => [ ("3K2", 1, t0), ("551", 2, t4), ... ..., ("3K2", 2, t2), ("551", 1, t2), ... ] [("3K2",1,t0), ("551",2,t4),..., ("3K2",2,t2), ("551",1,t2),...] [("aaa",5,t0), ("bbb",6,t4),..., ("aaa",6,t2), ("bbb",5,t2),...]
  • 28. Data Goes Where Data Lives In An Ordered Way 28 DC1 DC2 • Pattern #4 – Consistent Routing - Location Aware and Queuing Based on Primary Key Clients From West Region Clients From East Region Removes Need for Lightweight Transactions • East transactions go to DC2, West to DC1 – loc_id in key (sku, loc_id) are routed based on geography of the request. • loc_id's 1 & 2 are in the East, 5 & 6 West – Queuing will provide a time ordered queue per (sku, loc_id) primary key – No need for Lightweight Transactions • Each item activity is identified by (sku, loc_id) in it's own queue in the order submitted • Cassandra will ensure data is in both Data Centers • Typical Consistency Level – LOCAL_QUORUM client only needs success ack from nodes one DC's (RF = 3 x 1 DC's = 2 nodes respond) • It's OK if node(s) or DC's go down, application / Data Center FAILOVER for Cassandra is Automatic – Just can't think this way with an RDBMS (see screaming man) txn's PK (sku, loc_id, avail_time) => [ ("3K2", 1, t0), ("551", 2, t4), ... ..., ("3K2", 2, t2), ("551", 1, t2), ... ] ("3K2",1,t0),("3K2",1,t2) ("551",2,t2),("551",2,t4) ("xxx",x,tx),("xxx",x,tx) ("aaa",5,t0),("aaa",5,t2) ("bbb",6,t2),("bbb",6,t4) ("ccc",7,tx),("ccc",7,tx)
  • 29. Blog Outlining An Implementation • http://www.datastax.com/dev/blog/scalable-inventory-example • Includes a logging table for immediate and historical queries and analysis – Good use case for DataStax Enterprise Spark and Solr integration 29
  • 30. Last Thoughts • None of the architecture patterns discussed are bad, just some more efficient than others • Cassandra gives you design choices on how to work with your Data Center architectures, not the other way around • Cassandra can always scale! • Did not discuss different patterns as logic for things such as safety stock is applied 30
  • 31. Last Question – Why Not Active-Active RDBMS? • Limitations of technology often force Pattern #1, the active / failover RDBMS in separate Data Centers • Operationally this means compromising: – Speed – Always-on – Scalability • Which can mean Screaming Man – is not good for your sanity or career (imho)
  • 32. References and Interesting Related Stuff 32 http://www.datastax.com/dev/blog/scalable-inventory-example http://www.nextplatform.com/2015/09/23/making-cassandra-do-azure-but-not-windows/ http://highscalability.com/blog/2015/1/26/paper-immutability-changes-everything-by-pat-helland.html http://www.slideshare.net/planetcassandra/azure-datastax-enterprise-powers-office-365-per-user-store Life beyond Distributed Transactions: an Apostate’s Opinion: http://www- db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf http://blogs.msdn.com/b/pathelland/archive/2007/07/23/normalization-is-for-sissies.aspx http://www.oracle.com/technetwork/database/availability/fmw11gsoamultidc-aa-1998491.pdf https://msdn.microsoft.com/en-us/library/azure/dn251004.aspx http://www.datastax.com/dev/blog/scalable-inventory-example https://docs.oracle.com/middleware/1221/wls/WLCAG/weblogic_ca_archs.htm#WLCAG138 http://www.ibm.com/developerworks/bpm/bpmjournal/1308_zhang/1308_zhang.html Thank You!

Notes de l'éditeur

  1. Scaling Up == Expensive Denormalization == Duplicates of our data, no more 3NF and constraints for integrity
  2. C* doesn’t impose a single consistency model across your entire application/dataset Part of the “tunable consistency” model
  3. Demand and Supply interfaces / processes update available inventory that is then used by application layer to satisfy requests. example: Browse item -> check available inventory. item in online cart -> update demand. Item checked out -> update supply. In either case, the supply and demand processes determine when to update available inventory. Similar process for POS checkout Demand and Supply rules can apply, i.e. safety stock. alternative shipping locations depending on demand type, etc. Typically implemented with queues, REST API's, etc. Something that decouples the storage structure from the processing logic
  4. Point is two transactions for the same item can come through either tier
  5. Cassandra is often implemented in legacy network and application architectures. when would LWT not be used? When concurrency is not all that high combined with safety-stock logic
  6. Cassandra is often implemented in legacy network and application architectures. when would LWT not be used? When concurrency is not all that high combined with safety-stock logic
  7. May require changes to Global Traffic Manger(s) and application Application may need to implement or change how queuing is executed This pattern of thought is not an option with RDBMS's