SlideShare une entreprise Scribd logo
1  sur  41
Télécharger pour lire hors ligne
Introduction to Cassandra
Asis Mohanty
Source: DataStax, Cassandra
01ARCHITECTURE
Cassandra Architecture
02 C* CLUSTER
Cassandra Cluster
03CQL COLUMN
FAMILY
CQL Column Family
WRITE PATH04
Cassandra Write Path
READ PATH05
Cassandra Read Path
DATA
MODEL
06
C* Data Model
Whats is Cassandra?
• Open-source
database
management system
(DBMS)
• Several key features
of Cassandra
differentiate it from
other similar
systems
Whats is Cassandra?
Architected Scale in Mind
What is C* Cluster?
Peer To Peer
Data Centers
Tunable Consistency
Continuous Availability
Consistent Hashing
Consistent hashing allows distributing data across a cluster which minimizes reorganization
when nodes are added or removed. Consistent hashing partitions data based on the partition
key
name age car gender
jim 36 camaro M
carol 37 bmw F
johnny 12 M
suzy 10 F
For Example
Partition key Murmur3 hash value
jim -2245462676723223822
carol 7723358927203680754
johnny -6723372854036780875
suzy 1168604627387940318
Cassandra assigns a hash value to each partition key
What is a CQL table and how is it related to a column family?
Row is the smallest unit that stores related data in Cassandra
• Rows – individual rows constitute a column family
• Row key – uniquely identifies a row in a column family
• Row – stores pairs of column keys and column values
• Column key – uniquely identifies a column value in a row
• Column value – stores one value or a collection of values
What are row, row key, column key and column value?
What are partition, partition key, row, column and cell?
How does Cassandra writes so fast?
Cassandra is a log-structured storage engine
• Data is sequentially appended, not placed in pre-set locations
What are the key components of the write path?
Each node implements four key components to handle its writes
 Memtables – in-memory tables corresponding to CQL tables, with
indexes
 CommitLog – append-only log, replayed to restore downed node's
Memtables
 SSTables – Memtable snapshots periodically flushed to disk, clearing
heap
 Compaction – periodic process to merge and streamline SSTables
When any node receive any write request
 The record appends to the CommitLog, and
 The record appends to the Memtable for this record's target CQL table
 Periodically, Memtables flush to SSTables, clearing JVM heap and
CommitLog
 Periodically, Compaction runs to merge and streamline SSTables
How does the write path flow on a node?
What are Memtables and how are they flushed to disk?
What is a SSTable and what are its characteristics?
What is a SSTable and what are its characteristics?
What is compaction?
How does the read path flow on each node?
How does the read path flow on each node?
How does the read path flow on each node?
How does the read path flow on each node?
What is a data modeling framework?
ASample Data Model
What is a conceptual data model?
Partitioning
• Nodes are logically structured in Ring Topology.
• Hashed value of key associated with data partition is used to assign it
to a node in the ring.
• Hashing rounds off after certain value to support ring structure.
• Lightly loaded nodes moves position to alleviate highly loaded
nodes.
33
Appendix
Consistency – All the
servers in the system will
have the same data so
anyone using the system will
get the same copy
regardless of which server
answers their request.
Availability – The system
will always respond to a
request (even if it's not the
latest data or consistent
across the system or just a
message saying the system
isn't working)
Partition Tolerance – The
system continues to operate
as a whole even if individual
servers fail or can't be
reached..
CAP Theorem
CassandraArchitecture Overview
○ Cassandra was designed with the understanding that system/ hardware failures
can and do occur
○ Peer-to-peer, distributed system
○ All nodes are the same
○ Data partitioned among all nodes in the cluster
○ Custom data replication to ensure fault tolerance
○ Read/Write-anywhere design
○ Google BigTable - data model
○ Column Families
○ Memtables
○ SSTables
○ Amazon Dynamo - distributed systems technologies
○ Consistent hashing
○ Partitioning
○ Replication
○ One-hop routing
Transparent Elasticity
Nodes can be added and removed from Cassandra online, with no
downtime being experienced.
1
2
3
4
5
6
1
7
10
4
2
3
5
6
8
9
11
12
Transparent Scalability
Addition of Cassandra nodes increases performance linearly and
ability to manage TB’s-PB’s of data.
1
2
3
4
5
6
1
7
10
4
2
3
5
6
8
9
11
12
Performance
throughput = N
Performance
throughput = N x 2
HighAvailability
Cassandra, with its peer-to-peer architecture has no single point of
failure.
Multi-Geography/ZoneAware
Cassandra allows a single logical database to span 1-N datacenters
that are geographically dispersed. Also supports a hybrid on-
premise/Cloud implementation.
Data Redundancy
Cassandra allows for customizable data redundancy so that data is
completely protected. Also supports rack awareness (data can be
replicated between different racks to guard against machine/rack
failures).
uses ‘Zookeeper’ to
choose a leader
which tells nodes
the range they are
replicas for
Security in Cassandra
• Internal Authentication
 Manages login IDs and passwords inside the database.
• Object Permission Management
 Controls who has access to what and who can do what in the
database
 Uses familiar GRANT/REVOKE from relational systems.
• Client to Node Encryption
 Protects data in flight to and from a database

Contenu connexe

Tendances

NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraFolio3 Software
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Apache Cassandra overview
Apache Cassandra overviewApache Cassandra overview
Apache Cassandra overviewElifTech
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsOleg Magazov
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Benoit Perroud
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed DatabaseEric Evans
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraSoftwareMill
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra Nikiforos Botis
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsnarsiman
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraChetan Baheti
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...DataStax Academy
 

Tendances (20)

NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Apache Cassandra overview
Apache Cassandra overviewApache Cassandra overview
Apache Cassandra overview
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and Basics
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
 
Cassandra ppt 2
Cassandra ppt 2Cassandra ppt 2
Cassandra ppt 2
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache Cassandra
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 

En vedette

Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Acunu
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
Getting started with Cassandra 2.1
Getting started with Cassandra 2.1Getting started with Cassandra 2.1
Getting started with Cassandra 2.1Viswanath J
 
Management on Cloud 2011
Management on Cloud 2011Management on Cloud 2011
Management on Cloud 2011steccami
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 

En vedette (6)

Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
 
Cassandra ppt 1
Cassandra ppt 1Cassandra ppt 1
Cassandra ppt 1
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Getting started with Cassandra 2.1
Getting started with Cassandra 2.1Getting started with Cassandra 2.1
Getting started with Cassandra 2.1
 
Management on Cloud 2011
Management on Cloud 2011Management on Cloud 2011
Management on Cloud 2011
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 

Similaire à Cassandra basics 2.0

Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Md. Shohel Rana
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandraPL dream
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentationSergey Enin
 
cassandra
cassandracassandra
cassandraAkash R
 
Cassandra advanced part-ll
Cassandra advanced part-llCassandra advanced part-ll
Cassandra advanced part-llachudhivi
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataChen Robert
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to CassandraUmair Mansoob
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for SysadminsNathan Milford
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMIJCI JOURNAL
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...DataStax Academy
 
Cassandra advanced-I
Cassandra advanced-ICassandra advanced-I
Cassandra advanced-Iachudhivi
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelAndrey Lomakin
 

Similaire à Cassandra basics 2.0 (20)

Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentation
 
cassandra
cassandracassandra
cassandra
 
Cassandra advanced part-ll
Cassandra advanced part-llCassandra advanced part-ll
Cassandra advanced part-ll
 
Cassndra (4).pptx
Cassndra (4).pptxCassndra (4).pptx
Cassndra (4).pptx
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to Cassandra
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 
Cassandra Learning
Cassandra LearningCassandra Learning
Cassandra Learning
 
Cassandra no sql ecosystem
Cassandra no sql ecosystemCassandra no sql ecosystem
Cassandra no sql ecosystem
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
Cassandra advanced-I
Cassandra advanced-ICassandra advanced-I
Cassandra advanced-I
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
 

Plus de Asis Mohanty

Cloud Data Warehouses
Cloud Data WarehousesCloud Data Warehouses
Cloud Data WarehousesAsis Mohanty
 
Cloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsCloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsAsis Mohanty
 
Hadoop Architecture Options for Existing Enterprise DataWarehouse
Hadoop Architecture Options for Existing Enterprise DataWarehouseHadoop Architecture Options for Existing Enterprise DataWarehouse
Hadoop Architecture Options for Existing Enterprise DataWarehouseAsis Mohanty
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataAsis Mohanty
 
ETL tool evaluation criteria
ETL tool evaluation criteriaETL tool evaluation criteria
ETL tool evaluation criteriaAsis Mohanty
 
Cognos vs Hyperion vs SSAS Comparison
Cognos vs Hyperion vs SSAS ComparisonCognos vs Hyperion vs SSAS Comparison
Cognos vs Hyperion vs SSAS ComparisonAsis Mohanty
 
Reporting/Dashboard Evaluations
Reporting/Dashboard EvaluationsReporting/Dashboard Evaluations
Reporting/Dashboard EvaluationsAsis Mohanty
 
Oracle to Netezza Migration Casestudy
Oracle to Netezza Migration CasestudyOracle to Netezza Migration Casestudy
Oracle to Netezza Migration CasestudyAsis Mohanty
 
BI Error Processing Framework
BI Error Processing FrameworkBI Error Processing Framework
BI Error Processing FrameworkAsis Mohanty
 
Netezza vs teradata
Netezza vs teradataNetezza vs teradata
Netezza vs teradataAsis Mohanty
 
Change data capture the journey to real time bi
Change data capture the journey to real time biChange data capture the journey to real time bi
Change data capture the journey to real time biAsis Mohanty
 

Plus de Asis Mohanty (14)

Cloud Data Warehouses
Cloud Data WarehousesCloud Data Warehouses
Cloud Data Warehouses
 
Cloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsCloud Lambda Architecture Patterns
Cloud Lambda Architecture Patterns
 
Apache TAJO
Apache TAJOApache TAJO
Apache TAJO
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
 
Hadoop Architecture Options for Existing Enterprise DataWarehouse
Hadoop Architecture Options for Existing Enterprise DataWarehouseHadoop Architecture Options for Existing Enterprise DataWarehouse
Hadoop Architecture Options for Existing Enterprise DataWarehouse
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs Exadata
 
ETL tool evaluation criteria
ETL tool evaluation criteriaETL tool evaluation criteria
ETL tool evaluation criteria
 
COGNOS Vs OBIEE
COGNOS Vs OBIEECOGNOS Vs OBIEE
COGNOS Vs OBIEE
 
Cognos vs Hyperion vs SSAS Comparison
Cognos vs Hyperion vs SSAS ComparisonCognos vs Hyperion vs SSAS Comparison
Cognos vs Hyperion vs SSAS Comparison
 
Reporting/Dashboard Evaluations
Reporting/Dashboard EvaluationsReporting/Dashboard Evaluations
Reporting/Dashboard Evaluations
 
Oracle to Netezza Migration Casestudy
Oracle to Netezza Migration CasestudyOracle to Netezza Migration Casestudy
Oracle to Netezza Migration Casestudy
 
BI Error Processing Framework
BI Error Processing FrameworkBI Error Processing Framework
BI Error Processing Framework
 
Netezza vs teradata
Netezza vs teradataNetezza vs teradata
Netezza vs teradata
 
Change data capture the journey to real time bi
Change data capture the journey to real time biChange data capture the journey to real time bi
Change data capture the journey to real time bi
 

Dernier

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 

Dernier (20)

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 

Cassandra basics 2.0

  • 1. Introduction to Cassandra Asis Mohanty Source: DataStax, Cassandra
  • 2. 01ARCHITECTURE Cassandra Architecture 02 C* CLUSTER Cassandra Cluster 03CQL COLUMN FAMILY CQL Column Family WRITE PATH04 Cassandra Write Path READ PATH05 Cassandra Read Path DATA MODEL 06 C* Data Model Whats is Cassandra?
  • 3. • Open-source database management system (DBMS) • Several key features of Cassandra differentiate it from other similar systems Whats is Cassandra?
  • 4.
  • 6. What is C* Cluster?
  • 11. Consistent Hashing Consistent hashing allows distributing data across a cluster which minimizes reorganization when nodes are added or removed. Consistent hashing partitions data based on the partition key name age car gender jim 36 camaro M carol 37 bmw F johnny 12 M suzy 10 F For Example Partition key Murmur3 hash value jim -2245462676723223822 carol 7723358927203680754 johnny -6723372854036780875 suzy 1168604627387940318 Cassandra assigns a hash value to each partition key
  • 12. What is a CQL table and how is it related to a column family?
  • 13. Row is the smallest unit that stores related data in Cassandra • Rows – individual rows constitute a column family • Row key – uniquely identifies a row in a column family • Row – stores pairs of column keys and column values • Column key – uniquely identifies a column value in a row • Column value – stores one value or a collection of values What are row, row key, column key and column value?
  • 14. What are partition, partition key, row, column and cell?
  • 15.
  • 16. How does Cassandra writes so fast? Cassandra is a log-structured storage engine • Data is sequentially appended, not placed in pre-set locations
  • 17. What are the key components of the write path? Each node implements four key components to handle its writes  Memtables – in-memory tables corresponding to CQL tables, with indexes  CommitLog – append-only log, replayed to restore downed node's Memtables  SSTables – Memtable snapshots periodically flushed to disk, clearing heap  Compaction – periodic process to merge and streamline SSTables When any node receive any write request  The record appends to the CommitLog, and  The record appends to the Memtable for this record's target CQL table  Periodically, Memtables flush to SSTables, clearing JVM heap and CommitLog  Periodically, Compaction runs to merge and streamline SSTables
  • 18. How does the write path flow on a node?
  • 19. What are Memtables and how are they flushed to disk?
  • 20. What is a SSTable and what are its characteristics?
  • 21. What is a SSTable and what are its characteristics?
  • 23.
  • 24. How does the read path flow on each node?
  • 25. How does the read path flow on each node?
  • 26. How does the read path flow on each node?
  • 27. How does the read path flow on each node?
  • 28.
  • 29. What is a data modeling framework?
  • 31. What is a conceptual data model?
  • 32. Partitioning • Nodes are logically structured in Ring Topology. • Hashed value of key associated with data partition is used to assign it to a node in the ring. • Hashing rounds off after certain value to support ring structure. • Lightly loaded nodes moves position to alleviate highly loaded nodes.
  • 34. Consistency – All the servers in the system will have the same data so anyone using the system will get the same copy regardless of which server answers their request. Availability – The system will always respond to a request (even if it's not the latest data or consistent across the system or just a message saying the system isn't working) Partition Tolerance – The system continues to operate as a whole even if individual servers fail or can't be reached.. CAP Theorem
  • 35. CassandraArchitecture Overview ○ Cassandra was designed with the understanding that system/ hardware failures can and do occur ○ Peer-to-peer, distributed system ○ All nodes are the same ○ Data partitioned among all nodes in the cluster ○ Custom data replication to ensure fault tolerance ○ Read/Write-anywhere design ○ Google BigTable - data model ○ Column Families ○ Memtables ○ SSTables ○ Amazon Dynamo - distributed systems technologies ○ Consistent hashing ○ Partitioning ○ Replication ○ One-hop routing
  • 36. Transparent Elasticity Nodes can be added and removed from Cassandra online, with no downtime being experienced. 1 2 3 4 5 6 1 7 10 4 2 3 5 6 8 9 11 12
  • 37. Transparent Scalability Addition of Cassandra nodes increases performance linearly and ability to manage TB’s-PB’s of data. 1 2 3 4 5 6 1 7 10 4 2 3 5 6 8 9 11 12 Performance throughput = N Performance throughput = N x 2
  • 38. HighAvailability Cassandra, with its peer-to-peer architecture has no single point of failure.
  • 39. Multi-Geography/ZoneAware Cassandra allows a single logical database to span 1-N datacenters that are geographically dispersed. Also supports a hybrid on- premise/Cloud implementation.
  • 40. Data Redundancy Cassandra allows for customizable data redundancy so that data is completely protected. Also supports rack awareness (data can be replicated between different racks to guard against machine/rack failures). uses ‘Zookeeper’ to choose a leader which tells nodes the range they are replicas for
  • 41. Security in Cassandra • Internal Authentication  Manages login IDs and passwords inside the database. • Object Permission Management  Controls who has access to what and who can do what in the database  Uses familiar GRANT/REVOKE from relational systems. • Client to Node Encryption  Protects data in flight to and from a database