Submit Search
Upload
Cassandra Basics: Indexing
•
Download as KEY, PDF
•
44 likes
•
22,821 views
Benjamin Black
Follow
An introduction to indexing with supercolumns and range queries in Cassandra.
Read less
Read more
Technology
Education
Business
Report
Share
Report
Share
1 of 48
Download now
Recommended
DBI
DBI
Lambert Lum
Exemple de création de base
Exemple de création de base
Saber LAJILI
SetFocus Portfolio
SetFocus Portfolio
donjoshu
Growing jQuery
Growing jQuery
gueste8d8bc
Cassandra Explained
Cassandra Explained
Eric Evans
[Infographie] Comment ameliorer la qualité de vos données pour votre DMP mark...
[Infographie] Comment ameliorer la qualité de vos données pour votre DMP mark...
Camp de Bases (Webedia Data Services)
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
DataStax Academy
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
DataStax Academy
Recommended
DBI
DBI
Lambert Lum
Exemple de création de base
Exemple de création de base
Saber LAJILI
SetFocus Portfolio
SetFocus Portfolio
donjoshu
Growing jQuery
Growing jQuery
gueste8d8bc
Cassandra Explained
Cassandra Explained
Eric Evans
[Infographie] Comment ameliorer la qualité de vos données pour votre DMP mark...
[Infographie] Comment ameliorer la qualité de vos données pour votre DMP mark...
Camp de Bases (Webedia Data Services)
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
DataStax Academy
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
DataStax Academy
Cassandra
Cassandra
오석 한
Graphite cluster setup blueprint
Graphite cluster setup blueprint
Anatoliy Dobrosynets
Understanding BYOE and How Today's User Experience Drives Value for UC
Understanding BYOE and How Today's User Experience Drives Value for UC
ShoreTel
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
aliproductninja
What is a DMP
What is a DMP
Sarah Jones
Highly Available Graphite
Highly Available Graphite
Matthew Barlocker
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
Vassilis Bekiaris
Cassandra and Spark
Cassandra and Spark
datastaxjp
data science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyter
Raj Singh
Introduction to Apache Spark
Introduction to Apache Spark
Juan Pedro Moreno
Presentation of Apache Cassandra
Presentation of Apache Cassandra
Nikiforos Botis
Introduction to Cassandra - Denver
Introduction to Cassandra - Denver
Jon Haddad
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
Ryu Kobayashi
Intro to py spark (and cassandra)
Intro to py spark (and cassandra)
Jon Haddad
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
Miklos Christine
Python & Cassandra - Best Friends
Python & Cassandra - Best Friends
Jon Haddad
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014
Jon Haddad
Intro to Cassandra
Intro to Cassandra
Jon Haddad
The Cassandra Distributed Database
The Cassandra Distributed Database
Eric Evans
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
Frens Jan Rumph
Building Your First Java Application with MongoDB
Building Your First Java Application with MongoDB
MongoDB
Elasticsearch for SQL Users
Elasticsearch for SQL Users
All Things Open
More Related Content
Viewers also liked
Cassandra
Cassandra
오석 한
Graphite cluster setup blueprint
Graphite cluster setup blueprint
Anatoliy Dobrosynets
Understanding BYOE and How Today's User Experience Drives Value for UC
Understanding BYOE and How Today's User Experience Drives Value for UC
ShoreTel
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
aliproductninja
What is a DMP
What is a DMP
Sarah Jones
Highly Available Graphite
Highly Available Graphite
Matthew Barlocker
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
Vassilis Bekiaris
Cassandra and Spark
Cassandra and Spark
datastaxjp
data science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyter
Raj Singh
Introduction to Apache Spark
Introduction to Apache Spark
Juan Pedro Moreno
Presentation of Apache Cassandra
Presentation of Apache Cassandra
Nikiforos Botis
Introduction to Cassandra - Denver
Introduction to Cassandra - Denver
Jon Haddad
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
Ryu Kobayashi
Intro to py spark (and cassandra)
Intro to py spark (and cassandra)
Jon Haddad
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
Miklos Christine
Python & Cassandra - Best Friends
Python & Cassandra - Best Friends
Jon Haddad
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014
Jon Haddad
Intro to Cassandra
Intro to Cassandra
Jon Haddad
The Cassandra Distributed Database
The Cassandra Distributed Database
Eric Evans
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
Frens Jan Rumph
Viewers also liked
(20)
Cassandra
Cassandra
Graphite cluster setup blueprint
Graphite cluster setup blueprint
Understanding BYOE and How Today's User Experience Drives Value for UC
Understanding BYOE and How Today's User Experience Drives Value for UC
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
What is a DMP
What is a DMP
Highly Available Graphite
Highly Available Graphite
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
Cassandra and Spark
Cassandra and Spark
data science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyter
Introduction to Apache Spark
Introduction to Apache Spark
Presentation of Apache Cassandra
Presentation of Apache Cassandra
Introduction to Cassandra - Denver
Introduction to Cassandra - Denver
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
Intro to py spark (and cassandra)
Intro to py spark (and cassandra)
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
Python & Cassandra - Best Friends
Python & Cassandra - Best Friends
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014
Intro to Cassandra
Intro to Cassandra
The Cassandra Distributed Database
The Cassandra Distributed Database
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
Similar to Cassandra Basics: Indexing
Building Your First Java Application with MongoDB
Building Your First Java Application with MongoDB
MongoDB
Elasticsearch for SQL Users
Elasticsearch for SQL Users
All Things Open
MongoDB - Features and Operations
MongoDB - Features and Operations
ramyaranjith
Json at work overview and ecosystem-v2.0
Json at work overview and ecosystem-v2.0
Boulder Java User's Group
Elasticsearch for SQL Users
Elasticsearch for SQL Users
Great Wide Open
Embedding a language into string interpolator
Embedding a language into string interpolator
Michael Limansky
Native json in the Cache' ObjectScript 2016.*
Native json in the Cache' ObjectScript 2016.*
Timur Safin
The Aggregation Framework
The Aggregation Framework
MongoDB
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
Similar to Cassandra Basics: Indexing
(10)
Building Your First Java Application with MongoDB
Building Your First Java Application with MongoDB
Elasticsearch for SQL Users
Elasticsearch for SQL Users
MongoDB - Features and Operations
MongoDB - Features and Operations
Json at work overview and ecosystem-v2.0
Json at work overview and ecosystem-v2.0
Elasticsearch for SQL Users
Elasticsearch for SQL Users
Embedding a language into string interpolator
Embedding a language into string interpolator
Native json in the Cache' ObjectScript 2016.*
Native json in the Cache' ObjectScript 2016.*
The Aggregation Framework
The Aggregation Framework
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
Recently uploaded
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Rafal Los
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
RTylerCroy
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
jfdjdjcjdnsjd
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
apidays
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
apidays
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
The Digital Insurer
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
wesley chun
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
sammart93
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Khem
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Juan lago vázquez
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
apidays
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
UK Journal
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
lior mazor
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Product Anonymous
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
Recently uploaded
(20)
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Cassandra Basics: Indexing
1.
Cassandra Basics
Indexing Benjamin Black, b@b3k.us
2.
Relational stores are SCHEMA
ORIENTED
3.
Start from your
SCHEMA & WORK FORWARDS
4.
Column stores are QUERY
ORIENTED
5.
Start from your
QUERIES & WORK BACKWARDS
6.
AT SCALE
7.
AT SCALE
Denormalization is THE NORM
8.
AT SCALE
9.
AT SCALE
Everything depends on THE INDICES
10.
Cassandra is an INDEX
CONSTRUCTION KIT
11.
Column Family
12.
Two-level Map key: {
column: value, column: value, ... }
13.
Super Column Family
14.
Three-level Map key: {
supercolumn: { column:value, column: value }, supercolumn: { ... } }
15.
column sorting defined
by CompareWith/ CompareSubcolumnsWith
16.
TimeUUIDType UTF8Type
ASCIIType LongType LexicalUUIDType
17.
row placement determined
by Partitioner
18.
RandomPartitioner Place based on
MD5 of key OrderPreservingPartitioner Place based on actual key
19.
Rows are sorted
by key on each node Regardless of partitioner
20.
One example in TWO
ACTS
21.
Prelude A USER DATABASE
22.
<ColumnFamily Name=”Users”
CompareWith=”UTF8Type” />
23.
“b”:
{“name”:”Ben”, “street”:”1234 Oak St.”, “city”:”Seattle”, “state”:”WA”} “jason”: {”name”:”Jason”, “street”:”456 First Ave.”, “city”:”Bellingham”, “state”:”WA”} “zack”: {”name”: “Zack”, “street”: “4321 Pine St.”, “city”: “Seattle”, “state”: “WA”} “jen1982”: {”name”:”Jennifer”, “street”:”1120 Foo Lane”, “city”:”San Francisco”, “state”:”CA”} “albert”: {”name”:”Albert”, “street”:”2364 South St.”, “city”:”Boston”, “state”:”MA”}
24.
SELECT name FROM
Users WHERE state=”WA”
25.
SELECT name FROM
Users WHERE state=”WA” How is WHERE clause formed?
26.
Act One Supercolumn Indexing
27.
<ColumnFamily Name=”LocationUserIndexSCF”
CompareWith=”UTF8Type” CompareSubcolumnsWith=”UTF8Type” ColumnType=”Super” />
28.
[state]: {
[city1]: {[name1]:[user1], [name2]:[user2], ... }, [city2]: {[name3]:[user3], [name4]:[user4], ... }, ... [cityX]: {[name5]:[user5], [name6]:[user6], ... } }
29.
“CA”: { “San
Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
30.
Row Key “CA”: {
“San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
31.
Row Key
Super Column “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
32.
Row Key
Colum Super Column n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
33.
Row Key
Colum Super Column Value n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
34.
Show me EVERYONE IN
WASHINGTON
35.
get(:LocationUserIndexSCF, ‘WA’)
36.
{
“Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
37.
Act Two Composite Key
Indexing
38.
Order Preserving Partitioner
+ Range Queries
39.
<ColumnFamily Name=”LocationUserIndexCF”
CompareWith=”UTF8Type” />
40.
[state1]/[city1]:
{[name1]:[user1], [name2]:[user2], ... } [state1]/[city2]: {[name3]:[user3], [name4]:[user4], ... } [state2]/[city1]: {[name5]:[user5], [name6]:[user6], ... } ... [stateX]/[cityY]: {[name7]:[user7], [name8]:[user8], ... }
41.
“CA/San Francisco”: {”Jennifer”:
“jen1982”} “MA/Boston”: {”Albert”: “albert”} “WA/Bellingham”: {”Jason”: “jason”} “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”}
42.
Show me EVERYONE IN
WASHINGTON
43.
get_range(:LocationUserIndexCF, {:start: 'WA',
:finish:'WB'})
44.
{
”WA/Bellingham”: {”Jason”: “jason”}, “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”} }
45.
Finale BUILD SOMETHING AWESOME
46.
(This part is
up to you)
47.
Appendix EXAMPLE KEYSPACE
48.
<Keyspace Name="UserDb">
<ColumnFamily Name="Users" CompareWith="UTF8Type" /> <ColumnFamily Name="LocationUserIndexSCF" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" ColumnType="Super" /> <ColumnFamily Name="LocationUserIndexCF" CompareWith="UTF8Type" /> <ReplicaPlacementStrategy> org.apache.cassandra.locator.RackUnawareStrategy </ReplicaPlacementStrategy> <ReplicationFactor>1</ReplicationFactor> <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch> </Keyspace>
Editor's Notes
Download now