SlideShare a Scribd company logo
1 of 48
Cassandra Basics
              Indexing

     Benjamin Black, b@b3k.us
Relational stores are
SCHEMA ORIENTED
Start from your SCHEMA &
WORK FORWARDS
Column stores are
QUERY ORIENTED
Start from your QUERIES &
WORK BACKWARDS
AT SCALE
AT SCALE
           Denormalization is
              THE NORM
AT SCALE
AT SCALE
           Everything depends on
               THE INDICES
Cassandra is an
INDEX CONSTRUCTION KIT
Column Family
Two-level Map

key: {
  column: value,
  column: value,
  ...
 }
Super Column Family
Three-level Map
key: {
   supercolumn: {
       column:value,
      column: value
   },
   supercolumn: {
     ...
   }
 }
column sorting defined by
         CompareWith/
CompareSubcolumnsWith
TimeUUIDType
  UTF8Type
                ASCIIType
LongType

     LexicalUUIDType
row placement determined by
             Partitioner
RandomPartitioner
Place based on MD5 of key




        OrderPreservingPartitioner
               Place based on actual key
Rows are sorted by key on each node
Regardless of partitioner
One example in
TWO ACTS
Prelude
A USER DATABASE
<ColumnFamily Name=”Users”
       CompareWith=”UTF8Type” />
“b”:    {“name”:”Ben”, “street”:”1234 Oak St.”,
        “city”:”Seattle”, “state”:”WA”}
“jason”: {”name”:”Jason”, “street”:”456 First Ave.”,
        “city”:”Bellingham”, “state”:”WA”}
“zack”:     {”name”: “Zack”, “street”: “4321 Pine St.”,
          “city”: “Seattle”, “state”: “WA”}
“jen1982”: {”name”:”Jennifer”, “street”:”1120 Foo Lane”,
         “city”:”San Francisco”, “state”:”CA”}
“albert”: {”name”:”Albert”, “street”:”2364 South St.”,
         “city”:”Boston”, “state”:”MA”}
SELECT name FROM Users
WHERE state=”WA”
SELECT name FROM Users
               WHERE state=”WA”

How is WHERE clause
formed?
Act One
Supercolumn Indexing
<ColumnFamily Name=”LocationUserIndexSCF”
       CompareWith=”UTF8Type”
       CompareSubcolumnsWith=”UTF8Type”
       ColumnType=”Super” />
[state]: {
  [city1]: {[name1]:[user1], [name2]:[user2], ... },
  [city2]: {[name3]:[user3], [name4]:[user4], ... },
  ...
  [cityX]: {[name5]:[user5], [name6]:[user6], ... }
}
“CA”: {

 “San Francisco”: {”Jennifer”: “jen1982”}
}
“MA”: {

 “Boston”: {”Albert”: “albert”}
}
“WA”: {

 “Bellingham”: {”Jason”: “jason”},

 “Seattle”: {”Ben”: “b”, ”Zack”: “zack”}
}
Row Key


“CA”: {

 “San Francisco”: {”Jennifer”: “jen1982”}
}
“MA”: {

 “Boston”: {”Albert”: “albert”}
}
“WA”: {

 “Bellingham”: {”Jason”: “jason”},

 “Seattle”: {”Ben”: “b”, ”Zack”: “zack”}
}
Row Key
                 Super Column

“CA”: {

 “San Francisco”: {”Jennifer”: “jen1982”}
}
“MA”: {

 “Boston”: {”Albert”: “albert”}
}
“WA”: {

 “Bellingham”: {”Jason”: “jason”},

 “Seattle”: {”Ben”: “b”, ”Zack”: “zack”}
}
Row Key
                                     Colum
                 Super Column
                                     n
“CA”: {

 “San Francisco”: {”Jennifer”: “jen1982”}
}
“MA”: {

 “Boston”: {”Albert”: “albert”}
}
“WA”: {

 “Bellingham”: {”Jason”: “jason”},

 “Seattle”: {”Ben”: “b”, ”Zack”: “zack”}
}
Row Key
                                     Colum
                 Super Column                Value
                                     n
“CA”: {

 “San Francisco”: {”Jennifer”: “jen1982”}
}
“MA”: {

 “Boston”: {”Albert”: “albert”}
}
“WA”: {

 “Bellingham”: {”Jason”: “jason”},

 “Seattle”: {”Ben”: “b”, ”Zack”: “zack”}
}
Show me
EVERYONE IN WASHINGTON
get(:LocationUserIndexSCF, ‘WA’)
{

   “Bellingham”: {”Jason”: “jason”},

   “Seattle”: {”Ben”: “b”, ”Zack”: “zack”}
}
Act Two
Composite Key Indexing
Order Preserving Partitioner
                          +
        Range Queries
<ColumnFamily Name=”LocationUserIndexCF”
       CompareWith=”UTF8Type” />
[state1]/[city1]:   {[name1]:[user1], [name2]:[user2], ... }
[state1]/[city2]:   {[name3]:[user3], [name4]:[user4], ... }
[state2]/[city1]:   {[name5]:[user5], [name6]:[user6], ... }
...
[stateX]/[cityY]:   {[name7]:[user7], [name8]:[user8], ... }
“CA/San Francisco”: {”Jennifer”: “jen1982”}
“MA/Boston”: {”Albert”: “albert”}
“WA/Bellingham”: {”Jason”: “jason”}
“WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”}
Show me
EVERYONE IN WASHINGTON
get_range(:LocationUserIndexCF, {:start: 'WA',
                          :finish:'WB'})
{
    ”WA/Bellingham”: {”Jason”: “jason”},
    “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”}
}
Finale
BUILD SOMETHING AWESOME
(This part is up to you)
Appendix
EXAMPLE KEYSPACE
<Keyspace Name="UserDb">
  <ColumnFamily Name="Users"
          CompareWith="UTF8Type" />

  <ColumnFamily Name="LocationUserIndexSCF"

   
         CompareWith="UTF8Type"
  
       
     CompareSubcolumnsWith="UTF8Type"

   
         ColumnType="Super" />

   
  <ColumnFamily Name="LocationUserIndexCF"

   
         CompareWith="UTF8Type" />

   
  <ReplicaPlacementStrategy>
      org.apache.cassandra.locator.RackUnawareStrategy
  </ReplicaPlacementStrategy>
  <ReplicationFactor>1</ReplicationFactor>
  <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
</Keyspace>

More Related Content

Viewers also liked

Understanding BYOE and How Today's User Experience Drives Value for UC
Understanding BYOE and How Today's User Experience Drives Value for UCUnderstanding BYOE and How Today's User Experience Drives Value for UC
Understanding BYOE and How Today's User Experience Drives Value for UCShoreTel
 
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...aliproductninja
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
Cassandra and Spark
Cassandra and Spark Cassandra and Spark
Cassandra and Spark datastaxjp
 
data science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyterdata science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & JupyterRaj Singh
 
Introduction to Apache Spark
Introduction to Apache Spark Introduction to Apache Spark
Introduction to Apache Spark Juan Pedro Moreno
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra Nikiforos Botis
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - DenverJon Haddad
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLRyu Kobayashi
 
Intro to py spark (and cassandra)
Intro to py spark (and cassandra)Intro to py spark (and cassandra)
Intro to py spark (and cassandra)Jon Haddad
 
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in PythonThe Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in PythonMiklos Christine
 
Python & Cassandra - Best Friends
Python & Cassandra - Best FriendsPython & Cassandra - Best Friends
Python & Cassandra - Best FriendsJon Haddad
 
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014Jon Haddad
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to CassandraJon Haddad
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed DatabaseEric Evans
 
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark MeetupPySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark MeetupFrens Jan Rumph
 

Viewers also liked (20)

Cassandra
CassandraCassandra
Cassandra
 
Graphite cluster setup blueprint
Graphite cluster setup blueprintGraphite cluster setup blueprint
Graphite cluster setup blueprint
 
Understanding BYOE and How Today's User Experience Drives Value for UC
Understanding BYOE and How Today's User Experience Drives Value for UCUnderstanding BYOE and How Today's User Experience Drives Value for UC
Understanding BYOE and How Today's User Experience Drives Value for UC
 
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
 
What is a DMP
What is a DMPWhat is a DMP
What is a DMP
 
Highly Available Graphite
Highly Available GraphiteHighly Available Graphite
Highly Available Graphite
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Cassandra and Spark
Cassandra and Spark Cassandra and Spark
Cassandra and Spark
 
data science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyterdata science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyter
 
Introduction to Apache Spark
Introduction to Apache Spark Introduction to Apache Spark
Introduction to Apache Spark
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - Denver
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
 
Intro to py spark (and cassandra)
Intro to py spark (and cassandra)Intro to py spark (and cassandra)
Intro to py spark (and cassandra)
 
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in PythonThe Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
 
Python & Cassandra - Best Friends
Python & Cassandra - Best FriendsPython & Cassandra - Best Friends
Python & Cassandra - Best Friends
 
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark MeetupPySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
 

Similar to Cassandra Basics: Indexing

Building Your First Java Application with MongoDB
Building Your First Java Application with MongoDBBuilding Your First Java Application with MongoDB
Building Your First Java Application with MongoDBMongoDB
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL UsersAll Things Open
 
MongoDB - Features and Operations
MongoDB - Features and OperationsMongoDB - Features and Operations
MongoDB - Features and Operationsramyaranjith
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL UsersGreat Wide Open
 
Embedding a language into string interpolator
Embedding a language into string interpolatorEmbedding a language into string interpolator
Embedding a language into string interpolatorMichael Limansky
 
Native json in the Cache' ObjectScript 2016.*
Native json in the Cache' ObjectScript 2016.*Native json in the Cache' ObjectScript 2016.*
Native json in the Cache' ObjectScript 2016.*Timur Safin
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation FrameworkMongoDB
 
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 

Similar to Cassandra Basics: Indexing (10)

Building Your First Java Application with MongoDB
Building Your First Java Application with MongoDBBuilding Your First Java Application with MongoDB
Building Your First Java Application with MongoDB
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL Users
 
MongoDB - Features and Operations
MongoDB - Features and OperationsMongoDB - Features and Operations
MongoDB - Features and Operations
 
Json at work overview and ecosystem-v2.0
Json at work   overview and ecosystem-v2.0Json at work   overview and ecosystem-v2.0
Json at work overview and ecosystem-v2.0
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL Users
 
Embedding a language into string interpolator
Embedding a language into string interpolatorEmbedding a language into string interpolator
Embedding a language into string interpolator
 
Native json in the Cache' ObjectScript 2016.*
Native json in the Cache' ObjectScript 2016.*Native json in the Cache' ObjectScript 2016.*
Native json in the Cache' ObjectScript 2016.*
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 

Recently uploaded

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Cassandra Basics: Indexing

  • 1. Cassandra Basics Indexing Benjamin Black, b@b3k.us
  • 3. Start from your SCHEMA & WORK FORWARDS
  • 5. Start from your QUERIES & WORK BACKWARDS
  • 7. AT SCALE Denormalization is THE NORM
  • 9. AT SCALE Everything depends on THE INDICES
  • 10. Cassandra is an INDEX CONSTRUCTION KIT
  • 12. Two-level Map key: { column: value, column: value, ... }
  • 14. Three-level Map key: { supercolumn: { column:value, column: value }, supercolumn: { ... } }
  • 15. column sorting defined by CompareWith/ CompareSubcolumnsWith
  • 16. TimeUUIDType UTF8Type ASCIIType LongType LexicalUUIDType
  • 17. row placement determined by Partitioner
  • 18. RandomPartitioner Place based on MD5 of key OrderPreservingPartitioner Place based on actual key
  • 19. Rows are sorted by key on each node Regardless of partitioner
  • 22. <ColumnFamily Name=”Users” CompareWith=”UTF8Type” />
  • 23. “b”: {“name”:”Ben”, “street”:”1234 Oak St.”, “city”:”Seattle”, “state”:”WA”} “jason”: {”name”:”Jason”, “street”:”456 First Ave.”, “city”:”Bellingham”, “state”:”WA”} “zack”: {”name”: “Zack”, “street”: “4321 Pine St.”, “city”: “Seattle”, “state”: “WA”} “jen1982”: {”name”:”Jennifer”, “street”:”1120 Foo Lane”, “city”:”San Francisco”, “state”:”CA”} “albert”: {”name”:”Albert”, “street”:”2364 South St.”, “city”:”Boston”, “state”:”MA”}
  • 24. SELECT name FROM Users WHERE state=”WA”
  • 25. SELECT name FROM Users WHERE state=”WA” How is WHERE clause formed?
  • 27. <ColumnFamily Name=”LocationUserIndexSCF” CompareWith=”UTF8Type” CompareSubcolumnsWith=”UTF8Type” ColumnType=”Super” />
  • 28. [state]: { [city1]: {[name1]:[user1], [name2]:[user2], ... }, [city2]: {[name3]:[user3], [name4]:[user4], ... }, ... [cityX]: {[name5]:[user5], [name6]:[user6], ... } }
  • 29. “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 30. Row Key “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 31. Row Key Super Column “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 32. Row Key Colum Super Column n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 33. Row Key Colum Super Column Value n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 34. Show me EVERYONE IN WASHINGTON
  • 36. { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 38. Order Preserving Partitioner + Range Queries
  • 39. <ColumnFamily Name=”LocationUserIndexCF” CompareWith=”UTF8Type” />
  • 40. [state1]/[city1]: {[name1]:[user1], [name2]:[user2], ... } [state1]/[city2]: {[name3]:[user3], [name4]:[user4], ... } [state2]/[city1]: {[name5]:[user5], [name6]:[user6], ... } ... [stateX]/[cityY]: {[name7]:[user7], [name8]:[user8], ... }
  • 41. “CA/San Francisco”: {”Jennifer”: “jen1982”} “MA/Boston”: {”Albert”: “albert”} “WA/Bellingham”: {”Jason”: “jason”} “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”}
  • 42. Show me EVERYONE IN WASHINGTON
  • 44. { ”WA/Bellingham”: {”Jason”: “jason”}, “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”} }
  • 46. (This part is up to you)
  • 48. <Keyspace Name="UserDb"> <ColumnFamily Name="Users" CompareWith="UTF8Type" /> <ColumnFamily Name="LocationUserIndexSCF" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" ColumnType="Super" /> <ColumnFamily Name="LocationUserIndexCF" CompareWith="UTF8Type" /> <ReplicaPlacementStrategy> org.apache.cassandra.locator.RackUnawareStrategy </ReplicaPlacementStrategy> <ReplicationFactor>1</ReplicationFactor> <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch> </Keyspace>

Editor's Notes