SlideShare une entreprise Scribd logo
1  sur  2
My experience with Cassandra concepts
I recently read about Cassandra concepts and internals to understand how it works and why
it is suited for handling large volume of data. This is a very interesting and also complex
subject and I have merely scratched the surface so far.
Cassandra is an open source scalable and highly available "NoSQL" distributed database
management system from Apache. It is classified under the Column-Family NoSQL
category. It was initially developed by Facebook and was later taken over by Apache. The
core features of Cassandra have been extracted from Amazon’s Dynamo and Google’s
Bigtable.
Its support for dynamic columns and distributed counters will resolve a major problem of
being able to aggregate most statistics as they are, rather than aggregating them with
map/reduce at the later stage.
Another beautiful thing about Cassandra is that it can keep maximum data in its cache (if
given enough RAM).
Cassandra Data Model
The Cassandra data model consists of a keyspace (analogous to a database), column
families (analogous to tables in the relational model), keys and columns. Here’s what the
basic Cassandra table (also known as a column family) structure looks like:

Figure1Error! No text of specified style in document.-1 Structure of a super column family in Cassandra

Don’t think of a relational table
Instead, think of a nested, sorted map data structure.
The following relational model analogy is often used to introduce Cassandra to newcomers:

Figure 1Error! No text of specified style in document.-2 Relational vs. Cassandra Model
This analogy helps make the transition from the relational to non-relational world. But don’t
use this analogy while designing Cassandra column families. Instead, think of the Cassandra
column family as a map of a map: an outer map keyed by a row key, and an inner map
keyed by a column key. Both maps are sorted.
SortedMap<RowKey, SortedMap<ColumnKey, ColumnValue>>
Why?
A nested sorted map is a more accurate analogy than a relational table, and will help you
make the right decisions about your Cassandra data model.

Figure 1-3: Cassandra Data Model
How?





A map gives efficient key lookup, and the sorted nature gives efficient scans. In
Cassandra, we can use row keys and column keys to do efficient lookups and range
scans.
The number of column keys is unbounded. In other words, you can have wide rows.
A key can itself hold a value. In other words, you can have a valueless column.

Map<RowKey, SortedMap<ColumnKey, ColumnValue>>
Conclusion
It’s important to think carefully about your data and your technology choices, and
sometimes it can be difficult to do that in a data vacuum. Cassandra, Hive, and Hadoop are
considered as the right tools to resolve most of the data challenges.
Your mileage may vary, but feel free to ask us questions in the comments!

Contenu connexe

En vedette

The International Journal of Engineering and Science (The IJES)
 The International Journal of Engineering and Science (The IJES) The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
Проект "Встречи на родине М. И. Семевского"
Проект "Встречи на родине М. И. Семевского"Проект "Встречи на родине М. И. Семевского"
Проект "Встречи на родине М. И. Семевского"vera2011s
 
Srouji hala e118_final
Srouji hala e118_finalSrouji hala e118_final
Srouji hala e118_finalhalahalo
 
The International Journal of Engineering and Science (The IJES)
 The International Journal of Engineering and Science (The IJES) The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
The International Journal of Engineering and Science (The IJES)
 The International Journal of Engineering and Science (The IJES) The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
Hypotheekrente ABN AMRO
Hypotheekrente ABN AMROHypotheekrente ABN AMRO
Hypotheekrente ABN AMRO4ieder
 
Td 10 diaporama
Td 10 diaporamaTd 10 diaporama
Td 10 diaporamaFlolet
 

En vedette (10)

The International Journal of Engineering and Science (The IJES)
 The International Journal of Engineering and Science (The IJES) The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Проект "Встречи на родине М. И. Семевского"
Проект "Встречи на родине М. И. Семевского"Проект "Встречи на родине М. И. Семевского"
Проект "Встречи на родине М. И. Семевского"
 
Srouji hala e118_final
Srouji hala e118_finalSrouji hala e118_final
Srouji hala e118_final
 
Espronceda
EsproncedaEspronceda
Espronceda
 
Presentació 14 12 regina
Presentació 14 12 reginaPresentació 14 12 regina
Presentació 14 12 regina
 
The International Journal of Engineering and Science (The IJES)
 The International Journal of Engineering and Science (The IJES) The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
The International Journal of Engineering and Science (The IJES)
 The International Journal of Engineering and Science (The IJES) The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Hypotheekrente ABN AMRO
Hypotheekrente ABN AMROHypotheekrente ABN AMRO
Hypotheekrente ABN AMRO
 
Handling Jealousy
Handling JealousyHandling Jealousy
Handling Jealousy
 
Td 10 diaporama
Td 10 diaporamaTd 10 diaporama
Td 10 diaporama
 

Plus de Bodhtree

Bodhtree salesforce.com consulting_partner
Bodhtree salesforce.com consulting_partnerBodhtree salesforce.com consulting_partner
Bodhtree salesforce.com consulting_partnerBodhtree
 
Potential link between digital adoption and business growth
Potential link between digital adoption and business growthPotential link between digital adoption and business growth
Potential link between digital adoption and business growthBodhtree
 
Advanced analytics playing a vital role for health insurers
Advanced analytics playing a vital role for health insurersAdvanced analytics playing a vital role for health insurers
Advanced analytics playing a vital role for health insurersBodhtree
 
Bodhtree workforce productivity_improvement_solution
Bodhtree workforce productivity_improvement_solutionBodhtree workforce productivity_improvement_solution
Bodhtree workforce productivity_improvement_solutionBodhtree
 
Bodhtree executive management_program_tracking_solution
Bodhtree executive management_program_tracking_solutionBodhtree executive management_program_tracking_solution
Bodhtree executive management_program_tracking_solutionBodhtree
 
Bodhtree key account_planning_solution
Bodhtree key account_planning_solutionBodhtree key account_planning_solution
Bodhtree key account_planning_solutionBodhtree
 
Bodhtree cloud geotagging_solution
Bodhtree cloud geotagging_solutionBodhtree cloud geotagging_solution
Bodhtree cloud geotagging_solutionBodhtree
 
Bodhtree healthcare payer solutions
Bodhtree healthcare payer solutionsBodhtree healthcare payer solutions
Bodhtree healthcare payer solutionsBodhtree
 
Tele health startup case study
Tele health startup case studyTele health startup case study
Tele health startup case studyBodhtree
 
How unused Big Data turns into Big Value
How unused Big Data turns into Big ValueHow unused Big Data turns into Big Value
How unused Big Data turns into Big ValueBodhtree
 
Change is the only constant
Change is the only constantChange is the only constant
Change is the only constantBodhtree
 
Bodhtree Corporate Overview
Bodhtree Corporate OverviewBodhtree Corporate Overview
Bodhtree Corporate OverviewBodhtree
 
Balance your Supply Chain with Big Data
Balance your Supply Chain with Big DataBalance your Supply Chain with Big Data
Balance your Supply Chain with Big DataBodhtree
 
Business Analytics from Bodhtree
Business Analytics from BodhtreeBusiness Analytics from Bodhtree
Business Analytics from BodhtreeBodhtree
 
Bodhtree Corporate Deck
Bodhtree Corporate DeckBodhtree Corporate Deck
Bodhtree Corporate DeckBodhtree
 

Plus de Bodhtree (15)

Bodhtree salesforce.com consulting_partner
Bodhtree salesforce.com consulting_partnerBodhtree salesforce.com consulting_partner
Bodhtree salesforce.com consulting_partner
 
Potential link between digital adoption and business growth
Potential link between digital adoption and business growthPotential link between digital adoption and business growth
Potential link between digital adoption and business growth
 
Advanced analytics playing a vital role for health insurers
Advanced analytics playing a vital role for health insurersAdvanced analytics playing a vital role for health insurers
Advanced analytics playing a vital role for health insurers
 
Bodhtree workforce productivity_improvement_solution
Bodhtree workforce productivity_improvement_solutionBodhtree workforce productivity_improvement_solution
Bodhtree workforce productivity_improvement_solution
 
Bodhtree executive management_program_tracking_solution
Bodhtree executive management_program_tracking_solutionBodhtree executive management_program_tracking_solution
Bodhtree executive management_program_tracking_solution
 
Bodhtree key account_planning_solution
Bodhtree key account_planning_solutionBodhtree key account_planning_solution
Bodhtree key account_planning_solution
 
Bodhtree cloud geotagging_solution
Bodhtree cloud geotagging_solutionBodhtree cloud geotagging_solution
Bodhtree cloud geotagging_solution
 
Bodhtree healthcare payer solutions
Bodhtree healthcare payer solutionsBodhtree healthcare payer solutions
Bodhtree healthcare payer solutions
 
Tele health startup case study
Tele health startup case studyTele health startup case study
Tele health startup case study
 
How unused Big Data turns into Big Value
How unused Big Data turns into Big ValueHow unused Big Data turns into Big Value
How unused Big Data turns into Big Value
 
Change is the only constant
Change is the only constantChange is the only constant
Change is the only constant
 
Bodhtree Corporate Overview
Bodhtree Corporate OverviewBodhtree Corporate Overview
Bodhtree Corporate Overview
 
Balance your Supply Chain with Big Data
Balance your Supply Chain with Big DataBalance your Supply Chain with Big Data
Balance your Supply Chain with Big Data
 
Business Analytics from Bodhtree
Business Analytics from BodhtreeBusiness Analytics from Bodhtree
Business Analytics from Bodhtree
 
Bodhtree Corporate Deck
Bodhtree Corporate DeckBodhtree Corporate Deck
Bodhtree Corporate Deck
 

Dernier

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Dernier (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

My experience with Cassandra concepts

  • 1. My experience with Cassandra concepts I recently read about Cassandra concepts and internals to understand how it works and why it is suited for handling large volume of data. This is a very interesting and also complex subject and I have merely scratched the surface so far. Cassandra is an open source scalable and highly available "NoSQL" distributed database management system from Apache. It is classified under the Column-Family NoSQL category. It was initially developed by Facebook and was later taken over by Apache. The core features of Cassandra have been extracted from Amazon’s Dynamo and Google’s Bigtable. Its support for dynamic columns and distributed counters will resolve a major problem of being able to aggregate most statistics as they are, rather than aggregating them with map/reduce at the later stage. Another beautiful thing about Cassandra is that it can keep maximum data in its cache (if given enough RAM). Cassandra Data Model The Cassandra data model consists of a keyspace (analogous to a database), column families (analogous to tables in the relational model), keys and columns. Here’s what the basic Cassandra table (also known as a column family) structure looks like: Figure1Error! No text of specified style in document.-1 Structure of a super column family in Cassandra Don’t think of a relational table Instead, think of a nested, sorted map data structure. The following relational model analogy is often used to introduce Cassandra to newcomers: Figure 1Error! No text of specified style in document.-2 Relational vs. Cassandra Model
  • 2. This analogy helps make the transition from the relational to non-relational world. But don’t use this analogy while designing Cassandra column families. Instead, think of the Cassandra column family as a map of a map: an outer map keyed by a row key, and an inner map keyed by a column key. Both maps are sorted. SortedMap<RowKey, SortedMap<ColumnKey, ColumnValue>> Why? A nested sorted map is a more accurate analogy than a relational table, and will help you make the right decisions about your Cassandra data model. Figure 1-3: Cassandra Data Model How?    A map gives efficient key lookup, and the sorted nature gives efficient scans. In Cassandra, we can use row keys and column keys to do efficient lookups and range scans. The number of column keys is unbounded. In other words, you can have wide rows. A key can itself hold a value. In other words, you can have a valueless column. Map<RowKey, SortedMap<ColumnKey, ColumnValue>> Conclusion It’s important to think carefully about your data and your technology choices, and sometimes it can be difficult to do that in a data vacuum. Cassandra, Hive, and Hadoop are considered as the right tools to resolve most of the data challenges. Your mileage may vary, but feel free to ask us questions in the comments!