SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
The Whys of NoSQL
1 Jargon Galore
2 Schema
3 Modeling and Internals
4 Deployment
5 Conclusion
2© 2015. All Rights Reserved.
©2015 DataStax Confidential. Do not distribute without consent.
 3
SQL Jargon
©2015 DataStax Confidential. Do not distribute without consent.
 4
NoSQL Noise?
Schema
©2015 DataStax Confidential. Do not distribute without consent.
Rigid
Schema
Schema
Free
Schema
on read
Schema
Easy to change
In flexible Writes are schema free,
reads are freaking slow
Reads/Writes are schema aware
Schema changes are O(1) operations
BLOBs
Too Slow
Optimized for Agility of change when needed, not theoretical extremes
©2015 DataStax Confidential. Do not distribute without consent.
 6
Normalization, Joins, Referential Integrity
Database normalization is the process of
organizing the columns (attributes) and
tables (relations) of a relational database
to minimize data redundancy.
Referential integrity is a property of data
which, when satisfied, requires every
value of one column of a table to exist
as a value of another column in a
different table.A JOIN is a means for combining
fields from two tables (or more) by
using values common to each.
Source - https://en.wikipedia.org/
©2015 DataStax Confidential. Do not distribute without consent.
 7
Not all Data Access is equal
1:168K random vs.
sequential
1:10 random vs.
sequential
Source - https://queue.acm.org/detail.cfm?id=1563874
©2015 DataStax Confidential. Do not distribute without consent.
 8
Disk Density
Source http://silvertonconsulting.com/blog/2010/04/22/save-the-planet-buy-fatter-disks-and-flash/#sthash.sh2nwqtX.dpbs
©2015 DataStax Confidential. Do not distribute without consent.
 9
$0.01
$0.10
$1.00
$10.00
$100.00
$1,000.00
$10,000.00
$100,000.00
$1,000,000.00
201420132010200520001995199019851980
HDD Price / GB
Minimize Data
Redundancy?
Disk Price / GB
OS Cache
C* Read and Write paths
©2015 DataStax Confidential. Do not distribute without consent.
Memtable 1 Memtable 2 Memtable N
SSTable 1 SSTable 2 SSTable N
Commit Log
Persistent
Storage
Off Heap
In Process Memory
Reads (memtable + N SSTables where N >= 1)
Mandatory Flush
Writes
Max # of SSTables = N
(based on compaction)
Creation of new memtable during flush operation (cleanup
tombstones, cleanup token ranges, etc.)
Time (memtable_flush_in_ms controls the frequency)
Accounting
SSTable
Compacted
RANDOM ACCESS
SEQUENTIAL ACCESS
Execution Engine
©2015 DataStax Confidential. Do not distribute without consent.
Key takeaways
©2015 DataStax Confidential. Do not distribute without consent.
Optimal utilization of physical resources (random
access, sequential IO and CPU)
No Read before Write (well mostly!)
Plan for Compaction (like commercial paper, you need
a regular pay back)
De-Normalize for optimal application response (use
2NF instead of 3NF)
Deployment Semantics
©2014 DataStax Confidential. Do not distribute without consent.
R/W R
Single BoxDR
GR
ScaleUpby.
Sharding
Replication
GR + DR
San
Francisco
New York
Stockholm
DC1 DC2
Linear Scaling
©2015 DataStax Confidential. Do not distribute without consent.
http://www.datastax.com/apache-cassandra-leads-nosql-benchmark
End Point Report Excerpt: Balanced Read/Write YCSB Test
So what's the catch?
©2015 DataStax Confidential. Do not distribute without consent.
©2015 DataStax Confidential. Do not distribute without consent.
 16
Conclusion
Best in class performance, backed by physics

Enables pragmatic business agility,

Delivering delightful customer experience,

Always on, Linear Scale architecture delivering optimal ROI
Thank you

Contenu connexe

Tendances

Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
DataWorks Summit
 

Tendances (20)

Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichLambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
 
Building the Next-gen Digital Meter Platform for Fluvius
Building the Next-gen Digital Meter Platform for FluviusBuilding the Next-gen Digital Meter Platform for Fluvius
Building the Next-gen Digital Meter Platform for Fluvius
 
Data Ingestion Engine
Data Ingestion EngineData Ingestion Engine
Data Ingestion Engine
 
Scaling Your Skillset with Your Data with Jarrett Garcia (Nielsen)
Scaling Your Skillset with Your Data with Jarrett Garcia (Nielsen)Scaling Your Skillset with Your Data with Jarrett Garcia (Nielsen)
Scaling Your Skillset with Your Data with Jarrett Garcia (Nielsen)
 
Cassandra-as-a-Service
Cassandra-as-a-ServiceCassandra-as-a-Service
Cassandra-as-a-Service
 
Logging, Metrics, and APM: The Operations Trifecta
Logging, Metrics, and APM: The Operations TrifectaLogging, Metrics, and APM: The Operations Trifecta
Logging, Metrics, and APM: The Operations Trifecta
 
Migrating to Cassandra
Migrating to CassandraMigrating to Cassandra
Migrating to Cassandra
 
Dive Into Data Lakes
Dive Into Data LakesDive Into Data Lakes
Dive Into Data Lakes
 
10 Reasons Why Your SAP Applications Belong on NetApp
10 Reasons Why Your SAP Applications Belong on NetApp10 Reasons Why Your SAP Applications Belong on NetApp
10 Reasons Why Your SAP Applications Belong on NetApp
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
 
DataStax Enterprise in Practice (Field Notes)
DataStax Enterprise in Practice (Field Notes)DataStax Enterprise in Practice (Field Notes)
DataStax Enterprise in Practice (Field Notes)
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
 
Managing and Deploying High Performance Computing Clusters using Windows HPC ...
Managing and Deploying High Performance Computing Clusters using Windows HPC ...Managing and Deploying High Performance Computing Clusters using Windows HPC ...
Managing and Deploying High Performance Computing Clusters using Windows HPC ...
 
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
 
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, SisenseDatabase Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
 
Machine Learning Deep Dive
Machine Learning Deep DiveMachine Learning Deep Dive
Machine Learning Deep Dive
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
 

En vedette

TupleJump: Breakthrough OLAP performance on Cassandra and Spark
TupleJump: Breakthrough OLAP performance on Cassandra and SparkTupleJump: Breakthrough OLAP performance on Cassandra and Spark
TupleJump: Breakthrough OLAP performance on Cassandra and Spark
DataStax Academy
 

En vedette (13)

Petabridge: The New .NET Enterprise Stack
Petabridge: The New .NET Enterprise StackPetabridge: The New .NET Enterprise Stack
Petabridge: The New .NET Enterprise Stack
 
IBM Spark Technology Center: Real-time Advanced Analytics and Machine Learnin...
IBM Spark Technology Center: Real-time Advanced Analytics and Machine Learnin...IBM Spark Technology Center: Real-time Advanced Analytics and Machine Learnin...
IBM Spark Technology Center: Real-time Advanced Analytics and Machine Learnin...
 
DataStax: Setting Your Database Management on Autopilot with OpsCenter
DataStax: Setting Your Database Management on Autopilot with OpsCenterDataStax: Setting Your Database Management on Autopilot with OpsCenter
DataStax: Setting Your Database Management on Autopilot with OpsCenter
 
DataStax: Ramping up Cassandra QA
DataStax: Ramping up Cassandra QADataStax: Ramping up Cassandra QA
DataStax: Ramping up Cassandra QA
 
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra MigrationInfosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
 
Reltio: Powering Enterprise Data-driven Applications with Cassandra
Reltio: Powering Enterprise Data-driven Applications with CassandraReltio: Powering Enterprise Data-driven Applications with Cassandra
Reltio: Powering Enterprise Data-driven Applications with Cassandra
 
Glassbeam: Ad-hoc Analytics on Internet of Complex Things with Apache Cassand...
Glassbeam: Ad-hoc Analytics on Internet of Complex Things with Apache Cassand...Glassbeam: Ad-hoc Analytics on Internet of Complex Things with Apache Cassand...
Glassbeam: Ad-hoc Analytics on Internet of Complex Things with Apache Cassand...
 
DataStax: What's New in Apache TinkerPop - the Graph Computing Framework
DataStax: What's New in Apache TinkerPop - the Graph Computing FrameworkDataStax: What's New in Apache TinkerPop - the Graph Computing Framework
DataStax: What's New in Apache TinkerPop - the Graph Computing Framework
 
Target: Escaping Disco-Era Data Modeling
Target: Escaping Disco-Era Data ModelingTarget: Escaping Disco-Era Data Modeling
Target: Escaping Disco-Era Data Modeling
 
DataStax: Datastax Enterprise - The Multi-Model Platform
DataStax: Datastax Enterprise - The Multi-Model PlatformDataStax: Datastax Enterprise - The Multi-Model Platform
DataStax: Datastax Enterprise - The Multi-Model Platform
 
Cisco UCS Integrated Infrastructure for Big Data with Cassandra
Cisco UCS Integrated Infrastructure for Big Data with CassandraCisco UCS Integrated Infrastructure for Big Data with Cassandra
Cisco UCS Integrated Infrastructure for Big Data with Cassandra
 
Timeli: Believing Cassandra: Our Big-Data Journey To Enlightenment under the ...
Timeli: Believing Cassandra: Our Big-Data Journey To Enlightenment under the ...Timeli: Believing Cassandra: Our Big-Data Journey To Enlightenment under the ...
Timeli: Believing Cassandra: Our Big-Data Journey To Enlightenment under the ...
 
TupleJump: Breakthrough OLAP performance on Cassandra and Spark
TupleJump: Breakthrough OLAP performance on Cassandra and SparkTupleJump: Breakthrough OLAP performance on Cassandra and Spark
TupleJump: Breakthrough OLAP performance on Cassandra and Spark
 

Similaire à DataStax: The Whys of NoSQL

Building scalable application with sql server
Building scalable application with sql serverBuilding scalable application with sql server
Building scalable application with sql server
Chris Adkin
 
Troubleshooting SQL Server
Troubleshooting SQL ServerTroubleshooting SQL Server
Troubleshooting SQL Server
Stephen Rose
 
Mohan Testing
Mohan TestingMohan Testing
Mohan Testing
smittal81
 

Similaire à DataStax: The Whys of NoSQL (20)

Building scalable application with sql server
Building scalable application with sql serverBuilding scalable application with sql server
Building scalable application with sql server
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta Lake
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 edition
 
Rac Optimisation For Siebel Crm, Doag 2008
Rac Optimisation For Siebel Crm, Doag 2008Rac Optimisation For Siebel Crm, Doag 2008
Rac Optimisation For Siebel Crm, Doag 2008
 
Delta: Building Merge on Read
Delta: Building Merge on ReadDelta: Building Merge on Read
Delta: Building Merge on Read
 
Speeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT ApproachSpeeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT Approach
 
Large Scale SQL Considerations for SharePoint Deployments
Large Scale SQL Considerations for SharePoint DeploymentsLarge Scale SQL Considerations for SharePoint Deployments
Large Scale SQL Considerations for SharePoint Deployments
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data Analytics
 
Troubleshooting SQL Server
Troubleshooting SQL ServerTroubleshooting SQL Server
Troubleshooting SQL Server
 
Modeling data for scalable, ad hoc analytics
Modeling data for scalable, ad hoc analyticsModeling data for scalable, ad hoc analytics
Modeling data for scalable, ad hoc analytics
 
Novedades SQL Server 2014
Novedades SQL Server 2014Novedades SQL Server 2014
Novedades SQL Server 2014
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728
 
Mohan Testing
Mohan TestingMohan Testing
Mohan Testing
 
All the ways to (not) break your database in production
All the ways to (not) break your database in productionAll the ways to (not) break your database in production
All the ways to (not) break your database in production
 
Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810
 
Intro to Azure SQL database
Intro to Azure SQL databaseIntro to Azure SQL database
Intro to Azure SQL database
 
Hekaton introduction for .Net developers
Hekaton introduction for .Net developersHekaton introduction for .Net developers
Hekaton introduction for .Net developers
 
NoSQL with Microsoft Azure
NoSQL with Microsoft AzureNoSQL with Microsoft Azure
NoSQL with Microsoft Azure
 
GIDS 2016 Understanding and Building No SQLs
GIDS 2016 Understanding and Building No SQLsGIDS 2016 Understanding and Building No SQLs
GIDS 2016 Understanding and Building No SQLs
 
Designing data intensive applications
Designing data intensive applicationsDesigning data intensive applications
Designing data intensive applications
 

Plus de DataStax Academy

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 

Plus de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

DataStax: The Whys of NoSQL

  • 1. The Whys of NoSQL
  • 2. 1 Jargon Galore 2 Schema 3 Modeling and Internals 4 Deployment 5 Conclusion 2© 2015. All Rights Reserved.
  • 3. ©2015 DataStax Confidential. Do not distribute without consent. 3 SQL Jargon
  • 4. ©2015 DataStax Confidential. Do not distribute without consent. 4 NoSQL Noise?
  • 5. Schema ©2015 DataStax Confidential. Do not distribute without consent. Rigid Schema Schema Free Schema on read Schema Easy to change In flexible Writes are schema free, reads are freaking slow Reads/Writes are schema aware Schema changes are O(1) operations BLOBs Too Slow Optimized for Agility of change when needed, not theoretical extremes
  • 6. ©2015 DataStax Confidential. Do not distribute without consent. 6 Normalization, Joins, Referential Integrity Database normalization is the process of organizing the columns (attributes) and tables (relations) of a relational database to minimize data redundancy. Referential integrity is a property of data which, when satisfied, requires every value of one column of a table to exist as a value of another column in a different table.A JOIN is a means for combining fields from two tables (or more) by using values common to each. Source - https://en.wikipedia.org/
  • 7. ©2015 DataStax Confidential. Do not distribute without consent. 7 Not all Data Access is equal 1:168K random vs. sequential 1:10 random vs. sequential Source - https://queue.acm.org/detail.cfm?id=1563874
  • 8. ©2015 DataStax Confidential. Do not distribute without consent. 8 Disk Density Source http://silvertonconsulting.com/blog/2010/04/22/save-the-planet-buy-fatter-disks-and-flash/#sthash.sh2nwqtX.dpbs
  • 9. ©2015 DataStax Confidential. Do not distribute without consent. 9 $0.01 $0.10 $1.00 $10.00 $100.00 $1,000.00 $10,000.00 $100,000.00 $1,000,000.00 201420132010200520001995199019851980 HDD Price / GB Minimize Data Redundancy? Disk Price / GB
  • 10. OS Cache C* Read and Write paths ©2015 DataStax Confidential. Do not distribute without consent. Memtable 1 Memtable 2 Memtable N SSTable 1 SSTable 2 SSTable N Commit Log Persistent Storage Off Heap In Process Memory Reads (memtable + N SSTables where N >= 1) Mandatory Flush Writes Max # of SSTables = N (based on compaction) Creation of new memtable during flush operation (cleanup tombstones, cleanup token ranges, etc.) Time (memtable_flush_in_ms controls the frequency) Accounting SSTable Compacted RANDOM ACCESS SEQUENTIAL ACCESS
  • 11. Execution Engine ©2015 DataStax Confidential. Do not distribute without consent.
  • 12. Key takeaways ©2015 DataStax Confidential. Do not distribute without consent. Optimal utilization of physical resources (random access, sequential IO and CPU) No Read before Write (well mostly!) Plan for Compaction (like commercial paper, you need a regular pay back) De-Normalize for optimal application response (use 2NF instead of 3NF)
  • 13. Deployment Semantics ©2014 DataStax Confidential. Do not distribute without consent. R/W R Single BoxDR GR ScaleUpby. Sharding Replication GR + DR San Francisco New York Stockholm DC1 DC2
  • 14. Linear Scaling ©2015 DataStax Confidential. Do not distribute without consent. http://www.datastax.com/apache-cassandra-leads-nosql-benchmark End Point Report Excerpt: Balanced Read/Write YCSB Test
  • 15. So what's the catch? ©2015 DataStax Confidential. Do not distribute without consent.
  • 16. ©2015 DataStax Confidential. Do not distribute without consent. 16 Conclusion Best in class performance, backed by physics Enables pragmatic business agility, Delivering delightful customer experience, Always on, Linear Scale architecture delivering optimal ROI