SlideShare une entreprise Scribd logo
1  sur  31
NewSQL - Deliverance from BASE and back to SQL and ACID 
There are a number of NewSQL products now on market such as VoltDB and Progres-XL. These promise 
NoSQL performance and scalability but with ACID and relational concepts implemented with ANSI SQL. 
This session will cover off why NoSQL came about, why it's had it's day and why NewSQL will become the 
backbone of the Enterprise for OLTP and Analytics. 
Tony Rogerson, SQL Server MVP 
tonyrogerson@torver.net 
@tonyrogerson 
http://dataidol.com/tonyrogerson
Who am I? 
Freelance SQL Server professional and Data Specialist 
Fellow BCS, MSc in BI, PGCert in Data Science 
28 years of development and database experience, 22 of which SQL Server – starting out in 1986 
with VSAM, System W, Application System, DB2 and Oracle crossing over to Client/Server and 
SQL Server since 4.21a in 1993 
Awarded SQL Server MVP yearly since 97 
Founded UK SQL Server User Group back in ’99, founder member of DDD, SQL Bits, SQL Relay, 
SQL Santa 
Interested in commodity based distributed processing of Data (naturally!)
Agenda 
NoSQL 
◦ Why the need? 
◦ What products are available? 
Transactions 
◦ BASE 
◦ ACID 
SQL 
◦ What is today’s SQL capable of? 
◦ SQL Server performance – NoSQL required? 
NewSQL 
◦ SQL -> NoSQL -> NewSQL (distributed form of where we started) 
◦ Distributed Data and ACID 
Discussion
Not Only SQL (NoSQL) 
WHY THE NEED?
Why the Need? 
The year is 2001 and 
◦ It’s that Big Data thing…. 
◦ Mainstream Relational Databases (that use SQL) are scale up 
◦ More grunt required – buy a bigger box 
◦ SAN based storage is ridiculously expensive and complicated, heavy TCO 
Y2K + 1 
◦ Developers twiddling their thumbs ;) 
Web adoption accelerates 
◦ Google, Yahoo, Amazon and the like are born 
◦ MySQL does not scale – too inflexible 
◦ Up front costs of kit for projects/business that may fail – need elasticity 
http://www.tomshardware.co.uk/15-years-of-hard-drive- 
history-uk,review-1908-7.html
Products Available 
Varied – type of NoSQL database 
◦ Graph 
◦ Key-Value 
◦ Column store/Column Family 
◦ Document Store 
◦ Object 
◦ Relational but without SQL 
You name it and there is a product to do it
Performance Today [commodity] 
64KiB 100% Read 
100% sequential 100% random
ACID 
Atomicity 
◦ The bounds of the transaction – everything within those bounds is a single unit of work 
◦ All or nothing 
Consistency 
◦ Data must reside in the correct Domain of values 
◦ Deferrable to the end of the unit of work 
Isolation 
◦ Changes are Isolated from other users 
◦ Other connections cannot update what you have updated/updating 
◦ Multi-Value Concurrency Control (MVCC) – snapshots 
◦ Locking 
Durability 
◦ In system failure your changes are still maintained – nothing is lost
BASE (Basically Available, Soft-state, Eventually 
Consistent) 
BASE is a Transactional modelish (at the global level, rather than individual transactions) 
Specific to Distributed database model 
Basically Available – all or some of the system is available 
Node 1 Node 2 Node 3
BASE (Basically Available, Soft-state, Eventually 
Consistent) 
Soft-state 
Eventually Consistent 
System may change over time [as replica’s become up-to-date (consistent)] 
Node 1 Node 2 Node 3 
Insert value ‘A’
Eventual Consistency in SQL Server 
Asynchronous Availability Groups/Database Mirroring 
Replication 
Eventual / Causal Consistency 
◦ Eventual no good for order specific [and important] transactions 
◦ Like Merge replication 
◦ Causal: deliver messages in correct order [e.g. service broker] 
◦ Like Transactional Replication
ACID - Distributed 
2PC is clunky and doesn’t scale across many nodes 
PAXOS – Consensus theory – scales better 
Remove the need for distributed ACID altogether 
2PC Transaction 
Coordinator 
Subordinate 
INSERT Subordinate 
Subordinate 
All or nothing
Mixing BASE and ACID 
ACID applied local data node 
BASE remote
Relational 
Sets 
Tables with Rows x Columns 
Relational Theory dictates the row/column intersection is an Atomic value i.e. contains only a 
single value from the domain modelled for that column 
Chris Date: 
◦ Atomicity cannot really be defined as absolute in Normal Form 
◦ a column can contain “relational values” i.e. another table 
Normal Form – the process used to define the schema around the data being modelled
OldSQL roots 
Built for disk storage 
Built for single machine, scale-up 
Mature SQL language (decades of research) over the Relational Model 
SQL extensions to deal with unstructured data (freetext)
OldSQL today 
ACI [no Durability] 
In-Memory 
Modified design to work with Flash 
Still scale-up
SQL Server 
Delayed / No-Durability in SQL Server 2014 
In-Memory extensions 
Entity Attribute Value design combined with ColumnStore 
Sparse Columns / Column sets 
DEMOS
NewSQL 
OLDSQL -> SQL -> NEWSQL
Describe NewSQL 
NewSQL = OldSQL + Transparent_Data_Distribution + ACID 
Also – add in the knobs and whistles for new tech 
◦ Flash 
◦ RAM 
◦ Processor cache improvements 
◦ Better parallelisation across local processor cores 
Basically -> Scale out with ACID
Latency in a Distributed environment 
Server 
1Gbit 
ethernet 
Server 
Switch 
Server 
Server 
Server 
Server 
SQL Server 
FirstName Surname DOB 
Query returns 
20,000 rows 
558KiBytes of data 
Slowest Slower Fastest 
(Data Travel)
Reduce Latency – Data Locality 
Server SQL Server 
1Gbit 
Server ethernet 
Switch 
Server 
Server 
Server 
Server 
Server SQL Server 
Server SQL Server
Distributed SQL with ACID 
Server1 SQL Server 
1Gbit 
ethernet 
Switch 
Server2 SQL Server 
• 2 Phase Commit using DTC 
• High Latency 
• All or nothing 
BEGIN DISTRIBUTED TRAN 
INSERT Server3.pres_NEWSQL.dbo.people( ….. ) 
INSERT Server2.pres_NEWSQL.dbo.people( ….. ) 
INSERT Server1.pres_NEWSQL.dbo.people( ….. ) 
COMMIT TRAN 
Server2 SQL Server
Querying a Distributed Environment 
Financial Trading – Global position of the book 
TOP 10 customers 
Not easy (at speed) in an OLTP setting 
Network Switch 
N1 N2 N3 N4
Couple {Data, Processing} with 
{Machine-n}
Partitioning 
Chop big table up into “horizontal 
partitions” 
Partition key required (Mash, Modulo, Key 
range) 
Each partition is self-contained binding rows 
by the partitioning key 
Access all data through logical view over all 
partitions (local database) 
Table by table basis
Shared Nothing 
Partitioning+ 
Each Shard is self-contained and has all the 
procs, meta-data and of course your partition of 
data 
Shard Key common to multiple tables, for 
example CustomerID, Email Address. 
Greater autonomy across the distributed 
database 
Seeing the entire database as a logical unit is 
more difficult – joining is a nightmare 
Node 1 
Node 2 
Node 3
Data Distribution using Hashing 
Distributed Database Cluster has fixed number of data nodes 
Your data is spread across the database cluster 
◦ 10 node cluster; each data item may reside on 3 nodes 
◦ Which 3 nodes? 
Data key is Hashed to a number – hashing algorithm is deterministic 
data-node = f( data-key ) 
◦ print ( checksum( 'All hale to the ale' ) * 1.) % 10 
◦ print ( checksum( 'And a glass of wine for the ladies' ) * 1.) % 10
Sharding Sync 
LOGICAL 
DATABASE 
Pick a 
node 
Node 1 
Node 2 
Node 3 
Full copy of data 
Subset of data 
Replication 
Apps
Postgres-XC 
Applications 
(issue SQL to coordinators) 
Coordinators 
(plans, 2pc trans, knows 
about data distribution) 
Data Nodes 
GTM 
Global 
Transaction 
Manager 
http://de.slideshare.net/PavanDeolasee/postgresxc-28475161
Combine Sharding + Replication 
Shard your big tables based on a hash (or something) around your business key e.g. Customer, 
EmailAddress etc. 
Replicate static tables.
Discussion 
Tonyrogerson@torver.net 
@tonyrogerson 
http://dataidol.com/tonyrogerson

Contenu connexe

Tendances

Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDBSandun Perera
 
Introduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraIntroduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraJohnny Miller
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)DataStax Academy
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsnarsiman
 
Sql server hybrid what every sql professional should know
Sql server hybrid what every sql professional should knowSql server hybrid what every sql professional should know
Sql server hybrid what every sql professional should knowBob Ward
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0DataStax
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckDataStax Academy
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...DataStax Academy
 
Midwest PHP Presentation - New MSQL Features
Midwest PHP Presentation - New MSQL FeaturesMidwest PHP Presentation - New MSQL Features
Midwest PHP Presentation - New MSQL FeaturesDave Stokes
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architectureT Jake Luciani
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraJim Hatcher
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraNguyen Quang
 
Brk2051 sql server on linux and docker
Brk2051 sql server on linux and dockerBrk2051 sql server on linux and docker
Brk2051 sql server on linux and dockerBob Ward
 
Understanding data
Understanding dataUnderstanding data
Understanding dataShahd Salama
 

Tendances (20)

Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDB
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Introduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraIntroduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache Cassandra
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
Sql server hybrid what every sql professional should know
Sql server hybrid what every sql professional should knowSql server hybrid what every sql professional should know
Sql server hybrid what every sql professional should know
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 
SQL vs. NoSQL
SQL vs. NoSQLSQL vs. NoSQL
SQL vs. NoSQL
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
Midwest PHP Presentation - New MSQL Features
Midwest PHP Presentation - New MSQL FeaturesMidwest PHP Presentation - New MSQL Features
Midwest PHP Presentation - New MSQL Features
 
Clustering van IT-componenten
Clustering van IT-componentenClustering van IT-componenten
Clustering van IT-componenten
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into Cassandra
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Cassandra 101
Cassandra 101Cassandra 101
Cassandra 101
 
Brk2051 sql server on linux and docker
Brk2051 sql server on linux and dockerBrk2051 sql server on linux and docker
Brk2051 sql server on linux and docker
 
Understanding data
Understanding dataUnderstanding data
Understanding data
 

En vedette

Why new hardware may not make SQL Server faster
Why new hardware may not make SQL Server fasterWhy new hardware may not make SQL Server faster
Why new hardware may not make SQL Server fasterSolarWinds
 
Leveraging memory in sql server
Leveraging memory in sql serverLeveraging memory in sql server
Leveraging memory in sql serverChris Adkin
 
NoSQL, SQL, NewSQL - methods of structuring data.
NoSQL, SQL, NewSQL - methods of structuring data.NoSQL, SQL, NewSQL - methods of structuring data.
NoSQL, SQL, NewSQL - methods of structuring data.Tony Rogerson
 
The have no fear guide to virtualizing databases
The have no fear guide to virtualizing databasesThe have no fear guide to virtualizing databases
The have no fear guide to virtualizing databasesSolarWinds
 
Building scalable application with sql server
Building scalable application with sql serverBuilding scalable application with sql server
Building scalable application with sql serverChris Adkin
 
SQL Server 2014 In-Memory Tables (XTP, Hekaton)
SQL Server 2014 In-Memory Tables (XTP, Hekaton)SQL Server 2014 In-Memory Tables (XTP, Hekaton)
SQL Server 2014 In-Memory Tables (XTP, Hekaton)Tony Rogerson
 
Veja em primeira mão os tópicos de tecnologia de 2016
Veja em primeira mão os tópicos de tecnologia de 2016Veja em primeira mão os tópicos de tecnologia de 2016
Veja em primeira mão os tópicos de tecnologia de 2016SolarWinds
 
Why SQL Server 2014 Cardinality Estimator is *the* killer feature
Why SQL Server 2014 Cardinality Estimator is *the* killer featureWhy SQL Server 2014 Cardinality Estimator is *the* killer feature
Why SQL Server 2014 Cardinality Estimator is *the* killer featureSolarWinds
 
Why new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterWhy new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterSolarWinds
 
SolarWinds State of Government IT Management and Monitoring Survey
SolarWinds State of Government IT Management and Monitoring SurveySolarWinds State of Government IT Management and Monitoring Survey
SolarWinds State of Government IT Management and Monitoring SurveySolarWinds
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine LearningDavide Mauri
 
How to find what is making your Oracle database slow
How to find what is making your Oracle database slowHow to find what is making your Oracle database slow
How to find what is making your Oracle database slowSolarWinds
 
Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Chris Adkin
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton insertsChris Adkin
 
Sql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architecturesSql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architecturesChris Adkin
 
The 2015 Top Ten IT Pro-dictions
The 2015 Top Ten IT Pro-dictionsThe 2015 Top Ten IT Pro-dictions
The 2015 Top Ten IT Pro-dictionsSolarWinds
 
2015 Top 10 Vorhersagen Für IT-Profis
2015 Top 10 Vorhersagen Für IT-Profis2015 Top 10 Vorhersagen Für IT-Profis
2015 Top 10 Vorhersagen Für IT-ProfisSolarWinds
 
Back to the roots - SQL Server Indexing
Back to the roots - SQL Server IndexingBack to the roots - SQL Server Indexing
Back to the roots - SQL Server IndexingDavide Mauri
 
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow EngineScaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow EngineChris Adkin
 
Azure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsAzure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsDavide Mauri
 

En vedette (20)

Why new hardware may not make SQL Server faster
Why new hardware may not make SQL Server fasterWhy new hardware may not make SQL Server faster
Why new hardware may not make SQL Server faster
 
Leveraging memory in sql server
Leveraging memory in sql serverLeveraging memory in sql server
Leveraging memory in sql server
 
NoSQL, SQL, NewSQL - methods of structuring data.
NoSQL, SQL, NewSQL - methods of structuring data.NoSQL, SQL, NewSQL - methods of structuring data.
NoSQL, SQL, NewSQL - methods of structuring data.
 
The have no fear guide to virtualizing databases
The have no fear guide to virtualizing databasesThe have no fear guide to virtualizing databases
The have no fear guide to virtualizing databases
 
Building scalable application with sql server
Building scalable application with sql serverBuilding scalable application with sql server
Building scalable application with sql server
 
SQL Server 2014 In-Memory Tables (XTP, Hekaton)
SQL Server 2014 In-Memory Tables (XTP, Hekaton)SQL Server 2014 In-Memory Tables (XTP, Hekaton)
SQL Server 2014 In-Memory Tables (XTP, Hekaton)
 
Veja em primeira mão os tópicos de tecnologia de 2016
Veja em primeira mão os tópicos de tecnologia de 2016Veja em primeira mão os tópicos de tecnologia de 2016
Veja em primeira mão os tópicos de tecnologia de 2016
 
Why SQL Server 2014 Cardinality Estimator is *the* killer feature
Why SQL Server 2014 Cardinality Estimator is *the* killer featureWhy SQL Server 2014 Cardinality Estimator is *the* killer feature
Why SQL Server 2014 Cardinality Estimator is *the* killer feature
 
Why new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterWhy new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases faster
 
SolarWinds State of Government IT Management and Monitoring Survey
SolarWinds State of Government IT Management and Monitoring SurveySolarWinds State of Government IT Management and Monitoring Survey
SolarWinds State of Government IT Management and Monitoring Survey
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 
How to find what is making your Oracle database slow
How to find what is making your Oracle database slowHow to find what is making your Oracle database slow
How to find what is making your Oracle database slow
 
Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
 
Sql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architecturesSql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architectures
 
The 2015 Top Ten IT Pro-dictions
The 2015 Top Ten IT Pro-dictionsThe 2015 Top Ten IT Pro-dictions
The 2015 Top Ten IT Pro-dictions
 
2015 Top 10 Vorhersagen Für IT-Profis
2015 Top 10 Vorhersagen Für IT-Profis2015 Top 10 Vorhersagen Für IT-Profis
2015 Top 10 Vorhersagen Für IT-Profis
 
Back to the roots - SQL Server Indexing
Back to the roots - SQL Server IndexingBack to the roots - SQL Server Indexing
Back to the roots - SQL Server Indexing
 
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow EngineScaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
 
Azure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsAzure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applications
 

Similaire à NewSQL - Deliverance from BASE and back to SQL and ACID

Migrating on premises workload to azure sql database
Migrating on premises workload to azure sql databaseMigrating on premises workload to azure sql database
Migrating on premises workload to azure sql databasePARIKSHIT SAVJANI
 
By Popular Demand: The Rise of Elastic SQL
By Popular Demand: The Rise of Elastic SQLBy Popular Demand: The Rise of Elastic SQL
By Popular Demand: The Rise of Elastic SQLNuoDB
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesBernd Ocklin
 
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)Trivadis
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Clustrix
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDBI Goo Lee
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)James Serra
 
Experience sql server on l inux and docker
Experience sql server on l inux and dockerExperience sql server on l inux and docker
Experience sql server on l inux and dockerBob Ward
 
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyNoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyScyllaDB
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsKeeyong Han
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...StreamNative
 
Denver SQL Saturday The Next Frontier
Denver SQL Saturday The Next FrontierDenver SQL Saturday The Next Frontier
Denver SQL Saturday The Next FrontierKellyn Pot'Vin-Gorman
 
SPL_ALL_EN.pptx
SPL_ALL_EN.pptxSPL_ALL_EN.pptx
SPL_ALL_EN.pptx政宏 张
 
Cloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure toolsCloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure toolsPushkar Chivate
 
No sql databases
No sql databases No sql databases
No sql databases Ankit Dubey
 
SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerMichael Rys
 
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.vinithamaniB
 

Similaire à NewSQL - Deliverance from BASE and back to SQL and ACID (20)

Migrating on premises workload to azure sql database
Migrating on premises workload to azure sql databaseMigrating on premises workload to azure sql database
Migrating on premises workload to azure sql database
 
By Popular Demand: The Rise of Elastic SQL
By Popular Demand: The Rise of Elastic SQLBy Popular Demand: The Rise of Elastic SQL
By Popular Demand: The Rise of Elastic SQL
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion Queries
 
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
Experience sql server on l inux and docker
Experience sql server on l inux and dockerExperience sql server on l inux and docker
Experience sql server on l inux and docker
 
Copy Data Management for the DBA
Copy Data Management for the DBACopy Data Management for the DBA
Copy Data Management for the DBA
 
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyNoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data Analytics
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Denver SQL Saturday The Next Frontier
Denver SQL Saturday The Next FrontierDenver SQL Saturday The Next Frontier
Denver SQL Saturday The Next Frontier
 
SPL_ALL_EN.pptx
SPL_ALL_EN.pptxSPL_ALL_EN.pptx
SPL_ALL_EN.pptx
 
Cloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure toolsCloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure tools
 
NoSQL
NoSQLNoSQL
NoSQL
 
No sql databases
No sql databases No sql databases
No sql databases
 
SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL Server
 
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
 

Dernier

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 

Dernier (20)

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 

NewSQL - Deliverance from BASE and back to SQL and ACID

  • 1. NewSQL - Deliverance from BASE and back to SQL and ACID There are a number of NewSQL products now on market such as VoltDB and Progres-XL. These promise NoSQL performance and scalability but with ACID and relational concepts implemented with ANSI SQL. This session will cover off why NoSQL came about, why it's had it's day and why NewSQL will become the backbone of the Enterprise for OLTP and Analytics. Tony Rogerson, SQL Server MVP tonyrogerson@torver.net @tonyrogerson http://dataidol.com/tonyrogerson
  • 2. Who am I? Freelance SQL Server professional and Data Specialist Fellow BCS, MSc in BI, PGCert in Data Science 28 years of development and database experience, 22 of which SQL Server – starting out in 1986 with VSAM, System W, Application System, DB2 and Oracle crossing over to Client/Server and SQL Server since 4.21a in 1993 Awarded SQL Server MVP yearly since 97 Founded UK SQL Server User Group back in ’99, founder member of DDD, SQL Bits, SQL Relay, SQL Santa Interested in commodity based distributed processing of Data (naturally!)
  • 3. Agenda NoSQL ◦ Why the need? ◦ What products are available? Transactions ◦ BASE ◦ ACID SQL ◦ What is today’s SQL capable of? ◦ SQL Server performance – NoSQL required? NewSQL ◦ SQL -> NoSQL -> NewSQL (distributed form of where we started) ◦ Distributed Data and ACID Discussion
  • 4. Not Only SQL (NoSQL) WHY THE NEED?
  • 5. Why the Need? The year is 2001 and ◦ It’s that Big Data thing…. ◦ Mainstream Relational Databases (that use SQL) are scale up ◦ More grunt required – buy a bigger box ◦ SAN based storage is ridiculously expensive and complicated, heavy TCO Y2K + 1 ◦ Developers twiddling their thumbs ;) Web adoption accelerates ◦ Google, Yahoo, Amazon and the like are born ◦ MySQL does not scale – too inflexible ◦ Up front costs of kit for projects/business that may fail – need elasticity http://www.tomshardware.co.uk/15-years-of-hard-drive- history-uk,review-1908-7.html
  • 6. Products Available Varied – type of NoSQL database ◦ Graph ◦ Key-Value ◦ Column store/Column Family ◦ Document Store ◦ Object ◦ Relational but without SQL You name it and there is a product to do it
  • 7. Performance Today [commodity] 64KiB 100% Read 100% sequential 100% random
  • 8. ACID Atomicity ◦ The bounds of the transaction – everything within those bounds is a single unit of work ◦ All or nothing Consistency ◦ Data must reside in the correct Domain of values ◦ Deferrable to the end of the unit of work Isolation ◦ Changes are Isolated from other users ◦ Other connections cannot update what you have updated/updating ◦ Multi-Value Concurrency Control (MVCC) – snapshots ◦ Locking Durability ◦ In system failure your changes are still maintained – nothing is lost
  • 9. BASE (Basically Available, Soft-state, Eventually Consistent) BASE is a Transactional modelish (at the global level, rather than individual transactions) Specific to Distributed database model Basically Available – all or some of the system is available Node 1 Node 2 Node 3
  • 10. BASE (Basically Available, Soft-state, Eventually Consistent) Soft-state Eventually Consistent System may change over time [as replica’s become up-to-date (consistent)] Node 1 Node 2 Node 3 Insert value ‘A’
  • 11. Eventual Consistency in SQL Server Asynchronous Availability Groups/Database Mirroring Replication Eventual / Causal Consistency ◦ Eventual no good for order specific [and important] transactions ◦ Like Merge replication ◦ Causal: deliver messages in correct order [e.g. service broker] ◦ Like Transactional Replication
  • 12. ACID - Distributed 2PC is clunky and doesn’t scale across many nodes PAXOS – Consensus theory – scales better Remove the need for distributed ACID altogether 2PC Transaction Coordinator Subordinate INSERT Subordinate Subordinate All or nothing
  • 13. Mixing BASE and ACID ACID applied local data node BASE remote
  • 14. Relational Sets Tables with Rows x Columns Relational Theory dictates the row/column intersection is an Atomic value i.e. contains only a single value from the domain modelled for that column Chris Date: ◦ Atomicity cannot really be defined as absolute in Normal Form ◦ a column can contain “relational values” i.e. another table Normal Form – the process used to define the schema around the data being modelled
  • 15. OldSQL roots Built for disk storage Built for single machine, scale-up Mature SQL language (decades of research) over the Relational Model SQL extensions to deal with unstructured data (freetext)
  • 16. OldSQL today ACI [no Durability] In-Memory Modified design to work with Flash Still scale-up
  • 17. SQL Server Delayed / No-Durability in SQL Server 2014 In-Memory extensions Entity Attribute Value design combined with ColumnStore Sparse Columns / Column sets DEMOS
  • 18. NewSQL OLDSQL -> SQL -> NEWSQL
  • 19. Describe NewSQL NewSQL = OldSQL + Transparent_Data_Distribution + ACID Also – add in the knobs and whistles for new tech ◦ Flash ◦ RAM ◦ Processor cache improvements ◦ Better parallelisation across local processor cores Basically -> Scale out with ACID
  • 20. Latency in a Distributed environment Server 1Gbit ethernet Server Switch Server Server Server Server SQL Server FirstName Surname DOB Query returns 20,000 rows 558KiBytes of data Slowest Slower Fastest (Data Travel)
  • 21. Reduce Latency – Data Locality Server SQL Server 1Gbit Server ethernet Switch Server Server Server Server Server SQL Server Server SQL Server
  • 22. Distributed SQL with ACID Server1 SQL Server 1Gbit ethernet Switch Server2 SQL Server • 2 Phase Commit using DTC • High Latency • All or nothing BEGIN DISTRIBUTED TRAN INSERT Server3.pres_NEWSQL.dbo.people( ….. ) INSERT Server2.pres_NEWSQL.dbo.people( ….. ) INSERT Server1.pres_NEWSQL.dbo.people( ….. ) COMMIT TRAN Server2 SQL Server
  • 23. Querying a Distributed Environment Financial Trading – Global position of the book TOP 10 customers Not easy (at speed) in an OLTP setting Network Switch N1 N2 N3 N4
  • 24. Couple {Data, Processing} with {Machine-n}
  • 25. Partitioning Chop big table up into “horizontal partitions” Partition key required (Mash, Modulo, Key range) Each partition is self-contained binding rows by the partitioning key Access all data through logical view over all partitions (local database) Table by table basis
  • 26. Shared Nothing Partitioning+ Each Shard is self-contained and has all the procs, meta-data and of course your partition of data Shard Key common to multiple tables, for example CustomerID, Email Address. Greater autonomy across the distributed database Seeing the entire database as a logical unit is more difficult – joining is a nightmare Node 1 Node 2 Node 3
  • 27. Data Distribution using Hashing Distributed Database Cluster has fixed number of data nodes Your data is spread across the database cluster ◦ 10 node cluster; each data item may reside on 3 nodes ◦ Which 3 nodes? Data key is Hashed to a number – hashing algorithm is deterministic data-node = f( data-key ) ◦ print ( checksum( 'All hale to the ale' ) * 1.) % 10 ◦ print ( checksum( 'And a glass of wine for the ladies' ) * 1.) % 10
  • 28. Sharding Sync LOGICAL DATABASE Pick a node Node 1 Node 2 Node 3 Full copy of data Subset of data Replication Apps
  • 29. Postgres-XC Applications (issue SQL to coordinators) Coordinators (plans, 2pc trans, knows about data distribution) Data Nodes GTM Global Transaction Manager http://de.slideshare.net/PavanDeolasee/postgresxc-28475161
  • 30. Combine Sharding + Replication Shard your big tables based on a hash (or something) around your business key e.g. Customer, EmailAddress etc. Replicate static tables.
  • 31. Discussion Tonyrogerson@torver.net @tonyrogerson http://dataidol.com/tonyrogerson

Notes de l'éditeur

  1. GTM keeps simple state info (not a database itself) GXID (Global Transaction ID’s) – across cluster MVCC One active GTM per cluster, though standby’s available