The document discusses the evolution of online transaction processing (OLTP) databases and introduces NewSQL as a solution. It notes that traditional OLTP (OldSQL) is too slow and does not scale for modern high-volume applications (New OLTP). NoSQL databases provide performance but lack consistency guarantees and a SQL interface. NewSQL databases preserve SQL and consistency while providing the performance and scalability needed for New OLTP through innovative architectures. VoltDB is provided as an example NewSQL database that is over 45x faster than traditional OLTP databases and scales to over 1.6 million transactions per second.
Scaling API-first – The story of a global engineering organization
NewSQL vs NoSQL for New OLTP
1. the NewSQL database you’ll never outgrow
OldSQL vs. NoSQL vs. NewSQL
on New OLTP
Michael Stonebraker, CTO
VoltDB, Inc.
2. Old OLTP
Remember how we used to buy airplane tickets in the
1980s
+ By telephone
+ Through an intermediary (professional terminal operator)
Commerce at the speed of the intermediary
In 1985, 1,000 transactions per second was considered
an incredible stretch goal!!!!
+ HPTS (1985)
VoltDB 2
3. How has OLTP Changed in 25 Years?
The internet
+ Client is no longer a professional terminal operator
+ Instead Aunt Martha is using the web herself
+ Sends volume through the roof
VoltDB 3
4. How has OLTP Changed in 25 Years?
PDAs and sensors
+ Your cell phone is a transaction originator
+ Everything is being geo-positioned by sensors (marathon
runners, your car, ….)
+ Sends volume through the roof
VoltDB 4
5. How has OLTP Changed in 25 Years?
The definitions
+ “Online” no longer exclusively means a human operator
— The oncoming data tsunami is often device and system-generated
+ “Transaction” now transcends the traditional business
transaction
— High-throughput ACID write operations are a new requirement
+ “HA” and “durability” are now core database requirements
VoltDB 5
6. Example NewOLTP Use Cases
Data High-frequency Lower-frequency
Source operations operations
Write/index all trades, store Show consolidated risk
Capital markets
tick data across traders
Call initiation request Real-time authorization Fraud detection/analysis
Inbound HTTP Visitor logging, analysis,
Traffic pattern analytics
requests alerting
Rank scores:
Online game •Defined intervals Leaderboard lookups
•Player “bests”
Real-time ad trading Match form factor, Report ad performance
systems placement criteria, bid/ask from exhaust stream
Mobile device location Location updates, QoS,
Analytics on transactions
sensor transactions
VoltDB
VoltDB 6 6
6
7. New OLTP Challenges
New OLTP and You
You need to ingest the firehose in real
time
You need to process, validate, enrich
and respond in real-time
You often need real-time analytics
VoltDB
VoltDB 7 7
7
8. Solution Choices
OldSQL
+ Legacy RDBMS vendors
NoSQL
+ Give up SQL and ACID for performance
NewSQL
+ Preserve SQL and ACID
+ Get performance from a new architecture
VoltDB 8
9. OldSQL
Traditional SQL vendors (the “elephants”)
+ Code lines dating from the 1980’s
+ “bloatware”
+ Not very good at anything
— Can be beaten by at least an order of magnitude in every vertical
market I know of
+ Mediocre performance on New OLTP
— At low velocity it doesn’t matter
— Otherwise you get to tear your hair out
VoltDB 9
10. Reality Check
TPC-C CPU cycles
On the Shore DBMS prototype
Elephants should be similar
VoltDB 10
11. The Elephants
Are slow because they spend all of their time on
overhead!!!
+ Not on useful work
Would have to re-architect their legacy code to do
better
VoltDB 11
12. To Go a Lot Faster You Have to……
Focus on overhead
+ Better B-trees affects only 4% of the path length
Get rid of ALL major sources of overhead
+ Main memory deployment – gets rid of buffer pool
— Leaving other 75% of overhead intact
— i.e. win is 25%
VoltDB 12
14. Give Up SQL?
Compiler translates SQL at compile time into a
sequence of low level operations
Similar to what the NoSQL products make you
program in your application
30 years of RDBMS experience
+ Hard to beat the compiler
+ High level languages are good (data independence, less code, …)
+ Stored procedures are good!
— One round trip from app to DBMS rather than one one round trip
per record
— Move the code to the data, not the other way around
VoltDB 14
15. Give Up ACID?
If you have multi-record transactions
+ Need ACID or have to program it in user-level code
+ Rolling your own is messy, brittle, error-prone
Compensating transactions = pain
+ They’re people intensive and expensive
+ They cause customer/user dissatisfaction
+ Non-commutative single-record
transactions (e.g. buy an item) simply fail
VoltDB 15
16. NoSQL Summary
Appropriate for non-transactional systems
Appropriate for single record transactions that are
commutative
Not a good fit for New OLTP
Use the right tool for the job
Interesting …
Two recently-proposed NoSQL
language standards – CQL and
UnQL – are amazingly similar to
(you guessed it!) SQL
VoltDB 16
17. NewSQL
SQL
ACID
Performance and scalability through modern
innovative software architecture
VoltDB 17
18. NewSQL
Needs something other than traditional record level
locking (1st big source of overhead)
+ timestamp order
+ MVCC
+ Your good idea goes here
VoltDB 18
19. NewSQL
Needs a solution to buffer pool overhead (2nd big
source of overhead)
+ Main memory (at least for data that is not cold)
+ Some other way to reduce buffer pool cost
VoltDB 19
20. NewSQL
Needs a solution to latching for shared data
structures (3rd big source of overhead)
+ Some innovative use of B-trees
+ Single-threading
+ Your good idea goes here
VoltDB 20
21. NewSQL
Needs a solution to write-ahead logging (4th big
source of overhead)
+ Obvious answer is built-in replication and failover
+ New OLTP views this as a requirement anyway
Some details
+ On-line failover?
+ On-line failback?
+ LAN network partitioning?
+ WAN network partitioning?
VoltDB 21
22. Node Speed Really Matters
Application Requirement: 1MM operations/sec at scale
Operations/sec/node 10,000 Operations/sec/node 100,000
Nodes (without HA) 100 Nodes (without HA) 10
HA Replication factor 2 HA Replication factor 2
Nodes (including HA) 300 Nodes (including HA) 30
Greater scale through reduced network traffic
Reduced probability of hardware failures
Lower monetary and environmental costs
A high-scaling RDBMS must combine transparent
distribution, built-in durability and blazing node speed
VoltDB 22
23. A NewSQL Example – VoltDB
Main-memory storage
Single threaded, run Xacts to completion
+ No locking
+ No latching
Built-in HA and durability
+ No log
VoltDB 23
24. Yabut: What About Multicore?
For A K-core CPU, divide memory into K (non
overlapping) buckets
i.e. convert multi-core to K single cores
VoltDB 24
25. Where all the time goes… revisited
Before VoltDB
VoltDB 25
26. How Scalable is VoltDB?
“ VoltDB is very scalable; it should
scale to 120 partitions, 39 servers,
and 1.6 million complex transactions
per second at over 300 CPU cores. SGI Announces Support and Record
Benchmarks for VoltDB Database
High Performance SGI® Rackable™
Baron Schwartz Cluster Delivers Over Three Million
Chief Performance Architect Transactions per Second
Percona
Vanilla Hardware Pumped-up Hardware
VoltDB 26
27. Current VoltDB Status
Runs a subset of SQL (which is getting larger)
On VoltDB clusters (in memory on commodity gear)
No WAN support yet
+ Discussions abound on exactly what to do here
45X a popular OldSQL DBMS on TPC-C
5-7X Cassandra on VoltDB K-V layer
Clearly note this is an open source system!
VoltDB 27
28. Summary
Old OLTP New OLTP
Too slow
OldSQL for New OLTP Does not scale
Lacks consistency guarantees
NoSQL for New OLTP Low-level interface
Fast, scalable and consistent
NewSQL for New OLTP Supports SQL
VoltDB 28