SlideShare a Scribd company logo
1 of 73
Stonebraker Live!
Navigating the Database Universe
VoltDB presents
BRUCE READING
President and CEO
• Traditional RDBMS is all wrong
– Presented by Dr. Michael Stonebraker, Co-founder
• Making sense of the database universe
– Presented by Bruce Reading, President and CEO
• Hello VoltDB 3.0
– Presented by Ryan Betts, Field CTO
Agenda
TRADITIONAL RDBMS WISDOM IS ALL
WRONG
Dr. Michael Stonebraker
Traditional RDBMS Wisdom
• Data is in disk block formatting (heavily encoded)
• With a main memory buffer pool of blocks
• Query plans
– Optimize CPU, I/O
– Fundamental operation is read a row
• Indexing via B-trees
– Clustered or unclustered
Traditional RDBMS Wisdom
• Dynamic row-level locking
• Aries-style write-ahead log
• Replication (asynchronous or synchronous)
– Update the primary first
– Then move the log to other sites
– And roll forward at the secondary (s)
Traditional RDBMS Wisdom
• Describes MySQL, DB2, Postgres, SQLServer, Oracle…
• Focus of most college-level DBMS courses
– Including M.I.T.
• Focus of most DBMS textbooks
Traditional RDBMS Wisdom
• Is completely wrong
• (More charitably) is obsolete
The DBMS Marketplace
• About 1/3 “data warehouses”
– Lots of big reads
– Bulk-loaded from OLTP systems
• About 1/3 “OLTP”
– Lots of small updates
– And a few reads
• About 1/3 “everything else”
– Hadoop, NoSQL, graph DBMS, Array DBMS…
The DBMS Marketplace
• Data warehouses
– Market already moving strongly in the direction of column stores
– Which have nothing to do with the traditional wisdom
– Because column stores are 50 – 100 X row stores
The Participants
• Native column store vendors
– HP/Vertica, SAP/Hana, Red Shift (Amazon/Paraccl), SAP/Sybase/IQ
• Native row store vendors
– Microsoft, Oracle, DB2, Netezza
• In transition
– Teradata, Asterdata, Greenplum
• If you are running a row store, then be prepared to switch!
The DBMS Marketplace
• OLTP
– NewSQL systems are wildly faster than the traditional wisdom
• Everything else
– Not an RDBMS market
OLTP Databases – 3 Big Decisions
• Main memory vs. disk orientation
• Replication strategy
• Concurrency control strategy
Reality Check on OLTP Databases
• TP database size grows at the rate transactions increase
• 1 Tbyte of main memory buyable for around $30K (or less)
– (say) 64 Gbytes per server in 16 servers
• 10+ Tbytes possible
• If your data doesn’t fit in main memory now, then wait a
couple of years and it will…
Reality Check – Main Memory Performance
• TPC-C CPU cycles
• On the Shore DBMS
prototype
• “Elephants” should be
similar
To Go Fast
• Must focus on overhead
– B-trees affects a small fraction of the path length
• Must get rid of all four pie slices
– Anything less gives you a marginal win
– TimesTen as an example
16
Buffer Pool Overhead
• Get rid of the buffer pool
• i.e., run a main-memory DBMS
– Like VoltDB
Single Threading
• Hosed unless you do this
– Unless you get rid of queuing (somehow)
– Or eliminate shared data structures (somehow)
• VoltDB statically divides shared memory among the cores
– And cores are single threaded
Concurrency Control
• MVCC popular (NuoDB, Hekaton)
• Time stamp order popular (VoltDB)
• I don’t know anybody who is doing normal dynamic locking
– It’s too slow!!!!
Reality Check – High Availability (HA)
• Requirement in today’s OLTP systems
• Nobody will take down time
• Must be solved through replication
How to Implement HA
• I am only interested in ACID outcomes!!!!
• Eventual consistency actually means “creates garbage”
– Consider 2 customers at 2 sites, each buying the last “widget”
• Even Jeff Dean (Google) has come around to this point
of view
How to Implement HA
• Active-Passive
– Effectively requires you to write a log
– One of the four pie slices
• Active-Active (VoltDB solution)
– Send only the transaction, not the effect of the transaction
– Allows read-queries to be sent to any replica
Reality Check – Power Failures
• What to do if you don’t have UPS…
• Cannot lose data on a power failure!!!!
• Two options
– Bring back the log (and the pie slice)
– Command log plus asynchronous checkpoints
Some Data From Nirmesh Malvaiya
• Implemented Aries in VoltDB
• Compared against the VoltDB command logging
• Command logging about 3X faster in total throughput
The Nail in the Coffin
• Time stamp order compatible with active-active
– As are any deterministic schemes
• Locking and MVCC are not
– Need a 2 phase commit between the replicas
– Slow, slow, slow
Net-Net on OLTP
• Main memory DBMS
• Deterministic concurrency control
• HA via active-active
• Has nothing to do with the traditional wisdom
• Even if your data is too big for main memory
– The traditional wisdom is still wrong
– Stay tuned for a paper on this topic
Summary
• What we teach our DBMS students is all wrong
• Implementations from the “elephants” are all obsolete
– One-size-does-not-fit-all
– Several million lines of code per vendor are obsolete
• I expect a lot of turmoil in the market off into the future
MAKING SENSE OF THE DATABASE
UNIVERSE
Bruce Reading
The fact is…
There’s only more and
more to come.
And it’s not slowing
down…
Record amounts of data
are being created
everyday…
And if that data is most valuable at
the moment it’s created, how do you
put it to use NOW?
How do you automate decisioning
against it NOW?
NOW
Imagine…
Nice story. So what?
Large, busy bank
Rogue trader
5 “Mistyped
number”
-$
Small sum lost
9 “Mistyped
number”
& “Mistyped
number
-$
Small sum lost
-$
Small sum lost
Oblivious
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$ -$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$ -$
-$
-$
-$
-$-$
-$
-$
-$
-$
-$
-$-$
-$
-$
-$
-$
-$
-$
-$
-$
-$ -$
-$
-$
-$
-$
-$
-$
-$-$-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$
-$ -$
-$2BN
Large sum lost
Third largest loss in
banking history
UBS couldn't flag it among all
the data... until it was too late.
This is our world now.
Same old, same old
won’t cut it.
What’s a developer to do?
Data Value Chain
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Milliseconds Hundredths of seconds Second(s) Minutes Hours
• Place trade
• Serve ad
• Enrich stream
• Examine packet
• Approve trans.
• Calculate risk
• Leaderboard
• Aggregate
• Count
• Retrieve click
stream
• Show orders
• Backtest algo
• BI
• Daily reports
• Algo discovery
• Log analysis
• Fraud pattern match
Age of Data
Data Value Chain
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Milliseconds Hundredths of seconds Second(s) Minutes Hours
• Place trade
• Serve ad
• Enrich stream
• Examine packet
• Approve trans.
• Calculate risk
• Leaderboard
• Aggregate
• Count
• Retrieve click
stream
• Show orders
• Backtest algo
• BI
• Daily reports
• Algo discovery
• Log analysis
• Fraud pattern match
Value of Individual
Data Item
DataValue
Aggregate
Data Value
Age of Data
Traditional RDBMS
Simple Slow
Small
Fast
Complex
Large
ApplicationComplexity
Value of Individual Data Item Aggregate Data Value
DataValue
The Database Universe
Interactive Real-time Analytics Record Lookup Historical Analytics
Exploratory
Analytics
Transactional Analytic
Traditional RDBMS
Simple Slow
Small
Fast
Complex
Large
ApplicationComplexity
Value of Individual Data Item Aggregate Data Value
DataValue
Data
Warehouse
Hadoop, etc.
NoSQL
The Database Universe
Interactive Real-time Analytics Record Lookup Historical Analytics
Exploratory
Analytics
Transactional Analytic
NewSQL
Velocity
The fastest, most scalable database on
the market todayVoltDBIngest massive quantities of data and
perform automated decisioning in real time
3 MILLION transactions
per second
Dramatically lowering your cost per
transaction
VoltDB enables
NOW.
A huge impact on the bottom line
NOW
PREVENT
ACHIEVE
Anything is possible…
Electrical smart grids
Micro-personalization
Real-time display targeting
Dynamic airline ticket purchasing
State-of-the-art social networking
Session management
Network monitoring
We enable
NOW.
www.VoltDB.com
HELLO 3.0!
Ryan Betts
Introducing VoltDB 3.0
VoltDB 3.0
VoltDB: a modern OLTP database built for a high velocity world.
– Horizontal scalability
– Hundreds of thousands of transactions per second
– Relational SQL
Latency and Throughput, 50-50 Read/Write Workload
Latency and Throughput, 50-
50 Read/Write Workload
0
2
4
6
8
10
12
14
16
-50000 0 50000 100000 150000 200000 250000 300000
Latency(ms)
TPS
3.0
2.8.4.1
VoltDB 3.0 vs. v2.8.4.1
Key/Value 50/50 read/write workload
3 Node, K=1 Cluster
Read/Write Workload Latency/Throughput
Read/Write Workload
Latency/Throughput
0
1
2
3
4
5
6
7
8
9
-50000 0 50000 100000 150000 200000 250000 300000 350000
Avg.Latency(ms)
TPS
10% read/90% write
50% read/50% write
90% read/10% write
VoltDB 3.0
Key/Value various read/write workload
3 Node, K=1 Cluster
Faster: Ad Hoc SQL Performance
• Conversational SQL
• Thousands to 10,000+ ad hoc SQL transactions/second
• Single or multiple (batch) SQL statement transaction
Faster: Ad Hoc SQL
Performance
Easier Development: New SQL Support
• SQL LIKE and NOT LIKE
• UNION
• Column Functions
• Counting function (leaderboard ranking queries)
• Ability to define index using column functions
Easier Development:
New SQL Support
• JSON values stored in a varchar column
• Field() column function
• Indexing on JSON elements
CREATE INDEX session_site_moderator
ON user_session_table (field(json_data, 'site'),
field(json_data, 'moderator'), username);
• New JSON sample in kit
Easier Development:
JSON Support
Easier Development: JSON Support
Easier Development:
Online Operations
Easier Development: Online Operations
• Ability to re-join a failed node to cluster with no impact to
existing operations
• Online schema update
• No service window
Easier Development: Streamlined Development
• Elimination of project.xml
• VoltDB-specific configuration now defined in DDL
• Defaulting of deployment.xml
• New Volt Compiler CLI:
voltdb compile
Easier Development:
Streamlined Development
Expanded Reach: Cloud-Friendly
• Reduce impact of variable node performance and latency
• Elimination of strict NTP configuration
• Scales to large # of nodes
Expanded Reach:
Cloud-Friendly
Integration: High-Performance Export
• Parallelized export
• New connectors: JDBC, Netezza, Vertica
Integration: High-
Performance Export
Integration: Client Library Updates
• New PHP Client
• Node.js client v1.0
• Go Client
• Coming soon: updated Erlang client
Integration: Client
Library Updates
http://golang.org
Other Notable New Features
• Explain command
• CSV loader utility
• CSV snapshots
• New Administration CLI: voltadmin
– voltadmin save
– voltadmin restore
– voltadmin pause
– voltadmin resume
– voltadmin shutdown
Other Notable
New Features
More Samples Available
for Download
More Samples Available for Download
http://voltdb.com/comm
unity/volt-labs.php
Volt University
• Portfolio of instructional content, classes, tools, and other
resources to help them built applications quickly
• Curriculum and supporting material range from beginner to
advanced
• Three types of instruction:
– Volt University Online
– Volt University Classroom
– Volt Vanguard Certification
Volt University
Summary: VoltDB v3.0
• Run faster: transactions at high velocity scale.
• Create faster: write and scale your ACID application.
• Learn faster: Volt Labs & VoltDB University
VoltDB v3.0
DOWNLOAD 3.0
at
www.voltdb.com
Imagine the
Possibilities
More Information?
E-mail
info@voltdb.com
Visit our forums
http://community.voltdb.com/forum
Read the VoltDB “Getting Started Guide”
http://community.voltdb.com/docs/GettingStarted/index
Follow
@VoltDB on Twitter
More Information?
QUESTIONS?
THANK YOU

More Related Content

What's hot

001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
Scott Miao
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
004 architecture andadvanceduse
004 architecture andadvanceduse004 architecture andadvanceduse
004 architecture andadvanceduse
Scott Miao
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?
Tim Lossen
 

What's hot (17)

Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
Thousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/OThousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/O
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low Latency
 
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
 
004 architecture andadvanceduse
004 architecture andadvanceduse004 architecture andadvanceduse
004 architecture andadvanceduse
 
IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for th...
IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for th...IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for th...
IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for th...
 
Hadoop World 2011: Apache HBase Road Map - Jonathan Gray - Facebook
Hadoop World 2011: Apache HBase Road Map - Jonathan Gray - FacebookHadoop World 2011: Apache HBase Road Map - Jonathan Gray - Facebook
Hadoop World 2011: Apache HBase Road Map - Jonathan Gray - Facebook
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?
 
Scaling Up and Out your Virtualized SQL Servers
Scaling Up and Out your Virtualized SQL ServersScaling Up and Out your Virtualized SQL Servers
Scaling Up and Out your Virtualized SQL Servers
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
 
Digital Library Collection Management using HBase
Digital Library Collection Management using HBaseDigital Library Collection Management using HBase
Digital Library Collection Management using HBase
 
HBase and Hadoop at Urban Airship
HBase and Hadoop at Urban AirshipHBase and Hadoop at Urban Airship
HBase and Hadoop at Urban Airship
 

Similar to VoltDB - Stonebraker Live! - New York City 2013

M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
Edward Capriolo
 
What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010
jbellis
 
AWS Summit Tel Aviv - Enterprise Track - Data Warehouse
AWS Summit Tel Aviv - Enterprise Track - Data WarehouseAWS Summit Tel Aviv - Enterprise Track - Data Warehouse
AWS Summit Tel Aviv - Enterprise Track - Data Warehouse
Amazon Web Services
 

Similar to VoltDB - Stonebraker Live! - New York City 2013 (20)

Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDB
 
What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
 
OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
 
Deploying ssd in the data center 2014
Deploying ssd in the data center 2014Deploying ssd in the data center 2014
Deploying ssd in the data center 2014
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?
 
Oracle vs NoSQL – The good, the bad and the ugly
Oracle vs NoSQL – The good, the bad and the uglyOracle vs NoSQL – The good, the bad and the ugly
Oracle vs NoSQL – The good, the bad and the ugly
 
High Frequency Trading and NoSQL database
High Frequency Trading and NoSQL databaseHigh Frequency Trading and NoSQL database
High Frequency Trading and NoSQL database
 
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...
 
Michael stonebraker mit session
Michael stonebraker mit sessionMichael stonebraker mit session
Michael stonebraker mit session
 
What Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will WinWhat Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will Win
 
AWS Summit Tel Aviv - Enterprise Track - Data Warehouse
AWS Summit Tel Aviv - Enterprise Track - Data WarehouseAWS Summit Tel Aviv - Enterprise Track - Data Warehouse
AWS Summit Tel Aviv - Enterprise Track - Data Warehouse
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

VoltDB - Stonebraker Live! - New York City 2013

  • 1. Stonebraker Live! Navigating the Database Universe VoltDB presents
  • 3. • Traditional RDBMS is all wrong – Presented by Dr. Michael Stonebraker, Co-founder • Making sense of the database universe – Presented by Bruce Reading, President and CEO • Hello VoltDB 3.0 – Presented by Ryan Betts, Field CTO Agenda
  • 4. TRADITIONAL RDBMS WISDOM IS ALL WRONG Dr. Michael Stonebraker
  • 5. Traditional RDBMS Wisdom • Data is in disk block formatting (heavily encoded) • With a main memory buffer pool of blocks • Query plans – Optimize CPU, I/O – Fundamental operation is read a row • Indexing via B-trees – Clustered or unclustered
  • 6. Traditional RDBMS Wisdom • Dynamic row-level locking • Aries-style write-ahead log • Replication (asynchronous or synchronous) – Update the primary first – Then move the log to other sites – And roll forward at the secondary (s)
  • 7. Traditional RDBMS Wisdom • Describes MySQL, DB2, Postgres, SQLServer, Oracle… • Focus of most college-level DBMS courses – Including M.I.T. • Focus of most DBMS textbooks
  • 8. Traditional RDBMS Wisdom • Is completely wrong • (More charitably) is obsolete
  • 9. The DBMS Marketplace • About 1/3 “data warehouses” – Lots of big reads – Bulk-loaded from OLTP systems • About 1/3 “OLTP” – Lots of small updates – And a few reads • About 1/3 “everything else” – Hadoop, NoSQL, graph DBMS, Array DBMS…
  • 10. The DBMS Marketplace • Data warehouses – Market already moving strongly in the direction of column stores – Which have nothing to do with the traditional wisdom – Because column stores are 50 – 100 X row stores
  • 11. The Participants • Native column store vendors – HP/Vertica, SAP/Hana, Red Shift (Amazon/Paraccl), SAP/Sybase/IQ • Native row store vendors – Microsoft, Oracle, DB2, Netezza • In transition – Teradata, Asterdata, Greenplum • If you are running a row store, then be prepared to switch!
  • 12. The DBMS Marketplace • OLTP – NewSQL systems are wildly faster than the traditional wisdom • Everything else – Not an RDBMS market
  • 13. OLTP Databases – 3 Big Decisions • Main memory vs. disk orientation • Replication strategy • Concurrency control strategy
  • 14. Reality Check on OLTP Databases • TP database size grows at the rate transactions increase • 1 Tbyte of main memory buyable for around $30K (or less) – (say) 64 Gbytes per server in 16 servers • 10+ Tbytes possible • If your data doesn’t fit in main memory now, then wait a couple of years and it will…
  • 15. Reality Check – Main Memory Performance • TPC-C CPU cycles • On the Shore DBMS prototype • “Elephants” should be similar
  • 16. To Go Fast • Must focus on overhead – B-trees affects a small fraction of the path length • Must get rid of all four pie slices – Anything less gives you a marginal win – TimesTen as an example 16
  • 17. Buffer Pool Overhead • Get rid of the buffer pool • i.e., run a main-memory DBMS – Like VoltDB
  • 18. Single Threading • Hosed unless you do this – Unless you get rid of queuing (somehow) – Or eliminate shared data structures (somehow) • VoltDB statically divides shared memory among the cores – And cores are single threaded
  • 19. Concurrency Control • MVCC popular (NuoDB, Hekaton) • Time stamp order popular (VoltDB) • I don’t know anybody who is doing normal dynamic locking – It’s too slow!!!!
  • 20. Reality Check – High Availability (HA) • Requirement in today’s OLTP systems • Nobody will take down time • Must be solved through replication
  • 21. How to Implement HA • I am only interested in ACID outcomes!!!! • Eventual consistency actually means “creates garbage” – Consider 2 customers at 2 sites, each buying the last “widget” • Even Jeff Dean (Google) has come around to this point of view
  • 22. How to Implement HA • Active-Passive – Effectively requires you to write a log – One of the four pie slices • Active-Active (VoltDB solution) – Send only the transaction, not the effect of the transaction – Allows read-queries to be sent to any replica
  • 23. Reality Check – Power Failures • What to do if you don’t have UPS… • Cannot lose data on a power failure!!!! • Two options – Bring back the log (and the pie slice) – Command log plus asynchronous checkpoints
  • 24. Some Data From Nirmesh Malvaiya • Implemented Aries in VoltDB • Compared against the VoltDB command logging • Command logging about 3X faster in total throughput
  • 25. The Nail in the Coffin • Time stamp order compatible with active-active – As are any deterministic schemes • Locking and MVCC are not – Need a 2 phase commit between the replicas – Slow, slow, slow
  • 26. Net-Net on OLTP • Main memory DBMS • Deterministic concurrency control • HA via active-active • Has nothing to do with the traditional wisdom • Even if your data is too big for main memory – The traditional wisdom is still wrong – Stay tuned for a paper on this topic
  • 27. Summary • What we teach our DBMS students is all wrong • Implementations from the “elephants” are all obsolete – One-size-does-not-fit-all – Several million lines of code per vendor are obsolete • I expect a lot of turmoil in the market off into the future
  • 28. MAKING SENSE OF THE DATABASE UNIVERSE Bruce Reading
  • 29. The fact is… There’s only more and more to come. And it’s not slowing down… Record amounts of data are being created everyday…
  • 30. And if that data is most valuable at the moment it’s created, how do you put it to use NOW? How do you automate decisioning against it NOW?
  • 31. NOW
  • 33. Nice story. So what?
  • 34. Large, busy bank Rogue trader 5 “Mistyped number” -$ Small sum lost 9 “Mistyped number” & “Mistyped number -$ Small sum lost -$ Small sum lost Oblivious -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$-$ -$ -$ -$ -$ -$ -$-$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$-$-$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$ -$
  • 35. -$2BN Large sum lost Third largest loss in banking history
  • 36. UBS couldn't flag it among all the data... until it was too late.
  • 37. This is our world now.
  • 38. Same old, same old won’t cut it.
  • 40. Data Value Chain Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics Milliseconds Hundredths of seconds Second(s) Minutes Hours • Place trade • Serve ad • Enrich stream • Examine packet • Approve trans. • Calculate risk • Leaderboard • Aggregate • Count • Retrieve click stream • Show orders • Backtest algo • BI • Daily reports • Algo discovery • Log analysis • Fraud pattern match Age of Data
  • 41. Data Value Chain Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics Milliseconds Hundredths of seconds Second(s) Minutes Hours • Place trade • Serve ad • Enrich stream • Examine packet • Approve trans. • Calculate risk • Leaderboard • Aggregate • Count • Retrieve click stream • Show orders • Backtest algo • BI • Daily reports • Algo discovery • Log analysis • Fraud pattern match Value of Individual Data Item DataValue Aggregate Data Value Age of Data
  • 42. Traditional RDBMS Simple Slow Small Fast Complex Large ApplicationComplexity Value of Individual Data Item Aggregate Data Value DataValue The Database Universe Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics Transactional Analytic
  • 43. Traditional RDBMS Simple Slow Small Fast Complex Large ApplicationComplexity Value of Individual Data Item Aggregate Data Value DataValue Data Warehouse Hadoop, etc. NoSQL The Database Universe Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics Transactional Analytic NewSQL Velocity
  • 44. The fastest, most scalable database on the market todayVoltDBIngest massive quantities of data and perform automated decisioning in real time 3 MILLION transactions per second Dramatically lowering your cost per transaction VoltDB enables NOW. A huge impact on the bottom line NOW
  • 55. Introducing VoltDB 3.0 VoltDB 3.0 VoltDB: a modern OLTP database built for a high velocity world. – Horizontal scalability – Hundreds of thousands of transactions per second – Relational SQL
  • 56. Latency and Throughput, 50-50 Read/Write Workload Latency and Throughput, 50- 50 Read/Write Workload 0 2 4 6 8 10 12 14 16 -50000 0 50000 100000 150000 200000 250000 300000 Latency(ms) TPS 3.0 2.8.4.1 VoltDB 3.0 vs. v2.8.4.1 Key/Value 50/50 read/write workload 3 Node, K=1 Cluster
  • 57. Read/Write Workload Latency/Throughput Read/Write Workload Latency/Throughput 0 1 2 3 4 5 6 7 8 9 -50000 0 50000 100000 150000 200000 250000 300000 350000 Avg.Latency(ms) TPS 10% read/90% write 50% read/50% write 90% read/10% write VoltDB 3.0 Key/Value various read/write workload 3 Node, K=1 Cluster
  • 58. Faster: Ad Hoc SQL Performance • Conversational SQL • Thousands to 10,000+ ad hoc SQL transactions/second • Single or multiple (batch) SQL statement transaction Faster: Ad Hoc SQL Performance
  • 59. Easier Development: New SQL Support • SQL LIKE and NOT LIKE • UNION • Column Functions • Counting function (leaderboard ranking queries) • Ability to define index using column functions Easier Development: New SQL Support
  • 60. • JSON values stored in a varchar column • Field() column function • Indexing on JSON elements CREATE INDEX session_site_moderator ON user_session_table (field(json_data, 'site'), field(json_data, 'moderator'), username); • New JSON sample in kit Easier Development: JSON Support Easier Development: JSON Support
  • 61. Easier Development: Online Operations Easier Development: Online Operations • Ability to re-join a failed node to cluster with no impact to existing operations • Online schema update • No service window
  • 62. Easier Development: Streamlined Development • Elimination of project.xml • VoltDB-specific configuration now defined in DDL • Defaulting of deployment.xml • New Volt Compiler CLI: voltdb compile Easier Development: Streamlined Development
  • 63. Expanded Reach: Cloud-Friendly • Reduce impact of variable node performance and latency • Elimination of strict NTP configuration • Scales to large # of nodes Expanded Reach: Cloud-Friendly
  • 64. Integration: High-Performance Export • Parallelized export • New connectors: JDBC, Netezza, Vertica Integration: High- Performance Export
  • 65. Integration: Client Library Updates • New PHP Client • Node.js client v1.0 • Go Client • Coming soon: updated Erlang client Integration: Client Library Updates http://golang.org
  • 66. Other Notable New Features • Explain command • CSV loader utility • CSV snapshots • New Administration CLI: voltadmin – voltadmin save – voltadmin restore – voltadmin pause – voltadmin resume – voltadmin shutdown Other Notable New Features
  • 67. More Samples Available for Download More Samples Available for Download http://voltdb.com/comm unity/volt-labs.php
  • 68. Volt University • Portfolio of instructional content, classes, tools, and other resources to help them built applications quickly • Curriculum and supporting material range from beginner to advanced • Three types of instruction: – Volt University Online – Volt University Classroom – Volt Vanguard Certification Volt University
  • 69. Summary: VoltDB v3.0 • Run faster: transactions at high velocity scale. • Create faster: write and scale your ACID application. • Learn faster: Volt Labs & VoltDB University VoltDB v3.0
  • 71. More Information? E-mail info@voltdb.com Visit our forums http://community.voltdb.com/forum Read the VoltDB “Getting Started Guide” http://community.voltdb.com/docs/GettingStarted/index Follow @VoltDB on Twitter More Information?

Editor's Notes

  1. VoltDB was uniquely designed to let companies act on data automatically, the instant it's created, at its point of maximum value. No other database is constructed like this. We are the fastest, most scalable database on the planet. Hands down. Financial trading applications... e-commerce applications... billing applications in dynamic markets like telecommunications... sensor-driven applications in sectors like energy and healthcare. With VoltDB at the core, applications can decision against the freshest, most in-the-moment data, and they can do it at a fraction of the cost that they would with any other database on the market.Three million transactions per secondAn unprecedented drop in cost per transactionAll impacting a company's bottom lineVoltDB enables NOW.
  2. done on the volt10'sDell R510 server2 x Intel(R) Xeon(R) (quad core) CPU X5670  @ 2.93GHz64GB RAM