VoltDB presented their latest database product, VoltDB 3.0. The keynote consisted of three presentations:
1) Dr. Michael Stonebraker argued that traditional RDBMS wisdom is outdated and no longer applicable, as databases now focus on main memory, real-time analytics, and active-active replication.
2) Bruce Reading discussed how the database universe has expanded, and how VoltDB enables real-time automated decision making on massive amounts of data.
3) Ryan Betts demonstrated performance and usability improvements in VoltDB 3.0, including faster SQL and JSON support, online schema changes, and expanded cloud capabilities.
3. • Traditional RDBMS is all wrong
– Presented by Dr. Michael Stonebraker, Co-founder
• Making sense of the database universe
– Presented by Bruce Reading, President and CEO
• Hello VoltDB 3.0
– Presented by Ryan Betts, Field CTO
Agenda
5. Traditional RDBMS Wisdom
• Data is in disk block formatting (heavily encoded)
• With a main memory buffer pool of blocks
• Query plans
– Optimize CPU, I/O
– Fundamental operation is read a row
• Indexing via B-trees
– Clustered or unclustered
6. Traditional RDBMS Wisdom
• Dynamic row-level locking
• Aries-style write-ahead log
• Replication (asynchronous or synchronous)
– Update the primary first
– Then move the log to other sites
– And roll forward at the secondary (s)
7. Traditional RDBMS Wisdom
• Describes MySQL, DB2, Postgres, SQLServer, Oracle…
• Focus of most college-level DBMS courses
– Including M.I.T.
• Focus of most DBMS textbooks
9. The DBMS Marketplace
• About 1/3 “data warehouses”
– Lots of big reads
– Bulk-loaded from OLTP systems
• About 1/3 “OLTP”
– Lots of small updates
– And a few reads
• About 1/3 “everything else”
– Hadoop, NoSQL, graph DBMS, Array DBMS…
10. The DBMS Marketplace
• Data warehouses
– Market already moving strongly in the direction of column stores
– Which have nothing to do with the traditional wisdom
– Because column stores are 50 – 100 X row stores
11. The Participants
• Native column store vendors
– HP/Vertica, SAP/Hana, Red Shift (Amazon/Paraccl), SAP/Sybase/IQ
• Native row store vendors
– Microsoft, Oracle, DB2, Netezza
• In transition
– Teradata, Asterdata, Greenplum
• If you are running a row store, then be prepared to switch!
12. The DBMS Marketplace
• OLTP
– NewSQL systems are wildly faster than the traditional wisdom
• Everything else
– Not an RDBMS market
13. OLTP Databases – 3 Big Decisions
• Main memory vs. disk orientation
• Replication strategy
• Concurrency control strategy
14. Reality Check on OLTP Databases
• TP database size grows at the rate transactions increase
• 1 Tbyte of main memory buyable for around $30K (or less)
– (say) 64 Gbytes per server in 16 servers
• 10+ Tbytes possible
• If your data doesn’t fit in main memory now, then wait a
couple of years and it will…
15. Reality Check – Main Memory Performance
• TPC-C CPU cycles
• On the Shore DBMS
prototype
• “Elephants” should be
similar
16. To Go Fast
• Must focus on overhead
– B-trees affects a small fraction of the path length
• Must get rid of all four pie slices
– Anything less gives you a marginal win
– TimesTen as an example
16
17. Buffer Pool Overhead
• Get rid of the buffer pool
• i.e., run a main-memory DBMS
– Like VoltDB
18. Single Threading
• Hosed unless you do this
– Unless you get rid of queuing (somehow)
– Or eliminate shared data structures (somehow)
• VoltDB statically divides shared memory among the cores
– And cores are single threaded
19. Concurrency Control
• MVCC popular (NuoDB, Hekaton)
• Time stamp order popular (VoltDB)
• I don’t know anybody who is doing normal dynamic locking
– It’s too slow!!!!
20. Reality Check – High Availability (HA)
• Requirement in today’s OLTP systems
• Nobody will take down time
• Must be solved through replication
21. How to Implement HA
• I am only interested in ACID outcomes!!!!
• Eventual consistency actually means “creates garbage”
– Consider 2 customers at 2 sites, each buying the last “widget”
• Even Jeff Dean (Google) has come around to this point
of view
22. How to Implement HA
• Active-Passive
– Effectively requires you to write a log
– One of the four pie slices
• Active-Active (VoltDB solution)
– Send only the transaction, not the effect of the transaction
– Allows read-queries to be sent to any replica
23. Reality Check – Power Failures
• What to do if you don’t have UPS…
• Cannot lose data on a power failure!!!!
• Two options
– Bring back the log (and the pie slice)
– Command log plus asynchronous checkpoints
24. Some Data From Nirmesh Malvaiya
• Implemented Aries in VoltDB
• Compared against the VoltDB command logging
• Command logging about 3X faster in total throughput
25. The Nail in the Coffin
• Time stamp order compatible with active-active
– As are any deterministic schemes
• Locking and MVCC are not
– Need a 2 phase commit between the replicas
– Slow, slow, slow
26. Net-Net on OLTP
• Main memory DBMS
• Deterministic concurrency control
• HA via active-active
• Has nothing to do with the traditional wisdom
• Even if your data is too big for main memory
– The traditional wisdom is still wrong
– Stay tuned for a paper on this topic
27. Summary
• What we teach our DBMS students is all wrong
• Implementations from the “elephants” are all obsolete
– One-size-does-not-fit-all
– Several million lines of code per vendor are obsolete
• I expect a lot of turmoil in the market off into the future
40. Data Value Chain
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Milliseconds Hundredths of seconds Second(s) Minutes Hours
• Place trade
• Serve ad
• Enrich stream
• Examine packet
• Approve trans.
• Calculate risk
• Leaderboard
• Aggregate
• Count
• Retrieve click
stream
• Show orders
• Backtest algo
• BI
• Daily reports
• Algo discovery
• Log analysis
• Fraud pattern match
Age of Data
41. Data Value Chain
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Milliseconds Hundredths of seconds Second(s) Minutes Hours
• Place trade
• Serve ad
• Enrich stream
• Examine packet
• Approve trans.
• Calculate risk
• Leaderboard
• Aggregate
• Count
• Retrieve click
stream
• Show orders
• Backtest algo
• BI
• Daily reports
• Algo discovery
• Log analysis
• Fraud pattern match
Value of Individual
Data Item
DataValue
Aggregate
Data Value
Age of Data
44. The fastest, most scalable database on
the market todayVoltDBIngest massive quantities of data and
perform automated decisioning in real time
3 MILLION transactions
per second
Dramatically lowering your cost per
transaction
VoltDB enables
NOW.
A huge impact on the bottom line
NOW
55. Introducing VoltDB 3.0
VoltDB 3.0
VoltDB: a modern OLTP database built for a high velocity world.
– Horizontal scalability
– Hundreds of thousands of transactions per second
– Relational SQL
58. Faster: Ad Hoc SQL Performance
• Conversational SQL
• Thousands to 10,000+ ad hoc SQL transactions/second
• Single or multiple (batch) SQL statement transaction
Faster: Ad Hoc SQL
Performance
59. Easier Development: New SQL Support
• SQL LIKE and NOT LIKE
• UNION
• Column Functions
• Counting function (leaderboard ranking queries)
• Ability to define index using column functions
Easier Development:
New SQL Support
60. • JSON values stored in a varchar column
• Field() column function
• Indexing on JSON elements
CREATE INDEX session_site_moderator
ON user_session_table (field(json_data, 'site'),
field(json_data, 'moderator'), username);
• New JSON sample in kit
Easier Development:
JSON Support
Easier Development: JSON Support
61. Easier Development:
Online Operations
Easier Development: Online Operations
• Ability to re-join a failed node to cluster with no impact to
existing operations
• Online schema update
• No service window
62. Easier Development: Streamlined Development
• Elimination of project.xml
• VoltDB-specific configuration now defined in DDL
• Defaulting of deployment.xml
• New Volt Compiler CLI:
voltdb compile
Easier Development:
Streamlined Development
63. Expanded Reach: Cloud-Friendly
• Reduce impact of variable node performance and latency
• Elimination of strict NTP configuration
• Scales to large # of nodes
Expanded Reach:
Cloud-Friendly
66. Other Notable New Features
• Explain command
• CSV loader utility
• CSV snapshots
• New Administration CLI: voltadmin
– voltadmin save
– voltadmin restore
– voltadmin pause
– voltadmin resume
– voltadmin shutdown
Other Notable
New Features
67. More Samples Available
for Download
More Samples Available for Download
http://voltdb.com/comm
unity/volt-labs.php
68. Volt University
• Portfolio of instructional content, classes, tools, and other
resources to help them built applications quickly
• Curriculum and supporting material range from beginner to
advanced
• Three types of instruction:
– Volt University Online
– Volt University Classroom
– Volt Vanguard Certification
Volt University
69. Summary: VoltDB v3.0
• Run faster: transactions at high velocity scale.
• Create faster: write and scale your ACID application.
• Learn faster: Volt Labs & VoltDB University
VoltDB v3.0
71. More Information?
E-mail
info@voltdb.com
Visit our forums
http://community.voltdb.com/forum
Read the VoltDB “Getting Started Guide”
http://community.voltdb.com/docs/GettingStarted/index
Follow
@VoltDB on Twitter
More Information?
VoltDB was uniquely designed to let companies act on data automatically, the instant it's created, at its point of maximum value. No other database is constructed like this. We are the fastest, most scalable database on the planet. Hands down. Financial trading applications... e-commerce applications... billing applications in dynamic markets like telecommunications... sensor-driven applications in sectors like energy and healthcare. With VoltDB at the core, applications can decision against the freshest, most in-the-moment data, and they can do it at a fraction of the cost that they would with any other database on the market.Three million transactions per secondAn unprecedented drop in cost per transactionAll impacting a company's bottom lineVoltDB enables NOW.
done on the volt10'sDell R510 server2 x Intel(R) Xeon(R) (quad core) CPU X5670 @ 2.93GHz64GB RAM