SlideShare a Scribd company logo
1 of 28
Dr. Michael Stonebraker and Scott Jarr


Navigating the Database Universe
About Our Presenters
Mike Stonebraker                          Scott Jarr
Co-founder & CTO, VoltDB                  Co-founder & Chief Strategy
                                          Officer, VoltDB

A pioneer of database research and        More than 20 years of experience
technology for more than a quarter of a   building, launching and growing
century, and the main architect of the    technology companies from inception to
Ingres relational DBMS and the object-    market leadership in the
relational DBMS PostgreSQL                search, mobile, security, storage and
                                          virtualization markets
Agenda

• The (proper) design of DBMSs
  – Presented by Dr. Michael Stonebraker

• The database universe

• Where the future value comes from
We Believe…

• “Big Data” is a rare, transformative market
• Velocity is becoming the cornerstone
• Specialized databases (working together) are
  the answer
• Products must provide tangible customer
  value... Fast
Dr. Michael Stonebraker

THE (PROPER) DESIGN
        OF THE DBMS
Lessons from 40 Years of Database Design
1.   Get the user interaction right
     – Bet on a small number of easy-to-



2.
       understand constructs
     – Plus standards

     Get the implementation right
                                               “   Those who don’t learn
                                                   from history are
     – Bet on a small number of easy-to-
       understand constructs
                                                   destined to repeat it.
                                                             -Winston Churchill   ”
3.   One size does not fit all
     – At least not if you want fast, big or
       complex
#1: Get the User Interaction Right

       Historical Lesson: RDBMS vs. CODASYL vs. OODB

Winner: RDBMS           Loser: CODASYL                             Loser: OODBs
• Simple data model     •   Complicated data model             •   Complex data model
                            (records; participate in “sets”;       (hierarchical
  (tables)                  set has one owner                      records, pointers, sets, ar
• Simple access             and, perhaps, many
                                                                   rays, etc.)
                            members, etc.)
  language (SQL)                                               •   Complex access
                        •   Messy access language (sea
• ACID (transactions)       of “cursors”; some -- but not          language
                            all -- move on every                   (navigation, through this
• Standards (SQL)           command, navigation                    sea)
                            programming)
                                                               •   No standards
Interaction Take Away − Simple is Good

• ACID was easy for people to understand

• SQL provided a standard, high-level language and
  made people productive (transportable skills)
#2: Get the Implementation Right
• Leverage a few simple ideas: Early relational implementations




                                                                          Historical Winners
    – System R storage system dropped links
    – Views (protection, schema modification, performance)
    – Cost-based optimizer
• Leverage a few simple ideas: Postgres
    – User-defined data types and functions (adopted by most everybody)
    – Rules/triggers
    – No-overwrite storage
• Leverage a few simple ideas: Vertica
   – Store data by column
    – Compressed up the ging gong
    – Parallel load without compromising ACID
#3: One Size Does NOT Fit All
• OSFA is an old technology with hundreds
  of bags hanging off it
• It breaks 100% of the time when under
                                             “   …specialized systems
                                                 can each be a factor of
  load                                           50 faster than the
• Load = size or speed or complexity             single ‘one size fits all’
• Load is increasing at a startling rate         system…A factor of 50
                                                 is nothing to sneeze at.
• Purpose-built will exceed by 10x to 100x
• History has not been completely written
  yet…but let’s look at VoltDB as an
                                                       -My Top 10 Assertions About
                                                           Data Warehouses, 2010
                                                                                     ”
  example
Example: VoltDB
• Get the interface right
   – SQL
   – ACID

• Implementation: Leverage a few simple ideas
   – Main memory
   – Stored procedures
   – Deterministic scheduling

• Specialization
   – OLTP focus allowed for above implementation choices
Proving the Theory
                                    Useful Work
• Challenge: OLTP                       4%

  performance
                                                  Recovery 24%
                          Latching 24%
  – TPC-C CPU cycles
                                                   Buffer Pool 24%
  – On the Shore DBMS       Locking 24%
    prototype

  – Elephants should be
    similar
Implementation Construct #1: Main Memory
• Main memory format for data
    – Disk format gets you buffer pool overhead
• What happens if data doesn’t fit?
    – Return to disk-buffer pool architecture (slow)
    – Anti-caching
        • Main memory format for data
        • When memory fills up, then bundle together elderly tuples and write them out
        • Run a transaction in “sleuth mode”; find the required records and move to main
          memory (and pin)
        • Run Xact normally
Implementation Construct #2: Stored Procedures

• Round trip to the DBMS is expensive
   – Do it once per transaction
   – Not once per command
   – Or even once per cursor move
• Ad-hoc queries supported
   – Turn them into dynamic stored procedures
Implementation Construct #3:
Deterministic and Non-deterministic Scheduling
• Non-deterministic (can’t tell order until commit time)
   – MVCC
   – Dynamic locking
• Deterministic
   – Time stamp order
Result of Design Principles: VoltDB Example

• Good interface decisions – made developers more productive
   – SQL & ACID

• Leveraging a few simple implementation ideas – made
  VoltDB wicked fast
   – Main memory
   – Stored procedures
   – Deterministic scheduling
Proving the Theory

• Answer: OLTP performance
  – 3 million transactions per second
                                        “   …we are heading
                                            toward a world with at
                                            least 5 (and probably
  – 7x Cassandra
                                            more) specialized
  – 15 million SQL statements per           engines and the death
    second
                                            of the ‘one size fits all’
  – 100,000+ transactions per               legacy systems.
    commodity server
                                                                   ”
                                                  -The End of an Architectural
                                                  Era (It’s Time for a Complete
                                                                 Rewrite), 2007
Scott Jarr

THE DATABASE UNIVERSE
Technology Meets the Market
Believe
   –   “Big Data” is a rare, transformative market
   –   Velocity is becoming the cornerstone
   –   Specialized databases (working together) are the answer
   –   Products must provide tangible customer value… Fast

Observations
   – Noisy, crowded and new – kinda like Christmas shopping at the mall
   – Everyone wants to understand where the pieces fit
   – Analysts build maps on technology NOT use cases

What we need is…
Data Value Chain




                                                 Age of Data

     Interactive         Real-time Analytics         Record Lookup          Historical Analytics       Exploratory Analytics

     Milliseconds        Hundredths of seconds         Second(s)                  Minutes                      Hours

•   Place trade      •     Calculate risk        •     Retrieve click   •      Backtest algo       •     Algo discovery
•   Serve ad         •     Leaderboard                 stream           •      BI                  •     Log analysis
•   Enrich stream    •     Aggregate             •     Show orders      •      Daily reports       •     Fraud pattern match
•   Examine packet   •     Count
•   Approve trans.
Data Value Chain
            Value of Individual                                                                 Aggregate
                Data Item                                                                       Data Value




                                                                                                                                Data Value
                                                  Age of Data

     Interactive          Real-time Analytics         Record Lookup          Historical Analytics       Exploratory Analytics

     Milliseconds         Hundredths of seconds         Second(s)                  Minutes                      Hours

•   Place trade       •     Calculate risk        •     Retrieve click   •      Backtest algo       •     Algo discovery
•   Serve ad          •     Leaderboard                 stream           •      BI                  •     Log analysis
•   Enrich stream     •     Aggregate             •     Show orders      •      Daily reports       •     Fraud pattern match
•   Examine packet    •     Count
•   Approve trans.
The Database Universe
 Fast
 Complex
 Large
                               Value of Individual Data Item                               Aggregate Data Value
      Application Complexity




                                                                                                                          Data Value
                                                         Traditional RDBMS
Simple Slow
Small
                               Transactional                                                                Analytic
                                                                                                            Exploratory
                                Interactive    Real-time Analytics   Record Lookup   Historical Analytics
                                                                                                             Analytics
The Database Universe
 Fast
 Complex
 Large
                               Value of Individual Data Item                                Aggregate Data Value
      Application Complexity




                                                                                                                             Data Value
                                              Velocity                                                  Hadoop, etc.
                                                                         NoSQL
                                                                                            Data
                                     NewSQL                                               Warehouse
                                                          Traditional RDBMS
Simple Slow
Small
                               Transactional                                                                   Analytic
                                                                                                               Exploratory
                                Interactive     Real-time Analytics   Record Lookup   Historical Analytics
                                                                                                                Analytics
logins trades authorizations clicks
      sensors orders impressions
                                      Closed-loop Big Data

 Interactive & Real-time Analytics



  Historical Reports & Analytics



      Exploratory Analytics
logins trades authorizations clicks
                  sensors orders impressions
                                                  Closed-loop Big Data
                                                  • Make the most
             Interactive & Real-time Analytics      informed decision
                                                    every time there is an
                                                    interaction

                                                  • Real-time decisions
              Historical Reports & Analytics        are informed by
Knowledge                                           operational analytics
                                                    and past knowledge

                  Exploratory Analytics
The Velocity Use Case
What’s it look like?
    –   High throughput, relentless data feeds
    –   Fast decisions on high-value data
    –   Real-time, operational analytics present immediate visibility

What’s the big deal?
    –   Batch converts to real time = efficiency
    –   Decisions made at time of event = better decisions

    –   Ability to micro segment/target/personalize/etc. = conversion, satisfaction, more data is coming at
        you, use it to improve your business
Next Up

QUESTIONS AND ANSWERS
www.voltdb.com


THANK YOU

More Related Content

What's hot

Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservicespflueras
 
NYC Meetup November 15, 2012
NYC Meetup November 15, 2012NYC Meetup November 15, 2012
NYC Meetup November 15, 2012NuoDB
 
MySQL Cluster no PayPal
MySQL Cluster no PayPalMySQL Cluster no PayPal
MySQL Cluster no PayPalMySQL Brasil
 
Accelerating NoSQL
Accelerating NoSQLAccelerating NoSQL
Accelerating NoSQLsunnygleason
 
HPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemHPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemAdam Marcus
 
Building high traffic http front-ends. theo schlossnagle. зал 1
Building high traffic http front-ends. theo schlossnagle. зал 1Building high traffic http front-ends. theo schlossnagle. зал 1
Building high traffic http front-ends. theo schlossnagle. зал 1rit2011
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012jbellis
 
第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandraShun Nakamura
 
NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011David Funaro
 
Conference tutorial: MySQL Cluster as NoSQL
Conference tutorial: MySQL Cluster as NoSQLConference tutorial: MySQL Cluster as NoSQL
Conference tutorial: MySQL Cluster as NoSQLSeveralnines
 
MySQL High-Availability and Scale-Out architectures
MySQL High-Availability and Scale-Out architecturesMySQL High-Availability and Scale-Out architectures
MySQL High-Availability and Scale-Out architecturesFromDual GmbH
 
Oracle strategy for_information_management
Oracle strategy for_information_managementOracle strategy for_information_management
Oracle strategy for_information_managementInSync Conference
 
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...✔ Eric David Benari, PMP
 
NoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityNoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityLars Marius Garshol
 
Evaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for BenchmarkingEvaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for BenchmarkingSergey Bushik
 
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageWebinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageGlusterFS
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGlusterFS
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandrajbellis
 

What's hot (20)

Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
NYC Meetup November 15, 2012
NYC Meetup November 15, 2012NYC Meetup November 15, 2012
NYC Meetup November 15, 2012
 
MySQL Cluster no PayPal
MySQL Cluster no PayPalMySQL Cluster no PayPal
MySQL Cluster no PayPal
 
Accelerating NoSQL
Accelerating NoSQLAccelerating NoSQL
Accelerating NoSQL
 
HPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemHPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL Ecosystem
 
NewSQL
NewSQLNewSQL
NewSQL
 
Building high traffic http front-ends. theo schlossnagle. зал 1
Building high traffic http front-ends. theo schlossnagle. зал 1Building high traffic http front-ends. theo schlossnagle. зал 1
Building high traffic http front-ends. theo schlossnagle. зал 1
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
 
第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra
 
NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011
 
Conference tutorial: MySQL Cluster as NoSQL
Conference tutorial: MySQL Cluster as NoSQLConference tutorial: MySQL Cluster as NoSQL
Conference tutorial: MySQL Cluster as NoSQL
 
MySQL High-Availability and Scale-Out architectures
MySQL High-Availability and Scale-Out architecturesMySQL High-Availability and Scale-Out architectures
MySQL High-Availability and Scale-Out architectures
 
Oracle strategy for_information_management
Oracle strategy for_information_managementOracle strategy for_information_management
Oracle strategy for_information_management
 
Sql vs nosql
Sql vs nosqlSql vs nosql
Sql vs nosql
 
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
 
NoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityNoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativity
 
Evaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for BenchmarkingEvaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for Benchmarking
 
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageWebinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFS
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandra
 

Similar to "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB

OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...NETWAYS
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQLCrate.io
 
PayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterPayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterMat Keep
 
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBen Stopford
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Saltmarch Media
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
Millions quotes per second in pure java
Millions quotes per second in pure javaMillions quotes per second in pure java
Millions quotes per second in pure javaRoman Elizarov
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandraBrian Enochson
 
Yes sql08 inmemorydb
Yes sql08 inmemorydbYes sql08 inmemorydb
Yes sql08 inmemorydbDaniel Austin
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systemselliando dias
 
What Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will WinWhat Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will WinBigDataCloud
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)Ben Stopford
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureArthur Gimpel
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remanijaxconf
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling SoftwareAbdelmonaim Remani
 
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Ben Stopford
 
Solving k8s persistent workloads using k8s DevOps style
Solving k8s persistent workloads using k8s DevOps styleSolving k8s persistent workloads using k8s DevOps style
Solving k8s persistent workloads using k8s DevOps styleMayaData
 

Similar to "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB (20)

OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQL
 
PayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterPayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL Cluster
 
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
Millions quotes per second in pure java
Millions quotes per second in pure javaMillions quotes per second in pure java
Millions quotes per second in pure java
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
 
Yes sql08 inmemorydb
Yes sql08 inmemorydbYes sql08 inmemorydb
Yes sql08 inmemorydb
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
What Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will WinWhat Does Big Data Mean and Who Will Win
What Does Big Data Mean and Who Will Win
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data Architecture
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remani
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
 
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012
 
Solving k8s persistent workloads using k8s DevOps style
Solving k8s persistent workloads using k8s DevOps styleSolving k8s persistent workloads using k8s DevOps style
Solving k8s persistent workloads using k8s DevOps style
 

Recently uploaded

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Recently uploaded (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB

  • 1. Dr. Michael Stonebraker and Scott Jarr Navigating the Database Universe
  • 2. About Our Presenters Mike Stonebraker Scott Jarr Co-founder & CTO, VoltDB Co-founder & Chief Strategy Officer, VoltDB A pioneer of database research and More than 20 years of experience technology for more than a quarter of a building, launching and growing century, and the main architect of the technology companies from inception to Ingres relational DBMS and the object- market leadership in the relational DBMS PostgreSQL search, mobile, security, storage and virtualization markets
  • 3. Agenda • The (proper) design of DBMSs – Presented by Dr. Michael Stonebraker • The database universe • Where the future value comes from
  • 4. We Believe… • “Big Data” is a rare, transformative market • Velocity is becoming the cornerstone • Specialized databases (working together) are the answer • Products must provide tangible customer value... Fast
  • 5. Dr. Michael Stonebraker THE (PROPER) DESIGN OF THE DBMS
  • 6. Lessons from 40 Years of Database Design 1. Get the user interaction right – Bet on a small number of easy-to- 2. understand constructs – Plus standards Get the implementation right “ Those who don’t learn from history are – Bet on a small number of easy-to- understand constructs destined to repeat it. -Winston Churchill ” 3. One size does not fit all – At least not if you want fast, big or complex
  • 7. #1: Get the User Interaction Right Historical Lesson: RDBMS vs. CODASYL vs. OODB Winner: RDBMS Loser: CODASYL Loser: OODBs • Simple data model • Complicated data model • Complex data model (records; participate in “sets”; (hierarchical (tables) set has one owner records, pointers, sets, ar • Simple access and, perhaps, many rays, etc.) members, etc.) language (SQL) • Complex access • Messy access language (sea • ACID (transactions) of “cursors”; some -- but not language all -- move on every (navigation, through this • Standards (SQL) command, navigation sea) programming) • No standards
  • 8. Interaction Take Away − Simple is Good • ACID was easy for people to understand • SQL provided a standard, high-level language and made people productive (transportable skills)
  • 9. #2: Get the Implementation Right • Leverage a few simple ideas: Early relational implementations Historical Winners – System R storage system dropped links – Views (protection, schema modification, performance) – Cost-based optimizer • Leverage a few simple ideas: Postgres – User-defined data types and functions (adopted by most everybody) – Rules/triggers – No-overwrite storage • Leverage a few simple ideas: Vertica – Store data by column – Compressed up the ging gong – Parallel load without compromising ACID
  • 10. #3: One Size Does NOT Fit All • OSFA is an old technology with hundreds of bags hanging off it • It breaks 100% of the time when under “ …specialized systems can each be a factor of load 50 faster than the • Load = size or speed or complexity single ‘one size fits all’ • Load is increasing at a startling rate system…A factor of 50 is nothing to sneeze at. • Purpose-built will exceed by 10x to 100x • History has not been completely written yet…but let’s look at VoltDB as an -My Top 10 Assertions About Data Warehouses, 2010 ” example
  • 11. Example: VoltDB • Get the interface right – SQL – ACID • Implementation: Leverage a few simple ideas – Main memory – Stored procedures – Deterministic scheduling • Specialization – OLTP focus allowed for above implementation choices
  • 12. Proving the Theory Useful Work • Challenge: OLTP 4% performance Recovery 24% Latching 24% – TPC-C CPU cycles Buffer Pool 24% – On the Shore DBMS Locking 24% prototype – Elephants should be similar
  • 13. Implementation Construct #1: Main Memory • Main memory format for data – Disk format gets you buffer pool overhead • What happens if data doesn’t fit? – Return to disk-buffer pool architecture (slow) – Anti-caching • Main memory format for data • When memory fills up, then bundle together elderly tuples and write them out • Run a transaction in “sleuth mode”; find the required records and move to main memory (and pin) • Run Xact normally
  • 14. Implementation Construct #2: Stored Procedures • Round trip to the DBMS is expensive – Do it once per transaction – Not once per command – Or even once per cursor move • Ad-hoc queries supported – Turn them into dynamic stored procedures
  • 15. Implementation Construct #3: Deterministic and Non-deterministic Scheduling • Non-deterministic (can’t tell order until commit time) – MVCC – Dynamic locking • Deterministic – Time stamp order
  • 16. Result of Design Principles: VoltDB Example • Good interface decisions – made developers more productive – SQL & ACID • Leveraging a few simple implementation ideas – made VoltDB wicked fast – Main memory – Stored procedures – Deterministic scheduling
  • 17. Proving the Theory • Answer: OLTP performance – 3 million transactions per second “ …we are heading toward a world with at least 5 (and probably – 7x Cassandra more) specialized – 15 million SQL statements per engines and the death second of the ‘one size fits all’ – 100,000+ transactions per legacy systems. commodity server ” -The End of an Architectural Era (It’s Time for a Complete Rewrite), 2007
  • 19. Technology Meets the Market Believe – “Big Data” is a rare, transformative market – Velocity is becoming the cornerstone – Specialized databases (working together) are the answer – Products must provide tangible customer value… Fast Observations – Noisy, crowded and new – kinda like Christmas shopping at the mall – Everyone wants to understand where the pieces fit – Analysts build maps on technology NOT use cases What we need is…
  • 20. Data Value Chain Age of Data Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics Milliseconds Hundredths of seconds Second(s) Minutes Hours • Place trade • Calculate risk • Retrieve click • Backtest algo • Algo discovery • Serve ad • Leaderboard stream • BI • Log analysis • Enrich stream • Aggregate • Show orders • Daily reports • Fraud pattern match • Examine packet • Count • Approve trans.
  • 21. Data Value Chain Value of Individual Aggregate Data Item Data Value Data Value Age of Data Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics Milliseconds Hundredths of seconds Second(s) Minutes Hours • Place trade • Calculate risk • Retrieve click • Backtest algo • Algo discovery • Serve ad • Leaderboard stream • BI • Log analysis • Enrich stream • Aggregate • Show orders • Daily reports • Fraud pattern match • Examine packet • Count • Approve trans.
  • 22. The Database Universe Fast Complex Large Value of Individual Data Item Aggregate Data Value Application Complexity Data Value Traditional RDBMS Simple Slow Small Transactional Analytic Exploratory Interactive Real-time Analytics Record Lookup Historical Analytics Analytics
  • 23. The Database Universe Fast Complex Large Value of Individual Data Item Aggregate Data Value Application Complexity Data Value Velocity Hadoop, etc. NoSQL Data NewSQL Warehouse Traditional RDBMS Simple Slow Small Transactional Analytic Exploratory Interactive Real-time Analytics Record Lookup Historical Analytics Analytics
  • 24. logins trades authorizations clicks sensors orders impressions Closed-loop Big Data Interactive & Real-time Analytics Historical Reports & Analytics Exploratory Analytics
  • 25. logins trades authorizations clicks sensors orders impressions Closed-loop Big Data • Make the most Interactive & Real-time Analytics informed decision every time there is an interaction • Real-time decisions Historical Reports & Analytics are informed by Knowledge operational analytics and past knowledge Exploratory Analytics
  • 26. The Velocity Use Case What’s it look like? – High throughput, relentless data feeds – Fast decisions on high-value data – Real-time, operational analytics present immediate visibility What’s the big deal? – Batch converts to real time = efficiency – Decisions made at time of event = better decisions – Ability to micro segment/target/personalize/etc. = conversion, satisfaction, more data is coming at you, use it to improve your business