SlideShare une entreprise Scribd logo
1  sur  56
Télécharger pour lire hors ligne
Building a SQL Database
        that works



             Josh Berkus
             PostgreSQL Experts, Inc.
             OpenSourceBridge 2009
How Not To Do It
four popular methods
1. One Big Spreadsheet
2. EAV & E-Blob
ID    Property    Setting   ID            Properties

407    Eyes       Brown

407    Height      73in     407   <eyes=”brown”><height=”73”>
                                   <married=”1”><smoker=”1”>
407   Married?    TRUE
                            408    <hair=”brown”><age=”49”>
408   Married?    FALSE           <married=”0”><smoker=”0”>
408   Smoker      FALSE
                            409    <age=”37”><height=”66”>
408     Age         37             <hat=”old”><teeth=”gold”>

409    Height      66in
3. Incremental Development
4. Leave It to the ORM
E.F. Codd
Database Engineer, IBM 1970
IBM Databases Run Amok
         1.losing data
        2.duplicate data
         3.wrong data
     4.crappy performance

5.downtime for database redesign
    whenever anyone made an
        application change
The Relational Model
All the Relational Model
   You Need to Know
in less than 10 minutes
Set (Bag) Theory
Relations

    Relation
(table, view, rowset)
Tuples

                    Relation
               (table, view, rowset)
Tuple (row)         Tuple (row)             Tuple (row)




          Tuple (row)         Tuple (row)
Attributes

                                 Relation
                           (table, view, rowset)
    Tuple (row)                  Tuple (row)                 Tuple (row)
Attribute     Attribute     Attribute    Attribute       Attribute    Attribute
       Attribute                  Attribute                    Attribute


                   Tuple (row)                Tuple (row)
               Attribute    Attribute    Attribute      Attribute

                     Attribute                   Attribute
Domains (types)

                                 Relation
                        (table, view, rowset)
 Tuple (row)                     Tuple (row)                     Tuple (row)
INT          TEXT               INT          TEXT               INT          Attribute
                                                                              TEXT
      DATE                            DATE                            Attribute
                                                                       DATE


               Tuple (row)                     Tuple (row)
              INT          Attribute
                            TEXT              INT          Attribute
                                                            TEXT

                    Attribute
                     DATE                           Attribute
                                                     DATE
Keys

                                      Relation
                             (table, view, rowset)
    Tuple (row)                       Tuple (row)                         Tuple (row)
Attribute
  Key         Attribute              Key          Attribute              Key          Attribute
       Attribute                           Attribute                           Attribute


                    Tuple (row)                         Tuple (row)
                   Key          Attribute              Key          Attribute

                         Attribute                           Attribute
Constraints

                                   Relation
                             (table, view, rowset)
      Tuple (row)                  Tuple (row)                 Tuple (row)
  Attribute     Attribute     Attribute    Attribute       Attribute    Attribute
         Attribute                  Attribute                    Attribute


                     Tuple (row)                Tuple (row)
                 Attribute    Attribute    Attribute      Attribute

                       Attribute                   Attribute

Attribute > 5
Foreign Key Constraint
Derived Relation (query)
?
Problem 1:
Losing Data
The Atomic Age
Simple User Table
●   name (text)
●   email (text)
●   login (text)
●   password (text)
●   status (char)
    (user, inactive, admin)   Users

                                   Admins
Non-Atomic Attributes
●   name (text)
●   email (text)
●   login (text)
●   password (text)
●   status (char)
    (user, inactive, admin)   Users

                               Admins
What's Atomic?
The simplest form of a datum, which is not
   divisible without loss of information.

                    name
                 Josh Berkus
SELECT SUBSTR(name,STRPOS(name, ' ')) ...

                    Status
                      ai

    … WHERE status = 'a'???status = 'u' ...
          … WHERE or ...
What's Atomic?
The simplest form of a datum, which is not
   divisible without loss of information.

        first_name     last_name
            Josh         Berkus



          active         access
          TRUE             a
Table Atomized!
●   first_name (text)
●   last_name (text)
●   email (text)
●   login (text)
●   password (text)
●   active (boolean)             Users
●   access (char)
                                     Admins
Atomic, Shmomic. Who Cares?
●   Atomic Values:
       –   retain data
       –   make joins easier
       –   make constraints easier
●   Non-atomic Values:
       –   make data loss more likely
       –   increase CPU usage
       –   make you more likely to forget something
Transactions
Splitting a Bulletin Board Thread
             the hard way
INSERT INTO threads VALUES ( .... );
If $dbh('success') then
  for $these_posts.date > $cutdate loop
    UPDATE posts SET thread = $newthread
    WHERE id = $these_posts.id;
  if not $dbh('success') then
    for $these_posts.id > $last_id loop
      UPDATE posts SET thread = $oldthread
      WHERE id = $these_posts.id;
      DELETE FROM threads
      WHERE id = $newthread;
Splitting a Bulletin Board Thread
         the transactional way
BEGIN;
  INSERT INTO threads VALUES ( .... );
  $newthread = curval();
  UPDATE posts SET thread = $newthread
     WHERE thread = $oldthread
     AND date > $cutdate;
END;
name                user_name
                           admin
 Josh Berkus            Josh Berkus
                          Berkus
Joshua Berkus
 Berkus, Josh
                   user_name
                  Josh Berkus
                      Josh
                 Joshua Berkus


           Problem 2:
          Duplicate Data
Where are my Keys?
Candidate (Natural) Keys
●   first_name (text)
●   last_name (text)
●   email (text)
●   login (text)        Key
●   password (text)
●   active (boolean)          Users
●   access (char)
                               Admins
A Good Key
●   Should have to be unique because the
    application requires it to be.
●   Expresses a unique predicate which describes
    the tuple (row):
        –   user with login “jberkus”
        –   post from “jberkus” on “2009-05-02 13:41:22” in
             thread “Making your own wine”
●   If you can't find a good key, your table design is
    missing data.
Surrogate Key
●   first_name (text)         ●   user_id (serial)
●   last_name (text)
●   email (text)
●   login (text)        Key
●   password (text)
●   active (boolean)                     Users
●   access (char)
                                            Admins
We All Just Want to Be Normal
Abby Normal
 login              level        last_name
jberkus              u               Berkus
selena               a          Deckelman



           login             title            posted   level
          jberkus           Dinner?           09:28     u
          selena            Dinner?           09:37     u
          jberkus           Dinner?           09:44     a
How can I be “Normal”?
1. Each piece of data only appears in one relation
        –    except as a “foreign key” attribute
●   No “repeated” attributes
     login           level      last_name
    jberkus           u          Berkus
     selena           a         Deckelman

                       login         title    posted
                      jberkus      Dinner?     09:28
                      selena       Dinner?     09:37
                      jberkus      Dinner?     09:44
Problem 3:
Wrong Data
Constraints
Users Run Amok

●   first_name (text)
●   last_name (text)
●   email (text)
●   login (text)
●   password (text)
●   active (boolean)
●   access (char)
Ensure Your Data is Consistent
     no matter where it came from

first_name   last_name        email           login     password active      level
   Josh        Berkus    josh@pgexperts.com   jberkus   jehosaphat   TRUE      a
   NULL        Kelley         kelley@ucb         k         NULL      TRUE      u
   Mark        Twain         www.pm.org       samuel      halleys    NULL      I
    S            F           gavin@sf.gov      gavin       twitter   FALSE     x
Users Under Constraint

●   first_name (text)                   length() > 1
●   last_name (text)                    length() > 1
●   email (text)                        ILIKE '%@%.%'
●   login (text)                        length() > 5
●   password (text)                     length() > 5
●   active (boolean)                    NOT NULL
●   access (char)                       IN ( 'a','u' )

                       note: email and other validators would, of course, be more complex
Foreign Keys


                   attribute
Users                login


                      Posts
 Admins
Posts Table
●   title (text) NOT NULL
    REFERENCES threads ( title )
    ON DELETE CASCADE ON UPDATE CASCADE
●   posted (timestamp) NOT NULL
●   user (text) NOT NULL
    REFERENCES users ( login )
    ON DELETE CASCADE ON UPDATE CASCADE
●   content (text) NOT NULL
Beautiful Cascades
                       posts.content
                        Josh Berkus
                         What's up?
 users.login
                     I'm going crazy!
Josh Berkus
   jberkus
                    www.pornking.com
  jerkyboy
                            Why?
    selena
                   www.whitehouse.com
                    OSB! It's too much!
                   www.whiteslavery.com
                     www.lolcats.com
                      I told you so ...
Problem 4:
Crappy Performance
Things We've Already Done
●   Atomicization
       –   less CPU on parsing, calculations
●   Normalization
       –   less data duplication
       –   smaller tables
●   Transactions
       –   more batches, less iteration
       –   less locking
Denormalized Derived Relations
 materialized views for the win


   Users
    Admins
                   user_postcount

  Posts
Problem 5:
Database Changes Cause
  Application Downtime
     and vice-versa
Stuff We've Already Done
              to make our data “agile”
●   Atomicization
    –   data isn't in specific interface version formats
●   Normalization
    –   where to extend data is more obvious
    –   create a new table if you have to
●   Transactions
         –   prevent partial failures from changed schema
Views
Extending the Users Table
●   first_name (text)        CREATE VIEW oldapp.users
●   last_name (text)         AS
                             SELECT
●   email (text)              first_name || ' ' || last_name,
●   login (text)              email, login, password,
                              active, access
●   password (text)          FROM users;
●   active (boolean)
●   access (char)
●   created (timestamp)
●   last_login (timestamp)
The Rest
                 you already know
●   Write Migrations
       –   deploy these in transactions, if supported
       –   if not, write rollback scripts
●   Write Tests
       –   use a realistic staging environment
Some Other Tips
●   Pick a naming scheme
        –   and stick to it
●   Don't do your joins in the application
        –   the database does them better
●   Repurposing fields will bite you
        –   sure you'll remember when that changed
●   Don't micro-optimize
        –   INT3, anyone?
More Information
●   me
         –    josh@pgexperts.com
         –    www.pgexperts.com
         –    it.toolbox.com/blogs/database-soup
●   postgresql: www.postgresql.org
●   tutorial at OSCON
         –    monday, 8:30 am!
         –    see you in San Jose!


             This presentation copyright 2009 Josh Berkus, licensed for distribution under the
              Creative Commons Attribution License, except for photos, most of which were
              stolen from other people's websites via images.google.com. Thanks, Google!

Contenu connexe

Tendances

Modern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter BootstrapModern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter BootstrapHoward Lewis Ship
 
Sqlalchemy sqlの錬金術
Sqlalchemy  sqlの錬金術Sqlalchemy  sqlの錬金術
Sqlalchemy sqlの錬金術Atsushi Odagiri
 
Beyond SQL - Comparing SQL to TypeQL
Beyond SQL - Comparing SQL to TypeQLBeyond SQL - Comparing SQL to TypeQL
Beyond SQL - Comparing SQL to TypeQLVaticle
 
Textual Modeling Framework Xtext
Textual Modeling Framework XtextTextual Modeling Framework Xtext
Textual Modeling Framework XtextSebastian Zarnekow
 
Transparent Object Persistence with FLOW3
Transparent Object Persistence with FLOW3Transparent Object Persistence with FLOW3
Transparent Object Persistence with FLOW3Karsten Dambekalns
 
Jazoon 2010 - Building DSLs with Eclipse
Jazoon 2010 - Building DSLs with EclipseJazoon 2010 - Building DSLs with Eclipse
Jazoon 2010 - Building DSLs with EclipsePeter Friese
 
Ruby and rails - Advanced Training (Cybage)
Ruby and rails - Advanced Training (Cybage)Ruby and rails - Advanced Training (Cybage)
Ruby and rails - Advanced Training (Cybage)Gautam Rege
 
اسلاید جلسه ۹ کلاس پایتون برای هکر های قانونی
اسلاید جلسه ۹ کلاس پایتون برای هکر های قانونیاسلاید جلسه ۹ کلاس پایتون برای هکر های قانونی
اسلاید جلسه ۹ کلاس پایتون برای هکر های قانونیMohammad Reza Kamalifard
 

Tendances (10)

Modern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter BootstrapModern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter Bootstrap
 
Java
JavaJava
Java
 
Sqlalchemy sqlの錬金術
Sqlalchemy  sqlの錬金術Sqlalchemy  sqlの錬金術
Sqlalchemy sqlの錬金術
 
Beyond SQL - Comparing SQL to TypeQL
Beyond SQL - Comparing SQL to TypeQLBeyond SQL - Comparing SQL to TypeQL
Beyond SQL - Comparing SQL to TypeQL
 
Textual Modeling Framework Xtext
Textual Modeling Framework XtextTextual Modeling Framework Xtext
Textual Modeling Framework Xtext
 
Transparent Object Persistence with FLOW3
Transparent Object Persistence with FLOW3Transparent Object Persistence with FLOW3
Transparent Object Persistence with FLOW3
 
Jazoon 2010 - Building DSLs with Eclipse
Jazoon 2010 - Building DSLs with EclipseJazoon 2010 - Building DSLs with Eclipse
Jazoon 2010 - Building DSLs with Eclipse
 
SOLID Ruby, SOLID Rails
SOLID Ruby, SOLID RailsSOLID Ruby, SOLID Rails
SOLID Ruby, SOLID Rails
 
Ruby and rails - Advanced Training (Cybage)
Ruby and rails - Advanced Training (Cybage)Ruby and rails - Advanced Training (Cybage)
Ruby and rails - Advanced Training (Cybage)
 
اسلاید جلسه ۹ کلاس پایتون برای هکر های قانونی
اسلاید جلسه ۹ کلاس پایتون برای هکر های قانونیاسلاید جلسه ۹ کلاس پایتون برای هکر های قانونی
اسلاید جلسه ۹ کلاس پایتون برای هکر های قانونی
 

En vedette

Safety LAMP: data security & agile languages
Safety LAMP: data security & agile languagesSafety LAMP: data security & agile languages
Safety LAMP: data security & agile languagesPostgreSQL Experts, Inc.
 
Scaling Wanelo.com 100x in Six Months
Scaling Wanelo.com 100x in Six MonthsScaling Wanelo.com 100x in Six Months
Scaling Wanelo.com 100x in Six MonthsKonstantin Gredeskoul
 
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQLFrom Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQLKonstantin Gredeskoul
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesJonathan Katz
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALEPostgreSQL Experts, Inc.
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQLKonstantin Gredeskoul
 

En vedette (19)

Firehose Engineering
Firehose EngineeringFirehose Engineering
Firehose Engineering
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
Give A Great Tech Talk 2013
Give A Great Tech Talk 2013Give A Great Tech Talk 2013
Give A Great Tech Talk 2013
 
Pg py-and-squid-pypgday
Pg py-and-squid-pypgdayPg py-and-squid-pypgday
Pg py-and-squid-pypgday
 
Ten Ways to Destroy Your Database
Ten Ways to Destroy Your DatabaseTen Ways to Destroy Your Database
Ten Ways to Destroy Your Database
 
Safety LAMP: data security & agile languages
Safety LAMP: data security & agile languagesSafety LAMP: data security & agile languages
Safety LAMP: data security & agile languages
 
HowTo DR
HowTo DRHowTo DR
HowTo DR
 
GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)
 
Five steps perform_2013
Five steps perform_2013Five steps perform_2013
Five steps perform_2013
 
Scaling Wanelo.com 100x in Six Months
Scaling Wanelo.com 100x in Six MonthsScaling Wanelo.com 100x in Six Months
Scaling Wanelo.com 100x in Six Months
 
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQLFrom Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
 
7 Ways To Crash Postgres
7 Ways To Crash Postgres7 Ways To Crash Postgres
7 Ways To Crash Postgres
 
Scaling postgres
Scaling postgresScaling postgres
Scaling postgres
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL
 
Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
 
Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
 
Fail over fail_back
Fail over fail_backFail over fail_back
Fail over fail_back
 

Similaire à Building a SQL Database that Works

XSLT 1 and XPath Quick Reference (from mulberrytech.com)
XSLT 1 and XPath Quick Reference (from mulberrytech.com)XSLT 1 and XPath Quick Reference (from mulberrytech.com)
XSLT 1 and XPath Quick Reference (from mulberrytech.com)FrescatiStory
 
Xsl Tand X Path Quick Reference
Xsl Tand X Path Quick ReferenceXsl Tand X Path Quick Reference
Xsl Tand X Path Quick ReferenceLiquidHub
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Introthnetos
 
Clojure for Java developers - Stockholm
Clojure for Java developers - StockholmClojure for Java developers - Stockholm
Clojure for Java developers - StockholmJan Kronquist
 
Erlang/OTP for Rubyists
Erlang/OTP for RubyistsErlang/OTP for Rubyists
Erlang/OTP for RubyistsSean Cribbs
 
Advanced SQL Webinar
Advanced SQL WebinarAdvanced SQL Webinar
Advanced SQL WebinarRam Kedem
 
(first '(Clojure.))
(first '(Clojure.))(first '(Clojure.))
(first '(Clojure.))niklal
 
Java Collections API
Java Collections APIJava Collections API
Java Collections APIAlex Miller
 
楽々Scalaプログラミング
楽々Scalaプログラミング楽々Scalaプログラミング
楽々ScalaプログラミングTomoharu ASAMI
 
Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)Chia-Chi Chang
 
JavaScript for Flex Devs
JavaScript for Flex DevsJavaScript for Flex Devs
JavaScript for Flex DevsAaronius
 
Generics Past, Present and Future (Latest)
Generics Past, Present and Future (Latest)Generics Past, Present and Future (Latest)
Generics Past, Present and Future (Latest)RichardWarburton
 

Similaire à Building a SQL Database that Works (20)

XSLT 1 and XPath Quick Reference (from mulberrytech.com)
XSLT 1 and XPath Quick Reference (from mulberrytech.com)XSLT 1 and XPath Quick Reference (from mulberrytech.com)
XSLT 1 and XPath Quick Reference (from mulberrytech.com)
 
Xsl Tand X Path Quick Reference
Xsl Tand X Path Quick ReferenceXsl Tand X Path Quick Reference
Xsl Tand X Path Quick Reference
 
ExAlg Overview
ExAlg OverviewExAlg Overview
ExAlg Overview
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Intro
 
Pune Clojure Course Outline
Pune Clojure Course OutlinePune Clojure Course Outline
Pune Clojure Course Outline
 
Clojure for Java developers - Stockholm
Clojure for Java developers - StockholmClojure for Java developers - Stockholm
Clojure for Java developers - Stockholm
 
Sql alchemy bpstyle_4
Sql alchemy bpstyle_4Sql alchemy bpstyle_4
Sql alchemy bpstyle_4
 
Erlang/OTP for Rubyists
Erlang/OTP for RubyistsErlang/OTP for Rubyists
Erlang/OTP for Rubyists
 
tutorial5
tutorial5tutorial5
tutorial5
 
tutorial5
tutorial5tutorial5
tutorial5
 
Meet scala
Meet scalaMeet scala
Meet scala
 
Database
DatabaseDatabase
Database
 
Advanced SQL Webinar
Advanced SQL WebinarAdvanced SQL Webinar
Advanced SQL Webinar
 
(first '(Clojure.))
(first '(Clojure.))(first '(Clojure.))
(first '(Clojure.))
 
Java Collections API
Java Collections APIJava Collections API
Java Collections API
 
Hadoop
HadoopHadoop
Hadoop
 
楽々Scalaプログラミング
楽々Scalaプログラミング楽々Scalaプログラミング
楽々Scalaプログラミング
 
Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)
 
JavaScript for Flex Devs
JavaScript for Flex DevsJavaScript for Flex Devs
JavaScript for Flex Devs
 
Generics Past, Present and Future (Latest)
Generics Past, Present and Future (Latest)Generics Past, Present and Future (Latest)
Generics Past, Present and Future (Latest)
 

Plus de PostgreSQL Experts, Inc.

Elephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and VariantsElephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and VariantsPostgreSQL Experts, Inc.
 
Introducing Windowing Functions (pgCon 2009)
Introducing Windowing Functions (pgCon 2009)Introducing Windowing Functions (pgCon 2009)
Introducing Windowing Functions (pgCon 2009)PostgreSQL Experts, Inc.
 

Plus de PostgreSQL Experts, Inc. (20)

92 grand prix_2013
92 grand prix_201392 grand prix_2013
92 grand prix_2013
 
PWNage: Producing a newsletter with Perl
PWNage: Producing a newsletter with PerlPWNage: Producing a newsletter with Perl
PWNage: Producing a newsletter with Perl
 
10 Ways to Destroy Your Community
10 Ways to Destroy Your Community10 Ways to Destroy Your Community
10 Ways to Destroy Your Community
 
Open Source Press Relations
Open Source Press RelationsOpen Source Press Relations
Open Source Press Relations
 
5 (more) Ways To Destroy Your Community
5 (more) Ways To Destroy Your Community5 (more) Ways To Destroy Your Community
5 (more) Ways To Destroy Your Community
 
Preventing Community (from Linux Collab)
Preventing Community (from Linux Collab)Preventing Community (from Linux Collab)
Preventing Community (from Linux Collab)
 
Development of 8.3 In India
Development of 8.3 In IndiaDevelopment of 8.3 In India
Development of 8.3 In India
 
PostgreSQL and MySQL
PostgreSQL and MySQLPostgreSQL and MySQL
PostgreSQL and MySQL
 
50 Ways To Love Your Project
50 Ways To Love Your Project50 Ways To Love Your Project
50 Ways To Love Your Project
 
8.4 Upcoming Features
8.4 Upcoming Features 8.4 Upcoming Features
8.4 Upcoming Features
 
Elephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and VariantsElephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and Variants
 
Writeable CTEs: The Next Big Thing
Writeable CTEs: The Next Big ThingWriteable CTEs: The Next Big Thing
Writeable CTEs: The Next Big Thing
 
PostgreSQL Development Today: 9.0
PostgreSQL Development Today: 9.0PostgreSQL Development Today: 9.0
PostgreSQL Development Today: 9.0
 
9.1 Mystery Tour
9.1 Mystery Tour9.1 Mystery Tour
9.1 Mystery Tour
 
Postgres Open Keynote: The Next 25 Years
Postgres Open Keynote: The Next 25 YearsPostgres Open Keynote: The Next 25 Years
Postgres Open Keynote: The Next 25 Years
 
9.1 Grand Tour
9.1 Grand Tour9.1 Grand Tour
9.1 Grand Tour
 
Keyvil Lightning Talk
Keyvil Lightning TalkKeyvil Lightning Talk
Keyvil Lightning Talk
 
Safe Data is Happy Data
Safe Data is Happy DataSafe Data is Happy Data
Safe Data is Happy Data
 
Unit Test Your Database! (PgCon 2009)
Unit Test Your Database! (PgCon 2009)Unit Test Your Database! (PgCon 2009)
Unit Test Your Database! (PgCon 2009)
 
Introducing Windowing Functions (pgCon 2009)
Introducing Windowing Functions (pgCon 2009)Introducing Windowing Functions (pgCon 2009)
Introducing Windowing Functions (pgCon 2009)
 

Dernier

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Building a SQL Database that Works

  • 1. Building a SQL Database that works Josh Berkus PostgreSQL Experts, Inc. OpenSourceBridge 2009
  • 2. How Not To Do It four popular methods
  • 3. 1. One Big Spreadsheet
  • 4. 2. EAV & E-Blob ID Property Setting ID Properties 407 Eyes Brown 407 Height 73in 407 <eyes=”brown”><height=”73”> <married=”1”><smoker=”1”> 407 Married? TRUE 408 <hair=”brown”><age=”49”> 408 Married? FALSE <married=”0”><smoker=”0”> 408 Smoker FALSE 409 <age=”37”><height=”66”> 408 Age 37 <hat=”old”><teeth=”gold”> 409 Height 66in
  • 6. 4. Leave It to the ORM
  • 8. IBM Databases Run Amok 1.losing data 2.duplicate data 3.wrong data 4.crappy performance 5.downtime for database redesign whenever anyone made an application change
  • 10. All the Relational Model You Need to Know in less than 10 minutes
  • 12. Relations Relation (table, view, rowset)
  • 13. Tuples Relation (table, view, rowset) Tuple (row) Tuple (row) Tuple (row) Tuple (row) Tuple (row)
  • 14. Attributes Relation (table, view, rowset) Tuple (row) Tuple (row) Tuple (row) Attribute Attribute Attribute Attribute Attribute Attribute Attribute Attribute Attribute Tuple (row) Tuple (row) Attribute Attribute Attribute Attribute Attribute Attribute
  • 15. Domains (types) Relation (table, view, rowset) Tuple (row) Tuple (row) Tuple (row) INT TEXT INT TEXT INT Attribute TEXT DATE DATE Attribute DATE Tuple (row) Tuple (row) INT Attribute TEXT INT Attribute TEXT Attribute DATE Attribute DATE
  • 16. Keys Relation (table, view, rowset) Tuple (row) Tuple (row) Tuple (row) Attribute Key Attribute Key Attribute Key Attribute Attribute Attribute Attribute Tuple (row) Tuple (row) Key Attribute Key Attribute Attribute Attribute
  • 17. Constraints Relation (table, view, rowset) Tuple (row) Tuple (row) Tuple (row) Attribute Attribute Attribute Attribute Attribute Attribute Attribute Attribute Attribute Tuple (row) Tuple (row) Attribute Attribute Attribute Attribute Attribute Attribute Attribute > 5
  • 22. Simple User Table ● name (text) ● email (text) ● login (text) ● password (text) ● status (char) (user, inactive, admin) Users Admins
  • 23. Non-Atomic Attributes ● name (text) ● email (text) ● login (text) ● password (text) ● status (char) (user, inactive, admin) Users Admins
  • 24. What's Atomic? The simplest form of a datum, which is not divisible without loss of information. name Josh Berkus SELECT SUBSTR(name,STRPOS(name, ' ')) ... Status ai … WHERE status = 'a'???status = 'u' ... … WHERE or ...
  • 25. What's Atomic? The simplest form of a datum, which is not divisible without loss of information. first_name last_name Josh Berkus active access TRUE a
  • 26. Table Atomized! ● first_name (text) ● last_name (text) ● email (text) ● login (text) ● password (text) ● active (boolean) Users ● access (char) Admins
  • 27. Atomic, Shmomic. Who Cares? ● Atomic Values: – retain data – make joins easier – make constraints easier ● Non-atomic Values: – make data loss more likely – increase CPU usage – make you more likely to forget something
  • 29. Splitting a Bulletin Board Thread the hard way INSERT INTO threads VALUES ( .... ); If $dbh('success') then for $these_posts.date > $cutdate loop UPDATE posts SET thread = $newthread WHERE id = $these_posts.id; if not $dbh('success') then for $these_posts.id > $last_id loop UPDATE posts SET thread = $oldthread WHERE id = $these_posts.id; DELETE FROM threads WHERE id = $newthread;
  • 30. Splitting a Bulletin Board Thread the transactional way BEGIN; INSERT INTO threads VALUES ( .... ); $newthread = curval(); UPDATE posts SET thread = $newthread WHERE thread = $oldthread AND date > $cutdate; END;
  • 31. name user_name admin Josh Berkus Josh Berkus Berkus Joshua Berkus Berkus, Josh user_name Josh Berkus Josh Joshua Berkus Problem 2: Duplicate Data
  • 32. Where are my Keys?
  • 33. Candidate (Natural) Keys ● first_name (text) ● last_name (text) ● email (text) ● login (text) Key ● password (text) ● active (boolean) Users ● access (char) Admins
  • 34. A Good Key ● Should have to be unique because the application requires it to be. ● Expresses a unique predicate which describes the tuple (row): – user with login “jberkus” – post from “jberkus” on “2009-05-02 13:41:22” in thread “Making your own wine” ● If you can't find a good key, your table design is missing data.
  • 35. Surrogate Key ● first_name (text) ● user_id (serial) ● last_name (text) ● email (text) ● login (text) Key ● password (text) ● active (boolean) Users ● access (char) Admins
  • 36. We All Just Want to Be Normal
  • 37. Abby Normal login level last_name jberkus u Berkus selena a Deckelman login title posted level jberkus Dinner? 09:28 u selena Dinner? 09:37 u jberkus Dinner? 09:44 a
  • 38. How can I be “Normal”? 1. Each piece of data only appears in one relation – except as a “foreign key” attribute ● No “repeated” attributes login level last_name jberkus u Berkus selena a Deckelman login title posted jberkus Dinner? 09:28 selena Dinner? 09:37 jberkus Dinner? 09:44
  • 41. Users Run Amok ● first_name (text) ● last_name (text) ● email (text) ● login (text) ● password (text) ● active (boolean) ● access (char)
  • 42. Ensure Your Data is Consistent no matter where it came from first_name last_name email login password active level Josh Berkus josh@pgexperts.com jberkus jehosaphat TRUE a NULL Kelley kelley@ucb k NULL TRUE u Mark Twain www.pm.org samuel halleys NULL I S F gavin@sf.gov gavin twitter FALSE x
  • 43. Users Under Constraint ● first_name (text) length() > 1 ● last_name (text) length() > 1 ● email (text) ILIKE '%@%.%' ● login (text) length() > 5 ● password (text) length() > 5 ● active (boolean) NOT NULL ● access (char) IN ( 'a','u' ) note: email and other validators would, of course, be more complex
  • 44. Foreign Keys attribute Users login Posts Admins
  • 45. Posts Table ● title (text) NOT NULL REFERENCES threads ( title ) ON DELETE CASCADE ON UPDATE CASCADE ● posted (timestamp) NOT NULL ● user (text) NOT NULL REFERENCES users ( login ) ON DELETE CASCADE ON UPDATE CASCADE ● content (text) NOT NULL
  • 46. Beautiful Cascades posts.content Josh Berkus What's up? users.login I'm going crazy! Josh Berkus jberkus www.pornking.com jerkyboy Why? selena www.whitehouse.com OSB! It's too much! www.whiteslavery.com www.lolcats.com I told you so ...
  • 48. Things We've Already Done ● Atomicization – less CPU on parsing, calculations ● Normalization – less data duplication – smaller tables ● Transactions – more batches, less iteration – less locking
  • 49. Denormalized Derived Relations materialized views for the win Users Admins user_postcount Posts
  • 50. Problem 5: Database Changes Cause Application Downtime and vice-versa
  • 51. Stuff We've Already Done to make our data “agile” ● Atomicization – data isn't in specific interface version formats ● Normalization – where to extend data is more obvious – create a new table if you have to ● Transactions – prevent partial failures from changed schema
  • 52. Views
  • 53. Extending the Users Table ● first_name (text) CREATE VIEW oldapp.users ● last_name (text) AS SELECT ● email (text) first_name || ' ' || last_name, ● login (text) email, login, password, active, access ● password (text) FROM users; ● active (boolean) ● access (char) ● created (timestamp) ● last_login (timestamp)
  • 54. The Rest you already know ● Write Migrations – deploy these in transactions, if supported – if not, write rollback scripts ● Write Tests – use a realistic staging environment
  • 55. Some Other Tips ● Pick a naming scheme – and stick to it ● Don't do your joins in the application – the database does them better ● Repurposing fields will bite you – sure you'll remember when that changed ● Don't micro-optimize – INT3, anyone?
  • 56. More Information ● me – josh@pgexperts.com – www.pgexperts.com – it.toolbox.com/blogs/database-soup ● postgresql: www.postgresql.org ● tutorial at OSCON – monday, 8:30 am! – see you in San Jose! This presentation copyright 2009 Josh Berkus, licensed for distribution under the Creative Commons Attribution License, except for photos, most of which were stolen from other people's websites via images.google.com. Thanks, Google!