2. Introduction
• When were databases invented?
• What is a relational database?
• Why do you want to use a
relational database?
• What relational DB’s are there?
• What is MySQL?
• Why use MySQL?
• Why MySQL 5.0?
Monday, May 21, 2012
3. The first database
• Sumeria (Iraq)
• 2313BC
• For taxes (for
irrigation works)
• Counters &
envelopes &
markers, oh my!
Monday, May 21, 2012
4. What is relational?
• Everything
in tables
• Tables just
rows and
columns
• Fields are
atomic
Monday, May 21, 2012
5. Why go relational?
pat# name
1 Ann
• Simple
2 Betty • Explicit
• Flexible
pat# preg# pat# preg# • And fast
1 1 1 1 enough
1 2 1 2
2 1 2 1
2nd copy of table!
Monday, May 21, 2012
6. Relational databases
today
• Oracle
• SQL Server
• DB2, Informix
• MySQL
• Access, Filemaker
• Ad hockery
Monday, May 21, 2012
7. What is MySQL?
Open
source
Has all
needed
high end
features
Well-documented
Monday, May 21, 2012
8. Why use MySQL?
• Reliable
• Fast
• Cheap
• Easy to use
• Widespread
The four dog personality factors were energy
• Fixable
levels, affection-aggression, anxiety-calmness
and intelligence-stupidity.
Monday, May 21, 2012
9. Such as:
MySQL now Stored
has the features procedures
needed by even Transactions
the most
sofiztikated
Triggers
business-types. Views
& Point-time-
restore!
Monday, May 21, 2012
10. Installation & Setup
• Download
from
mysql.com
• Run startup
utility
• Has
Administrator
and Query
GUI’s
Monday, May 21, 2012
11. SQL Interface
mysql> SELECT VERSION(),
CURRENT_DATE;
+----------------+--------------+
| VERSION() | CURRENT_DATE |
+----------------+--------------+
| 5.0.7-beta-Max | 2005-07-11 |
+----------------+--------------+
1 row in set (0.01 sec)
Monday, May 21, 2012
12. Creating a database
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| test |
+--------------------+
mysql> create database menagerie;
mysql> use menagerie;
Database changed
Monday, May 21, 2012
13. What is a table?
• Nothing but
rows and
columns
• Types of
fields
• Nulls
• Constraints
• Indexes
Monday, May 21, 2012
14. How is a table stored?
MyISAM manages non-transactional tables. It provides high-speed storage and retrieval, as well as
fulltext searching capabilities. MyISAM is supported in all MySQL configurations, and is the default
storage engine unless you have configured MySQL to use a different one by default.
• The MEMORY storage engine provides in-memory tables. The MERGE storage engine allows a collection
of identical MyISAM tables to be handled as a single table. Like MyISAM, the MEMORY and
MERGE storage engines handle non-transactional tables, and both are also included in MySQL by default.
• The InnoDB and BDB storage engines provide transaction-safe tables. BDB is included in MySQLMax
binary distributions on those operating systems that support it. InnoDB is also included by default
in all MySQL 5.0 binary distributions.
• The EXAMPLE storage engine is a “stub” engine that does nothing. You can create tables with this
engine, but no data can be stored in them or retrieved from them. The purpose of this engine is to
serve as an example in the MySQL source code that illustrates how to begin writing new storage engines.
As such, it is primarily of interest to developers.
• NDB Cluster is the storage engine used by MySQL Cluster to implement tables that are partitioned
over many computers. It is available in MySQL-Max 5.0 binary distributions. This storage
engine is currently supported on Linux, Solaris, and Mac OS X only. We intend to add support for
this engine on other platforms, including Windows, in future MySQL releases.
• The ARCHIVE storage engine is used for storing large amounts of data without indexes with a very
small footprint.
• The CSV storage engine stores data in text files using comma-separated values format.
• The BLACKHOLE storage engine accepts but does not store data and retrievals always return an empty set.
• The FEDERATED storage engine was added in MySQL 5.0.3. This engine stores data in a remote
database. Currently, it works with MySQL only, using the MySQL C Client API. In future releases,
we intend to enable it to connect to other data sources using other drivers or client connection methods.
Monday, May 21, 2012
15. How/why do we index
a table?
• Balanced trees
are the most
common
• When records
are inserted/
deleted page
splits can
happen
Monday, May 21, 2012
16. How to create a table
mysql> CREATE TABLE pet (name VARCHAR(20), owner VARCHAR(20),
-> species VARCHAR(20), sex CHAR(1), birth DATE, death DATE);
mysql> SHOW TABLES;
+---------------------+
| Tables in menagerie |
+---------------------+
| pet |
+---------------------+
mysql> DESCRIBE pet;
+---------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+-------------+------+-----+---------+-------+
| name | varchar(20) | YES | | NULL | |
| owner | varchar(20) | YES | | NULL | |
| species | varchar(20) | YES | | NULL | |
| sex | char(1) | YES | | NULL | |
| birth | date | YES | | NULL | |
| death | date | YES | | NULL | |
Monday, May 21, 2012
17. Types of fields
• Numbers
• Strings
• Dates
• Blobs/Text
• Enum/Sets
Monday, May 21, 2012
18. What is a null
• Null means unknown
• is null, is not null
mysql> SELECT 1 = NULL, 1 <> NULL, 1 < NULL, 1 > NULL;
+----------+-----------+----------+----------+
| 1 = NULL | 1 <> NULL | 1 < NULL | 1 > NULL |
+----------+-----------+----------+----------+
| NULL | NULL | NULL | NULL |
+----------+-----------+----------+----------+
mysql> SELECT 1 IS NULL, 1 IS NOT NULL;
+-----------+---------------+
| 1 IS NULL | 1 IS NOT NULL |
+-----------+---------------+
| 0 | 1 |
+-----------+---------------+
And why are we
afraid of them?
Monday, May 21, 2012
20. Normalization
• First normal
form
• Second normal
form
• Third normal
form
Monday, May 21, 2012
21. First normal form
First normal form
To understand first normal form (1NF), consider these two
examples of things you might know:
"What is your favorite color?"
"What food will you not eat?"
A difference between these two questions is that, while you can
have only one favorite color, there may be many foods you do
not eat.
In "1NF"; Every attribute in a relation must be atomic. That is to
say that there can be no composite attributes in the relation.
Data that has a single value such as "person's favorite color" is
inherently in first normal form. Such data can be stored in a
single table with a simple key/value combination. Data that has
multiple values, however, must be stored differently.
Codd argued that there was one best way to keep multi-valued
data such as "food a person will not eat." He suggested that the
database should contain a separate table for the multi-value
data and then store each food as a separate row in that table.
Known as first normal form, this approach has been a standard
for decades.
Monday, May 21, 2012
22. Second normal form
Second normal form (2NF) prescribes full functional dependency on the primary key. It most commonly applies to
tables that have composite primary keys, where two or more attributes comprise the primary key. It requires that
there are no non-trivial functional dependencies of a non-key attribute on a part (subset) of a candidate key. A table
is said to be in the 2NF if and only if it is in the 1NF and every non-key attribute is irreducibly dependent on the
primary key (i.e. not partially dependent on candidate key).
Consider a table named part describing machine parts with the following attributes:
PART_NUMBER (PRIMARY KEY)
SUPPLIER_NAME (PRIMARY KEY)
PRICE
SUPPLIER_ADDRESS
The PART_NUMBER and SUPPLIER_NAME form the composite primary key, because the same part can be supplied
by multiple suppliers. In this example, PRICE is correctly placed on the part table, because it is fully dependent on
the primary key i.e. different suppliers will charge a different price for the same part.
SUPPLIER_ADDRESS, however, is only dependent on the SUPPLIER_NAME, and therefore this table breaks 2NF.
This attribute should be placed on a second table named supplier comprising:
SUPPLIER_NAME (PRIMARY KEY)
SUPPLIER_ADDRESS
In order to find if a table is in 2NF, ask whether any of the non-key attributes of the table could be derived from a
subset of the composite key, rather than the whole composite key. If the answer is yes, it's not in 2NF. This is solved
sometimes by using a correlation file, such as the supplier table above.
Easily understood definition: A unique key. A column of values that uniquely identify each row in each table.
Monday, May 21, 2012
23. Third normal form
Third normal form (3NF) requires that the table is in 2NF, and that there are no non-trivial functional
dependencies of non-key attributes on something other than a superset of a candidate key. A table is in 3NF if
none of the non-primary key attributes is a fact about any other non-primary key attribute. In summary, all non-key
attributes are mutually independent (i.e. there should not be transitive dependencies).
Consider a table that defines a machine part as having the following attributes.
PART_NUMBER (PRIMARY KEY)
MANUFACTURER_NAME
MANUFACTURER_ADDRESS
In this case, the manufacturer address does not belong on this table, because it is a fact about the manufacturer of
the part, rather than the part itself. MANUFACTURER_ADDRESS should therefore be moved into a separate table
with the attributes:
MANUFACTURER_NAME (PRIMARY KEY)
MANUFACTURER_ADDRESS
...and the original table should be redefined as:
PART_NUMBER (PRIMARY KEY)
MANUFACTURER_NAME
The problem with a table not being in 3NF is that for every MANUFACTURER_NAME we have to maintain a
redundant MANUFACTURER_ADDRESS (i.e. an address for each part_number, rather than one for each
MANUFACTURER_NAME).
Easily understood definition: Ensures that each table contains unique data. In other words, it ensures that a
table of customer identification data does not contain order data, and so on.
Monday, May 21, 2012
24. Update/delete
anomalies
If data is not fully normalized, then there
will be update/delete anomalies.
To change a piece of information, you may
have to update in two or more places.
If you remove the last piece of information
in one table, you may destroy all
information about the parent.
Monday, May 21, 2012
25. SQL/DML
The life-cycle of data
Inserts, Selects,
Bulk inserts Views
Updates,
Deletes Transactions,
Locking
Monday, May 21, 2012
26. Insert/bulk insert
mysql> INSERT INTO pet VALUES
('Puffball','Diane','hamster','f','1999-03-30',NULL);
mysql> insert into pet values ('Wally', 'Joan the
Mad', 'unicorn', 'm', '2000-01-02', null);
mysql> LOAD DATA LOCAL INFILE '/path/
-> LINES TERMINATED BY 'r';
Monday, May 21, 2012
27. Select
SELECT what_to_select
FROM which_table
WHERE conditions_to_satisfy;
mysql> select * from pet where name = 'Wally';
+-------+--------------+---------+------+------------+-------+
| name | owner | species | sex | birth | death |
+-------+--------------+---------+------+------------+-------+
| Wally | Joan the Mad | unicorn | m | 2000-01-02 | NULL |
+-------+--------------+---------+------+------------+-------+
Monday, May 21, 2012
28. Selects (2)
mysql> SELECT pet.name,
-> (YEAR(date)-YEAR(birth)) - (RIGHT(date,5)<RIGHT(birth,5)) AS
age,
-> remark
-> FROM pet, event
-> WHERE pet.name = event.name AND event.type = 'litter';
+--------+------+-----------------------------+
| name | age | remark |
+--------+------+-----------------------------+
| Fluffy | 2 | 4 kittens, 3 female, 1 male |
| Buffy | 4 | 5 puppies, 2 female, 3 male |
| Buffy | 5 | 3 puppies, 3 female |
+--------+------+-----------------------------+
Monday, May 21, 2012
29. Update
mysql> UPDATE pet set death = “2006-02-04” WHERE name =
'Bowser';
1 rows updated
mysql> SELECT * FROM pet WHERE name = 'Bowser';
+--------+-------+---------+------+------------+------------+
| name | owner | species | sex | birth | death |
+--------+-------+---------+------+------------+------------+
| Bowser | Diane | dog | m | 1989-08-31 | 1995-07-29 |
+--------+-------+---------+------+------------+------------+
Monday, May 21, 2012
30. Transactions
begin work;
• What is a
update account set balance = transaction?
balance + $1000
where account_n = 123 and type =
‘checking’; • Why are they
update account set balance = needed?
balance - $1000
where account_n = 123 and type =
‘savings’; • ACID
commit work;
• How is it
implemented?
Monday, May 21, 2012
31. Atomicity, Consistency, Isolation, and Durability
In databases, ACID stands for Atomicity, Consistency, Isolation, and Durability. They are considered to be the key transaction processing
features/properties of a database management system, or DBMS. Without them, the integrity of the database cannot be guaranteed. In practice,
these properties are often relaxed somewhat to provide better performance.
In the context of databases, a single logical operation on the data is called a transaction. An example of a transaction is a transfer of funds from
one account to another, even though it might consist of multiple individual operations (such as debiting one account and crediting another). The
ACID properties guarantee that such transactions are processed reliably.
•
Atomicity refers to the ability of the DBMS to guarantee that either all of the tasks of a transaction are performed or none of them are. The
transfer of funds can be completed or it can fail for a multitude of reasons, but atomicity guarantees that one account won't be debited if
the other is not credited as well.
•
Consistency refers to the database being in a legal state when the transaction begins and when it ends. This means that a transaction
can't break the rules, or integrity constraints, of the database. If an integrity constraint states that all accounts must have a positive
balance, then any transaction violating this rule will be aborted.
•
Isolation refers to the ability of the application to make operations in a transaction appear isolated from all other operations. This means
that no operation outside the transaction can ever see the data in an intermediate state; a bank manager can see the transferred funds on
one account or the other, but never on both—even if she ran her query while the transfer was still being processed. More formally,
isolation means the transaction history (or schedule) is serializable. For performance reasons, this ability is the most often relaxed
constraint. See the isolation article for more details.
•
Durability refers to the guarantee that once the user has been notified of success, the transaction will persist, and not be undone. This
means it will survive system failure, and that the database system has checked the integrity constraints and won't need to abort the
transaction. Typically, all transactions are written into a log that can be played back to recreate the system to its state right before the
failure. A transaction can only be deemed committed after it is safely in the log.
Implementing the ACID properties correctly is not simple. Processing a transaction often requires a number of small changes to be made,
including updating indices that are used by the system to speed up searches. This sequence of operations is subject to failure for a number of
reasons; for instance, the system may have no room left on its disk drives, or it may have used up its allocated CPU time.
Monday, May 21, 2012
32. Locking
• What is locking?
• Row level locking
• Page level locking
• Table level locking
• Database locking
how to insert a kiwi…
Monday, May 21, 2012
33. Dining philosophers &
the deadly embrace
• Two processes
want the same
thing
• But in different
orders
• Oops…
Monday, May 21, 2012
34. Delete
mysql> select count(*)
from pet where name =
'Wally';
mysql> delete from pet
where name = 'Wally';
Monday, May 21, 2012
35. Stored procedures
• What are stored procedures
• Why they are God’s gift to database
programmers
• How to write a stored procedure
• How to debug
Monday, May 21, 2012
36. Why they are good
• Performance
• Maintainability
• Security
Monday, May 21, 2012
37. mysql> CREATE PROCEDURE simpleproc (OUT param1 INT)
-> BEGIN
-> SELECT COUNT(*) INTO param1 FROM t;
-> END;
Query OK, 0 rows affected (0.00 sec)
mysql> CALL simpleproc(@a);
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT @a;
+------+
| @a |
+------+
| 3 |
+------+
1 row in set (0.00 sec)
mysql> CREATE FUNCTION hello (s CHAR(20)) RETURNS CHAR(50)
-> RETURN CONCAT('Hello, ',s,'!');
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT hello('world');
+----------------+
| hello('world') |
+----------------+
Stored Procedures and Functions
1059
Monday, May 21, 2012
38. How to debug
• Plan with failure in
mind
• Write clean code
• Build tests first
• Build in debugging
Monday, May 21, 2012
39. Overcoming
performance anxiety
• Query plans
• Preventing poor
performance
• Acts of madness/acts
of desperation
Monday, May 21, 2012
40. Explain
mysql> explain select * from pet;
+----+-------------+-------+--------+---------------+
| id | select_type | table | type | possible_keys |
+----+-------------+-------+--------+---------------+
| 1 | SIMPLE | pet | system | NULL |
+----+-------------+-------+--------+---------------+
• Most important: order of tables
• Rule based optimizer
• “Explain table” will show statistics for table
• Do “explain select” to get a feel for query plans
Monday, May 21, 2012
41. Proper Prior Planning
Prevents Pathetically Poor
• Clean database design
• Correct indexing
• Appropriate use of referential integrity
• Render unto server what is owed to server
Monday, May 21, 2012
42. Acts of desperation/
• Omit needless tasks. Omit needless tasks. Omit…
• Buy more RAM
• Measure before and after tuning
• Act locally; think globally
• Cache frequently needed data
• Partition in space & in time
Monday, May 21, 2012
43. Security
• Database level
security
• Table level
security
• General
principles of
security
Monday, May 21, 2012
44. Database security
• Root user
password
• Permissions on
the data files
Monday, May 21, 2012
45. Users and permissions
• Define users
• Grant/revoke permissions
• On tables and even individual columns
• For select, insert, update, delete,…
• Need overall plan
Monday, May 21, 2012
46. Principles of security
• Least privilege
• Clean design
• No unnecessary interactions between
users
• Well-defined roles
• Thorough logging
Monday, May 21, 2012
47. Backups
• Restore
philosophies
• Types of
backups
• Backup tools
Monday, May 21, 2012
48. Restore philosophies
• Disaster recovery
• Database failure
• User error
Monday, May 21, 2012
49. Types of backups
• Full database backup
• Log backups
• Asci table/data backup
• Special backups, i.e. of configuration files or
for particular projects
• Replication
Monday, May 21, 2012
50. Backup tools
• mysql is a command-line client for executing SQL statements
interactively or in batch mode.
• mysqladmin is an administrative client.
• mysqlcheck performs table maintenance operations.
• mysqldump and mysqlhotcopy make database backups.
• mysqlimport imports data files.
• mysqlshow displays information about databases and tables.
Monday, May 21, 2012
51. Under the hood
• Transaction logs
• Server process
• Database structures
Monday, May 21, 2012
52. Transaction logs
The binary log contains all statements that update data or potentially could have updated it (for example,
• Begin work
a DELETE which matched no rows). Statements are stored in the form of “events” that describe the
modifications. The binary log also contains information about how long each statement took that updated
data.
Note: The binary log has replaced the old update log, which is no longer available as of MySQL 5.0.
• Insert/update/
The binary log contains all information that is available in the update log in a more efficient format and
in a manner that is transaction-safe. If you are using transactions, you must use the MySQL binary log
for backups instead of the old update log.
The binary log does not contain statements that do not modify any data. If you want to log all statements
(for example, to identify a problem query), use the general query log. See Section 5.12.2, “The General
Query Log”.
delete
The primary purpose of the binary log is to be able to update databases during a restore operation as
fully as possible, because the binary log contains all updates done after a backup was made. The binary
log is also used on master replication servers as a record of the statements to be sent to slave servers. • Commit/
rollback
See Chapter 6, Replication in MySQL.
Running the server with the binary log enabled makes performance about 1% slower. However, the
benefits of the binary log for restore operations and in allowing you to set up replication generally
outweigh this minor performance decrement.
Monday, May 21, 2012
53. Server process
• Client/server
architecture
• Clients
• Connection
methods
• Server
• Actual data
Monday, May 21, 2012
54. Database structures
$ pwd
/usr/local/mysql
$ ls -l
total 88
-rw-r--r-- 1 root wheel 19071 Dec 21 14:39 COPYING
-rw-r--r-- 1 root wheel 5712 Dec 21 21:02 EXCEPTIONS-CLIENT
-rw-r--r-- 1 root wheel 7937 Dec 21 21:02 INSTALL-BINARY
-rw-r--r-- 1 root wheel 1379 Dec 21 14:39 README
drwxr-xr-x 50 root wheel 1700 Feb 8 22:03 bin
-rwxr-xr-x 1 root wheel 801 Dec 21 21:15 configure
drwxr-x--- 10 mysql wheel 340 Feb 10 06:09 data
drwxr-xr-x 4 root wheel 136 Feb 8 22:03 docs
drwxr-xr-x 62 root wheel 2108 Feb 8 22:03 include
drwxr-xr-x 10 root wheel 340 Feb 8 22:03 lib
drwxr-xr-x 3 root wheel 102 Dec 21 21:15 man
drwxr-xr-x 13 root wheel 442 Feb 8 22:03 mysql-test
drwxr-xr-x 3 root wheel 102 Feb 8 22:03 scripts
drwxr-xr-x 5 root wheel 170 Feb 8 22:03 share
drwxr-xr-x 31 root wheel 1054 Feb 8 22:03 sql-bench
drwxr-xr-x 14 root wheel 476 Feb 8 22:03 support-files
drwxr-xr-x 21 root wheel 714 Feb 8 22:03 tests
Monday, May 21, 2012
55. Hey, dude, where’s my
data?
# pwd
/usr/local/mysql/data/menagerie
# ls -l
total 48
-rw-rw---- 1 mysql wheel 65 Feb 8 22:09 db.opt
-rw-rw---- 1 mysql wheel 68 Feb 10 08:54 pet.MYD
-rw-rw---- 1 mysql wheel 1024 Feb 10 08:54 pet.MYI
-rw-rw---- 1 mysql wheel 8720 Feb 8 22:56 pet.frm
Monday, May 21, 2012
57. Summary
• Advantages of MySQL
• Principles of good design
• Where to go next
Monday, May 21, 2012
58. Advantages of MySQL
• Easy to download and install
• Reliable, fast, reasonably easy to use
• Has all needed features
• In widespread use
Monday, May 21, 2012
59. Principles of good
design
• Correct normalization
• Good use of database features
• Appropriate monitoring
• Act locally, think globally
Monday, May 21, 2012
60. Where to go next
• www.mysql.com
• MySQL Reference
Manual
• MySQL in a Nutshell,
Russell J. T. Dyer
• MySQL, Paul Dubois
Monday, May 21, 2012