2. Normalization
Normalization theory is based on the
observation that relations with certain
properties are more effective in
inserting, updating and deleting data
than other sets of relations containing
the same data
Normalization is a multi-step process
beginning with an “unnormalized”
relation
IS 257 – Fall 2006
3. Normal Forms
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form (BCNF)
Fourth Normal Form (4NF)
Fifth Normal Form (5NF)
IS 257 – Fall 2006
4. Normalization
Functional
dependency
No transitive
of nonkey
dependency
attributes on
between
the primary
nonkey
attributes
Boyce- key - Atomic
Codd and values only
Higher
All Full
determinants Functional
are candidate dependency
keys - Single of nonkey
multivalued attributes on
dependency the primary
key
IS 257 – Fall 2006
5. Unnormalized Relations
First step in normalization is to convert
the data into a two-dimensional table
In unnormalized relations data can
repeat within a column
IS 257 – Fall 2006
6. Types of anomalies
Redundancy
◦ Repeat info unnecessarily in several tuples
Update anomalies:
◦ Change info in one tuple but not in another
Deletion anomalies:
◦ Delete some values & lose other values too
Insert anomalies:
◦ Inserting row means having to insert other, separate
info / null-ing it out
6
7. First Normal Form
To move to First Normal Form a
relation must contain only atomic
values at each row and column.
◦ No repeating groups
◦ A column or set of columns is called a
Candidate Key when its values can
uniquely identify the row in the relation.
IS 257 – Fall 2006
8. Second Normal Form
A relation is said to be in Second
Normal Form when every nonkey
attribute is fully functionally
dependent on the primary key.
◦ That is, every nonkey attribute needs the
full primary key for unique identification
IS 257 – Fall 2006
9. Third Normal Form
A relation is said to be in Third Normal Form
if there is no transitive functional
dependency between nonkey attributes
◦ When one nonkey attribute can be determined
with one or more nonkey attributes there is said
to be a transitive functional dependency.
The side effect column in the Surgery table
is determined by the drug administered
◦ Side effect is transitively functionally dependent
on drug so Surgery is not 3NF
IS 257 – Fall 2006
10. Boyce-Codd Normal Form
Most 3NF relations are also BCNF
relations.
A 3NF relation is NOT in BCNF if:
◦ Candidate keys in the relation are
composite keys (they are not single
attributes)
◦ There is more than one candidate key in
the relation, and
◦ The keys are not disjoint, that is, some
attributes in the keys are common
IS 257 – Fall 2006
11. Fourth Normal Form
Any relation is in Fourth Normal Form
if it is BCNF and any multivalued
dependencies are trivial
Eliminate non-trivial multivalued
dependencies by projecting into
simpler tables
IS 257 – Fall 2006
12. Fifth Normal Form
A relation is in 5NF if every join
dependency in the relation is implied
by the keys of the relation
Implies that relations that have been
decomposed in previous NF can be
recombined via natural joins to
recreate the original relation.
IS 257 – Fall 2006
13. Dependencies
Dependency theory is a subfield of
database theory which studies implication
and optimization problems related to logical
constraints, commonly called dependencies,
on databases. The best known class of such
dependencies are functional dependencies,
which form the foundation of keys on
database relations. Another important class
of dependencies are the multivalued
dependencies. A key algorithm in
dependency theory is the Chase, and much
of the theory is devoted to its study.
14. Functional Dependence
Employee (1NF)
emp_no name dept_no dept_name skills
1 Kevin Jacobs 201 R&D C
1 Kevin Jacobs 201 R&D Perl
1 Kevin Jacobs 201 R&D Java
2 Barbara Jones 224 IT Linux
2 Barbara Jones 224 IT Mac
3 Jake Rivera 201 R&D DB2
3 Jake Rivera 201 R&D Oracle
3 Jake Rivera 201 R&D Java
Name, dept_no, and dept_name are functionally dependent on
emp_no. (emp_no -> name, dept_no, dept_name)
Skills is not functionally dependent on emp_no since it is not
unique to each emp_no.
15. Transitive Dependence
Employee (2NF)
emp_no name dept_no dept_name
1 Kevin Jacobs 201 R&D
2 Barbara Jones 224 IT
3 Jake Rivera 201 R&D
Dept_no and dept_name are functionally dependent
on emp_no however, department can be considered
a separate entity.
16. Entity - Relationship Model
A logical design method which emphasizes
simplicity and readability.
•Basic objects of the model are:
•Entities
•Relationships
•Attributes
17. Entities
Data objects detailed by the information in
the database.
•Denoted by rectangles in the model.
Employee Department
19. Relationships
Represent associations between entities.
•Denoted by diamonds in the model.
Employee works in Department
Name SSN Start date Name Budget
20. Relationship Connectivity
Constraints on the mapping of the associated
entities in the relationship.
•Denoted by variables between the related entities.
•Generally, values for connectivity are expressed as “one” or
“many”
N 1
Employee work Department
Name SSN Start date Name Budget
21. Connectivity
one-to-one
1 1
Department has Manager
one-to-many
1 N
Department has Project
many-to-many
M N
Employee works on Project
22. Logical Design to Physical
Design
Creating relational SQL schemas from
entity-relationship models.
•Transform each entity into a table with the key
and its attributes.
•Transform each relationship as either a
relationship table (many-to-many) or a “foreign
key” (one-to-many and many-to-many).
23. Entity tables
Transform each entity into a table with a key
and its attributes.
create table
employee
Employee (emp_no number,
name varchar2(256),
ssn number,
primary key
(emp_no));
Name SSN
24. Foreign Keys
Transform each one-to-one or one-to-many
relationship as a “foreign key”.
•Foreign key is a reference in the child (many) table to the
primary key of the parent (one) table.
create table department
Departme (dept_no number,
name varchar2(50),
nt 1 primary key (dept_no));
create table employee
has
(emp_no number,
dept_no number,
N name varchar2(256),
ssn number,
Employee primary key (emp_no),
foreign key (dept_no) references
department);
25. Foreign Key
Department Accounting has 1 employee:
dept_no Name Brian Burnett
1 Accounting
2 Human Resources
Human Resources has 2
3 IT employees:
Nora Edwards
Ben Smith
Employee IT has 3 employees:
emp_no dept_no Name Ajay Patel
1 2 Nora Edwards John O’Leary
2 3 Ajay Patel Julia Lenin
3 2 Ben Smith
4 1 Brian Burnett
5 3 John O'Leary
6 3 Julia Lenin
26. Many-to-Many tables
Transform each many-to-many relationship as a
table.
•The relationship table will contain the foreign keys to the
related entities as well as any relationship attributes.
create table proj_has_emp
Project (proj_no number,
N
emp_no number,
start_date date,
primary key (proj_no, emp_no),
Start date has
foreign key (proj_no) references project
foreign key (emp_no) references
employee);
M
Employee
30. DFD Weekly
Employee transactions
no, hours
employee Employee worked,
No, hours batch control Employee
worked totals no, hours
Batch worked employee
time
sheets Name, pay
rate, tax code,
Verify/valida etc.
Invalid te data Employee
employee data, no, hours
batch control Prepare worked
totals Employee no, Payroll Each employee:
hours worked Employee
Error report nos, pay, tax, etc
Employee Totals: pay, tax,etc
no, hours Employee
Print
worked no, pay,
paychequ
tax,etc.
e&
Valid weekly payslips
transactions Print
payroll
summary Each employee:
Employee nos,
Employee
pay, tax, etc
no, pay,
Totals: pay, tax,etc
tax,etc.
Accounts employ
dept ee