2. Database normalization
• Normalization is the process of reorganizing data in a database so that it
meets two basic requirements:
i. There is no redundancy of data.
ii. Data dependencies are logical.
• Normalization usually involves dividing a database into two or
more tables and defining relationships between the tables.
3. Purpose of normalisation
• Minimise redundancy in data
• Remove insert, delete and update anomalies during database activities
• Reduce the need to reorganise data when it is modified or enhanced.
• Normalisation reduces a complex user view into a number of subgroups
4. Levels of normalization based on the amount
of redundancy in the database.
• Various levels of normalization are:
• First Normal Form (1NF)
• Second Normal Form (2NF)
• Third Normal Form (3NF)
• Boyce-Codd Normal Form (BCNF)
• Fourth Normal Form (4NF)
• Fifth Normal Form (5NF)
5. 1st Normal Form (1NF)
• First Normal Form defines that all the attributes in a relation must have
atomic domains.The values in an atomic domain are indivisible units.
• 1 NF Decomposition
a. Place all items that appear in the repeating group in a new table
b. Designate a primary key for each new table produced.
c. Duplicate in the new table the primary key of the table from which the
repeating group was extracted or vice versa.
6. Example of a table not in 1NF :
Studio Director Movies
Marvel Kevin Feige The Avengers
Captain America
DCEU Zack Snyder BatmanVs
SuperMan
Suicide Squad
• This table contains Attribute values
which are not single.This is not in
Normalised form.
• To make it into 1NF we have to
decompose table into atomic
elements.
7. Table in 1NF after eliminating:
Studio Director Movies
Marvel Kevin Feige The Avengers
Marvel Kevin Feige Captain America
DCEU Zack Snyder BatmanVs Superman
DCEU Zack Snyder Suicide Squad
Now it is in 1NF.
8. Second Normal Form – 2NF
• Prime attribute − an attribute, which is a part of the prime-key, is known as
a prime attribute.
• Non-prime attribute − an attribute, which is not a part of the prime-key, is
said to be a non-prime attribute.
If we follow second normal form, then every non-prime attribute should be
fully functionally dependent on prime key attribute and there should not be
any partial dependency.
9. Example of a table not in 2NF:
Studio Movie Budget city
Marvel Avengers 100 New York
Marvel Captain
America
120 New York
DCEU Batman Vs
Superman
150 Gotham
DCEU Suicide
Squad
75 Gotham
• Here Primary key is (studio, movie) and city
depends only on the studio and not on the
whole key.
• So, this is not in 2NF form.
10. Solution of 2 NF
Old Scheme {Studio, Movie, Budget, City}
New Scheme {Movie, Studio, Budget}
New Scheme {Studio, City}
Movie Studio Budget
The Avengers Marvel 100
Captain America Marvel 120
BatmanVs
Superman
DCEU 150
Suicide Squad DCEU 75
Studio City
Marvel NewYork
DCEU Gotham
Now the 2 tables are in 2NF form.
11. Third normal form 3 NF
• This form dictates that all non-key attributes of a table must be
functionally dependent on a candidate key i.e. there can be no
interdependencies among non-key attributes.
• For a table to be in 3NF, there are two requirements
• The table should be second normal form
• No attribute is transitively dependent on the primary key
12. Example of a table not in 3nf
Studio StudioTemp City
Marvel 96 NewYork
DCEU 99 Gotham
Fox 96 NewYork
Paramount 95 Hollywood
Here Studio is the primary key and both
studio temp and city depends entirely
on the Studio.
1. Primary Key {Studio}
2. {Studio} {StudioCity}
3. {StudioCity} {CityTemp}
4. {Studio} {CityTemp}
5. CityTemp transitively depends on Studio
hence violates 3NF
It is called transitive dependency.
13. Solution of 3NF
Old Scheme {Studio, StudioCity, CityTemp}
New Scheme {Studio, StudioCity}
New Scheme {StudioCity, CityTemp}
Studio Studio City
Marvel NewYork
DCEU Gotham
FOx NewYork
Paramount Hollywood
Studio
City
CityTemp
NewYork 96
Gotham 95
Hollywood 99
14. Boyce Codd Normal Form (BCNF) – 3.5NF
• BCNF does not allow dependencies between attributes that belong to
candidate keys.
• BCNF is a refinement of the third normal form in which it drops the restriction
of a non-key attribute from the 3rd normal form.
• Third normal form and BCNF are not same if the following conditions are true:
• The table has two or more candidate keys
• At least two of the candidate keys are composed of more than one attribute
• The keys are not disjoint i.e.The composite candidate keys share some attributes
15. Example of BCNF
Scheme {MovieTitle, MovieID, PersonName, Role, Payment }
Key1 {MovieTitle, PersonName}
Key2 {MovieID, PersonName}
MovieTitle MovieID PersonName Role Payment
The Avengers M101 Robert Downet Jr. Tony Stark 200m
The Avengers M101 Chris Evans Chris Rogers 120m
BatmanVs Superman D101 Ben Afflek Bruce Wayne 180m
BatmanVs Superman D101 Henry Cavill Clarke Cent 125m
A walk to remember P101 Mandy Moore Jamie Sullivan 50m
Dependency between MovieID & MovieTitle Violates
BCNF
16. Solution of BCNF
Place the two candidate primary keys in separate entities
Place each of the remaining data items in one of the resulting entities according to its
dependency on the primary key.
New Scheme {MovieID, PersonName, Role, Payment}
New Scheme {MovieID, MovieTitle}
MovieID PersonName Role Payment
M101 Robert Downey
Jr.
Tony Stark 200m
M101 Chris Evans Chris Rogers 125m
D101 Ben Afflek Bruce Wayne 175m
D101 Henry Cavill Clarke Cent 120m
P101 Mandy Moore Jamie
Sullivan
50m
MovieID MovieTitle
M101 The Avengers
D101 BatmanVS
Superman
P101 A walk to remember
17. 4nf
• Fourth normal form (4NF) is a level of database normalization where there
are no non-trivial multivalued dependencies other than a candidate key.
• It builds on the first three normal forms (1NF, 2NF and 3NF) and the Boyce-
Codd Normal Form (BCNF). It states that, in addition to a database meeting
the requirements of BCNF, it must not contain more than one multivalued
dependency.
18. Example of 4NF
Scheme {MovieName, ScreeningCity, Genre)
Movie ScreeningCity Genre
The Avengers Los Angles Sci-Fi
The Avengers NewYork Sci-Fi
Batman vs
Superman
Santa Cruz Drama
Batman vs
Superman
Durham Drama
AWalk to
remember
NewYork Romance
• Many Movies can have the same Genre and
Many Cities can have the same movie.
• So this table violates 4NF .
19. Soultuin of 4NF
Move the two multi-valued relations to separate tables
Identify a primary key for each of the new entity.
New Scheme {MovieName, ScreeningCity}
New Scheme {MovieName, Genre}
We split the table into two tables with one multivalued
value in each.
MovieName ScreeningCity
Batman vs Superman Santa Cruz
The Avengers Los Angeles
AWalk to remember New york
Batman vs Superman Durham
The Avengers New york
MovieName Genre
Batman vs
Superman
Drama
The Avengers Sci-Fi
AWalk to remember Romance
20. Fifth normal form
• Fifth normal form (5NF), also known as project-join normal form (PJ/NF) is
a level of database normalization designed to reduce redundancy in
relational databases recording multi-valued facts by
isolating semantically related multiple relationships. A table is said to be in
the 5NF if and only if every non-trivial join dependency in it is implied by
the candidate keys.
21. Example of 5NF
Theatre Company Movie
T1 Paramount A walk to remember
T2 Marvel The Avengers
T2 Marvel Age of Ultron
T2 Marvel Dr. Strange
T3 DCEU BatmanVs
Superman
T4 Sony Spiderman
Homecoming
• Here Product is related to each company and
MVD: Theatre Company, Movie
22. TITLE: Solution of 5NF
Theatre Company
T1 Paramount
T2 Marvel
T3 DCEU
T4 Sony
Theatre Movie
T2 The Avengers
T1 A walk to remember
T2 Age of Ultron
T2 Dr. Strange
T3 Batman vs Superman
T4 Spiderman Homecoming