1. DATABASE DESIGN: DATA MODELING
AND NORMALIZATION
Dr. R. Khanchana
Assistant Professor
Department of Computer Science
Sri Ramakrishna College of Arts and Science for Women
2. Data Modeling
Data Modeling:
• Data modeling - A tool used to represent various components and their
relationships.
• Popular modeling tool is Entity Relational Model(E-R Model)
• E-R Model provides
An excellent communication tools
A simple graphical representation of data.
• E-R Model uses E-R Diagram(ERD) for graphical representation of the
database components.
• An entity is represented by a rectangle it uses upper case letters and it should
be singular noun Eg. EMPLOYEE,DEPARTMENT etc
• Entity Representation in an E-R Diagram
EMPLOYEE
3. Representation of Relationship
• The line represents the relationship between the two entities it uses
lowercase letters and it should be active verb
• Passive verb can be used but active verb is preferable.
• Representation of relationship in an E-R diagram
1:1
• The types of relationships (1:1,1:M,M:N) between entities are called
connectivity or multiplicity.
M: N
1: Mmanages
employs
contains
4. Entity, Relationship, Connectivity
• The types of relationships (1:1,1:M,M:N) between entities are
called Connectivity or Multiplicity
• The connectivity is shown with vertical or angled lines next to
each entity
5. Cardinality
• The Relationship between two entities can be
given suing the lowercase and uppercase
limits
• Example - (n,m)
• n – lower limit
• m – upper limit
DEPARTMENTEMPLOYEE
FACULTYDIVISION
ITEMINVOICE
Supervises
Employees
Contains
(1, 1)
(1, 1)
(1, N)
(1, 1)
(1, N) (1, N)
7. Cardinality Types
• It has 2 types
– Mandatory
Relationship
– Optional
Relationship
• Oracle set rules for
the minimum and
maximum values for
cardinality. These
types of decision are
known as business
rules.
10. Optional Relationship
• Optional relationships are shown with a small circle next to the
optional entity.
• The optional relational relationship can occur in 1:1,1:M or
M:N relationships and it can occur on one or both sides of the
relationship.
• In relational databases many-to-many (M:N) relationships are
allowed but they are not easy to implement.
11. Composite Entity & Relational Schema
• The decomposition from M:N to1:M involves
a third entity and it is known as Composition
entity or Associative entity
13. Other Elements
• Some of the other elements considered in the database design are
Simple Attributes:
Attributes that cannot be subdivided
Eg. city, gender
Composite Attributes:
Attributes that can be subdivided
Eg. Full name(Last name, First name, Middle Initial)
Single valued Attributes:
Attributes with a single value
Eg. Employee Id, Student Id
Multivalued Attributes:
Attributes with multiple values
Eg. Course details
15. Dependency
• A dependency is a constraint that applies to or defines the relationship
between attributes.
• Primary key which uniquely identifies an entity
• The column that do not make up the primary key for the table such columns
are called the nonkey column
• The non key columns are functionally dependent on the primary key columns
• There are three types of dependency in tables
– Total or full dependency: A nonkey column dependent on
all primary key column
– Partial dependency : A nonkey column is dependent on part
of the primary key
– Transitive dependency :A nonkey column is dependent on
another nonkey column.
20. Deletion Anomaly
• Which results when the deletion of information about one
entity leads to the deletion of another entity.
• If someone decides to delete Botany department , he may
end up deleting all student’s data who had the department of
Botany.
21. Insertion Anomaly
When the information about an entity cannot be inserted
unless the information about another entity is known
and it is said to be Insertion Anomaly
Jerry is a new Student with department id 6. There is no Department with this Dept_ID 6. Hence , the
anomaly. The usual behaviour should be a new department id with 6 and only then Student could have
it.
22. Update Anomaly
• An update anomaly occurs when data is only
partially updated in a database.
• English department has now Dept_ID 8 , but unfortunately it
was not updated in Student table.
25. Normalization
• Normalization is the process of organizing the
data in the database.
• Normalization is used to minimize the
redundancy from a relation or set of relations.
• Normalization divides the larger table into the
smaller table and links them using relationship.
• The normal form is used to reduce redundancy
from the database table.
27. Types of Normal Forms
• Higher Normal Form - Lower the redundancy
i) First Normal Form(1NF)
ii) Second Normal Form(2NF)
iii) Third Normal Form(3NF)
28. First Normal Form (1NF)
The table said to be first normal form (or) labeled 1NF if the following conditions
exists:
• The primary key is defined
• Also includes composite key if a single column cannot be used as a primary key.
• First normal form disallows the multi-valued attribute/ composite attribute, and
their combinations.
Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute
EMP_PHONE
29. Second Normal Form (2NF)
• All 1NF requirements are fulfilled
•No partial dependency
•Suppose the table is in 1NF and does not have composite key and it is said to
be 2NF.
Example: A school can store the data of teachers and the subjects they teach. In a
school, a teacher can teach more than one subject.
31. Third Normal Form (3NF)
• Table is said to be third normal form (or) 3NF
if the following requirements are satisfied.
– All 2NF requirements are fulfilled.
– No transitive dependency- non key column is
dependent on another non key column.
32. Third Normal Form (3NF)
• Super key in the table above:
– {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on
• Candidate key: {EMP_ID}
• Non-prime attributes: all attributes except EMP_ID are non-prime.
– EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on
EMP_ID.
– The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on
super key(EMP_ID). It violates the rule of third normal form.
– Hence need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
34. Other Normal forms
• BCNF - BoyceCodd Normal Form(BCNF)
• 4NF –Fourth Normal Form
• 5NF - Fifth Normal Form
• DKNF –Domain Key Normal Form
35. Dependency Diagram
• Total dependencies arrows are drawn above the boxes.
• Partial & transitive dependencies arrows are drawn below the boxes.
• i)Conversion from 1NF to 2NF
• ii)Conversion from 2NF to 3NF
36. Conversion from 1NF to 2NF
• The composite primary key is in the table remove partial dependency.
• 1st each primary key component on a separate line , they will become
primary key in two new tables.
• Composite key on the third line it will be the composite key on the third
table.
1NF to 2NF
Decomposition
38. Conversion from 2NF to 3NF
• Invoice table still has transitive dependency but no partial dependency.
• More columns of transitive dependency to a new table.
• Keep the primary key of new table as a foreign key at existing table.
2NF to 3NF
Decomposition
40. Denormaization
• Normalization process - splits tables into smaller
tables
• Smaller tables are joined together by common
columns to retrieve information from different
tables.
• Denomalization process – reverse process
• It reduces normal form and increases
data redundancy
• Duplicate data are stored more storage space is
required
43. Assignment –Case Study
• Assume, a video library maintains a database of
movies rented out. Without any normalization, all
information is stored in one table as shown below.