SlideShare a Scribd company logo
1 of 39
ER Diagrams (Concluded),
Schema Refinement, and Normalization
Zachary G. Ives
University of Pennsylvania
CIS 550 โ€“ Database & Information Systems
October 6, 2005
Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan
2
Examples of ER Diagrams
๏‚ง Please interpret these ER diagrams:
COURSESSTUDENTS Takes
COURSESSTUDENTS Takes
STUDENTS COURSESTakes
3
Converting ER Relationship Sets to
Tables: 1:n Relationships
CREATE TABLE Teaches(
fid INTEGER,
serno CHAR(15),
semester CHAR(4),
PRIMARY KEY (serno),
FOREIGN KEY (fid)
REFERENCES PROFESSORS,
FOREIGN KEY (serno) REFERENCES Teaches)
CREATE TABLE Teaches_Course(
serno INTEGER,
subj VARCHAR(30),
cid CHAR(15),
fid CHAR(15),
when CHAR(4),
PRIMARY KEY (serno),
FOREIGN KEY (fid) REFERENCES PROFESSORS)
โ€ข โ€œ1โ€ entity = key of
relationship set:
โ€ข Or embed
relationship in
โ€œmanyโ€ entity set:
COURSES
PROFESSORS
Teaches
4
1:1 Relationships
If you borrow money or have credit, you might get:
What are the table options?
CreditReport Borrower
delinquent?
ssn
namedebt
Describesrid
5
ISA Relationships: Subclassing
(Structurally)
๏‚ง Inheritance states that one entity is a โ€œspecial kindโ€
of another entity: โ€œsubclassโ€ should be member of
โ€œbase classโ€
name
ISA
People
id
Employees salary
6
But How Does thisTranslate
into the Relational Model?
Compare these options:
๏‚ง Two tables, disjoint tuples
๏‚ง Two tables, disjoint attributes
๏‚ง One table with NULLs
๏‚ง Object-relational databases (allow subclassing of tables)
7
Weak Entities
A weak entity can only be identified uniquely using the primary
key of another (owner) entity.
๏‚ง Owner and weak entity sets in a one-to-many relationship
set, 1 owner : many weak entities
๏‚ง Weak entity set must have total participation
People Feeds Pets
ssn name weeklyCost name species
8
Translating Weak Entity Sets
Weak entity set and identifying relationship set are translated
into a single table; when the owner entity is deleted, all
owned weak entities must also be deleted
CREATE TABLE Feed_Pets (
name VARCHAR(20),
species INTEGER,
weeklyCost REAL,
ssn CHAR(11) NOT NULL,
PRIMARY KEY (pname, ssn),
FOREIGN KEY (ssn) REFERENCES Employees,
ON DELETE CASCADE)
9
N-ary Relationships
๏‚ง Relationship sets can relate an arbitrary number of
entity sets:
Student Project
Advisor
Indep
Study
10
Summary of ER Diagrams
๏‚ง One of the primary ways of designing logical
schemas
๏‚ง CASE tools exist built around ER
(e.g. ERWin, PowerBuilder, etc.)
๏‚ง Translate the design automatically into DDL, XML, UML,
etc.
๏‚ง Use a slightly different notation that is better suited to
graphical displays
๏‚ง Some tools support constraints beyond what ER diagrams
can capture
๏‚ง Can you get different ER diagrams from the same data?
11
Schema Refinement & DesignTheory
๏‚ง ER Diagrams give us a start in logical schema design
๏‚ง Sometimes need to refine our designs further
๏‚ง Thereโ€™s a system and theory for this
๏‚ง Focus is on redundancy of data
๏‚Ÿ Causes update, insertion, deletion anomalies
12
Not All Designs are Equally Good
Why is this a poor schema design?
And why is this one better?
Stuff(sid, name, serno, subj, cid, exp-grade)
Student(sid, name)
Course(serno, cid)
Subject(cid, subj)
Takes(sid, serno, exp-grade)
13
Focus on the Bad Design
๏‚ง Certain items (e.g., name) get repeated
๏‚ง Some information requires that a student be enrolled
(e.g., courses) due to the key
sid name serno subj cid exp-grade
1 Sam 570103 AI 520 B
23 Nitin 550103 DB 550 A
45 Jill 505103 OS 505 A
1 Sam 505103 OS 505 C
14
Functional Dependencies
Describe โ€œKey-Likeโ€ Relationships
A key is a set of attributes where:
If keys match, then the tuples match
A functional dependency (FD) is a generalization:
If an attribute set determines another, written X !Y
then if two tuples agree on attribute set X, they must
agree on X:
sid ! name
What other FDs are there in this data?
๏ƒ˜ FDs are independent of our schema design choice
15
Formal Definition of FDโ€™s
Def. Given a relation schema R and subsets X,Y of R:
An instance r of R satisfies FD X ๏‚ฎY if,
for any two tuples t1, t2 2 r,
t1[X ] = t2[X] implies t1[Y] = t2[Y]
๏‚ง For an FD to hold for schema R, it must hold for
every possible instance of r
(Can a DBMS verify this? Can we determine this by looking
at an instance?)
16
GeneralThoughts on Good Schemas
We want all attributes in every tuple to be determined
by the tupleโ€™s key attributes, i.e. part of a superkey
(for key X ๏‚ฎY, a superkey is a โ€œnon-minimalโ€ X)
What does this say about redundancy?
But:
๏‚ง What about tuples that donโ€™t have keys (other than the entire
value)?
๏‚ง What about the fact that every attribute determines itself?
17
Armstrongโ€™s Axioms: Inferring FDs
Some FDs exist due to others; can compute using
Armstrongโ€™s axioms:
๏‚ง Reflexivity: If Y ๏ƒ X then X ๏‚ฎ Y (trivial dependencies)
name, sid ๏‚ฎ name
๏‚ง Augmentation: If X ๏‚ฎY then XW ๏‚ฎYW
serno ๏‚ฎ subj so serno, exp-grade ๏‚ฎ subj, exp-grade
๏‚ง Transitivity: If X ๏‚ฎ Y andY ๏‚ฎ Z then X ๏‚ฎ Z
serno ๏‚ฎ cid and cid ๏‚ฎ subj
so serno ๏‚ฎ subj
18
Armstrongโ€™s Axioms Lead toโ€ฆ
๏‚ง Union: If X ๏‚ฎ Y and X ๏‚ฎ Z
then X ๏‚ฎ YZ
๏‚ง Pseudotransitivity: If X ๏‚ฎ Y and WY ๏‚ฎ Z
then XW ๏‚ฎ Z
๏‚ง Decomposition: If X ๏‚ฎ Y and Z ๏ƒ Y
then X ๏‚ฎ Z
Letโ€™s prove these from Armstrongโ€™s Axioms
19
Closure of a Set of FDโ€™s
Defn. Let F be a set of FDโ€™s.
Its closure, F+,is the set of all FDโ€™s:
{X ๏‚ฎ Y | X ๏‚ฎ Y is derivable from F by Armstrongโ€™s
Axioms}
Which of the following are in the closure of our Student-Course
FDโ€™s?
name ๏‚ฎ name
cid ๏‚ฎ subj
serno ๏‚ฎ subj
cid, sid ๏‚ฎ subj
cid ๏‚ฎ sid
20
Attribute Closures: Is Something
Dependent on X?
Defn.The closure of an attribute set X, X+, is:
X+ = ๏ƒˆ {Y | X ๏‚ฎY ๏ƒŽ F +}
๏‚ง This answers the question โ€œisY determined
(transitively) by X?โ€; compute X+ by:
๏‚ง Does sid, serno ๏‚ฎ subj, exp-grade?
closure := X;
repeat until no change {
if there is an FD U ๏‚ฎ V in F
such that U is in closure
then add V to closure}
21
Equivalence of FD sets
Defn. Two sets of FDโ€™s, F and G, are equivalent if
their closures are equivalent, F + = G +
e.g., these two sets are equivalent:
{XY ๏‚ฎ Z, X ๏‚ฎ Y} and
{X ๏‚ฎ Z, X ๏‚ฎ Y}
๏‚ง F + contains a huge number of FDโ€™s
(exponential in the size of the schema)
๏‚ง Would like to have smallest โ€œrepresentativeโ€ FD
set
22
Minimal Cover
Defn. A FD set F is minimal if:
1. Every FD in F is of the form X ๏‚ฎ A,
where A is a single attribute
2. For no X ๏‚ฎ A in F is:
F โ€“ {X ๏‚ฎ A } equivalent to F
3. For no X ๏‚ฎ A in F and Z ๏ƒŒ X is:
F โ€“ {X ๏‚ฎ A } ๏ƒˆ {Z ๏‚ฎ A } equivalent to F
Defn. F is a minimum cover for G if F is minimal and is
equivalent to G.
e.g.,
{X ๏‚ฎ Z, X ๏‚ฎ Y} is a minimal cover for
{XY ๏‚ฎ Z, X ๏‚ฎ Z, X ๏‚ฎ Y}
in a sense,
each FD is
โ€œessentialโ€
to the cover
we express
each FD in
simplest form
23
More on Closures
If F is a set of FDโ€™s and X ๏‚ฎ Y ๏ƒ F +
then for some attribute A ๏ƒŽ Y, X ๏‚ฎ A ๏ƒ F +
Proof by counterexample.
Assume otherwise and let Y = {A1,..., An}
Since we assume X ๏‚ฎ A1, ..., X ๏‚ฎ An are in F +
then X ๏‚ฎ A1 ...An is in F + by union rule,
hence, X ๏‚ฎY is in F + which is a contradiction
24
Why Armstrongโ€™s Axioms?
Why are Armstrongโ€™s axioms (or an equivalent rule
set) appropriate for FDโ€™s? They are:
๏‚ง Consistent: any relation satisfying FDโ€™s in F will satisfy
those in F +
๏‚ง Complete: if an FD X ๏‚ฎ Y cannot be derived by
Armstrongโ€™s axioms from F, then there exists some
relational instance satisfying F but not
X ๏‚ฎ Y
๏ƒ˜ In other words,Armstrongโ€™s axioms derive all the
FDโ€™s that should hold
25
Proving Consistency
We prove that the axiomsโ€™ definitions must be true
for any instance, e.g.:
๏‚ง For augmentation (if X ๏‚ฎ Y then XW ๏‚ฎ YW):
If an instance satisfies X ๏‚ฎY, then:
๏‚ง For any tuples t1, t2 ๏ƒŽr,
if t1[X] = t2[X] then t1[Y] = t2[Y] by defn.
๏‚ง If, additionally, it is given that t1[W] = t2[W],
then t1[YW] = t2[YW]
26
Proving Completeness
Suppose X ๏‚ฎ Y ๏ƒ F + and define a relational instance
r that satisfies F + but not X ๏‚ฎ Y:
๏‚ง Then for some attribute A ๏ƒŽ Y, X ๏‚ฎ A ๏ƒ F +
๏‚ง Let some pair of tuples in r agree on X+ but disagree
everywhere else:
x1 x2 ... xn a1,1 v1 v2 ... vm w1,1 w2,1...
x1 x2 ... xn a1,2 v1 v2 ... vm w1,2 w2,2...
X A X+ โ€“ X R โ€“ X+ โ€“ {A}
27
Proof of Completeness contโ€™d
๏‚ง Clearly this relation fails to satisfy X ๏‚ฎ A and X ๏‚ฎ Y.
We also have to check that it satisfies any FD in F + .
๏‚ง The tuples agree on only X + .
Thus the only FDโ€™s that might be violated are of the form
Xโ€™ ๏‚ฎ Yโ€™ where Xโ€™ ๏ƒ X+ and Yโ€™ contains attributes in
R โ€“ X+ โ€“ {A}.
๏‚ง But if Xโ€™ ๏‚ฎ Yโ€™๏ƒŽ F+ and Xโ€™ ๏ƒ X+ then Yโ€™ ๏ƒ X+ (reflexivity
and augmentation).
Therefore Xโ€™ ๏‚ฎ Yโ€™ is satisfied.
28
Decomposition
๏‚ง Consider our original โ€œbadโ€ attribute set
๏‚ง We could decompose it into
๏‚ง But this decomposition loses information about
the relationship between students and courses.
Why?
Stuff(sid, name, serno, subj, cid, exp-grade)
Student(sid, name)
Course(serno, cid)
Subject(cid, subj)
29
Lossless Join Decomposition
R1, โ€ฆ Rk is a lossless join decomposition of R w.r.t. an FD set F if
for every instance r of R that satisfies F,
๏ƒ•R1
(r) โ‹ˆ ... โ‹ˆ ๏ƒ•Rk
(r) = r
Consider:
What if we decompose on
(sid, name) and (serno, subj, cid, exp-grade)?
sid name serno subj cid exp-grade
1 Sam 570103 AI 570 B
23 Nitin 550103 DB 550 A
30
Testing for Lossless Join
R1, R2 is a lossless join decomposition of R with respect to F
iff at least one of the following dependencies is in F+
(R1 ๏ƒ‡ R2) ๏‚ฎ R1 โ€“ R2
(R1 ๏ƒ‡ R2) ๏‚ฎ R2 โ€“ R1
So for the FD set:
sid ๏‚ฎ name
serno ๏‚ฎ cid, exp-grade
cid ๏‚ฎ subj
Is (sid, name) and (serno, subj, cid, exp-grade) a lossless
decomposition?
31
Dependency Preservation
Ensures we can โ€œeasilyโ€ check whether a FD X ๏‚ฎY
is violated during an update to a database:
๏‚ง The projection of an FD set F onto a set of attributes Z,
FZ is
{X ๏‚ฎY | X ๏‚ฎY ๏ƒŽ F +, X ๏ƒˆY ๏ƒ Z}
i.e., it is those FDs local to Zโ€™s attributes
๏‚ง A decomposition R1, โ€ฆ, Rk is dependency preserving if
F + = (FR1 ๏ƒˆ...๏ƒˆ FRk)+
The decomposition hasnโ€™t โ€œlostโ€ any essential FDโ€™s, so we
can check without doing a join
32
Example of Lossless and
Dependency-Preserving Decompositions
Given relation scheme
R(name, street, city, st, zip, item, price)
And FD set name ๏‚ฎ street, city
street, city ๏‚ฎ st
street, city ๏‚ฎ zip
name, item ๏‚ฎ price
Consider the decomposition
R1(name, street, city, st, zip) and R2(name, item, price)
๏ƒ˜Is it lossless?
๏ƒ˜Is it dependency preserving?
What if we replaced the first FD by name, street ๏‚ฎ city?
33
Another Example
Given scheme: R(sid, fid, subj)
and FD set: fid ๏‚ฎ subj
sid, subj ๏‚ฎ fid
Consider the decomposition
R1(sid, fid) and R2(fid, subj)
๏ƒ˜ Is it lossless?
๏ƒ˜ Is it dependency preserving?
34
FDโ€™s and Keys
๏‚ง Ideally, we want a design s.t. for each nontrivial
dependency X ๏‚ฎY, X is a superkey for some
relation schema in R
๏‚ง We just saw that this isnโ€™t always possible
๏‚ง Hence we have two kinds of normal forms
35
Two Important Normal Forms
Boyce-Codd Normal Form (BCNF). For every relation
scheme R and for every X ๏‚ฎ A that holds over R,
either A ๏ƒŽ X (it is trivial) ,or
or X is a superkey for R
Third Normal Form (3NF). For every relation scheme
R and for every X ๏‚ฎ A that holds over R,
either A ๏ƒŽ X (it is trivial), or
X is a superkey for R, or
A is a member of some key for R
36
Normal Forms Compared
๏‚ง BCNF is preferable, but sometimes in conflict with
the goal of dependency preservation
๏‚ง Itโ€™s strictly stronger than 3NF
๏‚ง Letโ€™s see algorithms to obtain:
๏‚ง A BCNF lossless join decomposition
๏‚ง A 3NF lossless join, dependency preserving decomposition
37
BCNF Decomposition Algorithm
(from Korth et al.; our book gives recursive version)
result := {R}
compute F+
while there is a schema Ri in result that is not in BCNF
{
let A ๏‚ฎ B be a nontrivial FD on Ri
s.t. A ๏‚ฎ Ri is not in F+
and A and B are disjoint
result:= (result โ€“ Ri) ๏ƒˆ {(Ri - B), (A,B)}
}
38
3NF Decomposition Algorithm
by Phil Bernstein, now @ MS Research
Let F be a minimal cover
i:=0
for each FD A ๏‚ฎ B in F {
if none of the schemas Rj, 1๏‚ฃ j ๏‚ฃ i, contains AB
{
increment i
Ri := (A, B)
}
}
if no schema Rj, 1 ๏‚ฃ j ๏‚ฃ i contains a candidate key for R {
increment i
Ri := any candidate key for R
}
return (R1, โ€ฆ, Ri)
Build dep.-
preserving
decomp.
Ensure
lossless
decomp.
39
Summary
๏‚ง We can always decompose into 3NF and get:
๏‚ง Lossless join
๏‚ง Dependency preservation
๏‚ง But with BCNF we are only guaranteed lossless joins
๏‚ง BCNF is stronger than 3NF: every BCNF schema is
also in 3NF
๏‚ง The BCNF algorithm is nondeterministic, so there is
not a unique decomposition for a given schema R

More Related Content

What's hot (14)

Du Calcul des prรฉdicats vers Prolog
Du Calcul des prรฉdicats vers PrologDu Calcul des prรฉdicats vers Prolog
Du Calcul des prรฉdicats vers Prolog
ย 
Modal Logic
Modal LogicModal Logic
Modal Logic
ย 
Compiler Components and their Generators - Lexical Analysis
Compiler Components and their Generators - Lexical AnalysisCompiler Components and their Generators - Lexical Analysis
Compiler Components and their Generators - Lexical Analysis
ย 
20 sequences x
20 sequences x20 sequences x
20 sequences x
ย 
Non Standard Logics & Modal Logics
Non Standard Logics & Modal LogicsNon Standard Logics & Modal Logics
Non Standard Logics & Modal Logics
ย 
A factorization theorem for generalized exponential polynomials with infinite...
A factorization theorem for generalized exponential polynomials with infinite...A factorization theorem for generalized exponential polynomials with infinite...
A factorization theorem for generalized exponential polynomials with infinite...
ย 
Real and convex analysis
Real and convex analysisReal and convex analysis
Real and convex analysis
ย 
Hak ontoforum
Hak ontoforumHak ontoforum
Hak ontoforum
ย 
1 s2.0-s1574035804702117-main
1 s2.0-s1574035804702117-main1 s2.0-s1574035804702117-main
1 s2.0-s1574035804702117-main
ย 
Cs229 notes4
Cs229 notes4Cs229 notes4
Cs229 notes4
ย 
Functions
FunctionsFunctions
Functions
ย 
Algo Final
Algo FinalAlgo Final
Algo Final
ย 
Midterm II Review
Midterm II ReviewMidterm II Review
Midterm II Review
ย 
Analysis Solutions CIV
Analysis Solutions CIVAnalysis Solutions CIV
Analysis Solutions CIV
ย 

Similar to 9 normalization

Unit05 dbms
Unit05 dbmsUnit05 dbms
Unit05 dbms
arnold 7490
ย 
Pl vol1
Pl vol1Pl vol1
Pl vol1
Aarsh Ps
ย 
Pl vol1
Pl vol1Pl vol1
Pl vol1
Aarsh Ps
ย 

Similar to 9 normalization (20)

Unit05 dbms
Unit05 dbmsUnit05 dbms
Unit05 dbms
ย 
6 relational schema_design
6 relational schema_design6 relational schema_design
6 relational schema_design
ย 
Normalization
NormalizationNormalization
Normalization
ย 
The Chase in Database Theory
The Chase in Database TheoryThe Chase in Database Theory
The Chase in Database Theory
ย 
DBMS.ppt
DBMS.pptDBMS.ppt
DBMS.ppt
ย 
Applications of partial differentiation
Applications of partial differentiationApplications of partial differentiation
Applications of partial differentiation
ย 
Function Dependencies and Normalization
 Function Dependencies and Normalization Function Dependencies and Normalization
Function Dependencies and Normalization
ย 
Dbms3
Dbms3Dbms3
Dbms3
ย 
Pl vol1
Pl vol1Pl vol1
Pl vol1
ย 
Normalization1
Normalization1Normalization1
Normalization1
ย 
04_AJMS_254_19.pdf
04_AJMS_254_19.pdf04_AJMS_254_19.pdf
04_AJMS_254_19.pdf
ย 
Pl vol1
Pl vol1Pl vol1
Pl vol1
ย 
Fol
FolFol
Fol
ย 
Introduction to database-Normalisation
Introduction to database-NormalisationIntroduction to database-Normalisation
Introduction to database-Normalisation
ย 
New Method for Finding an Optimal Solution of Generalized Fuzzy Transportatio...
New Method for Finding an Optimal Solution of Generalized Fuzzy Transportatio...New Method for Finding an Optimal Solution of Generalized Fuzzy Transportatio...
New Method for Finding an Optimal Solution of Generalized Fuzzy Transportatio...
ย 
Rademacher Averages: Theory and Practice
Rademacher Averages: Theory and PracticeRademacher Averages: Theory and Practice
Rademacher Averages: Theory and Practice
ย 
Normalization
NormalizationNormalization
Normalization
ย 
publish paper
publish paperpublish paper
publish paper
ย 
Paper 1
Paper 1Paper 1
Paper 1
ย 
Frobenious theorem
Frobenious theoremFrobenious theorem
Frobenious theorem
ย 

More from GRajendra (10)

Weka tutorial
Weka tutorialWeka tutorial
Weka tutorial
ย 
Unix.system.calls
Unix.system.callsUnix.system.calls
Unix.system.calls
ย 
Mca cloud-storage-report
Mca cloud-storage-reportMca cloud-storage-report
Mca cloud-storage-report
ย 
Unit 1 ppt
Unit 1 pptUnit 1 ppt
Unit 1 ppt
ย 
Cpu scheduling
Cpu schedulingCpu scheduling
Cpu scheduling
ย 
Bca 10
Bca 10Bca 10
Bca 10
ย 
9 deadlock
9 deadlock9 deadlock
9 deadlock
ย 
Ch01 introduction
Ch01 introductionCh01 introduction
Ch01 introduction
ย 
Chapter 01
Chapter 01Chapter 01
Chapter 01
ย 
Lec18 pipeline
Lec18 pipelineLec18 pipeline
Lec18 pipeline
ย 

Recently uploaded

CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธ
anilsa9823
ย 
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female serviceCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
anilsa9823
ย 
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
ย 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
bodapatigopi8531
ย 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
ย 

Recently uploaded (20)

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
ย 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
ย 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
ย 
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธ
ย 
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female serviceCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
ย 
Vip Call Girls Noida โžก๏ธ Delhi โžก๏ธ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida โžก๏ธ Delhi โžก๏ธ 9999965857 No Advance 24HRS LiveVip Call Girls Noida โžก๏ธ Delhi โžก๏ธ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida โžก๏ธ Delhi โžก๏ธ 9999965857 No Advance 24HRS Live
ย 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
ย 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
ย 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
ย 
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE
ย 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
ย 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
ย 
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธcall girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
ย 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
ย 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
ย 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
ย 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
ย 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
ย 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
ย 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
ย 

9 normalization

  • 1. ER Diagrams (Concluded), Schema Refinement, and Normalization Zachary G. Ives University of Pennsylvania CIS 550 โ€“ Database & Information Systems October 6, 2005 Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan
  • 2. 2 Examples of ER Diagrams ๏‚ง Please interpret these ER diagrams: COURSESSTUDENTS Takes COURSESSTUDENTS Takes STUDENTS COURSESTakes
  • 3. 3 Converting ER Relationship Sets to Tables: 1:n Relationships CREATE TABLE Teaches( fid INTEGER, serno CHAR(15), semester CHAR(4), PRIMARY KEY (serno), FOREIGN KEY (fid) REFERENCES PROFESSORS, FOREIGN KEY (serno) REFERENCES Teaches) CREATE TABLE Teaches_Course( serno INTEGER, subj VARCHAR(30), cid CHAR(15), fid CHAR(15), when CHAR(4), PRIMARY KEY (serno), FOREIGN KEY (fid) REFERENCES PROFESSORS) โ€ข โ€œ1โ€ entity = key of relationship set: โ€ข Or embed relationship in โ€œmanyโ€ entity set: COURSES PROFESSORS Teaches
  • 4. 4 1:1 Relationships If you borrow money or have credit, you might get: What are the table options? CreditReport Borrower delinquent? ssn namedebt Describesrid
  • 5. 5 ISA Relationships: Subclassing (Structurally) ๏‚ง Inheritance states that one entity is a โ€œspecial kindโ€ of another entity: โ€œsubclassโ€ should be member of โ€œbase classโ€ name ISA People id Employees salary
  • 6. 6 But How Does thisTranslate into the Relational Model? Compare these options: ๏‚ง Two tables, disjoint tuples ๏‚ง Two tables, disjoint attributes ๏‚ง One table with NULLs ๏‚ง Object-relational databases (allow subclassing of tables)
  • 7. 7 Weak Entities A weak entity can only be identified uniquely using the primary key of another (owner) entity. ๏‚ง Owner and weak entity sets in a one-to-many relationship set, 1 owner : many weak entities ๏‚ง Weak entity set must have total participation People Feeds Pets ssn name weeklyCost name species
  • 8. 8 Translating Weak Entity Sets Weak entity set and identifying relationship set are translated into a single table; when the owner entity is deleted, all owned weak entities must also be deleted CREATE TABLE Feed_Pets ( name VARCHAR(20), species INTEGER, weeklyCost REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY (pname, ssn), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE)
  • 9. 9 N-ary Relationships ๏‚ง Relationship sets can relate an arbitrary number of entity sets: Student Project Advisor Indep Study
  • 10. 10 Summary of ER Diagrams ๏‚ง One of the primary ways of designing logical schemas ๏‚ง CASE tools exist built around ER (e.g. ERWin, PowerBuilder, etc.) ๏‚ง Translate the design automatically into DDL, XML, UML, etc. ๏‚ง Use a slightly different notation that is better suited to graphical displays ๏‚ง Some tools support constraints beyond what ER diagrams can capture ๏‚ง Can you get different ER diagrams from the same data?
  • 11. 11 Schema Refinement & DesignTheory ๏‚ง ER Diagrams give us a start in logical schema design ๏‚ง Sometimes need to refine our designs further ๏‚ง Thereโ€™s a system and theory for this ๏‚ง Focus is on redundancy of data ๏‚Ÿ Causes update, insertion, deletion anomalies
  • 12. 12 Not All Designs are Equally Good Why is this a poor schema design? And why is this one better? Stuff(sid, name, serno, subj, cid, exp-grade) Student(sid, name) Course(serno, cid) Subject(cid, subj) Takes(sid, serno, exp-grade)
  • 13. 13 Focus on the Bad Design ๏‚ง Certain items (e.g., name) get repeated ๏‚ง Some information requires that a student be enrolled (e.g., courses) due to the key sid name serno subj cid exp-grade 1 Sam 570103 AI 520 B 23 Nitin 550103 DB 550 A 45 Jill 505103 OS 505 A 1 Sam 505103 OS 505 C
  • 14. 14 Functional Dependencies Describe โ€œKey-Likeโ€ Relationships A key is a set of attributes where: If keys match, then the tuples match A functional dependency (FD) is a generalization: If an attribute set determines another, written X !Y then if two tuples agree on attribute set X, they must agree on X: sid ! name What other FDs are there in this data? ๏ƒ˜ FDs are independent of our schema design choice
  • 15. 15 Formal Definition of FDโ€™s Def. Given a relation schema R and subsets X,Y of R: An instance r of R satisfies FD X ๏‚ฎY if, for any two tuples t1, t2 2 r, t1[X ] = t2[X] implies t1[Y] = t2[Y] ๏‚ง For an FD to hold for schema R, it must hold for every possible instance of r (Can a DBMS verify this? Can we determine this by looking at an instance?)
  • 16. 16 GeneralThoughts on Good Schemas We want all attributes in every tuple to be determined by the tupleโ€™s key attributes, i.e. part of a superkey (for key X ๏‚ฎY, a superkey is a โ€œnon-minimalโ€ X) What does this say about redundancy? But: ๏‚ง What about tuples that donโ€™t have keys (other than the entire value)? ๏‚ง What about the fact that every attribute determines itself?
  • 17. 17 Armstrongโ€™s Axioms: Inferring FDs Some FDs exist due to others; can compute using Armstrongโ€™s axioms: ๏‚ง Reflexivity: If Y ๏ƒ X then X ๏‚ฎ Y (trivial dependencies) name, sid ๏‚ฎ name ๏‚ง Augmentation: If X ๏‚ฎY then XW ๏‚ฎYW serno ๏‚ฎ subj so serno, exp-grade ๏‚ฎ subj, exp-grade ๏‚ง Transitivity: If X ๏‚ฎ Y andY ๏‚ฎ Z then X ๏‚ฎ Z serno ๏‚ฎ cid and cid ๏‚ฎ subj so serno ๏‚ฎ subj
  • 18. 18 Armstrongโ€™s Axioms Lead toโ€ฆ ๏‚ง Union: If X ๏‚ฎ Y and X ๏‚ฎ Z then X ๏‚ฎ YZ ๏‚ง Pseudotransitivity: If X ๏‚ฎ Y and WY ๏‚ฎ Z then XW ๏‚ฎ Z ๏‚ง Decomposition: If X ๏‚ฎ Y and Z ๏ƒ Y then X ๏‚ฎ Z Letโ€™s prove these from Armstrongโ€™s Axioms
  • 19. 19 Closure of a Set of FDโ€™s Defn. Let F be a set of FDโ€™s. Its closure, F+,is the set of all FDโ€™s: {X ๏‚ฎ Y | X ๏‚ฎ Y is derivable from F by Armstrongโ€™s Axioms} Which of the following are in the closure of our Student-Course FDโ€™s? name ๏‚ฎ name cid ๏‚ฎ subj serno ๏‚ฎ subj cid, sid ๏‚ฎ subj cid ๏‚ฎ sid
  • 20. 20 Attribute Closures: Is Something Dependent on X? Defn.The closure of an attribute set X, X+, is: X+ = ๏ƒˆ {Y | X ๏‚ฎY ๏ƒŽ F +} ๏‚ง This answers the question โ€œisY determined (transitively) by X?โ€; compute X+ by: ๏‚ง Does sid, serno ๏‚ฎ subj, exp-grade? closure := X; repeat until no change { if there is an FD U ๏‚ฎ V in F such that U is in closure then add V to closure}
  • 21. 21 Equivalence of FD sets Defn. Two sets of FDโ€™s, F and G, are equivalent if their closures are equivalent, F + = G + e.g., these two sets are equivalent: {XY ๏‚ฎ Z, X ๏‚ฎ Y} and {X ๏‚ฎ Z, X ๏‚ฎ Y} ๏‚ง F + contains a huge number of FDโ€™s (exponential in the size of the schema) ๏‚ง Would like to have smallest โ€œrepresentativeโ€ FD set
  • 22. 22 Minimal Cover Defn. A FD set F is minimal if: 1. Every FD in F is of the form X ๏‚ฎ A, where A is a single attribute 2. For no X ๏‚ฎ A in F is: F โ€“ {X ๏‚ฎ A } equivalent to F 3. For no X ๏‚ฎ A in F and Z ๏ƒŒ X is: F โ€“ {X ๏‚ฎ A } ๏ƒˆ {Z ๏‚ฎ A } equivalent to F Defn. F is a minimum cover for G if F is minimal and is equivalent to G. e.g., {X ๏‚ฎ Z, X ๏‚ฎ Y} is a minimal cover for {XY ๏‚ฎ Z, X ๏‚ฎ Z, X ๏‚ฎ Y} in a sense, each FD is โ€œessentialโ€ to the cover we express each FD in simplest form
  • 23. 23 More on Closures If F is a set of FDโ€™s and X ๏‚ฎ Y ๏ƒ F + then for some attribute A ๏ƒŽ Y, X ๏‚ฎ A ๏ƒ F + Proof by counterexample. Assume otherwise and let Y = {A1,..., An} Since we assume X ๏‚ฎ A1, ..., X ๏‚ฎ An are in F + then X ๏‚ฎ A1 ...An is in F + by union rule, hence, X ๏‚ฎY is in F + which is a contradiction
  • 24. 24 Why Armstrongโ€™s Axioms? Why are Armstrongโ€™s axioms (or an equivalent rule set) appropriate for FDโ€™s? They are: ๏‚ง Consistent: any relation satisfying FDโ€™s in F will satisfy those in F + ๏‚ง Complete: if an FD X ๏‚ฎ Y cannot be derived by Armstrongโ€™s axioms from F, then there exists some relational instance satisfying F but not X ๏‚ฎ Y ๏ƒ˜ In other words,Armstrongโ€™s axioms derive all the FDโ€™s that should hold
  • 25. 25 Proving Consistency We prove that the axiomsโ€™ definitions must be true for any instance, e.g.: ๏‚ง For augmentation (if X ๏‚ฎ Y then XW ๏‚ฎ YW): If an instance satisfies X ๏‚ฎY, then: ๏‚ง For any tuples t1, t2 ๏ƒŽr, if t1[X] = t2[X] then t1[Y] = t2[Y] by defn. ๏‚ง If, additionally, it is given that t1[W] = t2[W], then t1[YW] = t2[YW]
  • 26. 26 Proving Completeness Suppose X ๏‚ฎ Y ๏ƒ F + and define a relational instance r that satisfies F + but not X ๏‚ฎ Y: ๏‚ง Then for some attribute A ๏ƒŽ Y, X ๏‚ฎ A ๏ƒ F + ๏‚ง Let some pair of tuples in r agree on X+ but disagree everywhere else: x1 x2 ... xn a1,1 v1 v2 ... vm w1,1 w2,1... x1 x2 ... xn a1,2 v1 v2 ... vm w1,2 w2,2... X A X+ โ€“ X R โ€“ X+ โ€“ {A}
  • 27. 27 Proof of Completeness contโ€™d ๏‚ง Clearly this relation fails to satisfy X ๏‚ฎ A and X ๏‚ฎ Y. We also have to check that it satisfies any FD in F + . ๏‚ง The tuples agree on only X + . Thus the only FDโ€™s that might be violated are of the form Xโ€™ ๏‚ฎ Yโ€™ where Xโ€™ ๏ƒ X+ and Yโ€™ contains attributes in R โ€“ X+ โ€“ {A}. ๏‚ง But if Xโ€™ ๏‚ฎ Yโ€™๏ƒŽ F+ and Xโ€™ ๏ƒ X+ then Yโ€™ ๏ƒ X+ (reflexivity and augmentation). Therefore Xโ€™ ๏‚ฎ Yโ€™ is satisfied.
  • 28. 28 Decomposition ๏‚ง Consider our original โ€œbadโ€ attribute set ๏‚ง We could decompose it into ๏‚ง But this decomposition loses information about the relationship between students and courses. Why? Stuff(sid, name, serno, subj, cid, exp-grade) Student(sid, name) Course(serno, cid) Subject(cid, subj)
  • 29. 29 Lossless Join Decomposition R1, โ€ฆ Rk is a lossless join decomposition of R w.r.t. an FD set F if for every instance r of R that satisfies F, ๏ƒ•R1 (r) โ‹ˆ ... โ‹ˆ ๏ƒ•Rk (r) = r Consider: What if we decompose on (sid, name) and (serno, subj, cid, exp-grade)? sid name serno subj cid exp-grade 1 Sam 570103 AI 570 B 23 Nitin 550103 DB 550 A
  • 30. 30 Testing for Lossless Join R1, R2 is a lossless join decomposition of R with respect to F iff at least one of the following dependencies is in F+ (R1 ๏ƒ‡ R2) ๏‚ฎ R1 โ€“ R2 (R1 ๏ƒ‡ R2) ๏‚ฎ R2 โ€“ R1 So for the FD set: sid ๏‚ฎ name serno ๏‚ฎ cid, exp-grade cid ๏‚ฎ subj Is (sid, name) and (serno, subj, cid, exp-grade) a lossless decomposition?
  • 31. 31 Dependency Preservation Ensures we can โ€œeasilyโ€ check whether a FD X ๏‚ฎY is violated during an update to a database: ๏‚ง The projection of an FD set F onto a set of attributes Z, FZ is {X ๏‚ฎY | X ๏‚ฎY ๏ƒŽ F +, X ๏ƒˆY ๏ƒ Z} i.e., it is those FDs local to Zโ€™s attributes ๏‚ง A decomposition R1, โ€ฆ, Rk is dependency preserving if F + = (FR1 ๏ƒˆ...๏ƒˆ FRk)+ The decomposition hasnโ€™t โ€œlostโ€ any essential FDโ€™s, so we can check without doing a join
  • 32. 32 Example of Lossless and Dependency-Preserving Decompositions Given relation scheme R(name, street, city, st, zip, item, price) And FD set name ๏‚ฎ street, city street, city ๏‚ฎ st street, city ๏‚ฎ zip name, item ๏‚ฎ price Consider the decomposition R1(name, street, city, st, zip) and R2(name, item, price) ๏ƒ˜Is it lossless? ๏ƒ˜Is it dependency preserving? What if we replaced the first FD by name, street ๏‚ฎ city?
  • 33. 33 Another Example Given scheme: R(sid, fid, subj) and FD set: fid ๏‚ฎ subj sid, subj ๏‚ฎ fid Consider the decomposition R1(sid, fid) and R2(fid, subj) ๏ƒ˜ Is it lossless? ๏ƒ˜ Is it dependency preserving?
  • 34. 34 FDโ€™s and Keys ๏‚ง Ideally, we want a design s.t. for each nontrivial dependency X ๏‚ฎY, X is a superkey for some relation schema in R ๏‚ง We just saw that this isnโ€™t always possible ๏‚ง Hence we have two kinds of normal forms
  • 35. 35 Two Important Normal Forms Boyce-Codd Normal Form (BCNF). For every relation scheme R and for every X ๏‚ฎ A that holds over R, either A ๏ƒŽ X (it is trivial) ,or or X is a superkey for R Third Normal Form (3NF). For every relation scheme R and for every X ๏‚ฎ A that holds over R, either A ๏ƒŽ X (it is trivial), or X is a superkey for R, or A is a member of some key for R
  • 36. 36 Normal Forms Compared ๏‚ง BCNF is preferable, but sometimes in conflict with the goal of dependency preservation ๏‚ง Itโ€™s strictly stronger than 3NF ๏‚ง Letโ€™s see algorithms to obtain: ๏‚ง A BCNF lossless join decomposition ๏‚ง A 3NF lossless join, dependency preserving decomposition
  • 37. 37 BCNF Decomposition Algorithm (from Korth et al.; our book gives recursive version) result := {R} compute F+ while there is a schema Ri in result that is not in BCNF { let A ๏‚ฎ B be a nontrivial FD on Ri s.t. A ๏‚ฎ Ri is not in F+ and A and B are disjoint result:= (result โ€“ Ri) ๏ƒˆ {(Ri - B), (A,B)} }
  • 38. 38 3NF Decomposition Algorithm by Phil Bernstein, now @ MS Research Let F be a minimal cover i:=0 for each FD A ๏‚ฎ B in F { if none of the schemas Rj, 1๏‚ฃ j ๏‚ฃ i, contains AB { increment i Ri := (A, B) } } if no schema Rj, 1 ๏‚ฃ j ๏‚ฃ i contains a candidate key for R { increment i Ri := any candidate key for R } return (R1, โ€ฆ, Ri) Build dep.- preserving decomp. Ensure lossless decomp.
  • 39. 39 Summary ๏‚ง We can always decompose into 3NF and get: ๏‚ง Lossless join ๏‚ง Dependency preservation ๏‚ง But with BCNF we are only guaranteed lossless joins ๏‚ง BCNF is stronger than 3NF: every BCNF schema is also in 3NF ๏‚ง The BCNF algorithm is nondeterministic, so there is not a unique decomposition for a given schema R