The document provides an introduction to database management systems (DBMS) and data modeling. It discusses the evolution of data models from hierarchical and network models to relational and object-oriented models. The relational model introduced tables and relationships between entities. The entity-relationship model uses diagrams to visually represent entities, attributes, and relationships. The object-oriented model treats data and relationships as objects that can contain attributes, methods, and inherit properties from classes.
2. Data, information and knowledge
• Data
– Collection of text, numbers and symbols with no
meaning.
• Eg: cat, dog, gerbil, rabbit, cockatoo
• 161.2, 175.3, 166.4, 164.7, 169.3
• Data + Meaning = Information
• Eg: cat, dog, gerbil, rabbit, cockatoo is a list of household
pets
• 161.2, 175.3, 166.4, 164.7, 169.3 are the heights of 15-
year-old students
3. Data, information and knowledge
• Knowledge
– acquiring and remembering a set of facts, or
– the use of information to solve problems.
– Two types
• Explicit Knowledge
• Tacit knowledge
– Information + application or use = Knowledge
• Eg: The tallest student is 175.3cm.
• A lion is not a household pet as it is not in the list and it
lives in the wild.
4. Features of flat file database
• Advantages
– All records are stored in one place
– Easy to set up
– Easy to understand
– Simple sorting of records can be carried out
– Record can be extracted on the basis of simple criteria
• Disadvantages
– High level of redundancy in data
– Non-unique records
– Redundancy in data often results to inconsistency.
– Inherently inefficient
– Harder to change data format
– Poor at complex queries –because of inflexibility of files
– Poor at limiting access
5. Database
• Systematic collection of data
• Collection of information that is organized so that it
can be easily accessed, managed and updated
• Online telephone directory
6. Database Management System(DBMS)
• Collection of programs to access database, manipulate
data, representation of data
• Interface between the database and end users
• Programs which helps in creation and maintenance of a
database
– Collection of interrelated data
– Set of programs to access the data
– An environment that is both convenient and efficient to
use
• Microsoft Access, FoxPro ,SQL Server, Oracle, Ingres ,DB2
7. Drawbacks of using file systems to store data
• Data redundancy and inconsistency
• Difficulty in accessing data
• Data isolation
• Integrity problems
– Integrity constraints become “buried” in program code
rather than being stated explicitly
– Difficult to add new constraints or change existing ones
• Atomicity of updates
– Failures may leave database in an inconsistent state with
partial updates carried out
• Concurrent access by multiple users
– Uncontrolled concurrent accesses can lead to inconsistencies
• Security problems
– Hard to provide user access to some, but not all, data
8. View of Data
• Provide users with an abstract view of data
• Hides how data is stored and maintained
• Data Abstraction
– Physical level- describes how a record is stored.
– Logical level- describes what data stored in database, and the
relationships among the data.
– V iew level-describes only part of entire database
10. The three levels of data abstraction
• Physical level
– Block of consecutive storage locations
– Physical organization of data
– DBA
• Logical level
– Type definition
– Interrelationship
– Programmers and database administrators work at this level
• View level
– Set of application programs
– Views of the database is defined
– Users
11. Instances and Schemas
• Schema
– Design of a database
– Three types: Physical schema, logical schema and
view schema.
– Physical schema
• Design of a database at physical level
• How the data stored in blocks of storage is described at this
level.
– Logical schema
– View schema
• Instance – the actual content of the database at a
particular point in time
12. Data Independence
• Ability to modify schema definition in one level without affecting a
schema definition in the next higher level
• Physical Data Independence
– Ability to modify the physical schema without changing
application program
• Logical Data Independence
– Ability to modify the logical schema without changing application
program
– When logical structure is changed
13. Data Models
• How the logical structure of a database is modeled
• A collection of tools for describing
– Data
– Data relationships
– Data semantics
– Data constraints
• Define how data is connected to each other and how they
are processed and stored inside the system
14. Data Models
• Relational model
– Tables to represent data and relationship
– Eg of record-based model
• Entity-Relationship data model (mainly for database
design)
– Objects (Entities) and relationship
• Object-based data models (Object-oriented and Object-
relational)
• Semistructured data model (XML)
– Individual data items of the same type may have different
sets of attributes
• Other older models:
– Network model
– Hierarchical model
15. Relational Databases
• Based on the relational model of data
• Uses tables to represent data and relationship
• Includes DML and DDL
• Each table has multiple columns ,each has unique name
• Eg. of record-based model
– Database is structured in fixed-format records of several types
– Table contain records of a particular type
– Each record defines a fixed number of fields or attributes
17. Relational Databases
• SQL: widely used non-procedural language
• Data-Manipulation Language
– select customer.customer_name
from customer
where customer.customer_id = ‘192-83-7465’
• Data-Definition Language
– Create table account(acc_no char(10), balance integer)
• Application programs generally access databases
through one of
– Language extensions to allow embedded SQL
• DML precompiler converts DML to normal procedure calls
– Application program interface (e.g., ODBC/JDBC) which
allow SQL queries to be sent to a database
20. System structure
• DBMS acts as an interface between the user and the
database.
• Database System Structure are partitioned into modules for
different functions
• The functional components of a database system can be
broadly divided into the storage manager and query
processor components.
21. Database Users and User Interfaces
• Naïve users- invoke one of the permanent application
programs that have been written previously
• Application programmers-computer professional who write
application programs
– Interact with system through DML calls ,RAD tools
• Sophisticated users – form requests in a database query
language
• Specialized users – write specialized database applications
that do not fit into the traditional data processing
framework
22. The Query Processor
• DDL interpreter
– Interprets DDL statements and records the definitions in the data
dictionary
• DML compiler
– Translates DML statements in a query language into an evaluation
plan consisting of low level instructions that the query evaluation
engine understands
• Query evaluation engine
– Executes low level instructions generated by the DML compiler.
23. Storage Manager
• Program module that provides the interface between the low-level
data stored in the database and the application programs and
queries submitted to the system
• The storage manager is responsible to the following tasks:
– Interaction with the file manager
– Efficient storing, retrieving and updating of data
• Authorization and integrity manager
– Which tests for the satisfaction of integrity constraints and checks the authority of
users to access data.
• Transaction manager
– Which ensures that the database remains in a consistent correct) state despite
system failures, and that concurrent transaction executions proceed without
conflicting.
• File Manager
– Which manages the allocation of space on disk storage and the data structures used
to represent information stored on disk.
• Buffer manager
– Which is responsible for fetching data from disk storage into main memory, and
deciding what data to cache in main memory.
24. Storage Manager
• Implements several data structures as part of the physical
system implementation
– Data files, which store the database itself
– Data dictionary, which stores metadata about the structure of the
database, in particular the schema of the database
– Indices, which provide fast access to data items that hold
particular values
–
26. Transaction Management
• Atomicity
• Consistency
• Durability
• A transaction is a collection of operations that performs a
single logical function in a database application
• Transaction-management component ensures that the
database remains in a consistent (correct) state despite
system failures (e.g., power failures and operating system
crashes) and transaction failures.
• Concurrency-control manager controls the interaction
among the concurrent transactions, to ensure the
consistency of the database.
28. Data Modeling and Data Models
• Database design
– Focuses on how the database structure will be used and manage
end-user data
– First step-Data modeling
• Process of creating specific data model for a determined problem domain
• Iterative,progressive process
• Data model
• An abstraction of a more complex real-world object
• Represents data structures ,their characteristics, relations,
constraints,transformations..
• “blueprint”
29. The Importance of Data Models
• Data models
– Relatively simple representations, usually graphical, of
complex real-world data structures
– Facilitate interaction among the designer, the
applications programmer, and the end user
• End-users have different views and needs for data
• Data model organizes data for various users
30. Data Model Basic Building Blocks
• Entity - anything about which data are to be
collected and stored
• Attribute - a characteristic of an entity
• Relationship - describes an association among
entities
– One-to-many (1:M) relationship
• Painter paint painting
– Many-to-many (M:N or M:M) relationship
• Employee learn skill
– One-to-one (1:1) relationship
• Employee manages store
• Constraint - a restriction placed on the data
31. Business Rules
• Brief, precise, and unambiguous descriptions of policies,
procedures, or principles within a specific organization
• Apply to any organization that stores and uses data to
generate information
• Help to create and enforce actions within that
organization’s environment
• Based on policies, procedures, or principles within a
specific organization
• Must be rendered in writing
• Must be kept up to date
• Sometimes are external to the organization
• Must be easy to understand and widely disseminated
• Describe characteristics of the data as viewed by the
company
32. Business Rules
• Why
– Enhance understanding & facilitate communication
• Standardize company’s view of data
• Constitute a communications tool between users and designers
• Allow designer to understand business process as well as the nature, role,
and scope of data
– Promote creation of an accurate data model
• How (sources)
– Interviews
• Company managers
• Policy makers
• Department managers
• End users
– Written documentation
• Procedures, Standards, Operations manuals
– Observation
• Business operations
33. Translating Business Rules into Data Model Components
• Nouns translate into entities
• Verbs translate into relationships among entities
• Relationships are bi-directional
36. The Hierarchical Model
• Each parent can have many children
• Each child has only one parent
• Tree is defined by path that traces parent segments
to child segments, beginning from the left
37. The Hierarchical Model-Advantages
– Conceptual simplicity – easy to understand the model layout
– Database security
– Data independence (a change in a data type will be automatically
cascaded throughout the database by the DBMS, thereby eliminating
the need to make changes in the program segments that reference the
changes data type)
– Database integrity – always a link between parent and child
– Efficiency – very efficient when it contains a large volume of data in 1:M
relationships and whose relationships are fixed over time
38. The Hierarchical Model-Disdvantages
– Complex implementation – detailed knowledge of the physical data storage
characteristics is required by the designers and programmers
– Difficult to manage – relocation of segments requires application changes
– Lacks structural independence
– Complex applications programming and use – programmers and end users
must know precisely how the data are physically distributed within the
database
– Implementation limitations – difficult to support M:N relationships
– Lack of standards – no standard DDL and no DML
39. The Network Model
• Created to
– Represent complex data relationships more effectively
– Improve database performance
– Impose a database standard
• Conference on Data Systems Languages (CODASYL)
• American National Standards Institute (ANSI)
• Database Task Group (DBTG)
40. Crucial Database Components
• Schema
– Conceptual organization of entire database as viewed by
the database administrator
• Subschema
– Defines database portion “seen” by the application
programs that actually produce the desired information
from data contained within the database
• Data Management Language (DML)
– Define data characteristics and data structure in order to
manipulate the data
41. Data Management
Language Components
• Schema Data Definition Language (DDL)
– Enables database administrator to define schema
components
• Subschema DDL
– Allows application programs to define database
components that will be used
• DML
– Manipulates database contents
42. Network Model—Basic Structure
• Resembles hierarchical model
• Collection of records in 1:M relationships
– A relationship is called a Set
– Composed of at least two record types
• Owner
– Equivalent to the hierarchical model’s parent
• Member
– Equivalent to the hierarchical model’s child
– A record can appear as a member in more than one set i.e., a
member may have multiple owners
43.
44. 1
• Advantages
– Conceptual simplicity
– Handles more relationship types
– Data access flexibility – no need for a preorder traversal
– Promotes database integrity – must first define the owner
and then the member record
– Data independence
– Conformance to standards
• Disadvantages
– System complexity
– Lack of structural independence
45. The Relational Model: Basic Structure
• Relational Database Management System
(RDBMS)
• Performs same basic functions provided by
hierarchical and network DBMS systems, plus
other functions
– RDBMS handles all the complex physical details
• Most important advantage of the RDBMS is its
ability to let the user/designer operate in a
human logical environment
46. The Relational Model:Basic Structure
• Table (relations)
– Matrix consisting of a series of row/column
intersections
– Related to each other by sharing a common entity
characteristic
• Relational schema
– Visual representation of relational database’s
entities, attributes within those entities, and
relationships between those entities
48. Relational Table
• Stores a collection of related entities
– Resembles a file
• Relational table is purely logical structure
– How data are physically stored in the database is of no
concern to the user or the designer
• Rise to dominance due in part to its powerful and
flexible query language
• SQL-based relational database application involves:
– User interface
– A set of tables stored in the database
– SQL engine
50. The Relational Model
• Advantages
– Structural independence – changes in the relational
data structure do not affect the DBMS’s data access in
any way
– Improved conceptual simplicity by concentrating on the
logical view
– Easier database design, implementation, management,
and use
– Ad hoc query capability - SQL
– Powerful database management system
51. The Relational Model (continued)
• Disadvantages
– Substantial hardware and system software
overhead
– Can facilitate poor design and implementation
– May promote “islands of information”
problems
52. The Entity Relationship Model
• Widely accepted and adapted graphical tool
for data modeling
• Introduced by Peter Chen in 1976
• Graphical representation of entities and their
relationships in a database structure
53. The Entity Relationship Model—
Basic Structure
• Entity relationship diagram (ERD)
– Uses graphic representations to model database
components
– Entity is mapped to a relational table
• Entity instance (or occurrence) is row in table
• Entity set is collection of like entities
• Connectivity labels types of relationships
– Diamond connected to related entities through a
relationship line
56. The Object Oriented Model
• Semantic data model (SDM) developed by
Hammer and McLeod in 1981
• Modeled both data and their relationships in a
single structure known as an object
• Basis of object oriented data model (OODM)
• OODM becomes the basis for the object
oriented database management system
(OODBMS)
57. The Object Oriented Model
• Object is described by its factual content
– Like relational model’s entity
• Includes information about relationships
between facts within object and relationships
with other objects
– Unlike relational model’s entity
• Subsequent OODM development allowed an
object to also contain operations
• Object becomes basic building block for
autonomous structures
58. The Object Oriented Model Components
• Object –Abstraction of real-world entity
• Attributes
• Methods
• Class hierarchy
• Inheritance
59. Developments that
Boosted OODM’s Popularity
• Growing costs put a premium on code
reusability
• Complex data types and system requirements
became difficult to manage with a traditional
RDBMS
• Became possible to support increasingly
sophisticated transaction & information
requirements
• Ever-increasing computing power made it
possible to support the large computing
overhead required
60. Object Oriented Data Model—
Basic Structure
• Object: abstraction of a real-world entity
• Attributes describe the properties of an object
• Objects that share similar characteristics are
grouped in classes
• Classes are organized in a class hierarchy
• Inheritance is the ability of an object within
the class hierarchy to inherit the attributes and
methods of classes above it
62. The Object Oriented Model
• Advantages
– Adds semantic content
– Visual presentation includes semantic content
– Database integrity
– Both structural and data independence
63. The Object Oriented Model (continued)
• Disadvantages
– Slow pace of OODM standards development
– Complex navigational data access
– Steep learning curve
– High system overhead slows transactions
– Lack of market penetration
64. Other Models
• Extended Relational Data Model (ERDM)
– Semantic data model developed in response to
increasing complexity of applications
– DBMS based on the ERDM often described as
an object/relational database management
system (O/RDBMS)
– Primarily geared to business applications
65. Data Models: A Summary
• Each new data model capitalized on the
shortcomings of previous models
• Common characteristics:
– Conceptual simplicity without compromising the
semantic completeness of the database
– Represent the real world as closely as possible
– Representation of real-world transformations
(behavior) must be in compliance with
consistency and integrity characteristics of any
data model
67. Database Models and the Internet
• Characteristics of successful “Internet age” databases
– Flexible, efficient, and secure Internet access that is easily
used, developed, and supported
– Support for complex data types and relationships
– Seamless interfacing with multiple data sources and
structures
– Relative conceptual simplicity to make database design and
implementation less cumbersome
– An abundance of available database design, implementation,
and application development tools
– A powerful DBMS graphical user interface (GUI) to help make
the DBA’s job easier
68. Degrees of Data Abstraction
• Way of classifying data models
• Many processes begin at high level of abstraction and
proceed to an ever-increasing level of detail
• Designing a usable database follows the same basic
process
• American National Standards Institute/Standards
Planning and Requirements Committee (ANSI/SPARC)
– Classified data models according to their degree of
abstraction (1970s):
• Conceptual
• External
• Internal
70. The Conceptual Model
• Represents global view of the database
• Enterprise-wide representation of data as
viewed by high-level managers
• Basis for identification and description of
main data objects, avoiding details
• Most widely used conceptual model is the
entity relationship (ER) model
71.
72. Advantages of Conceptual Model
• Provides a relatively easily understood macro
level view of data environment
• Independent of both software and hardware
– Does not depend on the DBMS software used to
implement the model
– Does not depend on the hardware used in the
implementation of the model
– Changes in either the hardware or the DBMS
software have no effect on the database design at
the conceptual level
73. The Internal Model
• Representation of the database as “seen”
by the DBMS
• Adapts the conceptual model to the DBMS
• Software dependent
• Hardware independent
74.
75. The External Model
• End users’ view of the data environment
• Requires that the modeler subdivide set of
requirements and constraints into functional
modules that can be examined within the
framework of their external models
• Good design should:
– Consider such relationships between views
– Provide programmers with a set of restrictions that
govern common entities
76.
77. Advantages of External Models
• Use of database subsets makes application
program development much simpler
– Facilitates designer’s task by making it easier to
identify specific data required to support each
business unit’s operations
– Provides feedback about the conceptual
model’s adequacy
• Creation of external models helps to ensure
security constraints in the database design
79. The Physical Model
• Operates at lowest level of abstraction, describing
the way data are saved on storage media such as
disks or tapes
• Software and hardware dependent
• Requires that database designers have a detailed
knowledge of the hardware and software used to
implement database design
82. Design Process
• Design of database schema
• Design of programs that access and update
data
• Design of security scheme to control access to
data
83. Design Phases
• Characterize fully the data needs of the prospective
database users
– Diagramatic or textual representation
• Chooses a data model ,translates requirements into
conceptual schema
– Detailed overview of the enterprise
– Creation of E-R diagram
• Specification of functional requirements
– Describes the kinds of operations
•
84. Design Phases
• Process of moving from an abstract data model to
the implementation of the database
• Logical-design phase
– Maps the high-level conceptual scheme onto the
implementation data model
• Physical-design phase
– Physical features of the database are specified
•
86. Modeling
• A database can be modeled as:
– a collection of entities,
– relationship among entities.
• An entity is an object that exists and is
distinguishable from other objects.
– Example: specific person, company, event, plant
• Entities have attributes
– Example: people have names and addresses
• An entity set is a set of entities of the same type
that share the same properties.
– Example: set of all persons, companies, trees, holidays
87. Entity Sets customer and loan
customer_id customer_ customer_ customer_ loan_ amount
name street city number
88. Attributes
• An entity is represented by a set of attributes, that is
descriptive properties possessed by all members of an
entity set.
Example:
customer = (customer_id, customer_name,
customer_street, customer_city )
loan = (loan_number, amount )
• Domain – the set of permitted values for each attribute
• Attribute types:
– Simple and composite attributes.
– Single-valued and multi-valued attributes
• Example: multivalued attribute: phone_numbers
– Derived attributes
• Can be computed from other attributes
• Example: age, given date_of_birth
90. Relationship Sets
• A relationship is an association among several
entities
Example:
Hayes depositor A-102
customer entityrelationship setaccount entity
• A relationship set is a mathematical relation among
n 2 entities, each taken from entity sets
{(e1, e2, … en) | e1 E1, e2 E2, …, en En}
where (e1, e2, …, en) is a relationship
– Example:
(Hayes, A-102) depositor
92. Relationship Set
• Association between two entity sets –participation
– Entity sets E1,E2….En participate in relationship set R
• Relationship instance in an E-R schema represents an
association between the named entities in the real-world
enterprise
– Individual customer entity Hayes(677-89-9011) and loan entity L-
15 participate in the relationship borrower
– Relationship instance –Person called Hayes who holds customer-id
677-89-9011 has taken the loan numbered L-15
• Function that an entity plays in a relationship—
entity’s role
– The relationship set works_for that is modeled by
ordered pairs of employee entities …(worker,manager)
93. Relationship Set
• An attribute can also be property of a relationship set—
decriptive attribute
• For instance, the depositor relationship set between entity
sets customer and account may have the attribute access-date
Cutomer and loan entity set participate in relationship set borrower….also
guarantor
94. Degree of a Relationship Set
• Refers to number of entity sets that participate in a
relationship set.
• Relationship sets that involve two entity sets are binary
(or degree two). Generally, most relationship sets in a
database system are binary.
• Relationship sets may involve more than two entity
sets.
• Example: Suppose employees of a bank may have jobs (responsibilities) at
multiple branches, with different jobs at different branches. Then there is a
ternary relationship set between entity sets employee, job, and branch
• Relationships between more than two entity sets are
rare. Most relationships are binary
95. Mapping Cardinality Constraints
• Express the number of entities to which another
entity can be associated via a relationship set.
• Most useful in describing binary relationship sets.
• For a binary relationship set the mapping cardinality
must be one of the following types:
– One to one
– One to many
– Many to one
– Many to many
96. Mapping Cardinality
One to one One to many
Note: Some elements in A and B may not be mapped to any
elements in the other set
98. Keys
• A super key of an entity set is a set of one or more
attributes whose values uniquely determine each
entity.
• A candidate key of an entity set is a minimal super key
– Superkey for which no proper subset is a superkey
– Customer_id and {customer_name,customer_address} are
candidate keys
– Customer_id and customer_name together can distinguish customer entities
,but does not form a candidate key ,since customer_id alone is a candidate key
• Although several candidate keys may exist, one of the
candidate keys is selected to be the primary key.
99. Relationship set
• Let R be a relationship set involving entity sets E1, E2, ..., En.
• Let primary-key(Ei) denote the set of attributes that forms the
primary key for entity set Ei.
• If the relationship set R has no attributes associated with it, then the
set of attributes
primary-key(E1) U primary-key(E2) U ... U primary-key(En)
describes an individual relationship in set R
• If the relationship set R has attributes a1, a2, ..., am associated with
it, then the set of attributes
primary-key(E1) U primary-key(E2) U ... U primary-key(En) U {a1, a2,
..., am}
describes an individual relationship in set R
• The set of attributes
primary-key(E1) U primary-key(E2) U ... U primary-key(En)
forms a superkey for the relationship set
100. Relationship set
• Entity set –customer and account
• Relationship set –depositor with attribute access_date
• If relationship set is many-to-many ,the primary key of depositor is
the union of pk of customer and account
• If the depositor relationship is many-to-one from customer to
account , the primary key of depositor is simply the primary key of
customer
• If the depositor relationship is many-to-one from account to
customer , the primary key of depositor is simply the primary key of
account
• If one-to-one ,either pk can be used
101. Participation constraints
• Participation of an entity set E in a relaionship
is total if every entity in E participates in
atleast one relationship in R
• Partial
102. Removing redundant attributes in entity sets
• Entity set –department ,instructor
• Attributes
– Instructor-id,name,deptname, salary id- primary key
• Phoneno,mobileno,officeno,..
– Department-deptname,building,budget,…..deptname- pk
• Each instructor has assoicated department using a relationship set inst-dept
• Deptname appear in both entity set. Since deptname is the pk ,it is redundant in
instructor and needs to be removed
106. Roles
• Entity sets of a relationship need not be distinct
• The labels “manager” and “worker” are called roles; they specify
how employee entities interact via the works_for relationship set.
• Roles are indicated in E-R diagrams by labeling the lines that
connect diamonds to rectangles.
• Role labels are optional, and are used to clarify semantics of the
relationship
107. Cardinality Constraints
• We express cardinality constraints by drawing either a directed
line (), signifying “one,” or an undirected line (—), signifying
“many,” between the relationship set and the entity set.
• One-to-one relationship:
– A customer is associated with at most one loan via the relationship
borrower
– A loan is associated with at most one customer via borrower
108. One-To-Many Relationship
• In the one-to-many relationship a loan is
associated with at most one customer via
borrower, a customer is associated with several
(including 0) loans via borrower
109. Many-To-One Relationships
• In a many-to-one relationship a loan is associated
with several (including 0) customers via borrower,
a customer is associated with at most one loan via
borrower
110. Many-To-Many Relationship
• A customer is associated with several (possibly 0) loans
via borrower
• A loan is associated with several (possibly 0) customers
via borrower
113. Cardinality Constraints on Ternary Relationship
• We allow at most one arrow out of a ternary (or
greater degree) relationship to indicate a cardinality
constraint
• E.g. an arrow from works_on to job indicates each
employee works on at most one job at any branch.
• If there is more than one arrow, there are two ways of
defining the meaning.
– E.g a ternary relationship R between A, B and C with arrows
to B and C could mean
1. each A entity is associated with a unique entity from B
and C or
2. each pair of entities from (A, B) is associated with a
unique C entity,and each pair (A, C) is associated with a
unique B
114. Participation of an Entity Set in a Relationship Set
Total participation (indicated by double line): every entity in the
entity set participates in at least one relationship in the relationship
set
E.g. participation of loan in borrower is total
every loan must have a customer associated to it via borrower
Partial participation: some entities may not participate in any
relationship in the relationship set
Example: participation of customer in borrower is partial
115. Alternative Notation for Cardinality Limits
Cardinality limits can also express participation constraints
l..h
A minimum value of 1 indicate total participation of the entity set in
the relationship set
A maximum value of 1 indicates that the entity participates in
atmost one relationship
* no limit
116. Alternative Notation for Cardinality Limits
Each loan must have exactly one associated customer
A customer can have zero or more loans
Relationship borrower is one-to-many from customer to loan
Participation of loan in borrower is total
117.
118. Design Issues
• Use of entity sets vs. attributes
Choice mainly depends on the structure of the
enterprise being modeled, and on the semantics
associated with the attribute in question.
• Use of entity sets vs. relationship sets
Possible guideline is to designate a relationship set to
describe an action that occurs between entities
• Binary versus n-ary relationship sets
Although it is possible to replace any nonbinary (n-ary,
for n > 2) relationship set by a number of distinct binary
relationship sets, a n-ary relationship set shows more
clearly that several entities participate in a single
relationship.
• Placement of relationship attributes
119. Use of entity sets vs. attributes
• Employee-employee_id and employee_name
• Telephone-telephone_no and location
• Relationship set emp_telephone
• Attribute telephone_number employee have one telephone
number
• Entity telephone number several telephone numbers
• As multivalued attribute ?
• Mistake
– Use the primary key of an entity et a an attribute of another
entity set,intead of using a relationship
– Designate primary key attribute of related entity set as attribute
of relationship
120. Use of entity sets vs. relationship sets
• customer-customerid,customername,customerstreet,customercity
• Loan-loannumber,amount
• Branch-branchname,branchcity,assets
• If every loan is held by exactly one customer and is associated with
exactly one branch---loan is represented as a relationship
• Several customer hold a loan jointly , same value for descriptive
attributes loannumber and amount
• Two problems
– Data are stored multiple time ,wasting storage space
– Updates potentially leave the data in an inconsistent state
• So loan as entity ;borrower relationship between customer and loan
121. Binary Vs. Non-Binary Relationships
• Some relationships that appear to be non-binary
may be better represented using binary
relationships
– E.g. A ternary relationship parents, relating a child to
his/her father and mother, is best replaced by two
binary relationships, father and mother
• Using two binary relationships allows partial information
(e.g. only mother being know)
122. Converting Non-Binary Relationships to Binary Form
• In general, any non-binary relationship can be represented using binary
relationships by creating an artificial entity set.
– Replace R between entity sets A, B and C by an entity set E, and three
relationship sets:
1. RA, relating E and A 2.RB, relating E and B
3. RC, relating E and C
– Create a special identifying attribute for E
– Add any attributes of R to E
– For each relationship (ai , bi , ci) in R, create
1. a new entity ei in the entity set E 2. add (ei , ai ) to RA
3. add (ei , bi ) to RB 4. add (ei , ci ) to RC
123. Converting Non-Binary Relationships
• Also need to translate constraints
– Translating all constraints may not be possible
– There may be instances in the translated schema that
cannot correspond to any instance of R
• Exercise: add constraints to the relationships RA, RB and RC to
ensure that a newly created entity corresponds to exactly one
entity in each of entity sets A, B and C
– We can avoid creating an identifying attribute by
making E a weak entity set identified by the three
relationship sets
124. Placement of relationship attribute
Depositor is a one-to-many relationship set such that one customer may have
several accounts , but each account can have only one customer
Can make access-date an attribute of account, instead of a relationship
attribute
Attribute of a one-to-many relationship set can be repositioned to entity set on
the ”many” side
For one-to-one ,the relationship attribute can be associated with either one
participating entities
For many-to-many relationship ,access_date should be a relationship attribute
125. Weak Entity Sets
• An entity set that does not have a primary key is referred to
as a weak entity set.
• The existence of a weak entity set depends on the existence
of a identifying entity set
– it must relate to the identifying entity set via a total, one-to-
many relationship set from the identifying to the weak entity set
– Identifying relationship depicted using a double diamond
• The discriminator (or partial key) of a weak entity set is the
set of attributes that distinguishes among all the entities of a
weak entity set.
• The primary key of a weak entity set is formed by the
primary key of the strong entity set on which the weak entity
set is existence dependent, plus the weak entity set’s
discriminator.
126. Weak Entity Sets (Cont.)
• We depict a weak entity set by double rectangles.
• We underline the discriminator of a weak entity set with a
dashed line.
• payment_number – discriminator of the payment entity set
• Primary key for payment – (loan_number, payment_number)
127. Weak Entity Sets (Cont.)
• Note: the primary key of the strong entity set is
not explicitly stored with the weak entity set, since
it is implicit in the identifying relationship.
• If loan_number were explicitly stored, payment
could be made a strong entity, but then the
relationship between payment and loan would be
duplicated by an implicit relationship defined by
the attribute loan_number common to payment
and loan
136. Database Schema
• Database Schema-logical design of database
• Database schema-snapshot of the data in the
database at a given instant in time
– Change with time
• Relation- variable
• Relation schema-type definition
• Relation instance-value of a variable
140. CODD’s 12 RULES OF RELATIONAL
DATABASE
A relational database management system
(RDBMS) is a database management system
(DBMS) that is based on the relational model as
introduced by E. F. Codd.
A short definition of an RDBMS may be a DBMS
in which data is stored in the form of tables and
the relationship among the data is also stored in
the form of tables.
E.F. Codd, the famous mathematician has
introduced 12 rules (0-12)for the relational model
for databases commonly known as Codd's rules.
141. • This rule states that all subsequent rules are
based on the notation that in order for a
database to be considered relational, it must
use it’s relational facilities exclusively to
manage the database.
Rule 0
142. All information in the relational database is
represented in exactly one and only one way -by
values in tables.
This is achieved by values in column and rows of
tables.
All information including table names, column names
and column data types should be available in same
table within the database.
Rule 1: INFORMATION RULE
143. Rule 2: GUARANTEED ACCESS RULE
• Each and every datum(atomic value) is
guaranteed to be logically accessible by resorting
to a combination of table name, primary key value
and column name
– Each unique piece of data should be accessible
by:table name+primary key(row) + attribute(column).
– All data are uniquely identified and accessible via this
identity.
144. Rule 3 : Systematic treatment of null values
• "Null values (distinct from the empty character
string or a string of blank characters and distinct
from zero or any other number) are supported in
fully RDBMS for representing missing information
in a systematic way, independent of data type.“
– NULLs may mean: Missing data, Not applicable
– This is distinct to zero or empty strings
– Primary keys — Not NULL
– Separate handling of missing and/or non applicable
data.
145. Rule 4:Dynamic Online Catalog Based
on the relational model
• The data base description is represented at the
logical level in the same way as-ordinary data, so
that authorized users can apply the same relational
language to its interrogation as they apply to the
regular data.
– The authorized users can access the database structure
by using common language
– Database structure description is contained in user-
accessible tables
146. • A relational system may support several languages
and various modes of terminal use .However, there
must be at least one language whose statements
are expressible, per some well-defined syntax, as
character strings and the ability to support all of the
following is comprehensive
– Data Definition (create,insert,update)
– View Definition
– Data Manipulation (alter,delete,truncate)
– Integrity Constraints (primary key,foreign key,null values)
– Authorization (GRANT , REVOKE)
– Transaction boundaries (begin,commit,rollbacketc)
•
Rule 5: COMPREHENSIVE DATA
SUBLANGUAGE
147. • All view that are theoretically updatable are also
updatable by the system
– View = ”Virtual table”, temporarily derived from base
tables.
– Example: If a view is formed as join of 3 tables,
changes to view should be reflected in base tables.
• Create view record as select name,marks from
student;
Rule 6: VIEW UPDATING RULE
148. The database must support set-level inserts,updates and
deletes
This rule states that insert, update, and delete operations should be
supported for any retrievable set rather than just for a single row in a
single table.
It also perform the operation on multiple row simultaneously .
RULE 7 : HIGH-LEVEL INSERT , UPDATE AND DELETE
149. • Application programs and terminal activities remain logically unimpaired
whenever any changes are made in either storage representation or access
methods
– The ability to change the physical schema without changing the logical
schema is called physical data independence.
– This is saying that users shouldn’t be concerned about how the data is
stored or how it’s accessed. In fact, users of the data need only be able
to get the basic definition of the data they need.
– EXAMPLE:
A change to the internal schema, such as using different file organization or
storage structures, storage devices, or indexing strategy, should be
possible without having to change the conceptual or external schemas.
RULE 8:PHYSICAL DATA INDEPENDENCE
150. • Application program and ad hoc facilities are logically unaffected when
changes are made to the table structures that preserve the original
table values (changing order of column or inserting column)
– What is independence?
The ability to modify schema definition in on level without affecting schema
definition in the next higher level is called data independence
– The ability to change the logical (conceptual) schema without changing
the External schema (User View) is called logical data independence.
– EXAMPLE:
The addition or removal of new entities, attributes, or relationships to the
conceptual schema should be possible without having to change existing
external schemas or having to rewrite existing application programs.
RULE 9 :LOGICAL DATA INDEPENDENCE RULE
151.
152. RULE 10 : INTEGRITY INDEPENDENCE RULE
All relational integrity constraint must be definable
in the relational language and stored in the system
catalog ,not at the application level
Data integrity refers to maintaining assuring the
accuracy and consistency of data over its entire life
cycle.
Entity integrity—no component of pk is allowed to have
null value
Referential integrity
153. RULE 11 : DISTRIBUTION INDEPENDENCE RULE
• The end users and application programs are unaware of and
unaffected by the data location (distributed v. local databases)
– Distribution independence implies that user should not have to
be aware of whether a database is distributed at different sites
or not.
– Application program and adhoc request are not affected by the
change in distribution of physical data. Application program will
work even if the programs and data are moved on different site
– The RDBMS may spread across the more one system or several
networks
154. RULE 12 : NON-SUBVERSION RULE
• If the system support low-level access to the
data,users must not be allowed to bypass the
integrity rules of the database
– There should be no way to modify to database structure other
then through the multiple row data base language(SQL).
Example:
A relational system has a low-level (single-record-at-a-time)
language, that low level cannot be used to subvert or bypass the
integrity Rules and constraints expressed in the higher level
relational language (multiple-records-at-a-time).”