Advance database system (part 3)

Advance Database Systems
Overview of RDBMS

Contents
• The Three-Level ANSI-SPARC Architecture
• Relational Data Structure
• Relational Keys

The Three-Level ANSI-SPARC Architecture
• The levels for three-level architecture comprising an External,
Conceptual, and an Internal level.

• The overall Description/Skeleton structure of the database is called
the database schema.
• At the highest level, we have multiple external schemas(also called
subschemas) that correspond to different views of the data.
• At the conceptual level, we have the conceptual schema, which
describes all the entities, attributes, and relationships to get with
integrity constraints.
• At the lowest level, we have the internal schema, which is a complete
description of the internal model, containing the definitions of stored
records, the methods of representation, the data fields, and the
indexes and storage structures used. There is only one conceptual
schema and one internal schema per database.

• The objective of the three-level architecture is to separate each user’s
view of the database from the way the database is physically
represented. There are several reasons why this separation is
desirable:-
1. Each user should be able to access the same data, but have a
different customized view of the data.
2. Users should not have to deal directly with physical database storage
details.
4.Theinternalstructureofthedatabaseshouldbeunaffectedbychangestoth
ephysicalaspectsofstorage,suchasthechangeovertoanewstoragedevice.
5.TheDatabaseAdministrator(DBA)shouldbeabletochangetheconceptual
anddatabasestoragestructureswithoutaffectingtheusers’views.

• A major objective for the three-level architecture is to provide data
independence, which means that upper levels are unaffected by
changes to lower levels.
• There are two kinds of data independence:-
• 1- Logical Data Independence.
• 2- Physical Data Independence.

1- Logical Data Independence: -
• Changes to the conceptual schema, such as the addition or removal
of new entities. attributes, or relationships, should be possible
without having to change existing external schemas or having to
rewrite application programs. Clearly, the users for whom the changes
have been made need to be aware of them, but what is important is
that other users should not be.
2- Physical Data Independence: -
• Changes to the internal schema, such as using different file storage
structures, using different storage devices should be possible without
having to change the conceptual or external schemas.

Relational Keys
• Keys are used to create relationship among different database tables.
• An entity type may have many instances, from a few to several
thousands and even more.
• Now out of many instances, when and if we want to pick a
particular/single instance, and many times we do need it, then key is
the solution.
• For example, think of whole population of Pakistan, the data of all
Pakistan is lying at one place, say with NADRA people. Now if at some
time we need to identify a particular person out of all this data, how
can we do that?
• While defining an entity we also generally define the key of that
entity.
• A key can be simple, that is, consisting of single attribute, or it could
be composite which consists of two or more attributes.

Candidate Key
• A super key for which no subset is a super key is called a candidate key, or
the minimal super key is the candidate key.
• It means that there are two conditions for the candidate key, One; It
identifies the entity instances uniquely, as is required in case of super key,
Second; It should be minimum, that is, no proper subset of candidate key is
a key.
• So, If we have a simple super key, that is, that consists of single/simple
attribute, it is definitely a candidate key, 100%.
• However, if we have a composite super key and if we take any attribute out
of it and remaining part Is not a super key any more then that composite
super key is also a candidate key since it is minimal super key.
• For example, one of the super keys that we identified from the entity
STUDENT is “regno, name”, this super key is not a candidate key, since if we
remove the regno attribute from this combination, name attribute alone is
not able to identify the entity instances uniquely.

Primary Key
• A candidate key chosen by the database designer to act as key is the
primary key.
• An entity type may have more than one candidate keys, in that case
the database designer has to design at one of them as primary key,
since there is always only a single primary key in an entity type.
• If there is just one candidate key then obviously the same will be
declared as primary key. The primary key can also be defined as the
successful candidate key.
• The relation that holds between super and candidate keys also holds
between candidate and primary keys, that is, every primary key(PK) is
a candidate key and every candidate key is a super key.
• A certain value that may be associated with any attribute is NULL,
that means “not given” or “not defined”.
• A major characteristic of the Primary Key is that it cannot have the
NULL value.

Unique Key
• A candidate key which can return a Record uniquely but may store a
NULL value is called as Unique Key.
• Student Contact Number attribute in STUDENT table is known as
Unique key.
Alternate Key
• Candidate keys which are not chosen as the primary key are known as
alternate keys.
• For example, we have two candidate keys of EMPLOYEE in figure2, reg
No and nId Number, if we select reg No as PK then then Id Number
will be alternate key.

Foreign Key
• Some times the information stored in a relation is linked to the
information stored in an other relation.
• If one of the relations is modified, the other must be checked, and
perhaps modified, to keep the data consistent.
• Suppose that in addition to Students, we have a second relation:
• Enrolled (cId: string, sId: string, cGrade: Text)
• The sId field of Enrolled is called a foreign key and refers to Students.
• The foreign key in the referencing relation(Enrolled, in our example)
must match the primary key of the referenced relation(Students).

• As the figure shows, there may well be some students who are not
referenced from Enrolled (e.g., the student with sId= 50000)
• However, every sId value that appears in the instance of the Enrolled
table appears in the primary key column of a row in the Students
table.
• If we try to insert the tuple(55555, Art 104, A) into E1, the rule is
violated because there is no tuple in S1 with the id 55555; the
database system should reject such an insertion.
• Similarly, if we delete the tuple(53666, Jones, jones@cs, 18, 3.4) from
S1, we violate the foreign key constraint because the tuple(53666,
History 105, B) in E1 contains sid value 53666, the sid of the deleted
Students tuple.
• The DBMS should disallow the deletion or, perhaps, also delete the
Enrolled tuple that refers to the deleted Students tuple.

• Many times we need to access certain instances of an entity type using the values of
one or more attributes other than the PK.
• The difference in accessing instances using the value of a key or non-key attribute is
that the search on the value of PK will always return a single instance(if it exists), where
as uniqueness is not guaranteed in case of non-key attribute.
• Such attributes on which we need to access the instances of an entity type that may not
necessarily return unique instance is called the secondary key.
• For example, we want to see how many of our students belong to Multan, in that case
we will access those instances of the STUDENT entity type that contain “Multan” in their
address.

• In this case address will be called secondary key, since we are
accessing instances on the basis of its value, and there is no
compulsion that we will get a single instance.
• Keep one thing in mind here, that a particular access on the value of a
secondary key MAY return a single instance, but that will be
considered as chance.
• There is not the compulsion or it is not necessary for secondary key to
return unique instance.
• But In case of super, candidate, primary and alternate keys it is
compulsion that they will always return unique instance against a
particular value.

Surrogate Key
• A Surrogate Key is any column or set of columns that can be declared
as the primary key instead of more than two composite Primary keys
that jointly makes a Cumber some key(CUMBERSOME meaning: Large
Set). Example of Cumber some key and Surrogate key is shown in next
slide.
Remember that
the primary key
MUST be unique
This is why treatment date
and time are included in the
composite primary key
But this makes a
very
Cumbersome
Key…
It would be better to create a
Surrogate Key like treatmentId
in PATIENT_TREATMENT
table

Advance database system (part 3)

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Advance database system (part 3)

Similaire à Advance database system (part 3) (20)

Plus de Abdullah Khosa

Plus de Abdullah Khosa (20)

Dernier

Dernier (20)

Advance database system (part 3)