1. INTRODUCTION TO RDBMS
VIDHYA B
ASSISTANT PROFESSOR
Department of Information Technology
Sri Ramakrishna College of Arts and Science
Coimbatore - 641 006
Tamil Nadu, India
1
2. UNIT 1: INTRODUCTION
Purpose of Database Systems
Database System Applications
View of Data
Database Design
Data Models
Database Languages
Relational Databases
Database Storage and Querying
Transaction Management
Database Administrator
Database Users
Overall System Structure
3. DATA VS. INFORMATION
Sno Data Information
1 Data is fact and
figures
Processed Data to reveal
meaning
2 Not significant to
business
Significant to business
3 It is atomic level
pieces of information
It is collection of data
4 It does not helps in
making decision
It helps in making decision
5 It is generally in
unorganized form
It is in organized form
6 It is not dependent It is dependent
7 Example: 89,95, 90
are numbers
Example: 89,95, 90 are
marks of 3 subjects
4. DEFINITION
Database
• Database is defined as the collection of logically related data,
description of data, designed to meet the needs of an organization
Database Management System (DBMS)
• DBMS is a software that enables user to define, create, access,
maintain and control access to the database.
• Goal: To provide a way to store and retrieve database information
that is both convenient and efficient
DB + DBMS Software= Database System
Database System
• It is an integrated collection of related files along with the detail
about their definition, interpretation, manipulation and
maintenance
S511 Session 2, IU-SLIS
4
5. PURPOSE OF DATABASE SYSTEM
Consider the following example:
Savings Bank that keeps information about all customers and savings
accounts. One way to keep the information on a computer is to store the
data in permanent system files, includes
System uses a number of application programs to manipulate the files
including
• Program to debit or credit an account
• Program to add a new account
• Program to find the balance of an account
• Program to generate monthly statements
New application programs are added to the system as need arises.
Keeping organizational information in a file processing system has
a number of major disadvantages (Or Advantages of DBMS)
1. Data redundancy and inconsistency 2. Difficulty in accessing data
3. Data isolation 4. Integrity problems
5. Atomicity problems 6. Concurrent access anomalies
7.Security problems
6. 1. DATA REDUNDANCY AND INCONSISTENCY
• Files and application programs are created by different
programmers over a long period - files have different
formats and the programs may be written in several
programming languages.
• Same information may be duplicated in several places
(files).
• Example : address and telephone number of a particular
customer may appear in a file that consists of savings
account records and in a file that consists of checking
account records.
• Redundancy
• higher storage and access cost.
• Data inconsistency, example – a change in customer
address may be reflected in savings account records
but not elsewhere in the system.
PURPOSE OF DATABASE SYSTEMS
7. 2. DIFFICULTY IN ACCESSING DATA
• Example : find out names of all customers who live
within the city’s pincode 641 041
• There is no application program to meet this
requirement.
• Existing appln pgm generates list of all customers
• Two choices
• Obtain the list of all customers and have the
needed information extracted manually / write a
new application program.
3.DATA ISOLATION
• Data are scattered in various files and files may be
in different formats – difficult to write a new appln
pgms to retrieve the appropriate data
PURPOSE OF DATABASE SYSTEMS
8. PURPOSE OF DATABASE SYSTEMS
4.INTEGRITY PROBLEMS
• Data values stored in the database must satisfy certain
types of consistency constraints.
• Example: balance of a bank account may never fall below a
prescribed amount.
• These constraints are enforced in the appln pgms by the
developers.
• When new constraints are added, it is difficult to change the
existing one.
• The problem is compounded when constraints involve
several data items from different files.
9. 5. ATOMICITY PROBLEMS
• Once a failure occurs in the system, the data must be
restored to the consistent state that existed prior to the
failure.
• Example: a transfer of Rs. 100 from account A to B. If
a system failure occurs during the pgm exec, may be,
the amount would have been removed from account A
and not credited into account B, resulting in an
inconsistent state.
• In database consistency, either both the credit and
debit occur, or that neither occur, i.e., funds transfer
must be atomic.
• This property is difficult to be ensured in a conventional
file processing system.
PURPOSE OF DATABASE SYSTEMS
10. 6. CONCURRENT ACCESS ANOMALIES
• Many systems allow multiple users to update the data
simultaneously.
• Concurrent updates may result in inconsistent data.
• Example: balance in an SB a/c is Rs. 500, if two
customers withdraw Rs. 100 and Rs. 50 respectively at
about the same time, the concurrent executions may
leave the account in an incorrect state.
• If two programs run concurrently, they may both read
the value 500 and write back 400 and 450 respectively
rather than the correct value of 350.
• A supervision is needed to guard against this possibility
which is difficult in the file processing system.
PURPOSE OF DATABASE SYSTEMS
11. 7. SECURITY PROBLEMS
• All the users of the database system should not be
allowed to access all the data.
• Example : in a banking system, payroll personnel need
to see only that part of the database that has
information about the various bank employees. They do
not need to access information about customer
accounts.
• It is difficult to enforce such security constraints as
appln pgms are added to the system in an ad hoc
manner
PURPOSE OF DATABASE SYSTEMS
12. DATABASE SYSTEM APPLICATIONS
Database are widely used. Here are some representative
applications
Enterprise Information:
• Sales: customers, products, purchases
• Accounting: Payment, receipts, balance, assets etc
• Manufacturing: production, inventory, orders, supply chain
• Human resources: employee records, salaries, tax deductions
• Online Retailers: online order tracking, maintenance of online
product evaluations
Banking and Finance:
• Banking : Customer information, accounts, loans banking
transactions
• Credit Card Transaction: purchases on credit cards and online
payments
• Finance: For storing information for holdings,sales and stocks
and bonds,
Airlines: reservations, schedules
Universities: registration, grades
Telecommunications: For keeping records of calls,prepaid
calling record, communication networks
Thus Databases touch all aspects of our lives
13. Purpose of the database system is to provide users with
an abstract view of the data.
The system hides certain details of how the data are
stored and maintained.
The view of data is classified as follows
Data Abstraction
Instances and Schemas
Data Independence
VIEW OF DATA
14. DATA ABSTRACTION
• Efficient retrieval of data
• Design of complex data structures for the
representation of data in the database
• Complexity is hidden from the users through several
levels of abstraction
• Physical Level
• Logical Level
• View Level
Physical Level
• Lowest level of abstraction
• Describes how the data are actually stored
• Complex low level data structures are described in detail
VIEW OF DATA- DATA ABSTRACTION
15. Logical Level
• Describes what data are stored in the database and what
relationship exist among those data.
• Entire database is described in terms of a small number of
relatively simple structures.
• Used by database administrators who must decide what
information is to be kept in the database.
View Level
• Describes only part of the entire database.
• Many users of the database system will not be concerned
with all this information. Instead, such users need to access
only a part of the database.
• The system may provide many views for the same database.
VIEW OF DATA- DATA ABSTRACTION
16. VIEW OF DATA- DATA ABSTRACTION
An architecture for a database system Data abstraction
18. Example
Record account with fields account number and balance
Record employee with fields employee name and salary
• Physical level
• Account or employee record can be described as a block of
consecutive storage locations
• Logical level
• Each record is described by a type definition
• Example (Pascal): type account= record
Account-number = number
Balance = number
End
• Programmers and database administrators work at this level
• View level
• Computers use a set of application programs that hide details
of the data types.
• Several views of the database are defined and database users
see these views
• Example : tellers in a bank see only that part of the database
that has information on customer accounts, they cannot
access information concerning salaries of the employees.`
VIEW OF DATA- DATA ABSTRACTION
19. Instances and Schemas
• Collection of information stored in the database at a
particular moment is called an instance of the
database.
• Overall design of the database is called the database
schema.
• Database systems have several schemas, partitioned
according to the levels of abstraction.
• Lowest level – physical schema[database design at
the physical level]
• Intermediate level – logical schema [database
design at the logical level
• Highest level – subschema
• Database systems support one physical schema,
one logical schema and several subschema.
VIEW OF DATA- INSTANCES & SCHEMAS
20. VIEW OF DATA- DATA INDEPENDENCE
Data independence
• Ability to modify a schema definition in one level without
affecting a schema definition in the next higher level is
called data independence.
• Two levels of independence
• Physical data independence
• Logical data independence
Physical data independence
• Ability to modify the physical schema without causing
application programs to be rewritten
• Modifications at the physical level are occasionally
necessary to improve performance
Logical data independence
• Ability to modify the logical schema without causing
application programs to be rewritten.
• Modifications at the logical level are necessary whenever
the logical structure of the database is altered.
22. DATA MODELS
It is a collection of conceptual tools for describing data,
data relationships, data semantics and consistency
constraints.
Various data models are
Data Model
Oblect based
Model
Record based
Model
Physical
Model
-ER Model
-Object oriented
Model
-Semantic Model
-Functional
Model
- Hierarchical
Model
- Network
Model
- Relational
Model
23. DATA MODEL - THE ENTITY RELATIONSHIP MODEL
The Entity Relationship Model : Based on the perception
of a real world that consists of a collection of basic
objects called entities and relationships among these
objects.
Entity : A thing or object in the real world that is
distinguishable from other objects.
Example: each person, bank account
A relationship is an association among several entities.
The overall logical structure of a database can be expressed
graphically by an ER diagram which is built up from the
following components:
• Rectangles, which represents entity sets
• Ellipses, which represent attributes
• Diamonds, which represents relationship among entity
sets
• Lines, which link attributes to entity sets and entity sets
to relationships.
25. DATA MODEL - THE OBJECT ORIENTED MODEL
THE OBJECT ORIENTED MODEL
• It is based on collection of objects
• An object contains values stored in instance variables
within the object
• An object also contains bodies of code that operate on the
object and they are called as methods
• Objects that contain the same types of values and the
same methods are grouped together into classes.
Example: consider an object representing a bank account. It
will contain instance variables account-number and
balance. It also contains a method payinterest which adds
interest to the balance.
If the percent of interest has to be changed then, it would
involve changing the code within the pay interest method
whereas in other data models the change may involve in
the code of one or more application programs.
26. DATA MODEL- RECORD-BASED LOGICAL MODELS
Record-Based Logical Models
• Describes data at the logical and view levels
• Used to specify the overall logical structure of the
database and to provide a higher level description of the
implementation.
• The database is structured in fixed format records of
several types.
• Each record type defines a fixed number of fields or
attributes and each field is usually of a fixed length.
• Three record based data models are
• Relational model
• Network model
• Hierarchical model
27. DATA MODEL
Relational Model
• Uses a collection of tables to represent both data and
the relationships among those data.
• Each table has multiple columns, and each column has
a unique name.
Account number balance
A-101 500
A-102 700
A-103 1000
A sample relational database
28. DATA MODEL
Network Model
• Data in the network model are represented by
collections of records and relationships among data are
represented by links
• The records in the database are organized as a
collection of arbitrary graphs.
yyy 14-27-27 Pudur cbe
xxx 12-23-23 Perur cbe A-101 500
A-102 700
29. DATA MODEL
Hierarchical Model
• Similar to the network model in the sense that data and
relationships among data are represented by records
and links respectively.
• It differs from the network model in that the records are
organized as collections of trees rather than arbitrary
graphs.
Physical Data Models
• Describe the data at the lowest level.
• Widely known ones are
• Unifying model
• Frame memory model
32. DATA MANIPULATION LANGUAGE (DML)
DATABASE LANGUAGES
Database system provides two different types of languages
• DDL - to specify database schema(Data Definition Language)
• DML - To express database queries and updates (Data
Manipulation Language)
DML
Access:
Retrieval of information stored in database
Insertion of new database
Deletion of information from database
Modification of information stored in database
Two classes of languages
• Procedural – user specifies what data is required and how to get
those data
• Nonprocedural [Declarative DML]– user specifies what data is
required without specifying how to get those data
A query is a statement requesting the retrieval of information called
as Query language
SQL is the most widely used query language
33. DATA DEFINITION LANGUAGE (DDL)
DATABASE LANGUAGES
Data Definition Language (DDL)
• Specifies database schema by a set of definitions
• The data value stored in the database satisfy certain consistency as
constraints.
DOMAIN CONSTRAINTS
A domain of possible values must be associated with every attribute.
Declaring an attribute to be a particular domain acts as a constraint
on that value it can take.
It is an elementary form and easily tested anywhere
REFERENTIAL INTEGRITY CONSTRAINTS
Sometimes a value appeared in one relation may occur in other
relation too. This is called as referential integrity constraints.
Database modifications can cause violations of referential integrity
ASSERTIONS
An assertions is any condition that the database must always satisfy.
When assertions created the system tested for validity.
If valid then in future modification is possible.
Domain and referential constraints are special form of assertions
34. DATA DEFINITION LANGUAGE (DDL)
AUTHORIZATION
The types of access permitted to the user is differentiated in terms of
authorization.
The Read Authorization , which allows reading but not modification of
data.
The Insert Authorization , which allows insertion of new data but not
modification of existing data.
The Update Authorization, which allows modification but not delete of
data.
The Delete authorization, which allows deletion of data.
The user may assign all, none or combination of these types of
authorization.
DDL gets I/P as instruction] generates O/P .
The output of the data is stored in data dictionary, which contains
metadata- data about the data
35. RELATIONAL DATABASES- TABLES
A relational databases is based on the relational model and uses a
collection of table to represent both data and relationship among
those data.
Each table has multiple column and each column has unique
name. Name Cour
se
Phone_No Major Prof Grade
Ram 353 3323232 CS Ana
nd
A
Ravi 256 2356788 Physics Bask
ar
B
Priya 358 3569977 CS Pras
ad
A
Ram 236 3323232 CS Kum
ar
In progress
Priya 351 3569977 CS Vanit
ha
A
Raahul 278 2568952 Maths Selv
am
A
Tarun 396 3265656 Maths Mani A
36. RELATIONAL DATABASES-DDL
Data Definition Language (DDL):
1. Create: - Data Definition Language provides reserved keyword
create to create tables, views, database etc.
Syntax: -
Create table <Table name> (Column1 name data-type
<constraint>, Column2 name data-type <constraint>,
Column3 name data-type <constraint>);
Note: - Constraint is optional, and is restrictions.
Data ManipulationLanguage (DML):
select A1, A2, ..., An from r1, r2, ..., rm where P
Where A1, A2, ..., An represent attributes , r1, r2, ..., rm represent
relations, and P is a predicate.
Application Programs
Application Programs are programs that is used to interact with the
database
To Access database DML need to be executed from host language
that is done by
Application Program interface
Embed DML call
37. DATABASE STORAGE AND QUERYING
Database system is partitioned into modules.
The functional system is classified as Storage Manager and
Query processor.
Storage Manager responsible for large amount of storage space.
Also deals with the movement of data speed from secondary storage
device to primary storage. Query Processor is important to simplify and
facilitate the access to data.
Storage Manager
It is a program module that provides the interface between the
low level data stored in the database and the application
programs and queries submitted to the system.
• Responsible for interaction with the file manager. Raw data
are stored on the disk using the file system, which is usually
provided by a conventional operating system.
• Translates the various DML statements into low level file
system commands.
• Responsible for storing, retrieving and updating of data in
the database.
38. DATABASE STORAGE AND QUERYING
The storage manager components includes
• Authorization and integrity manager
• Tests for the satisfaction of integrity constraints and checks
the authority of users to access data
• Transaction manager
• Ensures that the database remains in a consistent state
despite system failures, and that concurrent transaction
executions proceed without conflicting
• File manager
• Manages the allocation of space on disk storage and the data
structures used to represent information stored on disk
• Buffer manager
• Responsible for fetching data from disk storage into main
memory, and deciding what data to cache in memory.
The storage manager has several data structures as follows:
Data files, which stores data itself.
Data dictionary, which stores metadata.
Indices provides fast access to the data items. A database index may
be a pointer to the data items that hold the data values.
39. DATABASE STORAGE AND QUERYING
The query processor components include
• DML compiler
• Embedded DML precompiler
• DDL interpreter
• Query evaluation engine
DML compiler : Translates DML statements in a query language
into low level instructions that the query evaluation engine
understands.
Embedded DML precompiler
• Converts DML statements embedded in an application program
to normal procedure calls in the host language.
• Precompiler must interact with the DML compiler to generate
the appropriate code.
DDL interpreter : Interprets DDL statements and records them in
a set of tables containing metadata
Query evaluation engine : Executes low level instructions
generated by the DML compiler.
41. TRANSACTION MANAGEMENT
Atomicity
• Example : in funds transfer, in which one account a is debited
and another b is credited.
• Here, both credit and debit should occur or neither occur
• This all or none requirement is called atomicity.
Consistency
• The execution of the fund transfer preserve the consistency of
the database.
• Ie., the value of sum a+b must be preserved.
• This correctness requirement is called consistency.
Durability
• After funds transfer, the new values of a and b must persist ,
despite the possibility of system failure. This persistency
requirement is called durability.
42. TRANSACTION MANAGEMENT
Transaction
• Collection of operations that performs a single logical function
in a database application.
• Each transaction is a unit of both atomicity and consistency.
• Transactions should not violate any database and consistency
constraints.
• i.e., if the DB was consistent when a transaction started, the
DB must be consistent when the transaction successfully
terminates.
Programmer is responsible for preserving the consistency of
the database.
• Example: transaction to transfer funds from account a to
account b could be two separate programs one for debit and
another for credit
• Execution of these two programs one after another will preserve
consistency.
43. TRANSACTION MANAGEMENT
Database system is responsible for ensuring the atomicity and
durability properties.
• In the absence of failures, all the transactions complete
successfully and atomicity is achieved easily.
• During failure, a transaction may not always complete its
execution successfully.
• To ensure atomicity property, a failed transaction must have no
effect on the state of the database.
• Thus the database must be restored to the state in which it
was before the transaction started executing.
• When several transactions update the database concurrently ,
the consistency of data may not be preserved. Concurrency
control manager should control the interaction among the
concurrent transactions, to ensure the consistency of the
database.
44. DATABASE ADMINISTRATOR
Has central control over the system
Functions of DBA include the following
• Schema definition
• Storage structure and access method definition
• Schema and physical organization modification
• Granting of authorization for data access
• Integrity constraint specification
Schema definition
• DBA creates the original database schema by writing a set of
definitions that is translated by the DDL compiler to a set of
tables that is stored permanently in the data dictionary.
Storage structure and access method definition
• DBA creates appropriate storage structures and access methods
by writing a set of definitions, which is translated by the data
storage and data definition language compiler.
45. Schema and physical organization modification
• Programmers accomplish the relatively rare modifications either
to the database schema or to the description of the physical
storage organization by writing a set of definitions that is used by
either the DDL compiler or the data storage and data definition
language compiler to generate modifications to the appropriate
internal system tables.
Granting of authorization for data access
• Granting of different types of authorization allows the database
administrator to regulate which parts of the database various
users can access.
• Authorization information is kept in a special system structure
that is consulted by the database system whenever access to the
data is attempted in the system.
Integrity constraint specification
• Data values stored in the database must satisfy certain
consistency constraints.
• Example: no. of hours an employee may work in 1 week may not
exceed 80 hours.
DATABASE ADMINISTRATOR
46. DATABASE USERS
Four types of database users
• Application programmers
• Sophisticated users
• Specialized users
• Naive users
Application programmers
• Interact with the system through DML calls, which are
embedded in a program written in a host language
• (Example: C). These programs are called application
programs.
• Example: banking system include programs that
generate payroll checks, that debit accounts, credit
accounts, or transfer funds between accounts.
47. DATABASE USERS
Sophisticated users
• Interact with the system without writing programs.
• Request is given as a database query language.
• The query is submitted to a query processor whose
function is to break down DML statement into
instructions that the storage manager understands.
Specialized users
• Write specialized database applications that do not fit
into the traditional data processing framework.
• Applications – computer aided design systems,
knowledge base and expert systems
Naïve users or unsophisticated users
• Interact with the system by invoking one of the
permanent application programs that have been written
previously.
48. OVERALL SYSTEM
STRUCTURE
The functional components of a database system can be
broadly divided into query processor components and
storage manager components.
The query processor components
The storage manager components
Includes DBA & it functions, Types of Users
50. APPLICATION ARCHITECTURES
Two-tier architecture: E.g. client programs using ODBC/JDBC to
communicate with a database
Three-tier architecture: E.g. web-based applications, and
applications built using “middleware”