SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
Unit 1

1.1    Overview of Database Management System:
Data - Data is meaningful known raw facts that can be processed and stored as
information.
Database - Database is a collection of interrelated and organized data.
In general, it is a collection of files (tables).

Entity: A person, place, thing or event about which information must be kept.
Attribute: Piece of information describing a particular entity. These are mainly the
characteristics about the individual entity. Individual attributes help to identify and
distinguish one entity from another.

                           STUDENT (DATABASE NAME)

               Entity              Attributes
               Personnel           Name, Age, Address, Father’s Name
               Academic            Name, Roll No., Course, Dept. Name

E.g.
Student                 (Database Name)
               Field name or attribute name
Personnel (Table Name)                               Academic (table name)
                                                                 ROLL      COURSE     Dept. Name
Name          Father Name         Age                 Name
                                                                 NO
John          Albert              24       RECORD     John       12        MSC        Computer

Ramesh        Suresh              18                  Ramesh     15        BCA        Computer



DBMS - Database Management System (DBMS) is a collection of interrelated data
[usually called database] and a set of programs to access, update and manage those data
[which form part of management system]
                                           OR

It is a software package to facilitate creation and maintenance of computerized database.
It is general purpose software that facilitates the following:

       1. Defining: Specifying data types and structures, and constraints for data to be
       stored.
       2. Constructing: Storing data in a storage medium.
       3. Manipulating: Involves querying, updating and generating reports.
       4. Sharing: Allowing multiple users and programs to access data simultaneously.

Eg. Of DBMS- Access, dBase, FileMaker Pro, and FoxBASE, ORACLE etc.




1.1.2 Primary goals of DBMS are:

1. To provide a way to store and retrieve database information that is both convenient and
efficient.
2. To manage large and small bodies of information. It involves defining structures for
storage of information and providing mechanism for manipulation of information.
3. It should ensure safety of information stored, despite system crashes or attempts at
unauthorized access.
4. If data are to be shared among several users, then system should avoid possible
anomalous results.
1.2       Database Management Systems:

Database Management Systems provide the following advantages and disadvantages:

Advantages of DBMS:

      •   Data independence: provides an abstract view of the data that hides the details
          data representation and storage.
      •   Efficient Data Access: This is the advantage where we use variety of techniques
          to store and retrieve data.
      •   Data integrity and security: we can ensure data integrity if the data is always
          enforced through integrity constraint.
      •   Data administration: data" administration deals with the modeling of the data
          and treats data as an organizational resource, while "database" administration
          deals with the implementation of the types of databases that are in use.
      •   Concurrent Access and crash recovery: It ensures concurrent access of the data
          in such a way that the data is being accessed by only one user a time. Also
          protects the system from crashes.
      •   Reduced Application Development time: It supports all the important functions
          that are common to many applications.

Disadvantages of a DBMS:

The following are disadvantages of DBMS
   • Setup of the database system requires more knowledge, money, skills, and time.
   • The complexity of the database may result in poor performance.

1.2.1 Advantages & Disadvantages of DBMS:

Advantages                              Disadvantages
Greater flexibility                     Difficult to learn
Good for larger databases               Packaged separately from the operating
                                        system (i.e. Oracle, Microsoft Access,
                                        Lotus/IBM Approach, Borland Paradox,
                                        Claris FileMaker Pro)
Greater processing power                Slower processing speeds
Fits the needs of many medium to large- Requires skilled administrators
sized organizations
Storage for all relevant data           Expensive
Provides user views relevant to tasks
performed
Ensures data integrity by managing
transactions (ACID test = atomicity,
consistency, isolation, durability)
Supports simultaneous access
Enforces design criteria in relation to data
format and structure


1.2.2 Applications of DBMS:

1. Banking – For customer information, accounts, and loans, and banking transactions.
[all transactions]
2. Airlines – For reservation and schedule information. [reservations, schedules]
3. Universities – For student information, course registrations, and grades. [registration,
grades]
4. Credit Card Transactions – For purchases on credit card and generation of monthly
statements.
5. Telecommunication – For keeping records of calls made, generating monthly bills,
maintaining balances on prepaid calling cards, and storing information about
communication networks.
6. Finance – For storing information about holdings, sales, and purchases of financial
instruments such as stocks and bonds.
7. Sales – For customer, product, and purchase information. [customers, products,
purchases]
8. Manufacturing – For management of supply chain and for tracking production of
items in factories, inventories of items in warehouses/stores, and orders for items.
[production, inventory, orders, supply chain]
9. Human Resources – For information about employees, salaries, payroll taxes and
benefits, and generation of paychecks. [employee records, salaries, tax deductions]



1.3    Various views of Data
1.3.1 Data abstraction:
It can be summed up as follows.
1. When the DBMS hides certain details of how data is stored and maintained, it provides
what is called as the abstract view of data.
2. This is to simplify user-interaction with the system.
3. Complexity (of data and data structure) is hidden from users through several levels of
abstraction.

Data abstraction is used for following purposes:
1. To provide abstract view of data.
2. To hide complexity from user.
3. To simplify user interaction with DBMS.
Figure: The Three Levels of Abstraction

Levels of data abstraction:

There are three levels of data abstraction.
1. Physical level: It describes how a record (e.g., customer) is stored.
Features:
        a) Lowest level of abstraction.
        b) It describes how data are actually stored.
        c) It describes low-level complex data structures in detail.
        d) At this level, efficient algorithms to access data are defined.
2. Logical level: It describes what data stored in database, and the relationships among
the data.
Features:
        a) It is next-higher level of abstraction. Here whole Database is divided into small
        simple structures.
        b) Users at this level need not be aware of the physical-level complexity used to
        implement the simple structures.
        c) Here the aim is ease of use.
        d) Generally, database administrators (DBAs) work at logical level of abstraction.

3. View level: Application programs hide details of data types. Views can also hide
information (e.g., salary) for security purposes.
Features:
       a) It is the highest level of abstraction.
       b) It describes only a part of the whole Database for particular group of users.
       c) This view hides all complexity.
d) It exists only to simplify user interaction with system.
        e) The system may provide many views for the whole system.

1.4     Data Models
A data model is a collection of concepts that can be used to describe the structure of a
database and provides the necessary means to achieve this abstraction whereas structure
of a database means the data types, relationships and constraints that should hold on the
data.
Collection of conceptual tools for describing data, data relationships, data semantics and
consistency constraints. The various data models that have been proposed fall into three
different groups. Object based logical models, record-based logical models and physical
models.

Object-Based Logical Models: They are used in describing data at the logical and view
levels. They are characterized by the fact that they provide fairly flexible structuring
capabilities and allow data constraints to be specified explicitly. There are many different
models and more are likely to come. Several of the more widely known ones are:
• The E-R model
• The object-oriented model
• The semantic data model
• The functional data model
The E-R Model
The (E-R) data model is based on a perception of a real worker that consists of a
collection of basic objects, called entities, and of relationships among these objects.
The overall logical structure of a database can be expressed graphically by an E-R
diagram. Which is built up by the following components:
• Rectangles, which represent entity sets
• Ellipses, which represent attributes
• Diamonds, which represent relationships among entity sets
• Lines, which link attributes to entity sets and entity sets to relationships.
    E.g. suppose we have two entities like customer and account, then these two entities
    can be modeled as follow:

                                                              Account number
               name
      Customer                      Customer city                                 Balanc
                                                                                        e

                      Customer                      Deposit            Account



                                 FIGURE 1.3: A SAMPLE E-R DIAGRAM

The Object-Oriented Model
Like the E-R model the object-oriented model is based on a collection of objects. An
object contains values stored in instance variables within the object. An object also
contains bodies of code that operate on the object. These bodies of code are called
methods.
Classes: It is the collection of objects which consist of the same types of values and the
same methods.
E.g. account number & balance are instance variables; pay-interest is a method that uses
the above two variables and adds interest to the balance.

Semantic Models
These include the extended relational, the semantic network and the functional models.
They are characterized by their provision of richer facilities for capturing the meaning of
data objects and hence of maintaining database integrity. Systems based on these models
exist in monotype for at the time of writing and will begin to filter through the next
decade.

Record-Based Logical Models
Record based logical models are also used in describing data at the logical and view
levels. In contrast to object-based data models, they are used both to specify the overall
logical structures of the database, and to provide a higher-level description of the
implementation.
Record-based models are so named because the database is structured in fixed-format
records of several types. Each record type defines a fixed number of fields, or attributes,
and each field is usually of a fixed length.
The three most widely accepted record-based data models are the relational, network, and
hierarchical models.
Relational Model
The relational model uses a collection of tables to represent both data and the
relationships among those data. Each table has multiple columns, and each column has a
unique name as follows:
                                   CUSTOMER (TABLE NAME)

Customer–name            Customer-street            Customer-City            Account-number

Johnsons                 Alma                       Pala Alto                A-101
Smith                    North                      Ryc                      A-215
Hayes                    Main                       Harrison                 A-102
Turner                   Dutnam                     Stanford                 A-305
Johnson                  Alma                       PalaAlto                 A-201
Jones                    Main                       Harrison                 A-217
Lindsay                  Park                       Pittifield               A-222
Smith                    North                      Rye                      A-201



                            A SAMPLE RELATIONAL DATABASE
Network Model
Data in the network model is represented by collection of records, and relationship
among data is represented by links, which can be viewed as pointers. The records in the
database are organized as collections of arbitrary graphs. Such type of database is shown
in the Figure 1.4.

Johnsons      Alma              Pala Alto                                         A-101       500

Smith         North             Rye                                               A-215       700

Hayes         Main              Harrison                                          A-102       400

Turner        Dutnam            Stanford                                          A-305       350

Jones         Main              Harrison                                          A-201       900

Lindsay       Park              Pittifield                                        A-217       750
                                                                                  A-222       700

                             FIGURE 1.4: A SAMPLE NETWORK DATABASE

Hierarchical Model
The hierarchical model is similar to the network model in the sense that data and
relationships among data are represented by records and links, respectively. It differs
from the network model in that records are organized as collection of trees rather than
arbitrary graphs.


                                                                                                     CUSTOMER




                        Johnson customer street -------

                                Smith        North              -------

                                             Hayes     Main       -------
    A-101   500

             A-201     900                           Turner    Putnam -------

                                                              Jones       Main      -------
                     A-215    700
                                                               Lindsay           Park               -------
                       A-201     900

                                                          A-217       350
                                A-102        400

                                                                             A-222        700
                                             A-305     350



                             FIGURE : A SAMPLE HIERACHICAL DATABASE
Physical Data Models
Physical data models are used to describe data at the lowest level. In contrast to logical
data models, there are few physical data models in use. Two of the widely known ones
are the unifying model and the frame-memory model.



1.5    Instances and Schemas:
Instances and Schemas are similar to types and variables in programming languages.

1. Schema:
The overall design of a database is called database schema. E.g., the database consists of
information about a set of customers and accounts and the relationship between them. It
is analogous to variable along with its type information in a program.

Types of Schemas (partitioned according to levels of abstraction):

       a. Physical schema: It is database design at the physical level. It is hidden below
       logical schema, and can be changed easily without affecting application programs.
       b. Logical schema: It is database design at the logical level. Programmers
       construct applications using logical schema. It is by far the most important
       schema, in terms of its effect on application programs.
       c. Subschema: It is schema at view level.

2. Instance: It is the actual content of the database at a particular point in time. It is
analogous to the value of a variable.


1.6    Introduction to Database Languages & Environments

1.6.1 Database Languages:

We have Data Definition Languages (DDL) to specify database schemas and Data
Manipulation Language (DML) to express database updates and queries. In practice,
these are not to separate languages but are part of a single database language, like SQL.

1. Data Definition Languages (DDL)
It is the language that is used to specify database schemas by a set of definitions
contained in it.

2. Data Manipulation Language (DML)
It is a language for accessing and manipulating the data organized by the appropriate data
model.
DML is also known as query language.

There are two types of DML
       a) Procedural DMLs
       b) Declarative DMLs (non-procedural DMLs)
1. Procedural DMLs - This language requires user to specify what data is required and
how to get those data.

2. Declarative DMLs (non-procedural DMLs) - This language requires user to specify
what data is required without specifying how to get those data.

Features of DDL:

   •   Used to specify a database schema as a set of definitions expressed in a DDL
   •   DDL statements are compiled, resulting in a set of tables stored in a special file
       called a data dictionary or data directory.
   •   The data directory contains metadata (data about data).
   •   The storage structure and access methods used by the database system are
       specified by a set of definitions in a special type of DDL called a data storage and
       definition language.
   •   Basic idea of DDL: To hide implementation details of the database schemes from
       the users.

Features of DML:

   •   A DML is a language which enables users to access and manipulate data.
   •   The goal is to provide efficient human interaction with the system.
   •   There are two types of DMLs

          o Procedural: Here user specifies what data is needed and how to get it.
          o Non-procedural: Here user only specifies what data is needed.
                   - Easier for user
                   - May not generate code as efficient as that produced by
                        procedural languages.


1.6.2 Database System Environment:

1.6.2.1 DBMS System Structure and its Components:

We can explain the overall structure of DBMS/System structure and its components by
the diagram given below:
Figure: System Structure


1. Database systems are partitioned into modules for different functions. Some functions
(e.g. file systems) may be provided by the operating system.
2. Broadly the functional components of a database system are:

        a. Query Processor: It is one of the functional components of DBMS. It
translates statements in a query language into low-level instructions the database manager
understands. It may also attempt to find an equivalent but more efficient form.
It contains following components:
a. DML compiler - It converts DDL statements to a set of tables containing
         metadata stored in a data dictionary.
It also performs query optimization.
         b. DDL interpreter – It interprets DDL statements and records definitions into
data dictionary.
         c. Query evaluation engine – It executes low-level instructions generated by
DML compiler. They mainly deal with solving all problems related to queries and query
processing. It helps database system simplify and facilitate access to data.

b. Storage Manger (Database Manager)
       1. Storage manager is a program module that provides the interface between the
       low-level data stored in the database and the application programs and queries
       submitted to the system.
       2. The storage manager is responsible to the following tasks:
               1. interaction with the file manager
               2. efficient storing, retrieving and updating of data
       3. The important components include:
               a. File manager: It manages allocation of disk space and data structures
               used to represent information on disk.
               b. Database manager: It is the interface between low-level data and
               application programs and queries.
               c. Transaction manager: Transaction manager ensures that the database
               remains in a consistent (correct) state despite system failures (e.g., power
               failures and operating system crashes) and transaction failures.
               d. DML precompiler: It converts DML statements embedded in an
               application program to normal procedure calls in a host language. The
               precompiler interacts with the query processor.
               e. DDL compiler: It converts DDL statements to a set of tables containing
               metadata stored in a data dictionary.
               f. Authorization and integrity manger – It conducts integrity checks and
               user authority to access data.
               g. Buffer manger – It is critical part of DB and stores temporary data.
       In addition, several data structures are required for physical system
       implementation:
               a. Data files: They store the database itself.
               b. Data dictionary: It stores information about the structure of the
               database. It is used heavily. Great emphasis should be placed on
               developing a good design and efficient implementation of the dictionary.
               In short, it stores metadata.
               c. Indices: They provide fast access to data items holding particular
               values.
1.7    File systems/File processing systems
A file system is basically storing information in data structures called ‘files’ in the
operating system and manipulating this information via application programs that
manipulate the files.

File Processing System

Advantages                                Disadvantages
Simpler to use                            Typically does not support multi-user
                                          access
Less expensive·                           Limited to smaller databases
Fits the needs of many small businesses Limited functionality (i.e. no support for
and home users                            complicated transactions, recovery, etc.)
Popular FMS’s are packaged along with
the operating systems of personal
                                          Decentralization of data
computers (i.e. Microsoft Card file and
Microsoft Works)
Good for database solutions for hand held
                                          Redundancy and Integrity issues
devices such as Palm Pilot


Disadvantages of File Processing System:

1. Data Redundancy – Since different programmers create the files and application
programs over a long period, the various files are likely to have different formats and the
programs may be written in several programming languages. Moreover, the same
information may be duplicated in several files, this duplication of data over several files
is known as data redundancy. Eg. The address and telephone number of a particular
customer may appear in a file that consists of saving- account records and in a file that
consists of checking account records. This redundancy leads to higher storage & excess
cost also leads to inconsistency discussed in the next.

2. Data Inconsistency – The various copies of same data may no longer agree i.e.
various copies of the same data may contain different information. Eg. A changed
customer address may be reflected in savings-account records but not elsewhere in the
system.

3. Difficulty in accessing data – In a conventional file processing system it is difficult to
access the data in a specific manner and it is require creating an application program to
carry out each new task. Eg. Suppose that one of the bank officers needs to find out the
names of all customers who live within a particular postal-code area. We ask the officer
of data-processing department to generate such a list.We have 2 choices:
- List all cust_names & manually do the information required
        - Ask the data processing dept. to have system programmer to write necessity
        application program Because the designers of the original system did not
        anticipate this request, there is no application program on hand to meet it.
4. Data Isolation – Because data are scattered in various files, and files may be in
different formats, writing new application programs to retrieve the appropriate data is
difficult.

5. Integrity problems – The data stored in the database must satisfy certain types of
consistency constraints. Eg. The balance of a bank account may never fall below a
prescribed amount (say, ICICI 2500/- ). Developers enforce these constraints in the
system by adding appropriate code in the various application programs. However, when
new constraints are added, it is difficult to change the programs to enforce them. The
problem is compounded when constraints involve several data items from different files.

6. Atomicity problems – A computer system, like any other mechanical or electrical
device, is subject to failure. In many applications, it is crucial that, if a failure occurs, the
data be restored to the consistent state that existed prior to the failure. Eg. Before
computer format, we require to have a backup first. It is difficult to ensure atomicity in a
conventional file processing system.

7. Concurrent-access anomalies – For the sake of overall performance of the system
and faster response, many systems allow multiple users to update the data
simultaneously. In such an environment, interaction of concurrent updates may result in
inconsistent data. Eg. Consider bank account A, containing $500. If two customers
withdraw funds (say $50 and $100 respectively) from account A at about the same time,
the result of the concurrent executions may leave the account in an incorrect (or
inconsistent) state. Suppose that the programs executing on behalf of each withdrawal
read the old balance, reduce that value by the amount being withdrawn, and write the
result back. If the two programs run concurrently, they may both read the value $500, and
write back $450 and $400, respectively. Depending on which one writes the value last,
the account may contain either $450 or $400, rather than the correct value of $350.
        To guard against this possibility, the system must maintain some form of
supervision. But supervision is difficult to provide because data may be accessed by
many different application programs that have not been coordinated previously.

8. Security Problems – Not every user of the database system should be able to access
all the data. Eg. In a bank system, payroll personnel need to see only that part of the
database that has information about the various bank employees. They do not need access
to information about customer accounts. But, since application programs are added to the
system in an ad hoc manner, enforcing such security constraints is difficult.
1.7.1 Difference between DBMS and File-processing system:


DBMS                                          File-processing Systems
1. Redundancies and inconsistencies in data   1. Redundancies and inconsistencies in data
are reduced due to single file formats and    exist due to single file formats and
duplication of data is eliminated.            duplication of data.

2. Data is easily accessed due to standard 2. Data cannot be easily accessed due to
query procedures.                          special application programs needed to
                                           access data.

3. Isolation/retrieval of required data is 3. Data isolation is difficult due to different
possible due to common file format, and file formats, and also because new
there are provisions to easily retrieve data. application programs have to be
                                              written.


4. Integrity constraints, whether new or old, 4. Introduction of integrity constraints is
can be created or modified as per need.       tedious and again new application
                                              programs have to be written.

5. Atomicity of updates is possible.          5. Atomicity of updates may not be
                                              maintained.

6. Several users can access data at the same 6. Concurrent accesses may cause problems
time i.e concurrently without problems       such as . Inconsistencies.

7. Security features can be enabled in 7. It may be difficult to enforce security
DBMS very easily.                      features.




1.8    Advantages of a DBMS over file-processing system:
A DBMS has three main features:
     (1) Centralized data management,
     (2) Data independence and
     (3) Data integration /System integration
In a DB system, the DBMS provides the interface b/w the application programs & data.
When changes are made to the data representation, the metadata maintained by the
DBMS is changed but the DBMS continues to provide data to application programs in
previously used way. The DBMS handles the task of information of data where ever
necessary. This independence b/w the programs & the data is called data independence.
this made program to continue irrespective of changes made in it. To provide a high
degree of data independence, a DBMS must include a sophisticated metadata mgmt
system. In DBMS , all files are integrated into one system thus reducing redundancies &
making data mgmt more efficient. In addition, DBMS provides centralized control of the
operational data. Some of advantages of above mention three features are:
Due to its centralized nature, the database system can overcome the disadvantages of the
file-based system as discussed below.

• Minimal Data Redundancy - Since the whole data resides in one central database, the
various programs in the application can access data in different data files. Hence data
present in one file need not be duplicated in another.
This reduces data redundancy. However, this does not mean all redundancy can be
eliminated. There could be business or technical reasons for having some amount of
redundancy. Any such redundancy should be carefully controlled and the DBMS should
be aware of it.

• Data Consistency - Reduced data redundancy leads to better data consistency.

• Data Integration - Since related data is stored in one single database, enforcing data
integrity is much easier. Moreover, the functions in the DBMS can be used to enforce the
integrity rules with minimum programming in the application programs.

• Data Sharing - Related data can be shared across programs since the data is stored in a
centralized manner. Even
new applications can be developed to operate against the same data.

• Enforcement of Standards - Enforcing standards in the organization and structure of
data files is required and also easy in a Database System, since it is one single set of
programs which is always interacting with the data files.

• Application Development Ease - The application programmer need not build the
functions for handling issues like concurrent access, security, data integrity, etc. The
programmer only needs to implement the application business rules. This brings in
application development ease. Adding additional functional modules is also easier than in
file based systems.

• Better Controls - Better controls can be achieved due to the centralized nature of the
system.

• Data Independence - The architecture of the DBMS can be viewed as a 3-level system
comprising the following:
        - The internal or the physical level where the data resides.
        - The conceptual level which is the level of the DBMS functions
        - The external level which is the level of the application programs or the end user.
Data Independence is isolating an upper level from the changes in the organization or
structure of a lower level. For example, if changes in the file organization of a data file do
not demand for changes in the functions in the DBMS or in the application programs,
data independence is achieved. Thus Data Independence can be defined as immunity of
applications to change in physical representation and access technique. The provision of
data independence is a major objective for database systems.

• Reduced Maintenance - Maintenance is less and easy, again, due to the centralized
nature of the system.



1.9    Responsibility of Database Administrator
The database administrator is a person having central control over data and programs
accessing that data. He coordinates all the activities of the database system; the database
administrator has a good understanding of the enterprise’s information resources and
needs.

Functions of a DBA:

Database administrator's duties include:
1. Schema definition: the creation of the original database schema. This involves writing
a set of definitions in a DDL (data storage and definition language), compiled by the
DDL compiler into a set of tables stored in the data dictionary.

2. Storage structure and access method definition: writing a set of definitions
translated by the data storage and definition language compiler.

3. Schema and physical organization modification: writing a set of definitions used by
the DDL compiler to generate modifications to appropriate internal system tables (e.g.
data dictionary). This is done rarely, but sometimes the database schema or physical
organization must be modified.

4. Granting user authority to access the database: granting different types of
authorization for data access to various users

5. Specifying integrity constraints: generating integrity constraints. These are consulted
by the database manager module whenever updates occur.

6. Routine Maintenance: It includes the following:
       a. Acting as liaison with users.
       b. Monitoring performance and responding to changes in requirements.
       c. Periodically backing up the database.
1.10 Three levels architecture of database systems

A DBMS can be considered as a buffer between application programs, end users and a
database designed to fulfill features of data independence. In 1975 the American National
Standards Institute Standards Planning and Requirements Committee (ANSI-SPARC)
proposed three-level architecture identified three levels of abstraction.




                 Figure: The Three Level Architecture For a DBMS

These levels are sometimes referred to as schemas or views.

1. The External or User Level: This level describes the user’s or application program’s
view of the database. Several programs or users may share the same view.

2. The Conceptual Level: This level describes the organization’s view of all the data in
the database, the relationships between the data and the constraints applicable to the
database. This level describes a logical view of the database i.e. a view locking
implementation detail.

3. The Internal or Physical Level: This level describes the way in which data is stored
and the way in which data may be accessed. This level describes a physical view of the
database.
        Each level is described in terms of schema – a map of the database. The three-
level architecture is used to implement data independence through two levels of mapping:
that between the external schema and the conceptual schema and that between the
conceptual schema and the physical schema.

For eg., Consider a Bank System, It uses
       • Customer_Details Table.
       • Customer_Transactions Table.




At the internal level, a Customer_Details or Customer_Transaction record can be
described as a block of consecutive storage locations (for example, words of bytes). The
language compiler hides this level of detail from programmers. Similarly, the database
system hides the lowest-level storage details (how data is stored and accessed) from
database programmers. At the conceptual level, the table definition (the attributes data
type and width definition) and the interrelationship among the data is described. Finally
at the external level, several views of the database are defined, and database end users are
also to see those views. In addition to hiding details of the conceptual level of the
database, the views also provide a security mechanism to prevent users from accessing
parts of the database. For e.g., tellers in the bank will be able to see only that part of the
database that has information on customer accounts; they cannot access information
concerning salaries of bank employees.
Data Model
A data model is collection of tools for describing
       1. data 2. data relationships 3. data semantics 4. data constraints

Types of Data Models:

There are basically two types of data models
       1. Record based Data Models.
       2. Object based Data Models.

1. Record based Data Models –

In Record-based models, the database is organized in fixed-format records of several
types. A fixed number of fields, or attributes, are defined in each record type, and each
field is usually of a fixed length.
The three most popular record-based data models are
            1. Relational Data Model
            2. Network Data Model
            3. Hierarchical Data Model

1. Relational Data Model –
1. The relational model uses tables to represent the data and the relationships among
those data.
2. Each table has multiple columns, and each column is identified by a unique name.
3. It is a low level model.
In this database, each row in the table represents a different customer. Relationships link
rows from two tables on the basis of the key field, in this case – number.

Advantages of Relational Data Model:

a. Structural Independence – Relational database model has structural independence,
i.e. changes made in the database structure do not affect the DBMS’s capability to access
data.
b. Simplicity – The relational model is the simplest model at the conceptual level. It
allows the designer to concentrate on the logical view of the database, leaving the
physical data storage details.
c. Ease of designing, implementation, maintenance, and usage – Due to the inherent
features of data independence and structural independence, and the relational model
makes it easy to design, implement, maintain and use the databases.
d. Adhoc query capability – One of the main reasons for the huge popularity of the
relational database model is the presence of powerful, flexible and easy-to-use query
capability. The query language of the relational database model – Structure Query
Language or SQL – is a fourth generation language (4GL). A 4GL concentrates on the
‘what’ and not on the ‘how’ of the problem. Selective output can be achieved by giving a
simple query. The relational database translates the user queries into the code required to
extract the desired information.

Disadvantages of Relational Data Model:

a. Hardware overheads – The RDBMS needs comparatively powerful hardware as it
hides the implementation complexities and the physical data storage details from the
users.
b. Ease of design can result in bad design – As the relational database is an easy-to-
design and use system, it can result in the development and implementation of poorly
designed database management systems. As the size of the database increases, several
problems may creep in – system shutdown, performance degradation and data corruption.

2. Network Data Model –
1. In the network model, data are represented by collections of records.
2. Relationships among data are represented by links.
3. In this model Graph data structure is used.
4. A network model permits a record to have more than one parent.




Advantages of Network Data Model:

a. Simplicity – The network data model is also conceptually simple and easy to design.
b. Ability to handle more relationship types – The network model can handle the one-
to-many and many-to-many relationships.
c. Ease of data access – In the network database terminology, a relationship is a set.
Each set comprises of two types of records – an owner record and a member record. In a
network model an application can access an owner record and all the member records
within a set.
d. Data Integrity – In a network model, no member can exist without an owner. A user
must therefore first define the owner record and then the member record. This ensures the
data integrity.
e. Data Independence – The network model draws a clear line of demarcation between
the programs and the complex physical storage details. The application programs work
independently of the data. Any changes made in the data characteristics do not affect the
application program.
f. Database standards – The standards devised by the DBTG (Database Task Group of
CODASYL Committee) form the basis of the network model. These standards were
further enhanced by ANSI/SPARC (American National Standards Institute/Standards
Planning and Requirements Committee) in the 1970s. All the network database
management systems adhere to these standards. These standards comprise of a DDL and
a DML that augments the database administration and portability.

Disadvantages of Network Data Model:

a. System complexity – In a network model, data are accessed one record at a time. This
makes it essential for the database designers, administrators, and programmers to be
familiar with the internal data structures to gain access to the data. Therefore, a user-
friendly database management system cannot be created using the network model.
b. Lack of structural independence – Making structural modifications to the database is
very difficult in the network database model as the data access method is navigational.
Any changes made to the database structure require the application programs to be
modified before they can access data. Though the network database model achieves data
independence, it still fails to achieve structural independence.

3. Hierarchical Model:

1. In the hierarchical model, data are represented by collections of records.
2. Relationships among data are represented by links.
3. In this model Tree data structure is used.
4. There are two concepts associated with the hierarchical model – segment types and
parent-child relationships. Segment type is similar to the record types in the network
models. The information retrieved only by navigating from the root segment type to the
nodes segment types. Thus you can access a segment type only via its parent segment
type in the parent-child relationship. The operators provided for manipulating such
structures include operators for traversing hierarchic paths up and down the trees.
Advantages of hierarchical Model:
1. Simplicity – Since the database is based on the hierarchical structure, the relationship
between the various layers is logically simple. Thus, the design of a hierarchical database
is simple.
2. Data Security – Hierarchical model was the first database that offered the data
security that is provided and enforced by the DBMS.
3. Data Integrity – Since the hierarchical model is based on the parent/child relationship,
there is always a link between the parent segment and the child segment under it. The
child segments are always automatically referenced to its parent, this model promotes
data integrity.
4. Efficiency – The hierarchical database model is a very efficient one when the database
contains a large number of one-to-many relationships and when the users require large
number of transactions, using data whose relationships are fixed.

Disadvantages of hierarchical Model

1. Implementation Complexity – Although the hierarchical database model is
conceptually simple and easy to design, it is quite complex to implement. The database
designers should have very good knowledge of the physical data storage characteristics.
2. Database management problems – If you make any changes in the database structure
of a hierarchical database, then it is required to make the necessary changes in all the
application programs that access the database. Thus, maintaining the database and the
applications can become very cumbersome.
3. Lack of structural independence – Structural independence exists when the changes
made to the database structure does not affect the DBMS’s ability to access data.
Hierarchical database systems use physical storage paths to navigate to the different data
segments. So the application programmer should have a good knowledge of the relevant
access paths to access the data. So if the physical structure is changed the applications
will also have to be altered. Thus, in a hierarchical database the benefits of data
independence are limited by structural dependence.
4. Programming complexity – Due to the structural dependence and the navigational
structure, the application programmers and the end users must know precisely how the
data is distributed physically in the database in order to access data. This requires
knowledge of complex pointer systems, which is difficult for users who have little or no
Programming knowledge.
5. Implementation limitation – Many of the common relationships do not conform to
the one-to-many format required by the hierarchical model. The many-to-many
relationships, which are more common in real life, are very difficult to implement in a
hierarchical model.

2. Object Based Data Models:

In Object-based models, the database is organized in real world objects of several types.
A number of fields, or attributes, are defined in each object type, and each field is usually
of a variable length.
The two most popular object-based data models are
      a. Object oriented model
      b. E - R Model

1. Object Oriented Model –
1. The object-oriented model is based on a collection of objects, like the E-R model.
2. An object contains values stored in instance variables within the object.
3. Unlike the record-oriented models, these values are themselves objects.
4. Thus objects contain objects to an arbitrarily deep level of nesting.
5. An object also contains bodies of code that operate on the object. These bodies of code
are called methods.
6. Objects that contain the same types of values and the same methods are grouped into
classes.
7. A class may be viewed as a type definition for objects.
8. Analogy: the programming language concept of an abstract data type.
9. The only way in which one object can access the data of another object is by invoking
the method of that other object. This is called sending a message to the object.
10. Internal parts of the object, the instance variables and method code, are not visible
externally.
11. Result is two levels of data abstraction.
For example, consider an object representing a bank account.
        a. The object contains instance variables number and balance.
        b. The object contains a method pay-interest which adds interest to the balance.
        c. Under most data models, changing the interest rate entails changing code in
        application programs.
        d. In the object-oriented model, this only entails a change within the pay-interest
        method.
12. Unlike entities in the E-R model, each object has its own unique identity, independent
of the values it contains:
                a. Two objects containing the same values are distinct.
                b. Distinction is maintained in physical level by assigning distinct object
                identifiers.

Advantages of Object Oriented Data Model:

a. Capability to handle large number of different data types – Traditional database
models like hierarchical, network and relational database are limited in their capability to
store the different types of data. For e.g., one cannot store pictures, voices and video in
these databases. But the object-oriented database can store any type of data including
text, numbers, pictures, voice and video.
b. Combination of object-oriented programming and database technology – Perhaps
the most significant characteristic of object-oriented database technology is that it
combines object-oriented programming with database technology to provide an
integrated application development system.
c. Object-oriented features improve productivity – Inheritance allows one to develop
solutions to complex problems incrementally by defining new objects in terms of
previously defined objects. Polymorphism and dynamic binding allow one to define
operations for one object and then to share the specification of the operation with other
objects.
These objects can further extend this operation to provide behaviors that are unique to
those objects. Dynamic binding determines at runtime, which of these operations is
actually executed, depending on the class of the object requested to perform the
operation. Polymorphism and dynamic binding are powerful object-oriented features that
allow one to compose objects to provide solutions without having to write code that is
specific to each object. All of these capabilities come together to provide significant
productivity advantages to database application developers. Data access – Object-
oriented database represent relationships explicitly, supporting both navigational and
associative access to information. As the complexity of interrelationships between
information within the database increases, the greater the advantages of representing
relationships explicitly. Another benefit of using explicit relationships is the
improvement in data access performance over relational value-based relationships.

Disadvantages of Object Oriented Data Model:

a. Difficult to maintain – In the real world, the data model is not static. It changes as
organizational information needs change and as missing information is identified.
Consequently, the definition of objects must be changed periodically and existing
databases migrated to conform to the new object definitions. Object-oriented databases
are semantically rich introducing a number of challenges when changing object
definitions and migrating databases.
Object-oriented databases have a greater challenge handling schema migration because it
is not sufficient to simply migrate the data representation to conform to the changes in
class specifications. One must also update the behavioral code associated with each
object.
b. Not suited for all applications – Object-oriented database systems are not suited for
all applications. If it is used in situations where it is not required, then it will result in
performance degradation and high processing requirements.OODBMS is popular in area
such as e-commerce, engineering product data management, and special purpose
databases in securities and medicine. The strength of the object model is in applications
where there is an underlying needed to manage complex relationships among data
objects.

E R Model (Entity Relational Model):

1. The entity-relationship model is based on a perception of the world as consisting of a
collection of basic objects (entities) and relationships among these objects.
2. It is an object-based logical model.
3. It is a high-level data model.
4. An entity is a distinguishable object that exists.
5. Each entity has associated with it a set of attributes describing it. E.g. number and
balance for an account entity.
6. A relationship is an association among several entities.
E.g. A cust_acct relationship associates a customer with each account he or she has.
7. The set of all entities or relationships of the same type is called the entity set or
relationship set.
8. Another essential element is the E-R diagram in which the mapping cardinalities
express the number of entities to which another entity can be associated via a relationship
set.
9. The overall logical structure of a database can be expressed graphically by an E-R
diagram:
                a. Rectangles: represent entity sets.
                b. Ellipses: represent attributes.
                c. Diamonds: represent relationships among entity sets.
                d. Lines: link attributes to entity sets and entity sets to relationships.

Contenu connexe

Tendances

Unit 1: Introduction to DBMS Unit 1 Complete
Unit 1: Introduction to DBMS Unit 1 CompleteUnit 1: Introduction to DBMS Unit 1 Complete
Unit 1: Introduction to DBMS Unit 1 CompleteRaj vardhan
 
Dbms relational model
Dbms relational modelDbms relational model
Dbms relational modelChirag vasava
 
Codd's Rules for Relational Database Management Systems
Codd's Rules for Relational Database Management SystemsCodd's Rules for Relational Database Management Systems
Codd's Rules for Relational Database Management SystemsRajeev Srivastava
 
Basic Concept Of Database Management System (DBMS) [Presentation Slide]
Basic Concept Of Database Management System (DBMS) [Presentation Slide]Basic Concept Of Database Management System (DBMS) [Presentation Slide]
Basic Concept Of Database Management System (DBMS) [Presentation Slide]Atik Israk
 
Relational database
Relational database Relational database
Relational database Megha Sharma
 
Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Ravinder Kamboj
 
Database administrator
Database administratorDatabase administrator
Database administratorTech_MX
 
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
FUNCTION DEPENDENCY  AND TYPES & EXAMPLEFUNCTION DEPENDENCY  AND TYPES & EXAMPLE
FUNCTION DEPENDENCY AND TYPES & EXAMPLEVraj Patel
 
Database fundamentals(database)
Database fundamentals(database)Database fundamentals(database)
Database fundamentals(database)welcometofacebook
 
Database architecture
Database architectureDatabase architecture
Database architectureVENNILAV6
 

Tendances (20)

Data independence
Data independenceData independence
Data independence
 
Dbms
DbmsDbms
Dbms
 
Unit 1: Introduction to DBMS Unit 1 Complete
Unit 1: Introduction to DBMS Unit 1 CompleteUnit 1: Introduction to DBMS Unit 1 Complete
Unit 1: Introduction to DBMS Unit 1 Complete
 
Dbms relational model
Dbms relational modelDbms relational model
Dbms relational model
 
Lecture2 oracle ppt
Lecture2 oracle pptLecture2 oracle ppt
Lecture2 oracle ppt
 
Dbms architecture
Dbms architectureDbms architecture
Dbms architecture
 
Codd's Rules for Relational Database Management Systems
Codd's Rules for Relational Database Management SystemsCodd's Rules for Relational Database Management Systems
Codd's Rules for Relational Database Management Systems
 
Basic Concept Of Database Management System (DBMS) [Presentation Slide]
Basic Concept Of Database Management System (DBMS) [Presentation Slide]Basic Concept Of Database Management System (DBMS) [Presentation Slide]
Basic Concept Of Database Management System (DBMS) [Presentation Slide]
 
Unit1 DBMS Introduction
Unit1 DBMS IntroductionUnit1 DBMS Introduction
Unit1 DBMS Introduction
 
Relational database
Relational database Relational database
Relational database
 
DML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with Examples
DML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with ExamplesDML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with Examples
DML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with Examples
 
DbMs
DbMsDbMs
DbMs
 
Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)
 
Database administrator
Database administratorDatabase administrator
Database administrator
 
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
FUNCTION DEPENDENCY  AND TYPES & EXAMPLEFUNCTION DEPENDENCY  AND TYPES & EXAMPLE
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
 
ER-Model-ER Diagram
ER-Model-ER DiagramER-Model-ER Diagram
ER-Model-ER Diagram
 
Database fundamentals(database)
Database fundamentals(database)Database fundamentals(database)
Database fundamentals(database)
 
Database architecture
Database architectureDatabase architecture
Database architecture
 
Database systems introduction
Database systems introductionDatabase systems introduction
Database systems introduction
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 

Similaire à 23246406 dbms-unit-1 (20)

Database Management System Introduction
Database Management System IntroductionDatabase Management System Introduction
Database Management System Introduction
 
DBMS-material for b.tech students to learn
DBMS-material for b.tech students to learnDBMS-material for b.tech students to learn
DBMS-material for b.tech students to learn
 
Unit 1.pptx
Unit 1.pptxUnit 1.pptx
Unit 1.pptx
 
DBMS introduction
DBMS introductionDBMS introduction
DBMS introduction
 
Introduction to Database
Introduction to DatabaseIntroduction to Database
Introduction to Database
 
database introductoin optimization1-app6891.pdf
database introductoin optimization1-app6891.pdfdatabase introductoin optimization1-app6891.pdf
database introductoin optimization1-app6891.pdf
 
Chapter one
Chapter oneChapter one
Chapter one
 
DBMS
DBMS DBMS
DBMS
 
Database management systems
Database management systemsDatabase management systems
Database management systems
 
RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.
 
Advanced Database Management System_Introduction Slide.ppt
Advanced Database Management System_Introduction Slide.pptAdvanced Database Management System_Introduction Slide.ppt
Advanced Database Management System_Introduction Slide.ppt
 
Ch 2-introduction to dbms
Ch 2-introduction to dbmsCh 2-introduction to dbms
Ch 2-introduction to dbms
 
Dbms Useful PPT
Dbms Useful PPTDbms Useful PPT
Dbms Useful PPT
 
Unit 2 DATABASE ESSENTIALS.pptx
Unit 2 DATABASE ESSENTIALS.pptxUnit 2 DATABASE ESSENTIALS.pptx
Unit 2 DATABASE ESSENTIALS.pptx
 
27 fcs157al2
27 fcs157al227 fcs157al2
27 fcs157al2
 
Database, Lecture-1.ppt
Database, Lecture-1.pptDatabase, Lecture-1.ppt
Database, Lecture-1.ppt
 
Data base management system
Data base management systemData base management system
Data base management system
 
Dbms
DbmsDbms
Dbms
 
Fundamentals of DBMS
Fundamentals of DBMSFundamentals of DBMS
Fundamentals of DBMS
 
Database management system
Database management systemDatabase management system
Database management system
 

Dernier

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 

Dernier (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 

23246406 dbms-unit-1

  • 1. Unit 1 1.1 Overview of Database Management System: Data - Data is meaningful known raw facts that can be processed and stored as information. Database - Database is a collection of interrelated and organized data. In general, it is a collection of files (tables). Entity: A person, place, thing or event about which information must be kept. Attribute: Piece of information describing a particular entity. These are mainly the characteristics about the individual entity. Individual attributes help to identify and distinguish one entity from another. STUDENT (DATABASE NAME) Entity Attributes Personnel Name, Age, Address, Father’s Name Academic Name, Roll No., Course, Dept. Name E.g. Student (Database Name) Field name or attribute name Personnel (Table Name) Academic (table name) ROLL COURSE Dept. Name Name Father Name Age Name NO John Albert 24 RECORD John 12 MSC Computer Ramesh Suresh 18 Ramesh 15 BCA Computer DBMS - Database Management System (DBMS) is a collection of interrelated data [usually called database] and a set of programs to access, update and manage those data [which form part of management system] OR It is a software package to facilitate creation and maintenance of computerized database.
  • 2. It is general purpose software that facilitates the following: 1. Defining: Specifying data types and structures, and constraints for data to be stored. 2. Constructing: Storing data in a storage medium. 3. Manipulating: Involves querying, updating and generating reports. 4. Sharing: Allowing multiple users and programs to access data simultaneously. Eg. Of DBMS- Access, dBase, FileMaker Pro, and FoxBASE, ORACLE etc. 1.1.2 Primary goals of DBMS are: 1. To provide a way to store and retrieve database information that is both convenient and efficient. 2. To manage large and small bodies of information. It involves defining structures for storage of information and providing mechanism for manipulation of information. 3. It should ensure safety of information stored, despite system crashes or attempts at unauthorized access. 4. If data are to be shared among several users, then system should avoid possible anomalous results.
  • 3. 1.2 Database Management Systems: Database Management Systems provide the following advantages and disadvantages: Advantages of DBMS: • Data independence: provides an abstract view of the data that hides the details data representation and storage. • Efficient Data Access: This is the advantage where we use variety of techniques to store and retrieve data. • Data integrity and security: we can ensure data integrity if the data is always enforced through integrity constraint. • Data administration: data" administration deals with the modeling of the data and treats data as an organizational resource, while "database" administration deals with the implementation of the types of databases that are in use. • Concurrent Access and crash recovery: It ensures concurrent access of the data in such a way that the data is being accessed by only one user a time. Also protects the system from crashes. • Reduced Application Development time: It supports all the important functions that are common to many applications. Disadvantages of a DBMS: The following are disadvantages of DBMS • Setup of the database system requires more knowledge, money, skills, and time. • The complexity of the database may result in poor performance. 1.2.1 Advantages & Disadvantages of DBMS: Advantages Disadvantages Greater flexibility Difficult to learn Good for larger databases Packaged separately from the operating system (i.e. Oracle, Microsoft Access, Lotus/IBM Approach, Borland Paradox, Claris FileMaker Pro) Greater processing power Slower processing speeds Fits the needs of many medium to large- Requires skilled administrators sized organizations Storage for all relevant data Expensive Provides user views relevant to tasks performed Ensures data integrity by managing transactions (ACID test = atomicity, consistency, isolation, durability) Supports simultaneous access
  • 4. Enforces design criteria in relation to data format and structure 1.2.2 Applications of DBMS: 1. Banking – For customer information, accounts, and loans, and banking transactions. [all transactions] 2. Airlines – For reservation and schedule information. [reservations, schedules] 3. Universities – For student information, course registrations, and grades. [registration, grades] 4. Credit Card Transactions – For purchases on credit card and generation of monthly statements. 5. Telecommunication – For keeping records of calls made, generating monthly bills, maintaining balances on prepaid calling cards, and storing information about communication networks. 6. Finance – For storing information about holdings, sales, and purchases of financial instruments such as stocks and bonds. 7. Sales – For customer, product, and purchase information. [customers, products, purchases] 8. Manufacturing – For management of supply chain and for tracking production of items in factories, inventories of items in warehouses/stores, and orders for items. [production, inventory, orders, supply chain] 9. Human Resources – For information about employees, salaries, payroll taxes and benefits, and generation of paychecks. [employee records, salaries, tax deductions] 1.3 Various views of Data 1.3.1 Data abstraction: It can be summed up as follows. 1. When the DBMS hides certain details of how data is stored and maintained, it provides what is called as the abstract view of data. 2. This is to simplify user-interaction with the system. 3. Complexity (of data and data structure) is hidden from users through several levels of abstraction. Data abstraction is used for following purposes: 1. To provide abstract view of data. 2. To hide complexity from user. 3. To simplify user interaction with DBMS.
  • 5. Figure: The Three Levels of Abstraction Levels of data abstraction: There are three levels of data abstraction. 1. Physical level: It describes how a record (e.g., customer) is stored. Features: a) Lowest level of abstraction. b) It describes how data are actually stored. c) It describes low-level complex data structures in detail. d) At this level, efficient algorithms to access data are defined. 2. Logical level: It describes what data stored in database, and the relationships among the data. Features: a) It is next-higher level of abstraction. Here whole Database is divided into small simple structures. b) Users at this level need not be aware of the physical-level complexity used to implement the simple structures. c) Here the aim is ease of use. d) Generally, database administrators (DBAs) work at logical level of abstraction. 3. View level: Application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes. Features: a) It is the highest level of abstraction. b) It describes only a part of the whole Database for particular group of users. c) This view hides all complexity.
  • 6. d) It exists only to simplify user interaction with system. e) The system may provide many views for the whole system. 1.4 Data Models A data model is a collection of concepts that can be used to describe the structure of a database and provides the necessary means to achieve this abstraction whereas structure of a database means the data types, relationships and constraints that should hold on the data. Collection of conceptual tools for describing data, data relationships, data semantics and consistency constraints. The various data models that have been proposed fall into three different groups. Object based logical models, record-based logical models and physical models. Object-Based Logical Models: They are used in describing data at the logical and view levels. They are characterized by the fact that they provide fairly flexible structuring capabilities and allow data constraints to be specified explicitly. There are many different models and more are likely to come. Several of the more widely known ones are: • The E-R model • The object-oriented model • The semantic data model • The functional data model The E-R Model The (E-R) data model is based on a perception of a real worker that consists of a collection of basic objects, called entities, and of relationships among these objects. The overall logical structure of a database can be expressed graphically by an E-R diagram. Which is built up by the following components: • Rectangles, which represent entity sets • Ellipses, which represent attributes • Diamonds, which represent relationships among entity sets • Lines, which link attributes to entity sets and entity sets to relationships. E.g. suppose we have two entities like customer and account, then these two entities can be modeled as follow: Account number name Customer Customer city Balanc e Customer Deposit Account FIGURE 1.3: A SAMPLE E-R DIAGRAM The Object-Oriented Model Like the E-R model the object-oriented model is based on a collection of objects. An object contains values stored in instance variables within the object. An object also
  • 7. contains bodies of code that operate on the object. These bodies of code are called methods. Classes: It is the collection of objects which consist of the same types of values and the same methods. E.g. account number & balance are instance variables; pay-interest is a method that uses the above two variables and adds interest to the balance. Semantic Models These include the extended relational, the semantic network and the functional models. They are characterized by their provision of richer facilities for capturing the meaning of data objects and hence of maintaining database integrity. Systems based on these models exist in monotype for at the time of writing and will begin to filter through the next decade. Record-Based Logical Models Record based logical models are also used in describing data at the logical and view levels. In contrast to object-based data models, they are used both to specify the overall logical structures of the database, and to provide a higher-level description of the implementation. Record-based models are so named because the database is structured in fixed-format records of several types. Each record type defines a fixed number of fields, or attributes, and each field is usually of a fixed length. The three most widely accepted record-based data models are the relational, network, and hierarchical models. Relational Model The relational model uses a collection of tables to represent both data and the relationships among those data. Each table has multiple columns, and each column has a unique name as follows: CUSTOMER (TABLE NAME) Customer–name Customer-street Customer-City Account-number Johnsons Alma Pala Alto A-101 Smith North Ryc A-215 Hayes Main Harrison A-102 Turner Dutnam Stanford A-305 Johnson Alma PalaAlto A-201 Jones Main Harrison A-217 Lindsay Park Pittifield A-222 Smith North Rye A-201 A SAMPLE RELATIONAL DATABASE
  • 8. Network Model Data in the network model is represented by collection of records, and relationship among data is represented by links, which can be viewed as pointers. The records in the database are organized as collections of arbitrary graphs. Such type of database is shown in the Figure 1.4. Johnsons Alma Pala Alto A-101 500 Smith North Rye A-215 700 Hayes Main Harrison A-102 400 Turner Dutnam Stanford A-305 350 Jones Main Harrison A-201 900 Lindsay Park Pittifield A-217 750 A-222 700 FIGURE 1.4: A SAMPLE NETWORK DATABASE Hierarchical Model The hierarchical model is similar to the network model in the sense that data and relationships among data are represented by records and links, respectively. It differs from the network model in that records are organized as collection of trees rather than arbitrary graphs. CUSTOMER Johnson customer street ------- Smith North ------- Hayes Main ------- A-101 500 A-201 900 Turner Putnam ------- Jones Main ------- A-215 700 Lindsay Park ------- A-201 900 A-217 350 A-102 400 A-222 700 A-305 350 FIGURE : A SAMPLE HIERACHICAL DATABASE
  • 9. Physical Data Models Physical data models are used to describe data at the lowest level. In contrast to logical data models, there are few physical data models in use. Two of the widely known ones are the unifying model and the frame-memory model. 1.5 Instances and Schemas: Instances and Schemas are similar to types and variables in programming languages. 1. Schema: The overall design of a database is called database schema. E.g., the database consists of information about a set of customers and accounts and the relationship between them. It is analogous to variable along with its type information in a program. Types of Schemas (partitioned according to levels of abstraction): a. Physical schema: It is database design at the physical level. It is hidden below logical schema, and can be changed easily without affecting application programs. b. Logical schema: It is database design at the logical level. Programmers construct applications using logical schema. It is by far the most important schema, in terms of its effect on application programs. c. Subschema: It is schema at view level. 2. Instance: It is the actual content of the database at a particular point in time. It is analogous to the value of a variable. 1.6 Introduction to Database Languages & Environments 1.6.1 Database Languages: We have Data Definition Languages (DDL) to specify database schemas and Data Manipulation Language (DML) to express database updates and queries. In practice, these are not to separate languages but are part of a single database language, like SQL. 1. Data Definition Languages (DDL) It is the language that is used to specify database schemas by a set of definitions contained in it. 2. Data Manipulation Language (DML) It is a language for accessing and manipulating the data organized by the appropriate data model.
  • 10. DML is also known as query language. There are two types of DML a) Procedural DMLs b) Declarative DMLs (non-procedural DMLs) 1. Procedural DMLs - This language requires user to specify what data is required and how to get those data. 2. Declarative DMLs (non-procedural DMLs) - This language requires user to specify what data is required without specifying how to get those data. Features of DDL: • Used to specify a database schema as a set of definitions expressed in a DDL • DDL statements are compiled, resulting in a set of tables stored in a special file called a data dictionary or data directory. • The data directory contains metadata (data about data). • The storage structure and access methods used by the database system are specified by a set of definitions in a special type of DDL called a data storage and definition language. • Basic idea of DDL: To hide implementation details of the database schemes from the users. Features of DML: • A DML is a language which enables users to access and manipulate data. • The goal is to provide efficient human interaction with the system. • There are two types of DMLs o Procedural: Here user specifies what data is needed and how to get it. o Non-procedural: Here user only specifies what data is needed. - Easier for user - May not generate code as efficient as that produced by procedural languages. 1.6.2 Database System Environment: 1.6.2.1 DBMS System Structure and its Components: We can explain the overall structure of DBMS/System structure and its components by the diagram given below:
  • 11. Figure: System Structure 1. Database systems are partitioned into modules for different functions. Some functions (e.g. file systems) may be provided by the operating system. 2. Broadly the functional components of a database system are: a. Query Processor: It is one of the functional components of DBMS. It translates statements in a query language into low-level instructions the database manager understands. It may also attempt to find an equivalent but more efficient form. It contains following components:
  • 12. a. DML compiler - It converts DDL statements to a set of tables containing metadata stored in a data dictionary. It also performs query optimization. b. DDL interpreter – It interprets DDL statements and records definitions into data dictionary. c. Query evaluation engine – It executes low-level instructions generated by DML compiler. They mainly deal with solving all problems related to queries and query processing. It helps database system simplify and facilitate access to data. b. Storage Manger (Database Manager) 1. Storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. 2. The storage manager is responsible to the following tasks: 1. interaction with the file manager 2. efficient storing, retrieving and updating of data 3. The important components include: a. File manager: It manages allocation of disk space and data structures used to represent information on disk. b. Database manager: It is the interface between low-level data and application programs and queries. c. Transaction manager: Transaction manager ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures. d. DML precompiler: It converts DML statements embedded in an application program to normal procedure calls in a host language. The precompiler interacts with the query processor. e. DDL compiler: It converts DDL statements to a set of tables containing metadata stored in a data dictionary. f. Authorization and integrity manger – It conducts integrity checks and user authority to access data. g. Buffer manger – It is critical part of DB and stores temporary data. In addition, several data structures are required for physical system implementation: a. Data files: They store the database itself. b. Data dictionary: It stores information about the structure of the database. It is used heavily. Great emphasis should be placed on developing a good design and efficient implementation of the dictionary. In short, it stores metadata. c. Indices: They provide fast access to data items holding particular values.
  • 13. 1.7 File systems/File processing systems A file system is basically storing information in data structures called ‘files’ in the operating system and manipulating this information via application programs that manipulate the files. File Processing System Advantages Disadvantages Simpler to use Typically does not support multi-user access Less expensive· Limited to smaller databases Fits the needs of many small businesses Limited functionality (i.e. no support for and home users complicated transactions, recovery, etc.) Popular FMS’s are packaged along with the operating systems of personal Decentralization of data computers (i.e. Microsoft Card file and Microsoft Works) Good for database solutions for hand held Redundancy and Integrity issues devices such as Palm Pilot Disadvantages of File Processing System: 1. Data Redundancy – Since different programmers create the files and application programs over a long period, the various files are likely to have different formats and the programs may be written in several programming languages. Moreover, the same information may be duplicated in several files, this duplication of data over several files is known as data redundancy. Eg. The address and telephone number of a particular customer may appear in a file that consists of saving- account records and in a file that consists of checking account records. This redundancy leads to higher storage & excess cost also leads to inconsistency discussed in the next. 2. Data Inconsistency – The various copies of same data may no longer agree i.e. various copies of the same data may contain different information. Eg. A changed customer address may be reflected in savings-account records but not elsewhere in the system. 3. Difficulty in accessing data – In a conventional file processing system it is difficult to access the data in a specific manner and it is require creating an application program to carry out each new task. Eg. Suppose that one of the bank officers needs to find out the names of all customers who live within a particular postal-code area. We ask the officer of data-processing department to generate such a list.We have 2 choices:
  • 14. - List all cust_names & manually do the information required - Ask the data processing dept. to have system programmer to write necessity application program Because the designers of the original system did not anticipate this request, there is no application program on hand to meet it. 4. Data Isolation – Because data are scattered in various files, and files may be in different formats, writing new application programs to retrieve the appropriate data is difficult. 5. Integrity problems – The data stored in the database must satisfy certain types of consistency constraints. Eg. The balance of a bank account may never fall below a prescribed amount (say, ICICI 2500/- ). Developers enforce these constraints in the system by adding appropriate code in the various application programs. However, when new constraints are added, it is difficult to change the programs to enforce them. The problem is compounded when constraints involve several data items from different files. 6. Atomicity problems – A computer system, like any other mechanical or electrical device, is subject to failure. In many applications, it is crucial that, if a failure occurs, the data be restored to the consistent state that existed prior to the failure. Eg. Before computer format, we require to have a backup first. It is difficult to ensure atomicity in a conventional file processing system. 7. Concurrent-access anomalies – For the sake of overall performance of the system and faster response, many systems allow multiple users to update the data simultaneously. In such an environment, interaction of concurrent updates may result in inconsistent data. Eg. Consider bank account A, containing $500. If two customers withdraw funds (say $50 and $100 respectively) from account A at about the same time, the result of the concurrent executions may leave the account in an incorrect (or inconsistent) state. Suppose that the programs executing on behalf of each withdrawal read the old balance, reduce that value by the amount being withdrawn, and write the result back. If the two programs run concurrently, they may both read the value $500, and write back $450 and $400, respectively. Depending on which one writes the value last, the account may contain either $450 or $400, rather than the correct value of $350. To guard against this possibility, the system must maintain some form of supervision. But supervision is difficult to provide because data may be accessed by many different application programs that have not been coordinated previously. 8. Security Problems – Not every user of the database system should be able to access all the data. Eg. In a bank system, payroll personnel need to see only that part of the database that has information about the various bank employees. They do not need access to information about customer accounts. But, since application programs are added to the system in an ad hoc manner, enforcing such security constraints is difficult.
  • 15. 1.7.1 Difference between DBMS and File-processing system: DBMS File-processing Systems 1. Redundancies and inconsistencies in data 1. Redundancies and inconsistencies in data are reduced due to single file formats and exist due to single file formats and duplication of data is eliminated. duplication of data. 2. Data is easily accessed due to standard 2. Data cannot be easily accessed due to query procedures. special application programs needed to access data. 3. Isolation/retrieval of required data is 3. Data isolation is difficult due to different possible due to common file format, and file formats, and also because new there are provisions to easily retrieve data. application programs have to be written. 4. Integrity constraints, whether new or old, 4. Introduction of integrity constraints is can be created or modified as per need. tedious and again new application programs have to be written. 5. Atomicity of updates is possible. 5. Atomicity of updates may not be maintained. 6. Several users can access data at the same 6. Concurrent accesses may cause problems time i.e concurrently without problems such as . Inconsistencies. 7. Security features can be enabled in 7. It may be difficult to enforce security DBMS very easily. features. 1.8 Advantages of a DBMS over file-processing system: A DBMS has three main features: (1) Centralized data management, (2) Data independence and (3) Data integration /System integration In a DB system, the DBMS provides the interface b/w the application programs & data. When changes are made to the data representation, the metadata maintained by the DBMS is changed but the DBMS continues to provide data to application programs in previously used way. The DBMS handles the task of information of data where ever necessary. This independence b/w the programs & the data is called data independence. this made program to continue irrespective of changes made in it. To provide a high
  • 16. degree of data independence, a DBMS must include a sophisticated metadata mgmt system. In DBMS , all files are integrated into one system thus reducing redundancies & making data mgmt more efficient. In addition, DBMS provides centralized control of the operational data. Some of advantages of above mention three features are: Due to its centralized nature, the database system can overcome the disadvantages of the file-based system as discussed below. • Minimal Data Redundancy - Since the whole data resides in one central database, the various programs in the application can access data in different data files. Hence data present in one file need not be duplicated in another. This reduces data redundancy. However, this does not mean all redundancy can be eliminated. There could be business or technical reasons for having some amount of redundancy. Any such redundancy should be carefully controlled and the DBMS should be aware of it. • Data Consistency - Reduced data redundancy leads to better data consistency. • Data Integration - Since related data is stored in one single database, enforcing data integrity is much easier. Moreover, the functions in the DBMS can be used to enforce the integrity rules with minimum programming in the application programs. • Data Sharing - Related data can be shared across programs since the data is stored in a centralized manner. Even new applications can be developed to operate against the same data. • Enforcement of Standards - Enforcing standards in the organization and structure of data files is required and also easy in a Database System, since it is one single set of programs which is always interacting with the data files. • Application Development Ease - The application programmer need not build the functions for handling issues like concurrent access, security, data integrity, etc. The programmer only needs to implement the application business rules. This brings in application development ease. Adding additional functional modules is also easier than in file based systems. • Better Controls - Better controls can be achieved due to the centralized nature of the system. • Data Independence - The architecture of the DBMS can be viewed as a 3-level system comprising the following: - The internal or the physical level where the data resides. - The conceptual level which is the level of the DBMS functions - The external level which is the level of the application programs or the end user. Data Independence is isolating an upper level from the changes in the organization or structure of a lower level. For example, if changes in the file organization of a data file do not demand for changes in the functions in the DBMS or in the application programs,
  • 17. data independence is achieved. Thus Data Independence can be defined as immunity of applications to change in physical representation and access technique. The provision of data independence is a major objective for database systems. • Reduced Maintenance - Maintenance is less and easy, again, due to the centralized nature of the system. 1.9 Responsibility of Database Administrator The database administrator is a person having central control over data and programs accessing that data. He coordinates all the activities of the database system; the database administrator has a good understanding of the enterprise’s information resources and needs. Functions of a DBA: Database administrator's duties include: 1. Schema definition: the creation of the original database schema. This involves writing a set of definitions in a DDL (data storage and definition language), compiled by the DDL compiler into a set of tables stored in the data dictionary. 2. Storage structure and access method definition: writing a set of definitions translated by the data storage and definition language compiler. 3. Schema and physical organization modification: writing a set of definitions used by the DDL compiler to generate modifications to appropriate internal system tables (e.g. data dictionary). This is done rarely, but sometimes the database schema or physical organization must be modified. 4. Granting user authority to access the database: granting different types of authorization for data access to various users 5. Specifying integrity constraints: generating integrity constraints. These are consulted by the database manager module whenever updates occur. 6. Routine Maintenance: It includes the following: a. Acting as liaison with users. b. Monitoring performance and responding to changes in requirements. c. Periodically backing up the database.
  • 18. 1.10 Three levels architecture of database systems A DBMS can be considered as a buffer between application programs, end users and a database designed to fulfill features of data independence. In 1975 the American National Standards Institute Standards Planning and Requirements Committee (ANSI-SPARC) proposed three-level architecture identified three levels of abstraction. Figure: The Three Level Architecture For a DBMS These levels are sometimes referred to as schemas or views. 1. The External or User Level: This level describes the user’s or application program’s view of the database. Several programs or users may share the same view. 2. The Conceptual Level: This level describes the organization’s view of all the data in the database, the relationships between the data and the constraints applicable to the database. This level describes a logical view of the database i.e. a view locking implementation detail. 3. The Internal or Physical Level: This level describes the way in which data is stored and the way in which data may be accessed. This level describes a physical view of the database. Each level is described in terms of schema – a map of the database. The three- level architecture is used to implement data independence through two levels of mapping:
  • 19. that between the external schema and the conceptual schema and that between the conceptual schema and the physical schema. For eg., Consider a Bank System, It uses • Customer_Details Table. • Customer_Transactions Table. At the internal level, a Customer_Details or Customer_Transaction record can be described as a block of consecutive storage locations (for example, words of bytes). The language compiler hides this level of detail from programmers. Similarly, the database system hides the lowest-level storage details (how data is stored and accessed) from database programmers. At the conceptual level, the table definition (the attributes data type and width definition) and the interrelationship among the data is described. Finally at the external level, several views of the database are defined, and database end users are also to see those views. In addition to hiding details of the conceptual level of the database, the views also provide a security mechanism to prevent users from accessing parts of the database. For e.g., tellers in the bank will be able to see only that part of the database that has information on customer accounts; they cannot access information concerning salaries of bank employees.
  • 20.
  • 21. Data Model A data model is collection of tools for describing 1. data 2. data relationships 3. data semantics 4. data constraints Types of Data Models: There are basically two types of data models 1. Record based Data Models. 2. Object based Data Models. 1. Record based Data Models – In Record-based models, the database is organized in fixed-format records of several types. A fixed number of fields, or attributes, are defined in each record type, and each field is usually of a fixed length. The three most popular record-based data models are 1. Relational Data Model 2. Network Data Model 3. Hierarchical Data Model 1. Relational Data Model – 1. The relational model uses tables to represent the data and the relationships among those data. 2. Each table has multiple columns, and each column is identified by a unique name. 3. It is a low level model. In this database, each row in the table represents a different customer. Relationships link rows from two tables on the basis of the key field, in this case – number. Advantages of Relational Data Model: a. Structural Independence – Relational database model has structural independence, i.e. changes made in the database structure do not affect the DBMS’s capability to access data. b. Simplicity – The relational model is the simplest model at the conceptual level. It allows the designer to concentrate on the logical view of the database, leaving the physical data storage details. c. Ease of designing, implementation, maintenance, and usage – Due to the inherent features of data independence and structural independence, and the relational model makes it easy to design, implement, maintain and use the databases. d. Adhoc query capability – One of the main reasons for the huge popularity of the relational database model is the presence of powerful, flexible and easy-to-use query capability. The query language of the relational database model – Structure Query Language or SQL – is a fourth generation language (4GL). A 4GL concentrates on the ‘what’ and not on the ‘how’ of the problem. Selective output can be achieved by giving a
  • 22. simple query. The relational database translates the user queries into the code required to extract the desired information. Disadvantages of Relational Data Model: a. Hardware overheads – The RDBMS needs comparatively powerful hardware as it hides the implementation complexities and the physical data storage details from the users. b. Ease of design can result in bad design – As the relational database is an easy-to- design and use system, it can result in the development and implementation of poorly designed database management systems. As the size of the database increases, several problems may creep in – system shutdown, performance degradation and data corruption. 2. Network Data Model – 1. In the network model, data are represented by collections of records. 2. Relationships among data are represented by links. 3. In this model Graph data structure is used. 4. A network model permits a record to have more than one parent. Advantages of Network Data Model: a. Simplicity – The network data model is also conceptually simple and easy to design. b. Ability to handle more relationship types – The network model can handle the one- to-many and many-to-many relationships. c. Ease of data access – In the network database terminology, a relationship is a set. Each set comprises of two types of records – an owner record and a member record. In a network model an application can access an owner record and all the member records within a set. d. Data Integrity – In a network model, no member can exist without an owner. A user must therefore first define the owner record and then the member record. This ensures the data integrity. e. Data Independence – The network model draws a clear line of demarcation between the programs and the complex physical storage details. The application programs work independently of the data. Any changes made in the data characteristics do not affect the application program.
  • 23. f. Database standards – The standards devised by the DBTG (Database Task Group of CODASYL Committee) form the basis of the network model. These standards were further enhanced by ANSI/SPARC (American National Standards Institute/Standards Planning and Requirements Committee) in the 1970s. All the network database management systems adhere to these standards. These standards comprise of a DDL and a DML that augments the database administration and portability. Disadvantages of Network Data Model: a. System complexity – In a network model, data are accessed one record at a time. This makes it essential for the database designers, administrators, and programmers to be familiar with the internal data structures to gain access to the data. Therefore, a user- friendly database management system cannot be created using the network model. b. Lack of structural independence – Making structural modifications to the database is very difficult in the network database model as the data access method is navigational. Any changes made to the database structure require the application programs to be modified before they can access data. Though the network database model achieves data independence, it still fails to achieve structural independence. 3. Hierarchical Model: 1. In the hierarchical model, data are represented by collections of records. 2. Relationships among data are represented by links. 3. In this model Tree data structure is used. 4. There are two concepts associated with the hierarchical model – segment types and parent-child relationships. Segment type is similar to the record types in the network models. The information retrieved only by navigating from the root segment type to the nodes segment types. Thus you can access a segment type only via its parent segment type in the parent-child relationship. The operators provided for manipulating such structures include operators for traversing hierarchic paths up and down the trees.
  • 24. Advantages of hierarchical Model: 1. Simplicity – Since the database is based on the hierarchical structure, the relationship between the various layers is logically simple. Thus, the design of a hierarchical database is simple. 2. Data Security – Hierarchical model was the first database that offered the data security that is provided and enforced by the DBMS. 3. Data Integrity – Since the hierarchical model is based on the parent/child relationship, there is always a link between the parent segment and the child segment under it. The child segments are always automatically referenced to its parent, this model promotes data integrity. 4. Efficiency – The hierarchical database model is a very efficient one when the database contains a large number of one-to-many relationships and when the users require large number of transactions, using data whose relationships are fixed. Disadvantages of hierarchical Model 1. Implementation Complexity – Although the hierarchical database model is conceptually simple and easy to design, it is quite complex to implement. The database designers should have very good knowledge of the physical data storage characteristics. 2. Database management problems – If you make any changes in the database structure of a hierarchical database, then it is required to make the necessary changes in all the application programs that access the database. Thus, maintaining the database and the applications can become very cumbersome. 3. Lack of structural independence – Structural independence exists when the changes made to the database structure does not affect the DBMS’s ability to access data. Hierarchical database systems use physical storage paths to navigate to the different data segments. So the application programmer should have a good knowledge of the relevant access paths to access the data. So if the physical structure is changed the applications will also have to be altered. Thus, in a hierarchical database the benefits of data independence are limited by structural dependence. 4. Programming complexity – Due to the structural dependence and the navigational structure, the application programmers and the end users must know precisely how the data is distributed physically in the database in order to access data. This requires knowledge of complex pointer systems, which is difficult for users who have little or no Programming knowledge. 5. Implementation limitation – Many of the common relationships do not conform to the one-to-many format required by the hierarchical model. The many-to-many relationships, which are more common in real life, are very difficult to implement in a hierarchical model. 2. Object Based Data Models: In Object-based models, the database is organized in real world objects of several types. A number of fields, or attributes, are defined in each object type, and each field is usually of a variable length.
  • 25. The two most popular object-based data models are a. Object oriented model b. E - R Model 1. Object Oriented Model – 1. The object-oriented model is based on a collection of objects, like the E-R model. 2. An object contains values stored in instance variables within the object. 3. Unlike the record-oriented models, these values are themselves objects. 4. Thus objects contain objects to an arbitrarily deep level of nesting. 5. An object also contains bodies of code that operate on the object. These bodies of code are called methods. 6. Objects that contain the same types of values and the same methods are grouped into classes. 7. A class may be viewed as a type definition for objects. 8. Analogy: the programming language concept of an abstract data type. 9. The only way in which one object can access the data of another object is by invoking the method of that other object. This is called sending a message to the object. 10. Internal parts of the object, the instance variables and method code, are not visible externally. 11. Result is two levels of data abstraction. For example, consider an object representing a bank account. a. The object contains instance variables number and balance. b. The object contains a method pay-interest which adds interest to the balance. c. Under most data models, changing the interest rate entails changing code in application programs. d. In the object-oriented model, this only entails a change within the pay-interest method. 12. Unlike entities in the E-R model, each object has its own unique identity, independent of the values it contains: a. Two objects containing the same values are distinct. b. Distinction is maintained in physical level by assigning distinct object identifiers. Advantages of Object Oriented Data Model: a. Capability to handle large number of different data types – Traditional database models like hierarchical, network and relational database are limited in their capability to store the different types of data. For e.g., one cannot store pictures, voices and video in these databases. But the object-oriented database can store any type of data including text, numbers, pictures, voice and video. b. Combination of object-oriented programming and database technology – Perhaps the most significant characteristic of object-oriented database technology is that it combines object-oriented programming with database technology to provide an integrated application development system. c. Object-oriented features improve productivity – Inheritance allows one to develop solutions to complex problems incrementally by defining new objects in terms of
  • 26. previously defined objects. Polymorphism and dynamic binding allow one to define operations for one object and then to share the specification of the operation with other objects. These objects can further extend this operation to provide behaviors that are unique to those objects. Dynamic binding determines at runtime, which of these operations is actually executed, depending on the class of the object requested to perform the operation. Polymorphism and dynamic binding are powerful object-oriented features that allow one to compose objects to provide solutions without having to write code that is specific to each object. All of these capabilities come together to provide significant productivity advantages to database application developers. Data access – Object- oriented database represent relationships explicitly, supporting both navigational and associative access to information. As the complexity of interrelationships between information within the database increases, the greater the advantages of representing relationships explicitly. Another benefit of using explicit relationships is the improvement in data access performance over relational value-based relationships. Disadvantages of Object Oriented Data Model: a. Difficult to maintain – In the real world, the data model is not static. It changes as organizational information needs change and as missing information is identified. Consequently, the definition of objects must be changed periodically and existing databases migrated to conform to the new object definitions. Object-oriented databases are semantically rich introducing a number of challenges when changing object definitions and migrating databases. Object-oriented databases have a greater challenge handling schema migration because it is not sufficient to simply migrate the data representation to conform to the changes in class specifications. One must also update the behavioral code associated with each object. b. Not suited for all applications – Object-oriented database systems are not suited for all applications. If it is used in situations where it is not required, then it will result in performance degradation and high processing requirements.OODBMS is popular in area such as e-commerce, engineering product data management, and special purpose databases in securities and medicine. The strength of the object model is in applications where there is an underlying needed to manage complex relationships among data objects. E R Model (Entity Relational Model): 1. The entity-relationship model is based on a perception of the world as consisting of a collection of basic objects (entities) and relationships among these objects. 2. It is an object-based logical model. 3. It is a high-level data model. 4. An entity is a distinguishable object that exists. 5. Each entity has associated with it a set of attributes describing it. E.g. number and balance for an account entity. 6. A relationship is an association among several entities.
  • 27. E.g. A cust_acct relationship associates a customer with each account he or she has. 7. The set of all entities or relationships of the same type is called the entity set or relationship set. 8. Another essential element is the E-R diagram in which the mapping cardinalities express the number of entities to which another entity can be associated via a relationship set. 9. The overall logical structure of a database can be expressed graphically by an E-R diagram: a. Rectangles: represent entity sets. b. Ellipses: represent attributes. c. Diamonds: represent relationships among entity sets. d. Lines: link attributes to entity sets and entity sets to relationships.