Millenials and Fillennials (Ethical Challenge and Responses).pptx
Data administration
1. CHAPTER 6: DATA ADMINISTRATION
Chapter Objectives
At the end of this chapter, you should be able to:
define data administration, database administration, locking, versioning, deadlock,
transaction;
define the difference between data administration and database administration;
describe the function of a DBMS and its major components;
describe the optimistic and pessimistic systems of concurrency control;
describe the problem of database security and the techniques to enhance security;
describe the problem of database recovery and the facilities to recover database.
Essential Reading
Modem Database Management (4th Edition), Fred R. McFadden & Jeffrey A. Hoffer (1994),
Benjamin/Cummings. [Chapter 12, page 425 - 458]
Fundamentals of Database Systems, Ramez Elmasri & Shamkant B.Narathe (1989),
Benjamin/Cummings.
Practical Database Techniques, S. Misbah Deen.
Useful Websites to learn Database and Programming:
http://erwinglobio.wix.com/ittraining
http://ittrainingsolutions.webs.com/
http://erwinglobio.sulit.com.ph/
http://erwinglobio.multiply.com/
Prof. Erwin M. Globio, MSIT 6-1
2. DB212 CHAPTER 6: DATA ADMINISTRATION
6.1 Data and Database Administrator
6.1.1 Introduction
There are many causes of poor data utilization:
Multiple definitions of the same data entity and inconsistent representations of the same
data elements in separate database, which makes linking data across different.
Missing key data elements, which makes existing data useless.
Low levels of data quality due to inappropriate sources of data or timing of data transfers
from one system to another.
Not knowing what data exist, where to find them, and what they really mean. Therefore,
the data administration function is essential to the success of managing the data resource.
6.1.2 Data Administration
A high-level function that is responsible for the overall management for the overall
management of data resources in an organization, including maintaining corporate-wie
definitions and standards.
6.1.3 Database Administration
A technical function that is responsible for physical database design and for dealing with
technical issues such as security enforcement, database performance, and backup and
recovery.
6.1.4 Functions of Data and Database Administration
There are 6 stages in the life cycle of a typical database system:
Database planning
This develops a strategic plan fro database development that supports the overall
organizational business plan. This is usually is the responsibility of top management.
Database analysis
The process of analysis is concerned with identifying data entities currently used by the
organization and their relationships.
Database design
This develops a strategic plan for database development that supports the overall
organization business plan. This usually is the responsibility of top management.
6-2 Prof. Erwin M. Globio, MSIT
3. DB212 CHAPTER 6: DATA ADMINISTRATION
Operation and maintenance
This is a process to update the database to keep it current.
Growth and change
Data administrators must plan for charge, such as adding new record types,
accommodating growth. They must monitor the performance of the database and take
corrective actions whenever necessary.
The manner in which these functions are performed varies from one organization to the next
and is influenced by the use of specific methodologies and CASE tools.
6.2 DBMS
A DBMS is a software application system that is used to create, maintain, and to provide
controlled access to user databases.
6.2.1 Components of a DBMS
DBMS Engine
This is the central components of a DBMS which provides access to the repository and
the database and coordinates all of the other functional elements of the DBMS.
Interface subsystem
The interface subsystem provides facilities for users and applications to access the various
components of the DBMS. Most DBMS products provide a range of languages and other
interfaces. The system is used by programmers and by users with little or no
programming experience.For examples:
A data definitions languages (DDL) which is used to define database structures such
as records, tables, files and views.
An interactive query language (such as SQL), which is used to display data extracted
from the database and to perform simple updates.
A graphic interface (such as Query-by example).
A DBMS programming language (such as dBASE IV command language or Access
Basic).
An interface to standard third-generation programming languages such as BASIC and
COBOL.
Information Repository Dictionary Subsystem
This is also known as the Data Dictionary which is used to manage and control access to
the repository.
Prof. Erwin M. Globio, MSIT 6-3
4. DB212 CHAPTER 6: DATA ADMINISTRATION
Performance Management Subsystem
This provides facilities to optimize DBMS performance. Two of its important functions
follow:
Query optimization: Structuring SQL queries to minimize response time.
DBMS reorganization: Maintaining statistics on database usage and taking actions
such as database reorganization, creating indexes.
Backup and Recovery SubsystemThis subsystem provides facilities for logging
transactions and database changes, periodically making backup copies of the database,
and recovering the database in the event of some type of failure.
Application Development SubsystemThis subsystem that provides facilities that allow
end users and programmers to develop complete database applications.
Security Management SubsystemThis subsystem provides facilities to protect and control
access to the database and repository.
6.3 Concurrency Control
This concerned with preventing loss of data integrity due to interference between users in a
multi-user environment.
6.3.1 Single-user versus Multi-user Systems
One criterion for classifying a database system is by the number of users who can use the
system concurrently. A DBMS is single-user if at most one user at a time can use the system
and is multi-user if many users can use the system concurrently.
In a multi-user DBMS, the stored data items are the primary resources that may be accessed
concurrently by user programs, which are constantly retrieving and modifying the database.
The execution of a program that accesses or changes the contents of the database is called a
transaction. The transactions submitted by the various users may execute concurrently and
may access and update the same database records. If this concurrent execution is controlled, it
may lead to problems such as an inconsistent database.
6-4 Prof. Erwin M. Globio, MSIT
5. DB212 CHAPTER 6: DATA ADMINISTRATION
6.3.2 Why Concurrency Control is Needed?
Problems
The lost update problem
Consider the situation illustrated in diagram below. That figure is intended to be read
as follow:
Transaction A Time Transaction B
---------------------------- -------------------------
1.Read account balance -------------------------
(Balance = $1,000) t1
--------------------------- 1.Read account balance
(Balance = $1,000)
2.Update record t2 -------------------------
(withdraw $200 and the
balance is $800) 2.Update record
--------------------------- t3 (withdraw $300 and the
balance is $700)
t4 --------------------------
ERROR!
Transaction A retrieve some record R at time t1;
Transaction B retrieves that same record R at the t2;
Transaction B updates the same record at time t4.
Thus transaction A's update is lost at time t4, because transaction B overwrites without even
looking at it.
This means that the effect of B's update has been lost due to interference between the
transactions.
The temporary update problem
This occurs when one transaction updates a database item and then the transaction
fails for some reason. The updated item is accessed by another transaction before it is
changed back to its original value. For example, TI updates item X then fails before
completion, so the system must change X back to its original value. Before it does so,
transaction T2 reads the "temporary" value of X, which will not be recorded
permanently in the database because of the failure of T1.
Transaction 1 (T1) Transaction 2 (T2)
Read item (X)
X=X–N
Write item (X) Read-item (X)
X=X+M
Write-item (X)
read-item
transaction T1 fails and must change
the value of X back to its old value;
but meanwhile, T2 as read the
“temporary” incorrect value of X
Prof. Erwin M. Globio, MSIT 6-5
6. DB212 CHAPTER 6: DATA ADMINISTRATION
Inconsistent Analysis Problem
Another problem is when one transaction is calculating an aggregate summary
function on a number of records while other transactions are updating some of these
records. The aggregate function may calculate some values before they are updated
and others after they are updated. For example, suppose a transaction T3 is
calculating the total number of reservations an all the flights, meanwhile, transaction
T1 is executing. If the interleaving of operations shown in figure below occurs, the
result of T3 will be off by an amount N because T3 reads the value of X after N seats
are subtracted from it and reads the value of Y before those N seats are added to it.
Transaction 1 (T1) Transaction 2 (T2)
Sum = 0
Read-item (A)
Sum = Sum + A
Read-item (X)
X = X-N
Write-item (X)
Read-item (X)
Sum = sum + X
Read-item (Y)
Sum = sum + Y
Read-item (Y)
Y = Y+N
Write-item (Y)
6.3.3 Basic Approaches to Concurrency Control
In short, concurrency control is concerned with preventing loss of data integrity due to
interference between users in a multi-user environment.
There are two basic approaches to concurrency control : a pessimistic approach and an
optimistic approach.
Locking (Pessimistic Approach)
Locking mechanisms are the most common type of concurrency control mechanism. With
looking, any data that is retrieved by a user for updating must be locked, or denied to
other user, until the update is completed.
Locking data is most like checking a cook out of the library. It is unavailable to other
until it is returned by the borrower.
There are many types of lock. The following is a different type/example of lock:
Shared locks
Shared locks (also called S locks, or read locks) allow other transaction to read (but not
update) a record (or other resource).
A transaction should place a shared lock on a record when it will only read (but not
update) that record. With a shared lock, it prevents another user from placing an exclusive
lock on that record.
6-6 Prof. Erwin M. Globio, MSIT
7. DB212 CHAPTER 6: DATA ADMINISTRATION
Exclusive locks
Exclusive locks (also called X locks, or write locks) prevent another transaction from
reading (and therefore updating) a record until it is unlocked.
A transaction should place an exclusive lock on a record when it is about to update
that record. With an exclusive lock, it prevents another user from placing any type of
on that record.
Shared Lock(S lock) Exclusive Lock (X lock)
Shared Lock True False
Exclusive Lock False False
Deadlock
Locking (say at the record level) solves the problem of erroneous updates but may
lead to another, called deadlock. This may result when two (or more) transaction
have locked a common resource and each must wait for the other to unlocks the
resource.
For example, user A has locked record X and user B has locked record Y. User A
then requests record Y and user B requests record X. Both requests are denied, since
the requested records are already locked. Thus, unless the DBMS intervenes, both
users will wait indefinitely.
User A Time User B
---------------------------- ------------------------
t1 ------------------------
1. Lock record X
t2 1.Lock record Y
--------------------------------- --------------------------
2. Request record Y t3 2. Requesr record X
:
t4 :
-------------------------------- (Wait for X)
(Wait for Y)
Managing deadlock
There are two basis ways to resolve deadlocks :
- Deadlock prevention
When deadlock prevention is employed, user programs must lock all records
they will required at the beginning of a transaction (rather than one at a
time).
- Deadlock resolution
This allows deadlocks to occur but build mechanisms into the DBMS for
deteching and breaking the deadlocks.
Prof. Erwin M. Globio, MSIT 6-7
8. DB212 CHAPTER 6: DATA ADMINISTRATION
Optimistic approach (Versioning)
This approach that most of the time other users do not want the same record, or it they do,
they only want to read the record. With versioning, there is no form of locking. Each
transaction is treated as a view of the database as when the transaction starts. When
transaction modifies a record, the DBMS creates a new record version instead of
overwriting the old record. If there is no conflict, this user 's changes are used to update
the central database.
However, suppose there is a conflict such as two users have made conflicting changes to
their private copy of the database. Then, changes made by one of the users are committed
to the database.(Committed means after "successful" completion). The other user must be
told that there was a conflict and his work cannot be incorporated into the central
database. This update will be repeated again later.
The main advantage of versioning over locking is performance improvement as read-only
transactions can run concurrently with updating transaction.
User A reads the record containing the account balance, successfully withdraws $200 and
the new balance $800 is posted the account with a COMMIT statement. Meanwhile, user
B has also read the account record and requested a withdrawal. This is posted to her local
version of the account record. Therefore, when the transaction attempts to COMMIT, it
discovers the update conflict and her transaction is aborted. The transaction can be
restarted later with the correct balance of $800.
6.3.4 Why Recovery Is Needed?
Whenever a transaction is submitted to a DBMS for execution, the system is responsible for
making sure that either (a) all operations in the transaction are completed successfully and
their effect is recorded permanently in the database or (b) the transaction has no effect on the
database or any other transactions. The DBMS must not permit to let some operations of a
transaction T be applied to the database while other operations of T are not. However, this can
happen if a transaction fails after executing some of its operations by before executing all of
them.
Types of Failures
There are several possible reasons for a transaction to fail in the middle of execution. For
example :
Computer failure (system crush) : A hardware or software error occurs in the
computer system during transaction execution. If the hardware crashes, the contents
of the computer internal memory may be lost.
A transaction or system error: Some operation in the transaction may cause it to fail,
such as integer overflow or division by zero.
Disk failure: Some disk blocks may lose their data because of a read or write
malfunction or because of a disk read/write head crash. This may happen during a
read or write operation of the transaction.
Physical problems and catastrophes:This is an endless list that includes power or air
conditioning failure,fire,theft sabotage,overwriting disks or tapes by mistake etc.
6-8 Prof. Erwin M. Globio, MSIT
9. DB212 CHAPTER 6: DATA ADMINISTRATION
6.4 Database Recovery
Database recovery means restoring a database quickly and accurately after loss and damage.
The basic recovery facilities includes :
Backup facility, which provide periodic backup copies of the entire database. The copy
should be stored in a secured location where it is protected from loss or damaged.
Journalizing facilities, which maintain an audit of transactions and database changes.
There are transaction log and database change log.
Transaction log contains a record of the essential data for each transaction that is
processed against the database.
Database change log contains before- and after- images of records that have been
modified by transactions.
Database
Management
System
Database Transaction Database
Change
(Current) log
log
Database
(backup)
A checkpoint facility is when the DBMS periodically suspends all processing and
synchronizes its files and journals. Checkpoints should be taken frequently (say, several
times an hour). When failures do occurs, it is often possible to resume processing from
the most recent checkpoint. Thus, only a few minutes of processing work must be
repeated. Consider the following example which shows the possible timings of
transactions in relation to the time of the crash and the time of the last checkpoint.
T1
T2
T3
Prof. Erwin M. Globio, MSIT 6-9
10. DB212 CHAPTER 6: DATA ADMINISTRATION
T4
T5
Time of last checkpoint Time of crash
Transaction T1 was completed before the last checkpoint, so it will not be listed in
the checkpoint log record and will have no records in the log subsequent to the last
checkpoint.
Transaction T2 was currently active at the time of the last checkpoint so it will also
have a COMMIT or ABORT log record in the log file subsequent to the last
checkpoint.
Transaction T3 is also listed in the checkpoint record, but it has not completed by
the time of the failure, so it has no COMMIT or ABORT record in the log.
Transaction T4 was executed fully between the time of the last checkpoint and the
crash, so it has both a BEGIN TRANSACTION and a COMMIT or ABORT record
in the log, subsequent to the last check-point record.
Transaction T5 was was begun after the checkpoint, but not completed. It therefore
has a BEGIN TRANSACTION, but no COMMIT or ABORT record, in the log
subsequent to the last checkpoint.
Therefore, at the time of crash, transaction T3 and T5 effects have to be undone, since
they are incomplete transaction. Transactions of type T1 has no problems, since they are
known to have completed and their updates are known to have been consolidated on the
databases at the time of the last checkpoint. Transaction of type T2 and T4 normally
present no problem but it is not known whether all the necessary updates have been
carried out on the database (some changed pages may still be in the buffers and
consequently been lost). Thus the system will have to check whether a complete updates
are done. If not, all the updates are undone, else if completed (commit), all updates are
redone.
In short, this means redoing the effects of a transaction which had committed before the
crash, but after the last checkpoint; as well as undoing the effects of the incomplete
transactions at the point of crash.
A recovery manager, allows the DBMS to restore the database to a correct condition and
restart processing transactions.
6 - 10 Prof. Erwin M. Globio, MSIT
11. DB212 CHAPTER 6: DATA ADMINISTRATION
6.4 Database Security
Database security is defined as protection of the database against accidental or intentional loss,
destruction or misuse. Data administration uses several facilities provided by data
management software in carrying out these functions. These include:
Views or subschemas, which help to restrict user views of the database. For example:
CREATE VIEW ITEM-ORDER
AS SELECT ITEM-NAME, ORDER-NO
FROM ITEM, ORDER
WHERE ITEM.ORDER-NO = ORDER.ORDER-NO;
Authorization rules, which identify users and restrict the actions they may take against the
database. For example, using of password.
User-defined procedures, which defines additional constraints or limitations in using the
database. For example, user implements their password logging in their own PC.
Encryption procedures, which encodes data in an unrecognizable form. For example, in
the electronic funds transfer systems. The encryption procedures should also include
decoding facility.
Authentication schemas, which positively identify a person attempting to gain access to a
database.
Prof. Erwin M. Globio, MSIT 6 - 11
12. DB212 CHAPTER 6: DATA ADMINISTRATION
6.5 Review Questions
1. Contrast the following terms:
a. data administration vs database administration
b. deadlock prevention vs deadlock resolution
c. optimistic concurrency control vs pessimistic concurrency control
d. shared locks vs exclusive locks
2. Describe the DBMS facilities that are required for database backup and recovery.
3. For each of the situations describe below, indicate which of the following security
measures is most important appropriate:
i. authorization rules
ii. encryption
iii. authentication schemes
a. A national brokerage firm uses a simple password system to protect its
database but finds it needs a more comprehensive system to grant
different privileges (such as read versus create or update) to different
users.
b. A manufacturing firm uses a simple password system to protect its
database but finds it needs a more comprehensive system to grant
different privileges (such as read versus create or update) to different
users.
c. A university has experienced considerable difficulty with unauthorized users
who access files and databases by appropriating passwords from legitimated
users.
6 - 12 Prof. Erwin M. Globio, MSIT
13. DB212 CHAPTER 6: DATA ADMINISTRATION
Prof. Erwin M. Globio, MSIT
Senior IT Trainer
Mobile Numbers: 09393741359 or 09323956678
Email Add: Erwin_globio@yahoo.com
Skype Id: erwinglobio
Prof. Erwin M. Globio, MSIT 6 - 13