SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez nos Conditions d’utilisation et notre Politique de confidentialité.
SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez notre Politique de confidentialité et nos Conditions d’utilisation pour en savoir plus.
PRESENTATION ON DBMS & SQL
SCHOOL OF MANAGEMENT NIT-W
2) FOUNDER ,VERSION & FEATURES
3) ADVANTAGES OF DBMS AND SQL
4) DATABASE & DBMS
5) KEYS , NORMALISATION
6) ER MODELLING
7) DATA WAREHOUSING CONCEPTS
8) SQL BASICS
9) SQL FORMAT AND QUERIES
10) DATA WAREHOUSING TOOLS
HISTORY OF DBMS & SQL
• Ancient Times: Human beings began to store information very long ago.
In the ancient times, elaborate database systems were developed by
government offices, libraries, hospitals, and business organizations, and
some of the basic principles of these systems are still being used today.
• 1960s: Computerized database started in the 1960s, when the use of
computers became a more cost-effective option for private organizations.
There were two popular data models in this decade: a network model called
CODASYL and a hierarchical model called IMS. One database system that
proved to be a commercial success was the SABRE system that was used
by IBM to help American Airlines manage its reservations data.
• 1970 to 1972: E.F. Codd published an important paper to propose the use
of a relational database model, and his ideas changed the way people
thought about databases. In his model, the database’s schema, or logical
organization, is disconnected from physical information storage, and this
became the standard principle for database systems.
1970s: Two major relational database system prototypes were created
between the years 1974 and 1977, and they were the Ingres, which was
developed at UBC, and System R, created at IBM San Jose. Ingres used a
query language known as QUEL, and it led to the creation of systems such
as Ingres Corp., MS SQL Server, Sybase, Wang’s PACE, and Britton-Lee.
On the other hand, System R used the SEQUEL query language, and it
contributed to the development of SQL/DS, DB2, Allbase, Oracle, and
1976: A new database model called Entity-Relationship, or ER, was
proposed by P. Chen this year. This model made it possible for designers to
focus on data application, instead of logical table structure.
1980s: Structured Query Language, or SQL, became the standard query
language. DB2 became the flagship database product for IBM, and the
introduction of the IBM PC resulted in the establishments of many new
database companies and the development of products such as PARADOX,
RBASE 5000, RIM, Dbase III and IV, OS/2 Database Manager, and
Early 1990s: Around this time, new client tools for application development were
released, and these included the Oracle Developer, PowerBuilder, VB, and others.
A number of tools for personal productivity, such as ODBC and Excel/Access,
were also developed. Prototypes for Object Database Management Systems, or
ODBMS, were created in the early 1990s.
Late 1990s: Increased investment in online businesses resulted in a rise in
demand for Internet database connectors, such as Front Page, Active Server
Pages, Java Servelets, Dream Weaver, ColdFusion, Enterprise Java Beans, and
Oracle Developer 2000. The use of cgi, gcc, MySQL, Apache, and other systems
brought open source solution to the Internet. With the increased use of point-of-
sale technology, online transaction processing and online analytic processing
began to come of age.
2000s: Although the Internet industry experienced a decline in the early 2000s,
database applications continue to grow. New interactive applications were
developed for PDAs, point-of-sale transactions, and consolidation of vendors.
Presently, the three leading database companies in the western world are
Microsoft, IBM, and Oracle.
FOUNDERS OF DBMS & SQL
DBMS – The relational model, first proposed in 1970 by Edgar F. Codd,
departed from this tradition by insisting that applications should search for
data by content, rather than by following links.
SQL- SQL was initially developed at IBM by Donald D.
Chamberlin and Raymond F. Boyce in the early 1970s.This version,
initially called SEQUEL (Structured English QUEry Language), was
designed to manipulate and retrieve data stored in IBM's original quasi-
relational database management system, System R, which a group at IBM
San Jose Research Laboratory had developed during the 1970s. The
acronym SEQUEL was later changed to SQL because "SEQUEL" was
a trademarkof the UK-based Hawker Siddeley aircraft company.
SPECIAL FEATURES OF SQL
• SQL is an ANSI and ISO standard computer language
for creating and manipulating databases.
• SQL allows the user to create, update, delete, and
retrieve data from a database.
• SQL is very simple and easy to learn.
• SQL works with database programs like DB2, Oracle,
MS Access, Sybase, MS SQL Sever etc.
ADVANTAGES OF DBMS
• Concurrent Use
• Structured and Described Data
• Separation of Data and Applications
• Data Integrity
• Data Persistence
• Data Views
ADVANTAGES OF SQL
• High Speed: SQL Queries can be used to retrieve large amounts of records
from a database quickly and efficiently.
• Well Defined Standards Exist: SQL databases use long-established
standard, which is being adopted by ANSI & ISO. Non-SQL
databases do not adhere to any clear standard.
• No Coding Required: Using standard SQL it is easier to manage
database systems without having to write substantial amount of code.
• Emergence of ORDBMS: Previously SQL databases were synonymous
with relational database. With the emergence of Object Oriented DBMS,
object storage capabilities are extended to relational databases.
Database is shared collection of logical related data design to
meet information needs of an oragnization,logical means all
data are realed with one another.
Example - Database of hospital contain record of
In short a database is an organized collection of data.
A database management system (DBMS) is the software that
allows a computer to perform database functions of storing,
retrieving, adding, deleting, and modifying data.
EXAMPLES OF DBMS
1. Microsoft access
3. Microsoft SQL server
5. File maker pro
A hierarchical database is a design that uses a one-to-many
relationship for data elements. Hierarchical database models
use a tree structure that links a number of disparate elements to
one "owner," or "parent,”.
Network databases are similar to hierarchical databases by also
having a hierarchical structure. There are a few key
differences, however. Instead of looking like a tree, a network
database looks more like a cobweb or interconnected network
of records. In network databases, children are called members
and parents are called owners. The most important difference
is that each child or member can have more than one parent (or
RDBMS is the most important database system used in the
software industry today. It was exclusively used to establish
the relation the relationship between two-database objects or
two database tables.
The Relationship may be –
One - One
One - Many
Many - One
Many – Many
WHAT IS A TABLE IN RDBMS ?
In Relational database, a table is a collection of data elements
organised in terms of rows and columns. A table is also
considered as convenient representation of relations. But a
table can have duplicate tuples while a true relation cannot
have duplicate tuples. Table is the most simplest form of data
WHAT IS A RECORD IN RDBMS?
A single entry in a table is called a Record or Row. A Record in
a table represents set of related data.
WHAT IS A FIELD IN RDBMS ?
A table consists of several records(row), each record can be
broken into several smaller entities known as Fields.
WHAT IS A COLUMN IN RDBMS ?
In Relational table, a column is a set of value of a particular
type. The term Attribute is also used to represent a column.
FLAT FILE DBMS
In flat file database management system the user specifies the
data attributes for one table at a time, storing data
independently from application.
An object database (also object-oriented database management
system) is a database management system in which
information is represented in the form of objects as used
in object-orientedprogramming. Object databases are different
from relational databaseswhich are table-oriented.
CONCEPT OF PRIMARY KEY &FOREIGN KEY
Primary key uniquely identify a record in the table. We can
have only one Primary key in a table.
Foreign key is a field in the table that is primary key in another
table. We can have more than one foreign key in a table.
Database Normalisation is a technique of organizing the data in
the database. Normalization is a systematic approach of
decomposing tables to eliminate data redundancy and
undesirable characteristics like Insertion, Update and Deletion
Anamolies. It is a multi-step process that puts data into tabular
form by removing duplicated data from the relation tables.
• First Normal Form
• Second Normal Form
• Third Normal Form
The requirements to satisfy the 1st NF:
– Each table has a primary key: minimal set of attributes
which can uniquely identify a record
– The values in each column of a table are atomic (No multi-
value attributes allowed).
– There are no repeating groups: two columns do not store
similar information in the same table.
1st NORMAL FORM EXAMPLE
Un-normalized Students table:
Normalized Students table:
Student# AdvID AdvName AdvRoom Class1 Class2
123 123A James 555 102-8 104-9
124 123B Smith 467 209-0 102-8
Student# AdvID AdvName AdvRoom Class#
123 123A James 555 102-8
123 123A James 555 104-9
124 123B Smith 467 209-0
124 123B Smith 467 102-8
2nd NORMAL FORM REQUIREMENTS
The requirements to satisfy the 2nd NF:
– All requirements for 1st NF must be met.
– Redundant data across multiple rows of a table must be
moved to a separate table.The resulting tables must be
related to each other by use of foreign key.
2nd NORMAL FORM EXAMPLE
Student# AdvID AdvName AdvRoom
123 123A James 555
124 123B Smith 467
The entity-relationship model (or ER model) is a way of
graphically representing the logical relationships of entities (or
objects) in order to create a database. The ER model was first
proposed by Peter Pin-Shan Chen of Massachusetts Institute of
Technology (MIT) in the 1970s.
COMPONENTS OF ERD
An ERD typically consists of four different graphical components:
Express the number of entities to which another entity can be
associated via a relationship set.
• Cardinality Constraints - the number of instances of one entity
that can or must be associated with each instance of another
• Minimum Cardinality
– If zero, then optional
– If one or more, then mandatory
• Maximum Cardinality
– The maximum number
CARDINALITY CONSTRAINTS (CONTD.)
For a binary relationship set the mapping cardinality must be one of
the following types:
– One to one
• A Manager Head one Department and vice versa
– One to many ( or many to one)
• An Employee Works in one Department or One Department has
– Many to many
• A Teacher Teaches many Students and A student is taught by
STEPS TO CREATE AN ERD
• Identify the entity
• Identify the entity's attributes
• Identify the Primary Keys
• Identify the relation between entities
• Identify the Cardinality constraint
• Draw the ERD
• Check the ERD
DEVELOPING AN ERD
The process has ten steps:
1. Identify Entities
2. Find Relationships
3. Draw Rough ERD
4. Fill in Cardinality
5. Define Primary Keys
6. Draw Key-Based ERD
7. Identify Attributes
8. Map Attributes
9. Draw fully attributed ERD
10. Check Results
A SIMPLE EXAMPLE
A company has several departments. Each department has a
supervisor and at least one employee. Employees must be assigned
to at least one, but possibly more departments. At least one
employee is assigned to a project, but an employee may be on
vacation and not assigned to any projects. The important data fields
are the names of the departments, projects, supervisors and
employees, as well as the supervisor and employee number and a
unique project number.
• One approach to this is to work through the information and highlight
those words which you think correspond to entities.
• A company has several departments. Each department has a
supervisor and at least one employee. Employees must be assigned to
at least one, but possibly more departments. At least one employee is
assigned to a project, but an employee may be on vacation and not
assigned to any projects. The important data fields are the names of
the departments, projects, supervisors and employees, as well as the
supervisor and employee number and a unique project number.
• A true entity should have more than one instance
• Aim is to identify the associations, the connections between
pairs of entities.
• A simple approach to do this is using a relationship matrix
(table) that has rows and columns for each of the identified
FIND RELATIONSHIPS (Contd.)
Go through each cell and decide whether or not there is an
association. For example, the first cell on the second row is
used to indicate if there is a relationship between the entity
"Employee" and the entity "Department".
Names placed in the cells are meant to capture/describe the
relationships. So you can use them like this
• A Department is assigned an employee
• A Department is run by a supervisor
• An employee belongs to a department
• An employee works on a project
• A supervisor runs a department
• A project uses an employee
DRAW ROUGH ERD
Draw a diagram and:
• Place all the entities in rectangles
• Use diamonds and lines to represent the relationships
• General Examples
FILL IN CARDINALITY
– Each department has one supervisor.
– Each supervisor has one department.
– Each employee can belong to one or more departments
– Each department must have one or more employees
– Each project must have one or more employees
– Each employee can have 0 or more projects.
FILL IN CARDINALITY (Contd.)
The cardinality of a relationship can only have the following
– One and only one
– One or more
– Zero or more
– Zero or one
Each instance of A is related to a minimum of
zero and a maximum of one instance of B
Each instance of B is related to a minimum of
one and a maximum of one instance of A
Each instance of A is related to a minimum of
one and a maximum of many instances of B
Each instance of B is related to a minimum of
zero and a maximum of many instances of A
• In this step we try to identify and name all the attributes essential
to the system we are studying without trying to match them to
• The best way to do this is to study the forms, files and reports
currently kept by the users of the system and circle each data item
on the paper copy.
• Cross out those which will not be transferred to the new system,
extraneous items such as signatures, and constant information
which is the same for all instances of the form (e.g. your company
name and address). The remaining circled items should represent
the attributes you need. You should always verify these with your
system users. (Sometimes forms or reports are out of date.)
• The only attributes indicated are the names of the departments,
projects, supervisors and employees, as well as the supervisor and
employee NUMBER and a unique project number.
• For each attribute we need to match it with exactly one entity.
Often it seems like an attribute should go with more than one
entity (e.g. Name). In this case you need to add a modifier to the
attribute name to make it unique (e.g. Customer Name, Employee
Name, etc.) or determine which entity an attribute "best' describes.
• If you have attributes left over without corresponding entities, you
may have missed an entity and its corresponding relationships.
Identify these missed entities and add them to the relationship
CHECK ERD RESULTS
• Look at your diagram from the point of view of a system owner or
user. Is everything clear?
• Check through the Cardinality pairs.
• Also, look over the list of attributes associated with each entity to
see if anything has been omitted.
An enterprise structured repository of subject-oriented, time-
variant, historical data used for information retrieval and
decision support. The data warehouse stores atomic and
OLTP(DATABASE) V/S DATA WAREHOUSE
• Application Oriented like-
• Used to run business
• Detailed data
• Current up to date
• Isolated Data
• Clerical User
• Few Records accessed at a time (tens)
• Read/Update Access
• No data redundancy
• Database Size 100MB -100 GB
• Transaction throughput is the
• Thousands of users
• Managed in entirety
• EXAMPLE- Bank Accounts
• Warehouse (DSS)
– Subject Oriented like –
– Used to analyze business
– Summarized and refined
– Snapshot data
– Integrated Data
– Knowledge User (Manager)
– Large volumes accessed at a time (millions)
– Mostly Read (Batch Update)
– Redundancy present
– Database Size 100 GB - few terabytes
– Query throughput is the performance metric
– Hundreds of users
– Managed by subsets
– EXAMPLE- Company financial report of year
TO SUMMARIZE ...
• OLTP Systems are
used to “run” a business
• The Data Warehouse helps to
“optimize” the business
Data Mining is the process of extracting information from the
company's various databases and re-organizing it for purposes.
For example: for targeted marketing, data mining can use data
on past promotional mailings to identify the targets most likely
to maximize the return on the company’s investment in future
• Customized by Department
• Source is departmentally
structured data warehouse
WHAT IS SQL ?
• SQL stands for Structured Query Language
• SQL lets you access and manipulate databases
• SQL is an ANSI (American National Standards Institute)
TYPES OF SQL LANGUAGES
• DDL (Data Definition Language)
• DML (Data Manipulation Language)
• DQL (Data Query Language)
• DCL (Data Control Language)
• Data administration commands
• Transactional control commands
DML is abbreviation of Data Manipulation Language. It is used to
retrieve, store, modify, delete, insert and update data in database.
Examples: SELECT, UPDATE, INSERT statements
DDL is abbreviation of Data Definition Language. It is used to create and
modify the structure of database objects in database.
Examples: CREATE, ALTER, DROP statements
DCL is abbreviation of Data Control Language. It is used to create roles,
permissions, and referential integrity as well it is used to control access to
database by securing it.
Examples: GRANT, REVOKE statements
TCL is abbreviation of Transactional Control Language. It is used to
manage different transactions occurring within a database.
Examples: COMMIT, ROLLBACK statements
WHAT CAN SQL DO ?
• SQL can execute queries against a database
• SQL can retrieve data from a database
• SQL can insert records in a database
• SQL can update records in a database
• SQL can delete records from a database
• SQL can create new databases
• SQL can create new tables in a database
• SQL can create stored procedures in a database
• SQL can create views in a database
• SQL can set permissions on tables, procedures, and views
IMPORTANT SQL COMMANDS
• SELECT - extracts data from a database
• UPDATE - updates data in a database
• DELETE - deletes data from a database
• INSERT INTO - inserts new data into a database
• CREATE DATABASE - creates a new database
• ALTER DATABASE - modifies a database
• CREATE TABLE - creates a new table
• ALTER TABLE - modifies a table
• DROP TABLE - deletes a table
• CREATE INDEX - creates an index (search key)
• DROP INDEX - deletes an index
SOME BASIC SQL SYNTAX
1) SQL SELECT Statement
2) SQL DELETE Statement
DELETE FROM table_name
3) SQL UPDATE Statement
In a relational database, data is stored in tables.
Each table consists of columns and rows. Each column is a field in a record, and
there is a column name associated with each column.
Each row represents one record. When we say how many records we have, we are
referring to the number of rows.
SELECT … FROM
SQL is structured similar to the English language. The basic command for retrieving
data from a database table is to SELECT data FROM a table. Not surprisingly, the
keywords "SELECT" and "FROM" make up the core of a SQL statement.
The syntax for “SELECT… FROM” is:
SELECT … FROM
Different ways of selecting data:
Select more than 1 column:
SELECT “COLUMN_NAME_1”, “COLUMN_NAME_2”
Select all columns:
Select unique values:
SELECT DISTINCT “Column_Name”
Sometimes we want to retrieve only a subset of the data. In those cases, we use the
The syntax for “WHERE” is:
CONDITION represents how we want the data to be filtered.
ORDER BY “COLUMN_NAME” [ASC | DESC]
When we want to list the results in a particular order (ascending or descending), we
use the ORDER BY keyword at the end of the SQL statement.
The syntax for “ORDER BY” is:
SQL has built-in mathematical functions to allow us to perform mathematical
operations on the data. Common mathematical functions include:
GROUP BY “COLUMN_NAME_1”
To summarize, the syntax for GROUP BY is as follows:
ORDER OF SQL COMMANDS
A SELECT statement has the following order:
• SELECT … FROM
• GROUP BY
• ORDER BY
INFORMATICA – DATA WAREHOUSING
Informatica is one of the powerful ETL tool, easy to work
which supports all the steps of Extraction, Transformation and
Load process. It has got a simple visual interface like forms in
visual basic. You just need to drag and drop different objects
(known as transformations) and design process flow for Data
extraction transformation and load.
Main Components of Informatica Power Center
1. Repository: Repository is the heart of Informatica tool.
Repository is a kind of data inventory where all the data related to
mappings, sources, targets etc is kept. All the client tools and
Informatica Server fetch data from Repository.
2. Informatica PowerCenter Server: All the executions take place
in server, it makes physical connections to sources/targets, fetches
data, applies the transformations mentioned in the mapping and
loads the data in the target system.
3. Informatica PowerCenter Client Tools: These tools enable a
developer to define transformation process, known as mapping in
Designer, define run-time properties for a mapping, known as
sessions in Workflow Manager, monitor execution of sessions in
Workflow Monitor and manage repository, useful for administrators
in Repository Manager.