Lectura Capitulo 5. Sistemas de Información Gerencial, James O´Brien

Management
Challenges

CHAPTER 5 Business
Applications Module
II
Information
Technologies

Development Foundation
Processes Concepts

DATA RESOURCE MANAGEMENT

Chapter Highlights Learning Objectives
Section I After reading and studying this chapter, you should
Technical Foundations of Database Management be able to:
Real World Case: Harrah’s Entertainment and Others: 1. Explain the business value of implementing data
Protecting the Data Jewels
resource management processes and technologies
Database Management
in an organization.
Fundamental Data Concepts
2. Outline the advantages of a database management
Database Structures
approach to managing the data resources of a
Database Development
business, compared to a file processing approach.
Section II
3. Explain how database management software helps
Managing Data Resources
business professionals and supports the operations
Real World Case: Emerson and Sanofi: Data Stewards and management of a business.
Seek Data Conformity
Data Resource Management 4. Provide examples to illustrate each of the
Types of Databases
following concepts:
Data Warehouses and Data Mining a. Major types of databases.
Traditional File Processing b. Data warehouses and data mining.
The Database Management Approach c. Logical data elements.
Real World Case: Acxiom Corporation: Data d. Fundamental database structures.
Demands Respect
e. Database development.

149

150 ● Module II / Information Technologies

SECTION I Technical Foundations of
Database Management
Database Just imagine how difficult it would be to get any information from an information sys-
tem if data were stored in an unorganized way, or if there were no systematic way to
Management retrieve them. Therefore, in all information systems, data resources must be organized
and structured in some logical manner so that they can be accessed easily, processed ef-
ficiently, retrieved quickly, and managed effectively. Data structures and access meth-
ods ranging from simple to complex have been devised to efficiently organize and
access data stored by information systems. In this chapter, we will explore these con-
cepts, as well as the managerial implications and value of data resource management.
See Figure 5.1.
Read the Real World Case on data resources in the casino gaming and hospitality
industry. We can learn a lot from this case about the importance of protecting the data
resources of the organization.

Fundamental Before we go any further, let’s discuss some fundamental concepts about how data are
organized in information systems. A conceptual framework of several levels of data has
Data Concepts been devised that differentiates between different groupings, or elements, of data.
Thus, data may be logically organized into characters, fields, records, files, and data-
bases, just as writing can be organized in letters, words, sentences, paragraphs, and
documents. Examples of these logical data elements are shown in Figure 5.2.

Character The most basic logical data element is the character, which consists of a single alpha-
betic, numeric, or other symbol. You might argue that the bit or byte is a more ele-
mentary data element, but remember that those terms refer to the physical storage
elements provided by the computer hardware, discussed in Chapter 3. Using that un-
derstanding, one way to think of a character is that it is a byte used to represent a par-
ticular character. From a user’s point of view (that is, from a logical as opposed to a
physical or hardware view of data), a character is the most basic element of data that
can be observed and manipulated.

Field The next higher level of data is the field, or data item. A field consists of a grouping of
related characters. For example, the grouping of alphabetic characters in a person’s
name may form a name field (or typically, last name, first name, and middle initial
fields), and the grouping of numbers in a sales amount forms a sales amount field.
Specifically, a data field represents an attribute (a characteristic or quality) of some
entity (object, person, place, or event). For example, an employee’s salary is an
attribute that is a typical data field used to describe an entity who is an employee of a
business. Generally speaking, fields are organized such that they represent some logi-
cal order. For example, last_name, first_name, address, city, state, zipcode, and so on.

Record All of the fields used to describe the attributes of an entity are grouped to form a
record. Thus, a record represents a collection of attributes that describe an entity. An
example is a person’s payroll record, which consists of data fields describing attributes
such as the person’s name, Social Security number, and rate of pay. Fixed-length records
contain a fixed number of fixed-length data fields. Variable-length records contain a
variable number of fields and field lengths. Another way of looking at a record is that
it represents a single instance of an entity. Each record in an employee file describes
one specific employee.

File A group of related records is a data file, or table. Thus, an employee file would contain
the records of the employees of a firm. Files are frequently classified by the application

Chapter 5 / Data Resource Management ● 151

1
REAL WORLD
Harrah’s Entertainment and Others:
CASE Protecting the Data Jewels

I n the casino industry, one of the most valuable assets is the
dossier that casinos keep on their affluent customers, the
high rollers. But in 2003, casino operator Harrah’s Enter-
tainment Inc. filed a lawsuit in Placer County, California,
Superior Court charging that a former employee had copied
lists. Through these documents, employees “acknowledge
that they will be introduced to this information and agree not
to disclose it on departure from the company,” says Suzanne
Labrit, a partner at law firm Shutts & Bowen LLP in West
Palm Beach, Florida.
the records of up to 450 wealthy customers before leaving Although most states have enacted trade-secrets laws,
the company to work at competitor Thunder Valley Casino Labrit says they have different attitudes about enforcing these
in Lincoln, California. laws with regard to customer lists. “But as a starting point, at
The complaint said the employee was seen printing the least you have this understanding [with employees] that the
list—which included names, contact information, and credit customer information is being treated as confidential,” Labrit
and account histories—from a Harrah’s database. It also says. Then if an employee leaves to work for a competitor and
alleged that he tried to lure those players to Thunder Valley. uses this protected customer data, the employer will more
The employee denies the charge of stealing Harrah’s trade likely be able to take legal action to stop the activity. “If you
secrets, and the case was still pending at this writing, but don’t treat it as confidential information internally,” she says,
many similar cases have been filed in the past 20 years, legal “the court will not treat it as confidential information, either.”
experts say. It’s also important to educate employees about the
While savvy companies are using business intelligence confidentiality of customer lists, because many people
and customer relationship management systems to identify wrongly assume they’re public information, says Tim
their most profitable customers, there’s a genuine danger Headley, a partner at the Houston law firm of Gardere
of that information falling into the wrong hands. Broader Wynne Sewell LLP. “Most people think they can take the
access to those applications and the trend toward employees lists with them,” he says. “You have to show that you’ve
switching jobs more frequently have made protecting cus- kept it a secret and told employees it’s a valuable secret.
tomer lists an even greater priority. [Customer lists] are at the core of how you bring revenue
Fortunately, there are managerial, legal, and technologi- into the company. These are the decision-makers who are
cal steps you can take to help prevent, or at least discourage, willing to buy your product.”
departing employees from walking out the door with this From a management and process standpoint, organiza-
vital information. tions should try to limit access to customer lists to only
For starters, organizations should make sure that certain employees, such as sales representatives, who need the
employees—particularly those with frequent access to cus- information to do their jobs. “If you make it broadly avail-
tomer information—sign nondisclosure, noncompete, and able to employees, then it’s not considered confidential,” says
nonsolicitation agreements that specifically mention customer Labrit.
Physical security should also be considered, Labrit says.
FIGURE 5.1 Visitors such as vendors shouldn’t be permitted to roam free
in the hallways or into conference rooms. And security poli-
cies, such as a requirement that all computer systems have
strong password protection, should be strictly enforced.
Companies should instantly shut down access to com-
puters and networks when employees leave, whether the rea-
son is a layoff or a move to a new job. At the exit interview,
the employee should be reminded of any signed agreements
and corporate policies regarding customer lists and other
confidential information. Employees should be told to turn
over anything, including data that belongs to the company.
In addition, employers should track the activities of em-
ployees who’ve given notice but will be around for a while.
This includes monitoring systems to see if the employee is
e-mailing company-owned documents outside the company.
Some organizations rely on technology to help prevent
While data management is a strategic initiative in the loss of customer lists and other critical data. Inflow Inc.,
every modern organization, those in the gaming a Denver-based provider of managed Web hosting services,
industry believe their success lies in the protection uses a product from Opsware Inc. in Sunnyvale, California,
and strategic management of their data resources. that lets managers control access to specific systems, such as
databases, from a central location.
Source: Jose Luis Palaez, Inc./Corbis.


The company also uses an e-mail-scanning service that Vijay Sonty, chief technology officer at advertising firm
allows it to analyze messages that it suspects might contain Foote Cone & Belding Worldwide in New York, says losing
proprietary files, says Lenny Monsour, general manager customer information to competitors is a growing concern,
of application hosting and management. Inflow combines particularly in industries where companies go after many of
the use of this technology with practices such as monitoring the same clients.
employees who have access to data considered vital to the “We have a lot of account executives who are very close
company. to the clients and have access to client lists,” Sonty says. “If
A major financial services provider is using a firewall an account executive leaves to join a competitor, he can take
from San Francisco-based Vontu Inc. that monitors out- all this confidential information.” The widespread sharing of
bound e-mail, Webmail, Web posts, and instant messages to corporate data, such as customer contact information, has
ensure that no confidential data leave the company. The made it easier for people to do their jobs, but it has also
software includes search algorithms and can be customized increased the risk of losing confidential data, Sonty says.
to automatically detect specific types of data such as lists on He says the firm, which mandates that some employees
a spreadsheet or even something as granular as a customer’s sign noncompete agreements, is looking into policies and
Social Security number. The firm began using the product guidelines regarding the proper use of customer informa-
after it went through layoffs in 2000 and 2001. tion, as well as audit trails to see who’s accessing customer
“Losing customer information was a primary concern of lists. “I think it makes good business sense to take precau-
ours,” says the firm’s chief information security officer, who tions and steps to prevent this from happening,” Sonty says.
asked to not be identified. “We were concerned about people “We could lose a lot of money if key people leave.”
leaving and sending e-mail to their home accounts.” In fact,
he says, before using the firewall, the company had trouble
with departing employees taking intellectual property and
Source: Adapted from Bob Violino, “Protecting the Data Jewels: Valuable
using it in their new jobs at rival firms, which sometimes led Customer Lists,” Computerworld, July 19, 2004. Copyright © 2004 by
to lawsuits. Computerworld Inc., Framingham, MA 01701. All rights reserved.

CASE STUDY QUESTIONS REAL WORLD ACTIVITIES
1. Why have developments in IT helped to increase the 1. Companies are increasingly adopting a position that
value of the data resources of many companies? data is an asset that must be managed with the same
2. How have these capabilities increased the security chal- level of attention as that of cash and other capital.
lenges associated with protecting a company’s data Using the Internet, see if you can find examples of how
resources? companies treat their data. Does there seem to be any
relationship between companies that look at their data
3. How can companies use IT to meet the challenges of as an asset and companies that are highly successful in
data resource security? their respective industries?

The Real World Case above illustrates how valuable data re- estimated that any firm in the financial industry would have
sources are to the casino industry. Break into small groups a life expectancy of less than 100 hours if they were placed in
with your classmates, and discuss other industries where a position where they could not access their organizational
their data are clearly their lifeblood. For example, it has been data. Do you agree with this estimate?


FIGURE 5.2 Examples of the logical data elements in information systems. Note especially the examples of how
data fields, records, files, and databases are related.

Human Resource Database

Payroll File Benefits File

Employee Employee Employee Employee
Record 1 Record 2 Record 3 Record 4

Name SS No. Salary Name SS No. Salary Name SS No. Insurance Name SS No. Insurance
Field Field Field Field Field Field Field Field Field Field Field Field

Jones T. A. 275-32-3874 20,000 Klugman J. L. 349-88-7913 28,000 Alvarez J.S. 542-40-3718 100,000 Porter M.L. 617-87-7915 50,000

for which they are primarily used, such as a payroll file or an inventory file, or the type
of data they contain, such as a document file or a graphical image file. Files are also classified
by their permanence, for example, a payroll master file versus a payroll weekly transac-
tion file. A transaction file, therefore, would contain records of all transactions occur-
ring during a period and might be used periodically to update the permanent records
contained in a master file. A history file is an obsolete transaction or master file retained
for backup purposes or for long-term historical storage called archival storage.

Database A database is an integrated collection of logically related data elements. A database
consolidates records previously stored in separate files into a common pool of data
elements that provides data for many applications. The data stored in a database are
independent of the application programs using them and of the type of storage devices
on which they are stored.
Thus, databases contain data elements describing entities and relationships among
entities. For example, Figure 5.3 outlines some of the entities and relationships in a

FIGURE 5.3
Some of the entities and Electric Utility Database
relationships in a simplified
electric utility database.
Note a few of the business Billing Payment
applications that access the Entities: processing
data in the database. Customers, meters, bills,
payments, meter readings

Meter Service
Relationships:
reading start / stop
Bills sent to customers,
customers make payments,
customers use meters, . . .

Source: Adapted from Michael V. Mannino, Database Application Development
and Design (Burr Ridge, IL: McGraw-Hill/Irwin, 2001), p. 6.


database for an electric utility. Also shown are some of the business applications (billing,
payment processing) that depend on access to the data elements in the database.

Database The relationships among the many individual data elements stored in databases are
based on one of several logical data structures, or models. Database management sys-
Structures tem packages are designed to use a specific data structure to provide end users with
quick, easy access to information stored in databases. Five fundamental database struc-
tures are the hierarchical, network, relational, object-oriented, and multidimensional models.
Simplified illustrations of the first three database structures are shown in Figure 5.4.

Hierarchical Early mainframe DBMS packages used the hierarchical structure, in which the rela-
Structure tionships between records form a hierarchy or treelike structure. In the traditional
hierarchical model, all records are dependent and arranged in multilevel structures,

FIGURE 5.4 Hierarchical Structure
Example of three Department
fundamental database Data Element
structures. They represent
three basic ways to
develop and express the
relationships among the Project A Project B
data elements in a database. Data Element Data Element

Employee 1 Employee 2
Data Element Data Element

Network Structure

Department A Department B

Employee Employee Employee
1 2 3

Project Project
A B

Relational Structure
Department Table Employee Table
Deptno Dname Dloc Dmgr Empno Ename Etitle Esalary Deptno
Dept A Emp 1 Dept A
Dept B Emp 2 Dept A
Dept C Emp 3 Dept B
Emp 4 Dept B
Emp 5 Dept C
Emp 6 Dept B


consisting of one root record and any number of subordinate levels. Thus, all of the
relationships among records are one-to-many, since each data element is related to only
one element above it. The data element or record at the highest level of the hierarchy
(the department data element in this illustration) is called the root element. Any data
element can be accessed by moving progressively downward from a root and along the
branches of the tree until the desired record (for example, the employee data element)
is located.

Network Structure The network structure can represent more complex logical relationships and is still
used by some mainframe DBMS packages. It allows many-to-many relationships
among records; that is, the network model can access a data element by following one
of several paths, because any data element or record can be related to any number of
other data elements. For example, in Figure 5.4, departmental records can be related
to more than one employee record, and employee records can be related to more than
one project record. Thus, you could locate all employee records for a particular
department, or all project records related to a particular employee.

Relational Structure The relational model is the most widely used of the three database structures. It is used
by most microcomputer DBMS packages, as well as by most midrange and mainframe
systems. In the relational model, all data elements within the database are viewed as
being stored in the form of simple two-dimensional tables sometimes referred to as
relations. The tables in a relational database have rows and columns. Each row repre-
sents a single record in the file, and each column represents a field.
Figure 5.4 illustrates the relational database model with two tables representing some
of the relationships among departmental and employee records. Other tables, or rela-
tions, for this organization’s database might represent the data element relationships
among projects, divisions, product lines, and so on. Database management system pack-
ages based on the relational model can link data elements from various tables to provide
information to users. For example, a manager might want to retrieve and display an
employee’s name and salary from the employee table in Figure 5.4, and the name of the
employee’s department from the department table, by using their common department
number field (Deptno) to link or join the two tables. See Figure 5.5. The relational
model can relate data in any one file with data in another file if both files share a com-
mon data element or field. Because of this, information can be created by retrieving data
from multiple files even if they are not all stored in the same physical location.

Relational Three basic operations can be performed on a relational database to create useful sets
Operations of data. The select operation is used to create a subset of records that meet a stated cri-
terion. For example, a select operation might be used on an employee database to
create a subset of records that contain all employees who make more than $30,000 per
year and who have been with the company more than three years. Another way to
think of the select operation is that it temporarily creates a table whose rows have
records that meet the selection criteria.

FIGURE 5.5 Department Table Employee Table
Joining the Employee and Deptno Dname Dloc Dmgr Empno Ename Etitle Esalary Deptno
Department tables in a Dept A Emp 1 Dept A
relational database enables Dept B Emp 2 Dept A
you to selectively access Dept C Emp 3 Dept B
data in both tables at the Emp 4 Dept B
same time. Emp 5 Dept C
Emp 6 Dept B


The join operation can be used to temporarily combine two or more tables so that a
user can see relevant data in a form that looks like it is all in one big table. Using this
operation, a user can ask for data to be retrieved from multiple files or databases without
having to go to each one separately.
Finally, the project operation is used to create a subset of the columns contained in
the temporary tables created by the select and join operations. Just as the select oper-
ation creates a subset of records that meet stated criteria, the project operation creates
a subset of the columns, or fields, that the user wants to see. Using a project operation,
the user can decide not to view all of the columns in the table but only those that have
data necessary to answer a particular question or to construct a specific report.
Because of the widespread use of the relational model, an abundance of commer-
cial products exists to create and manage them. Leading mainframe relational database
applications include Oracle 10g from Oracle Corp. and DB2 from IBM. A very popu-
lar midrange database application is SQL server from Microsoft. The most commonly
used database application for the PC is Microsoft Access.

Multidimensional The multidimensional model is a variation of the relational model that uses multidi-
Structure mensional structures to organize data and express the relationships between data. You
can visualize multidimensional structures as cubes of data and cubes within cubes
of data. Each side of the cube is considered a dimension of the data. Figure 5.6 is an
example that shows that each dimension can represent a different category, such as
product type, region, sales channel, and time [5].

FIGURE 5.6 An example of the different dimensions of a multidimensional database.
Denver Profit
Los Angeles Total Expenses
San Francisco Margin
West COGS
February March East West
East Sales
Actual Budget Actual Budget Actual Budget Actual Budget
Sales Camera TV January
TV February
VCR March
Audio Qtr 1
Margin Camera VCR January
TV February
VCR March
Audio Qtr 1

April April
Qtr 1 Qtr 1
March March
February February
Actual Budget Sales Margin
January January
Sales Margin Sales Margin TV VCR TV VCR
TV East East Actual
West Budget
South Forecast
Total Variance
VCR East West Actual
West Budget
South Forecast
Total Variance


FIGURE 5.7 Bank Account Object
The checking and savings Attributes
account objects can inherit Customer
common attributes and Balance
operations from the bank Interest
account object. Operations
Deposit (amount)
Withdraw (amount)
Get owner

Inheritance Inheritance

Checking Account Object Savings Account Object
Attributes Attributes
Credit line Number of withdrawals
Monthly statement Quarterly statement
Operations Operations
Calculate interest owed Calculate interest paid
Print monthly statement Print quarterly statement

Source: Adapted from Ivar Jacobsen, Maria Ericsson, and Ageneta Jacobsen, The Object
Advantage: Business Process Reengineering with Object Technology (New York: ACM Press, 1995),
p. 65. Copyright © 1995, Association for Computing Machinery. By permission.

Each cell within a multidimensional structure contains aggregated data related to
elements along each of its dimensions. For example, a single cell may contain the total
sales for a product in a region for a specific sales channel in a single month. A major
benefit of multidimensional databases is that they are a compact and easy-to-understand
way to visualize and manipulate data elements that have many interrelationships. So
multidimensional databases have become the most popular database structure for
the analytical databases that support online analytical processing (OLAP) applications,
in which fast answers to complex business queries are expected. We discuss OLAP
applications in Chapter 9.

Object-Oriented The object-oriented model is considered to be one of the key technologies of a new
Structure generation of multimedia Web-based applications. As Figure 5.7 illustrates, an object
consists of data values describing the attributes of an entity, plus the operations that
can be performed upon the data. This encapsulation capability allows the object-
oriented model to more easily handle complex types of data (graphics, pictures, voice,
text) than other database structures.
The object-oriented model also supports inheritance; that is, new objects can be
automatically created by replicating some or all of the characteristics of one or more
parent objects. Thus, in Figure 5.7, the checking and savings account objects can both
inherit the common attributes and operations of the parent bank account object. Such
capabilities have made object-oriented database management systems (OODBMS) popular
in computer-aided design (CAD) and in a growing number of applications. For exam-
ple, object technology allows designers to develop product designs, store them as ob-
jects in an object-oriented database, and replicate and modify them to create new
product designs. In addition, multimedia Web-based applications for the Internet and
corporate intranets and extranets have become a major application area for object
technology.
Object technology proponents argue that an object-oriented DBMS can work with
complex data types such as document and graphic images, video clips, audio segments,
and other subsets of Web pages much more efficiently than relational database
management systems. However, major relational DBMS vendors have countered by


FIGURE 5.8
This claims analysis
graphics display provided
by the CleverPath
enterprise portal is powered
by the Jasmine ii object-
oriented database
management system of
Computer Associates.

Source: Courtesy of Computer Associates.

adding object-oriented modules to their relational software. Examples include multi-
media object extensions to IBM’s DB2, and Oracle’s object-based “cartridges” for
Oracle 10g. See Figure 5.8.

Evaluation of The hierarchical data structure was a natural model for the databases used for the
Database Structures structured, routine types of transaction processing characteristic of many business op-
erations in the early years of data processing and computing. Data for these operations
can easily be represented by groups of records in a hierarchical relationship. However,
as time progressed, there were many cases where information was needed about
records that did not have hierarchical relationships. For example, in some organizations,
employees from more than one department can work on more than one project (refer
back to Figure 5.4). A network data structure could easily handle this many-to-many
relationship, whereas a hierarchical model could not. As such, the more flexible net-
work structure became popular for these types of business operations. However, like
the hierarchical structure, because its relationships must be specified in advance, the
network model was unable to easily handle ad hoc requests for information, thus
pointing out the need for the relational model.
Relational databases allow an end user to easily receive information in response to
ad hoc requests. That’s because not all of the relationships between the data elements
in a relationally organized database need to be specified when the database is created.
Database management software (such as Oracle 10g, DB2, Access, and Approach) cre-
ates new tables of data relationships by using parts of the data from several tables.
Thus, relational databases are easier for programmers to work with and easier to main-
tain than the hierarchical and network models.
The major limitation of the relational model is that relational database manage-
ment systems cannot process large amounts of business transactions as quickly and
efficiently as those based on the hierarchical and network models, or process com-
plex, high-volume applications as well as the object-oriented model. This performance
gap has narrowed with the development of advanced relational database software
with object-oriented extensions. The use of database management software based on
the object-oriented and multidimensional models is growing steadily, as these tech-
nologies are playing a greater role for OLAP and Web-based applications.


Experian Experian Inc. (www.experian.com), a unit of London-based GUS PLC, runs one of
Automotive: The the largest credit reporting agencies in the United States. But Experian wanted to
expand its business beyond credit checks for automobile loans. If it could collect
Business Value vehicle data from the various motor-vehicle departments in the United States and
of Relational blend that with other data, such as change-of-address records, then its Experian
Database Automotive division could sell the enhanced data to a variety of customers. For
example, car dealers could use the data to make sure their inventory matches local
Management buying preferences. And toll collectors could match license plates to addresses to
find motorists who sail past tollbooths without paying.
But to offer new services, Experian first needed a way to extract, transfer, and
load data from the systems of 50 different U.S. state departments of motor vehicles
(DMVs), plus Puerto Rico, into a single database. That was a big challenge. “Unlike
the credit industry that writes to a common format, the DMVs do not,” says Ken
Kauppila, vice president of IT at Experian Automotive in Costa Mesa, California.
Of course, Experian didn’t want to replicate the hodgepodge of file formats it
inherited when the project began in January 1999—175 formats among 18,000
files. So Kauppila decided to transform and map the data to a common relational
database format.
Fortunately, off-the-shelf software tools for extracting, transforming, and loading
data (called ETL tools) make it economical to combine very large data repositories.
Using ETL Extract from Evolutionary Technologies, Experian created a database
that can incorporate vehicle information within 48 hours of its entry into any of the
nation’s DMV computers. This is one of the areas in which data management soft-
ware tools can excel, says Guy Creese, analyst at Aberdeen Group in Boston. “It can
simplify the mechanics of multiple data feeds, and it can add to data quality, making
fixes possible before errors are propagated to data warehouses,” he says.
Using the ETL extraction and transformation tools along with IBM’s DB2 data-
base system, Experian Automotive created a database that processes 175 million
transactions per month and has created a variety of profitable new revenue streams.
Experian’s automotive database is the 10th largest database in the world—now, with
up to 16 billion rows of data. But the company says the relational database is man-
aged by just three IT professionals. Experian says this demonstrates how efficiently
database software like DB2 and the ETL tools can work with a large database to
handle vast amounts of data quickly.

Database Database management packages like Microsoft Access or Lotus Approach allow end
users to easily develop the databases they need. See Figure 5.9. However, large orga-
Development nizations usually place control of enterprisewide database development in the hands of
database administrators (DBAs) and other database specialists. This improves the in-
tegrity and security of organizational databases. Database developers use the data def-
inition language (DDL) in database management systems like Oracle 10g or IBM’s DB2
to develop and specify the data contents, relationships, and structure of each database,
and to modify these database specifications when necessary. Such information is cata-
loged and stored in a database of data definitions and specifications called a data dictio-
nary, or metadata repository, which is managed by the database management software
and maintained by the DBA.
A data dictionary is a database management catalog or directory containing
metadata, that is, data about data. A data dictionary relies on a specialized database
software component to manage a database of data definitions, that is, metadata about
the structure, data elements, and other characteristics of an organization’s databases.
For example, it contains the names and descriptions of all types of data records and
their interrelationships, as well as information outlining requirements for end users’
access and use of application programs, and database maintenance and security.


FIGURE 5.9
Creating a database table
using the Table Wizard
of Microsoft Access.

Source: Courtesy of Microsoft Corp.

Data dictionaries can be queried by the database administrator to report the status
of any aspect of a firm’s metadata. The administrator can then make changes to the
definitions of selected data elements. Some active (versus passive) data dictionaries
automatically enforce standard data element definitions whenever end users and ap-
plication programs access an organization’s databases. For example, an active data dic-
tionary would not allow a data entry program to use a nonstandard definition of
a customer record, nor would it allow an employee to enter a name of a customer that
exceeded the defined size of that data element.
Developing a large database of complex data types can be a complicated task. Data-
base administrators and database design analysts work with end users and systems
analysts to model business processes and the data they require. Then they determine
(1) what data definitions should be included in the database and (2) what structure or
relationships should exist among the data elements.

Data Planning and As Figure 5.10 illustrates, database development may start with a top-down data plan-
Database Design ning process. Database administrators and designers work with corporate and end
user management to develop an enterprise model that defines the basic business process
of the enterprise. Then they define the information needs of end users in a business
process, such as the purchasing/receiving process that all businesses have.
Next, end users must identify the key data elements that are needed to perform
their specific business activities. This frequently involves developing entity relationship
diagrams (ERDs) that model the relationships among the many entities involved in
business processes. For example, Figure 5.11 illustrates some of the relationships
in a purchasing/receiving process. ERDs are simply graphical models of the various
files and their relationships contained within a database system. End users and data-
base designers could use database management or business modeling software
to help them develop ERD models for the purchasing/receiving process. This would
help identify what supplier and product data are required to automate their purchasing/
receiving and other business processes using enterprise resource management (ERM)
or supply chain management (SCM) software. You will learn about ERDs and other
data modeling tools in much greater detail if you ever take a course in systems analysis
and design.


FIGURE 5.10
Database development 1. Data Planning Physical Data Models
involves data planning and Develops a model of business Storage representations and
database design activities. processes access methods
Data models that support
business processes are used
to develop databases that
meet the information needs 5. Physical Design
of users. Enterprise model of business Determines the data storage
processes with documentation structures and access
methods

Logical Data Models
2. Requirements Specification
E.g., relational, network,
Defines information needs of end
hierarchical, multidimensional,
users in a business process
or object-oriented models

Description of users’ needs may 4. Logical Design
be represented in natural Translates the conceptual
language or using the tools of a models into the data model of
particular design methodology a DBMS

3. Conceptual Design
Conceptual Data Models
Expresses all information
Often expressed as entity
requirements in the form of a
relationship models
high-level model

Such user views are a major part of a data modeling process where the relation-
ships between data elements are identified. Each data model defines the logical rela-
tionships among the data elements needed to support a basic business process. For
example, can a supplier provide more than one type of product to us? Can a customer
have more than one type of account with us? Can an employee have several pay rates
or be assigned to several project workgroups?
Answering such questions will identify data relationships that have to be repre-
sented in a data model that supports a business process. These data models then serve
as logical frameworks (called schemas and subschemas) on which to base the physical de-
sign of databases and the development of application programs to support the business
processes of the organization. A schema is an overall logical view of the relationships

FIGURE 5.11 Ordered on Supplies
Purchase
This entity relationship Product Supplier
Order Item
diagram illustrates some of
the relationships among the
Stocked as

entities (product, supplier,
Contains

warehouse, etc.) in a
purchasing/receiving
business process.
Purchase Product Holds
Warehouse
Order Stock


FIGURE 5.12 Example of the logical and physical database views and the software interface of a banking services
information system.

Installment
Checking Savings
Loan
Application Application
Application

Logical User Views
Checking and Installment Data elements and relationships (the subschemas) needed
Savings Loan for checking, savings, or installment loan processing
Data Model Data Model

Data elements and relationships (the schema)
Banking Services Data Model needed for the support of all bank services

Software Interface
Database Management System
The DBMS provides access to the bank’s databases

Physical Data Views
Organization and location of data on the storage media
Bank
Databases

among the data elements in a database, while the subschema is a logical view of the
data relationships needed to support specific end user application programs that will
access that database.
Remember that data models represent logical views of the data and relationships of
the database. Physical database design takes a physical view of the data (also called the
internal view) that describes how data are to be physically stored and accessed on the
storage devices of a computer system. For example, Figure 5.12 illustrates these dif-
ferent database views and the software interface of a bank database processing system.
In this example, checking, savings, and installment lending are the business processes
whose data models are part of a banking services data model that serves as a logical
data framework for all bank services.

Aetna: Insuring On a daily basis the operational services central support area at Aetna Inc. is
Tons of Data responsible for 21.8 tons of data (174.6 terabytes [TB]). Over 119.2TB reside
on mainframe-connected disk drives, while the remaining 55.4TB sit on disks
attached to midrange computers. Almost all of this data are located in the com-
pany’s headquarters in Hartford, Connecticut—with most of the information in
relational databases. To make matters even more interesting, outside customers
have access to about 20TB of the information. Four interconnected data centers
containing 14 mainframes and more than 1,000 midrange servers process the
data. It takes more than 4,100 direct-access storage devices to hold Aetna’s key
databases.


Most of Aetna’s ever-growing mountain of data is health care information. The
insurance company maintains records for both health maintenance organization
participants and customers covered by insurance policies. Aetna has detailed
records of providers, such as doctors, hospitals, dentists, and pharmacies, and it
keeps track of all the claims it has processed. Some of Aetna’s larger customers send
tapes containing insured employee data; the firm is moving toward using the Internet
to collect such data.
If managing gigabytes of data is like flying a hang glider, managing multiple
terabytes of data is like piloting a space shuttle: a thousand times more complex.
You can’t just extrapolate from experiences with small and medium data stores to
understand how to successfully manage tons of data. Even an otherwise mundane
operation such as backing up a database can be daunting if the time needed to finish
copying the data exceeds the time available.
Data integrity, backup, security, and availability are collectively the Holy
Grail of dealing with large data stores. The sheer volume of data makes these
goals a challenge, and a highly decentralized environment complicates matters
even more. Developing and adhering to standardized data maintenance proce-
dures always provide an organization with the best return on their data dollar
investment [9, 11].


SECTION II Managing Data Resources

Data Resource Data are a vital organizational resource that needs to be managed like other important
business assets. Today’s business enterprises cannot survive or succeed without quality
Management data about their internal operations and external environment.
With each online mouse click, either a fresh bit of data is created or already-stored data are
retrieved from all those business websites. All that’s on top of the heavy demand for indus-
trial-strength data storage already in use by scores of big corporations. What’s driving the
growth is a crushing imperative for corporations to analyze every bit of information they can
extract from their huge data warehouses for competitive advantage. That has turned the
data storage and management function into a key strategic role of the information age [8].
That’s why organizations and their managers need to practice data resource man-
agement, a managerial activity that applies information systems technologies like data-
base management, data warehousing, and other data management tools to the task of
managing an organization’s data resources to meet the information needs of their busi-
ness stakeholders. This chapter will show you the managerial implications of using
data resource management technologies and methods to manage an organization’s data
assets to meet business information requirements.
Read the Real World Case on data administration. We can learn a lot from this case
about the challenges of managing the data within an organization. See Figure 5.13.

Types of Continuing developments in information technology and its business applications
have resulted in the evolution of several major types of databases. Figure 5.14 illus-
Databases trates several major conceptual categories of databases that may be found in many
organizations. Let’s take a brief look at some of them now.

Operational Operational databases store detailed data needed to support the business processes
Databases and operations of a company. They are also called subject area databases (SADB), trans-
action databases, and production databases. Examples are a customer database, human re-
source database, inventory database, and other databases containing data generated by
business operations. For example, a human resource database like that shown earlier in
Figure 5.2 would include data identifying each employee and his or her time worked,
compensation, benefits, performance appraisals, training and development status, and
other related human resource data. Figure 5.15 illustrates some of the common oper-
ational databases that can be created and managed for a small business using Microsoft
Access database management software.

Distributed Many organizations replicate and distribute copies or parts of databases to network
Databases servers at a variety of sites. These distributed databases can reside on network servers
on the World Wide Web, on corporate intranets or extranets, or on other company
networks. Distributed databases may be copies of operational or analytical databases,
hypermedia or discussion databases, or any other type of database. Replication and dis-
tribution of databases are done to improve database performance at end user worksites.
Ensuring that the data in an organization’s distributed databases are consistently and
concurrently updated is a major challenge of distributed database management.
Distributed databases have both advantages and disadvantages. One primary ad-
vantage of a distributed database lies with the protection of valuable data. If all of an
organization’s data reside in a single physical location, any catastrophic event like a fire
or damage to the media holding the data would result in an equally catastrophic loss
of use of that data. By having databases distributed in multiple locations, the negative
impact of such an event can be minimized.


2
REAL WORLD
Emerson and Sanofi: Data
CASE Stewards Seek Data Conformity

A customer is a customer is a customer, right? Actu-
ally, it’s not that simple. Just ask Emerson Process
Management, an Emerson Electric Co. unit in
Austin that supplies process automation products. In 2000 the
company attempted to build a data warehouse to store cus-
“It’s usually a seesaw effect,” says Chris Enger, formerly
manager of information management at Philip Morris USA
Inc. “When something goes wrong, they put someone in
charge of data quality, and when things get better, they pull
those resources away.”
tomer information from over 85 countries. The effort failed Creating a data quality team requires gathering people
in large part because the structure of the warehouse couldn’t with an unusual mix of business, technology, and diplomatic
accommodate the many variations on customers’ names. skills. It’s even difficult to agree on a job title. In Rybeck’s
For instance, different users in different parts of the world department, they’re called “data analysts,” but titles at other
might identify Exxon as Exxon, Mobil, Esso, or ExxonMobil, companies include “data quality control supervisor,” “data
to name a few variations. The warehouse would see them as coordinator,” or “data quality manager.”
separate customers, and that would lead to inaccurate results “When you say you want a data analyst, they’ll come
when business users performed queries. back with a DBA [database administrator]. But it’s not the
That’s when the company hired Nancy Rybeck as data same at all,” Rybeck says. “It’s not the data structure, it’s the
administrator. Rybeck is now leading a renewed data ware- content.”
house project that ensures not only the standardization of At Emerson, data analysts in each business unit review
customer names, but also the quality and accuracy of cus- data and correct errors before it’s put into the operational
tomer data, including postal addresses, shipping addresses, systems. They also research customer relationships, loca-
and province codes. tions, and corporate hierarchies; train overseas workers to fix
To accomplish this, Emerson has done something unusual: data in their native languages; and serve as the main contact
It has started to build a department with 6 to 10 full-time “data with the data administrator and database architect for new
stewards” dedicated to establishing and maintaining the quality requirements and bug fixes.
of data entered into the operational systems that feed the data As the leader of the group, Rybeck plays a role that
warehouse. includes establishing and communicating data standards,
The practice of having formal data stewards is uncom- ensuring data integrity is maintained during database con-
mon. Most companies recognize the importance of data versions, and doing the logical design for the data ware-
quality, but many treat it as a “find-and-fix” effort, to be con- house tables.
ducted at the end of a project by someone in IT. Others The stewards have their work cut out for them. Bringing
casually assign the job to the business users who deal with the together customer records from the 75 business units yielded
data head-on. Still others may throw resources at improving a 75 percent duplication rate, misspellings, and fields with
data only when a major problem occurs. incorrect or missing data.
“Most of the divisions would have sworn they had great
FIGURE 5.13 processes and standards in place,” Rybeck says. “But when
you show them they entered the customer name 17 different
ways, or someone had entered, ‘Loading dock open 8:00–4:00’
into the address field, they realize it’s not as clean as they
thought.”
Although the data steward may report to IT—as is the
case at Emerson and at pharmaceuticals company Sanofi-
Synthelabo Inc.—it’s not a job for someone steeped in tech-
nical knowledge. Yet it’s not right for a businessperson who’s
a technophobe, either.
Seth Cohen is the first data quality control supervisor at
Sanofi in New York. He was hired in 2003 to help design au-
tomated processes to ensure the data quality of the customer
knowledge base that Sanofi was beginning to build.
Data stewards at Sanofi need to have business knowledge
because they need to make frequent judgment calls, Cohen
says. Indeed, judgment is a big part of the data steward’s
job—including the ability to determine where you don’t
need 100 percent perfection.
Cohen says that task is one of the biggest challenges of the
job. “One-hundred percent accuracy is just not achievable,”
Source: Flying Colours Ltd./Digital Vision/Getty Images


he says. “Some things you’re just going to have to let go or didn’t see why he was “causing them so many headaches and
you’d have a data warehouse with only 15 to 20 records.” adding several extra steps to the process,” he says.
A good example is when Sanofi purchases data on doctors There are many political traps as well. Take the issue of
that includes their birth dates, Cohen says. If a birth date is defining “customer address.” If data comes from a variety
given as February 31 or the number of the month is listed as of sources, you’re likely to get different types of coding
13 but the rest of the data are good, do you throw out all of schemes, some of which overlap.
the data or just figure the birth date isn’t all that important? People may also argue about how data should be pro-
It comes down to knowing how much it costs to fix the duced, he says. Should field representatives enter it from
data versus the payback. “You can pay millions of dollars a their laptops? Or should it first be independently checked for
year to get it perfect, but if the returns are in the hundreds of quality? Should it be uploaded hourly or weekly?
thousands, is it worth it?” asks Chuck Kelley, senior advisory Most of all, data stewards need to understand that data
consultant at Navigator Systems Inc., a corporate perfor- quality is a journey, not a destination. “It’s not a one-shot
mance management consultancy in Addison, Texas. deal—it’s ongoing,” Rybeck of Emerson says. “You can’t quit
Data stewards also need to be politically astute, diplo- after the first task.”
matic, and good at conflict resolution—in part because the
Source: Adapted from Mary Brandel, “Data Stewards Seek Data Conformity,”
environment isn’t always friendly. When Cohen joined Computerworld, March 15, 2004. Copyright © 2004 by Computerworld Inc.,
Sanofi, some questioned why he was there. In particular, IT Framingham, MA 01701. All rights reserved.

CASE STUDY QUESTIONS REAL WORLD ACTIVITIES
1. Why is the role of a data steward considered to be 1. As discussed in the case, the role of data steward is
innovative? Explain. relatively new, and its creation is motivated by the
2. What are the business benefits associated with the data desire to protect the valuable data assets of the firm.
steward program at Emerson? There are many job descriptions in the modern organi-
zation associated with the strategic management of data
3. How does effective data resource management resources. Using the Internet, see if you can find evi-
contribute to the strategic goals of an organization? dence of other job roles that are focused on the man-
Provide examples from Emerson and others. agement of an organization’s data. How might a person
train for these new jobs?
2. As more and more data are collected stored, processed,
and disseminated by organizations, new and innovative
ways to manage them must be developed. Break into
small groups with your classmates, and discuss how the
data resource management methods of today will need
to evolve as more types of data emerge. Will we ever
get to the point where we can manage our data in a
completely automated manner?


FIGURE 5.14 Examples of some of the major types of databases used by organizations and end users.

External
Databases
on the
Internet and
Online
Client PC Services

Network
Server

Distributed
Databases Operational
on Intranets Databases
and Other of the
Networks Organization

End User Data Data
Databases Warehouse Marts

Another advantage of distributed databases is found in their storage requirements.
Often, a large database system may be distributed into smaller databases based on
some logical relationship between the data and the location. For example, a company
with several branch operations may distribute its data so that each branch operation
location is also the location of its branch database. Because multiple databases in a
distributed system can be joined together, each location has control of its local data
while all other locations can access any database in the company if so desired.
Distributed databases are not without some challenges, however. The primary chal-
lenge is the maintenance of data accuracy. If a company distributes its database to

FIGURE 5.15
Examples of operational
databases that can be
created and managed
for a small business by
microcomputer database
management software like
Microsoft Access.

Source: Courtesy of Microsoft Corp.


multiple locations, any change to the data in one location must somehow be updated in
all other locations. This can be accomplished in one of two ways: replication or duplication.
Updating a distributed database using replication involves using a specialized soft-
ware application that looks at each distributed database and then finds the changes
made to it. Once these changes have been identified, the replication process makes all
of the distributed databases look the same by making the appropriate changes to each
one. The replication process is very complex and, depending upon the number and
size of the distributed databases, can consume a lot of time and computer resources.
The duplication process, in contrast, is much less complicated. It basically identi-
fies one database as a master and then duplicates that database at a prescribed time af-
ter hours so that each distributed location has the same data. One drawback to the
duplication process is that no changes can ever be made to any database other than the
master to avoid having local changes overwritten during the duplication process.
Nonetheless, properly used, duplication and replication can keep all distributed
locations current with the latest data.
One additional challenge associated with distributed databases is the extra com-
puting power and bandwidth necessary to access multiple databases in multiple loca-
tions. We will look more closely at the issue of bandwidth in Chapter 6 when we focus
on telecommunications and networks.

External Databases Access to a wealth of information from external databases is available for a fee from
commercial online services, and with or without charge from many sources on the
World Wide Web. Websites provide an endless variety of hyperlinked pages of multi-
media documents in hypermedia databases for you to access. Data are available in the
form of statistics on economic and demographic activity from statistical databanks. Or
you can view or download abstracts or complete copies of hundreds of newspapers,
magazines, newsletters, research papers, and other published material and other peri-
odicals from bibliographic and full text databases. Whenever you use a search engine like
Google or Yahoo to look up something on the Internet, you are using an external
database—a very, very large one!

Hypermedia The rapid growth of websites on the Internet and corporate intranets and extranets has
Databases dramatically increased the use of databases of hypertext and hypermedia documents.
A website stores such information in a hypermedia database consisting of hyper-
linked pages of multimedia (text, graphic, and photographic images, video clips, audio
segments, and so on). That is, from a database management point of view, the set of
interconnected multimedia pages at a website is a database of interrelated hypermedia
page elements, rather than interrelated data records [2].
Figure 5.16 shows how you might use a Web browser on your client PC to connect
with a Web network server. This server runs Web server software to access and transfer the

FIGURE 5.16 The components of a Web-based information system include Web browsers,
servers, and hypermedia databases.

The Internet
Intranets
Web Extranets HTML
Browser Web XML
Server
Web Pages
Software
Image Files
Video Files
Audio Files

Client PCs
Network Hypermedia
Server Database


FIGURE 5.17 The components of a complete data warehouse system.

Operational, External,
and Other Databases Analytical
Data Store

Data Enterprise
Management Warehouse

Data
Marts
Data Acquisition Data Analysis
(Capture, clean, (Query, report,
transform, transport, analyze, mine,
load/apply) deliver)
Metadata
Metadata Directory
Management
Warehouse Metadata
Repository Web Information
Design Systems

Source: Adapted courtesy of Hewlett-Packard.

Web pages you request. The website illustrated in Figure 5.17 uses a hypermedia database
consisting of Web page content described by HTML (Hypertext Markup Language)
code or XML (Extensible Markup Language) labels, image files, video files, and audio.
The Web server software acts as a database management system to manage the transfer of
hypermedia files for downloading by the multimedia plug-ins of your Web browser.

Data A data warehouse stores data that have been extracted from the various operational,
external, and other databases of an organization. It is a central source of the data that
Warehouses have been cleaned, transformed, and cataloged so they can be used by managers and
and Data other business professionals for data mining, online analytical processing, and other
Mining forms of business analysis, market research, and decision support. (We’ll talk in depth
about all of these activities in Chapter 9.) Data warehouses may be subdivided into
data marts, which hold subsets of data from the warehouse that focus on specific
aspects of a company, such as a department or a business process.
Figure 5.17 illustrates the components of a complete data warehouse system. No-
tice how data from various operational and external databases are captured, cleaned,
and transformed into data that can be better used for analysis. This acquisition process
might include activities like consolidating data from several sources, filtering out un-
wanted data, correcting incorrect data, converting data to new data elements, and
aggregating data into new data subsets.
This data is then stored in the enterprise data warehouse, from where it can be moved
into data marts or to an analytical data store that holds data in a more useful form for cer-
tain types of analysis. Metadata (data that defines the data in the data warehouse) is stored
in a metadata repository and cataloged by a metadata directory. Finally, a variety of ana-
lytical software tools can be provided to query, report, mine, and analyze the data for
delivery via Internet and intranet Web systems to business end users. See Figure 5.18.

Revenue: Closing In the late 1990s the state of Iowa had a tax gap, a polite way of describing compa-
the Gap with a nies and individuals who either didn’t file state tax returns or who underreported
their earnings. To identify noncompliant taxpayers, the Iowa Department of
Data Warehouse Revenue and Finance (IDRF) relied on a jumble of nonintegrated mainframe
applications, file extracts, and over 20 disparate stand-alone systems (databases,


FIGURE 5.18 Applications Data Marts
A data warehouse and its
data mart subsets hold data
that have been extracted Finance
ERP
from various operational
databases for business
analysis, market research,
decision support, and data
mining applications. Inventory
control
Marketing

Logistics

Data
Warehouse Sales

Shipping

Accounting
Purchasing

CRM
Management
reporting

mainframe data, and information on individual spreadsheets, to name a few).
The real problem was that none of these systems could communicate with each
other. What was needed was a central data warehouse to pull together information
from all those systems for analysis. But getting funding from the state for such a
large-scale project wasn’t an option.
So the IDRF came up with a plan the Iowa Legislature couldn’t help but ap-
prove. The plan was simple: Build a data warehouse that would be entirely funded
using the additional tax revenue it generated by catching tax scofflaws.
Development of the data warehouse began in November 1999, and it became
operational five months later. The system combines data from the department’s
own tax and accounts receivable systems, tax files shared by the federal Internal
Revenue Service, the Iowa Workforce Development Agency, and a number of other
sources. Revenue- and finance-department employees analyze the data using com-
mercially available reporting software.
In the three years since it went live, the IDRF data warehouse has generated $28
million in tax revenue and is expected to generate $10 million each year from now
on. There’s no question the project has paid for itself many times over, and the state
of Iowa is sold on the value of data warehousing. The next step is to use the data
warehouse to better understand why taxpayers might be in noncompliance. That will
involve analyzing taxpayer demographics and changes in tax laws and policies. This
phase of the project is also expected to generate revenues for the state while simulta-
neously helping to improve the tax laws for the citizens of Iowa [12, 13].

Lectura Capitulo 5. Sistemas de Información Gerencial, James O´Brien

Lectura Capitulo 5. Sistemas de Información Gerencial, James O´Brien

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Lectura Capitulo 5. Sistemas de Información Gerencial, James O´Brien

Similaire à Lectura Capitulo 5. Sistemas de Información Gerencial, James O´Brien (20)

Plus de Andres Roa Gonzalez

Plus de Andres Roa Gonzalez (8)

Dernier

Dernier (20)

Lectura Capitulo 5. Sistemas de Información Gerencial, James O´Brien