SlideShare une entreprise Scribd logo
1  sur  666
Télécharger pour lire hors ligne
Oracle9i
Data Warehousing Guide
Release 2 (9.2)
March 2002
Part No. A96520-01
Oracle9i Data Warehousing Guide, Release 2 (9.2)
Part No. A96520-01
Copyright © 1996, 2002 Oracle Corporation. All rights reserved.
Primary Author: Paul Lane
Contributing Authors: Viv Schupmann (Change Data Capture)
Contributors: Patrick Amor, Hermann Baer, Subhransu Basu, Srikanth Bellamkonda, Randy Bello,
Tolga Bozkaya, Benoit Dageville, John Haydu, Lilian Hobbs, Hakan Jakobsson, George Lumpkin, Cetin
Ozbutun, Jack Raitto, Ray Roccaforte, Sankar Subramanian, Gregory Smith, Ashish Thusoo,
Jean-Francois Verrier, Gary Vincent, Andy Witkowski, Zia Ziauddin
Graphic Designer: Valarie Moore
The Programs (which include both the software and documentation) contain proprietary information of
Oracle Corporation; they are provided under a license agreement containing restrictions on use and
disclosure and are also protected by copyright, patent and other intellectual and industrial property
laws. Reverse engineering, disassembly or decompilation of the Programs, except to the extent required
to obtain interoperability with other independently created software or as specified by law, is prohibited.
The information contained in this document is subject to change without notice. If you find any problems
in the documentation, please report them to us in writing. Oracle Corporation does not warrant that this
document is error-free. Except as may be expressly permitted in your license agreement for these
Programs, no part of these Programs may be reproduced or transmitted in any form or by any means,
electronic or mechanical, for any purpose, without the express written permission of Oracle Corporation.
If the Programs are delivered to the U.S. Government or anyone licensing or using the programs on
behalf of the U.S. Government, the following notice is applicable:
Restricted Rights Notice Programs delivered subject to the DOD FAR Supplement are "commercial
computer software" and use, duplication, and disclosure of the Programs, including documentation,
shall be subject to the licensing restrictions set forth in the applicable Oracle license agreement.
Otherwise, Programs delivered subject to the Federal Acquisition Regulations are "restricted computer
software" and use, duplication, and disclosure of the Programs shall be subject to the restrictions in FAR
52.227-19, Commercial Computer Software - Restricted Rights (June, 1987). Oracle Corporation, 500
Oracle Parkway, Redwood City, CA 94065.
The Programs are not intended for use in any nuclear, aviation, mass transit, medical, or other inherently
dangerous applications. It shall be the licensee's responsibility to take all appropriate fail-safe, backup,
redundancy, and other measures to ensure the safe use of such applications if the Programs are used for
such purposes, and Oracle Corporation disclaims liability for any damages caused by such use of the
Programs.
Oracle is a registered trademark, and Express, Oracle Expert, Oracle Store, Oracle7, Oracle8, Oracle8i,
Oracle9i, Oracle Store, PL/SQL, Pro*C, and SQL*Plus are trademarks or registered trademarks of Oracle
Corporation. Other names may be trademarks of their respective owners.
iii
Contents
Send Us Your Comments ................................................................................................................. xix
Preface.......................................................................................................................................................... xxi
What’s New in Data Warehousing?........................................................................................ xxxiii
Part I Concepts
1 Data Warehousing Concepts
What is a Data Warehouse?............................................................................................................... 1-2
Subject Oriented............................................................................................................................ 1-2
Integrated....................................................................................................................................... 1-2
Nonvolatile .................................................................................................................................... 1-3
Time Variant.................................................................................................................................. 1-3
Contrasting OLTP and Data Warehousing Environments..................................................... 1-3
Data Warehouse Architectures......................................................................................................... 1-5
Data Warehouse Architecture (Basic)........................................................................................ 1-5
Data Warehouse Architecture (with a Staging Area).............................................................. 1-6
Data Warehouse Architecture (with a Staging Area and Data Marts) ................................. 1-7
Part II Logical Design
2 Logical Design in Data Warehouses
Logical Versus Physical Design in Data Warehouses.................................................................. 2-2
iv
Creating a Logical Design ................................................................................................................. 2-2
Data Warehousing Schemas.............................................................................................................. 2-3
Star Schemas.................................................................................................................................. 2-4
Other Schemas............................................................................................................................... 2-5
Data Warehousing Objects................................................................................................................ 2-5
Fact Tables...................................................................................................................................... 2-5
Dimension Tables ......................................................................................................................... 2-6
Unique Identifiers......................................................................................................................... 2-8
Relationships ................................................................................................................................. 2-8
Example of Data Warehousing Objects and Their Relationships.......................................... 2-8
Part III Physical Design
3 Physical Design in Data Warehouses
Moving from Logical to Physical Design....................................................................................... 3-2
Physical Design................................................................................................................................... 3-2
Physical Design Structures.......................................................................................................... 3-4
Tablespaces.................................................................................................................................... 3-4
Tables and Partitioned Tables..................................................................................................... 3-5
Views .............................................................................................................................................. 3-6
Integrity Constraints .................................................................................................................... 3-6
Indexes and Partitioned Indexes ................................................................................................ 3-6
Materialized Views....................................................................................................................... 3-7
Dimensions .................................................................................................................................... 3-7
4 Hardware and I/O Considerations in Data Warehouses
Overview of Hardware and I/O Considerations in Data Warehouses ..................................... 4-2
Why Stripe the Data?.................................................................................................................... 4-2
Automatic Striping ....................................................................................................................... 4-3
Manual Striping ............................................................................................................................ 4-4
Local and Global Striping............................................................................................................ 4-5
Analyzing Striping ....................................................................................................................... 4-6
RAID Configurations ......................................................................................................................... 4-9
RAID 0 (Striping) ........................................................................................................................ 4-10
v
RAID 1 (Mirroring)..................................................................................................................... 4-10
RAID 0+1 (Striping and Mirroring) ......................................................................................... 4-10
Striping, Mirroring, and Media Recovery............................................................................... 4-10
RAID 5.......................................................................................................................................... 4-11
The Importance of Specific Analysis........................................................................................ 4-12
5 Parallelism and Partitioning in Data Warehouses
Overview of Parallel Execution........................................................................................................ 5-2
When to Implement Parallel Execution..................................................................................... 5-2
Granules of Parallelism..................................................................................................................... 5-3
Block Range Granules.................................................................................................................. 5-3
Partition Granules......................................................................................................................... 5-4
Partitioning Design Considerations ............................................................................................... 5-4
Types of Partitioning.................................................................................................................... 5-4
Partitioning and Data Segment Compression........................................................................ 5-17
Partition Pruning ........................................................................................................................ 5-19
Partition-Wise Joins.................................................................................................................... 5-21
Miscellaneous Partition Operations ............................................................................................. 5-31
Adding Partitions ....................................................................................................................... 5-32
Dropping Partitions.................................................................................................................... 5-33
Exchanging Partitions................................................................................................................ 5-34
Moving Partitions....................................................................................................................... 5-34
Splitting and Merging Partitions.............................................................................................. 5-35
Truncating Partitions ................................................................................................................. 5-35
Coalescing Partitions.................................................................................................................. 5-36
6 Indexes
Bitmap Indexes.................................................................................................................................... 6-2
Bitmap Join Indexes...................................................................................................................... 6-6
B-tree Indexes .................................................................................................................................... 6-10
Local Indexes Versus Global Indexes ........................................................................................... 6-10
7 Integrity Constraints
Why Integrity Constraints are Useful in a Data Warehouse ...................................................... 7-2
vi
Overview of Constraint States.......................................................................................................... 7-3
Typical Data Warehouse Integrity Constraints ............................................................................. 7-4
UNIQUE Constraints in a Data Warehouse ............................................................................. 7-4
FOREIGN KEY Constraints in a Data Warehouse................................................................... 7-5
RELY Constraints.......................................................................................................................... 7-6
Integrity Constraints and Parallelism........................................................................................ 7-7
Integrity Constraints and Partitioning....................................................................................... 7-7
View Constraints........................................................................................................................... 7-7
8 Materialized Views
Overview of Data Warehousing with Materialized Views......................................................... 8-2
Materialized Views for Data Warehouses................................................................................. 8-2
Materialized Views for Distributed Computing...................................................................... 8-3
Materialized Views for Mobile Computing.............................................................................. 8-3
The Need for Materialized Views .............................................................................................. 8-3
Components of Summary Management ................................................................................... 8-5
Data Warehousing Terminology ................................................................................................ 8-7
Materialized View Schema Design ............................................................................................ 8-8
Loading Data ............................................................................................................................... 8-10
Overview of Materialized View Management Tasks............................................................ 8-11
Types of Materialized Views .......................................................................................................... 8-12
Materialized Views with Aggregates....................................................................................... 8-13
Materialized Views Containing Only Joins ............................................................................ 8-16
Nested Materialized Views ....................................................................................................... 8-18
Creating Materialized Views.......................................................................................................... 8-21
Naming Materialized Views ..................................................................................................... 8-22
Storage And Data Segment Compression............................................................................... 8-23
Build Methods............................................................................................................................. 8-23
Enabling Query Rewrite ............................................................................................................ 8-24
Query Rewrite Restrictions ....................................................................................................... 8-24
Refresh Options........................................................................................................................... 8-25
ORDER BY Clause ...................................................................................................................... 8-31
Materialized View Logs............................................................................................................. 8-31
Using Oracle Enterprise Manager............................................................................................ 8-32
Using Materialized Views with NLS Parameters .................................................................. 8-32
vii
Registering Existing Materialized Views..................................................................................... 8-33
Partitioning and Materialized Views............................................................................................ 8-35
Partition Change Tracking ........................................................................................................ 8-35
Partitioning a Materialized View ............................................................................................. 8-39
Partitioning a Prebuilt Table..................................................................................................... 8-40
Rolling Materialized Views....................................................................................................... 8-41
Materialized Views in OLAP Environments............................................................................... 8-41
OLAP Cubes................................................................................................................................ 8-41
Specifying OLAP Cubes in SQL............................................................................................... 8-42
Querying OLAP Cubes in SQL................................................................................................. 8-43
Partitioning Materialized Views for OLAP ............................................................................ 8-47
Compressing Materialized Views for OLAP.......................................................................... 8-47
Materialized Views with Set Operators .................................................................................. 8-47
Choosing Indexes for Materialized Views................................................................................... 8-49
Invalidating Materialized Views................................................................................................... 8-50
Security Issues with Materialized Views..................................................................................... 8-50
Altering Materialized Views .......................................................................................................... 8-51
Dropping Materialized Views........................................................................................................ 8-52
Analyzing Materialized View Capabilities................................................................................. 8-52
Using the DBMS_MVIEW.EXPLAIN_MVIEW Procedure................................................... 8-53
MV_CAPABILITIES_TABLE.CAPABILITY_NAME Details............................................... 8-56
MV_CAPABILITIES_TABLE Column Details ....................................................................... 8-58
9 Dimensions
What are Dimensions?....................................................................................................................... 9-2
Creating Dimensions ......................................................................................................................... 9-4
Multiple Hierarchies .................................................................................................................... 9-7
Using Normalized Dimension Tables ....................................................................................... 9-9
Viewing Dimensions........................................................................................................................ 9-10
Using The DEMO_DIM Package.............................................................................................. 9-10
Using Oracle Enterprise Manager............................................................................................ 9-11
Using Dimensions with Constraints............................................................................................. 9-11
Validating Dimensions.................................................................................................................... 9-12
Altering Dimensions........................................................................................................................ 9-13
Deleting Dimensions....................................................................................................................... 9-14
viii
Using the Dimension Wizard ......................................................................................................... 9-14
Managing the Dimension Object.............................................................................................. 9-14
Creating a Dimension................................................................................................................. 9-17
Part IV Managing the Warehouse Environment
10 Overview of Extraction, Transformation, and Loading
Overview of ETL ............................................................................................................................... 10-2
ETL Tools ............................................................................................................................................ 10-3
Daily Operations......................................................................................................................... 10-4
Evolution of the Data Warehouse ............................................................................................ 10-4
11 Extraction in Data Warehouses
Overview of Extraction in Data Warehouses............................................................................... 11-2
Introduction to Extraction Methods in Data Warehouses......................................................... 11-2
Logical Extraction Methods....................................................................................................... 11-3
Physical Extraction Methods..................................................................................................... 11-4
Change Data Capture................................................................................................................. 11-5
Data Warehousing Extraction Examples....................................................................................... 11-8
Extraction Using Data Files....................................................................................................... 11-8
Extraction Via Distributed Operations.................................................................................. 11-11
12 Transportation in Data Warehouses
Overview of Transportation in Data Warehouses ...................................................................... 12-2
Introduction to Transportation Mechanisms in Data Warehouses ......................................... 12-2
Transportation Using Flat Files ................................................................................................ 12-2
Transportation Through Distributed Operations .................................................................. 12-2
Transportation Using Transportable Tablespaces ................................................................. 12-3
13 Loading and Transformation
Overview of Loading and Transformation in Data Warehouses ............................................. 13-2
Transformation Flow.................................................................................................................. 13-2
Loading Mechanisms ....................................................................................................................... 13-5
SQL*Loader ................................................................................................................................. 13-5
ix
External Tables............................................................................................................................ 13-6
OCI and Direct-Path APIs ......................................................................................................... 13-8
Export/Import ............................................................................................................................ 13-8
Transformation Mechanisms.......................................................................................................... 13-9
Transformation Using SQL ....................................................................................................... 13-9
Transformation Using PL/SQL.............................................................................................. 13-15
Transformation Using Table Functions................................................................................. 13-16
Loading and Transformation Scenarios...................................................................................... 13-25
Parallel Load Scenario.............................................................................................................. 13-25
Key Lookup Scenario ............................................................................................................... 13-33
Exception Handling Scenario ................................................................................................. 13-34
Pivoting Scenarios .................................................................................................................... 13-35
14 Maintaining the Data Warehouse
Using Partitioning to Improve Data Warehouse Refresh ......................................................... 14-2
Refresh Scenarios........................................................................................................................ 14-5
Scenarios for Using Partitioning for Refreshing Data Warehouses .................................... 14-7
Optimizing DML Operations During Refresh ........................................................................... 14-8
Implementing an Efficient MERGE Operation ...................................................................... 14-9
Maintaining Referential Integrity........................................................................................... 14-10
Purging Data ............................................................................................................................. 14-11
Refreshing Materialized Views ................................................................................................... 14-12
Complete Refresh ..................................................................................................................... 14-13
Fast Refresh ............................................................................................................................... 14-14
ON COMMIT Refresh.............................................................................................................. 14-14
Manual Refresh Using the DBMS_MVIEW Package .......................................................... 14-14
Refresh Specific Materialized Views with REFRESH.......................................................... 14-15
Refresh All Materialized Views with REFRESH_ALL_MVIEWS ..................................... 14-16
Refresh Dependent Materialized Views with REFRESH_DEPENDENT......................... 14-16
Using Job Queues for Refresh................................................................................................. 14-18
When Refresh is Possible......................................................................................................... 14-18
Recommended Initialization Parameters for Parallelism................................................... 14-18
Monitoring a Refresh ............................................................................................................... 14-19
Checking the Status of a Materialized View......................................................................... 14-19
Tips for Refreshing Materialized Views with Aggregates ................................................. 14-19
x
Tips for Refreshing Materialized Views Without Aggregates........................................... 14-22
Tips for Refreshing Nested Materialized Views .................................................................. 14-23
Tips for Fast Refresh with UNION ALL ............................................................................... 14-25
Tips After Refreshing Materialized Views............................................................................ 14-25
Using Materialized Views with Partitioned Tables ................................................................. 14-26
Fast Refresh with Partition Change Tracking....................................................................... 14-26
Fast Refresh with CONSIDER FRESH................................................................................... 14-30
15 Change Data Capture
About Change Data Capture........................................................................................................... 15-2
Publish and Subscribe Model.................................................................................................... 15-3
Example of a Change Data Capture System........................................................................... 15-4
Components and Terminology for Synchronous Change Data Capture ........................... 15-5
Installation and Implementation................................................................................................... 15-8
Change Data Capture Restriction on Direct-Path INSERT................................................... 15-8
Security ............................................................................................................................................... 15-9
Columns in a Change Table............................................................................................................ 15-9
Change Data Capture Views......................................................................................................... 15-10
Synchronous Mode of Data Capture........................................................................................... 15-12
Publishing Change Data................................................................................................................ 15-12
Step 1: Decide which Oracle Instance will be the Source System...................................... 15-12
Step 2: Create the Change Tables that will Contain the Changes...................................... 15-12
Managing Change Tables and Subscriptions............................................................................ 15-14
Subscribing to Change Data......................................................................................................... 15-15
Steps Required to Subscribe to Change Data ....................................................................... 15-15
What Happens to Subscriptions when the Publisher Makes Changes............................. 15-19
Export and Import Considerations .............................................................................................. 15-20
16 Summary Advisor
Overview of the Summary Advisor in the DBMS_OLAP Package ........................................ 16-2
Using the Summary Advisor .......................................................................................................... 16-6
Identifier Numbers ..................................................................................................................... 16-7
Workload Management ............................................................................................................. 16-7
Loading a User-Defined Workload.......................................................................................... 16-9
Loading a Trace Workload...................................................................................................... 16-12
xi
Loading a SQL Cache Workload............................................................................................ 16-15
Validating a Workload............................................................................................................. 16-17
Removing a Workload............................................................................................................. 16-18
Using Filters with the Summary Advisor............................................................................. 16-18
Removing a Filter ..................................................................................................................... 16-22
Recommending Materialized Views...................................................................................... 16-23
SQL Script Generation ............................................................................................................. 16-27
Summary Data Report ............................................................................................................. 16-29
When Recommendations are No Longer Required............................................................. 16-31
Stopping the Recommendation Process................................................................................ 16-32
Summary Advisor Sample Sessions ...................................................................................... 16-32
Summary Advisor and Missing Statistics............................................................................. 16-37
Summary Advisor Privileges and ORA-30446..................................................................... 16-38
Estimating Materialized View Size............................................................................................. 16-38
ESTIMATE_MVIEW_SIZE Parameters................................................................................. 16-38
Is a Materialized View Being Used? ........................................................................................... 16-39
DBMS_OLAP.EVALUATE_MVIEW_STRATEGY Procedure........................................... 16-39
Summary Advisor Wizard............................................................................................................. 16-40
Summary Advisor Steps.......................................................................................................... 16-41
Part V Warehouse Performance
17 Schema Modeling Techniques
Schemas in Data Warehouses......................................................................................................... 17-2
Third Normal Form .......................................................................................................................... 17-2
Optimizing Third Normal Form Queries................................................................................ 17-3
Star Schemas...................................................................................................................................... 17-4
Snowflake Schemas .................................................................................................................... 17-5
Optimizing Star Queries ................................................................................................................. 17-6
Tuning Star Queries ................................................................................................................... 17-6
Using Star Transformation........................................................................................................ 17-7
18 SQL for Aggregation in Data Warehouses
Overview of SQL for Aggregation in Data Warehouses........................................................... 18-2
xii
Analyzing Across Multiple Dimensions ................................................................................. 18-3
Optimized Performance............................................................................................................. 18-4
An Aggregate Scenario .............................................................................................................. 18-5
Interpreting NULLs in Examples ............................................................................................. 18-6
ROLLUP Extension to GROUP BY................................................................................................ 18-6
When to Use ROLLUP ............................................................................................................... 18-7
ROLLUP Syntax.......................................................................................................................... 18-7
Partial Rollup............................................................................................................................... 18-8
CUBE Extension to GROUP BY ................................................................................................... 18-10
When to Use CUBE................................................................................................................... 18-10
CUBE Syntax ............................................................................................................................. 18-11
Partial CUBE.............................................................................................................................. 18-12
Calculating Subtotals Without CUBE.................................................................................... 18-13
GROUPING Functions .................................................................................................................. 18-13
GROUPING Function .............................................................................................................. 18-14
When to Use GROUPING ....................................................................................................... 18-16
GROUPING_ID Function........................................................................................................ 18-17
GROUP_ID Function................................................................................................................ 18-17
GROUPING SETS Expression ..................................................................................................... 18-19
Composite Columns....................................................................................................................... 18-21
Concatenated Groupings............................................................................................................... 18-24
Concatenated Groupings and Hierarchical Data Cubes..................................................... 18-26
Considerations when Using Aggregation.................................................................................. 18-28
Hierarchy Handling in ROLLUP and CUBE........................................................................ 18-28
Column Capacity in ROLLUP and CUBE............................................................................. 18-29
HAVING Clause Used with GROUP BY Extensions .......................................................... 18-29
ORDER BY Clause Used with GROUP BY Extensions ....................................................... 18-30
Using Other Aggregate Functions with ROLLUP and CUBE............................................ 18-30
Computation Using the WITH Clause........................................................................................ 18-30
19 SQL for Analysis in Data Warehouses
Overview of SQL for Analysis in Data Warehouses.................................................................. 19-2
Ranking Functions............................................................................................................................ 19-5
RANK and DENSE_RANK....................................................................................................... 19-5
Top N Ranking.......................................................................................................................... 19-12
xiii
Bottom N Ranking.................................................................................................................... 19-12
CUME_DIST.............................................................................................................................. 19-13
PERCENT_RANK..................................................................................................................... 19-14
NTILE......................................................................................................................................... 19-14
ROW_NUMBER........................................................................................................................ 19-16
Windowing Aggregate Functions................................................................................................ 19-17
Treatment of NULLs as Input to Window Functions ......................................................... 19-18
Windowing Functions with Logical Offset........................................................................... 19-18
Cumulative Aggregate Function Example ........................................................................... 19-18
Moving Aggregate Function Example .................................................................................. 19-19
Centered Aggregate Function................................................................................................. 19-20
Windowing Aggregate Functions in the Presence of Duplicates...................................... 19-21
Varying Window Size for Each Row ..................................................................................... 19-22
Windowing Aggregate Functions with Physical Offsets.................................................... 19-23
FIRST_VALUE and LAST_VALUE ....................................................................................... 19-24
Reporting Aggregate Functions ................................................................................................... 19-24
Reporting Aggregate Example ............................................................................................... 19-26
RATIO_TO_REPORT............................................................................................................... 19-27
LAG/LEAD Functions.................................................................................................................... 19-27
LAG/LEAD Syntax.................................................................................................................. 19-28
FIRST/LAST Functions.................................................................................................................. 19-28
FIRST/LAST Syntax................................................................................................................. 19-29
FIRST/LAST As Regular Aggregates.................................................................................... 19-29
FIRST/LAST As Reporting Aggregates................................................................................ 19-30
Linear Regression Functions ........................................................................................................ 19-31
REGR_COUNT ......................................................................................................................... 19-32
REGR_AVGY and REGR_AVGX ........................................................................................... 19-32
REGR_SLOPE and REGR_INTERCEPT................................................................................ 19-32
REGR_R2.................................................................................................................................... 19-32
REGR_SXX, REGR_SYY, and REGR_SXY............................................................................. 19-33
Linear Regression Statistics Examples................................................................................... 19-33
Sample Linear Regression Calculation.................................................................................. 19-34
Inverse Percentile Functions......................................................................................................... 19-34
Normal Aggregate Syntax....................................................................................................... 19-35
Inverse Percentile Restrictions................................................................................................ 19-38
xiv
Hypothetical Rank and Distribution Functions ....................................................................... 19-38
Hypothetical Rank and Distribution Syntax......................................................................... 19-38
WIDTH_BUCKET Function.......................................................................................................... 19-40
WIDTH_BUCKET Syntax........................................................................................................ 19-40
User-Defined Aggregate Functions ............................................................................................. 19-43
CASE Expressions........................................................................................................................... 19-44
CASE Example .......................................................................................................................... 19-44
Creating Histograms With User-Defined Buckets............................................................... 19-45
20 OLAP and Data Mining
OLAP ................................................................................................................................................... 20-2
Benefits of OLAP and RDBMS Integration............................................................................. 20-2
Data Mining....................................................................................................................................... 20-4
Enabling Data Mining Applications ........................................................................................ 20-5
Predictions and Insights ............................................................................................................ 20-5
Mining Within the Database Architecture.............................................................................. 20-5
Java API........................................................................................................................................ 20-7
21 Using Parallel Execution
Introduction to Parallel Execution Tuning................................................................................... 21-2
When to Implement Parallel Execution................................................................................... 21-2
Operations That Can Be Parallelized....................................................................................... 21-3
The Parallel Execution Server Pool .......................................................................................... 21-3
How Parallel Execution Servers Communicate ..................................................................... 21-5
Parallelizing SQL Statements.................................................................................................... 21-6
Types of Parallelism ....................................................................................................................... 21-11
Parallel Query............................................................................................................................ 21-11
Parallel DDL .............................................................................................................................. 21-13
Parallel DML.............................................................................................................................. 21-18
Parallel Execution of Functions .............................................................................................. 21-28
Other Types of Parallelism...................................................................................................... 21-29
Initializing and Tuning Parameters for Parallel Execution .................................................... 21-30
Selecting Automated or Manual Tuning of Parallel Execution ......................................... 21-31
Using Automatically Derived Parameter Settings............................................................... 21-31
Setting the Degree of Parallelism ........................................................................................... 21-32
xv
How Oracle Determines the Degree of Parallelism for Operations.................................. 21-34
Balancing the Workload .......................................................................................................... 21-37
Parallelization Rules for SQL Statements.............................................................................. 21-38
Enabling Parallelism for Tables and Queries ....................................................................... 21-46
Degree of Parallelism and Adaptive Multiuser: How They Interact................................ 21-47
Forcing Parallel Execution for a Session ............................................................................... 21-48
Controlling Performance with the Degree of Parallelism .................................................. 21-48
Tuning General Parameters for Parallel Execution.................................................................. 21-49
Parameters Establishing Resource Limits for Parallel Operations.................................... 21-49
Parameters Affecting Resource Consumption ..................................................................... 21-58
Parameters Related to I/O ...................................................................................................... 21-63
Monitoring and Diagnosing Parallel Execution Performance............................................... 21-64
Is There Regression?................................................................................................................. 21-66
Is There a Plan Change?........................................................................................................... 21-66
Is There a Parallel Plan?........................................................................................................... 21-66
Is There a Serial Plan? .............................................................................................................. 21-66
Is There Parallel Execution?.................................................................................................... 21-67
Is the Workload Evenly Distributed? .................................................................................... 21-67
Monitoring Parallel Execution Performance with Dynamic Performance Views .......... 21-68
Monitoring Session Statistics .................................................................................................. 21-71
Monitoring System Statistics................................................................................................... 21-73
Monitoring Operating System Statistics................................................................................ 21-74
Affinity and Parallel Operations.................................................................................................. 21-75
Affinity and Parallel Queries .................................................................................................. 21-75
Affinity and Parallel DML....................................................................................................... 21-76
Miscellaneous Parallel Execution Tuning Tips......................................................................... 21-76
Setting Buffer Cache Size for Parallel Operations ............................................................... 21-77
Overriding the Default Degree of Parallelism...................................................................... 21-77
Rewriting SQL Statements ...................................................................................................... 21-78
Creating and Populating Tables in Parallel.......................................................................... 21-78
Creating Temporary Tablespaces for Parallel Sort and Hash Join.................................... 21-80
Executing Parallel SQL Statements........................................................................................ 21-81
Using EXPLAIN PLAN to Show Parallel Operations Plans .............................................. 21-81
Additional Considerations for Parallel DML ....................................................................... 21-82
Creating Indexes in Parallel .................................................................................................... 21-85
xvi
Parallel DML Tips..................................................................................................................... 21-87
Incremental Data Loading in Parallel.................................................................................... 21-90
Using Hints with Cost-Based Optimization ......................................................................... 21-92
FIRST_ROWS(n) Hint .............................................................................................................. 21-93
Enabling Dynamic Statistic Sampling.................................................................................... 21-93
22 Query Rewrite
Overview of Query Rewrite............................................................................................................ 22-2
Cost-Based Rewrite..................................................................................................................... 22-3
When Does Oracle Rewrite a Query? ...................................................................................... 22-4
Enabling Query Rewrite.................................................................................................................. 22-7
Initialization Parameters for Query Rewrite .......................................................................... 22-8
Controlling Query Rewrite........................................................................................................ 22-8
Privileges for Enabling Query Rewrite.................................................................................... 22-9
Accuracy of Query Rewrite..................................................................................................... 22-10
How Oracle Rewrites Queries...................................................................................................... 22-11
Text Match Rewrite Methods.................................................................................................. 22-12
General Query Rewrite Methods............................................................................................ 22-13
When are Constraints and Dimensions Needed? ................................................................ 22-14
Special Cases for Query Rewrite ................................................................................................. 22-45
Query Rewrite Using Partially Stale Materialized Views................................................... 22-45
Query Rewrite Using Complex Materialized Views........................................................... 22-49
Query Rewrite Using Nested Materialized Views............................................................... 22-50
Query Rewrite When Using GROUP BY Extensions .......................................................... 22-51
Did Query Rewrite Occur?............................................................................................................ 22-56
Explain Plan............................................................................................................................... 22-56
DBMS_MVIEW.EXPLAIN_REWRITE Procedure ............................................................... 22-57
Design Considerations for Improving Query Rewrite Capabilities..................................... 22-63
Query Rewrite Considerations: Constraints......................................................................... 22-63
Query Rewrite Considerations: Dimensions ........................................................................ 22-63
Query Rewrite Considerations: Outer Joins ......................................................................... 22-63
Query Rewrite Considerations: Text Match ......................................................................... 22-63
Query Rewrite Considerations: Aggregates ......................................................................... 22-64
Query Rewrite Considerations: Grouping Conditions ....................................................... 22-64
Query Rewrite Considerations: Expression Matching........................................................ 22-64
xvii
Query Rewrite Considerations: Date Folding...................................................................... 22-65
Query Rewrite Considerations: Statistics.............................................................................. 22-65
Glossary
Index
xviii
xix
Send Us Your Comments
Oracle9i Data Warehousing Guide, Release 2 (9.2)
Part No. A96520-01
Oracle Corporation welcomes your comments and suggestions on the quality and usefulness of this
document. Your input is an important part of the information used for revision.
s Did you find any errors?
s Is the information clearly presented?
s Do you need more information? If so, where?
s Are the examples correct? Do you need more examples?
s What features did you like most?
If you find any errors or have any other suggestions for improvement, please indicate the document
title and part number, and the chapter, section, and page number (if available). You can send com-
ments to us in the following ways:
s Electronic mail: infodev_us@oracle.com
s FAX: (650) 506-7227 Attn: Server Technologies Documentation Manager
s Postal service:
Oracle Corporation
Server Technologies Documentation
500 Oracle Parkway, Mailstop 4op11
Redwood Shores, CA 94065
USA
If you would like a reply, please give your name, address, telephone number, and (optionally) elec-
tronic mail address.
If you have problems with the software, please contact your local Oracle Support Services.
xx
xxi
Preface
This manual provides information about Oracle9i’s data warehousing capabilities.
This preface contains these topics:
s Audience
s Organization
s Related Documentation
s Conventions
s Documentation Accessibility
xxii
Audience
Oracle9i Data Warehousing Guide is intended for database administrators, system
administrators, and database application developers who design, maintain, and use
data warehouses.
To use this document, you need to be familiar with relational database concepts,
basic Oracle server concepts, and the operating system environment under which
you are running Oracle.
Organization
This document contains:
Part 1: Concepts
Chapter 1, Data Warehousing Concepts
This chapter contains an overview of data warehousing concepts.
Part 2: Logical Design
Chapter 2, Logical Design in Data Warehouses
This chapter discusses the logical design of a data warehouse.
Part 3: Physical Design
Chapter 3, Physical Design in Data Warehouses
This chapter discusses the physical design of a data warehouse.
Chapter 4, Hardware and I/O Considerations in Data Warehouses
This chapter describes some hardware and input-output issues.
Chapter 5, Parallelism and Partitioning in Data Warehouses
This chapter describes the basics of parallelism and partitioning in data
warehouses.
Chapter 6, Indexes
This chapter describes how to use indexes in data warehouses.
xxiii
Chapter 7, Integrity Constraints
This chapter describes some issues involving constraints.
Chapter 8, Materialized Views
This chapter describes how to use materialized views in data warehouses.
Chapter 9, Dimensions
This chapter describes how to use dimensions in data warehouses.
Part 4: Managing the Warehouse Environment
Chapter 10, Overview of Extraction, Transformation, and Loading
This chapter is an overview of the ETL process.
Chapter 11, Extraction in Data Warehouses
This chapter describes extraction issues.
Chapter 12, Transportation in Data Warehouses
This chapter describes transporting data in data warehouses.
Chapter 13, Loading and Transformation
This chapter describes transforming data in data warehouses.
Chapter 14, Maintaining the Data Warehouse
This chapter describes how to refresh in a data warehousing environment.
Chapter 15, Change Data Capture
This chapter describes how to use Change Data Capture capabilities.
Chapter 16, Summary Advisor
This chapter describes how to use the Summary Advisor utility.
xxiv
Part 5: Warehouse Performance
Chapter 17, Schema Modeling Techniques
This chapter describes the schemas useful in data warehousing environments.
Chapter 18, SQL for Aggregation in Data Warehouses
This chapter explains how to use SQL aggregation in data warehouses.
Chapter 19, SQL for Analysis in Data Warehouses
This chapter explains how to use analytic functions in data warehouses.
Chapter 20, OLAP and Data Mining
This chapter describes using analytic services in combination with Oracle9i.
Chapter 21, Using Parallel Execution
This chapter describes how to tune data warehouses using parallel execution.
Chapter 22, Query Rewrite
This chapter describes how to use query rewrite.
Glossary
Related Documentation
For more information, see these Oracle resources:
s Oracle9i Database Performance Tuning Guide and Reference
Many of the examples in this book use the sample schemas of the seed database,
which is installed by default when you install Oracle. Refer to Oracle9i Sample
Schemas for information on how these schemas were created and how you can use
them yourself.
In North America, printed documentation is available for sale in the Oracle Store at
http://oraclestore.oracle.com/
xxv
Customers in Europe, the Middle East, and Africa (EMEA) can purchase
documentation from
http://www.oraclebookshop.com/
Other customers can contact their Oracle representative to purchase printed
documentation.
To download free release notes, installation documentation, white papers, or other
collateral, please visit the Oracle Technology Network (OTN). You must register
online before using OTN; registration is free and can be done at
http://otn.oracle.com/admin/account/membership.html
If you already have a username and password for OTN, then you can go directly to
the documentation section of the OTN Web site at
http://otn.oracle.com/docs/index.htm
To access the database documentation search engine directly, please visit
http://tahiti.oracle.com
For additional information, see:
s The Data Warehouse Toolkit by Ralph Kimball (John Wiley and Sons, 1996)
s Building the Data Warehouse by William Inmon (John Wiley and Sons, 1996)
Conventions
This section describes the conventions used in the text and code examples of this
documentation set. It describes:
s Conventions in Text
s Conventions in Code Examples
s Conventions for Windows Operating Systems
xxvi
Conventions in Text
We use various conventions in text to help you more quickly identify special terms.
The following table describes those conventions and provides examples of their use.
Convention Meaning Example
Bold Bold typeface indicates terms that are
defined in the text or terms that appear in
a glossary, or both.
When you specify this clause, you create an
index-organized table.
Italics Italic typeface indicates book titles or
emphasis.
Oracle9i Database Concepts
Ensure that the recovery catalog and target
database do not reside on the same disk.
UPPERCASE
monospace
(fixed-width)
font
Uppercase monospace typeface indicates
elements supplied by the system. Such
elements include parameters, privileges,
datatypes, RMAN keywords, SQL
keywords, SQL*Plus or utility commands,
packages and methods, as well as
system-supplied column names, database
objects and structures, usernames, and
roles.
You can specify this clause only for a NUMBER
column.
You can back up the database by using the
BACKUP command.
Query the TABLE_NAME column in the USER_
TABLES data dictionary view.
Use the DBMS_STATS.GENERATE_STATS
procedure.
lowercase
monospace
(fixed-width)
font
Lowercase monospace typeface indicates
executables, filenames, directory names,
and sample user-supplied elements. Such
elements include computer and database
names, net service names, and connect
identifiers, as well as user-supplied
database objects and structures, column
names, packages and classes, usernames
and roles, program units, and parameter
values.
Note: Some programmatic elements use a
mixture of UPPERCASE and lowercase.
Enter these elements as shown.
Enter sqlplus to open SQL*Plus.
The password is specified in the orapwd file.
Back up the datafiles and control files in the
/disk1/oracle/dbs directory.
The department_id, department_name,
and location_id columns are in the
hr.departments table.
Set the QUERY_REWRITE_ENABLED
initialization parameter to true.
Connect as oe user.
The JRepUtil class implements these
methods.
lowercase
italic
monospace
(fixed-width)
font
Lowercase italic monospace font
represents placeholders or variables.
You can specify the parallel_clause.
Run Uold_release.SQL where old_
release refers to the release you installed
prior to upgrading.
xxvii
Conventions in Code Examples
Code examples illustrate SQL, PL/SQL, SQL*Plus, or other command-line
statements. They are displayed in a monospace (fixed-width) font and separated
from normal text as shown in this example:
SELECT username FROM dba_users WHERE username = ’MIGRATE’;
The following table describes typographic conventions used in code examples and
provides examples of their use.
Convention Meaning Example
[ ] Brackets enclose one or more optional
items. Do not enter the brackets.
DECIMAL (digits [ , precision ])
{ } Braces enclose two or more items, one of
which is required. Do not enter the braces.
{ENABLE | DISABLE}
| A vertical bar represents a choice of two
or more options within brackets or braces.
Enter one of the options. Do not enter the
vertical bar.
{ENABLE | DISABLE}
[COMPRESS | NOCOMPRESS]
... Horizontal ellipsis points indicate either:
s That we have omitted parts of the
code that are not directly related to
the example
s That you can repeat a portion of the
code
CREATE TABLE ... AS subquery;
SELECT col1, col2, ... , coln FROM
employees;
.
.
.
Vertical ellipsis points indicate that we
have omitted several lines of code not
directly related to the example.
SQL> SELECT NAME FROM V$DATAFILE;
NAME
------------------------------------
/fsl/dbs/tbs_01.dbf
/fs1/dbs/tbs_02.dbf
.
.
.
/fsl/dbs/tbs_09.dbf
9 rows selected.
Other notation You must enter symbols other than
brackets, braces, vertical bars, and ellipsis
points as shown.
acctbal NUMBER(11,2);
acct CONSTANT NUMBER(4) := 3;
xxviii
Conventions for Windows Operating Systems
The following table describes conventions for Windows operating systems and
provides examples of their use.
Italics Italicized text indicates placeholders or
variables for which you must supply
particular values.
CONNECT SYSTEM/system_password
DB_NAME = database_name
UPPERCASE Uppercase typeface indicates elements
supplied by the system. We show these
terms in uppercase in order to distinguish
them from terms you define. Unless terms
appear in brackets, enter them in the
order and with the spelling shown.
However, because these terms are not
case sensitive, you can enter them in
lowercase.
SELECT last_name, employee_id FROM
employees;
SELECT * FROM USER_TABLES;
DROP TABLE hr.employees;
lowercase Lowercase typeface indicates
programmatic elements that you supply.
For example, lowercase indicates names
of tables, columns, or files.
Note: Some programmatic elements use a
mixture of UPPERCASE and lowercase.
Enter these elements as shown.
SELECT last_name, employee_id FROM
employees;
sqlplus hr/hr
CREATE USER mjones IDENTIFIED BY ty3MU9;
Convention Meaning Example
Choose Start > How to start a program. To start the Database Configuration Assistant,
choose Start > Programs > Oracle - HOME_
NAME > Configuration and Migration Tools >
Database Configuration Assistant.
File and directory
names
File and directory names are not case
sensitive. The following special characters
are not allowed: left angle bracket (<),
right angle bracket (>), colon (:), double
quotation marks ("), slash (/), pipe (|),
and dash (-). The special character
backslash () is treated as an element
separator, even when it appears in quotes.
If the file name begins with , then
Windows assumes it uses the Universal
Naming Convention.
c:winnt""system32 is the same as
C:WINNTSYSTEM32
Convention Meaning Example
xxix
C:> Represents the Windows command
prompt of the current hard disk drive.
The escape character in a command
prompt is the caret (^). Your prompt
reflects the subdirectory in which you are
working. Referred to as the command
prompt in this manual.
C:oracleoradata>
Special characters The backslash () special character is
sometimes required as an escape
character for the double quotation mark
(") special character at the Windows
command prompt. Parentheses and the
single quotation mark (’) do not require
an escape character. Refer to your
Windows operating system
documentation for more information on
escape and special characters.
C:>exp scott/tiger TABLES=emp
QUERY="WHERE job=’SALESMAN’ and
sal<1600"
C:>imp SYSTEM/password FROMUSER=scott
TABLES=(emp, dept)
HOME_NAME Represents the Oracle home name. The
home name can be up to 16 alphanumeric
characters. The only special character
allowed in the home name is the
underscore.
C:> net start OracleHOME_NAMETNSListener
Convention Meaning Example
xxx
ORACLE_HOME
and ORACLE_
BASE
In releases prior to Oracle8i release 8.1.3,
when you installed Oracle components,
all subdirectories were located under a
top level ORACLE_HOME directory that by
default used one of the following names:
s C:orant for Windows NT
s C:orawin98 for Windows 98
This release complies with Optimal
Flexible Architecture (OFA) guidelines.
All subdirectories are not under a top
level ORACLE_HOME directory. There is a
top level directory called ORACLE_BASE
that by default is C:oracle. If you
install the latest Oracle release on a
computer with no other Oracle software
installed, then the default setting for the
first Oracle home directory is
C:oracleorann, where nn is the
latest release number. The Oracle home
directory is located directly under
ORACLE_BASE.
All directory path examples in this guide
follow OFA conventions.
Refer to Oracle9i Database Getting Started
for Windows for additional information
about OFA compliances and for
information about installing Oracle
products in non-OFA compliant
directories.
Go to the ORACLE_BASEORACLE_
HOMErdbmsadmin directory.
Convention Meaning Example
xxxi
Documentation Accessibility
Our goal is to make Oracle products, services, and supporting documentation
accessible, with good usability, to the disabled community. To that end, our
documentation includes features that make information available to users of
assistive technology. This documentation is available in HTML format, and contains
markup to facilitate access by the disabled community. Standards will continue to
evolve over time, and Oracle Corporation is actively engaged with other
market-leading technology vendors to address technical obstacles so that our
documentation can be accessible to all of our customers. For additional information,
visit the Oracle Accessibility Program Web site at
http://www.oracle.com/accessibility/
Accessibility of Code Examples in Documentation JAWS, a Windows screen
reader, may not always correctly read the code examples in this document. The
conventions for writing code require that closing braces should appear on an
otherwise empty line; however, JAWS may not always read a line of text that
consists solely of a bracket or brace.
Accessibility of Links to External Web Sites in Documentation This
documentation may contain links to Web sites of other companies or organizations
that Oracle Corporation does not own or control. Oracle Corporation neither
evaluates nor makes any representations regarding the accessibility of these Web
sites.
xxxii
xxxiii
What’s New in Data Warehousing?
This section describes new features of Oracle9i release 2 (9.2) and provides pointers
to additional information. New features information from previous releases is also
retained to help those users migrating to the current release.
The following sections describe the new features in Oracle Data Warehousing:
s Oracle9i Release 2 (9.2) New Features in Data Warehousing
s Oracle9i Release 1 (9.0.1) New Features in Data Warehousing
xxxiv
Oracle9i Release 2 (9.2) New Features in Data Warehousing
s Data Segment Compression
You can compress data segments in heap-organized tables, and a typical
example of a heap-organized table you should consider for data segment
compression is partitioned tables. Data segment compression is also useful for
highly redundant data, such as tables with many foreign keys and materialized
views created with the ROLLUP clause. You should avoid compression on tables
with many updates or DML.
s Materialized View Enhancements
You can now nest materialized views when the materialized view contains joins
and aggregates. Fast refresh is now possible on a materialized views containing
the UNION ALL operator. Various restrictions were removed in addition to
expanding the situations where materialized views could be effectively used. In
particular, using materialized views in an OLAP environment has been
improved.
s Parallel DML on Non-Partitioned Tables
You can now use parallel DML on non-partitioned tables.
s Partitioning Enhancements
You can now simplify SQL syntax by using a DEFAULT partition or a
subpartition template. You can implement SPLIT operations more easily.
See Also: Chapter 8, "Materialized Views"
See Also: "Overview of Data Warehousing with Materialized
Views" on page 8-2 and "Materialized Views in OLAP
Environments" on page 8-41, and Chapter 14, "Maintaining the
Data Warehouse"
See Also: Chapter 21, "Using Parallel Execution"
See Also: "Partitioning Methods" on page 5-5, Chapter 5,
"Parallelism and Partitioning in Data Warehouses", and Oracle9i
Database Administrator’s Guide
xxxv
s Query Rewrite Enhancements
Text match processing and join equivalence recognition have been improved.
Materialized views containing the UNION ALL operator can now use query
rewrite.
s Range-List Partitioning
You can now subpartition by list range-partitioned tables.
s Summary Advisor Enhancements
The Summary Advisor tool and its related DBMS_OLAP package were improved
so you can restrict workloads to a specific schema.
Oracle9i Release 1 (9.0.1) New Features in Data Warehousing
s Analytic Functions
Oracle’s analytic capabilities have been improved through the addition of
Inverse percentile, hypothetical distribution, and first/last analytic functions.
s Bitmap Join Index
A bitmap join index spans multiple tables and improves the performance of
joins of those tables.
s ETL Enhancements
Oracle’s extraction, transformation, and loading capabilities have been
improved with a MERGE statement, multi-table inserts, and table functions.
See Also: Chapter 22, "Query Rewrite"
See Also: "Types of Partitioning" on page 5-4
See Also: Chapter 16, "Summary Advisor"
See Also: Chapter 19, "SQL for Analysis in Data Warehouses"
See Also: "Bitmap Indexes" on page 6-2
See Also: Chapter 10, "Overview of Extraction, Transformation,
and Loading"
xxxvi
s Full Outer Joins
Oracle added full support for full outer joins so that you can more easily
express certain complex queries.
s Grouping Sets
You can now selectively specify the set of groups that you want to create using
a GROUPING SETS expression within a GROUP BY clause. This allows precise
specification across multiple dimensions without computing the whole CUBE.
s List Partitioning
List partitioning offers you precise control over which data belongs in a
particular partition.
s Materialized View Enhancements
Various restrictions were removed in addition to expanding the situations
where materialized views could be effectively used.
s Query Rewrite Enhancements
The query rewrite feature, which allows many SQL statements to use
materialized views, thereby improving performance significantly, was
improved significantly. Text match processing and join equivalence recognition
have been improved.
See Also: Oracle9i Database Performance Tuning Guide and Reference
See Also: Chapter 18, "SQL for Aggregation in Data Warehouses"
See Also: "Partitioning Design Considerations" on page 5-4 and
Oracle9i Database Concepts, and Oracle9i Database Administrator’s
Guide
See Also: "Overview of Data Warehousing with Materialized
Views" on page 8-2
See Also: Chapter 22, "Query Rewrite"
xxxvii
s Summary Advisor Enhancements
The Summary Advisor tool and its related DBMS_OLAP package were improved
so you can specify workloads. In addition, a broader class of schemas is now
supported.
s WITH Clause
The WITH clause enables you to reuse a query block in a SELECT statement
when it occurs more than once within a complex query.
See Also: Chapter 16, "Summary Advisor"
See Also: "Computation Using the WITH Clause" on page 18-30
xxxviii
Part I
Concepts
This section introduces basic data warehousing concepts.
It contains the following chapter:
s Data Warehousing Concepts
Data Warehousing Concepts 1-1
1
Data Warehousing Concepts
This chapter provides an overview of the Oracle data warehousing implementation.
It includes:
s What is a Data Warehouse?
s Data Warehouse Architectures
Note that this book is meant as a supplement to standard texts about data
warehousing. This book focuses on Oracle-specific material and does not reproduce
in detail material of a general nature. Two standard texts are:
s The Data Warehouse Toolkit by Ralph Kimball (John Wiley and Sons, 1996)
s Building the Data Warehouse by William Inmon (John Wiley and Sons, 1996)
What is a Data Warehouse?
1-2 Oracle9i Data Warehousing Guide
What is a Data Warehouse?
A data warehouse is a relational database that is designed for query and analysis
rather than for transaction processing. It usually contains historical data derived
from transaction data, but it can include data from other sources. It separates
analysis workload from transaction workload and enables an organization to
consolidate data from several sources.
In addition to a relational database, a data warehouse environment includes an
extraction, transportation, transformation, and loading (ETL) solution, an online
analytical processing (OLAP) engine, client analysis tools, and other applications
that manage the process of gathering data and delivering it to business users.
A common way of introducing data warehousing is to refer to the characteristics of
a data warehouse as set forth by William Inmon:
s Subject Oriented
s Integrated
s Nonvolatile
s Time Variant
Subject Oriented
Data warehouses are designed to help you analyze data. For example, to learn more
about your company’s sales data, you can build a warehouse that concentrates on
sales. Using this warehouse, you can answer questions like "Who was our best
customer for this item last year?" This ability to define a data warehouse by subject
matter, sales in this case, makes the data warehouse subject oriented.
Integrated
Integration is closely related to subject orientation. Data warehouses must put data
from disparate sources into a consistent format. They must resolve such problems
as naming conflicts and inconsistencies among units of measure. When they achieve
this, they are said to be integrated.
See Also: Chapter 10, "Overview of Extraction, Transformation,
and Loading"
What is a Data Warehouse?
Data Warehousing Concepts 1-3
Nonvolatile
Nonvolatile means that, once entered into the warehouse, data should not change.
This is logical because the purpose of a warehouse is to enable you to analyze what
has occurred.
Time Variant
In order to discover trends in business, analysts need large amounts of data. This is
very much in contrast to online transaction processing (OLTP) systems, where
performance requirements demand that historical data be moved to an archive. A
data warehouse’s focus on change over time is what is meant by the term time
variant.
Contrasting OLTP and Data Warehousing Environments
Figure 1–1 illustrates key differences between an OLTP system and a data
warehouse.
Figure 1–1 Contrasting OLTP and Data Warehousing Environments
One major difference between the types of system is that data warehouses are not
usually in third normal form (3NF), a type of data normalization common in OLTP
environments.
Few
Rare
Normalized
DBMS
Many
Indexes
Derived Data
and Aggregates
Duplicated
Data
Joins
Many
Complex data
structures
(3NF databases)
Multidimensional
data structures
OLTP Data Warehouse
Common
Denormalized
DBMS
Some
What is a Data Warehouse?
1-4 Oracle9i Data Warehousing Guide
Data warehouses and OLTP systems have very different requirements. Here are
some examples of differences between typical data warehouses and OLTP systems:
s Workload
Data warehouses are designed to accommodate ad hoc queries. You might not
know the workload of your data warehouse in advance, so a data warehouse
should be optimized to perform well for a wide variety of possible query
operations.
OLTP systems support only predefined operations. Your applications might be
specifically tuned or designed to support only these operations.
s Data modifications
A data warehouse is updated on a regular basis by the ETL process (run nightly
or weekly) using bulk data modification techniques. The end users of a data
warehouse do not directly update the data warehouse.
In OLTP systems, end users routinely issue individual data modification
statements to the database. The OLTP database is always up to date, and reflects
the current state of each business transaction.
s Schema design
Data warehouses often use denormalized or partially denormalized schemas
(such as a star schema) to optimize query performance.
OLTP systems often use fully normalized schemas to optimize
update/insert/delete performance, and to guarantee data consistency.
s Typical operations
A typical data warehouse query scans thousands or millions of rows. For
example, "Find the total sales for all customers last month."
A typical OLTP operation accesses only a handful of records. For example,
"Retrieve the current order for this customer."
s Historical data
Data warehouses usually store many months or years of data. This is to support
historical analysis.
OLTP systems usually store data from only a few weeks or months. The OLTP
system stores only historical data as needed to successfully meet the
requirements of the current transaction.
Data Warehouse Architectures
Data Warehousing Concepts 1-5
Data Warehouse Architectures
Data warehouses and their architectures vary depending upon the specifics of an
organization's situation. Three common architectures are:
s Data Warehouse Architecture (Basic)
s Data Warehouse Architecture (with a Staging Area)
s Data Warehouse Architecture (with a Staging Area and Data Marts)
Data Warehouse Architecture (Basic)
Figure 1–2 shows a simple architecture for a data warehouse. End users directly
access data derived from several source systems through the data warehouse.
Figure 1–2 Architecture of a Data Warehouse
In Figure 1–2, the metadata and raw data of a traditional OLTP system is present, as
is an additional type of data, summary data. Summaries are very valuable in data
warehouses because they pre-compute long operations in advance. For example, a
typical data warehouse query is to retrieve something like August sales. A
summary in Oracle is called a materialized view.
WarehouseData Sources
Summary
Data
Raw Data
Metadata
Operational
System
Operational
System
Flat Files
Users
Analysis
Reporting
Mining
Data Warehouse Architectures
1-6 Oracle9i Data Warehousing Guide
Data Warehouse Architecture (with a Staging Area)
In Figure 1–2, you need to clean and process your operational data before putting it
into the warehouse. You can do this programmatically, although most data
warehouses use a staging area instead. A staging area simplifies building
summaries and general warehouse management. Figure 1–3 illustrates this typical
architecture.
Figure 1–3 Architecture of a Data Warehouse with a Staging Area
Operational
System
Data
Sources
Staging
Area Warehouse Users
Operational
System
Flat Files
Analysis
Reporting
Mining
Summary
Data
Raw Data
Metadata
Data Warehouse Architectures
Data Warehousing Concepts 1-7
Data Warehouse Architecture (with a Staging Area and Data Marts)
Although the architecture in Figure 1–3 is quite common, you may want to
customize your warehouse’s architecture for different groups within your
organization. You can do this by adding data marts, which are systems designed for
a particular line of business. Figure 1–4 illustrates an example where purchasing,
sales, and inventories are separated. In this example, a financial analyst might want
to analyze historical data for purchases and sales.
Figure 1–4 Architecture of a Data Warehouse with a Staging Area and Data Marts
Note: Data marts are an important part of many warehouses, but
they are not the focus of this book.
See Also: Data Mart Suites documentation for further information
regarding data marts
Operational
System
Data
Sources
Staging
Area Warehouse
Data
Marts Users
Operational
System
Flat Files
Sales
Purchasing
Inventory
Analysis
Reporting
Mining
Summary
Data
Raw Data
Metadata
Data Warehouse Architectures
1-8 Oracle9i Data Warehousing Guide
Part II
Logical Design
This section deals with the issues in logical design in a data warehouse.
It contains the following chapter:
s Logical Design in Data Warehouses
Logical Design in Data Warehouses 2-1
2
Logical Design in Data Warehouses
This chapter tells you how to design a data warehousing environment and includes
the following topics:
s Logical Versus Physical Design in Data Warehouses
s Creating a Logical Design
s Data Warehousing Schemas
s Data Warehousing Objects
Logical Versus Physical Design in Data Warehouses
2-2 Oracle9i Data Warehousing Guide
Logical Versus Physical Design in Data Warehouses
Your organization has decided to build a data warehouse. You have defined the
business requirements and agreed upon the scope of your application, and created a
conceptual design. Now you need to translate your requirements into a system
deliverable. To do so, you create the logical and physical design for the data
warehouse. You then define:
s The specific data content
s Relationships within and between groups of data
s The system environment supporting your data warehouse
s The data transformations required
s The frequency with which data is refreshed
The logical design is more conceptual and abstract than the physical design. In the
logical design, you look at the logical relationships among the objects. In the
physical design, you look at the most effective way of storing and retrieving the
objects as well as handling them from a transportation and backup/recovery
perspective.
Orient your design toward the needs of the end users. End users typically want to
perform analysis and look at aggregated data, rather than at individual
transactions. However, end users might not know what they need until they see it.
In addition, a well-planned design allows for growth and changes as the needs of
users change and evolve.
By beginning with the logical design, you focus on the information requirements
and save the implementation details for later.
Creating a Logical Design
A logical design is conceptual and abstract. You do not deal with the physical
implementation details yet. You deal only with defining the types of information
that you need.
One technique you can use to model your organization's logical information
requirements is entity-relationship modeling. Entity-relationship modeling involves
identifying the things of importance (entities), the properties of these things
(attributes), and how they are related to one another (relationships).
The process of logical design involves arranging data into a series of logical
relationships called entities and attributes. An entity represents a chunk of
Data Warehousing Schemas
Logical Design in Data Warehouses 2-3
information. In relational databases, an entity often maps to a table. An attribute is
a component of an entity that helps define the uniqueness of the entity. In relational
databases, an attribute maps to a column.
To be sure that your data is consistent, you need to use unique identifiers. A unique
identifier is something you add to tables so that you can differentiate between the
same item when it appears in different places. In a physical design, this is usually a
primary key.
While entity-relationship diagramming has traditionally been associated with
highly normalized models such as OLTP applications, the technique is still useful
for data warehouse design in the form of dimensional modeling. In dimensional
modeling, instead of seeking to discover atomic units of information (such as
entities and attributes) and all of the relationships between them, you identify
which information belongs to a central fact table and which information belongs to
its associated dimension tables. You identify business subjects or fields of data,
define relationships between business subjects, and name the attributes for each
subject.
Your logical design should result in (1) a set of entities and attributes corresponding
to fact tables and dimension tables and (2) a model of operational data from your
source into subject-oriented information in your target data warehouse schema.
You can create the logical design using a pen and paper, or you can use a design
tool such as Oracle Warehouse Builder (specifically designed to support modeling
the ETL process) or Oracle Designer (a general purpose modeling tool).
Data Warehousing Schemas
A schema is a collection of database objects, including tables, views, indexes, and
synonyms. You can arrange schema objects in the schema models designed for data
warehousing in a variety of ways. Most data warehouses use a dimensional model.
The model of your source data and the requirements of your users help you design
the data warehouse schema. You can sometimes get the source model from your
company's enterprise data model and reverse-engineer the logical data model for
the data warehouse from this. The physical implementation of the logical data
See Also: Chapter 9, "Dimensions" for further information
regarding dimensions
See Also: Oracle Designer and Oracle Warehouse Builder
documentation sets
Data Warehousing Schemas
2-4 Oracle9i Data Warehousing Guide
warehouse model may require some changes to adapt it to your system
parameters—size of machine, number of users, storage capacity, type of network,
and software.
Star Schemas
The star schema is the simplest data warehouse schema. It is called a star schema
because the diagram resembles a star, with points radiating from a center. The
center of the star consists of one or more fact tables and the points of the star are the
dimension tables, as shown in Figure 2–1.
Figure 2–1 Star Schema
The most natural way to model a data warehouse is as a star schema, only one join
establishes the relationship between the fact table and any one of the dimension
tables.
A star schema optimizes performance by keeping queries simple and providing fast
response time. All the information about each level is stored in one row.
Note: Oracle Corporation recommends that you choose a star
schema unless you have a clear reason not to.
customers
products
Dimension Table Dimension Table
channels
sales
(amount_sold,
quantity_sold)
times
Fact Table
Data Warehousing Objects
Logical Design in Data Warehouses 2-5
Other Schemas
Some schemas in data warehousing environments use third normal form rather
than star schemas. Another schema that is sometimes useful is the snowflake
schema, which is a star schema with normalized dimensions in a tree structure.
Data Warehousing Objects
Fact tables and dimension tables are the two types of objects commonly used in
dimensional data warehouse schemas.
Fact tables are the large tables in your warehouse schema that store business
measurements. Fact tables typically contain facts and foreign keys to the dimension
tables. Fact tables represent data, usually numeric and additive, that can be
analyzed and examined. Examples include sales, cost, and profit.
Dimension tables, also known as lookup or reference tables, contain the relatively
static data in the warehouse. Dimension tables store the information you normally
use to contain queries. Dimension tables are usually textual and descriptive and
you can use them as the row headers of the result set. Examples are customers or
products.
Fact Tables
A fact table typically has two types of columns: those that contain numeric facts
(often called measurements), and those that are foreign keys to dimension tables. A
fact table contains either detail-level facts or facts that have been aggregated. Fact
tables that contain aggregated facts are often called summary tables. A fact table
usually contains facts with the same level of aggregation. Though most facts are
additive, they can also be semi-additive or non-additive. Additive facts can be
aggregated by simple arithmetical addition. A common example of this is sales.
Non-additive facts cannot be added at all. An example of this is averages.
Semi-additive facts can be aggregated along some of the dimensions and not along
others. An example of this is inventory levels, where you cannot tell what a level
means simply by looking at it.
See Also: Chapter 17, "Schema Modeling Techniques" for further
information regarding star and snowflake schemas in data
warehouses and Oracle9i Database Concepts for further conceptual
material
Data Warehousing Objects
2-6 Oracle9i Data Warehousing Guide
Creating a New Fact Table
You must define a fact table for each star schema. From a modeling standpoint, the
primary key of the fact table is usually a composite key that is made up of all of its
foreign keys.
Dimension Tables
A dimension is a structure, often composed of one or more hierarchies, that
categorizes data. Dimensional attributes help to describe the dimensional value.
They are normally descriptive, textual values. Several distinct dimensions,
combined with facts, enable you to answer business questions. Commonly used
dimensions are customers, products, and time.
Dimension data is typically collected at the lowest level of detail and then
aggregated into higher level totals that are more useful for analysis. These natural
rollups or aggregations within a dimension table are called hierarchies.
Hierarchies
Hierarchies are logical structures that use ordered levels as a means of organizing
data. A hierarchy can be used to define data aggregation. For example, in a time
dimension, a hierarchy might aggregate data from the month level to the quarter
level to the year level. A hierarchy can also be used to define a navigational drill
path and to establish a family structure.
Within a hierarchy, each level is logically connected to the levels above and below it.
Data values at lower levels aggregate into the data values at higher levels. A
dimension can be composed of more than one hierarchy. For example, in the
product dimension, there might be two hierarchies—one for product categories
and one for product suppliers.
Dimension hierarchies also group levels from general to granular. Query tools use
hierarchies to enable you to drill down into your data to view different levels of
granularity. This is one of the key benefits of a data warehouse.
When designing hierarchies, you must consider the relationships in business
structures. For example, a divisional multilevel sales organization.
Hierarchies impose a family structure on dimension values. For a particular level
value, a value at the next higher level is its parent, and values at the next lower level
are its children. These familial relationships enable analysts to access data quickly.
Data Warehousing Objects
Logical Design in Data Warehouses 2-7
Levels A level represents a position in a hierarchy. For example, a time dimension
might have a hierarchy that represents data at the month, quarter, and year
levels. Levels range from general to specific, with the root level as the highest or
most general level. The levels in a dimension are organized into one or more
hierarchies.
Level Relationships Level relationships specify top-to-bottom ordering of levels from
most general (the root) to most specific information. They define the parent-child
relationship between the levels in a hierarchy.
Hierarchies are also essential components in enabling more complex rewrites. For
example, the database can aggregate an existing sales revenue on a quarterly base to
a yearly aggregation when the dimensional dependencies between quarter and year
are known.
Typical Dimension Hierarchy
Figure 2–2 illustrates a dimension hierarchy based on customers.
Figure 2–2 Typical Levels in a Dimension Hierarchy
See Also: Chapter 9, "Dimensions" and Chapter 22, "Query
Rewrite" for further information regarding hierarchies
region
customer
country_name
subregion
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse
Datawarehouse

Contenu connexe

Tendances

Oracle database 12c 2 day developer's guide 123
Oracle database 12c 2 day developer's guide 123Oracle database 12c 2 day developer's guide 123
Oracle database 12c 2 day developer's guide 123bupbechanhgmail
 
115chvug
115chvug115chvug
115chvugkamy5
 
Oracle database 12c client installation guide 3
Oracle database 12c client installation guide 3Oracle database 12c client installation guide 3
Oracle database 12c client installation guide 3bupbechanhgmail
 
Oracle database 12c sql tuning
Oracle database 12c sql tuningOracle database 12c sql tuning
Oracle database 12c sql tuningFemi Adeyemi
 
Engineering user guide
Engineering user guideEngineering user guide
Engineering user guideRajesh Kumar
 
Oracle database 12c 2 day + performance tuning guide
Oracle database 12c 2 day + performance tuning guideOracle database 12c 2 day + performance tuning guide
Oracle database 12c 2 day + performance tuning guidebupbechanhgmail
 
Agm diagnostics and_recovery_toolset_(da_rt)_8.0
Agm diagnostics and_recovery_toolset_(da_rt)_8.0Agm diagnostics and_recovery_toolset_(da_rt)_8.0
Agm diagnostics and_recovery_toolset_(da_rt)_8.0Steve Xu
 
Oracle backup and recovery user's guide
Oracle backup and recovery user's guideOracle backup and recovery user's guide
Oracle backup and recovery user's guideEgg Chang
 
Developer’s guide for oracle data integrator
Developer’s guide for oracle data integratorDeveloper’s guide for oracle data integrator
Developer’s guide for oracle data integratorAbhishek Srivastava
 
Oracle database 12c advanced replication
Oracle database 12c advanced replicationOracle database 12c advanced replication
Oracle database 12c advanced replicationbupbechanhgmail
 
Oracle database 12c client installation guide 4
Oracle database 12c client installation guide 4Oracle database 12c client installation guide 4
Oracle database 12c client installation guide 4bupbechanhgmail
 
Oracle® database 2 days security guide e10575
Oracle® database 2 days security guide e10575Oracle® database 2 days security guide e10575
Oracle® database 2 days security guide e10575imranshahid7861
 
Oracle database 12c client installation guide 6
Oracle database 12c client installation guide 6Oracle database 12c client installation guide 6
Oracle database 12c client installation guide 6bupbechanhgmail
 
Oracle database 12c client installation guide
Oracle database 12c client installation guideOracle database 12c client installation guide
Oracle database 12c client installation guidebupbechanhgmail
 
Oracle database 12c 2 day + application express developer's guide
Oracle database 12c 2 day + application express developer's guideOracle database 12c 2 day + application express developer's guide
Oracle database 12c 2 day + application express developer's guidebupbechanhgmail
 
Oracle database 12c 2 day + php developer's guide
Oracle database 12c 2 day + php developer's guideOracle database 12c 2 day + php developer's guide
Oracle database 12c 2 day + php developer's guidebupbechanhgmail
 

Tendances (20)

Oracle database 12c 2 day developer's guide 123
Oracle database 12c 2 day developer's guide 123Oracle database 12c 2 day developer's guide 123
Oracle database 12c 2 day developer's guide 123
 
115chvug
115chvug115chvug
115chvug
 
Oracle database 12c client installation guide 3
Oracle database 12c client installation guide 3Oracle database 12c client installation guide 3
Oracle database 12c client installation guide 3
 
Oracle database 12c sql tuning
Oracle database 12c sql tuningOracle database 12c sql tuning
Oracle database 12c sql tuning
 
Engineering user guide
Engineering user guideEngineering user guide
Engineering user guide
 
Oracle database 12c 2 day + performance tuning guide
Oracle database 12c 2 day + performance tuning guideOracle database 12c 2 day + performance tuning guide
Oracle database 12c 2 day + performance tuning guide
 
120finig
120finig120finig
120finig
 
120posig
120posig120posig
120posig
 
Agm diagnostics and_recovery_toolset_(da_rt)_8.0
Agm diagnostics and_recovery_toolset_(da_rt)_8.0Agm diagnostics and_recovery_toolset_(da_rt)_8.0
Agm diagnostics and_recovery_toolset_(da_rt)_8.0
 
Oracle backup and recovery user's guide
Oracle backup and recovery user's guideOracle backup and recovery user's guide
Oracle backup and recovery user's guide
 
Developer’s guide for oracle data integrator
Developer’s guide for oracle data integratorDeveloper’s guide for oracle data integrator
Developer’s guide for oracle data integrator
 
Oracle database 12c advanced replication
Oracle database 12c advanced replicationOracle database 12c advanced replication
Oracle database 12c advanced replication
 
E49462 01
E49462 01E49462 01
E49462 01
 
Oracle database 12c client installation guide 4
Oracle database 12c client installation guide 4Oracle database 12c client installation guide 4
Oracle database 12c client installation guide 4
 
Oracle® database 2 days security guide e10575
Oracle® database 2 days security guide e10575Oracle® database 2 days security guide e10575
Oracle® database 2 days security guide e10575
 
Ascp
AscpAscp
Ascp
 
Oracle database 12c client installation guide 6
Oracle database 12c client installation guide 6Oracle database 12c client installation guide 6
Oracle database 12c client installation guide 6
 
Oracle database 12c client installation guide
Oracle database 12c client installation guideOracle database 12c client installation guide
Oracle database 12c client installation guide
 
Oracle database 12c 2 day + application express developer's guide
Oracle database 12c 2 day + application express developer's guideOracle database 12c 2 day + application express developer's guide
Oracle database 12c 2 day + application express developer's guide
 
Oracle database 12c 2 day + php developer's guide
Oracle database 12c 2 day + php developer's guideOracle database 12c 2 day + php developer's guide
Oracle database 12c 2 day + php developer's guide
 

En vedette

Balanced Scorecarding
Balanced  ScorecardingBalanced  Scorecarding
Balanced Scorecardinghanu friend
 
Feenstra Farm To School Impacts F2 C Conf 3 09
Feenstra Farm To School Impacts F2 C Conf 3 09Feenstra Farm To School Impacts F2 C Conf 3 09
Feenstra Farm To School Impacts F2 C Conf 3 09guestbbcdbd
 
Cognos Best Practices
Cognos  Best  PracticesCognos  Best  Practices
Cognos Best Practiceshanu friend
 
Propuesta de desarrollo de método bioanalítico mediante HPLC
Propuesta de desarrollo de método bioanalítico mediante HPLCPropuesta de desarrollo de método bioanalítico mediante HPLC
Propuesta de desarrollo de método bioanalítico mediante HPLCChobrack Vázquez
 
Trucos y consejos en la resolución de problemas en hplc
Trucos y consejos en la resolución de problemas en hplcTrucos y consejos en la resolución de problemas en hplc
Trucos y consejos en la resolución de problemas en hplcPostgradoMLCC
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerLuminary Labs
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsLinkedIn
 

En vedette (7)

Balanced Scorecarding
Balanced  ScorecardingBalanced  Scorecarding
Balanced Scorecarding
 
Feenstra Farm To School Impacts F2 C Conf 3 09
Feenstra Farm To School Impacts F2 C Conf 3 09Feenstra Farm To School Impacts F2 C Conf 3 09
Feenstra Farm To School Impacts F2 C Conf 3 09
 
Cognos Best Practices
Cognos  Best  PracticesCognos  Best  Practices
Cognos Best Practices
 
Propuesta de desarrollo de método bioanalítico mediante HPLC
Propuesta de desarrollo de método bioanalítico mediante HPLCPropuesta de desarrollo de método bioanalítico mediante HPLC
Propuesta de desarrollo de método bioanalítico mediante HPLC
 
Trucos y consejos en la resolución de problemas en hplc
Trucos y consejos en la resolución de problemas en hplcTrucos y consejos en la resolución de problemas en hplc
Trucos y consejos en la resolución de problemas en hplc
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
 

Similaire à Datawarehouse

Similaire à Datawarehouse (20)

115msdug
115msdug115msdug
115msdug
 
Opm inv user guide
Opm inv user guideOpm inv user guide
Opm inv user guide
 
Guia implementacion seguridad oracle 12c
Guia implementacion seguridad oracle 12cGuia implementacion seguridad oracle 12c
Guia implementacion seguridad oracle 12c
 
using-advanced-controls (1).pdf
using-advanced-controls (1).pdfusing-advanced-controls (1).pdf
using-advanced-controls (1).pdf
 
Oracle_10g_PLSQL_Guia_Ref.pdf
Oracle_10g_PLSQL_Guia_Ref.pdfOracle_10g_PLSQL_Guia_Ref.pdf
Oracle_10g_PLSQL_Guia_Ref.pdf
 
Backup and recovery basics
Backup and recovery basicsBackup and recovery basics
Backup and recovery basics
 
Tai lieu-sql-plus-user-s-guide-and-reference
Tai lieu-sql-plus-user-s-guide-and-referenceTai lieu-sql-plus-user-s-guide-and-reference
Tai lieu-sql-plus-user-s-guide-and-reference
 
120posig i spplr implmntn guide
120posig i spplr implmntn guide120posig i spplr implmntn guide
120posig i spplr implmntn guide
 
E13882== ORACLE SOA COOK BOOK
E13882== ORACLE SOA COOK BOOKE13882== ORACLE SOA COOK BOOK
E13882== ORACLE SOA COOK BOOK
 
Manufacturing scheduling user guide
Manufacturing scheduling user guideManufacturing scheduling user guide
Manufacturing scheduling user guide
 
A73073
A73073A73073
A73073
 
Using backlog-management
Using backlog-managementUsing backlog-management
Using backlog-management
 
ORACLE DATABASE - Programmers Guide to the Oracle Precompilers.pdf
ORACLE DATABASE - Programmers Guide to the Oracle Precompilers.pdfORACLE DATABASE - Programmers Guide to the Oracle Precompilers.pdf
ORACLE DATABASE - Programmers Guide to the Oracle Precompilers.pdf
 
Apps fundamentals
Apps fundamentalsApps fundamentals
Apps fundamentals
 
Oracle® business intelligence
Oracle® business intelligenceOracle® business intelligence
Oracle® business intelligence
 
120cseug asset tracking user guide
120cseug asset tracking user guide120cseug asset tracking user guide
120cseug asset tracking user guide
 
Oracle 10g release 1
Oracle 10g release  1Oracle 10g release  1
Oracle 10g release 1
 
120ocmug
120ocmug120ocmug
120ocmug
 
Instalacion de apex
Instalacion de apexInstalacion de apex
Instalacion de apex
 
Opm costing
Opm costingOpm costing
Opm costing
 

Dernier

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Dernier (20)

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Datawarehouse

  • 1. Oracle9i Data Warehousing Guide Release 2 (9.2) March 2002 Part No. A96520-01
  • 2. Oracle9i Data Warehousing Guide, Release 2 (9.2) Part No. A96520-01 Copyright © 1996, 2002 Oracle Corporation. All rights reserved. Primary Author: Paul Lane Contributing Authors: Viv Schupmann (Change Data Capture) Contributors: Patrick Amor, Hermann Baer, Subhransu Basu, Srikanth Bellamkonda, Randy Bello, Tolga Bozkaya, Benoit Dageville, John Haydu, Lilian Hobbs, Hakan Jakobsson, George Lumpkin, Cetin Ozbutun, Jack Raitto, Ray Roccaforte, Sankar Subramanian, Gregory Smith, Ashish Thusoo, Jean-Francois Verrier, Gary Vincent, Andy Witkowski, Zia Ziauddin Graphic Designer: Valarie Moore The Programs (which include both the software and documentation) contain proprietary information of Oracle Corporation; they are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright, patent and other intellectual and industrial property laws. Reverse engineering, disassembly or decompilation of the Programs, except to the extent required to obtain interoperability with other independently created software or as specified by law, is prohibited. The information contained in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing. Oracle Corporation does not warrant that this document is error-free. Except as may be expressly permitted in your license agreement for these Programs, no part of these Programs may be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without the express written permission of Oracle Corporation. If the Programs are delivered to the U.S. Government or anyone licensing or using the programs on behalf of the U.S. Government, the following notice is applicable: Restricted Rights Notice Programs delivered subject to the DOD FAR Supplement are "commercial computer software" and use, duplication, and disclosure of the Programs, including documentation, shall be subject to the licensing restrictions set forth in the applicable Oracle license agreement. Otherwise, Programs delivered subject to the Federal Acquisition Regulations are "restricted computer software" and use, duplication, and disclosure of the Programs shall be subject to the restrictions in FAR 52.227-19, Commercial Computer Software - Restricted Rights (June, 1987). Oracle Corporation, 500 Oracle Parkway, Redwood City, CA 94065. The Programs are not intended for use in any nuclear, aviation, mass transit, medical, or other inherently dangerous applications. It shall be the licensee's responsibility to take all appropriate fail-safe, backup, redundancy, and other measures to ensure the safe use of such applications if the Programs are used for such purposes, and Oracle Corporation disclaims liability for any damages caused by such use of the Programs. Oracle is a registered trademark, and Express, Oracle Expert, Oracle Store, Oracle7, Oracle8, Oracle8i, Oracle9i, Oracle Store, PL/SQL, Pro*C, and SQL*Plus are trademarks or registered trademarks of Oracle Corporation. Other names may be trademarks of their respective owners.
  • 3. iii Contents Send Us Your Comments ................................................................................................................. xix Preface.......................................................................................................................................................... xxi What’s New in Data Warehousing?........................................................................................ xxxiii Part I Concepts 1 Data Warehousing Concepts What is a Data Warehouse?............................................................................................................... 1-2 Subject Oriented............................................................................................................................ 1-2 Integrated....................................................................................................................................... 1-2 Nonvolatile .................................................................................................................................... 1-3 Time Variant.................................................................................................................................. 1-3 Contrasting OLTP and Data Warehousing Environments..................................................... 1-3 Data Warehouse Architectures......................................................................................................... 1-5 Data Warehouse Architecture (Basic)........................................................................................ 1-5 Data Warehouse Architecture (with a Staging Area).............................................................. 1-6 Data Warehouse Architecture (with a Staging Area and Data Marts) ................................. 1-7 Part II Logical Design 2 Logical Design in Data Warehouses Logical Versus Physical Design in Data Warehouses.................................................................. 2-2
  • 4. iv Creating a Logical Design ................................................................................................................. 2-2 Data Warehousing Schemas.............................................................................................................. 2-3 Star Schemas.................................................................................................................................. 2-4 Other Schemas............................................................................................................................... 2-5 Data Warehousing Objects................................................................................................................ 2-5 Fact Tables...................................................................................................................................... 2-5 Dimension Tables ......................................................................................................................... 2-6 Unique Identifiers......................................................................................................................... 2-8 Relationships ................................................................................................................................. 2-8 Example of Data Warehousing Objects and Their Relationships.......................................... 2-8 Part III Physical Design 3 Physical Design in Data Warehouses Moving from Logical to Physical Design....................................................................................... 3-2 Physical Design................................................................................................................................... 3-2 Physical Design Structures.......................................................................................................... 3-4 Tablespaces.................................................................................................................................... 3-4 Tables and Partitioned Tables..................................................................................................... 3-5 Views .............................................................................................................................................. 3-6 Integrity Constraints .................................................................................................................... 3-6 Indexes and Partitioned Indexes ................................................................................................ 3-6 Materialized Views....................................................................................................................... 3-7 Dimensions .................................................................................................................................... 3-7 4 Hardware and I/O Considerations in Data Warehouses Overview of Hardware and I/O Considerations in Data Warehouses ..................................... 4-2 Why Stripe the Data?.................................................................................................................... 4-2 Automatic Striping ....................................................................................................................... 4-3 Manual Striping ............................................................................................................................ 4-4 Local and Global Striping............................................................................................................ 4-5 Analyzing Striping ....................................................................................................................... 4-6 RAID Configurations ......................................................................................................................... 4-9 RAID 0 (Striping) ........................................................................................................................ 4-10
  • 5. v RAID 1 (Mirroring)..................................................................................................................... 4-10 RAID 0+1 (Striping and Mirroring) ......................................................................................... 4-10 Striping, Mirroring, and Media Recovery............................................................................... 4-10 RAID 5.......................................................................................................................................... 4-11 The Importance of Specific Analysis........................................................................................ 4-12 5 Parallelism and Partitioning in Data Warehouses Overview of Parallel Execution........................................................................................................ 5-2 When to Implement Parallel Execution..................................................................................... 5-2 Granules of Parallelism..................................................................................................................... 5-3 Block Range Granules.................................................................................................................. 5-3 Partition Granules......................................................................................................................... 5-4 Partitioning Design Considerations ............................................................................................... 5-4 Types of Partitioning.................................................................................................................... 5-4 Partitioning and Data Segment Compression........................................................................ 5-17 Partition Pruning ........................................................................................................................ 5-19 Partition-Wise Joins.................................................................................................................... 5-21 Miscellaneous Partition Operations ............................................................................................. 5-31 Adding Partitions ....................................................................................................................... 5-32 Dropping Partitions.................................................................................................................... 5-33 Exchanging Partitions................................................................................................................ 5-34 Moving Partitions....................................................................................................................... 5-34 Splitting and Merging Partitions.............................................................................................. 5-35 Truncating Partitions ................................................................................................................. 5-35 Coalescing Partitions.................................................................................................................. 5-36 6 Indexes Bitmap Indexes.................................................................................................................................... 6-2 Bitmap Join Indexes...................................................................................................................... 6-6 B-tree Indexes .................................................................................................................................... 6-10 Local Indexes Versus Global Indexes ........................................................................................... 6-10 7 Integrity Constraints Why Integrity Constraints are Useful in a Data Warehouse ...................................................... 7-2
  • 6. vi Overview of Constraint States.......................................................................................................... 7-3 Typical Data Warehouse Integrity Constraints ............................................................................. 7-4 UNIQUE Constraints in a Data Warehouse ............................................................................. 7-4 FOREIGN KEY Constraints in a Data Warehouse................................................................... 7-5 RELY Constraints.......................................................................................................................... 7-6 Integrity Constraints and Parallelism........................................................................................ 7-7 Integrity Constraints and Partitioning....................................................................................... 7-7 View Constraints........................................................................................................................... 7-7 8 Materialized Views Overview of Data Warehousing with Materialized Views......................................................... 8-2 Materialized Views for Data Warehouses................................................................................. 8-2 Materialized Views for Distributed Computing...................................................................... 8-3 Materialized Views for Mobile Computing.............................................................................. 8-3 The Need for Materialized Views .............................................................................................. 8-3 Components of Summary Management ................................................................................... 8-5 Data Warehousing Terminology ................................................................................................ 8-7 Materialized View Schema Design ............................................................................................ 8-8 Loading Data ............................................................................................................................... 8-10 Overview of Materialized View Management Tasks............................................................ 8-11 Types of Materialized Views .......................................................................................................... 8-12 Materialized Views with Aggregates....................................................................................... 8-13 Materialized Views Containing Only Joins ............................................................................ 8-16 Nested Materialized Views ....................................................................................................... 8-18 Creating Materialized Views.......................................................................................................... 8-21 Naming Materialized Views ..................................................................................................... 8-22 Storage And Data Segment Compression............................................................................... 8-23 Build Methods............................................................................................................................. 8-23 Enabling Query Rewrite ............................................................................................................ 8-24 Query Rewrite Restrictions ....................................................................................................... 8-24 Refresh Options........................................................................................................................... 8-25 ORDER BY Clause ...................................................................................................................... 8-31 Materialized View Logs............................................................................................................. 8-31 Using Oracle Enterprise Manager............................................................................................ 8-32 Using Materialized Views with NLS Parameters .................................................................. 8-32
  • 7. vii Registering Existing Materialized Views..................................................................................... 8-33 Partitioning and Materialized Views............................................................................................ 8-35 Partition Change Tracking ........................................................................................................ 8-35 Partitioning a Materialized View ............................................................................................. 8-39 Partitioning a Prebuilt Table..................................................................................................... 8-40 Rolling Materialized Views....................................................................................................... 8-41 Materialized Views in OLAP Environments............................................................................... 8-41 OLAP Cubes................................................................................................................................ 8-41 Specifying OLAP Cubes in SQL............................................................................................... 8-42 Querying OLAP Cubes in SQL................................................................................................. 8-43 Partitioning Materialized Views for OLAP ............................................................................ 8-47 Compressing Materialized Views for OLAP.......................................................................... 8-47 Materialized Views with Set Operators .................................................................................. 8-47 Choosing Indexes for Materialized Views................................................................................... 8-49 Invalidating Materialized Views................................................................................................... 8-50 Security Issues with Materialized Views..................................................................................... 8-50 Altering Materialized Views .......................................................................................................... 8-51 Dropping Materialized Views........................................................................................................ 8-52 Analyzing Materialized View Capabilities................................................................................. 8-52 Using the DBMS_MVIEW.EXPLAIN_MVIEW Procedure................................................... 8-53 MV_CAPABILITIES_TABLE.CAPABILITY_NAME Details............................................... 8-56 MV_CAPABILITIES_TABLE Column Details ....................................................................... 8-58 9 Dimensions What are Dimensions?....................................................................................................................... 9-2 Creating Dimensions ......................................................................................................................... 9-4 Multiple Hierarchies .................................................................................................................... 9-7 Using Normalized Dimension Tables ....................................................................................... 9-9 Viewing Dimensions........................................................................................................................ 9-10 Using The DEMO_DIM Package.............................................................................................. 9-10 Using Oracle Enterprise Manager............................................................................................ 9-11 Using Dimensions with Constraints............................................................................................. 9-11 Validating Dimensions.................................................................................................................... 9-12 Altering Dimensions........................................................................................................................ 9-13 Deleting Dimensions....................................................................................................................... 9-14
  • 8. viii Using the Dimension Wizard ......................................................................................................... 9-14 Managing the Dimension Object.............................................................................................. 9-14 Creating a Dimension................................................................................................................. 9-17 Part IV Managing the Warehouse Environment 10 Overview of Extraction, Transformation, and Loading Overview of ETL ............................................................................................................................... 10-2 ETL Tools ............................................................................................................................................ 10-3 Daily Operations......................................................................................................................... 10-4 Evolution of the Data Warehouse ............................................................................................ 10-4 11 Extraction in Data Warehouses Overview of Extraction in Data Warehouses............................................................................... 11-2 Introduction to Extraction Methods in Data Warehouses......................................................... 11-2 Logical Extraction Methods....................................................................................................... 11-3 Physical Extraction Methods..................................................................................................... 11-4 Change Data Capture................................................................................................................. 11-5 Data Warehousing Extraction Examples....................................................................................... 11-8 Extraction Using Data Files....................................................................................................... 11-8 Extraction Via Distributed Operations.................................................................................. 11-11 12 Transportation in Data Warehouses Overview of Transportation in Data Warehouses ...................................................................... 12-2 Introduction to Transportation Mechanisms in Data Warehouses ......................................... 12-2 Transportation Using Flat Files ................................................................................................ 12-2 Transportation Through Distributed Operations .................................................................. 12-2 Transportation Using Transportable Tablespaces ................................................................. 12-3 13 Loading and Transformation Overview of Loading and Transformation in Data Warehouses ............................................. 13-2 Transformation Flow.................................................................................................................. 13-2 Loading Mechanisms ....................................................................................................................... 13-5 SQL*Loader ................................................................................................................................. 13-5
  • 9. ix External Tables............................................................................................................................ 13-6 OCI and Direct-Path APIs ......................................................................................................... 13-8 Export/Import ............................................................................................................................ 13-8 Transformation Mechanisms.......................................................................................................... 13-9 Transformation Using SQL ....................................................................................................... 13-9 Transformation Using PL/SQL.............................................................................................. 13-15 Transformation Using Table Functions................................................................................. 13-16 Loading and Transformation Scenarios...................................................................................... 13-25 Parallel Load Scenario.............................................................................................................. 13-25 Key Lookup Scenario ............................................................................................................... 13-33 Exception Handling Scenario ................................................................................................. 13-34 Pivoting Scenarios .................................................................................................................... 13-35 14 Maintaining the Data Warehouse Using Partitioning to Improve Data Warehouse Refresh ......................................................... 14-2 Refresh Scenarios........................................................................................................................ 14-5 Scenarios for Using Partitioning for Refreshing Data Warehouses .................................... 14-7 Optimizing DML Operations During Refresh ........................................................................... 14-8 Implementing an Efficient MERGE Operation ...................................................................... 14-9 Maintaining Referential Integrity........................................................................................... 14-10 Purging Data ............................................................................................................................. 14-11 Refreshing Materialized Views ................................................................................................... 14-12 Complete Refresh ..................................................................................................................... 14-13 Fast Refresh ............................................................................................................................... 14-14 ON COMMIT Refresh.............................................................................................................. 14-14 Manual Refresh Using the DBMS_MVIEW Package .......................................................... 14-14 Refresh Specific Materialized Views with REFRESH.......................................................... 14-15 Refresh All Materialized Views with REFRESH_ALL_MVIEWS ..................................... 14-16 Refresh Dependent Materialized Views with REFRESH_DEPENDENT......................... 14-16 Using Job Queues for Refresh................................................................................................. 14-18 When Refresh is Possible......................................................................................................... 14-18 Recommended Initialization Parameters for Parallelism................................................... 14-18 Monitoring a Refresh ............................................................................................................... 14-19 Checking the Status of a Materialized View......................................................................... 14-19 Tips for Refreshing Materialized Views with Aggregates ................................................. 14-19
  • 10. x Tips for Refreshing Materialized Views Without Aggregates........................................... 14-22 Tips for Refreshing Nested Materialized Views .................................................................. 14-23 Tips for Fast Refresh with UNION ALL ............................................................................... 14-25 Tips After Refreshing Materialized Views............................................................................ 14-25 Using Materialized Views with Partitioned Tables ................................................................. 14-26 Fast Refresh with Partition Change Tracking....................................................................... 14-26 Fast Refresh with CONSIDER FRESH................................................................................... 14-30 15 Change Data Capture About Change Data Capture........................................................................................................... 15-2 Publish and Subscribe Model.................................................................................................... 15-3 Example of a Change Data Capture System........................................................................... 15-4 Components and Terminology for Synchronous Change Data Capture ........................... 15-5 Installation and Implementation................................................................................................... 15-8 Change Data Capture Restriction on Direct-Path INSERT................................................... 15-8 Security ............................................................................................................................................... 15-9 Columns in a Change Table............................................................................................................ 15-9 Change Data Capture Views......................................................................................................... 15-10 Synchronous Mode of Data Capture........................................................................................... 15-12 Publishing Change Data................................................................................................................ 15-12 Step 1: Decide which Oracle Instance will be the Source System...................................... 15-12 Step 2: Create the Change Tables that will Contain the Changes...................................... 15-12 Managing Change Tables and Subscriptions............................................................................ 15-14 Subscribing to Change Data......................................................................................................... 15-15 Steps Required to Subscribe to Change Data ....................................................................... 15-15 What Happens to Subscriptions when the Publisher Makes Changes............................. 15-19 Export and Import Considerations .............................................................................................. 15-20 16 Summary Advisor Overview of the Summary Advisor in the DBMS_OLAP Package ........................................ 16-2 Using the Summary Advisor .......................................................................................................... 16-6 Identifier Numbers ..................................................................................................................... 16-7 Workload Management ............................................................................................................. 16-7 Loading a User-Defined Workload.......................................................................................... 16-9 Loading a Trace Workload...................................................................................................... 16-12
  • 11. xi Loading a SQL Cache Workload............................................................................................ 16-15 Validating a Workload............................................................................................................. 16-17 Removing a Workload............................................................................................................. 16-18 Using Filters with the Summary Advisor............................................................................. 16-18 Removing a Filter ..................................................................................................................... 16-22 Recommending Materialized Views...................................................................................... 16-23 SQL Script Generation ............................................................................................................. 16-27 Summary Data Report ............................................................................................................. 16-29 When Recommendations are No Longer Required............................................................. 16-31 Stopping the Recommendation Process................................................................................ 16-32 Summary Advisor Sample Sessions ...................................................................................... 16-32 Summary Advisor and Missing Statistics............................................................................. 16-37 Summary Advisor Privileges and ORA-30446..................................................................... 16-38 Estimating Materialized View Size............................................................................................. 16-38 ESTIMATE_MVIEW_SIZE Parameters................................................................................. 16-38 Is a Materialized View Being Used? ........................................................................................... 16-39 DBMS_OLAP.EVALUATE_MVIEW_STRATEGY Procedure........................................... 16-39 Summary Advisor Wizard............................................................................................................. 16-40 Summary Advisor Steps.......................................................................................................... 16-41 Part V Warehouse Performance 17 Schema Modeling Techniques Schemas in Data Warehouses......................................................................................................... 17-2 Third Normal Form .......................................................................................................................... 17-2 Optimizing Third Normal Form Queries................................................................................ 17-3 Star Schemas...................................................................................................................................... 17-4 Snowflake Schemas .................................................................................................................... 17-5 Optimizing Star Queries ................................................................................................................. 17-6 Tuning Star Queries ................................................................................................................... 17-6 Using Star Transformation........................................................................................................ 17-7 18 SQL for Aggregation in Data Warehouses Overview of SQL for Aggregation in Data Warehouses........................................................... 18-2
  • 12. xii Analyzing Across Multiple Dimensions ................................................................................. 18-3 Optimized Performance............................................................................................................. 18-4 An Aggregate Scenario .............................................................................................................. 18-5 Interpreting NULLs in Examples ............................................................................................. 18-6 ROLLUP Extension to GROUP BY................................................................................................ 18-6 When to Use ROLLUP ............................................................................................................... 18-7 ROLLUP Syntax.......................................................................................................................... 18-7 Partial Rollup............................................................................................................................... 18-8 CUBE Extension to GROUP BY ................................................................................................... 18-10 When to Use CUBE................................................................................................................... 18-10 CUBE Syntax ............................................................................................................................. 18-11 Partial CUBE.............................................................................................................................. 18-12 Calculating Subtotals Without CUBE.................................................................................... 18-13 GROUPING Functions .................................................................................................................. 18-13 GROUPING Function .............................................................................................................. 18-14 When to Use GROUPING ....................................................................................................... 18-16 GROUPING_ID Function........................................................................................................ 18-17 GROUP_ID Function................................................................................................................ 18-17 GROUPING SETS Expression ..................................................................................................... 18-19 Composite Columns....................................................................................................................... 18-21 Concatenated Groupings............................................................................................................... 18-24 Concatenated Groupings and Hierarchical Data Cubes..................................................... 18-26 Considerations when Using Aggregation.................................................................................. 18-28 Hierarchy Handling in ROLLUP and CUBE........................................................................ 18-28 Column Capacity in ROLLUP and CUBE............................................................................. 18-29 HAVING Clause Used with GROUP BY Extensions .......................................................... 18-29 ORDER BY Clause Used with GROUP BY Extensions ....................................................... 18-30 Using Other Aggregate Functions with ROLLUP and CUBE............................................ 18-30 Computation Using the WITH Clause........................................................................................ 18-30 19 SQL for Analysis in Data Warehouses Overview of SQL for Analysis in Data Warehouses.................................................................. 19-2 Ranking Functions............................................................................................................................ 19-5 RANK and DENSE_RANK....................................................................................................... 19-5 Top N Ranking.......................................................................................................................... 19-12
  • 13. xiii Bottom N Ranking.................................................................................................................... 19-12 CUME_DIST.............................................................................................................................. 19-13 PERCENT_RANK..................................................................................................................... 19-14 NTILE......................................................................................................................................... 19-14 ROW_NUMBER........................................................................................................................ 19-16 Windowing Aggregate Functions................................................................................................ 19-17 Treatment of NULLs as Input to Window Functions ......................................................... 19-18 Windowing Functions with Logical Offset........................................................................... 19-18 Cumulative Aggregate Function Example ........................................................................... 19-18 Moving Aggregate Function Example .................................................................................. 19-19 Centered Aggregate Function................................................................................................. 19-20 Windowing Aggregate Functions in the Presence of Duplicates...................................... 19-21 Varying Window Size for Each Row ..................................................................................... 19-22 Windowing Aggregate Functions with Physical Offsets.................................................... 19-23 FIRST_VALUE and LAST_VALUE ....................................................................................... 19-24 Reporting Aggregate Functions ................................................................................................... 19-24 Reporting Aggregate Example ............................................................................................... 19-26 RATIO_TO_REPORT............................................................................................................... 19-27 LAG/LEAD Functions.................................................................................................................... 19-27 LAG/LEAD Syntax.................................................................................................................. 19-28 FIRST/LAST Functions.................................................................................................................. 19-28 FIRST/LAST Syntax................................................................................................................. 19-29 FIRST/LAST As Regular Aggregates.................................................................................... 19-29 FIRST/LAST As Reporting Aggregates................................................................................ 19-30 Linear Regression Functions ........................................................................................................ 19-31 REGR_COUNT ......................................................................................................................... 19-32 REGR_AVGY and REGR_AVGX ........................................................................................... 19-32 REGR_SLOPE and REGR_INTERCEPT................................................................................ 19-32 REGR_R2.................................................................................................................................... 19-32 REGR_SXX, REGR_SYY, and REGR_SXY............................................................................. 19-33 Linear Regression Statistics Examples................................................................................... 19-33 Sample Linear Regression Calculation.................................................................................. 19-34 Inverse Percentile Functions......................................................................................................... 19-34 Normal Aggregate Syntax....................................................................................................... 19-35 Inverse Percentile Restrictions................................................................................................ 19-38
  • 14. xiv Hypothetical Rank and Distribution Functions ....................................................................... 19-38 Hypothetical Rank and Distribution Syntax......................................................................... 19-38 WIDTH_BUCKET Function.......................................................................................................... 19-40 WIDTH_BUCKET Syntax........................................................................................................ 19-40 User-Defined Aggregate Functions ............................................................................................. 19-43 CASE Expressions........................................................................................................................... 19-44 CASE Example .......................................................................................................................... 19-44 Creating Histograms With User-Defined Buckets............................................................... 19-45 20 OLAP and Data Mining OLAP ................................................................................................................................................... 20-2 Benefits of OLAP and RDBMS Integration............................................................................. 20-2 Data Mining....................................................................................................................................... 20-4 Enabling Data Mining Applications ........................................................................................ 20-5 Predictions and Insights ............................................................................................................ 20-5 Mining Within the Database Architecture.............................................................................. 20-5 Java API........................................................................................................................................ 20-7 21 Using Parallel Execution Introduction to Parallel Execution Tuning................................................................................... 21-2 When to Implement Parallel Execution................................................................................... 21-2 Operations That Can Be Parallelized....................................................................................... 21-3 The Parallel Execution Server Pool .......................................................................................... 21-3 How Parallel Execution Servers Communicate ..................................................................... 21-5 Parallelizing SQL Statements.................................................................................................... 21-6 Types of Parallelism ....................................................................................................................... 21-11 Parallel Query............................................................................................................................ 21-11 Parallel DDL .............................................................................................................................. 21-13 Parallel DML.............................................................................................................................. 21-18 Parallel Execution of Functions .............................................................................................. 21-28 Other Types of Parallelism...................................................................................................... 21-29 Initializing and Tuning Parameters for Parallel Execution .................................................... 21-30 Selecting Automated or Manual Tuning of Parallel Execution ......................................... 21-31 Using Automatically Derived Parameter Settings............................................................... 21-31 Setting the Degree of Parallelism ........................................................................................... 21-32
  • 15. xv How Oracle Determines the Degree of Parallelism for Operations.................................. 21-34 Balancing the Workload .......................................................................................................... 21-37 Parallelization Rules for SQL Statements.............................................................................. 21-38 Enabling Parallelism for Tables and Queries ....................................................................... 21-46 Degree of Parallelism and Adaptive Multiuser: How They Interact................................ 21-47 Forcing Parallel Execution for a Session ............................................................................... 21-48 Controlling Performance with the Degree of Parallelism .................................................. 21-48 Tuning General Parameters for Parallel Execution.................................................................. 21-49 Parameters Establishing Resource Limits for Parallel Operations.................................... 21-49 Parameters Affecting Resource Consumption ..................................................................... 21-58 Parameters Related to I/O ...................................................................................................... 21-63 Monitoring and Diagnosing Parallel Execution Performance............................................... 21-64 Is There Regression?................................................................................................................. 21-66 Is There a Plan Change?........................................................................................................... 21-66 Is There a Parallel Plan?........................................................................................................... 21-66 Is There a Serial Plan? .............................................................................................................. 21-66 Is There Parallel Execution?.................................................................................................... 21-67 Is the Workload Evenly Distributed? .................................................................................... 21-67 Monitoring Parallel Execution Performance with Dynamic Performance Views .......... 21-68 Monitoring Session Statistics .................................................................................................. 21-71 Monitoring System Statistics................................................................................................... 21-73 Monitoring Operating System Statistics................................................................................ 21-74 Affinity and Parallel Operations.................................................................................................. 21-75 Affinity and Parallel Queries .................................................................................................. 21-75 Affinity and Parallel DML....................................................................................................... 21-76 Miscellaneous Parallel Execution Tuning Tips......................................................................... 21-76 Setting Buffer Cache Size for Parallel Operations ............................................................... 21-77 Overriding the Default Degree of Parallelism...................................................................... 21-77 Rewriting SQL Statements ...................................................................................................... 21-78 Creating and Populating Tables in Parallel.......................................................................... 21-78 Creating Temporary Tablespaces for Parallel Sort and Hash Join.................................... 21-80 Executing Parallel SQL Statements........................................................................................ 21-81 Using EXPLAIN PLAN to Show Parallel Operations Plans .............................................. 21-81 Additional Considerations for Parallel DML ....................................................................... 21-82 Creating Indexes in Parallel .................................................................................................... 21-85
  • 16. xvi Parallel DML Tips..................................................................................................................... 21-87 Incremental Data Loading in Parallel.................................................................................... 21-90 Using Hints with Cost-Based Optimization ......................................................................... 21-92 FIRST_ROWS(n) Hint .............................................................................................................. 21-93 Enabling Dynamic Statistic Sampling.................................................................................... 21-93 22 Query Rewrite Overview of Query Rewrite............................................................................................................ 22-2 Cost-Based Rewrite..................................................................................................................... 22-3 When Does Oracle Rewrite a Query? ...................................................................................... 22-4 Enabling Query Rewrite.................................................................................................................. 22-7 Initialization Parameters for Query Rewrite .......................................................................... 22-8 Controlling Query Rewrite........................................................................................................ 22-8 Privileges for Enabling Query Rewrite.................................................................................... 22-9 Accuracy of Query Rewrite..................................................................................................... 22-10 How Oracle Rewrites Queries...................................................................................................... 22-11 Text Match Rewrite Methods.................................................................................................. 22-12 General Query Rewrite Methods............................................................................................ 22-13 When are Constraints and Dimensions Needed? ................................................................ 22-14 Special Cases for Query Rewrite ................................................................................................. 22-45 Query Rewrite Using Partially Stale Materialized Views................................................... 22-45 Query Rewrite Using Complex Materialized Views........................................................... 22-49 Query Rewrite Using Nested Materialized Views............................................................... 22-50 Query Rewrite When Using GROUP BY Extensions .......................................................... 22-51 Did Query Rewrite Occur?............................................................................................................ 22-56 Explain Plan............................................................................................................................... 22-56 DBMS_MVIEW.EXPLAIN_REWRITE Procedure ............................................................... 22-57 Design Considerations for Improving Query Rewrite Capabilities..................................... 22-63 Query Rewrite Considerations: Constraints......................................................................... 22-63 Query Rewrite Considerations: Dimensions ........................................................................ 22-63 Query Rewrite Considerations: Outer Joins ......................................................................... 22-63 Query Rewrite Considerations: Text Match ......................................................................... 22-63 Query Rewrite Considerations: Aggregates ......................................................................... 22-64 Query Rewrite Considerations: Grouping Conditions ....................................................... 22-64 Query Rewrite Considerations: Expression Matching........................................................ 22-64
  • 17. xvii Query Rewrite Considerations: Date Folding...................................................................... 22-65 Query Rewrite Considerations: Statistics.............................................................................. 22-65 Glossary Index
  • 18. xviii
  • 19. xix Send Us Your Comments Oracle9i Data Warehousing Guide, Release 2 (9.2) Part No. A96520-01 Oracle Corporation welcomes your comments and suggestions on the quality and usefulness of this document. Your input is an important part of the information used for revision. s Did you find any errors? s Is the information clearly presented? s Do you need more information? If so, where? s Are the examples correct? Do you need more examples? s What features did you like most? If you find any errors or have any other suggestions for improvement, please indicate the document title and part number, and the chapter, section, and page number (if available). You can send com- ments to us in the following ways: s Electronic mail: infodev_us@oracle.com s FAX: (650) 506-7227 Attn: Server Technologies Documentation Manager s Postal service: Oracle Corporation Server Technologies Documentation 500 Oracle Parkway, Mailstop 4op11 Redwood Shores, CA 94065 USA If you would like a reply, please give your name, address, telephone number, and (optionally) elec- tronic mail address. If you have problems with the software, please contact your local Oracle Support Services.
  • 20. xx
  • 21. xxi Preface This manual provides information about Oracle9i’s data warehousing capabilities. This preface contains these topics: s Audience s Organization s Related Documentation s Conventions s Documentation Accessibility
  • 22. xxii Audience Oracle9i Data Warehousing Guide is intended for database administrators, system administrators, and database application developers who design, maintain, and use data warehouses. To use this document, you need to be familiar with relational database concepts, basic Oracle server concepts, and the operating system environment under which you are running Oracle. Organization This document contains: Part 1: Concepts Chapter 1, Data Warehousing Concepts This chapter contains an overview of data warehousing concepts. Part 2: Logical Design Chapter 2, Logical Design in Data Warehouses This chapter discusses the logical design of a data warehouse. Part 3: Physical Design Chapter 3, Physical Design in Data Warehouses This chapter discusses the physical design of a data warehouse. Chapter 4, Hardware and I/O Considerations in Data Warehouses This chapter describes some hardware and input-output issues. Chapter 5, Parallelism and Partitioning in Data Warehouses This chapter describes the basics of parallelism and partitioning in data warehouses. Chapter 6, Indexes This chapter describes how to use indexes in data warehouses.
  • 23. xxiii Chapter 7, Integrity Constraints This chapter describes some issues involving constraints. Chapter 8, Materialized Views This chapter describes how to use materialized views in data warehouses. Chapter 9, Dimensions This chapter describes how to use dimensions in data warehouses. Part 4: Managing the Warehouse Environment Chapter 10, Overview of Extraction, Transformation, and Loading This chapter is an overview of the ETL process. Chapter 11, Extraction in Data Warehouses This chapter describes extraction issues. Chapter 12, Transportation in Data Warehouses This chapter describes transporting data in data warehouses. Chapter 13, Loading and Transformation This chapter describes transforming data in data warehouses. Chapter 14, Maintaining the Data Warehouse This chapter describes how to refresh in a data warehousing environment. Chapter 15, Change Data Capture This chapter describes how to use Change Data Capture capabilities. Chapter 16, Summary Advisor This chapter describes how to use the Summary Advisor utility.
  • 24. xxiv Part 5: Warehouse Performance Chapter 17, Schema Modeling Techniques This chapter describes the schemas useful in data warehousing environments. Chapter 18, SQL for Aggregation in Data Warehouses This chapter explains how to use SQL aggregation in data warehouses. Chapter 19, SQL for Analysis in Data Warehouses This chapter explains how to use analytic functions in data warehouses. Chapter 20, OLAP and Data Mining This chapter describes using analytic services in combination with Oracle9i. Chapter 21, Using Parallel Execution This chapter describes how to tune data warehouses using parallel execution. Chapter 22, Query Rewrite This chapter describes how to use query rewrite. Glossary Related Documentation For more information, see these Oracle resources: s Oracle9i Database Performance Tuning Guide and Reference Many of the examples in this book use the sample schemas of the seed database, which is installed by default when you install Oracle. Refer to Oracle9i Sample Schemas for information on how these schemas were created and how you can use them yourself. In North America, printed documentation is available for sale in the Oracle Store at http://oraclestore.oracle.com/
  • 25. xxv Customers in Europe, the Middle East, and Africa (EMEA) can purchase documentation from http://www.oraclebookshop.com/ Other customers can contact their Oracle representative to purchase printed documentation. To download free release notes, installation documentation, white papers, or other collateral, please visit the Oracle Technology Network (OTN). You must register online before using OTN; registration is free and can be done at http://otn.oracle.com/admin/account/membership.html If you already have a username and password for OTN, then you can go directly to the documentation section of the OTN Web site at http://otn.oracle.com/docs/index.htm To access the database documentation search engine directly, please visit http://tahiti.oracle.com For additional information, see: s The Data Warehouse Toolkit by Ralph Kimball (John Wiley and Sons, 1996) s Building the Data Warehouse by William Inmon (John Wiley and Sons, 1996) Conventions This section describes the conventions used in the text and code examples of this documentation set. It describes: s Conventions in Text s Conventions in Code Examples s Conventions for Windows Operating Systems
  • 26. xxvi Conventions in Text We use various conventions in text to help you more quickly identify special terms. The following table describes those conventions and provides examples of their use. Convention Meaning Example Bold Bold typeface indicates terms that are defined in the text or terms that appear in a glossary, or both. When you specify this clause, you create an index-organized table. Italics Italic typeface indicates book titles or emphasis. Oracle9i Database Concepts Ensure that the recovery catalog and target database do not reside on the same disk. UPPERCASE monospace (fixed-width) font Uppercase monospace typeface indicates elements supplied by the system. Such elements include parameters, privileges, datatypes, RMAN keywords, SQL keywords, SQL*Plus or utility commands, packages and methods, as well as system-supplied column names, database objects and structures, usernames, and roles. You can specify this clause only for a NUMBER column. You can back up the database by using the BACKUP command. Query the TABLE_NAME column in the USER_ TABLES data dictionary view. Use the DBMS_STATS.GENERATE_STATS procedure. lowercase monospace (fixed-width) font Lowercase monospace typeface indicates executables, filenames, directory names, and sample user-supplied elements. Such elements include computer and database names, net service names, and connect identifiers, as well as user-supplied database objects and structures, column names, packages and classes, usernames and roles, program units, and parameter values. Note: Some programmatic elements use a mixture of UPPERCASE and lowercase. Enter these elements as shown. Enter sqlplus to open SQL*Plus. The password is specified in the orapwd file. Back up the datafiles and control files in the /disk1/oracle/dbs directory. The department_id, department_name, and location_id columns are in the hr.departments table. Set the QUERY_REWRITE_ENABLED initialization parameter to true. Connect as oe user. The JRepUtil class implements these methods. lowercase italic monospace (fixed-width) font Lowercase italic monospace font represents placeholders or variables. You can specify the parallel_clause. Run Uold_release.SQL where old_ release refers to the release you installed prior to upgrading.
  • 27. xxvii Conventions in Code Examples Code examples illustrate SQL, PL/SQL, SQL*Plus, or other command-line statements. They are displayed in a monospace (fixed-width) font and separated from normal text as shown in this example: SELECT username FROM dba_users WHERE username = ’MIGRATE’; The following table describes typographic conventions used in code examples and provides examples of their use. Convention Meaning Example [ ] Brackets enclose one or more optional items. Do not enter the brackets. DECIMAL (digits [ , precision ]) { } Braces enclose two or more items, one of which is required. Do not enter the braces. {ENABLE | DISABLE} | A vertical bar represents a choice of two or more options within brackets or braces. Enter one of the options. Do not enter the vertical bar. {ENABLE | DISABLE} [COMPRESS | NOCOMPRESS] ... Horizontal ellipsis points indicate either: s That we have omitted parts of the code that are not directly related to the example s That you can repeat a portion of the code CREATE TABLE ... AS subquery; SELECT col1, col2, ... , coln FROM employees; . . . Vertical ellipsis points indicate that we have omitted several lines of code not directly related to the example. SQL> SELECT NAME FROM V$DATAFILE; NAME ------------------------------------ /fsl/dbs/tbs_01.dbf /fs1/dbs/tbs_02.dbf . . . /fsl/dbs/tbs_09.dbf 9 rows selected. Other notation You must enter symbols other than brackets, braces, vertical bars, and ellipsis points as shown. acctbal NUMBER(11,2); acct CONSTANT NUMBER(4) := 3;
  • 28. xxviii Conventions for Windows Operating Systems The following table describes conventions for Windows operating systems and provides examples of their use. Italics Italicized text indicates placeholders or variables for which you must supply particular values. CONNECT SYSTEM/system_password DB_NAME = database_name UPPERCASE Uppercase typeface indicates elements supplied by the system. We show these terms in uppercase in order to distinguish them from terms you define. Unless terms appear in brackets, enter them in the order and with the spelling shown. However, because these terms are not case sensitive, you can enter them in lowercase. SELECT last_name, employee_id FROM employees; SELECT * FROM USER_TABLES; DROP TABLE hr.employees; lowercase Lowercase typeface indicates programmatic elements that you supply. For example, lowercase indicates names of tables, columns, or files. Note: Some programmatic elements use a mixture of UPPERCASE and lowercase. Enter these elements as shown. SELECT last_name, employee_id FROM employees; sqlplus hr/hr CREATE USER mjones IDENTIFIED BY ty3MU9; Convention Meaning Example Choose Start > How to start a program. To start the Database Configuration Assistant, choose Start > Programs > Oracle - HOME_ NAME > Configuration and Migration Tools > Database Configuration Assistant. File and directory names File and directory names are not case sensitive. The following special characters are not allowed: left angle bracket (<), right angle bracket (>), colon (:), double quotation marks ("), slash (/), pipe (|), and dash (-). The special character backslash () is treated as an element separator, even when it appears in quotes. If the file name begins with , then Windows assumes it uses the Universal Naming Convention. c:winnt""system32 is the same as C:WINNTSYSTEM32 Convention Meaning Example
  • 29. xxix C:> Represents the Windows command prompt of the current hard disk drive. The escape character in a command prompt is the caret (^). Your prompt reflects the subdirectory in which you are working. Referred to as the command prompt in this manual. C:oracleoradata> Special characters The backslash () special character is sometimes required as an escape character for the double quotation mark (") special character at the Windows command prompt. Parentheses and the single quotation mark (’) do not require an escape character. Refer to your Windows operating system documentation for more information on escape and special characters. C:>exp scott/tiger TABLES=emp QUERY="WHERE job=’SALESMAN’ and sal<1600" C:>imp SYSTEM/password FROMUSER=scott TABLES=(emp, dept) HOME_NAME Represents the Oracle home name. The home name can be up to 16 alphanumeric characters. The only special character allowed in the home name is the underscore. C:> net start OracleHOME_NAMETNSListener Convention Meaning Example
  • 30. xxx ORACLE_HOME and ORACLE_ BASE In releases prior to Oracle8i release 8.1.3, when you installed Oracle components, all subdirectories were located under a top level ORACLE_HOME directory that by default used one of the following names: s C:orant for Windows NT s C:orawin98 for Windows 98 This release complies with Optimal Flexible Architecture (OFA) guidelines. All subdirectories are not under a top level ORACLE_HOME directory. There is a top level directory called ORACLE_BASE that by default is C:oracle. If you install the latest Oracle release on a computer with no other Oracle software installed, then the default setting for the first Oracle home directory is C:oracleorann, where nn is the latest release number. The Oracle home directory is located directly under ORACLE_BASE. All directory path examples in this guide follow OFA conventions. Refer to Oracle9i Database Getting Started for Windows for additional information about OFA compliances and for information about installing Oracle products in non-OFA compliant directories. Go to the ORACLE_BASEORACLE_ HOMErdbmsadmin directory. Convention Meaning Example
  • 31. xxxi Documentation Accessibility Our goal is to make Oracle products, services, and supporting documentation accessible, with good usability, to the disabled community. To that end, our documentation includes features that make information available to users of assistive technology. This documentation is available in HTML format, and contains markup to facilitate access by the disabled community. Standards will continue to evolve over time, and Oracle Corporation is actively engaged with other market-leading technology vendors to address technical obstacles so that our documentation can be accessible to all of our customers. For additional information, visit the Oracle Accessibility Program Web site at http://www.oracle.com/accessibility/ Accessibility of Code Examples in Documentation JAWS, a Windows screen reader, may not always correctly read the code examples in this document. The conventions for writing code require that closing braces should appear on an otherwise empty line; however, JAWS may not always read a line of text that consists solely of a bracket or brace. Accessibility of Links to External Web Sites in Documentation This documentation may contain links to Web sites of other companies or organizations that Oracle Corporation does not own or control. Oracle Corporation neither evaluates nor makes any representations regarding the accessibility of these Web sites.
  • 32. xxxii
  • 33. xxxiii What’s New in Data Warehousing? This section describes new features of Oracle9i release 2 (9.2) and provides pointers to additional information. New features information from previous releases is also retained to help those users migrating to the current release. The following sections describe the new features in Oracle Data Warehousing: s Oracle9i Release 2 (9.2) New Features in Data Warehousing s Oracle9i Release 1 (9.0.1) New Features in Data Warehousing
  • 34. xxxiv Oracle9i Release 2 (9.2) New Features in Data Warehousing s Data Segment Compression You can compress data segments in heap-organized tables, and a typical example of a heap-organized table you should consider for data segment compression is partitioned tables. Data segment compression is also useful for highly redundant data, such as tables with many foreign keys and materialized views created with the ROLLUP clause. You should avoid compression on tables with many updates or DML. s Materialized View Enhancements You can now nest materialized views when the materialized view contains joins and aggregates. Fast refresh is now possible on a materialized views containing the UNION ALL operator. Various restrictions were removed in addition to expanding the situations where materialized views could be effectively used. In particular, using materialized views in an OLAP environment has been improved. s Parallel DML on Non-Partitioned Tables You can now use parallel DML on non-partitioned tables. s Partitioning Enhancements You can now simplify SQL syntax by using a DEFAULT partition or a subpartition template. You can implement SPLIT operations more easily. See Also: Chapter 8, "Materialized Views" See Also: "Overview of Data Warehousing with Materialized Views" on page 8-2 and "Materialized Views in OLAP Environments" on page 8-41, and Chapter 14, "Maintaining the Data Warehouse" See Also: Chapter 21, "Using Parallel Execution" See Also: "Partitioning Methods" on page 5-5, Chapter 5, "Parallelism and Partitioning in Data Warehouses", and Oracle9i Database Administrator’s Guide
  • 35. xxxv s Query Rewrite Enhancements Text match processing and join equivalence recognition have been improved. Materialized views containing the UNION ALL operator can now use query rewrite. s Range-List Partitioning You can now subpartition by list range-partitioned tables. s Summary Advisor Enhancements The Summary Advisor tool and its related DBMS_OLAP package were improved so you can restrict workloads to a specific schema. Oracle9i Release 1 (9.0.1) New Features in Data Warehousing s Analytic Functions Oracle’s analytic capabilities have been improved through the addition of Inverse percentile, hypothetical distribution, and first/last analytic functions. s Bitmap Join Index A bitmap join index spans multiple tables and improves the performance of joins of those tables. s ETL Enhancements Oracle’s extraction, transformation, and loading capabilities have been improved with a MERGE statement, multi-table inserts, and table functions. See Also: Chapter 22, "Query Rewrite" See Also: "Types of Partitioning" on page 5-4 See Also: Chapter 16, "Summary Advisor" See Also: Chapter 19, "SQL for Analysis in Data Warehouses" See Also: "Bitmap Indexes" on page 6-2 See Also: Chapter 10, "Overview of Extraction, Transformation, and Loading"
  • 36. xxxvi s Full Outer Joins Oracle added full support for full outer joins so that you can more easily express certain complex queries. s Grouping Sets You can now selectively specify the set of groups that you want to create using a GROUPING SETS expression within a GROUP BY clause. This allows precise specification across multiple dimensions without computing the whole CUBE. s List Partitioning List partitioning offers you precise control over which data belongs in a particular partition. s Materialized View Enhancements Various restrictions were removed in addition to expanding the situations where materialized views could be effectively used. s Query Rewrite Enhancements The query rewrite feature, which allows many SQL statements to use materialized views, thereby improving performance significantly, was improved significantly. Text match processing and join equivalence recognition have been improved. See Also: Oracle9i Database Performance Tuning Guide and Reference See Also: Chapter 18, "SQL for Aggregation in Data Warehouses" See Also: "Partitioning Design Considerations" on page 5-4 and Oracle9i Database Concepts, and Oracle9i Database Administrator’s Guide See Also: "Overview of Data Warehousing with Materialized Views" on page 8-2 See Also: Chapter 22, "Query Rewrite"
  • 37. xxxvii s Summary Advisor Enhancements The Summary Advisor tool and its related DBMS_OLAP package were improved so you can specify workloads. In addition, a broader class of schemas is now supported. s WITH Clause The WITH clause enables you to reuse a query block in a SELECT statement when it occurs more than once within a complex query. See Also: Chapter 16, "Summary Advisor" See Also: "Computation Using the WITH Clause" on page 18-30
  • 39. Part I Concepts This section introduces basic data warehousing concepts. It contains the following chapter: s Data Warehousing Concepts
  • 40.
  • 41. Data Warehousing Concepts 1-1 1 Data Warehousing Concepts This chapter provides an overview of the Oracle data warehousing implementation. It includes: s What is a Data Warehouse? s Data Warehouse Architectures Note that this book is meant as a supplement to standard texts about data warehousing. This book focuses on Oracle-specific material and does not reproduce in detail material of a general nature. Two standard texts are: s The Data Warehouse Toolkit by Ralph Kimball (John Wiley and Sons, 1996) s Building the Data Warehouse by William Inmon (John Wiley and Sons, 1996)
  • 42. What is a Data Warehouse? 1-2 Oracle9i Data Warehousing Guide What is a Data Warehouse? A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources. In addition to a relational database, a data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users. A common way of introducing data warehousing is to refer to the characteristics of a data warehouse as set forth by William Inmon: s Subject Oriented s Integrated s Nonvolatile s Time Variant Subject Oriented Data warehouses are designed to help you analyze data. For example, to learn more about your company’s sales data, you can build a warehouse that concentrates on sales. Using this warehouse, you can answer questions like "Who was our best customer for this item last year?" This ability to define a data warehouse by subject matter, sales in this case, makes the data warehouse subject oriented. Integrated Integration is closely related to subject orientation. Data warehouses must put data from disparate sources into a consistent format. They must resolve such problems as naming conflicts and inconsistencies among units of measure. When they achieve this, they are said to be integrated. See Also: Chapter 10, "Overview of Extraction, Transformation, and Loading"
  • 43. What is a Data Warehouse? Data Warehousing Concepts 1-3 Nonvolatile Nonvolatile means that, once entered into the warehouse, data should not change. This is logical because the purpose of a warehouse is to enable you to analyze what has occurred. Time Variant In order to discover trends in business, analysts need large amounts of data. This is very much in contrast to online transaction processing (OLTP) systems, where performance requirements demand that historical data be moved to an archive. A data warehouse’s focus on change over time is what is meant by the term time variant. Contrasting OLTP and Data Warehousing Environments Figure 1–1 illustrates key differences between an OLTP system and a data warehouse. Figure 1–1 Contrasting OLTP and Data Warehousing Environments One major difference between the types of system is that data warehouses are not usually in third normal form (3NF), a type of data normalization common in OLTP environments. Few Rare Normalized DBMS Many Indexes Derived Data and Aggregates Duplicated Data Joins Many Complex data structures (3NF databases) Multidimensional data structures OLTP Data Warehouse Common Denormalized DBMS Some
  • 44. What is a Data Warehouse? 1-4 Oracle9i Data Warehousing Guide Data warehouses and OLTP systems have very different requirements. Here are some examples of differences between typical data warehouses and OLTP systems: s Workload Data warehouses are designed to accommodate ad hoc queries. You might not know the workload of your data warehouse in advance, so a data warehouse should be optimized to perform well for a wide variety of possible query operations. OLTP systems support only predefined operations. Your applications might be specifically tuned or designed to support only these operations. s Data modifications A data warehouse is updated on a regular basis by the ETL process (run nightly or weekly) using bulk data modification techniques. The end users of a data warehouse do not directly update the data warehouse. In OLTP systems, end users routinely issue individual data modification statements to the database. The OLTP database is always up to date, and reflects the current state of each business transaction. s Schema design Data warehouses often use denormalized or partially denormalized schemas (such as a star schema) to optimize query performance. OLTP systems often use fully normalized schemas to optimize update/insert/delete performance, and to guarantee data consistency. s Typical operations A typical data warehouse query scans thousands or millions of rows. For example, "Find the total sales for all customers last month." A typical OLTP operation accesses only a handful of records. For example, "Retrieve the current order for this customer." s Historical data Data warehouses usually store many months or years of data. This is to support historical analysis. OLTP systems usually store data from only a few weeks or months. The OLTP system stores only historical data as needed to successfully meet the requirements of the current transaction.
  • 45. Data Warehouse Architectures Data Warehousing Concepts 1-5 Data Warehouse Architectures Data warehouses and their architectures vary depending upon the specifics of an organization's situation. Three common architectures are: s Data Warehouse Architecture (Basic) s Data Warehouse Architecture (with a Staging Area) s Data Warehouse Architecture (with a Staging Area and Data Marts) Data Warehouse Architecture (Basic) Figure 1–2 shows a simple architecture for a data warehouse. End users directly access data derived from several source systems through the data warehouse. Figure 1–2 Architecture of a Data Warehouse In Figure 1–2, the metadata and raw data of a traditional OLTP system is present, as is an additional type of data, summary data. Summaries are very valuable in data warehouses because they pre-compute long operations in advance. For example, a typical data warehouse query is to retrieve something like August sales. A summary in Oracle is called a materialized view. WarehouseData Sources Summary Data Raw Data Metadata Operational System Operational System Flat Files Users Analysis Reporting Mining
  • 46. Data Warehouse Architectures 1-6 Oracle9i Data Warehousing Guide Data Warehouse Architecture (with a Staging Area) In Figure 1–2, you need to clean and process your operational data before putting it into the warehouse. You can do this programmatically, although most data warehouses use a staging area instead. A staging area simplifies building summaries and general warehouse management. Figure 1–3 illustrates this typical architecture. Figure 1–3 Architecture of a Data Warehouse with a Staging Area Operational System Data Sources Staging Area Warehouse Users Operational System Flat Files Analysis Reporting Mining Summary Data Raw Data Metadata
  • 47. Data Warehouse Architectures Data Warehousing Concepts 1-7 Data Warehouse Architecture (with a Staging Area and Data Marts) Although the architecture in Figure 1–3 is quite common, you may want to customize your warehouse’s architecture for different groups within your organization. You can do this by adding data marts, which are systems designed for a particular line of business. Figure 1–4 illustrates an example where purchasing, sales, and inventories are separated. In this example, a financial analyst might want to analyze historical data for purchases and sales. Figure 1–4 Architecture of a Data Warehouse with a Staging Area and Data Marts Note: Data marts are an important part of many warehouses, but they are not the focus of this book. See Also: Data Mart Suites documentation for further information regarding data marts Operational System Data Sources Staging Area Warehouse Data Marts Users Operational System Flat Files Sales Purchasing Inventory Analysis Reporting Mining Summary Data Raw Data Metadata
  • 48. Data Warehouse Architectures 1-8 Oracle9i Data Warehousing Guide
  • 49. Part II Logical Design This section deals with the issues in logical design in a data warehouse. It contains the following chapter: s Logical Design in Data Warehouses
  • 50.
  • 51. Logical Design in Data Warehouses 2-1 2 Logical Design in Data Warehouses This chapter tells you how to design a data warehousing environment and includes the following topics: s Logical Versus Physical Design in Data Warehouses s Creating a Logical Design s Data Warehousing Schemas s Data Warehousing Objects
  • 52. Logical Versus Physical Design in Data Warehouses 2-2 Oracle9i Data Warehousing Guide Logical Versus Physical Design in Data Warehouses Your organization has decided to build a data warehouse. You have defined the business requirements and agreed upon the scope of your application, and created a conceptual design. Now you need to translate your requirements into a system deliverable. To do so, you create the logical and physical design for the data warehouse. You then define: s The specific data content s Relationships within and between groups of data s The system environment supporting your data warehouse s The data transformations required s The frequency with which data is refreshed The logical design is more conceptual and abstract than the physical design. In the logical design, you look at the logical relationships among the objects. In the physical design, you look at the most effective way of storing and retrieving the objects as well as handling them from a transportation and backup/recovery perspective. Orient your design toward the needs of the end users. End users typically want to perform analysis and look at aggregated data, rather than at individual transactions. However, end users might not know what they need until they see it. In addition, a well-planned design allows for growth and changes as the needs of users change and evolve. By beginning with the logical design, you focus on the information requirements and save the implementation details for later. Creating a Logical Design A logical design is conceptual and abstract. You do not deal with the physical implementation details yet. You deal only with defining the types of information that you need. One technique you can use to model your organization's logical information requirements is entity-relationship modeling. Entity-relationship modeling involves identifying the things of importance (entities), the properties of these things (attributes), and how they are related to one another (relationships). The process of logical design involves arranging data into a series of logical relationships called entities and attributes. An entity represents a chunk of
  • 53. Data Warehousing Schemas Logical Design in Data Warehouses 2-3 information. In relational databases, an entity often maps to a table. An attribute is a component of an entity that helps define the uniqueness of the entity. In relational databases, an attribute maps to a column. To be sure that your data is consistent, you need to use unique identifiers. A unique identifier is something you add to tables so that you can differentiate between the same item when it appears in different places. In a physical design, this is usually a primary key. While entity-relationship diagramming has traditionally been associated with highly normalized models such as OLTP applications, the technique is still useful for data warehouse design in the form of dimensional modeling. In dimensional modeling, instead of seeking to discover atomic units of information (such as entities and attributes) and all of the relationships between them, you identify which information belongs to a central fact table and which information belongs to its associated dimension tables. You identify business subjects or fields of data, define relationships between business subjects, and name the attributes for each subject. Your logical design should result in (1) a set of entities and attributes corresponding to fact tables and dimension tables and (2) a model of operational data from your source into subject-oriented information in your target data warehouse schema. You can create the logical design using a pen and paper, or you can use a design tool such as Oracle Warehouse Builder (specifically designed to support modeling the ETL process) or Oracle Designer (a general purpose modeling tool). Data Warehousing Schemas A schema is a collection of database objects, including tables, views, indexes, and synonyms. You can arrange schema objects in the schema models designed for data warehousing in a variety of ways. Most data warehouses use a dimensional model. The model of your source data and the requirements of your users help you design the data warehouse schema. You can sometimes get the source model from your company's enterprise data model and reverse-engineer the logical data model for the data warehouse from this. The physical implementation of the logical data See Also: Chapter 9, "Dimensions" for further information regarding dimensions See Also: Oracle Designer and Oracle Warehouse Builder documentation sets
  • 54. Data Warehousing Schemas 2-4 Oracle9i Data Warehousing Guide warehouse model may require some changes to adapt it to your system parameters—size of machine, number of users, storage capacity, type of network, and software. Star Schemas The star schema is the simplest data warehouse schema. It is called a star schema because the diagram resembles a star, with points radiating from a center. The center of the star consists of one or more fact tables and the points of the star are the dimension tables, as shown in Figure 2–1. Figure 2–1 Star Schema The most natural way to model a data warehouse is as a star schema, only one join establishes the relationship between the fact table and any one of the dimension tables. A star schema optimizes performance by keeping queries simple and providing fast response time. All the information about each level is stored in one row. Note: Oracle Corporation recommends that you choose a star schema unless you have a clear reason not to. customers products Dimension Table Dimension Table channels sales (amount_sold, quantity_sold) times Fact Table
  • 55. Data Warehousing Objects Logical Design in Data Warehouses 2-5 Other Schemas Some schemas in data warehousing environments use third normal form rather than star schemas. Another schema that is sometimes useful is the snowflake schema, which is a star schema with normalized dimensions in a tree structure. Data Warehousing Objects Fact tables and dimension tables are the two types of objects commonly used in dimensional data warehouse schemas. Fact tables are the large tables in your warehouse schema that store business measurements. Fact tables typically contain facts and foreign keys to the dimension tables. Fact tables represent data, usually numeric and additive, that can be analyzed and examined. Examples include sales, cost, and profit. Dimension tables, also known as lookup or reference tables, contain the relatively static data in the warehouse. Dimension tables store the information you normally use to contain queries. Dimension tables are usually textual and descriptive and you can use them as the row headers of the result set. Examples are customers or products. Fact Tables A fact table typically has two types of columns: those that contain numeric facts (often called measurements), and those that are foreign keys to dimension tables. A fact table contains either detail-level facts or facts that have been aggregated. Fact tables that contain aggregated facts are often called summary tables. A fact table usually contains facts with the same level of aggregation. Though most facts are additive, they can also be semi-additive or non-additive. Additive facts can be aggregated by simple arithmetical addition. A common example of this is sales. Non-additive facts cannot be added at all. An example of this is averages. Semi-additive facts can be aggregated along some of the dimensions and not along others. An example of this is inventory levels, where you cannot tell what a level means simply by looking at it. See Also: Chapter 17, "Schema Modeling Techniques" for further information regarding star and snowflake schemas in data warehouses and Oracle9i Database Concepts for further conceptual material
  • 56. Data Warehousing Objects 2-6 Oracle9i Data Warehousing Guide Creating a New Fact Table You must define a fact table for each star schema. From a modeling standpoint, the primary key of the fact table is usually a composite key that is made up of all of its foreign keys. Dimension Tables A dimension is a structure, often composed of one or more hierarchies, that categorizes data. Dimensional attributes help to describe the dimensional value. They are normally descriptive, textual values. Several distinct dimensions, combined with facts, enable you to answer business questions. Commonly used dimensions are customers, products, and time. Dimension data is typically collected at the lowest level of detail and then aggregated into higher level totals that are more useful for analysis. These natural rollups or aggregations within a dimension table are called hierarchies. Hierarchies Hierarchies are logical structures that use ordered levels as a means of organizing data. A hierarchy can be used to define data aggregation. For example, in a time dimension, a hierarchy might aggregate data from the month level to the quarter level to the year level. A hierarchy can also be used to define a navigational drill path and to establish a family structure. Within a hierarchy, each level is logically connected to the levels above and below it. Data values at lower levels aggregate into the data values at higher levels. A dimension can be composed of more than one hierarchy. For example, in the product dimension, there might be two hierarchies—one for product categories and one for product suppliers. Dimension hierarchies also group levels from general to granular. Query tools use hierarchies to enable you to drill down into your data to view different levels of granularity. This is one of the key benefits of a data warehouse. When designing hierarchies, you must consider the relationships in business structures. For example, a divisional multilevel sales organization. Hierarchies impose a family structure on dimension values. For a particular level value, a value at the next higher level is its parent, and values at the next lower level are its children. These familial relationships enable analysts to access data quickly.
  • 57. Data Warehousing Objects Logical Design in Data Warehouses 2-7 Levels A level represents a position in a hierarchy. For example, a time dimension might have a hierarchy that represents data at the month, quarter, and year levels. Levels range from general to specific, with the root level as the highest or most general level. The levels in a dimension are organized into one or more hierarchies. Level Relationships Level relationships specify top-to-bottom ordering of levels from most general (the root) to most specific information. They define the parent-child relationship between the levels in a hierarchy. Hierarchies are also essential components in enabling more complex rewrites. For example, the database can aggregate an existing sales revenue on a quarterly base to a yearly aggregation when the dimensional dependencies between quarter and year are known. Typical Dimension Hierarchy Figure 2–2 illustrates a dimension hierarchy based on customers. Figure 2–2 Typical Levels in a Dimension Hierarchy See Also: Chapter 9, "Dimensions" and Chapter 22, "Query Rewrite" for further information regarding hierarchies region customer country_name subregion