2. What Is A Data Warehouse?
History
Current scenario
Characteristics
Operational Database vs. Data Warehouse
Architecture
Data Model
Gopal K KGDS
3. The term "data warehouse" refers to a special
type of database that acts as the central
repository for company data. It can be thought of
as a database archive that is segregated from the
operational databases, and used primarily for
reporting and data mining purposes.
4. The relational database revolution in the early
1980s ushered in an era of improved access to
the valuable information contained deep within
data. Still improvements were needed.
It was soon discovered that databases modeled
to be efficient at transactional processing were
not always optimized for complex reporting or
analytical needs
5. Inmon champions the large centralized Data Warehouse approach
leveraging solid relational design principles. His Corporate
Information Factory remains an example of this "top down"
philosophy.
Kimball, on the other hand, favors the development of individual
data marts at the departmental level that get integrated together
using the Information Bus architecture. This "bottom up" approach
dovetails nicely with Kimball's preference for star-schema modeling
6. Many of the current changes in today's data industry also affect Data
Warehousing. Cloud storage and high-velocity, real-time data analysis
being two obvious factors playing a role in the practice's evolution. On
the end-user side, web-based and mobile access to decision support or
reporting data is a major requirement on many projects. Advances in
the practice of ontology have enhanced the capabilities of ETL systems
to parse information out of unstructured as well as structured data
sources
7. Subject-oriented
The data in the database is organized so that all the data elements
relating to the same real-world event or object are linked together.
Time-variant
The changes to the data in the database are tracked and recorded
so that reports can be produced showing changes over time.
8. Non-volatile
Data in the database is never over-written or deleted. Once
committed, the data is static, read-only, but retained for future
reporting.
Integrated
The database contains data from most or all of an organization's
operational applications, and that this data is made consistent.
9. The processing load of reporting reduced the
response time of the operational systems.
The database designs of operational systems
were not optimized for information analysis and
reporting.
10. Most organizations had more than one
operational system, so company-wide reporting
could not be supported from a single system.
Development of reports in operational systems
often required writing specific computer
programs which was slow and expensive.
11. Consolidation of data from a wide variety of data
sources.
Ability to analyze data beyond the level of
standard monitoring reports.
Operational response time unaffected.