1. DATA WAREHOUSING AND DATA MINING M.Mageshwari,Lecturer M.S.P.V.L Polytechnic College
2.
3.
4. A producer wants to know…. Which are our lowest/highest margin customers ? Who are my customers and what products are they buying? Which customers are most likely to go to the competition ? What impact will new products/services have on revenue and margins? What product prom- -otions have the biggest impact on revenue? What is the most effective distribution channel?
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20. Explorers, Farmers and Tourists Explorers: Seek out the unknown and previously unsuspected rewards hiding in the detailed data Farmers: Harvest information from known access paths Tourists: Browse information about Tourists
21. Application-Orientation vs. Subject-Orientation Application-Orientation Operational Database Loans Credit Card Trust Savings Subject-Orientation Data Warehouse Customer Vendor Product Activity
22. Functioning of Data warehousing Data Source cleaning Transformation Data Warehouse New Update
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33. OLAP DATA WAREHOUSE OLAP SERVER FRONT END TOOL User Result Result set Request SQL
34.
35.
36.
37. Data Warehouse Architecture Data Warehouse Engine Optimized Loader Extraction Cleansing Analyze Query Metadata Repository Relational Databases Legacy Data Purchased Data ERP Systems
38. Architecture of data warehousing External data Data Acquisition Data Manager Warehouse data External data Data Dictionary Information Directiory Warehouse data Middleware Design Management Data Access
51. Different between data warehouse and data mart Data warehouse Data Mart Data mart is therefore useful for small organizations with very few departments data warehousing is suitable to support an entire corporate environment. If you listen to some vendors, you may be left thinking that building data warehouses is a waste of time. data mart vendor that tells you this are looking out for their own best interests. This supports the entire information requirement of an organization. This support the information requirement of a department in an organization This has large model, wider implementation, large data and more number of users. This has small data model, shorter implementation, less data and some users.
57. Different between data warehouse and views Data warehouse Views Data warehouse is a permanent storage data. Views are created from warehouse data when needed and it is not permanent Data warehouse are multidimensional Views are relational Data warehouse can be indexed to maximize performance. Views cannot be indexed. Data warehouse provides specific support to a functionality Views cannot give specific support to a functionality. Data warehouse provide large amount of data. Views are created by extracting minimum data from data warehouse.
63. Application Areas Industry Application Finance Credit Card Analysis Insurance Claims, Fraud Analysis Telecommunication Call record analysis Consumer goods promotion analysis Data Service providers Value added data Utilities Power usage analysis
70. Data Integration Across Sources Trust Credit card Savings Loans Same data different name Different data Same name Data found here nowhere else Different keys same data
71. Data Transformation Example encoding unit field appl A - balance appl B - bal appl C - currbal appl D - balcurr appl A - pipeline - cm appl B - pipeline - in appl C - pipeline - feet appl D - pipeline - yds appl A - m,f appl B - 1,0 appl C - x,y appl D - male, female Data Warehouse
74. From the Data Warehouse to Data Marts Departmentally Structured Individually Structured Data Warehouse Organizationally Structured Less More History Normalized Detailed Data Information
75. Data Warehouse and Data Marts OLAP Data Mart Lightly summarized Departmentally structured Organizationally structured Atomic Detailed Data Warehouse Data
94. Relational OLAP: 3 Tier DSS Store atomic data in industry standard RDBMS. Generate SQL execution plans in the ROLAP engine to obtain OLAP functionality. Obtain multi-dimensional reports from the DSS Client. Data Warehouse ROLAP Engine Decision Support Client Database Layer Application Logic Layer Presentation Layer
95. MD-OLAP: 2 Tier DSS MDDB Engine MDDB Engine Decision Support Client Database Layer Application Logic Layer Presentation Layer Store atomic data in a proprietary data structure (MDDB), pre-calculate as many outcomes as possible, obtain OLAP functionality via proprietary algorithms running against this data. Obtain multi-dimensional reports from the DSS Client.