36. Market Basket Analysis • Consider shopping cart filled with several items • Market basket analysis tries to answer the following questions: – Who makes purchases? – What do customers buy together? – In what order do customers purchase items? • Given a database of customer transactions, each transaction is a set of items – deduce association rules.
37. Examples of Market Basket Analysis • Co- ocurrences – 80% of all customers purchase items a, b, and c together. • Association Rules – 60% of all customers who purchase X and Y also buy Z. • Sequential Patterns – 60% of customers who first buy X also purchase Y within two weeks. Confidence and Support • We prune the set of all possible association rules using two measures of interest: – Confidence of a rule: X -> Y has confidence c if P( Y| X)= c. – Support of a rule: X-> Y has support s if P( XY)= s. Also, support of an itemset XY.
38. • Direct Marketing • Fraud Detection for Medical Insurance • Floor/ Shelf Planning • Web Site Layout • Cross- selling Applications
39. Frequent Itemsets Applications – Classification – Seeds for constructing Bayesian networks – Web log analysis – Collaborative filtering Association Rules Approaches • Problem Reduction • Breadth- First Search • Depth- First Search
40. Environment Interoperability: key components (client/network/server) work together. Salability: any of the key elements may be replaced when the need to either grow or reduce processing for that element dictates, without major impact on the other elements. Adaptability: new technology (multi-media, broad band networks, distributed database, etc.) may be incorporated into the system. Affordability: using less expensive insures cost effectiveness MISs which available on each platform. Data Integrity: entity, domain and referential integrity are maintained on the database server. Accessibility: data may be accessed from WANs and multiple client applications. Perform: performance may optimize by hardware and process. Security: data security is centralized on the server.
41. Data Warehouse One or more tools to extract fields from any kind of data structure (flat, hierarchical, relational, or object) including external data. The synthesis of the data into a nonvolatile, integrated, subject oriented database with a metadata “catalog.” All AI applications on Data Warehouse
54. What Does Data Mining Do? Explores Your Data Finds Patterns Performs Predictions
55. What does Data Mining do? Illustrated DM Engine DM Engine Predicted Data DB data Client data Application data DB data Client data Application data “ Just one row ” Mining Model Data To Predict Training Data Mining Model Mining Model
56. Server Mining Architecture Analysis Services Server Mining Model Data Mining Algorithm Data Source Your Application OLE DB/ ADOMD/ XMLA Deploy BI Dev Studio (Visual Studio) App Data
57. Data Mining Process CRISP-DM “ Putting Data Mining to Work” “ Doing Data Mining” Data www.crisp-dm.org Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment
58. Data Mining Process in SQL CRISP-DM SSAS (Data Mining) SSAS (OLAP) DSV SSIS SSAS(OLAP) SSRS Flexible APIs SSIS SSAS (OLAP) Data www.crisp-dm.org Data Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment
59. What Do Data Mining Applications Do? Finds Patterns Performs Predictions Explores Your Data Automatic Mining Pattern Exploration Perform Predictions
60. Algorithm Training Algorithm Module Case Processor (generates and prepares all training cases) StartCases Process One Case Converged/complete? No Yes Done! Persist patterns
61. DM data flow New Dataset Cube Historical Dataset Data Transform (DTS) Reporting Mining Models Model Browsing Prediction LOB Application Cube
62. Prediction Parser Validation-I & Initialization AST Binding & Validation-II DMX tree Execution Planning DMX tree Input data Read / Evaluate one row Push response Untokenize results Income Gender $50,000 F 1 2 50000 2 1 2 3 50000 2 1 Income Gender Plan $50,000 F Attend