SlideShare une entreprise Scribd logo
1  sur  47
CS 361A  (Advanced Data Structures and Algorithms) Lecture 20 (Dec 7, 2005) Data Mining: Association Rules Rajeev Motwani (partially based on notes by Jeff Ullman)
Association Rules Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Association Rules ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Market-Basket Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Application 1 (Retail Stores) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Application 2 (Information Retrieval) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Application 3 (Web Search) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Scale of Problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Association Rules ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Finding Association Rules ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Computation Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Finding Frequent Pairs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Montonicity Property ,[object Object],[object Object],[object Object],[object Object],[object Object]
A-Priori Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Memory Usage – A-Priori Candidate Items Pass 1 Pass 2 Frequent Items Candidate Pairs M E M O R Y M E M O R Y
PCY Idea ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Memory Usage – PCY Candidate Items Pass 1 Pass 2 M E M O R Y M E M O R Y Hash Table Frequent Items Bitmap Candidate Pairs
PCY Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Multistage PCY Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Memory Usage – Multistage PCY Candidate Items Pass 1 Pass 2 Hash Table 1 Frequent Items Bitmap Frequent Items Bitmap 1 Bitmap 2 Candidate Pairs Hash Table 2
Finding Larger Itemsets ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Approximation Techniques ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Sampling Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SON Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Toivonen’s Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Low-Support, High-Correlation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Matrix Representation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Column Similarity ,[object Object],[object Object],[object Object],[object Object],C i   C j 0  1 1  0 1  1  sim(C i ,C j ) = 2/5 = 0.4 0  0 1  1 0  1
Identifying Similar Columns? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Key Observation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Min Hashing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Min-Hash Signatures ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example C 1   C 2   C 3 R 1   1  0  1 R 2   0  1  1 R 3   1  0  0 R 4   1  0  1 R 5   0  1  0 Signatures S 1   S 2   S 3 Perm 1 = (12345)   1  2  1 Perm 2 = (54321)   4  5  4 Perm 3 = (34512)   3  5  4 Similarities 1-2  1-3  2-3 Col-Col   0.00  0.50  0.25 Sig-Sig   0.00  0.67  0.00
Implementation Trick ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example C 1   C 2 R 1 1  0 R 2   0  1 R 3   1  1 R 4   1  0 R 5   0  1 h(x) = x mod 5 g(x) = 2x+1 mod 5 h(1) = 1 1 - g(1) = 3 3 - h(2) = 2 1 2 g(2) = 0 3 0 h(3) = 3 1 2 g(3) = 2 2 0 h(4) = 4 1 2 g(4) = 4 2 0 h(5) = 0 1 0 g(5) = 1 2 0 C 1  slots   C 2  slots
Comparing Signatures ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Locality-Sensitive Hashing ,[object Object],[object Object],[object Object],[object Object],[object Object],Bands H 3
Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Band-Hash Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
LSH Summary ,[object Object],[object Object],[object Object],[object Object]
Densifying – Amplification of 1’s ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example 0 0 1 1 0 0 1 0 0 1 0 1 1 1 1
Using Hamming LSH ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
References ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Contenu connexe

En vedette

V2 groop presentation with audio
V2   groop presentation with audio V2   groop presentation with audio
V2 groop presentation with audio javerde
 
Wiki linguistics
Wiki linguisticsWiki linguistics
Wiki linguisticsmattriley
 
Amendment To State Support Agreement
Amendment To State Support AgreementAmendment To State Support Agreement
Amendment To State Support AgreementRithesh Swamy
 
Building Maintainable Android Apps (DroidCon NYC 2014)
Building Maintainable Android Apps (DroidCon NYC 2014)Building Maintainable Android Apps (DroidCon NYC 2014)
Building Maintainable Android Apps (DroidCon NYC 2014)Kevin Schultz
 
The Fab Four: Beginning Social Media (Twitter, Facebook, Google+, LinkedIn)
The Fab Four: Beginning Social Media (Twitter, Facebook, Google+, LinkedIn)The Fab Four: Beginning Social Media (Twitter, Facebook, Google+, LinkedIn)
The Fab Four: Beginning Social Media (Twitter, Facebook, Google+, LinkedIn)Miller Social Media
 
Biblical Worldview
Biblical  WorldviewBiblical  Worldview
Biblical Worldviewdeanohlman
 

En vedette (9)

V2 groop presentation with audio
V2   groop presentation with audio V2   groop presentation with audio
V2 groop presentation with audio
 
Wiki linguistics
Wiki linguisticsWiki linguistics
Wiki linguistics
 
Amendment To State Support Agreement
Amendment To State Support AgreementAmendment To State Support Agreement
Amendment To State Support Agreement
 
Building Maintainable Android Apps (DroidCon NYC 2014)
Building Maintainable Android Apps (DroidCon NYC 2014)Building Maintainable Android Apps (DroidCon NYC 2014)
Building Maintainable Android Apps (DroidCon NYC 2014)
 
Tips Of Presentation.
Tips Of Presentation.Tips Of Presentation.
Tips Of Presentation.
 
The Fab Four: Beginning Social Media (Twitter, Facebook, Google+, LinkedIn)
The Fab Four: Beginning Social Media (Twitter, Facebook, Google+, LinkedIn)The Fab Four: Beginning Social Media (Twitter, Facebook, Google+, LinkedIn)
The Fab Four: Beginning Social Media (Twitter, Facebook, Google+, LinkedIn)
 
Biblical Worldview
Biblical  WorldviewBiblical  Worldview
Biblical Worldview
 
Tips Structure
Tips StructureTips Structure
Tips Structure
 
Steering 01 10
Steering 01 10Steering 01 10
Steering 01 10
 

Similaire à CS 361A Lecture 20 Data Mining: Association Rules

Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsJustin Cletus
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Salah Amean
 
Cs501 mining frequentpatterns
Cs501 mining frequentpatternsCs501 mining frequentpatterns
Cs501 mining frequentpatternsKamal Singh Lodhi
 
1.9.association mining 1
1.9.association mining 11.9.association mining 1
1.9.association mining 1Krish_ver2
 
Lecture3 assoc rules
Lecture3 assoc rulesLecture3 assoc rules
Lecture3 assoc rulessidsingh680
 
Mining Frequent Itemsets.ppt
Mining Frequent Itemsets.pptMining Frequent Itemsets.ppt
Mining Frequent Itemsets.pptNBACriteria2SICET
 
Associations.ppt
Associations.pptAssociations.ppt
Associations.pptQuyn590023
 
Associations1
Associations1Associations1
Associations1mancnilu
 
ARM_03_FPtreefrequency pattern data warehousing .ppt
ARM_03_FPtreefrequency pattern data warehousing .pptARM_03_FPtreefrequency pattern data warehousing .ppt
ARM_03_FPtreefrequency pattern data warehousing .pptChellamuthuHaripriya
 
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...Subrata Kumer Paul
 
FP growth algorithm, data mining, data analystics
FP growth algorithm, data mining, data analysticsFP growth algorithm, data mining, data analystics
FP growth algorithm, data mining, data analysticsAlketaAlia
 

Similaire à CS 361A Lecture 20 Data Mining: Association Rules (20)

My6asso
My6assoMy6asso
My6asso
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
 
Cs501 mining frequentpatterns
Cs501 mining frequentpatternsCs501 mining frequentpatterns
Cs501 mining frequentpatterns
 
Unit 3.pptx
Unit 3.pptxUnit 3.pptx
Unit 3.pptx
 
06FPBasic02.pdf
06FPBasic02.pdf06FPBasic02.pdf
06FPBasic02.pdf
 
pattern mninng.ppt
pattern mninng.pptpattern mninng.ppt
pattern mninng.ppt
 
1.9.association mining 1
1.9.association mining 11.9.association mining 1
1.9.association mining 1
 
Lecture3 assoc rules
Lecture3 assoc rulesLecture3 assoc rules
Lecture3 assoc rules
 
6asso
6asso6asso
6asso
 
Association Rules
Association RulesAssociation Rules
Association Rules
 
Association Rules
Association RulesAssociation Rules
Association Rules
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
Mining Frequent Itemsets.ppt
Mining Frequent Itemsets.pptMining Frequent Itemsets.ppt
Mining Frequent Itemsets.ppt
 
Associations.ppt
Associations.pptAssociations.ppt
Associations.ppt
 
Associations1
Associations1Associations1
Associations1
 
ARM_03_FPtreefrequency pattern data warehousing .ppt
ARM_03_FPtreefrequency pattern data warehousing .pptARM_03_FPtreefrequency pattern data warehousing .ppt
ARM_03_FPtreefrequency pattern data warehousing .ppt
 
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
 
FP growth algorithm, data mining, data analystics
FP growth algorithm, data mining, data analysticsFP growth algorithm, data mining, data analystics
FP growth algorithm, data mining, data analystics
 
Data Mining Lecture_3.pptx
Data Mining Lecture_3.pptxData Mining Lecture_3.pptx
Data Mining Lecture_3.pptx
 

CS 361A Lecture 20 Data Mining: Association Rules

  • 1. CS 361A (Advanced Data Structures and Algorithms) Lecture 20 (Dec 7, 2005) Data Mining: Association Rules Rajeev Motwani (partially based on notes by Jeff Ullman)
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. Memory Usage – A-Priori Candidate Items Pass 1 Pass 2 Frequent Items Candidate Pairs M E M O R Y M E M O R Y
  • 18.
  • 19. Memory Usage – PCY Candidate Items Pass 1 Pass 2 M E M O R Y M E M O R Y Hash Table Frequent Items Bitmap Candidate Pairs
  • 20.
  • 21.
  • 22. Memory Usage – Multistage PCY Candidate Items Pass 1 Pass 2 Hash Table 1 Frequent Items Bitmap Frequent Items Bitmap 1 Bitmap 2 Candidate Pairs Hash Table 2
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. Example C 1 C 2 C 3 R 1 1 0 1 R 2 0 1 1 R 3 1 0 0 R 4 1 0 1 R 5 0 1 0 Signatures S 1 S 2 S 3 Perm 1 = (12345) 1 2 1 Perm 2 = (54321) 4 5 4 Perm 3 = (34512) 3 5 4 Similarities 1-2 1-3 2-3 Col-Col 0.00 0.50 0.25 Sig-Sig 0.00 0.67 0.00
  • 36.
  • 37. Example C 1 C 2 R 1 1 0 R 2 0 1 R 3 1 1 R 4 1 0 R 5 0 1 h(x) = x mod 5 g(x) = 2x+1 mod 5 h(1) = 1 1 - g(1) = 3 3 - h(2) = 2 1 2 g(2) = 0 3 0 h(3) = 3 1 2 g(3) = 2 2 0 h(4) = 4 1 2 g(4) = 4 2 0 h(5) = 0 1 0 g(5) = 1 2 0 C 1 slots C 2 slots
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44. Example 0 0 1 1 0 0 1 0 0 1 0 1 1 1 1
  • 45.
  • 46.
  • 47.