SlideShare une entreprise Scribd logo
1  sur  23
Jung Hoon Kim
N5, Room 2239
E-mail: junghoon.kim@kaist.ac.kr

2014.01.07

KAIST Knowledge Service Engineering
Data Mining Lab.

1
Introduction
 Frequent pattern and association rule mining is one of

the few exceptions to emerge from machine learning
 Apriori algorithm

 AprioriTid algorithm
 AprioriAll algorithm
 FP-Tree algorithm

KAIST Knowledge Service Engineering
Data Mining Lab.

2
Notation


KAIST Knowledge Service Engineering
Data Mining Lab.

3
Principle
 downward closure property.
 If an itemset is frequenct,
then all of its subsets must
also be frequent
 if an itemset is not frequent,
any of its superset is never
frequent

KAIST Knowledge Service Engineering
Data Mining Lab.

4
Apriori algorithm
 Pseudo code

KAIST Knowledge Service Engineering
Data Mining Lab.

5
Example

KAIST Knowledge Service Engineering
Data Mining Lab.

6
Discussion
 Too many database scanning makes high computation

 Need minsup & minconf to be specified in advance.
 Use hash-tree to store the candidate itemsets.

Sometimes it adapt trie-structure to store sets.

KAIST Knowledge Service Engineering
Data Mining Lab.

7
AprioriTid


KAIST Knowledge Service Engineering
Data Mining Lab.

8
AprioriTid

KAIST Knowledge Service Engineering
Data Mining Lab.

9
AprioriTid

KAIST Knowledge Service Engineering
Data Mining Lab.

10
AprioriTid

KAIST Knowledge Service Engineering
Data Mining Lab.

11
FP-Growth
 To avoid scanning multiple database
 the cost of database is too high !!
 To avoid making lots of candidates
 in apriori algorithm, the bottleneck is generation of
candidate
 How can solve these problems?

KAIST Knowledge Service Engineering
Data Mining Lab.

12
FP-Growth
 Algorithm was too simple

1. Scan the database once, find frequent 1-itemsets

(single item patterns)
2. Sort the frequent items in frequency descending
order, f-list(F-list = f-c-a-b-m-p)
3. Scan the DB again, construct the FP-tree
KAIST Knowledge Service Engineering
Data Mining Lab.

13
FP-Growth Algorithm

KAIST Knowledge Service Engineering
Data Mining Lab.

14
FP-Tree
 Scanning the transaction with TID=100

KAIST Knowledge Service Engineering
Data Mining Lab.

15
FP-Tree
 Scanning the transaction with TID=200

KAIST Knowledge Service Engineering
Data Mining Lab.

16
FP-Tree
 Final FP-Tree

KAIST Knowledge Service Engineering
Data Mining Lab.

17
Mine a FP-Tree
forming conditional pattern bases
II. constructing conditional FP-trees
III. recursively mining conditional FP-trees
I.

KAIST Knowledge Service Engineering
Data Mining Lab.

18
Conditional pattern base
 frequent itemset as a co-occurring

suffix pattern
 for example
 m : <f, c, a> : support / 2
 m : <f,c,a,b> : support / 1

KAIST Knowledge Service Engineering
Data Mining Lab.

19
Conditional pattern tree
 {m}’s conditional pattern tree

KAIST Knowledge Service Engineering
Data Mining Lab.

20
Pseudo Code

KAIST Knowledge Service Engineering
Data Mining Lab.

21
Conclusion
 In data mining, association rules are useful for analyzing

and predicting customer behavior. They play an
important part in shopping basket data analysis, product
clustering, catalog design and store layout.

KAIST Knowledge Service Engineering
Data Mining Lab.

22
Thank you

KAIST Knowledge Service Engineering
Data Mining Lab.

23

Contenu connexe

Tendances

Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithmGangadhar S
 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceMaryamRehman6
 
Association rule mining
Association rule miningAssociation rule mining
Association rule miningAcad
 
Supervised learning
Supervised learningSupervised learning
Supervised learningankit_ppt
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture janani thirupathi
 
Data Mining (Partie 1).pdf
Data Mining (Partie 1).pdfData Mining (Partie 1).pdf
Data Mining (Partie 1).pdfOuailChoukhairi
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data miningSulman Ahmed
 
Lect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmLect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmhktripathy
 
Association rule mining and Apriori algorithm
Association rule mining and Apriori algorithmAssociation rule mining and Apriori algorithm
Association rule mining and Apriori algorithmhina firdaus
 
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning PPT4: Frameworks & Libraries of Machine Learning & Deep Learning
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning akira-ai
 
Association Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset GenerationAssociation Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset GenerationKnoldus Inc.
 
Frequent itemset mining methods
Frequent itemset mining methodsFrequent itemset mining methods
Frequent itemset mining methodsProf.Nilesh Magar
 
Lect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysisLect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysishktripathy
 

Tendances (20)

Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data science
 
Lecture13 - Association Rules
Lecture13 - Association RulesLecture13 - Association Rules
Lecture13 - Association Rules
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Apriori
AprioriApriori
Apriori
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
 
Data mining primitives
Data mining primitivesData mining primitives
Data mining primitives
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Data Mining (Partie 1).pdf
Data Mining (Partie 1).pdfData Mining (Partie 1).pdf
Data Mining (Partie 1).pdf
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data mining
 
Text mining
Text miningText mining
Text mining
 
Lect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmLect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithm
 
Association rule mining and Apriori algorithm
Association rule mining and Apriori algorithmAssociation rule mining and Apriori algorithm
Association rule mining and Apriori algorithm
 
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning PPT4: Frameworks & Libraries of Machine Learning & Deep Learning
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Association Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset GenerationAssociation Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset Generation
 
Frequent itemset mining methods
Frequent itemset mining methodsFrequent itemset mining methods
Frequent itemset mining methods
 
Lect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysisLect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysis
 

Similaire à Apriori algorithm

Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...ijsrd.com
 
Mining single dimensional boolean association rules from transactional
Mining single dimensional boolean association rules from transactionalMining single dimensional boolean association rules from transactional
Mining single dimensional boolean association rules from transactionalramya marichamy
 
Mining Algorithm for Weighted FP-Growth Frequent Item Sets based on Ordered F...
Mining Algorithm for Weighted FP-Growth Frequent Item Sets based on Ordered F...Mining Algorithm for Weighted FP-Growth Frequent Item Sets based on Ordered F...
Mining Algorithm for Weighted FP-Growth Frequent Item Sets based on Ordered F...Dr. Amarjeet Singh
 
ARM_03_FPtreefrequency pattern data warehousing .ppt
ARM_03_FPtreefrequency pattern data warehousing .pptARM_03_FPtreefrequency pattern data warehousing .ppt
ARM_03_FPtreefrequency pattern data warehousing .pptChellamuthuHaripriya
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Salah Amean
 
A classification of methods for frequent pattern mining
A classification of methods for frequent pattern miningA classification of methods for frequent pattern mining
A classification of methods for frequent pattern miningIOSR Journals
 
Cs501 mining frequentpatterns
Cs501 mining frequentpatternsCs501 mining frequentpatterns
Cs501 mining frequentpatternsKamal Singh Lodhi
 
Frequent Itemset Mining on BigData
Frequent Itemset Mining on BigDataFrequent Itemset Mining on BigData
Frequent Itemset Mining on BigDataRaju Gupta
 
Fp growth tree improve its efficiency and scalability
Fp growth tree improve its efficiency and scalabilityFp growth tree improve its efficiency and scalability
Fp growth tree improve its efficiency and scalabilityDr.Manmohan Singh
 
Frequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth methodFrequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth methodShani729
 
Scalable frequent itemset mining using heterogeneous computing par apriori a...
Scalable frequent itemset mining using heterogeneous computing  par apriori a...Scalable frequent itemset mining using heterogeneous computing  par apriori a...
Scalable frequent itemset mining using heterogeneous computing par apriori a...ijdpsjournal
 
Associations.ppt
Associations.pptAssociations.ppt
Associations.pptQuyn590023
 
Associations1
Associations1Associations1
Associations1mancnilu
 

Similaire à Apriori algorithm (20)

B03606010
B03606010B03606010
B03606010
 
Ej36829834
Ej36829834Ej36829834
Ej36829834
 
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
 
My6asso
My6assoMy6asso
My6asso
 
6asso
6asso6asso
6asso
 
Mining single dimensional boolean association rules from transactional
Mining single dimensional boolean association rules from transactionalMining single dimensional boolean association rules from transactional
Mining single dimensional boolean association rules from transactional
 
20120140502006
2012014050200620120140502006
20120140502006
 
20120140502006
2012014050200620120140502006
20120140502006
 
Mining Algorithm for Weighted FP-Growth Frequent Item Sets based on Ordered F...
Mining Algorithm for Weighted FP-Growth Frequent Item Sets based on Ordered F...Mining Algorithm for Weighted FP-Growth Frequent Item Sets based on Ordered F...
Mining Algorithm for Weighted FP-Growth Frequent Item Sets based on Ordered F...
 
ARM_03_FPtreefrequency pattern data warehousing .ppt
ARM_03_FPtreefrequency pattern data warehousing .pptARM_03_FPtreefrequency pattern data warehousing .ppt
ARM_03_FPtreefrequency pattern data warehousing .ppt
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
 
J017114852
J017114852J017114852
J017114852
 
A classification of methods for frequent pattern mining
A classification of methods for frequent pattern miningA classification of methods for frequent pattern mining
A classification of methods for frequent pattern mining
 
Cs501 mining frequentpatterns
Cs501 mining frequentpatternsCs501 mining frequentpatterns
Cs501 mining frequentpatterns
 
Frequent Itemset Mining on BigData
Frequent Itemset Mining on BigDataFrequent Itemset Mining on BigData
Frequent Itemset Mining on BigData
 
Fp growth tree improve its efficiency and scalability
Fp growth tree improve its efficiency and scalabilityFp growth tree improve its efficiency and scalability
Fp growth tree improve its efficiency and scalability
 
Frequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth methodFrequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth method
 
Scalable frequent itemset mining using heterogeneous computing par apriori a...
Scalable frequent itemset mining using heterogeneous computing  par apriori a...Scalable frequent itemset mining using heterogeneous computing  par apriori a...
Scalable frequent itemset mining using heterogeneous computing par apriori a...
 
Associations.ppt
Associations.pptAssociations.ppt
Associations.ppt
 
Associations1
Associations1Associations1
Associations1
 

Dernier

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 

Dernier (20)

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 

Apriori algorithm

  • 1. Jung Hoon Kim N5, Room 2239 E-mail: junghoon.kim@kaist.ac.kr 2014.01.07 KAIST Knowledge Service Engineering Data Mining Lab. 1
  • 2. Introduction  Frequent pattern and association rule mining is one of the few exceptions to emerge from machine learning  Apriori algorithm  AprioriTid algorithm  AprioriAll algorithm  FP-Tree algorithm KAIST Knowledge Service Engineering Data Mining Lab. 2
  • 3. Notation  KAIST Knowledge Service Engineering Data Mining Lab. 3
  • 4. Principle  downward closure property.  If an itemset is frequenct, then all of its subsets must also be frequent  if an itemset is not frequent, any of its superset is never frequent KAIST Knowledge Service Engineering Data Mining Lab. 4
  • 5. Apriori algorithm  Pseudo code KAIST Knowledge Service Engineering Data Mining Lab. 5
  • 6. Example KAIST Knowledge Service Engineering Data Mining Lab. 6
  • 7. Discussion  Too many database scanning makes high computation  Need minsup & minconf to be specified in advance.  Use hash-tree to store the candidate itemsets. Sometimes it adapt trie-structure to store sets. KAIST Knowledge Service Engineering Data Mining Lab. 7
  • 8. AprioriTid  KAIST Knowledge Service Engineering Data Mining Lab. 8
  • 9. AprioriTid KAIST Knowledge Service Engineering Data Mining Lab. 9
  • 10. AprioriTid KAIST Knowledge Service Engineering Data Mining Lab. 10
  • 11. AprioriTid KAIST Knowledge Service Engineering Data Mining Lab. 11
  • 12. FP-Growth  To avoid scanning multiple database  the cost of database is too high !!  To avoid making lots of candidates  in apriori algorithm, the bottleneck is generation of candidate  How can solve these problems? KAIST Knowledge Service Engineering Data Mining Lab. 12
  • 13. FP-Growth  Algorithm was too simple 1. Scan the database once, find frequent 1-itemsets (single item patterns) 2. Sort the frequent items in frequency descending order, f-list(F-list = f-c-a-b-m-p) 3. Scan the DB again, construct the FP-tree KAIST Knowledge Service Engineering Data Mining Lab. 13
  • 14. FP-Growth Algorithm KAIST Knowledge Service Engineering Data Mining Lab. 14
  • 15. FP-Tree  Scanning the transaction with TID=100 KAIST Knowledge Service Engineering Data Mining Lab. 15
  • 16. FP-Tree  Scanning the transaction with TID=200 KAIST Knowledge Service Engineering Data Mining Lab. 16
  • 17. FP-Tree  Final FP-Tree KAIST Knowledge Service Engineering Data Mining Lab. 17
  • 18. Mine a FP-Tree forming conditional pattern bases II. constructing conditional FP-trees III. recursively mining conditional FP-trees I. KAIST Knowledge Service Engineering Data Mining Lab. 18
  • 19. Conditional pattern base  frequent itemset as a co-occurring suffix pattern  for example  m : <f, c, a> : support / 2  m : <f,c,a,b> : support / 1 KAIST Knowledge Service Engineering Data Mining Lab. 19
  • 20. Conditional pattern tree  {m}’s conditional pattern tree KAIST Knowledge Service Engineering Data Mining Lab. 20
  • 21. Pseudo Code KAIST Knowledge Service Engineering Data Mining Lab. 21
  • 22. Conclusion  In data mining, association rules are useful for analyzing and predicting customer behavior. They play an important part in shopping basket data analysis, product clustering, catalog design and store layout. KAIST Knowledge Service Engineering Data Mining Lab. 22
  • 23. Thank you KAIST Knowledge Service Engineering Data Mining Lab. 23