SlideShare une entreprise Scribd logo
1  sur  9
Télécharger pour lire hors ligne
the 37th International Conference on
Logic Programming
Sept. 20–27, 2021 (virtual event)
Extending High-Utility Pattern Mining with Facets and Advanced
Utility Functions (Extended Abstract)
Francesco Cauteruccio and Giorgio Terracina
DEMACS, University of Calabria, Italy
{cauteruccio, terracina}@mat.unical.it
Context and Motivation
• Pattern Mining is one of the most studied data mining branches:
• Find interesting patterns (set of items) in a database of transactions;
• Examples: Frequent pattern mining, sequential pattern mining, etc…
• High-Utility Pattern Mining (HUPM)
• Find patterns having a high-utility (w.r.t. some utility measure)
• Example: in a sales database, the utility of a pattern may be represented by the profit of items sold together.
• Basic assumption: each item is associated with one, static utility .
• However…
• The utility of an item can be defined from very different point of views,
• Transactions are not only flat lists of items but they can provide different level of abstractions.
Pattern Mining and High-Utility Pattern Mining
Context and Motivation
• We present a framework for HUPM extending basic notions and introducing:
• Given a pattern 𝑃, we say that 𝑃 is an extended high-utility pattern if its utility 𝑢(𝑃) is greater than a minimum
threshold 𝑡ℎ! and it occurs in at least 𝑡ℎ" transactions.
• The problem of extended high-utility pattern mining (e-HUPM) is to discover all the extended high-utility
patterns in a given database 𝐷.
An extended framework for HUPM
Transaction set representation
Facets
Advanced utility functions
For each transaction, different levels of
aggregation can be defined.
+
+
Attributes that can be associated with
any level of aggregation of the
framework.
A taxonomy of functions that can be
combined in several ways to fit
different notions of utility.
Proposed Framework
• A multi-layer database representation
• 𝐷𝑎𝑡𝑎𝑏𝑎𝑠𝑒 → 𝐶𝑜𝑛𝑡𝑎𝑖𝑛𝑒𝑟 → 𝑂𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛
• Given a database 𝐷 and a set of transactions {𝑇!, … , 𝑇"}
• 𝐷 is organized as a set of containers 𝐶 = {𝐶!, … , 𝐶"}
• 𝐶# can be associated with a set of objects 𝑂 = {𝑂!, … , 𝑂$}
• 𝑂% contains a set of transactions {𝑇!, … , 𝑇&}
• The facets define the utility of an item 𝑖
• Each item 𝑖 can be associated with one or more facets.
• Facets can also be defined for transactions, objects and containers.
• Values of facets are represented by utility vectors
• Item utility vector: 𝐼𝑈! = [𝑖𝑢", 𝑖𝑢#, … , 𝑖𝑢$], and 𝑖𝑢% describes a certain facet of 𝑖
• Same applies for transactions, objects and containers.
Extending the database structure and introducing facets
Running example depicting a sales database
Item utility vector for the item 𝑖! containing
values for price and weight facets
Proposed Framework
• The advanced utility functions combine utility vectors in different ways:
1. Intra-pattern utility function
• Given a pattern and its occurrences, combine item utility vectors across all the occurrences.
2. Pattern utility function
• Combine the intra-pattern utility function with the facets from the transaction set representation.
• Generates a matrix 𝑈& (occurrences × facets).
3. Utility function 𝒖 𝑷
• Computes the utility of a pattern over 𝑈&.
• Several classifications for 𝑢(𝑃)
• Horizontal-first, vertical-first, inter-transaction utility, pattern-vs-object utility, etc…
Advanced utility functions
ASP Approach
The encoding
%%% Input schema:
%container(ContainerId)
%object(ObjectId,ContainerId)
%transaction(Tid, ObjectId)
%item(Item, Tid, Position, Q)
%itemUtilityVector(Item, I1, ..., Il)
%transactionUtilityVector(Tid, T1, ..., Tm)
%objectUtilityVector(ObjectId, O1, ..., On)
%containerUtilityVector(ContainerId, C1, ..., Co)
%%% Parameters
occurrencesThreshold(...). utilityTreshold(...).
%%% Item pre-filtering
usefulItem(I):- item(I,_,_,_),....any condition on the items.
%%% Candidate pattern generation
{inCandidatePattern(I)}:- usefulItem(I).
%%% Occurrences computation and check
inTransaction(Tid):- transaction(Tid,_), not incomplete(Tid).
incomplete(TiD):- transaction(Tid,_), inCandidatePattern(I), not contains(I,Tid).
contains(I,Tid):- item(I,Tid,_,_).
:- #count{ Tid : inTransaction(Tid)}=N, N < Tho, occurrencesThreshold(Tho).
%%% Utility computation
patternItemUtilityVectors(Tid,Item,I1,...,Il,Q):- inCandidatePattern(Item),
itemUtilityVector(Item, I1, ..., Il), inTransaction(Tid), item(Item, Tid, Position, Q).
intraPatternUtilityVector(Tid,I1,...,Il):-
&computeIntraPatternUtility[patternItemUtilityVectors](Tid,I1,...,Il).
occurrenceUtilityVector(Tid,I1,...,Il,T1,...Tm,O1,...On,C1,...,Co):-
inTransaction(Tid), intraPatternUtilityVector(Tid,I1,...,Il), transactionUtilityVector(Tid, T1, ..., Tm),
transaction(Tid, ObjectId), objectUtilityVector(ObjectId, O1, ..., On), object(ObjectId , ContainerId)
containerUtilityVector(ContainerId, C1, ..., Co).
:- &computeUtility[occurrenceUtilityVector](U), U < Thu, utilityTreshold(Thu).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
• Classic guess-and-check approach
• We generate one answer set for each
valid pattern,
• The advanced utility functions are
executed by means of external
functions (e.g., in DLVHEX, WASP,
clingo)
• Pattern validity criteria and filters can
be applied by encoding them.
Experimental Evaluation
• Dataset: aspect-based sentiment analysis of
scientific reviews [1]
• Each sentence is annotated with different
aspects, each aspect as a value.
Dataset, quantitative and qualitative analysis
[1] Chakraborty, S., Goyal, P., Mukherjee, A.: Aspect-based sentiment analysis of scientific reviews. In: JCDL ’20: Proceedings of
the ACM/IEEE Joint Conference on Digital Libraries in 2020, Virtual Event, China, August 1-5, 2020. pp. 207–216 (2020), ACM
• Quantitative analysis: Running time, comparison with
HUPM systems (in a classical setting).
• Qualitative analysis:
• Mining patterns where the advanced utility function is the
Pearson correlation between the sentiment on a sentence
aspect 𝑋 and the final decision on the corresponding paper.
• Similar results have been obtained where the advanced
utility function is the Multiple correlation.
The use case scenario in the e-HUPM model
The use case scenario in the e-HUPM model
Conclusion
• We introduced a general framework for HUPM with several extensions.
• The framework allows to work with multi-dimensional data and with different utility measures.
• A versatile and modular ASP encoding has been developed.
• We employed a real use case on scientific reviews to carry out both quantitative and qualitative
analyses.
• Facets and advanced utility functions help reducing the amount of relevant patterns
• Useful in providing deep insights on the data.
• Not an ending point!
• Apply the framework to new contexts,
• Derive ad-hoc algorithms for particular classes of advanced utility functions.
Thanks for your attention!
These slides are available at https://bit.ly/ehupm-iclp21
Francesco Cauteruccio
Research Fellow @ DEMACS, University of Calabria
cauteruccio@mat.unical.it
francescocauteruccio.info
@finalfire

Contenu connexe

Tendances

Mining frequent patterns association
Mining frequent patterns associationMining frequent patterns association
Mining frequent patterns associationDeepaR42
 
Feature selection on boolean symbolic objects
Feature selection on boolean symbolic objectsFeature selection on boolean symbolic objects
Feature selection on boolean symbolic objectsijcsity
 
MS SQL SERVER: Microsoft naive bayes algorithm
MS SQL SERVER: Microsoft naive bayes algorithmMS SQL SERVER: Microsoft naive bayes algorithm
MS SQL SERVER: Microsoft naive bayes algorithmDataminingTools Inc
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methodsrajshreemuthiah
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsJustin Cletus
 
Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...ranjit banshpal
 
A Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionA Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionIOSR Journals
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Modelsguest0edcaf
 
2 introductory slides
2 introductory slides2 introductory slides
2 introductory slidestafosepsdfasg
 

Tendances (14)

Mining frequent patterns association
Mining frequent patterns associationMining frequent patterns association
Mining frequent patterns association
 
Feature selection on boolean symbolic objects
Feature selection on boolean symbolic objectsFeature selection on boolean symbolic objects
Feature selection on boolean symbolic objects
 
MS SQL SERVER: Microsoft naive bayes algorithm
MS SQL SERVER: Microsoft naive bayes algorithmMS SQL SERVER: Microsoft naive bayes algorithm
MS SQL SERVER: Microsoft naive bayes algorithm
 
2.mathematics for machine learning
2.mathematics for machine learning2.mathematics for machine learning
2.mathematics for machine learning
 
4 module 3 --
4 module 3 --4 module 3 --
4 module 3 --
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methods
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
 
Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...
 
A Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionA Modified KS-test for Feature Selection
A Modified KS-test for Feature Selection
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
 
Matlab:Regression
Matlab:RegressionMatlab:Regression
Matlab:Regression
 
7 decision tree
7 decision tree7 decision tree
7 decision tree
 
2 introductory slides
2 introductory slides2 introductory slides
2 introductory slides
 
3 module 2
3 module 23 module 2
3 module 2
 

Similaire à Extending High-Utility Pattern Mining with Facets and Advanced Utility Functions (Extended Abstract)

Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
Introduction to Design Patterns in Javascript
Introduction to Design Patterns in JavascriptIntroduction to Design Patterns in Javascript
Introduction to Design Patterns in JavascriptSanthosh Kumar Srinivasan
 
software engineering Design Patterns.pdf
software  engineering Design   Patterns.pdfsoftware  engineering Design   Patterns.pdf
software engineering Design Patterns.pdfmulugetaberihun3
 
Mining High Utility Patterns in Large Databases using Mapreduce Framework
Mining High Utility Patterns in Large Databases using Mapreduce FrameworkMining High Utility Patterns in Large Databases using Mapreduce Framework
Mining High Utility Patterns in Large Databases using Mapreduce FrameworkIRJET Journal
 
Introduction to Data Structure
Introduction to Data Structure Introduction to Data Structure
Introduction to Data Structure Prof Ansari
 
Functional and Algebraic Domain Modeling
Functional and Algebraic Domain ModelingFunctional and Algebraic Domain Modeling
Functional and Algebraic Domain ModelingDebasish Ghosh
 
Modeling System Requirements
Modeling System RequirementsModeling System Requirements
Modeling System RequirementsAsjad Raza
 
UNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptxUNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptxBhagyasriPatel2
 
Searching Repositories of Web Application Models
Searching Repositories of Web Application ModelsSearching Repositories of Web Application Models
Searching Repositories of Web Application ModelsMarco Brambilla
 
In memory OLAP engine
In memory OLAP engineIn memory OLAP engine
In memory OLAP engineWO Community
 
Basic Behavioral Modeling
Basic Behavioral ModelingBasic Behavioral Modeling
Basic Behavioral ModelingAMITJain879
 
Object oriented methodologies
Object oriented methodologiesObject oriented methodologies
Object oriented methodologiesnaina-rani
 
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberPreservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberJISC KeepIt project
 
Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...Gianmario Spacagna
 
Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structuressonykhan3
 

Similaire à Extending High-Utility Pattern Mining with Facets and Advanced Utility Functions (Extended Abstract) (20)

Ch08
Ch08Ch08
Ch08
 
Ch08
Ch08Ch08
Ch08
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Introduction to Design Patterns in Javascript
Introduction to Design Patterns in JavascriptIntroduction to Design Patterns in Javascript
Introduction to Design Patterns in Javascript
 
software engineering Design Patterns.pdf
software  engineering Design   Patterns.pdfsoftware  engineering Design   Patterns.pdf
software engineering Design Patterns.pdf
 
Mining High Utility Patterns in Large Databases using Mapreduce Framework
Mining High Utility Patterns in Large Databases using Mapreduce FrameworkMining High Utility Patterns in Large Databases using Mapreduce Framework
Mining High Utility Patterns in Large Databases using Mapreduce Framework
 
Introduction to Data Structure
Introduction to Data Structure Introduction to Data Structure
Introduction to Data Structure
 
Functional and Algebraic Domain Modeling
Functional and Algebraic Domain ModelingFunctional and Algebraic Domain Modeling
Functional and Algebraic Domain Modeling
 
algo 1.ppt
algo 1.pptalgo 1.ppt
algo 1.ppt
 
Variable and Arguments_4.pptx
Variable and Arguments_4.pptxVariable and Arguments_4.pptx
Variable and Arguments_4.pptx
 
Modeling System Requirements
Modeling System RequirementsModeling System Requirements
Modeling System Requirements
 
Oomd unit1
Oomd unit1Oomd unit1
Oomd unit1
 
UNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptxUNIT_5_Data Wrangling.pptx
UNIT_5_Data Wrangling.pptx
 
Searching Repositories of Web Application Models
Searching Repositories of Web Application ModelsSearching Repositories of Web Application Models
Searching Repositories of Web Application Models
 
In memory OLAP engine
In memory OLAP engineIn memory OLAP engine
In memory OLAP engine
 
Basic Behavioral Modeling
Basic Behavioral ModelingBasic Behavioral Modeling
Basic Behavioral Modeling
 
Object oriented methodologies
Object oriented methodologiesObject oriented methodologies
Object oriented methodologies
 
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberPreservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
 
Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...
 
Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structures
 

Dernier

300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.Silpa
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Silpa
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Silpa
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Silpa
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxRenuJangid3
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptxArvind Kumar
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Silpa
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxSilpa
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLkantirani197
 

Dernier (20)

300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 

Extending High-Utility Pattern Mining with Facets and Advanced Utility Functions (Extended Abstract)

  • 1. the 37th International Conference on Logic Programming Sept. 20–27, 2021 (virtual event) Extending High-Utility Pattern Mining with Facets and Advanced Utility Functions (Extended Abstract) Francesco Cauteruccio and Giorgio Terracina DEMACS, University of Calabria, Italy {cauteruccio, terracina}@mat.unical.it
  • 2. Context and Motivation • Pattern Mining is one of the most studied data mining branches: • Find interesting patterns (set of items) in a database of transactions; • Examples: Frequent pattern mining, sequential pattern mining, etc… • High-Utility Pattern Mining (HUPM) • Find patterns having a high-utility (w.r.t. some utility measure) • Example: in a sales database, the utility of a pattern may be represented by the profit of items sold together. • Basic assumption: each item is associated with one, static utility . • However… • The utility of an item can be defined from very different point of views, • Transactions are not only flat lists of items but they can provide different level of abstractions. Pattern Mining and High-Utility Pattern Mining
  • 3. Context and Motivation • We present a framework for HUPM extending basic notions and introducing: • Given a pattern 𝑃, we say that 𝑃 is an extended high-utility pattern if its utility 𝑢(𝑃) is greater than a minimum threshold 𝑡ℎ! and it occurs in at least 𝑡ℎ" transactions. • The problem of extended high-utility pattern mining (e-HUPM) is to discover all the extended high-utility patterns in a given database 𝐷. An extended framework for HUPM Transaction set representation Facets Advanced utility functions For each transaction, different levels of aggregation can be defined. + + Attributes that can be associated with any level of aggregation of the framework. A taxonomy of functions that can be combined in several ways to fit different notions of utility.
  • 4. Proposed Framework • A multi-layer database representation • 𝐷𝑎𝑡𝑎𝑏𝑎𝑠𝑒 → 𝐶𝑜𝑛𝑡𝑎𝑖𝑛𝑒𝑟 → 𝑂𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛 • Given a database 𝐷 and a set of transactions {𝑇!, … , 𝑇"} • 𝐷 is organized as a set of containers 𝐶 = {𝐶!, … , 𝐶"} • 𝐶# can be associated with a set of objects 𝑂 = {𝑂!, … , 𝑂$} • 𝑂% contains a set of transactions {𝑇!, … , 𝑇&} • The facets define the utility of an item 𝑖 • Each item 𝑖 can be associated with one or more facets. • Facets can also be defined for transactions, objects and containers. • Values of facets are represented by utility vectors • Item utility vector: 𝐼𝑈! = [𝑖𝑢", 𝑖𝑢#, … , 𝑖𝑢$], and 𝑖𝑢% describes a certain facet of 𝑖 • Same applies for transactions, objects and containers. Extending the database structure and introducing facets Running example depicting a sales database Item utility vector for the item 𝑖! containing values for price and weight facets
  • 5. Proposed Framework • The advanced utility functions combine utility vectors in different ways: 1. Intra-pattern utility function • Given a pattern and its occurrences, combine item utility vectors across all the occurrences. 2. Pattern utility function • Combine the intra-pattern utility function with the facets from the transaction set representation. • Generates a matrix 𝑈& (occurrences × facets). 3. Utility function 𝒖 𝑷 • Computes the utility of a pattern over 𝑈&. • Several classifications for 𝑢(𝑃) • Horizontal-first, vertical-first, inter-transaction utility, pattern-vs-object utility, etc… Advanced utility functions
  • 6. ASP Approach The encoding %%% Input schema: %container(ContainerId) %object(ObjectId,ContainerId) %transaction(Tid, ObjectId) %item(Item, Tid, Position, Q) %itemUtilityVector(Item, I1, ..., Il) %transactionUtilityVector(Tid, T1, ..., Tm) %objectUtilityVector(ObjectId, O1, ..., On) %containerUtilityVector(ContainerId, C1, ..., Co) %%% Parameters occurrencesThreshold(...). utilityTreshold(...). %%% Item pre-filtering usefulItem(I):- item(I,_,_,_),....any condition on the items. %%% Candidate pattern generation {inCandidatePattern(I)}:- usefulItem(I). %%% Occurrences computation and check inTransaction(Tid):- transaction(Tid,_), not incomplete(Tid). incomplete(TiD):- transaction(Tid,_), inCandidatePattern(I), not contains(I,Tid). contains(I,Tid):- item(I,Tid,_,_). :- #count{ Tid : inTransaction(Tid)}=N, N < Tho, occurrencesThreshold(Tho). %%% Utility computation patternItemUtilityVectors(Tid,Item,I1,...,Il,Q):- inCandidatePattern(Item), itemUtilityVector(Item, I1, ..., Il), inTransaction(Tid), item(Item, Tid, Position, Q). intraPatternUtilityVector(Tid,I1,...,Il):- &computeIntraPatternUtility[patternItemUtilityVectors](Tid,I1,...,Il). occurrenceUtilityVector(Tid,I1,...,Il,T1,...Tm,O1,...On,C1,...,Co):- inTransaction(Tid), intraPatternUtilityVector(Tid,I1,...,Il), transactionUtilityVector(Tid, T1, ..., Tm), transaction(Tid, ObjectId), objectUtilityVector(ObjectId, O1, ..., On), object(ObjectId , ContainerId) containerUtilityVector(ContainerId, C1, ..., Co). :- &computeUtility[occurrenceUtilityVector](U), U < Thu, utilityTreshold(Thu). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 • Classic guess-and-check approach • We generate one answer set for each valid pattern, • The advanced utility functions are executed by means of external functions (e.g., in DLVHEX, WASP, clingo) • Pattern validity criteria and filters can be applied by encoding them.
  • 7. Experimental Evaluation • Dataset: aspect-based sentiment analysis of scientific reviews [1] • Each sentence is annotated with different aspects, each aspect as a value. Dataset, quantitative and qualitative analysis [1] Chakraborty, S., Goyal, P., Mukherjee, A.: Aspect-based sentiment analysis of scientific reviews. In: JCDL ’20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, Virtual Event, China, August 1-5, 2020. pp. 207–216 (2020), ACM • Quantitative analysis: Running time, comparison with HUPM systems (in a classical setting). • Qualitative analysis: • Mining patterns where the advanced utility function is the Pearson correlation between the sentiment on a sentence aspect 𝑋 and the final decision on the corresponding paper. • Similar results have been obtained where the advanced utility function is the Multiple correlation. The use case scenario in the e-HUPM model The use case scenario in the e-HUPM model
  • 8. Conclusion • We introduced a general framework for HUPM with several extensions. • The framework allows to work with multi-dimensional data and with different utility measures. • A versatile and modular ASP encoding has been developed. • We employed a real use case on scientific reviews to carry out both quantitative and qualitative analyses. • Facets and advanced utility functions help reducing the amount of relevant patterns • Useful in providing deep insights on the data. • Not an ending point! • Apply the framework to new contexts, • Derive ad-hoc algorithms for particular classes of advanced utility functions.
  • 9. Thanks for your attention! These slides are available at https://bit.ly/ehupm-iclp21 Francesco Cauteruccio Research Fellow @ DEMACS, University of Calabria cauteruccio@mat.unical.it francescocauteruccio.info @finalfire