SlideShare a Scribd company logo
1 of 3
Download to read offline
V. Priyadharshini et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.347-349
RESEARCH ARTICLE

www.ijera.com

OPEN ACCESS

Analysis of Process Mining Model Using Frequentgroup Based
Noise Filtering Algorithm
V. Priyadharshini*, A. Malathi**
*

(PhD Research Scholar, PG & Research Department of Computer Science, Government Arts College
Coimbatore – 18)
**
(Assistant Professor, PG & Research Department of Computer Science, Government Arts College
Coimbatore – 18)

ABSTRACT
Process mining is a process management system used to analyze business processes based on event logs. The
knowledge is extracted from event logs by using knowledge retrieval techniques. The process mining algorithms
are capable of automatically discover models to give details of all the events registered in some log traces
provided as input. The theory of regions is a valuable tool in process discovery: it aims at learning a formal
model (Petri nets) from a set of traces. The main objective of this paper is to propose new concept
Frequentgroup based noise filtering algorithm. The experiment is done based on standard bench mark dataset
HELIX and RALIC datasets. The performance of the proposed system is better than existing method.
Keywords: Cycle detection algorithm, frequent group based noise filtering algorithm, Helix dataset, Process
discovery, Process mining, RALIC dataset

I. INTRODUCTION
Process discovery is one of the most
challenging process mining tasks. Based on event
log, a process model is constructed by capturing the
behavior of the log [4]. Event Logs essentially
capture the business activities happened at a certain
time period [1]. The basic idea is to extract
knowledge from event logs recorded by an
information system.
Process mining aims at rising this by providing
techniques and tools for locating method, control,
data, structure, and social structures from event logs.
The research domain that is concerned with
knowledge discovery from event logs is called
process mining [2]. More traditional data mining
techniques can be used in process mining. New
techniques are developed to perform process mining
i.e. mining of process models. It is the traditional
analysis of business processes based on the opinion
of process expert [3]. The business process mining
attempts to reconstruct a complete process model
from data logs that contain real process execution
data. Many techniques highlight the likelihood of
mixing variety of method mining
approaches
to
mine tougher event logs, like those which contain
noise.
The necessary background in Section II
describes related work. Section III presents existing
alpha algorithm that describes previous work done.
Section IV describes the proposed implemented
work. The result and discussion is presented in

www.ijera.com

Section V. Conclusion and future work is discussed
in Section VI.

II. RELATED WORK
The algorithm in [5] is an approximation
algorithm that has proven its efficiency in estimating
large number of cycles in polynomial time when
applied to real world networks. The algorithm counts
the number of cycles in random, sparse graphs as a
function of their length. While using it in real world
networks, the result is not guaranteed for generic
graphs.
The algorithm in [6] presented an algorithm
based on cycle vector space methods. This algorithm
is slow since it investigates all vectors and only a
small portion of them could be cycles.
The algorithm in [9] is DFS-XOR based on the fact
that small cycles can be joined together to form
bigger cycle. It is more time efficient when it comes
to real life problems of counting cycles in a graph
because its complexity is not depending on the factor
of number of cycles.

III. CYCLE DETECTION
ALGORITHM
It is a pointer algorithmic rule that
uses solely two pointers that move through the
sequence at totally different speeds. The algorithmic
rule solely must check for recurrent values of this
special type, one double as aloof from the beginning
of the sequence because the different, to search
out an amount [7]. The function value is used to find
347 | P a g e
V. Priyadharshini et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.347-349

IV. FREQUENT GROUP BASED
NOISE FILTERING
It finds all Frequentgroup patterns for a
given minimum support and minimum confidence
and remove the objects that are not a part of any
Frequentgroup pattern [10]. The set of Frequentgroup
patterns for any data set depends upon the value of
minimum support and minimum confidence.
Wherever possible, we fix the minimum support to be
zero and employ the minimum confidence to control
the number of objects that are designated as noise
[12].
However, in some data sets setting the
support threshold to zero leads to an explosion in the
number of hypergroup patterns. For this reason, a low
support threshold that is high enough to reduce the
number of Frequentgroup patterns to a manageable
level is used.
The proposed system automatically decides
correct initial weight, noise filtering, feature selection
properties [11]. The proposed system is more
efficient than the existing system.
4.1 Datasets
Helix and RALIC datasets is a compilation
of release histories of a number of non-trivial Java
Open Source Software System. It contains class files
for each release of the system along with meta-data.
A metric history is derived from extraction of
releases and this data is directly used in research
works.
4.2 Results and discussion
In this work helix and RALIC dataset are
used for the experiment. We are comparing the three
methods such as existing and proposed works [13].
The existing region based folding and proposed
unsupervised noise filtering and Frequentgroup based
noise filtering as shown in Fig. 1. The fitness value of
existing system is low compared to the proposed
approach. By using proposed unsupervised noise
filtering and Frequentgroup based noise filtering
methods, we obtain the high fitness value. So we can
get the better performance comparatively [14].
By doing the experiment based on fitness value
calculation and fraction of unconnected transitions
(Tu). In Fig.1 experiment based on fraction of
unconnected transitions (Tu), we calculate the Tu
corresponding to existing and proposed approaches.
The Tu is decreased for the proposed methods
www.ijera.com

unsupervised noise filtering and Frequentgroup based
noise filtering methods.
This will also increase the performance of the
proposed system compared to existing system.
Finally we can conclude as the proposed
unsupervised noise filtering and Frequentgroup based
noise filtering methods has more effective than the
existing system.

V. CONCLUSION AND FUTURE
WORK
The datasets were used with three different
region based algorithms for unconnected transitions
and the result is shown. In future, we will look for
improvements of the existing process discovery and
visualization techniques that allow for the
construction of comprehensible models based on
realistic characteristics of an event logs.

Fraction of unconnected
transitions Tu

a cycle in a sequence of iterations. Cycles are
available in a graph and in much real life application;
it is required to know the existence of cycles in a
graph [8]. The algorithm is developed in the context
of network design problem but useful in any graph
application where existence is to be finding out.

www.ijera.com

0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

region-based
folding
unsupervised
noise fildering

Helix RALIC

frequentgrou
p based noise
fildering

Dataset

Fig 1. Fraction of unconnected Transitions
using datasets

REFERENCES
[1]

[2]

[3]

[4]

Josep Carmona et al, New region-based
algorithms for deriving bounded petri nets,
IEEE Transactions on Computers, Vol. 59,
No.3, March 2010.
Jochen De Weerdt et al, Active trace
clustering for improved process discovery,
IEEE Transactions on Knowledge and Data
Engineering, Vol. 25, No. 12, Dec 2013
Zhiang Wu et al, A novel noise filter based
on interesting pattern mining for bag-offeatures images, Expert systems with
applications 2013.
Filip Caron, Jan Vanthienen and Bart
Baesens, A comprehensive framework for
the application of process mining in risk
management and compliance checking,
Faculty of Business and Economics.

348 | P a g e
V. Priyadharshini et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.347-349
[5]

[6]

[7]

[8]

[9]

K. A. Mahdi, M. Safar, I. Sorkhoh, and A.
Kassem, Cycle-based versus degree-based
classification of social networks, Journal of
Digital Information Management, vol. 7, no.
6, 2009.
Robert Sedge wick, Thomas G. Szymanski,
Andrew C. Yao, The complexity of finding
cycles in periodic functions, SIAM journal
of computing, Vol. 11, No. 2, 2006.
Van der Aalst, W., de Medeiros, A.,
Weijters, A., Genetic process mining.
Applications and Theory of Petri Nets pp.
985–985, 2005.
De Medeiros, A., Weijters, A., van der
Aalst, W., Genetic process mining: an
experimental evaluation. Data Mining and
Knowledge Discovery vol.14, Issue. 2, 245–
304, 2007.
Kumar, Anand, Jani, N. N, An Algorithm to
Detect Cycle in an Undirected Graph,
International Journal of Computational
Intelligence Research, Vol. 6 No. 2, 2010.

www.ijera.com

[10]

[11]

[12]

[13]

[14]

www.ijera.com

Maytham Safar, Fahad Mahdi and Khaled
Mahdi, An Algorithm for Detecting Cycles
in Undirected Graphs using CUDA
Technology, International Journal on New
Computer
Architectures
and
Their
Applications, Vol. 2 No. 1, 2011.
Gianluigi Greco et al, Discovering
expressive process models by clustering log
traces, IEEE Transactions on Knowledge
and data engineering, Vol. 18, No. 8, Aug
2006.
Joonsoo Bae, Ling Liu et al, Development
of Distance Measures for Process Mining,
Discovery, and Integration, International
Journal of Web Services Research, Vol. 10,
No. 10, 2010.
Philip Weber, Behzad Bordbar, Peter Tiˇ no,
A Framework for the Analysis of Process
Mining Algorithms, 2009.
Wil M.P Van der Aalst, Process discovery:
capturing the invisible, IEEE Computational
Intelligence Magazine, 2010.

349 | P a g e

More Related Content

What's hot

Materials Informatics Overview
Materials Informatics OverviewMaterials Informatics Overview
Materials Informatics Overview
Tony Fast
 
Internet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series MethodsInternet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series Methods
Ajay Ohri
 

What's hot (20)

Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
 
The Use of K-NN and Bees Algorithm for Big Data Intrusion Detection System
The Use of K-NN and Bees Algorithm for Big Data Intrusion Detection SystemThe Use of K-NN and Bees Algorithm for Big Data Intrusion Detection System
The Use of K-NN and Bees Algorithm for Big Data Intrusion Detection System
 
Ijricit 01-002 enhanced replica detection in short time for large data sets
Ijricit 01-002 enhanced replica detection in  short time for large data setsIjricit 01-002 enhanced replica detection in  short time for large data sets
Ijricit 01-002 enhanced replica detection in short time for large data sets
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesFeature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering Techniques
 
Materials Informatics Overview
Materials Informatics OverviewMaterials Informatics Overview
Materials Informatics Overview
 
How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data mining
 
Internet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series MethodsInternet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series Methods
 
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCF
 
Outlier Detection Approaches in Data Mining
Outlier Detection Approaches in Data MiningOutlier Detection Approaches in Data Mining
Outlier Detection Approaches in Data Mining
 
Computational Materials Design and Data Dissemination through the Materials P...
Computational Materials Design and Data Dissemination through the Materials P...Computational Materials Design and Data Dissemination through the Materials P...
Computational Materials Design and Data Dissemination through the Materials P...
 
DuraMat Data Analytics
DuraMat Data AnalyticsDuraMat Data Analytics
DuraMat Data Analytics
 
Journals analysis ppt
Journals analysis pptJournals analysis ppt
Journals analysis ppt
 
A Machine Learning Framework for Materials Knowledge Systems
A Machine Learning Framework for Materials Knowledge SystemsA Machine Learning Framework for Materials Knowledge Systems
A Machine Learning Framework for Materials Knowledge Systems
 
Review of Existing Methods in K-means Clustering Algorithm
Review of Existing Methods in K-means Clustering AlgorithmReview of Existing Methods in K-means Clustering Algorithm
Review of Existing Methods in K-means Clustering Algorithm
 
Svm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and RSvm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and R
 
Overview of DuraMat software tool development (poster version)
Overview of DuraMat software tool development(poster version)Overview of DuraMat software tool development(poster version)
Overview of DuraMat software tool development (poster version)
 
Automated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design ProblemsAutomated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design Problems
 
A Review on Automatic Target Recognition and Detection Image Preprocessing Ap...
A Review on Automatic Target Recognition and Detection Image Preprocessing Ap...A Review on Automatic Target Recognition and Detection Image Preprocessing Ap...
A Review on Automatic Target Recognition and Detection Image Preprocessing Ap...
 

Viewers also liked

Conteudo Programatico
Conteudo ProgramaticoConteudo Programatico
Conteudo Programatico
btizatto1
 

Viewers also liked (6)

Conteudo Programatico
Conteudo ProgramaticoConteudo Programatico
Conteudo Programatico
 
Ernesto Colman Viajes: Déjate sorprender por los encantos de Brujas
Ernesto Colman Viajes: Déjate sorprender por los encantos de BrujasErnesto Colman Viajes: Déjate sorprender por los encantos de Brujas
Ernesto Colman Viajes: Déjate sorprender por los encantos de Brujas
 
power
powerpower
power
 
Teaching Students with Emojis, Emoticons, & Textspeak
Teaching Students with Emojis, Emoticons, & TextspeakTeaching Students with Emojis, Emoticons, & Textspeak
Teaching Students with Emojis, Emoticons, & Textspeak
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
 

Similar to Ay4201347349

Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
BRNSSPublicationHubI
 
Recommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assocRecommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assoc
ijerd
 
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association RulesA New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
Venu Madhav
 

Similar to Ay4201347349 (20)

Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
 
IRJET- Surveillance of Object Motion Detection and Caution System using B...
IRJET-  	  Surveillance of Object Motion Detection and Caution System using B...IRJET-  	  Surveillance of Object Motion Detection and Caution System using B...
IRJET- Surveillance of Object Motion Detection and Caution System using B...
 
[IJET-V1I4P3] Authors :Mekhala
[IJET-V1I4P3] Authors :Mekhala[IJET-V1I4P3] Authors :Mekhala
[IJET-V1I4P3] Authors :Mekhala
 
IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...
IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...
IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...
 
Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...
Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...
Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...
 
I0343047049
I0343047049I0343047049
I0343047049
 
Review Over Sequential Rule Mining
Review Over Sequential Rule MiningReview Over Sequential Rule Mining
Review Over Sequential Rule Mining
 
Association Rule Mining using RHadoop
Association Rule Mining using RHadoopAssociation Rule Mining using RHadoop
Association Rule Mining using RHadoop
 
Analysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTAnalysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOT
 
REVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining TechniquesREVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining Techniques
 
An Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set MiningAn Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
 
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
 
A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...
A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...
A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...
 
Adaptive and Fast Predictions by Minimal Itemsets Creation
Adaptive and Fast Predictions by Minimal Itemsets CreationAdaptive and Fast Predictions by Minimal Itemsets Creation
Adaptive and Fast Predictions by Minimal Itemsets Creation
 
50120140504006
5012014050400650120140504006
50120140504006
 
A Novel Framework on Web Usage Mining
A Novel Framework on Web Usage MiningA Novel Framework on Web Usage Mining
A Novel Framework on Web Usage Mining
 
Recommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assocRecommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assoc
 
Mining High Utility Patterns in Large Databases using Mapreduce Framework
Mining High Utility Patterns in Large Databases using Mapreduce FrameworkMining High Utility Patterns in Large Databases using Mapreduce Framework
Mining High Utility Patterns in Large Databases using Mapreduce Framework
 
IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...
IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...
IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...
 
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association RulesA New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
 

Recently uploaded

Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
UK Journal
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 

Recently uploaded (20)

Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 

Ay4201347349

  • 1. V. Priyadharshini et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.347-349 RESEARCH ARTICLE www.ijera.com OPEN ACCESS Analysis of Process Mining Model Using Frequentgroup Based Noise Filtering Algorithm V. Priyadharshini*, A. Malathi** * (PhD Research Scholar, PG & Research Department of Computer Science, Government Arts College Coimbatore – 18) ** (Assistant Professor, PG & Research Department of Computer Science, Government Arts College Coimbatore – 18) ABSTRACT Process mining is a process management system used to analyze business processes based on event logs. The knowledge is extracted from event logs by using knowledge retrieval techniques. The process mining algorithms are capable of automatically discover models to give details of all the events registered in some log traces provided as input. The theory of regions is a valuable tool in process discovery: it aims at learning a formal model (Petri nets) from a set of traces. The main objective of this paper is to propose new concept Frequentgroup based noise filtering algorithm. The experiment is done based on standard bench mark dataset HELIX and RALIC datasets. The performance of the proposed system is better than existing method. Keywords: Cycle detection algorithm, frequent group based noise filtering algorithm, Helix dataset, Process discovery, Process mining, RALIC dataset I. INTRODUCTION Process discovery is one of the most challenging process mining tasks. Based on event log, a process model is constructed by capturing the behavior of the log [4]. Event Logs essentially capture the business activities happened at a certain time period [1]. The basic idea is to extract knowledge from event logs recorded by an information system. Process mining aims at rising this by providing techniques and tools for locating method, control, data, structure, and social structures from event logs. The research domain that is concerned with knowledge discovery from event logs is called process mining [2]. More traditional data mining techniques can be used in process mining. New techniques are developed to perform process mining i.e. mining of process models. It is the traditional analysis of business processes based on the opinion of process expert [3]. The business process mining attempts to reconstruct a complete process model from data logs that contain real process execution data. Many techniques highlight the likelihood of mixing variety of method mining approaches to mine tougher event logs, like those which contain noise. The necessary background in Section II describes related work. Section III presents existing alpha algorithm that describes previous work done. Section IV describes the proposed implemented work. The result and discussion is presented in www.ijera.com Section V. Conclusion and future work is discussed in Section VI. II. RELATED WORK The algorithm in [5] is an approximation algorithm that has proven its efficiency in estimating large number of cycles in polynomial time when applied to real world networks. The algorithm counts the number of cycles in random, sparse graphs as a function of their length. While using it in real world networks, the result is not guaranteed for generic graphs. The algorithm in [6] presented an algorithm based on cycle vector space methods. This algorithm is slow since it investigates all vectors and only a small portion of them could be cycles. The algorithm in [9] is DFS-XOR based on the fact that small cycles can be joined together to form bigger cycle. It is more time efficient when it comes to real life problems of counting cycles in a graph because its complexity is not depending on the factor of number of cycles. III. CYCLE DETECTION ALGORITHM It is a pointer algorithmic rule that uses solely two pointers that move through the sequence at totally different speeds. The algorithmic rule solely must check for recurrent values of this special type, one double as aloof from the beginning of the sequence because the different, to search out an amount [7]. The function value is used to find 347 | P a g e
  • 2. V. Priyadharshini et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.347-349 IV. FREQUENT GROUP BASED NOISE FILTERING It finds all Frequentgroup patterns for a given minimum support and minimum confidence and remove the objects that are not a part of any Frequentgroup pattern [10]. The set of Frequentgroup patterns for any data set depends upon the value of minimum support and minimum confidence. Wherever possible, we fix the minimum support to be zero and employ the minimum confidence to control the number of objects that are designated as noise [12]. However, in some data sets setting the support threshold to zero leads to an explosion in the number of hypergroup patterns. For this reason, a low support threshold that is high enough to reduce the number of Frequentgroup patterns to a manageable level is used. The proposed system automatically decides correct initial weight, noise filtering, feature selection properties [11]. The proposed system is more efficient than the existing system. 4.1 Datasets Helix and RALIC datasets is a compilation of release histories of a number of non-trivial Java Open Source Software System. It contains class files for each release of the system along with meta-data. A metric history is derived from extraction of releases and this data is directly used in research works. 4.2 Results and discussion In this work helix and RALIC dataset are used for the experiment. We are comparing the three methods such as existing and proposed works [13]. The existing region based folding and proposed unsupervised noise filtering and Frequentgroup based noise filtering as shown in Fig. 1. The fitness value of existing system is low compared to the proposed approach. By using proposed unsupervised noise filtering and Frequentgroup based noise filtering methods, we obtain the high fitness value. So we can get the better performance comparatively [14]. By doing the experiment based on fitness value calculation and fraction of unconnected transitions (Tu). In Fig.1 experiment based on fraction of unconnected transitions (Tu), we calculate the Tu corresponding to existing and proposed approaches. The Tu is decreased for the proposed methods www.ijera.com unsupervised noise filtering and Frequentgroup based noise filtering methods. This will also increase the performance of the proposed system compared to existing system. Finally we can conclude as the proposed unsupervised noise filtering and Frequentgroup based noise filtering methods has more effective than the existing system. V. CONCLUSION AND FUTURE WORK The datasets were used with three different region based algorithms for unconnected transitions and the result is shown. In future, we will look for improvements of the existing process discovery and visualization techniques that allow for the construction of comprehensible models based on realistic characteristics of an event logs. Fraction of unconnected transitions Tu a cycle in a sequence of iterations. Cycles are available in a graph and in much real life application; it is required to know the existence of cycles in a graph [8]. The algorithm is developed in the context of network design problem but useful in any graph application where existence is to be finding out. www.ijera.com 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 region-based folding unsupervised noise fildering Helix RALIC frequentgrou p based noise fildering Dataset Fig 1. Fraction of unconnected Transitions using datasets REFERENCES [1] [2] [3] [4] Josep Carmona et al, New region-based algorithms for deriving bounded petri nets, IEEE Transactions on Computers, Vol. 59, No.3, March 2010. Jochen De Weerdt et al, Active trace clustering for improved process discovery, IEEE Transactions on Knowledge and Data Engineering, Vol. 25, No. 12, Dec 2013 Zhiang Wu et al, A novel noise filter based on interesting pattern mining for bag-offeatures images, Expert systems with applications 2013. Filip Caron, Jan Vanthienen and Bart Baesens, A comprehensive framework for the application of process mining in risk management and compliance checking, Faculty of Business and Economics. 348 | P a g e
  • 3. V. Priyadharshini et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.347-349 [5] [6] [7] [8] [9] K. A. Mahdi, M. Safar, I. Sorkhoh, and A. Kassem, Cycle-based versus degree-based classification of social networks, Journal of Digital Information Management, vol. 7, no. 6, 2009. Robert Sedge wick, Thomas G. Szymanski, Andrew C. Yao, The complexity of finding cycles in periodic functions, SIAM journal of computing, Vol. 11, No. 2, 2006. Van der Aalst, W., de Medeiros, A., Weijters, A., Genetic process mining. Applications and Theory of Petri Nets pp. 985–985, 2005. De Medeiros, A., Weijters, A., van der Aalst, W., Genetic process mining: an experimental evaluation. Data Mining and Knowledge Discovery vol.14, Issue. 2, 245– 304, 2007. Kumar, Anand, Jani, N. N, An Algorithm to Detect Cycle in an Undirected Graph, International Journal of Computational Intelligence Research, Vol. 6 No. 2, 2010. www.ijera.com [10] [11] [12] [13] [14] www.ijera.com Maytham Safar, Fahad Mahdi and Khaled Mahdi, An Algorithm for Detecting Cycles in Undirected Graphs using CUDA Technology, International Journal on New Computer Architectures and Their Applications, Vol. 2 No. 1, 2011. Gianluigi Greco et al, Discovering expressive process models by clustering log traces, IEEE Transactions on Knowledge and data engineering, Vol. 18, No. 8, Aug 2006. Joonsoo Bae, Ling Liu et al, Development of Distance Measures for Process Mining, Discovery, and Integration, International Journal of Web Services Research, Vol. 10, No. 10, 2010. Philip Weber, Behzad Bordbar, Peter Tiˇ no, A Framework for the Analysis of Process Mining Algorithms, 2009. Wil M.P Van der Aalst, Process discovery: capturing the invisible, IEEE Computational Intelligence Magazine, 2010. 349 | P a g e