SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




       Integrating Computer Log Files
             for Process Mining
           A Genetic Algorithm Inspired Technique

                                                       Jan Claes
                                                       jan.claes@ugent.be
                                                       http://processmining.ugent.be
                                                       Ghent University, Belgium

Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




                            1. Process Mining




Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
A plane crashed... What happened?




Analyse the ‘black box’
Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             3 / 24
A process failed... What happened?


Analyse the ‘black box’: look for historical data
Process Mining:
        Reconstruct and analyse processes
        From historical process data
             • Log files
             • Audit trails
             • Database history fields/tables



Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             4 / 24
Process Mining

Processes are supported by IT systems
IT systems record actual process data
Process data can be used to automatically
   Discover process model
   Check conformance with existing process info
   Extend existing process model
Attention                      Process Mining
        Only As-Is
        Only (correctly) recorded information
Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             5 / 24
Process Mining steps

 Preparation
            Collect data: find traces
            Merge data: from different sources
            Structure data: group per instance
            Convert data: to tool specific format
 Process mining
 Make decisions, take action


Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             6 / 24
Process Mining steps




Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             7 / 24
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




                          2. Merging log files




Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
Example

Product ordering: registered events:
        Sales order: document creation (administration)
        Delivery: truck load confirmation (warehouse)
        Invoice: document creation (administration)
Logging
        from administration software
        from warehouse software
How to merge both log files?
Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             9 / 24
Example 1

Administration                                                   Warehouse
          SO1       SO > Inv                                       SO1       Deliver

          SO2       SO > Inv                                       SO2       Deliver

          SO3       SO > Inv                                       SO3       Deliver

                                         SO1 SO > Deliver > Inv

                                         SO2 SO > Deliver > Inv

                                         SO3 SO > Deliver > Inv


       Merge based on matching trace identifiers
Faculty of Economics and Business Administration                         Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                    10 / 24
Example 2

Administration                                                   Warehouse
           SO1      SO > Inv                                        Del1 Deliver (SO1)

           SO2      SO > Inv                                        Del2 Deliver (SO2)

           SO3      SO > Inv                                        Del3 Deliver (SO3)

                                         SO1 SO > Deliver > Inv

                                         SO2 SO > Deliver > Inv

                                         SO3 SO > Deliver > Inv


       Merge based on matching attribute values
Faculty of Economics and Business Administration                        Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                   11 / 24
Example 3

Administration                                    t1<t2<t3       Warehouse
                                                      <<
           SO1 SO t > Inv t                                         Arr1     Deliver t
                   1       3                       t4<t5<t6                              2

           SO2 SO t > Inv t
                           6
                                                      <<            Arr2     Deliver t
                   4                                                                     5

           SO3 SO t > Inv t
                                                   t7<t8<t9         Arr3     Deliver t
                   7       9                                                             8


                                         SO1 SO > Deliver > Inv

                                         SO2 SO > Deliver > Inv

                                         SO3 SO > Deliver > Inv


                 Merge based on time information
Faculty of Economics and Business Administration                       Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                  12 / 24
Merging computer log files

Merge based on
        Example 1: matching trace identifiers                        indicator 1
        Example 2: matching attribute values                         indicator 2
        Example 3: time information                                  indicator 3
General solution
  algorithm combining different indicators
Genetic algorithm
  indicators build up fitness function

Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            13 / 24
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




                        3. Genetic algorithm




Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
Genetic algorithm




                            cross-over




                                                                 survival of
                                                                 the fittest
                           mutation



  1st generation                               2nd generation                        3th generation
Faculty of Economics and Business Administration                               Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                          15 / 24
Genetic algorithm
                     Fitness function score


         14                                            18                                    18
                            cross-over


         27                                            29                                    28
                                                                 survival of
                                                                 the fittest
                           mutation
          6                                             5                                    32

  1st generation                               2nd generation                        3th generation
Faculty of Economics and Business Administration                               Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                          16 / 24
Genetic algorithm inspired technique

Find links between traces of both log files and
 merge them chronologically in new log file
Steps
        Make initial solution (best individual links)
        Make pseudo-random changes
         (try to improve score for one specific factor)
        Evaluate (keep original or changed solution)
        Stop condition (fixed amount of steps)
Only one solution, no cross-over
Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            17 / 24
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




                      4. Experiment results




Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
Experiment: proof of concept


Simulated data
        Given model
        Generate
             • random set of logs
             • single log (=solution)
        Use merge algorithm to merge set of logs
        Check resulting log with solution log



Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            19 / 24
Experiment: proof of concept

Advantages of using simulated data
        Solution is known
        Controllable parameters
         (e.g. noise, overlap, matching id)
Disadvantages of using simulated data
        Limited internal validity (are results realistic?)
        No external validity (results not generalisable)


Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            20 / 24
Experiment results

Incorrect links related to total links identified




Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            21 / 24
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




                                    5. Discussion




Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
Future work

Optimise genetic algorithm
        Less incorrect links
       Faster implementation (AIS algorithm)
        Fitness function factors
Validation with real test cases
       Ghent University DPO (Human Resources)
       Century21 (Real Estate) & FlexPack (Packaging)
        BNP Paribas Fortis (Finance)
        ...
Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            23 / 24
Contact information




                                             Jan Claes
                                             jan.claes@ugent.be

                                             http://processmining.ugent.be
                                             Twitter: @janclaesbelgium




Faculty of Economics and Business Administration                       Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                  24 / 24

Contenu connexe

Plus de Jan Claes

COGNISE@CAiSE 2019
COGNISE@CAiSE 2019COGNISE@CAiSE 2019
COGNISE@CAiSE 2019Jan Claes
 
BPMS2@BPM2018
BPMS2@BPM2018BPMS2@BPM2018
BPMS2@BPM2018Jan Claes
 
EMMSAD++@CAiSE 2018
EMMSAD++@CAiSE 2018EMMSAD++@CAiSE 2018
EMMSAD++@CAiSE 2018Jan Claes
 
BPM Cluster Meeting 2018
BPM Cluster Meeting 2018BPM Cluster Meeting 2018
BPM Cluster Meeting 2018Jan Claes
 
Research: Why? What? How?
Research: Why? What? How?Research: Why? What? How?
Research: Why? What? How?Jan Claes
 
TEDxGhent 2016 PhD Contest
TEDxGhent 2016 PhD ContestTEDxGhent 2016 PhD Contest
TEDxGhent 2016 PhD ContestJan Claes
 
PhD defense November 2015
PhD defense November 2015PhD defense November 2015
PhD defense November 2015Jan Claes
 
PhD pre-defense September 2015
PhD pre-defense September 2015PhD pre-defense September 2015
PhD pre-defense September 2015Jan Claes
 
UGent MIS research seminar June 2015
UGent MIS research seminar June 2015UGent MIS research seminar June 2015
UGent MIS research seminar June 2015Jan Claes
 
UGent MIS research seminar December 2014
UGent MIS research seminar December 2014UGent MIS research seminar December 2014
UGent MIS research seminar December 2014Jan Claes
 
BPM Cluster Meeting 2014
BPM Cluster Meeting 2014BPM Cluster Meeting 2014
BPM Cluster Meeting 2014Jan Claes
 
PhD Day 2014
PhD Day 2014PhD Day 2014
PhD Day 2014Jan Claes
 
Colloquium@TUe
Colloquium@TUeColloquium@TUe
Colloquium@TUeJan Claes
 
PhD Day 2013
PhD Day 2013PhD Day 2013
PhD Day 2013Jan Claes
 
Stad Gent 2012
Stad Gent 2012Stad Gent 2012
Stad Gent 2012Jan Claes
 
Confenis 2012
Confenis 2012Confenis 2012
Confenis 2012Jan Claes
 
Confenis2012DC
Confenis2012DCConfenis2012DC
Confenis2012DCJan Claes
 

Plus de Jan Claes (20)

COGNISE@CAiSE 2019
COGNISE@CAiSE 2019COGNISE@CAiSE 2019
COGNISE@CAiSE 2019
 
BPMS2@BPM2018
BPMS2@BPM2018BPMS2@BPM2018
BPMS2@BPM2018
 
ICLTC 2018
ICLTC 2018ICLTC 2018
ICLTC 2018
 
EMMSAD++@CAiSE 2018
EMMSAD++@CAiSE 2018EMMSAD++@CAiSE 2018
EMMSAD++@CAiSE 2018
 
BPM Cluster Meeting 2018
BPM Cluster Meeting 2018BPM Cluster Meeting 2018
BPM Cluster Meeting 2018
 
Research: Why? What? How?
Research: Why? What? How?Research: Why? What? How?
Research: Why? What? How?
 
TEDxGhent 2016 PhD Contest
TEDxGhent 2016 PhD ContestTEDxGhent 2016 PhD Contest
TEDxGhent 2016 PhD Contest
 
PhD defense November 2015
PhD defense November 2015PhD defense November 2015
PhD defense November 2015
 
PhD pre-defense September 2015
PhD pre-defense September 2015PhD pre-defense September 2015
PhD pre-defense September 2015
 
UGent MIS research seminar June 2015
UGent MIS research seminar June 2015UGent MIS research seminar June 2015
UGent MIS research seminar June 2015
 
UGent MIS research seminar December 2014
UGent MIS research seminar December 2014UGent MIS research seminar December 2014
UGent MIS research seminar December 2014
 
BPM Cluster Meeting 2014
BPM Cluster Meeting 2014BPM Cluster Meeting 2014
BPM Cluster Meeting 2014
 
PhD Day 2014
PhD Day 2014PhD Day 2014
PhD Day 2014
 
Colloquium@TUe
Colloquium@TUeColloquium@TUe
Colloquium@TUe
 
ECIS2013DC
ECIS2013DCECIS2013DC
ECIS2013DC
 
PhD Day 2013
PhD Day 2013PhD Day 2013
PhD Day 2013
 
Stad Gent 2012
Stad Gent 2012Stad Gent 2012
Stad Gent 2012
 
Confenis 2012
Confenis 2012Confenis 2012
Confenis 2012
 
Confenis2012DC
Confenis2012DCConfenis2012DC
Confenis2012DC
 
BPI@BPM2012
BPI@BPM2012BPI@BPM2012
BPI@BPM2012
 

Dernier

Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadAyesha Khan
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Servicecallgirls2057
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckHajeJanKamps
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...lizamodels9
 
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxContemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxMarkAnthonyAurellano
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Seta Wicaksana
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607dollysharma2066
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Riya Pathan
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis UsageNeil Kimberley
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfrichard876048
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMintel Group
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaoncallgirls2057
 

Dernier (20)

Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
 
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxContemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdf
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 Edition
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
 

Integrating Computer Log Files Genetic Algorithm

  • 1. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION Integrating Computer Log Files for Process Mining A Genetic Algorithm Inspired Technique Jan Claes jan.claes@ugent.be http://processmining.ugent.be Ghent University, Belgium Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 2. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 1. Process Mining Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 3. A plane crashed... What happened? Analyse the ‘black box’ Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 3 / 24
  • 4. A process failed... What happened? Analyse the ‘black box’: look for historical data Process Mining:  Reconstruct and analyse processes  From historical process data • Log files • Audit trails • Database history fields/tables Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 4 / 24
  • 5. Process Mining Processes are supported by IT systems IT systems record actual process data Process data can be used to automatically  Discover process model  Check conformance with existing process info  Extend existing process model Attention Process Mining  Only As-Is  Only (correctly) recorded information Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 5 / 24
  • 6. Process Mining steps  Preparation  Collect data: find traces  Merge data: from different sources  Structure data: group per instance  Convert data: to tool specific format  Process mining  Make decisions, take action Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 6 / 24
  • 7. Process Mining steps Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 7 / 24
  • 8. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 2. Merging log files Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 9. Example Product ordering: registered events:  Sales order: document creation (administration)  Delivery: truck load confirmation (warehouse)  Invoice: document creation (administration) Logging  from administration software  from warehouse software How to merge both log files? Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 9 / 24
  • 10. Example 1 Administration Warehouse SO1 SO > Inv SO1 Deliver SO2 SO > Inv SO2 Deliver SO3 SO > Inv SO3 Deliver SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on matching trace identifiers Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 10 / 24
  • 11. Example 2 Administration Warehouse SO1 SO > Inv Del1 Deliver (SO1) SO2 SO > Inv Del2 Deliver (SO2) SO3 SO > Inv Del3 Deliver (SO3) SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on matching attribute values Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 11 / 24
  • 12. Example 3 Administration t1<t2<t3 Warehouse << SO1 SO t > Inv t Arr1 Deliver t 1 3 t4<t5<t6 2 SO2 SO t > Inv t 6 << Arr2 Deliver t 4 5 SO3 SO t > Inv t t7<t8<t9 Arr3 Deliver t 7 9 8 SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on time information Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 12 / 24
  • 13. Merging computer log files Merge based on  Example 1: matching trace identifiers indicator 1  Example 2: matching attribute values indicator 2  Example 3: time information indicator 3 General solution  algorithm combining different indicators Genetic algorithm  indicators build up fitness function Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 13 / 24
  • 14. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 3. Genetic algorithm Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 15. Genetic algorithm cross-over survival of the fittest mutation 1st generation 2nd generation 3th generation Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 15 / 24
  • 16. Genetic algorithm Fitness function score 14 18 18 cross-over 27 29 28 survival of the fittest mutation 6 5 32 1st generation 2nd generation 3th generation Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 16 / 24
  • 17. Genetic algorithm inspired technique Find links between traces of both log files and merge them chronologically in new log file Steps  Make initial solution (best individual links)  Make pseudo-random changes (try to improve score for one specific factor)  Evaluate (keep original or changed solution)  Stop condition (fixed amount of steps) Only one solution, no cross-over Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 17 / 24
  • 18. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 4. Experiment results Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 19. Experiment: proof of concept Simulated data  Given model  Generate • random set of logs • single log (=solution)  Use merge algorithm to merge set of logs  Check resulting log with solution log Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 19 / 24
  • 20. Experiment: proof of concept Advantages of using simulated data  Solution is known  Controllable parameters (e.g. noise, overlap, matching id) Disadvantages of using simulated data  Limited internal validity (are results realistic?)  No external validity (results not generalisable) Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 20 / 24
  • 21. Experiment results Incorrect links related to total links identified Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 / 24
  • 22. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 5. Discussion Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 23. Future work Optimise genetic algorithm  Less incorrect links Faster implementation (AIS algorithm)  Fitness function factors Validation with real test cases Ghent University DPO (Human Resources) Century21 (Real Estate) & FlexPack (Packaging)  BNP Paribas Fortis (Finance)  ... Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 23 / 24
  • 24. Contact information Jan Claes jan.claes@ugent.be http://processmining.ugent.be Twitter: @janclaesbelgium Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 24 / 24