Liquid process model collections

Prof. Marcello LaRosa
BPMDiscipline, Information SystemsSchool
Queensland University ofTechnology

Research
• Technology oriented
• Business oriented
Teaching
• BPM specializations
• Master’s of BPM
Service
• Professional training
• Consultancy
BPM Discipline @ QUT
http://bpm-research-group.org

Each process is varied by product & brand…
End to end insurance process
Source: Guidewire reference models
Total number of insurance models: 3,000+
30
variations
500
tasks
Home      
Motor        
Commercial     
Liability     
CTP / WC      
A few years back… Suncorp insurance

Managing large process model collections
versions & variants management
merging
refactoring /
standardization
clone
detection
Process model
repository
querying
similarity
search
80%
R. Dijkman, M. La Rosa, H. Reijers “Managing large collections of business process models – Current techniques and challenges”, COMIND 2012
0
10
20
30
40
2000 2002 2004 2006 2008 2010 2012

The Apromore Initiative (apromore.org)
An open-source, highly scalable, SaaS platform to manage
process model collections
M. La Rosa, H. Reijers, W. van der Aalst, R. Dijkman, J. Mendling, M. Dumas, L. Garcia-Banuelos “APROMORE: an advanced process model
repository”, EXP.SYS.APP. 2011

“Build awareness”
Understand differences and causes for these differences
“Achieve simplification”
Identify and consolidate common business functions
“Achieve centralisation”
Centralise support for non-core processes across LOBs
“Identify opportunities for partnering”
Make better decisions about the processes you can partner/run in-house
Expected benefits (beginning of the project)

“The tool is great but it would be pretty useless because our
process models aren’t great and we know they aren’t great” (*)
“If they [the models] would have been updated all along it
would have been worthwhile but now they’re out of date, it’s
not really worth the effort of bringing them up to date” (**)
(*) Realistic Suncorp employee
(**) Disillusioned Suncorp employee
The reality (end of the project)

Large process model collections, hard to maintain
Large collections of “dead” process models
The other face of the medal

Vision: and if the collection could self-adapt?
Change is endemic to organizations and continuously affects them:
Requirements change
Environments change
Processes change

If an organization’s processes change, this will be recorded in
the systems logs
Use process mining techniques to “discover” process changes
from logs, and apply these changes to process model collections
Release the full potential of management techniques for
process model collections
Let’s get more concrete
/
event log
live event stream
database
process model
patterns
conformance
analysis
process
performance…
if A then B
extract
process
knowledge

A process model collection that is
• Aligned with organizational behavior
• Can self-adapt to evolving organizational behavior
Solution: “Liquid” process model collection
W.M.P. van der Aalst, M. La Rosa, A.H.M. ter Hofstede, M.T. Wynn, Liquid Business Process Model Collections. In Modeling and
Simulation-based Systems Engineering Handbook, 2014

1. Discovering a collection in the first place
2. Coping with evolution
3. Aligning logs with an existing process model collection
Approach: 3 interrelated challenges

Event log
Challenge 1: Discovering a collection
Process discovery
algorithm
Current situation: discovering “all-in-one” models

a b a b c d e c a f g h
a b a b k d e c h f g h
b c p q p r a k q r s
b c p h p r a k q r s
a x p h y z t t u
Trace
clustering
a b a b c d e c a f g h
a b a b k d e c h f g h
b c p q p r a k q r s
b c p h p r a k q r s
a x p h y z t t u
Cluster 1
Cluster 2
Noise
Event log
Process variant 1
Process variant 2
Trace clustering

Our approach: Slice, Mine and Dice (SMD)
L. Garcia-Banuelos, M. Dumas, M. La Rosa, J. De Weerdt, C.C. Ekanayake. Controlled Automated Discovery of Collections of
Business Process Models. Information Systems, 2014
slice the log horizontally
per variant
dice the discovered models
hierarchically
mine

Process
discovery
Trace
clustering
Event log
Complexity
threshold
e.g. Size ≤ 30
Process model
Slice, Mine and Dice (SMD)
>?
1) Slice
2) Mine
3) Dice

Process
discovery
Trace
clustering
Event log
Complexity
threshold
e.g. Size ≤ 30
Slice, Mine and Dice (SMD)
>?
Process models

Process model M3
Process model M2
A closer look…

F10
F11 F13
F14F12
M2 RPST of M2
F20
F21 F22
F24F23
F25
RPST of M3
Refined Process Structure Tree (RPST)
J. Vanhatalo, H. Volzer, J. Koehler: The Refined Process Structure Tree. Data Knowl. Eng., 2009
M3

M2
M3
F14 F25
M. Dumas, L. García-Bañuelos, M. La Rosa, and R. Uba. Fast Detection of Exact Clones in Business Process Model Repositories.
Information Systems, 2013
RPSDAG
F10
F11 F13
F14F12
RPST of M2
F20
F21 F22
F24F23
F25
RPST of M3

F14
M2
M3
S3
+
S3
+
Extracting exact clones
S3
Exact clones:
• SESE
• Non-trivial
• Identical

F12
F22M3
+
S3
+
S3
?
?
Extracting approximate clones
M. La Rosa, M. Dumas, C. Ekanayake, L. Garcia-Baneulos, J. Recker, A.H.M. ter Hofstede, Detecting Approximate Clones in
Business Process Model Repositories. Information Systems, 2015
Appr. clones:
• SESE
• Non-trivial
• Similar
• Unrelated
+
+
M2

Merging
algorithm
Fragment F12 of model M2
Fragment F22 of model M3
Configurable
gateway
Configurable
label
M. La Rosa, M. Dumas, R. Uba, and R. M. Dijkman. Business Process Model Merging: An Approach to Business Process
Consolidation. ACM TOSEM, 2013.
Merging approximate clones
S4

Trace clustering
• M. Song, C.W. Gunther, and W.M.P. van der Aalst, Improving Process Mining
with Trace Clustering, J. Korean Inst. of Industrial Engineers 34(4), 2008
• R.P.J.C. Bose, W.M.P. van der Aalst, Trace Clustering Based on Conserved
Patterns: Towards Achieving Better Process Models, BPM 2009 Workshops
• A.K.A. de Medeiros, A. Guzzo, G. Greco, W.M.P. van der Aalst, A.J.M.M.
Weijters, B.F. van Dongen, D. Saccà. Process Mining Based on Clustering: A
Quest for Precision, BPM Workshops 2007
Discovery
• A.J.M.M. Weijters, J.T.S. Ribeiro. Flexible Heuristics Miner (FHM), CIDM,
2011.
Evaluation setup
Log Traces Events
Event
classes
Duplication
ratio
Motor 4,293 33,202 292 114
Commercial 4,852 54,134 81 668
BPI 2012 5,312 91,949 36 2,554

Evaluation – repository size and models number
S: Song et al.
B: Bose et al.
M: de Medeiros et al.
• up to 64% reduction in repository size
• up to 66% reduction in # of top level process models
• up to 120 sub-processes extracted
Motor Comm BPI Motor Comm BPI
14%
22%
66%
64%

Evaluation – individual model complexity
Motor Comm BPI Motor Comm BPI
30%

concept
drift
log at time2 > time1
A
B
C
X
E
F
Y
A
B
C
X
E
Y
B
C
C
X
E
B
C
C
X
E
B
C
C
X
E
E
A
B
C
X
D
F
Y
Challenge 2: Coping with evolution
log at time1
A
B
C
D
E
F
G
A
B
C
D
E
G
B
C
C
D
F
B
C
C
D
E
B
C
C
D
E
E
A
B
C
D
D
F
G
liquid process
model collection
(currently in use)
intentional changes
since last version
process
stakeholder
liquid process
model collection
(consolidated)
liquid process
model collection
(from new behavior)
non-transient changes
since last log
x
y
x
y
y

A time point when there is a statistically significant difference
between the observed process behavior before and after this
point
Concept drift in a single process

Example
<A,B,E,F,G>
<A,B,C,F,G>
<A,B,C,F,G>
<A,B,D,F,G>
<A,B,E,F,G>
<A,B,D,F,G>
Drift
Log
<A,B,E,F,G>
<A,B,D,F,G>
<A,B,E,F,G>
<A,B,D,F,G>
<A,B,D,F,G>
<A,B,D,F,G>

1. Fully automated
2. Highly scalable (online use)
3. Highly accurate
- types of drifts detected
- delay in detecting the drift
4. Explainable
Requirements

• Statistically significant difference in process behavior, i.e.
“when are two processes different?”
• Use an appropriate data structure to encode process behavior
Partial order runs of a process where concurrency is explicitly captured >
configuration equivalence
• Process drift = time point when there is a statistically
significant difference in the distribution of the runs before and
after (for a given time window size)
Our approach: ProDrift
A. Maaradji, M. Dumas, M. La Rosa, A. Ostovar, Fast and accurate business process drift detection. In BPM 2015

1. Starting from an event log, we consider completed traces
2. For each new trace 𝜎𝑖:
• update the concurrency relation
• transform trace 𝜎𝑖 into run 𝜋𝑖 by encoding the associated concurrency
relation
From a stream of traces to a stream of runs
Stream of tracesStream of runs

1. Define two juxtaposed sliding windows (reference and
detection) forming the 2𝑤 most recent runs
2. Consider the runs as observations of a categorical variable,
one per window
3. Apply the Chi-square test of independence between the two
windows
Reference window
Point of the
hypothesis test
Detection window
𝜋𝑖+2𝑤𝜋𝑖+1
𝜋𝑖+𝑤 𝜋𝑖+𝑤+1
Chi-square test of independence
P-value < threshold
Stream of runs

The detection delay d is the distance between the actual drift
and the last trace read in order to detect a drift
To avoid sporadic stochastic oscillations of P-value, we have a
drift when P-value < threshold for 𝜙 consecutive tests
Detection delay and noise filter
𝜋𝑖+2𝑤
Actual drift
d
Reference window
Point of the
hypothesis test
Detection window
𝜋𝑖+1
Stream of runs

The choice of the windows size is critical for drift detection:
• a higher variation needs more observations
• a lower variation needs less observations
We use an adaptive window technique to have a more reliable
statistical test based on the evolution ratio
Adaptive window
Reference window Detection window
Stream of runs
𝑇𝑗
𝑇𝑗+1

Implementation in Apromore
Watch the screencast at https://youtu.be/97NLShSMJnQ

We generated a benchmark dataset of 72 logs by simulating a
textbook example (loan origination process) using BIMP
Injected 18 different change patterns
For each pattern, we generated 4 logs of different lengths
(2,500, 5,000, 7,500 and 10,000 traces)
Evaluation: synthetic dataset

Change patterns from Weber et al.
12 simple change patterns:
+ 6 complex change patterns (3 nested patterns each):
IRO, IOR, ORI, OIR, RIO, ROI
Weber, B., Reichert, M., Rinderle-Ma, S.: Change patterns and change support features-enhancing flexibility in
process-aware information systems. DKE 66(3), 2008

Drift injection – gold standard
Each drift injected 9 times by composing 10 sublogs
juxtaposition
simulation

Time performance: time required to perform a new statistical test
- min: 0.26ms
- max: 2.3ms
- mean: 0.5ms (real time)
Accuracy:
- F-score
- Mean delay
Evaluation measures

Impact of window size on F-score and mean delay

Impact of window size on mean delay

Impact of adaptive window size on F-score

Impact of adaptive window size on mean delay

• Log from claims management system of a large Australian
insurance company
• 4,509 traces, 29,108 events with 12 event classes
Evaluation on real-life log

• Results validated with a business analyst from the insurance
company
• Distribution of the number of active cases over log timeline
confirms the results
Evaluation on real-life log

How to explain what happened?
N.R. van Beest, M. Dumas, L. Garcia-Banuelos, M. La Rosa, Log delta analysis: Interpretable differencing of business process event logs. In
BPM 2015
Event structure1 Event structure2
MERGE MERGE
Runs1
Runs2
PSP
Before the drift, task “Emit invoice”
could be repeated, afterwards not
anymore...

lack of accuracy
superfluous activity
missing activity
P P
Q Q
R >>
S >>
E >>
F F
- X
G -
Challenge 3: Aligning logs with existing collection
log
P
Q
R
S
E
F
G
A
B
D
A
B
C
D
E
P
Q
F
X
E
F
G
process model collection
A
D
B C
P
Q
F
X
A
D
B C
E
trace sub-
trace
activity
event
A A
B B
C C
D D
E E
full alignment
partial alignment
overall alignment
score = ?
Wil M. P. van der Aalst, Arya Adriansyah, Boudewijn F. van Dongen: Replaying history on process models for conformance checking and
performance analysis. Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery, 2012

Academic Director (Corporate engagements)
BPM Discipline, IS School
Science & Engineering Faculty
Queensland University of Technology
m.larosa@qut.edu.au
marcellolarosa.com
@mlr80

Liquid process model collections

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (19)

En vedette

En vedette (20)

Similaire à Liquid process model collections

Similaire à Liquid process model collections (20)

Dernier

Dernier (20)

Liquid process model collections