2. Motivation
Imagine a train crashes because of an engineering error and
a lot of people get injured
You are a national railway system administrator, say ABC
You might be in trouble!
Mining Project-Oriented Business Processes Motivation 2 / 22
5. Who is responsible?
Are you as ABC responsible for the accident?!
Show that your work complies with safety regulations
E.g. in the railway domain EN50128, EN50129, EN50126
Mining Project-Oriented Business Processes Problem 5 / 22
6. How to provide evidence of compliance?
Analyze the work in retrospect
The company does not use a BPM engine to execute their processes:
No process designed a priori
Rather a project that is handled ad-hoc by engineers
An expert (auditor) analyses the existing documentation and
manually checks if everything was done properly
Spreadsheets, wordprocessor, diagrams, version control system (VCS)
data
Mining Project-Oriented Business Processes Problem 6 / 22
8. Idea: mine project-oriented business pro-
cesses
Has the accident something to do with the software?
Mining Project-Oriented Business Processes Project-Oriented Business Processes 8 / 22
9. Idea: mine project-oriented business pro-
cesses
Has the accident something to do with the software?
Mining Project-Oriented Business Processes Project-Oriented Business Processes 8 / 22
10. Project-Oriented Business Processes
Classic business process Project-oriented business process
Engine No engine
Recursive, cyclic One time with fixed goals and resources
Many instances One prototype/product
Process model (e.g. BPMN) Plan (e.g. GANTT chart)
Activities Workpackages
Subprocesses Subworkpackages
Mining Project-Oriented Business Processes Project-Oriented Business Processes 9 / 22
11. Project-Oriented Business Processes
Classic business process Project-oriented business process
Engine No engine
Recursive, cyclic One time with fixed goals and resources
Many instances One prototype/product
Process model (e.g. BPMN) Plan (e.g. GANTT chart)
Activities Workpackages
Subprocesses Subworkpackages
Process mining
Mining Project-Oriented Business Processes Project-Oriented Business Processes 9 / 22
12. Project-Oriented Business Processes
Classic business process Project-oriented business process
Engine No engine
Recursive, cyclic One time with fixed goals and resources
Many instances One prototype/product
Process model (e.g. BPMN) Plan (e.g. GANTT chart)
Activities Workpackages
Subprocesses Subworkpackages
Process mining
Mining Project-Oriented Business Processes Project-Oriented Business Processes 9 / 22
13. State of the art: reduction to process min-
ing
Mining a process from software repositories (Kindler et al.,2006)
Mining Project-Oriented Business Processes Project-Oriented Business Processes 10 / 22
14. State of the art: visualization I
Dotted chart (Song & van der Aalst,2007)
Mining Project-Oriented Business Processes Project-Oriented Business Processes 11 / 22
15. State of the art: visualization II
Storylines (Ogawa & Ma, 2010)
Mining Project-Oriented Business Processes Project-Oriented Business Processes 12 / 22
18. Challenges
Timing (how big is the activity in reality wrt to what we see in the
log?)
Aggregation (how can we aggregate events into activities? and how
can we see the project from a coarser grained point of view?)
Coverage (how efficiently was the time used?)
Mining Project-Oriented Business Processes Approach 15 / 22
21. Assumptions
1. Meaningful tree structure
2. Members perform local changes
Mining Project-Oriented Business Processes Approach 16 / 22
22. Assumptions
1. Meaningful tree structure
2. Members perform local changes
3. Systematic commits
Mining Project-Oriented Business Processes Approach 16 / 22
23. Visualization of a project
Aggregation (data from the SHAPE-project)
Time span Jan 2014 – Jan 2015
8 people
156 objects (files and directories)
226 commits, generating 453 events
Mining Project-Oriented Business Processes Approach 17 / 22
24. Correction of activity starting times
Adjustment and coverage
Mining Project-Oriented Business Processes Approach 18 / 22
25. Evaluation on open source projects
Log Duration Idle periods Files Commits ˆtc χ
File name Days Number Number Number Hours %
Our work 24 0 89 63 9 100
Whitehall 1279 6 6539 15566 2 95
Petitions 834 17 1562 914 13 59
Study 624 13 7501 736 11 58
The Guardian 1667 59 12889 621 30 44
Book 414 15 154 592 5 32
Papers 1859 55 1791 649 20 30
Requirements 771 22 505 231 17 21
Yelp 206 6 24 54 20 20
Adobe 1076 13 356 237 24 15
More real world logs on https://github.com/showcases
Mining Project-Oriented Business Processes Approach 19 / 22
26. Limitations and Future work
Limitations
Strong assumptions on the structure
The approach doesn’t take into account amount of documents
changes
Checking rules
Future work
Use statistic methods to improve the quality of the discovered projects
Discover the type of work/project by using comments written by users
User assessment of the quality of the discovered GANTT charts
Mining Project-Oriented Business Processes Approach 20 / 22
28. Conclusion
We help the auditor to analyze the project
Different levels of abstraction (aggregation)
Time and resource of events
Work effort measure (coverage)
We used project VCS logs
Output as GANTT chart
Source code: https://github.com/s41m1r/MiningCVS
Email me: saimir.bala@wu.ac.at
Mining Project-Oriented Business Processes Conclusion 22 / 22
29. References
Kindler, E., Rubin, V. & Schäfer, W. (2006). Activity Mining for
Discovering Software Process Models. Software Engineering 79,
175–180.
Ogawa, M. & Ma, K.-L. (2010). Software evolution storylines. In
Proceedings of the 5th international symposium on Software
visualization (pp. 35–42).
Song, M. & van der Aalst, W. M. (2007). Supporting process mining
by showing events at a glance. In 7th Annual Workshop on
Information Technologies and Systems (pp. 139–145).
Baier, T., Mendling, J., & Weske, M. (2014). Bridging abstraction
layers in process mining. Information Systems, 46, 123-139.
Part I: AppendixMining Project-Oriented Business Processes References 1 / 4
30. Expected active time between commits
Expected active time between commits ^tc is given as follows.
(1) ^tc =
a∈Af
(ω(a) − α (a))
a∈Af
(c(a) − 1)
with
ω (a): End time of activity a
α’(a): Time of the first event of the activity a
c (a): Number of commits in activity a
Part I: AppendixMining Project-Oriented Business Processes Backup 2 / 4
31. Coverage factor
Definition (Coverage)
The coverage χ of work packages by activities is a function χ : W → [0, 1]
and is defined as follows.
(2) χ(w) =
a∈β−1(w) (ω(a) − α(a))
τ(w)
where τ is the duration of work package w.
Part I: AppendixMining Project-Oriented Business Processes Backup 3 / 4
32. Average idle time
Let nc be the number of commits per work package. We compute the
average idle time as follows.
(3) tIdle =
τ − nc ·^tc
n
, n > 0
where n is the number of idle times in the work package, and τ is the time
duration of the work package.
Part I: AppendixMining Project-Oriented Business Processes Backup 4 / 4