The 7 Things I Know About Cyber Security After 25 Years | April 2024
DRONE: Predicting Priority of Reported Bugs by Multi-Factor Analysis
1. DRONE: Predicting Priority of Reported Bugs
by Multi-Factor Analysis
Yuan Tian1, David Lo1, Chengnian Sun2
1Singapore Management University
2National University of Singapore
2. Bug tracking systems allow developers to prioritize
which bugs are to be fixed first
¡ Manual process
¡ Depend on other bugs
¡ Time consuming
What is priority and when it is assigned?
New
Assigned
300 reports to triage daily!
Validity Check,
Duplicate Check,
Priority Assignment
Developer Assignment
Bug
Triager
2
3. Priority Vs Severity
3
“Severity is assigned by customers [users] while
priority is provided by developers . . . customer
[user] reported severity does impact the developer
when they assign a priority level to a bug report,
but it’s not the only consideration. For example, it
may be a critical issue for a particular reporter that
a bug is fixed but it still may not be the right thing
for the eclipse team to fix.”
Eclipse PMC Member
Importance: P5 (lowest priority level), major (high severity)
4. q Background
q Approach
Overall Framework
Features
Classification Module
q Experiment
Dataset
Research Questions
Results
q Conclusion
Outline
4
5. q Background
q Approach
Overall Framework
Features
Classification Module
q Experiment
Dataset
Research Questions
Results
q Conclusion
Outline
5
9. Temporal Factor
TMP1 Number of bugs reported within 7 days before the reporting of BR.
TMP2 Number of bugs reported with the same severity within 7 days
before the reporting of BR.
TMP3 Number of bugs reported with the same or higher severity within 7
days before the reporting of BR.
TMP4-6 The same as TMP1-3 except the time duration is 1 day.
TMP7-9 The same as TMP1-3 except the time duration is 3 days.
TMP10-12 The same as TMP1-3 except the time duration is 30 days.
Textual Factor
TXT1-n Stemmed words from the description field of BR excluding stop
words.
Severity Factor
SEV BR’s severity field. 9
10. Author Factor
AUT1 Mean priority of all bugs reports made by the author of BR prior to the
reporting of BR.
AUT2 Median priority of all bugs reports made by the author of BR prior to the
reporting of BR.
AUT3 The number of bug reports made by the author of BR prior to the reporting
of BR.
Related Reports Factor [REP-, Sun et al.]
REP1 Mean priority of the top-20 most similar bug reports to BR as measured
using REP- prior to the reporting of BR.
REP2 Median priority of the top-20 most similar bug reports to BR as measured
using REP prior to the reporting of BR.
REP3-4 The same as REP1-2 except only the top 10 bug reports are considered.
REP5-6 The same as REP1-2 except only the top 5 bug reports are considered.
REP7-8 The same as REP1-2 except only the top 3 bug reports are considered.
REP9-10 The same as REP1-2 except only the top 1 bug reports are considered.
10
11. Product Factor
PRO1 BR’s product field Note: categorical feature
PRO2 Number of bug reports made for the same product as that of BR prior to
the reporting of BR.
PRO3 Number of bug reports made for the same product of the same severity as
that of BR prior to the reporting of BR.
PRO4 Number of bug reports made for the same product of the same or higher
severity as those of BR prior to the reporting of BR.
PRO5 Proportion of bug reports made for the same product as that of BR prior to
the reporting of BR that are assigned priority P1.
PRO6-9 The same as PRO5 except they are for priority P2-P5 respectively.
PRO10 Mean priority of bug reports made for the same product as that of BR prior
to the reporting of BR.
PRO11 Median priority of bug reports mad for the same product as that of BR prior
to the reporting of BR.
PRO12-22 The same as PRO1-11 except they are for the component field of BR.
11
18. q Background
q Approach
Overall Framework
Features
Classification Module
q Experiment
Dataset
Research Questions
Results
q Conclusion
Outline
18
19. q Eclipse Project
§ 2001-10-10 to 2007-12-14,
§ 178,609 bug reports.
Dataset
DRONE TestingDRONE Training
Model Building Validation
REP-
4.50% 6.89%
85.45%
1.95% 1.21%
P1 P2 P3 P4 P5
19
20. ? Accuracy (Precision, Recall, F-measure)
Compare with SEVERISprio [Menzies & Marcus], SEVERISprio+
? Efficiency (Run time)
? Top features (Fisher score)
Research Questions & Measurements
20
21. 0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
P1 P2 P3 P4 P5
F-measure
DRONE SEVERISprio SEVERISprio+
RQ1: How accurate?
21
29.47%
18.75%
1. Baselines predict everything
as P3 !
2. Average F-measure improves
from 18.75% to 29.47%
3. A relative improvement of
57.17%.
22. Approach
Run Time (in seconds)
Feature
Extraction(train)
Model
Building
Feature
Extraction(test)
Model
Application
SEVERISprio <0.01 812.18 <0.01 <0.01
SEVERISprio+ <0.01 773.62 <0.01 <0.01
DRONE 0.01 69.25 <0.01 <0.01
RQ2: How efficient?
22
Our approach is much faster in Model Building!
27. Conclusion
yuan.tian.2012@smu.edu.sg
q Priority prediction is an ordinal +
imbalance classification problem -
>linear regression + thresholding is
one option.
q DRONE can improve the average F-
measure of baselines from 18.75% to
29.47%, a relative improvement of
57.17%.
q Product factor features are the most
discriminative features, followed by
related-reports factor features.
28. Conclusion
yuan.tian.2012@smu.edu.sg
q Priority prediction is an ordinal +
imbalance classification problem -
>linear regression + thresholding is
one option.
q DRONE can improve the average F-
measure of baselines from 18.75% to
29.47%, a relative improvement of
57.17%.
q Product factor features are the most
discriminative features, followed by
related-reports factor features.
29. Conclusion
yuan.tian.2012@smu.edu.sg
q Priority prediction is an ordinal +
imbalance classification problem -
>linear regression + thresholding is
one option.
q DRONE can improve the average F-
measure of baselines from 18.75% to
29.47%, a relative improvement of
57.17%.
q Product factor features are the most
discriminative features, followed by
related-reports factor features.
30. 30
I acknowledge the support of Google and the
ICSM organizers in the form of a Female
Student Travel Grant, which enabled me to
attend this conference.
Thank you!
31. Conclusion
yuan.tian.2012@smu.edu.sg
q Priority prediction is an ordinal +
imbalance classification problem -
>linear regression + thresholding is
one option.
q DRONE can improve the average F-
measure of baselines from 18.75% to
29.47%, a relative improvement of
57.17%.
q Product factor features are the most
discriminative features, followed by
related-reports factor features.
38. q Menzies and Marcus (ICSM 2008)
¡ Analyze reports in NASA
¡ Textual features +feature selection+ RIPPER
q Lamkanfi et al. (MSR 2010, CSMR 2011)
¡ Predict coarse-grained severity labels
¡ Severe vs. non-severe
¡ Analyze reports in open-source systems
¡ Compare and contrast various algorithms
q Tian et al.(WCRE 2012)
¡ Information retrieval + k nearest neighbour
Previous Research Work: Severity Prediciton
38
39. q Tokenization
Spliting document into tokens according to delimiters.
q Stop-word Removal
eg: are, is, I, he
q Stemming
eg: woking, works, worked->work
Text Pre-processing
39
40. 40
q Textual Features
§ Compute BM25Fext scores
§ Feature1: Extract unigram
§ Feature2: Extract bigrams
q Non-Textual Features
§ Feature3: Product field
§ Feature4: Component Field
Appendix: Similarity Between Bug Reports (REP-)
Note: Weights are
learned from
duplicate bug reports.
41. Feature
PRO5 Proportion of bug reports made for the same product as that of BR prior to the
reporting of BR that are assigned priority P1.
PRO16 Proportion of bug reports made for the same component as that of BR prior to
the reporting of BR that are assigned priority P1.
REP1 Mean priority of the top-20 most similar bug reports to BR as measured using
REP- prior to the reporting of BR.
REP3 Mean priority of the top-10 most similar bug reports to BR as measured using
REP prior to the reporting of BR.
PRO18 Proportion of bug reports made for the same component as that of BR prior to
the reporting of BR that are assigned priority P3.
PRO10 Mean priority of bug reports made for the same product as that of BR prior to
the reporting of BR.
PRO21 Mean priority of bug reports made for the same component as that of BR prior
to the reporting of BR.
PRO7 Proportion of bug reports made for the same product as that of BR prior to the
reporting of BR that are assigned priority P3.
REP5 Mean priority of the top-5 most similar bug reports to BR as measured using REP
prior to the reporting of BR.
Text “1663”
41