The document presents an approach called DECA (Development Email Content Analyzer) that uses natural language parsing to classify paragraphs in development emails according to intentions. DECA was evaluated on emails from two open source projects. It achieved good results for classifying email content, outperforming traditional machine learning techniques in terms of precision, recall and F-measure. The study aimed to understand how natural language parsing could help recognize informative text fragments in development discussions to guide software maintenance and evolution.
1. Development Emails Content Analyzer:
Intention Mining in Developer Discussions
Andrea
Di Sorbo
Sebastiano
Panichella
Corrado
Visaggio
Massimiliano
Di Penta
Gerardo
Canfora
Harald
Gall
2. Outline
Context:
Wri5en
Development Discussions
Case Study:
Development Mailing List
of 2 Open Source Projects
Results:
Automatic Classification of Relevant
Contents in Developers’ Communication
2
7. Development
Communication Means
Recommender systems:
-‐‑ Bug Triaging [1]
-‐‑ Suggest Mentors [2]
-‐‑ Code re-‐‑documentation [3]
-‐‑ Etc.
[1] Anvik et al. “Who should fix this bug?”.
[2] Canfora et al. “Who is going to mentor newcomers in open source projects?”
[3] Panichella et al. “Mining source code descriptions from developer communications”
7
9. Development
Communication Means
[1] Bacchelli et al. “Content classification of development emails”.
[2] Cerulo et al. “A Hidden Markov Model to detect coded information islands in free text.”
9
10. Different Kinds of Data
Structured
Semi-‐‑Structured
Unstructured
10
11. A Considerable Effort for
Developers
Many messages
Developers get lost in unnecessary details
missing potential useful information…
11
12. Previous Work
12
Hana et al.
“…Lazy” RTC occurs when
a core developer post a
change to a mailing lists
and nobody responds,
it assumed that other
developers reviewed the
code…”
13. Previous Work
Approaches for:
-‐‑ Generating summaries
of emails.
à Lam et al. ,
à Rambow et al.
-‐‑ Generating summaries
of bug reports.
à Rastkar et al.
13
15. DECA
(Development Email Content Analyzer)
An approach to Classify Paragraphs
According to Intentions
hSp://www.ifi.uzh.ch/seal/people/panichella/tools/DECA.html
15
16. Why use NLP for Classifying
Paragraphs According to
Intentions?
16
17. Example
i. We could use a leaky bucket algorithm to limit
the bandwidth
ii. The leaky bucket algorithm fails in limiting the
bandwidth
17
18. i. We could use a leaky bucket algorithm to limit
the bandwidth
ii. The leaky bucket algorithm fails in limiting the
bandwidth
An high percentage of words in common
Example
18
19. i. We could use a leaky bucket algorithm to limit
the bandwidth
ii. The leaky bucket algorithm fails in limiting the
bandwidth
Discuss about the same topics
Example
19
20. i. We could use a leaky bucket algorithm to limit
the bandwidth
ii. The leaky bucket algorithm fails in limiting the
bandwidth
Have different intentions
Example
20
21. i. We could use a leaky bucket algorithm to limit
the bandwidth
ii. The leaky bucket algorithm fails in limiting the
bandwidth
Have different intentions
Example
“Techniques based on lexicon analysis, such as VSM [1], LSI [2], or LDA [3] would
not be sufficient to classify paragraphs according to intentions”.
.
[1] Baeza-‐‑Yates et al. “Modern Information Retrieval”.
[2] de Marneffe et al., “The Stanford typed dependencies representation”.
[3] Blei et al., “Latent dirichlet allocation”.
21
23. Goal: Understanding to what extent NL parsing could be
used in recognizing informative text fragments in emails
from a software maintenance and evolution perspective
Quality focus: Detection of text paragraphs in
development discussions containing helpful information
for developers.
Perspective: Guide developers in maintaining and
evolving their products.
Case Study
23
24. Research Questions
RQ1: Can an NLP approach (i.e. DECA) be
effective in classifying writers’ intentions in
development emails?
RQ2: Is DECA more effective than existing
Machine Learning techniques in
classifying development emails content?
24
36. Why NL parsing?
Well defined predicate-‐‑argument structures
use
we
could
algorithm
a
leaky
bucket
limit
to
bandwidth
the
nsubj aux dobj xcomp
det amod nn
aux dobj
det
fails
algorithm
the
leaky
bucket
in
limiting
bandwidth
the
nsubj prep
det amod nn
pcomp
dobj
det
36
37. NL parsing
Natural Language Templates
use
[someone]
could
[something]
nsubj aux dobj
fails
[somehing]
nsubj
37
55. RQ2:
Is the proposed approach more
effective than existing ML in classifying
development emails content?
55
56. ML for Email Classification
An Approach Based on ML for Email Content Classification
à Antoniol et. al., CASCON 2008
à Zhou et al. , ICSME 2014
56
57. ML for Email Classification
An Approach Based on ML for Email Content Classification
1)Text Features
57
58. ML for Email Classification
An Approach Based on ML for Email Content Classification
1)Text Features
2) Split training
and test sets
58
59. ML for Email Classification
An Approach Based on ML for Email Content Classification
1)Text Features
2) Split training
and test sets
3) Oracle
building
59
60. ML for Email Classification
An Approach Based on ML for Email Content Classification
1)Text Features
2) Split training
and test sets
3) Oracle
building
4) Classification
training
prediction
à Antoniol et. al., CASCON 2008
à Zhou et al. , ICSME 2014
60
70. Summary
• RQ2: DECA outperforms traditional ML techniques in
terms of recall, precision and F-Measure when
classifying e-mail content.
• RQ1: the automatic classification performed by DECA
achieves very good results in terms of both precision,
recall and F-measure (over all the experiments).
70
71. Summary
• RQ2: DECA outperforms traditional ML techniques in
terms of recall, precision and F-Measure when
classifying e-mail content.
”…it took the MSR community more than 10 years to
figure out that machine learning is not the best method
for analyzing human-written text. Thank you for helping
move the field forward…” [One of the ASE Reviewers]
• RQ1: the automatic classification performed by DECA
achieves very good results in terms of both precision,
recall and F-measure (over all the experiments).
71
73. Code e-‐‑documentation
àPanichella et. al. – ICPC 2012
Extract methods’ descriptions from
developers discussions
à Vector Space Models
à ad hoc heuristics
“… several are the discourse
paIerns that characterize false
negative method descriptions… “
73
74. Code re-‐‑documentation
“… several are the discourse
paIerns that characterize false
negative method descriptions… “
74
75. Code re-‐‑documentation
“… several are the discourse
paIerns that characterize false
negative method descriptions… “
75
76. Code re-‐‑documentation
“… several are the discourse
paIerns that characterize false
negative method descriptions… “
76
77. Code re-‐‑documentation
“… several are the discourse
paIerns that characterize false
negative method descriptions… “
77
78. Code re-‐‑documentation
“… several are the discourse
paIerns that characterize false
negative method descriptions… “
78
79. Code re-‐‑documentation
“… several are the discourse
paIerns that characterize false
negative method descriptions… “
79
87. Future work
1)DECA as preprocessing
support to discard irrelevant
sentences in summarization
approaches
87
88. Future work
1)DECA as preprocessing
support to discard irrelevant
sentences in summarization
approaches
2)DECA in combination with
topic models for mining
contents with the same intentions
and the same topics
88