Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Of Bugs and Men (and Plugins too)
1. Of Bugs and Men
(and Plugins too)
Michel Wermelinger, Yijun Yu
The Open University, UK
Markus Strohmaier
Technical University Graz, Austria
2. Plugins
Working Conf. on
Mining Softw. Repositories 2008
Int’l Conf. on Softw. Maintenance 2008
3. Motivation & Method
What is the validity, generality and usefulness
of design principles?
Study long-term evolution
Study architectural evolution
Study complex systems
Case study: Eclipse
modern CBS with reusable, extensible
components
4. Eclipse
Static dependency: X depends on Y
Dynamic dependency:
X uses extension points provided by Y
Self-cycles possible
We analysed whole Eclipse SDK (JDT, PDE, etc)
5. Eclipse releases
Various types of releases
Major (e.g. 3.1) and maintenance releases (e.g. 3.1.1)
Milestones (3.2M1) and Release candidates (3.2RC1)
Maintenance of current major release in parallel with milestones
and release candidates of next one
We analysed
20 major and maintenance releases over 6 years (1.0 to
3.3.1.1)
27 milestone and release candidates over 2 years (3.1 to 3.3)
grouped in 2 sequences: 1.0 – 3.1 and 3.1 – 3.3.1.1
7. Some Research Questions
Is there continuous growth (Lehman’s 6th
law)?
Is there any pattern (e.g. superlinear growth)?
Does complexity increase (Lehman’s 2nd
law)?
Is there any effort to reduce it?
Does coupling decrease?
Does cohesion increase?
8. Modules
A simple structural model
Module = directed graph
Elements = internal or external
Arcs = internal or external relations
External elements and arcs show context
For Eclipse SDK module
elements = plugins or external components
arcs = static and/or dynamic dependencies
9. Module measures
Size = # internal elements
NIP = number of internal plugins
Complexity = # internal arcs
NISD/NIDD = number of internal static/dynamic
dependencies
Cohesion = complexity / size
Coupling = # external arcs
NESD (NEDD is always zero)
10. Size Evolution (1)
Number of plugins kept, added, deleted w.r.t. previous release
Number kept since initial release → stable architectural core
Segmented growth
Overall 4- to 5-fold growth, but not superlinear
Many changes in 3.0; few deletions overall
11. Size Evolution (2)
Long equilibrium and short punctuation periods
Equilibrium: changes accommodated within current architecture
Punctuation: changes require architectural revisions
mostly in milestones
some in release candidates
hardly in maintenance
12. Architectural core
jdt.ui
jdt.launching
jdt.doc.isv jdt.doc.user pde.doc.user platform.doc.isv platform.doc.user help.ui pde.runtime ant.ui search compare pde.core debug.ui jdt.debug
help pde ui ant.core jdt.core debug.core
core.runtime swt core.resources
core with static and dynamic dependencies
self-cycles point to reuse of extension points
layered architecture
core is >40% of release 1.0 and ca. 10% of 3.3.1.1
13. Complexity Evolution
Charts show NISD (left) and NIDD (right)
Release 3.1 is major restructuring
Static dependencies decreased by 19%
Plugins increased by 57%
More deletions, i.e. effort to reduce complexity
14. Cohesion evolution (1)
Size (left) and complexity (right) grow in step
Two exceptions
Release 3.0 maintains size
Release 3.1 reduces complexity
15. Cohesion evolution (2)
Result: cohesion slightly decreases over time
Except for major increase during 3.0.* releases
Independently of static, dynamic, or both dependencies
Low cohesion: <3 (incoming or outgoing) dependencies per plugin
explicit effort to keep architecture loosely cohesive?
16. Coupling Evolution
Charts show NESD
Refactoring in 3.0:
All existing external dependencies removed via new internal
proxies
External component org.apache.xerces was removed
Overall, coupling is small compared to size and
complexity
17. Acyclic Dependency Principle
Dependency graph should be acyclic [Martin 96 and others]
decreases change propagation
eases release management and work allocation
Measured cycle length over joint dependency graph
Graph shows segmented growth of harmless self-cycles (length 1)
Single cycle with length > 1 was broken apart in release 3.0
18. Stable Dependency Principle
dependencies should be in direction of stability
[Martin 97]
changes propagate opposite to dependencies
if A depends on B, A can’t be harder to change than B
instability of element = fanout / (fanin + fanout)
irresponsible: fanin = 0, instability = 1, may change
independent: fanout = 0, instability = 0, no reason to
change
19. SDP Evolution
Charts show number of SDP violations
Absolute (left) and relative (right)
static, dynamic and both dependencies
Numbers kept low, with ratio tending to decrease
1-5% violations for static dependencies, 9-17% for dynamic
20. Changeability measures
slight adaptation of [van Belle 04]
likelihood of changing an element
# of actual changes / max possible #
impact of an element’s changes
avg # of elements changed with it
acuteness = impact / likelihood
high for interfaces, low for method bodies
21. Changes and Stability (1)
changes and stability are related
responsible elements: high change impact
independent elements: low change likelihood
stable elements: high change acuteness
van Belle: correlational linkage
implicit, from co-change observation
takes change propagation closure into account
Martin: causal linkage
must be given explicitly
only looks at immediate neighbours
22. Changes and Stability (2)
measured fanin/fanout of the 69 plugins in release 2.0
measured impact/likelihood of same plugins over next 45
releases
normalised measures, ordered plugins by fanin and fanout
lower fanin ⇒ less responsible ⇒ lower impact: not quite so
lower fanout ⇒ less dependent ⇒ lower likelihood: somewhat
23. Changes and Stability (3)
measured instability when defined (52 plugins in 2.0)
All but one irresponsible and independent plugins remained so over time
higher instability lower acuteness: mixed
some trend but many exceptions
likelihood vs independence is better than impact vs responsibility
static causal linkage can’t predict future correlational linkage
former only accounts for internal drives, latter includes external drives
24. Conclusions (1)
Successful evolution of Eclipse due to…?
systematic architectural change process
segmented growth of size and complexity
cohesion kept low; cycles removed
SDP violations and coupling reduced
significant stable layered architectural core
Some consistency between causal and
correlational changeability measures
25. Conclusions (2)
many design principles/guidelines proposed, but…
no empirical evidence of usefulness for maintenance
selected representative case study
large, complex, successful, component-based system
accurate architectural information + enough evolution history
generic and lightweight approach
no reverse engineering, no static code analysis
modules and changeability measures
flexible scripting tool manipulating text files with relational data
potential practical implications of findings
confirmed some laws and principles; observed some patterns
investigated static and historic changeability measures
26. Bugs and Men
New Ideas and Emerging Results track
of Int’l Conf. on Software Eng. 2009
27. Motivation
Software engineering is socio-technical activity
Global and open source software development led to
increased interest in and relevance of social aspects
Need for representing socio-technical relations
Bipartite graphs of software artefacts and people
Ad-hoc arc semantics, depending on relation
Ad-hoc flat layout, often hard to read
Relevant relations lost among many nodes and arcs
Sought improvements:
More compact, intuitive, and explicit representation
Distinguish ‘hierarchical’ importance of artefacts, people
and their relations.
28. General Approach
Obtain a bipartite socio-technical network
Compute socio-technical concept lattice
Apply formal concept analysis (FCA) theory
Use free tool ConExp (Concept Explorer)
Concept: clusters all artefacts associated to same
people
Hierarchy: partial ordering of clusters
Study different and evolving socio-technical
relations
Repeat for various relations and system releases
29. Case study
Requirements:
Should have non-trivial social and technical
structure
Should not have fluid social structure
Should provide different data sources (not just
code)
Eclipse
Has IBM lead and Bugzilla repository
30. The socio-technical network (1)
Build PBC network
P nodes: 16,025 people
B nodes: 101,966 Eclipse SDK bug reports
C nodes: 16 Eclipse SDK components
p-b arc: p reported/assigned to/discussed b
b-c arc: b is reported for c
Repeat for various releases and roles
31. The socio-technical network (2)
Build the PC network
Folding of PBC, i.e. p-c arc with weight b
person p is associated to b reports for
component c
Number of paths from p to c
Build the PC(k) network
Remove all arcs with weight < k
Remove all weight information
32. Formal Concept Analysis
Given objects O and attributes A and relation O × A
e.g. O = components, A = assignees
Concept c = (o ⊆ O, a ⊆ A)
each object in o has all attributes a
o is the extent and a is the intent of the concept
Hierarchy: (o, a) ≤ (o’, a’) if o ⊆ o’ (or a’ ⊆ a)
From top to bottom: extent decreases, intent increases
Socio-technical concept lattice
Usually, people at level n (bottom=0) associated to n components
‘specialists’ at lower, ‘generalists’ at upper levels
Each node includes all its ancestors’ people and all its
descendants’ components
33. Release 1.0, assignees, k=10
USA coordinating
2 Canadian teams?
only 4 ‘generalists’
(2 components each)
the French team
only 1 developer associated:
what if they leave project?
the Swiss team
most developers associated:
is this largest or most
complex component?
34. Release 3.0, assignees, k=100
only 2 ‘generalists’ Common developers:
(3 components each) highly dependent
components?
Used higher k because bug reports accumulate over time
Geographical and workload distribution like release 1.0
35. Release 3.0, discussants, k=100
Developers discuss more
components than they
are assigned to: due to
dependencies?
Developers don’t discuss all reports they are assigned to
36. Conclusions
Novel application of Formal Concept Analysis
Clustering and ordering of socio-technical relations
General tool-supported approach
Some advantages over bi-partite graphs
More scalable: not one node per person and artefact
More explicit: related people & artefacts in same node
More intuitive: uniform vertical layout & arc semantics
Helps spot expertise and potential problems
Generalist and specialist people
Artefacts with too many or too few people associated
Undesired or absent communication/coordination
37. Concluding conclusions
Software engineering is inherently socio-technical
endeavour
Availability of FLOSS projects allows to study
historical heterogeneous data
Used process and artefact data to present different
views on same case study
Evolution of architecture
Hierarchy of maintainers
Impact of dependencies
Opportunities for many studies, mining and
visualisation techniques that can help academics,
developers and managers
Notes de l'éditeur
Men: women included Removed: Haruhiko Kaiya Shinshu University, Japan
Long-term evolution should provide more insight than looking at single snapshots Lessons should be useful to managers and developers
We haven’t studied features yet Note that dynamic dependencies are at architectural level, they don’t capture object run-time calls
Graph shows chronological sequence (left to right) and logical sequence (arrows) rationale for split: we have no M or RC releases available before 3.1
Metadata: plugin.xml, MANIFEST.MF, feature.xml
Release system number Provides <release> <plugin> <ext pt>
Want to verify some laws and principles
For cohesion to be within [0,1] it should be complexity / sqr(size) but we don’t expect architecture to be complete graph NEDD is zero because external components don’t have extension points
Did deletions break 3 rd -party applications / plugins? Superlinear: observed in some OSS code, but probably doesn’t make sense for architecture highest growth from 1.0 to 2.0
This figure is the ‘join’ of the two figures in the paper
Remind audience that cohesion = complexity / size
This may be due to nature of eclipse, with various subprojects (JDT, PDE, etc.)
Note different scale on right bar chart Preparation of 3.3 first reduces then increases coupling but unclear significance of this
We took union of dependencies to make cycle finding more likely The cycle’s length was 3: ui, ui.editors, ui.workbench.editor. Cycle was removed by moving one component to other feature
rationale: if B changes, A may have to change too, so should be less resistant to change than B
this slide is what van Belle argues
this slide is about practical lessons from a system that has many users and remains useful, with a flexible architecture Systematic process: changes mostly in milestones An increasing cohesion and a superlinear growth might be counter-productive for architecture; restructurings show active effort to maintain certain characteristics the last point means that increasing stability might somewhat reduce acuteness of changes
even principle that wasn’t confirmed (increasing cohesion) makes sense in this context patterns: stable core, segmented growth future work: more case studies (incl unsuccessful systems), more principles, more info sources (eg Bugzilla data) also: repeat acuteness vs instability study but separating plugins that follow SDP from those that don’t
Took from http://archive.eclipse.org – arch birt callisto datatools dsdp eclipse modeling rt technology tools tptp webtools
Extracted from CVS by date, not by tag
1.0 is only Eclipse SDK
Incl statuc+dynamic
Blue=total, red=unused Could they be used outside these 11 projects?
Only xml and jar binaries imported Feedback task: to ask Eclipse developers
Left dotted arrow is to remove duplicate and malformed XML files
One group only looks at individual dimensions, another at 2, another at 3
Left Platform, right PDE; bugs ordered by time and attrib value; frames can be resized; red/bright/saturated means problem One can see that middle severity and priority values are used Early bugs reopened and higher priority and severity; recent bugs mostly only assigned To do: present as pixmap with lables while hovering mouse
Same legend in all pictures, so ordering is by status (hue) then priority (saturation) on the top, in the middle is hue then brightness, on bottom is first saturation then brightness This allows to see e.g. the priorities of the resolved but not verified bugs: mostly the lowest, i.e. probably bugs verified by priority Most enhancements already verified
Relatively few bright saturated colours (middle range values again) Ordering by priority, shows the high-priority bugs are in the verified/closed/reopened state The tools allows generating all possible orderings and takes user defined value scale
Hs = hue vs saturation correlation (status vs priority)
Group 2 noted best the higher sp correlation for Platform, but the least for ps Group3 noted best the ps correlation, but thought that PDE had better ss correlation Group 1 performs best in the ss correlation but it’s the one where theer is hardly any correlation and hence multiple dimensions are not actually needed
arc semantics: subset
Release: cumulative due to lack of history
Each person fixed 10+ bugs of given compoment Reduced labelling: each node has attributes of ancestors and objects of descendants French team only on gdt:core, Swiss team on jdt:ui mostly Few people more than one component, eg is Kai-Uwe a bridge element between Canadian and Swiss team
Release 3.0 has many more components Keeping same threshold is a way to see how many more people are contributing
Increasing threshold is a way to take accumulated bugs into account (our graphs are cumulative) Note that some components may not have enough bugs to be shared by many developers Each layer is one more component: Daniel Megert also climbed the hierarchy, Same US developers work on platform:debug, ant, jdt:debug -> component dependency? Too much reliant on same people?
Fewer people and components than in assignees lattice, hence statement on slide People discuss more components than those they are assigned to -> component dependencies?
Novel: artefacts are objects, People are attributes
Shared contributors: fold PC(k) by computing PC(k)T * PC(k) and removing the diagonal (i.e. self loops) Code dependency: only for 3.3.1.1, when Bugzilla snapshot was produced
This is for k=128 weights not shown: number of contributors associeted to at least k requests for each component Why is number of people in paper (22,532) different from ICSM’08 value?
This is for both dependencies; two more networks computed weights not shown: how many plugins of A depend on B (and vice-versa?)
Random graphs have 16 nodes and similar number of edges to CC; weights generated randomly? Correlation is between CR and CC (why?)