DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
Featureous: An Integrated Approach To Location, Analysis And Modularization Of Features In Java Applications
1. Ph.D. Defense:
Featureous: An Integrated Approach
To Location, Analysis And Modularization
Of Features In Java Applications
Andrzej Olszak
2. Road to this defense
90
latitude
80
70
60
50
40
30
20
10
0
year
2007 2008 2009 2010 2011 2012
3. Publications
Main Publications
• [TOOLS’12] Olszak, A., Bouwers, E., Jørgensen, B. N. and Visser J. (2012). Detection of Seed Methods for Quantification of Feature
Confinement. In TOOLS’12: Proceeding of the TOOLS Europe: Objects, Models, Components, Patterns, Springer LNCS, Vol. 7304, pp
252-268.
• [CSMR’12] Olszak, A. and Jørgensen, B. N. (2012). Modularization of Legacy Features by Relocation and Reconceptualization: How
Much is Enough? In CSMR’12: Proceedings of the 16th European Conference on Software Maintenance and Reengineering, IEEE
Computer Society Press, pp. 171-180.
• [SCICO’12] Olszak, A. and Jørgensen, B. N. (2012). Remodularizing Java Programs for Improved Locality of Feature Implementations
in Source Code. Science of Computer Programming, Elsevier, Vol. 77, no. 3, pp. 131-151.
• [CSIS’11] Olszak, A. and Jørgensen, B.N. (2011). Featureous: An Integrated Environment for Feature-centric Analysis and Modification
of Object-oriented Software. International Journal on Computer Science and Information Systems, Vol. 6, No. 1, pp. 58-75.
• [SEA’10] Olszak, A. and Jørgensen, B. N. (2010). A unified approach to feature-centric analysis of object-oriented software. In SEA’10:
Proceedings of the IASTED International Conference Software Engineering and Applications, ACTA Press, pp. 494-503.
• [AC’10] Olszak, A. and Jørgensen, B. N. (2010). Featureous: Infrastructure for feature-centric analysis of object-oriented software. In
AC’10: Proceedings of IADIS International Conference Applied Computing, pp. 19-26.
• [FOSD’09] Olszak, A. and Jørgensen, B. N. (2009). Remodularizing Java programs for comprehension of features. In FOSD’09:
Proceedings of the First International Workshop on Feature-Oriented Software Development, ACM, pp. 19-26.
Short Papers and Tool Demo Papers
• [WCRE’11a] Olszak, A., Jensen, M. L. R. and Jørgensen, B. N. (2011). Meta-Level Runtime Feature Awareness for Java. In WCRE’11:
Proceedings of the 18th Working Conference on Reverse Engineering, IEEE Computer Society Press, pp. 271-274.
• [WCRE’11b] Olszak, A. and Jørgensen, B. N. (2011). Understanding Legacy Features with Featureous. In WCRE’11: Proceedings of the
18th Working Conference on Reverse Engineering, IEEE Computer Society Press, pp. 435-436.
• [ICPC’10] Olszak, A. and Jørgensen, B. N. (2010). Featureous: A Tool for Feature-Centric Analysis of Java Software. In ICPC’10:
Proceedings of IEEE 18th International Conference on Program Comprehension, IEEE Computer Society Press, pp. 44-45.
4. Agenda
1. Motivation
2. Overview of Featureous
3. Feature Location
4. Feature-Oriented Analysis
5. Demo
6. Feature-Oriented Remodularization
7. Automating Measurement of Features
8. Conclusions
6. The Role of Modularity
•There are various criteria for dividing software into modules
•‘Proper’ modularization facilitates:
• Comprehension: understanding systems one module at a time
• Change: modifying modules independently
• Work division: dividing work on modules boundaries
7. The Role of Modularity
•There are various criteria for dividing software into modules
•‘Proper’ modularization facilitates:
• Comprehension: understanding systems one module at a time
• Change: modifying modules independently
• Work division: dividing work on modules boundaries
8. The Role of Modularity
•There are various criteria for dividing software into modules
•‘Proper’ modularization facilitates:
• Comprehension: understanding systems one module at a time
• Change: modifying modules independently
• Work division: dividing work on modules boundaries
9. The Role of Features
Evolution &
maintenance
60-90%
Initial development
10-40%
10. The Role of Features
Functionality-oriented
evolutionary changes
75%
Evolution &
maintenance
60-90%
Initial development Other evolutionary changes
10-40% 25%
11. The Role of Features
Functionality-oriented
evolutionary changes
75%
Evolution &
maintenance
60-90%
Initial development Other evolutionary changes
10-40% 25%
• Feature – unit of user-identifiable functionality of
software
• Feature-oriented change in nutshell:
– User Request Developer Code
12. Features in OO software
• Features as inter-class collaborations:
– Implicit mappings and boundaries
– Scattered (increased change scope and delocalization effects)
– Tangled (increased change propagation and interleaving effects)
RQ: How can features of Java applications be located, analyzed and
modularized to support comprehension and modification during
software evolution?
17. The Challenge of Feature Location
• Feature location – identifying source code units that
implement features
• Manual approaches
– Problems with scaling and reproducibility
• Existing semi-automated approaches
– Require dedicated test suites
– Rely on artifacts other than source code
18. The Method
• Dynamic analysis – execution tracing
– Resolutions of polymorphism and conditionals
• The notion of feature-entry points
– Annotated “entrances to features” to guide tracing
23. Evaluation
• Applied to 6 medium-sized unfamiliar OSS
– Discussed cases of BlueJ and JHotDraw SVG
BlueJ JHotDraw SVG
Application size 78 KLOC 62 KLOC
Number of identified use cases 127 90
Number of identified features 41 31
Number of feature-entry points 228 91
Class code coverage 66% 75%
Total time 8 hours 5 hours
Estimated manual location time* 111-195 hours 89-155 hours
25. Features as Units of Code Analysis
• Feature-oriented analysis treats features as first-class code
investigation entities
• Several metrics and visualizations exist
– Not always compatible with one another
– Need firm evolutionary grounding
– Lack of usable implementations
26. Structuring Feature-Oriented Analysis
• Unifying conceptual framework for describing views
– Granularity, Perspective, Abstraction
Abstraction
a3 characterization
• Objective definition of views
a2 correlation
• 3x3x3 possible configurations
a1 traceability
Perspective
p3
p1
p2
g1
feature relations
feature
code unit
g2 package
g3 class
method
Granularity
27. Instantiating the Framework
Change
Planning phase implementation
Request Change Verification
for Software impact Restructuring Change and
change comprehension analysis for change propagation validation
g3 g3 g3
Granularity n/a g2 g2 g2 g2 g2
g1 g1 g1
…
30. Evaluation
1. Parnas’ KWIC textbook example Shared Abstract Implicit Pipes
data data type invocation and filters
– 4 modularization alternatives Scattering 0.29 0.35 0.20 0.20
– Results consistent with Tangling 0.18 0.30 0.15 0.15
analyses of Parnas and Garlan Change in function [23] + - + +
2. Feature-Oriented
comprehension of JHotDraw SVG
3. Adoption of feature-oriented change in JHotDraw SVG
4. Analytical evaluation of support for comprehension with the
framework of Storey et al.
33. Modularizing Features in Existing
Systems
• Feature-oriented analysis alone does not reduce the
scope of changes
• Physical modularization of features needed for:
– Reducing comprehension scope
– Reducing change propagation
– Facilitate per-feature division of work
34. Restructuring to Features
• Two fundamental class-level refactoring types
– Class relocation
• Simple
• Does not untangle classes
– Class reconceptualization
• Can be complex
• Untangles classes
• Fragility of class-level decomposition
– Domain entities as aggregates of feature-roles
– Problematic semantics of entity-fragments
36. Evaluation: the Case of NDVis
• Neurological analysis tool by VisiTrend:
– Monolithic -> NetBeans MS, to improve customizability
and reuse of core algorithms
– 10 KLOC, 27 use cases, 10 features
– No unit tests, unfamiliar source code, complex domain
• Application of Featureous:
– Initial analysis + Iterative modularization
37. Modularization in 35 Hours
3rd-Party Metrics:
Scattering 16%
Tangling 27%
Cohesion 31%
Coupling 5%
• Explicit and pluggable features
• Reusable core
• Biggest challenge: decentralizing initialization of features
39. Remodularization as a Multi-Objective
Optimization Problem
• Automatically relocating classes to optimize
measurable design-level principles
– Metrics As Fitness Functions criteria
feature non-common reuse
40. Multi-Objective Grouping Genetic
Algorithm
• Evolving a population of package structures
• Pareto-optimality
– Diverse metrics
– Avoiding extremes
• Grouping operators
• Resulting designs established (and reversed) using
automated code transformations
43. Towards Feature-Oriented Quality
Assessment
• For usage in benchmarks, feature-oriented
measurement needs to be adapted to:
– Scaling to hundreds of large-scale systems
– Assuming only source code available
45. Evaluation
Non-covered GT-only
1. Assessing extent and 13% 2%
accuracy of coverage Method-only
6%
• Ground truth constructed
from14 diverse OSS
– Apart from expected, the
Intersection
method detected application- 79%
specific abstractions
2. Measured scattering and tangling in Checkstyle
over 10 years (27 releases) period
47. Open Questions and Future Directions
• Further empirical evaluation of the approach
• Continuous analysis and remodularization
• Featureous within software lifecycle
• Involving 3rd-party application authors
• Towards general on-demand remodularization
– Other concern dimensions needed
48. Conclusions
• Featureous delivers practical means of:
– Feature location based on feature-entry points
– Structured feature-oriented analysis
– Manual and automated remodularization
• Applicable to existing unfamiliar systems
• Real-world migration to feature-oriented
modularizations feasible
– Mix automated and manual
53. Feature trace model
TraceModel
featureID: String
enclosingType
*
Type
signature: String
* package: String
Invocation instances: Set<String>
caller
*
Execution
callee signature: String
featureEntryPoint: boolean
constructor: boolean
executionCount: int
54. Application: tracking evolution of
feature confinement
• Checkstyle, 10 years, 27 releases: 1.0 – 5.4
– Violation detectors are the features
• Scattering and tangling calibrated on 55 systems
Architectural Removal of
restructuring J2EE detectors,
refactorings –
less tangling
more scattering
No erosion
despite 2x
growth in LOC
55. Details of Manual vs. Automatic
0.07
NDVis Original
0.06 Manual
Scattering
Automated
0.05
fsca
0.04
0.03
0.02
0.01
0
Feature
0.045
0.04
NDVis Original
Tangling Manual
0.035
Automated
0.03
ftang
0.025
0.02
0.015
0.01
0.005
0
Package
56. Results
• Average results:
Non-covered GT-only
13% 2%
Appr.-only
6%
Intersection
79%
• Interesting cases:
k9mail Checkstyle Spring
invoke
process
verify