Injustice - Developers Among Us (SciFiDevCon 2024)
Icpc12b.ppt
1. Women and Men – Different but Equal:
On the Impact of Identifier Style on
Source Code Reading
Zohreh Sharafi, Zéphyrin Soh,
Yann-Gaël Guéhéneuc, and Giuliano Antoniol
Ptidej Team and Soccer Lab,
École Polytechnique de Montréal, Montréal, Canada.
1
2. Introduction
Documentation
Program evolution is often
solely based on source code.
Comments
Not available
Outdated
Program comprehension
heavily depends on identifiers.
Not available
Not useful
2
3. Introduction
What is the impact of identifiers on program
comprehension?
Identifier Style (Sharif 20101, Binkley 20092)
Camel
Case
Underscore
Identifier quality (Lawrie 20073)
Length
Structure
3
1. D. Binkley, M. Davis, D. Lawrie, and C. Morrell, “To camelcase or under score,” ICPC, 2009
2. B. Sharif and J. Maletic, “An eye tracking study on camelcase and under score identifier styles, ICPC, 2010
3. D. Lawrie, C. Morrell, H. Feild, and D. Binkley, “Effective identifier names for comprehension and memory,” Innovations in Systems and Software
Engineering, 2007.
4. Identifier Style
Investigate the Impact of Identifier Style
“Camel Case vs. Underscore”
Time
Binkley, 20091
Time
Higher accuracy for
Camel Case
No difference
More time spent on
Camel Case
More time spent on
Camel Case
Not applicable
Accuracy
Sharif, 20102
Lower visual effort on
Underscore
Effort
4
1. D. Binkley, M. Davis, D. Lawrie, and C. Morrell, “To camelcase or under scoreI, CPC, 2009
2. B. Sharif and J. Maletic, “An eye tracking study on camelcase and under score identifier styles,” ICPC , 2010
5. Gender Preference
Does gender affects identifier style preference?
Camel Case?
Underscore?
Camel Case?
Underscore?
Does gender and identifier style affects program
understanding?
5
6. Related Work
Self-efficacy
(1994)1
Lower self-efficacy
Higher self-efficacy
Selectivity
Hypothesis
(2001)2
Use comprehensive way
Use simple heuristic
Software
Adaptation &
Use (2003)3
No experience
=
No self-confidence
Apply general knowledge
for specific tasks
Testing
Strategies
(2009)4
Use testing only for finding
bugs
6
1.
2.
3.
4.
Use testing for:
1. finding bugs
2. fixing bugs
3. evaluating their fix
G. Torkzadeh and X. Koufteros, “Factorial validity of a computer self-efficacy scale and the impact of computer training1994.
E. O’Donnell and E. Johnson, “The effects of auditor gender and task complexity on information processing efficiency,” 2001.
K. Hartzel, “How self-efficacy and gender issues affect software adoption and use,” Communications of the ACM, 2003.
V. Grigoreanu, J. Brundage, E. Bahna, M. Burnett, P. ElRif, and J. Snover, “Males and females script debugging strategies,” 2009.
7. Our Empirical Study
Goal: Design and perform an experiment to investigate the impact
of gender on identifier styles and identifier comprehension
strategies
Research question: Does the developers’ gender impact their
effort, their required time, and as well as their ability to recall
identifiers (accuracy) in source code reading?
Memorability impacts the comprehension of
identifiers1,2,3
7
Lacking memorability may impair:
Identifiers recall Code readability
Program comprehension
1. D. M. Jones, 2003, the New C Standard (Identifiers) An Economic and Cultural Commentary.
2. B. Sharif and J. Maletic, “An eye tracking study on camelcase and under score identifier styles, ICPC, 2010
3. D. Lawrie, C. Morrell, H. Feild, and D. Binkley, “Effective identifier names for comprehension and memory,” 2007
8. Experiment Design
Independent variables
1.
2.
Gender: male (M) or female (F)
Identifier style: Camel Case (CC) or Underscore (US)
Dependent variables
1.
2.
3.
Accuracy
Required time (Speed)
Visual effort
Mitigating variables
1.
2.
8
Study level
Style preference of subjects
11. Results and Analyses:
Accuracy and Time
Accuracy
Time (min)
Accuracy
Men
Women
Time
Men
Women
Camel Case
83%
80%
Camel Case
5.25
6.6
Underscore
68%
85%
Underscore
6.7
7.8
Table 1 - Accuracy
Table 2 - Time
Accuracy
Time (min)
Men
74%
5.94
Women
82%
7.18
Table 3 - Overall Time and Accuracy
Accuracy
No significant differences
No significant differences
Time
11
12. Results and Analyses:
Visual Effort
Visual effort:
Calculated from eye-tracking data
Calculated based on the amount of visual attention:
Less attention
Less effort
Visual attention can trigger the mental processes
Two types of eye gaze data:
Fixation
Saccade
We use fixation to calculate effort as previous works
12
13. Results and Analyses:
Visual Effort Metric - Source code Stimulus
Convex hull: the smallest convex sets of fixations that
contains all of a subject’s fixations1
Source Code Stimulus
Smaller convex hull
Close fixations
Less effort
13
1.
J. H. Goldberg and X. P. Kotval, “Computer interface evaluation using eye movements: methods and constructs,”, 1999.
14. Results and Analyses:
Visual Effort Metrics - Question Stimulus
Metrics to measure visual effort
(Sharif 20101)
Question Stimulus
1. B. Sharif and J. Maletic, “An eye tracking study on camelcase and under score identifier styles,” IEEE 18th International Conference on Program
Comprehension , 2010
15. Result and Analysis:
Visual Effort
FC(Q)
No significant differences
FR(Correct) No significant differences
Effort
• Female
FR(distracter) Female subjects spends more visual
effort
subjects put more visual effort on distracters
• Female subjects analyze all distracters before making decision
15
17. Conclusion
Investigate the Impact of Identifier Style
“Camel Case vs. Underscore”
Binkley, 20091
Time
Our Experiment
(135 subjects)
Accuracy
Sharif, 20102
(15 subjects)
(24 subjects)
Higher
accuracy for
Camel Case
More time
spent on
Camel Case
Not
applicable
No difference
No difference
More time spent
on
Camel Case
Lower visual effort
on Underscore
No difference
No difference
Effort
17
1. D. Binkley, M. Davis, D. Lawrie, and C. Morrell, “To camelcase or under score,” ICPC, 2009
2. B. Sharif and J. Maletic, “An eye tracking study on camelcase and under score identifier styles,” ICPC, 2010
18. Conclusion
Gender does NOT impact:
Accuracy
• Spend
Time
more effort on alternative answers
• Seem to:
• Carefully weight all options
• Rule out wrong answers
• Statistically strong interaction between accuracy and
effort on distracters
Effort
18
Seem to quickly go for one choice
• No correlation between effort and accuracy
19. Threats to Validity
Internal validity:
Random ordering of source code
Comfortable environment
External validity (generalization of the results):
Students as subjects
19
Students are the next generation of software professionals1
24 subjects
9 female subjects Not large enough to be generalized
A call for further studies
1. B. A. Kitchenham, S. L. Pfleeger, L. M. Pickard, P. W. Jones, D. C. Hoaglin, K. E. Emam, and J. Rosenberg, “Preliminary guidelines for
empirical research in software engineering,” IEEE Trans. Softw. Eng., 2002
20. Last but not Least
Gender does NOT impact:
Accuracy
Male and female subjects use different strategiess to
select the correct answer
Time
Understanding systematic biases between men and women:
20
Design tools and methods better adapted to different developers
Support different program understanding strategies