Icpc12b.ppt

Women and Men – Different but Equal:
On the Impact of Identifier Style on
Source Code Reading
Zohreh Sharafi, Zéphyrin Soh,
Yann-Gaël Guéhéneuc, and Giuliano Antoniol
Ptidej Team and Soccer Lab,
École Polytechnique de Montréal, Montréal, Canada.

1

Introduction
Documentation

Program evolution is often
solely based on source code.

Comments

Not available
Outdated

Program comprehension
heavily depends on identifiers.
Not available
Not useful
2

Introduction
What is the impact of identifiers on program
comprehension?





Identifier Style (Sharif 20101, Binkley 20092)
 Camel

Case
 Underscore


Identifier quality (Lawrie 20073)
 Length
 Structure

3

1. D. Binkley, M. Davis, D. Lawrie, and C. Morrell, “To camelcase or under score,” ICPC, 2009
2. B. Sharif and J. Maletic, “An eye tracking study on camelcase and under score identifier styles, ICPC, 2010
3. D. Lawrie, C. Morrell, H. Feild, and D. Binkley, “Effective identifier names for comprehension and memory,” Innovations in Systems and Software
Engineering, 2007.

Identifier Style
Investigate the Impact of Identifier Style
“Camel Case vs. Underscore”
Time
Binkley, 20091

Time

Higher accuracy for
Camel Case

No difference

More time spent on
Camel Case

More time spent on
Camel Case

Not applicable

Accuracy

Sharif, 20102

Lower visual effort on
Underscore

Effort

4

1. D. Binkley, M. Davis, D. Lawrie, and C. Morrell, “To camelcase or under scoreI, CPC, 2009
2. B. Sharif and J. Maletic, “An eye tracking study on camelcase and under score identifier styles,” ICPC , 2010

Gender Preference


Does gender affects identifier style preference?
Camel Case?
Underscore?



Camel Case?
Underscore?

Does gender and identifier style affects program
understanding?
5

Related Work
Self-efficacy
(1994)1

Lower self-efficacy

Higher self-efficacy

Selectivity
Hypothesis
(2001)2

Use comprehensive way

Use simple heuristic

Software
Adaptation &
Use (2003)3

No experience
=
No self-confidence

Apply general knowledge
for specific tasks

Testing
Strategies
(2009)4

Use testing only for finding
bugs

6

1.
2.
3.
4.

Use testing for:
1. finding bugs
2. fixing bugs
3. evaluating their fix

G. Torkzadeh and X. Koufteros, “Factorial validity of a computer self-efficacy scale and the impact of computer training1994.
E. O’Donnell and E. Johnson, “The effects of auditor gender and task complexity on information processing efficiency,” 2001.
K. Hartzel, “How self-efficacy and gender issues affect software adoption and use,” Communications of the ACM, 2003.
V. Grigoreanu, J. Brundage, E. Bahna, M. Burnett, P. ElRif, and J. Snover, “Males and females script debugging strategies,” 2009.

Our Empirical Study
Goal: Design and perform an experiment to investigate the impact
of gender on identifier styles and identifier comprehension
strategies

Research question: Does the developers’ gender impact their

effort, their required time, and as well as their ability to recall
identifiers (accuracy) in source code reading?



Memorability impacts the comprehension of
identifiers1,2,3

7

Lacking memorability may impair:
Identifiers recall Code readability

Program comprehension

1. D. M. Jones, 2003, the New C Standard (Identifiers) An Economic and Cultural Commentary.
2. B. Sharif and J. Maletic, “An eye tracking study on camelcase and under score identifier styles, ICPC, 2010
3. D. Lawrie, C. Morrell, H. Feild, and D. Binkley, “Effective identifier names for comprehension and memory,” 2007

Experiment Design
Independent variables



1.
2.

Gender: male (M) or female (F)
Identifier style: Camel Case (CC) or Underscore (US)

Dependent variables



1.
2.
3.

Accuracy
Required time (Speed)
Visual effort

Mitigating variables



1.
2.

8

Study level
Style preference of subjects

Experiment Design
Subjects’ Demography
(24 Subjects)
Academic background

Gender

Ph.D.

M.Sc.

B.Sc.

Male

Female

11

10

3

15

9

Source code stimulus

9

Question stimulus

Experiment Design
Eye-tracking system


FaceLAB1:






Video-based
Two cameras
An infrared sensor

Non-intrusive




No goggles,
No wires
No sensing device

10

1. http://www.seeingmachines.com/

Results and Analyses:
Accuracy and Time
Accuracy

Time (min)

Accuracy

Men

Women

Time

Men

Women

Camel Case

83%

80%

Camel Case

5.25

6.6

Underscore

68%

85%

Underscore

6.7

7.8

Table 1 - Accuracy

Table 2 - Time
Accuracy

Time (min)

Men

74%

5.94

Women

82%

7.18

Table 3 - Overall Time and Accuracy
Accuracy

No significant differences

Time
11

Visual Effort


Visual effort:



Calculated from eye-tracking data
Calculated based on the amount of visual attention:


Less attention

Less effort



Visual attention can trigger the mental processes



Two types of eye gaze data:





Fixation
Saccade

We use fixation to calculate effort as previous works
12


Visual Effort Metric - Source code Stimulus


Convex hull: the smallest convex sets of fixations that

contains all of a subject’s fixations1



Source Code Stimulus

Smaller convex hull
Close fixations
Less effort

13

1.

J. H. Goldberg and X. P. Kotval, “Computer interface evaluation using eye movements: methods and constructs,”, 1999.


Visual Effort Metrics - Question Stimulus

Metrics to measure visual effort
(Sharif 20101)

Question Stimulus

1. B. Sharif and J. Maletic, “An eye tracking study on camelcase and under score identifier styles,” IEEE 18th International Conference on Program
Comprehension , 2010

Result and Analysis:
Visual Effort

FC(Q)


FR(Correct) No significant differences
Effort

• Female

FR(distracter) Female subjects spends more visual
effort

subjects put more visual effort on distracters
• Female subjects analyze all distracters before making decision

15

Visual Effort

16

Heatmap

Conclusion
Investigate the Impact of Identifier Style
“Camel Case vs. Underscore”
Binkley, 20091

Time

Our Experiment

(135 subjects)

Accuracy

Sharif, 20102
(15 subjects)

(24 subjects)

Higher
accuracy for
Camel Case
More time
spent on
Camel Case
Not
applicable

No difference

No difference

More time spent
on
Camel Case
Lower visual effort
on Underscore

No difference

No difference

Effort
17

1. D. Binkley, M. Davis, D. Lawrie, and C. Morrell, “To camelcase or under score,” ICPC, 2009
2. B. Sharif and J. Maletic, “An eye tracking study on camelcase and under score identifier styles,” ICPC, 2010

Conclusion


Gender does NOT impact:
Accuracy
• Spend

Time

more effort on alternative answers
• Seem to:
• Carefully weight all options
• Rule out wrong answers
• Statistically strong interaction between accuracy and
effort on distracters
Effort

18

Seem to quickly go for one choice
• No correlation between effort and accuracy

Threats to Validity


Internal validity:





Random ordering of source code
Comfortable environment

External validity (generalization of the results):


Students as subjects




19

Students are the next generation of software professionals1

24 subjects
 9 female subjects  Not large enough to be generalized
 A call for further studies
1. B. A. Kitchenham, S. L. Pfleeger, L. M. Pickard, P. W. Jones, D. C. Hoaglin, K. E. Emam, and J. Rosenberg, “Preliminary guidelines for
empirical research in software engineering,” IEEE Trans. Softw. Eng., 2002

Last but not Least
Gender does NOT impact:



Accuracy

Male and female subjects use different strategiess to
select the correct answer





Time

Understanding systematic biases between men and women:



20

Design tools and methods better adapted to different developers
Support different program understanding strategies

Icpc12b.ppt

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (7)

Similaire à Icpc12b.ppt

Similaire à Icpc12b.ppt (20)

Plus de Ptidej Team

Plus de Ptidej Team (20)

Dernier

Dernier (20)

Icpc12b.ppt