SlideShare a Scribd company logo
1 of 59
ICSE’14 Workshop Keynote
Address: Emerging Trends in
Software Metrics (WeTSOM’14)
What Metrics
Matter?
(And the answer
may surprise you)
tim.menzies@gmail.com
1
http://bit.ly/icse14metrics
Late 2014 Late 2015
Coming soon to an Amazon near you
2
This talk is in two parts
Part1: a little history
(my unhappiness with past “results”)
Part2: a new view
3
• But is shared between them
in some shared, underlying,
shape.
• Not every project has the
same underlying simple shape
– Many projects have different,
albeit simple, shapes
• We can exploit that shape,
to great effect:
– For better local predictions
– For transferring lessons learned
– For privacy-preserving data mining
Data about software projects is
not stored in metrc1, metric2,…
4
So, what metrics
to collect?
• Whatever you can get, quickly,
cheaply:
– Then model within the reduced
dimensions
– Then cycle back to the users, for
sanity, for clarity, for questions for
the next round of analysis
5
PART1:
YE OLDE “RESULTS”
(FROM THE 1990S)
6
Along time ago…
In a century far, far away…
• We thought the “right name” was
inherently power
– Stranger in a Strange Land (Heinlien)
– Wizard of Earthsea (LeGuin)
– Snow Crash (Stephenson)
• Sapir-Whorf hypothesis:
– The right words let you think better
• And we need such power
– to avoid the lies and illusions of a cruel
and confusing world.
7
Shotgun
correlations
Courtney, R.E.; Gustafson, D.A.,
"Shotgun correlations in software
measures," Software Engineering
Journal , vol.8, no.1, pp.5,13, Jan 1993
8
Norman F. Schneidewind. 1992. Methodology for Validating Software Metrics.
IEEE Trans. Softw. Eng. 18, 5 (May 1992), 410-422. 9
The 1990’s obsession:
What metrics to collect?
10
How to Design a
Metrics Repository (mid-1990s)
• RA-1 : process-centric [1]
• TAME resource model: resource-centric [2]
• Harrison model : product centric [3]
[1] Ramakishanan et al, Building an effective measure systems, TR96, SD, Monash, 1996
[2] Jeffrey, Basili, Validating the TAME resource model, ICSE’88
[3] W. Harrison, Towards Well-Defined Shareable Product Data, Workshop SE : Future Directions, 1992
11
Battle lines were drawn
And again
• Measurement theory is used to
highlight both weaknesses and
strengths of software metrics work,
including work on metrics validation.
• We identify a problem with the well-
known Weyuker properties, but also
show that a criticism of these
properties by Cherniavsky and Smith is
invalid.
• We show that the search for general
software complexity measures is
doomed to failure.
Blood was split
• It is shown that a collection of nine
properties suggested by E.J. Weyuker is
inadequate for determining the quality
of a software complexity measure.
• A complexity measure which satisfies all
nine of the properties, but which has
absolutely no practical utility in
measuring the complexity of a program
is presented.
• It is concluded that satisfying all of the
nine properties is a necessary, but not
sufficient, condition for a good
complexity measure.
John C. Cherniavsky and Carl H. Smith.
1991. On Weyuker's Axioms for Software
Complexity Measures. IEEE Trans. Softw.
Eng. 17, 6 (June 1991), 636-638
Software measurement: A Necessary
Scientific Basis, Norman Fenton, IEEE
TSE 30(3), 1994 12
And the eventual
winner?
• No one
• When the dust settled, no
one really cared.
• Norman Fenton abandoned,
renounced, his prior work
on metrics.
• The IEEE Metrics conference
got cancelled
– subsumed by EMSE
– R.I.P.
13
PART2:
A NEW VIEW
14
Looking in the wrong direction?
• SE project data = surface features of an underlying effect
• Stop fussing on surface details (mere metrics)
• Go beneath the surface 15
Reflect LESS on raw dimensions
16
Reflect MORE on INTRINSIC dimensions
• Levina and Bickel report
that it is possible to
simplify seemingly
complex data:
– “... the only reason any
methods work in very high
dimensions is that, in fact,
the data are not truly high-
dimensional. Rather, they
are embedded in a high-
dimensional space, but can
be efficiently summarized in
a space of a much lower
dimension ... “
Elizaveta Levina and Peter J. Bickel. Maximum likelihood estimation of intrinsic dimension. In NIPS,
2004.
17
If SE data compresses
in this way then….
• In compressed space, many measures tightly
associated
– Does not matter exactly what you collect
– Since they will map the same structures
• So collect what you can :
– As fast as you can
– Then model within the reduced dimensions
• The “7M” Hypothesis:
– Menzies mentions that many measures
mean much the same thing
18
How to test for 7M
• Specifics do not matter
• But general shape does
• Examples :
– Instability in “what matters most”
– Intrinsic dimensionality
– Active learning
– Transfer learning
– Filtering and wrapping
– Privacy algorithms
19
How to test for 7M
• Specifics do not matter
• But general shape does
• Examples :
– Instability in “what matters most”
– Intrinsic dimensionality
– Transfer learning
– Filtering and wrapping
– Privacy algorithms
– Active learning
20
Raw dimensions problematic
(for effort estimation)
• Conclusion instability
• Learning effort =
b0 + b1*x1+b2*x3+
• 20 times * 66% of the data
– Record the learned“b” values
• NASA93 (effort data)
21
Raw dimensions problematic
(for defect prediction)
Tim Menzies, Andrew
Butcher, David R. Cok,
Andrian Marcus,
Lucas Layman, Forrest
Shull, Burak Turhan,
Thomas
Zimmermann: Local
versus Global Lessons
for Defect Prediction
and Effort Estimation.
IEEE Trans. Software
Eng. 39(6): 822-834
(2013)
22
By the way, same instabilities
for social metrics
Results from Helix repo
• Many classes
are “write once”
– A few OO classes are
“rewrite many”
• Defect defectors that take
into account this developer
social interaction
– perform very well indeed
– Near optimum
Results for AT&T
• Studied patterns of
programmer interaction
with the code
• Not a major influence
on defects
23
Lumpe, Vasa, Menzies, Rush, Turhan, Learning better
inspection optimization policies International Journal of Software
Engineering and Knowledge Engineering 22(05), 621-644, 2012
Weyuker, Ostrand, Bell. 2008. Do too many cooks spoil the broth?
Using the number of developers to enhance defect prediction models.
Empirical Softw. Eng. 13, 5 (October 2008), 539-559
How to test for 7M
• Specifics do not matter
• But general shape does
• Examples :
– Instability in “what matters most”
– Intrinsic dimensionality
– Active learning
– Transfer learning
– Filtering and wrapping
– Privacy algorithms
24
Intrinsic dimensionality (in theory)
Elizaveta Levina and Peter J. Bickel. Maximum
likelihood estimation of intrinsic dimension. In NIPS,
2004.
25
Intrinsic dimensionality (in practice)
26
• Defect projects, open source JAVA projects
• TRAIN:
– Project 21 features onto two synthesized
• using FASTMAP
• X= First PCA component
• Y= right angles to X
– Recursively divide two dimensions (at median)
• Stopping a SQRT(N)
– In each grid, replace N projects with median centroid
• TEST:
– Estimate = interpolate between 2 near centroids
• For 10 data sets, 5*5 cross-val
– Performs no worse, and sometimes better, than
Random forests, NaiveBayes
• Conclusion:
– 21 dimensions can map to two without loss of signal
Vasil Papakroni, Data Carving: Identifying and Removing Irrelevancies
in the Data by Masters thesis, WVU, 2013 http://goo.gl/i6caq7
How to test for 7M
• Specifics do not matter
• But general shape does
• Examples :
– Instability in “what matters most”
– Intrinsic dimensionality
– Active learning
– Transfer learning
– Filtering and wrapping
– Privacy algorithms
27
Active learning in effort estimation
• If difficult to find actual project effort, ask that for fewest projects
• Put aside a hold-out set
• Prune columns that are most often other column’s nearest neighbors
• Sort rows by how often they are other people’s nearest neighbor
• For first “I” rows in sort
– Train (then test on hold out)
– Stop when no performance gain after N new rows
28
Ekrem Kocaguneli, Tim Menzies, Jacky Keung, David R. Cok, Raymond J. Madachy: Active Learning and Effort Estimation:
Finding the Essential Content of Software Effort Estimation Data. IEEE Trans. Software Eng. 39(8): 1040-1053 (2013)
How to test for 7M
• Specifics do not matter
• But general shape does
• Examples :
– Instability in “what matters most”
– Intrinsic dimensionality
– Active learning
– Transfer learning
– Filtering and wrapping
– Privacy algorithms
29
Between Turkish Toasters
AND NASA Space Ships
30
Q: How to TRANSFER
Lessons Learned?
• Ignore most of the data
• relevancy filtering: Turhan ESEj’09; Peters TSE’13
• variance filtering: Kocaguneli TSE’12,TSE’13
• performance similarities: He ESEM’13
• Contort the data
• spectral learning (working in PCA
space or some other rotation)
Menzies, TSE’13; Nam, ICSE’13
• Build a bickering committee
• Ensembles Minku, PROMISE’12
31
BTW, Sometimes, TRANFER better
than LOCAL
Minku:PROMISE’12 Nam:ICSE’13
Peters:TSE’13
32
How to test for 7M
• Specifics do not matter
• But general shape does
• Examples :
– Instability in “what matters most”
– Intrinsic dimensionality
– Active learning
– Transfer learning
– Filtering and wrapping
– Privacy algorithms
33
Some technology:
feature selectors
Wrappers Filters
M.A. Hall and G. Holmes, Benchmarking Attribute Selection Techniques for Discrete Class Data Mining, IEEE
Transactions On Knowledge And Data Engineering 15)6) 1437-1447. 2--3
• Slow: O(2N) for N features
• Selection biased by target learner
• Faster (some even linear time)
• Selection may confuse target
34
Filter
results
• Data from Norman
Fenton’s Bayes nets
discussing software
defects = yes, no
• Give classes x,y
– Fx, Fy
• frequency of discretized
ranges in x,y
– Log Odds Ratio [1]
• log(Fx/Fy )
• Is zero if no difference in
x,y
• Most variables do not
contribute to
determination of defects
Martin Možina, Janez Demšar, Michael Kattan, and Blaž Zupan. 2004. Nomograms for visualization of naive Bayesian classifier. In
Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD '04), 35
Filter
results (more)
• Defect prediction, NASA
data
• Baseline= learn defect
model on all data
(McCabes + Halstead +
LOC measures)
• Filter = sort columns on
“info gain”
• Experiment = use the first
N items in the sort,
stopping when recall, false
alarm, same as baseline
Tim Menzies, Jeremy Greenwald, Art Frank:
Data Mining Static Code Attributes to Learn
Defect Predictors. IEEE Trans. Software Eng.
33(1): 2-13 (2007) 36
Filter results on defect data:
many different features equally
as good for prediction
• Consistent with
this data
“pinging” a
much lower
dimensional
space
37
Tim Menzies, Jeremy Greenwald, Art Frank: Data Mining Static Code Attributes to Learn Defect
Predictors. IEEE Trans. Software Eng. 33(1): 2-13 (2007)
Some Wrapper
results: effort
estimation,
NASA data
• X = f (a,b,c,..)
• X’s variance comes
from a,b,c
• If less a,b,c
– then less confusion
about X
• E.g effort estimation
• Pred(30) = %estimates
within 30% of actual
Zhihao Chen, Tim Menzies, Daniel
Port, Barry Boehm, Finding the Right
Data for Software Cost Modelling, IEEE
Software, Nov, 2005 38
How to test for 7M
• Specifics do not matter
• But general shape does
• Examples :
– Instability in “what matters most”
– Intrinsic dimensionality
– Active learning
– Transfer learning
– Filtering and wrapping
– Privacy algorithms
39
Peter’s Power Principle
(for row and column pruning)
Filtering via range “power”
• Divide data with N rows into
• one region for classes x,y,etc
• For each region x, of size nx
• px = nx/N
• py (of everything else) =(N-nx )/N
• Let Fx and Fy be frequency of range r in
(1) region x and (2) everywhere else
• Do the Bayesian thing:
• a = Fx * px
• b= Fy * py
• Power of range r for predicting x is:
• POW[r,x] = a2/(a+b)
Pruning
• Column pruning
• Sort columns by power of column
(POC)
• POC = max POW value in that
column
• Row pruning
• Sort rows by power of row (POR)
• If row is classified as x
• POR =
Prod( POW[r,x] for r in row )
• Keep 20% most powerful
rows and columns:
• 0.2 * 0.2 = 0.04
• i.e. 4% of the original data
Fayola Peters Tim Menzies, Liang Gong, Hongyu Zhang, Balancing Privacy and Utility in Cross-Company Defect
Prediction, 39(8) 1054-1068, 2013 40
Q: What does that look like?
A: Empty out the “billiard table”
• This is a privacy algorithm:
– CLIFF: prune X% of rows, we are 100-X% private
– MORPH: mutate the survivors no more than half the distance to their
nearest unlike neighbor
– One of the few known privacy algorithms that does not damage data
mining efficacy
before after
Fayola Peters Tim Menzies, Liang Gong, Hongyu Zhang, Balancing Privacy and Utility in Cross-Company Defect
Prediction, 39(8) 1054-1068, 2013 41
Advanced privacy methods
• Pass around the reduced data set
• “Alien”: new data is too “far away”
from the reduced data
– “Too far”: 10% of separation
most distance pair
• If anomalous, add to cache
– For defect data, cache does not
grow beyond 3% of total data
Incremental learning
Privacy-preserving
Cross-company Learning
• LACE : Learn from N software
projects
– Mixtures of open+closed source
projects
• As you learn, play “pass the parcel”
– The cache of reduced data
• Each company only adds its “aliens”
to the passed cache
– Morphing as they goes
• Each company has full control of
privacy
• Generated very good defect
predictors
Peters, Ph.D. thesis, WVU,
September 2014, in progress.
ASE’14: submitted
42
SUMMARY &
CONCLUSIONS
43
Underlying dimensions more
interesting than raw dimensions
• We can ignore most of
the data
– And still find the signal
– Filters, active learning
– Sometimes even
enhancing the signal
• Wrappers
• We can significantly and
usefully contort the
data
– And still get our signal
– E.g. transfer learning,
privacy
44
• But is shared between them
in some shared, underlying,
shape.
• Not every project has the
same underlying simple shape
– Many projects have different,
albeit simple, shapes
• We can exploit that shape,
to great effect:
– For better local predictions
– For transferring lessons learned
– For privacy-preserving data mining
Data about software projects is
not stored in metrc1, metric2,…
45
We were looking
in the wrong direction
• SE project data = surface features of an underlying effect
• Stop fussing on surface details (mere metrics)
• Go beneath the surface 46
Reflect LESS on raw dimensions
47
Reflect MORE on INTRINSIC dimensions
• Levina and Bickel report
that it is possible to
simplify seemingly
complex data:
– “... the only reason any
methods work in very high
dimensions is that, in fact,
the data are not truly high-
dimensional. Rather, they
are embedded in a high-
dimensional space, but can
be efficiently summarized in
a space of a much lower
dimension ... “
Elizaveta Levina and Peter J. Bickel. Maximum likelihood estimation of intrinsic dimension. In NIPS,
2004.
48
If SE data compresses
in this way then….
• The “7M” Hypothesis:
– Menzies mentions that many
measures
mean much the same thing
• In compressed space, many measures
tightly associated
– Does not matter exactly what you
collect
– Since they will map the same
structures
49
So, what metrics to collect?
• Whatever you can get, quickly,
cheaply:
– Then model within the reduced
dimensions
– Then cycle back to the users, for
sanity, for clarity, for questions for
the next round of analysis
50
51
52
SHOULD WE
STUDY L-SYSTEM?
Speculation
53
54
55
56
57
58
59
End of my tale

More Related Content

What's hot

Glenn Vanderburg — Real software engineering
Glenn Vanderburg — Real software engineeringGlenn Vanderburg — Real software engineering
Glenn Vanderburg — Real software engineering
atr2006
 
Scheduling Problems from Workshop to Collaborative Mobile Computing: A State ...
Scheduling Problems from Workshop to Collaborative Mobile Computing: A State ...Scheduling Problems from Workshop to Collaborative Mobile Computing: A State ...
Scheduling Problems from Workshop to Collaborative Mobile Computing: A State ...
IJCSIS Research Publications
 
SafeguardAI and Surprise Based Learning -- Protect your AI solutions from Uni...
SafeguardAI and Surprise Based Learning -- Protect your AI solutions from Uni...SafeguardAI and Surprise Based Learning -- Protect your AI solutions from Uni...
SafeguardAI and Surprise Based Learning -- Protect your AI solutions from Uni...
NAVER Engineering
 

What's hot (16)

The effects of visual realism on search tasks in mixed reality simulations-IE...
The effects of visual realism on search tasks in mixed reality simulations-IE...The effects of visual realism on search tasks in mixed reality simulations-IE...
The effects of visual realism on search tasks in mixed reality simulations-IE...
 
Applying Systems Thinking to Solve Wicked Problems in Software Engineering
Applying Systems Thinking to Solve Wicked Problems in Software EngineeringApplying Systems Thinking to Solve Wicked Problems in Software Engineering
Applying Systems Thinking to Solve Wicked Problems in Software Engineering
 
MPhil Mid-Candidature
MPhil Mid-CandidatureMPhil Mid-Candidature
MPhil Mid-Candidature
 
Eben Myers & Santosh Mathan - Building Robot Factory: A Game Design and Brain...
Eben Myers & Santosh Mathan - Building Robot Factory: A Game Design and Brain...Eben Myers & Santosh Mathan - Building Robot Factory: A Game Design and Brain...
Eben Myers & Santosh Mathan - Building Robot Factory: A Game Design and Brain...
 
Glenn Vanderburg — Real software engineering
Glenn Vanderburg — Real software engineeringGlenn Vanderburg — Real software engineering
Glenn Vanderburg — Real software engineering
 
Neural Correlates of Technological Ambivalence: A Research Proposal
Neural Correlates of Technological Ambivalence: A Research Proposal Neural Correlates of Technological Ambivalence: A Research Proposal
Neural Correlates of Technological Ambivalence: A Research Proposal
 
Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...
Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...
Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...
 
Nonlinear Programming: Theories and Algorithms of Some Unconstrained Optimiza...
Nonlinear Programming: Theories and Algorithms of Some Unconstrained Optimiza...Nonlinear Programming: Theories and Algorithms of Some Unconstrained Optimiza...
Nonlinear Programming: Theories and Algorithms of Some Unconstrained Optimiza...
 
Scheduling Problems from Workshop to Collaborative Mobile Computing: A State ...
Scheduling Problems from Workshop to Collaborative Mobile Computing: A State ...Scheduling Problems from Workshop to Collaborative Mobile Computing: A State ...
Scheduling Problems from Workshop to Collaborative Mobile Computing: A State ...
 
Thinking Tools - For Root Cause Analysis
Thinking Tools - For Root Cause Analysis Thinking Tools - For Root Cause Analysis
Thinking Tools - For Root Cause Analysis
 
Deep residual learning for image recognition
Deep residual learning for image recognitionDeep residual learning for image recognition
Deep residual learning for image recognition
 
On impact in Software Engineering Research (ICSE 2018 New Faculty Symposium)
On impact in Software Engineering Research (ICSE 2018 New Faculty Symposium)On impact in Software Engineering Research (ICSE 2018 New Faculty Symposium)
On impact in Software Engineering Research (ICSE 2018 New Faculty Symposium)
 
Systems Thinking and Unthinkable Thoughts
Systems Thinking and Unthinkable ThoughtsSystems Thinking and Unthinkable Thoughts
Systems Thinking and Unthinkable Thoughts
 
SafeguardAI and Surprise Based Learning -- Protect your AI solutions from Uni...
SafeguardAI and Surprise Based Learning -- Protect your AI solutions from Uni...SafeguardAI and Surprise Based Learning -- Protect your AI solutions from Uni...
SafeguardAI and Surprise Based Learning -- Protect your AI solutions from Uni...
 
G.A.N.N - Documentation
G.A.N.N - DocumentationG.A.N.N - Documentation
G.A.N.N - Documentation
 
Goldstein chapter 12
Goldstein chapter 12Goldstein chapter 12
Goldstein chapter 12
 

Similar to What Metrics Matter?

Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1
CS, NcState
 
Dm sei-tutorial-v7
Dm sei-tutorial-v7Dm sei-tutorial-v7
Dm sei-tutorial-v7
CS, NcState
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?
CS, NcState
 
ExaLearn Overview - ECP Co-Design Center for Machine Learning
ExaLearn Overview - ECP Co-Design Center for Machine LearningExaLearn Overview - ECP Co-Design Center for Machine Learning
ExaLearn Overview - ECP Co-Design Center for Machine Learning
inside-BigData.com
 
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Jeromy Anglim
 
Principles of effort estimation
Principles of effort estimationPrinciples of effort estimation
Principles of effort estimation
CS, NcState
 
Statistics in the age of data science, issues you can not ignore
Statistics in the age of data science, issues you can not ignoreStatistics in the age of data science, issues you can not ignore
Statistics in the age of data science, issues you can not ignore
Turi, Inc.
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
CS, NcState
 

Similar to What Metrics Matter? (20)

Icse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceIcse15 Tech-briefing Data Science
Icse15 Tech-briefing Data Science
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data Science
 
On Impact in Software Engineering Research (HU Berlin 2021)
On Impact in Software Engineering Research (HU Berlin 2021)On Impact in Software Engineering Research (HU Berlin 2021)
On Impact in Software Engineering Research (HU Berlin 2021)
 
Dm sei-tutorial-v7
Dm sei-tutorial-v7Dm sei-tutorial-v7
Dm sei-tutorial-v7
 
A data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingA data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototyping
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software Data
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?
 
ExaLearn Overview - ECP Co-Design Center for Machine Learning
ExaLearn Overview - ECP Co-Design Center for Machine LearningExaLearn Overview - ECP Co-Design Center for Machine Learning
ExaLearn Overview - ECP Co-Design Center for Machine Learning
 
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
 
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
 
Principles of effort estimation
Principles of effort estimationPrinciples of effort estimation
Principles of effort estimation
 
Ml pluss ejan2013
Ml pluss ejan2013Ml pluss ejan2013
Ml pluss ejan2013
 
Statistics in the age of data science, issues you can not ignore
Statistics in the age of data science, issues you can not ignoreStatistics in the age of data science, issues you can not ignore
Statistics in the age of data science, issues you can not ignore
 
Computational Thinking in the Workforce and Next Generation Science Standards...
Computational Thinking in the Workforce and Next Generation Science Standards...Computational Thinking in the Workforce and Next Generation Science Standards...
Computational Thinking in the Workforce and Next Generation Science Standards...
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
MLSEV Virtual. State of the Art in ML
MLSEV Virtual. State of the Art in MLMLSEV Virtual. State of the Art in ML
MLSEV Virtual. State of the Art in ML
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
 

More from CS, NcState

Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9
CS, NcState
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab template
CS, NcState
 
How to do better experiments in SE
How to do better experiments in SEHow to do better experiments in SE
How to do better experiments in SE
CS, NcState
 
Idea Engineering
Idea EngineeringIdea Engineering
Idea Engineering
CS, NcState
 

More from CS, NcState (20)

Talks2015 novdec
Talks2015 novdecTalks2015 novdec
Talks2015 novdec
 
Future se oct15
Future se oct15Future se oct15
Future se oct15
 
GALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringGALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software Engineering
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
 
Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9
 
Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).
 
Kits to Find the Bits that Fits
Kits to Find  the Bits that Fits Kits to Find  the Bits that Fits
Kits to Find the Bits that Fits
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab template
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements Engineering
 
172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia
 
Automated Software Engineering
Automated Software EngineeringAutomated Software Engineering
Automated Software Engineering
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)
 
Goldrush
GoldrushGoldrush
Goldrush
 
Know thy tools
Know thy toolsKnow thy tools
Know thy tools
 
Sayyad slides ase13_v4
Sayyad slides ase13_v4Sayyad slides ase13_v4
Sayyad slides ase13_v4
 
Ase2013
Ase2013Ase2013
Ase2013
 
Warning: don't do CS
Warning: don't do CSWarning: don't do CS
Warning: don't do CS
 
How to do better experiments in SE
How to do better experiments in SEHow to do better experiments in SE
How to do better experiments in SE
 
Idea Engineering
Idea EngineeringIdea Engineering
Idea Engineering
 

Recently uploaded

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 

Recently uploaded (20)

CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 

What Metrics Matter?

  • 1. ICSE’14 Workshop Keynote Address: Emerging Trends in Software Metrics (WeTSOM’14) What Metrics Matter? (And the answer may surprise you) tim.menzies@gmail.com 1 http://bit.ly/icse14metrics
  • 2. Late 2014 Late 2015 Coming soon to an Amazon near you 2
  • 3. This talk is in two parts Part1: a little history (my unhappiness with past “results”) Part2: a new view 3
  • 4. • But is shared between them in some shared, underlying, shape. • Not every project has the same underlying simple shape – Many projects have different, albeit simple, shapes • We can exploit that shape, to great effect: – For better local predictions – For transferring lessons learned – For privacy-preserving data mining Data about software projects is not stored in metrc1, metric2,… 4
  • 5. So, what metrics to collect? • Whatever you can get, quickly, cheaply: – Then model within the reduced dimensions – Then cycle back to the users, for sanity, for clarity, for questions for the next round of analysis 5
  • 7. Along time ago… In a century far, far away… • We thought the “right name” was inherently power – Stranger in a Strange Land (Heinlien) – Wizard of Earthsea (LeGuin) – Snow Crash (Stephenson) • Sapir-Whorf hypothesis: – The right words let you think better • And we need such power – to avoid the lies and illusions of a cruel and confusing world. 7
  • 8. Shotgun correlations Courtney, R.E.; Gustafson, D.A., "Shotgun correlations in software measures," Software Engineering Journal , vol.8, no.1, pp.5,13, Jan 1993 8
  • 9. Norman F. Schneidewind. 1992. Methodology for Validating Software Metrics. IEEE Trans. Softw. Eng. 18, 5 (May 1992), 410-422. 9
  • 10. The 1990’s obsession: What metrics to collect? 10
  • 11. How to Design a Metrics Repository (mid-1990s) • RA-1 : process-centric [1] • TAME resource model: resource-centric [2] • Harrison model : product centric [3] [1] Ramakishanan et al, Building an effective measure systems, TR96, SD, Monash, 1996 [2] Jeffrey, Basili, Validating the TAME resource model, ICSE’88 [3] W. Harrison, Towards Well-Defined Shareable Product Data, Workshop SE : Future Directions, 1992 11
  • 12. Battle lines were drawn And again • Measurement theory is used to highlight both weaknesses and strengths of software metrics work, including work on metrics validation. • We identify a problem with the well- known Weyuker properties, but also show that a criticism of these properties by Cherniavsky and Smith is invalid. • We show that the search for general software complexity measures is doomed to failure. Blood was split • It is shown that a collection of nine properties suggested by E.J. Weyuker is inadequate for determining the quality of a software complexity measure. • A complexity measure which satisfies all nine of the properties, but which has absolutely no practical utility in measuring the complexity of a program is presented. • It is concluded that satisfying all of the nine properties is a necessary, but not sufficient, condition for a good complexity measure. John C. Cherniavsky and Carl H. Smith. 1991. On Weyuker's Axioms for Software Complexity Measures. IEEE Trans. Softw. Eng. 17, 6 (June 1991), 636-638 Software measurement: A Necessary Scientific Basis, Norman Fenton, IEEE TSE 30(3), 1994 12
  • 13. And the eventual winner? • No one • When the dust settled, no one really cared. • Norman Fenton abandoned, renounced, his prior work on metrics. • The IEEE Metrics conference got cancelled – subsumed by EMSE – R.I.P. 13
  • 15. Looking in the wrong direction? • SE project data = surface features of an underlying effect • Stop fussing on surface details (mere metrics) • Go beneath the surface 15
  • 16. Reflect LESS on raw dimensions 16
  • 17. Reflect MORE on INTRINSIC dimensions • Levina and Bickel report that it is possible to simplify seemingly complex data: – “... the only reason any methods work in very high dimensions is that, in fact, the data are not truly high- dimensional. Rather, they are embedded in a high- dimensional space, but can be efficiently summarized in a space of a much lower dimension ... “ Elizaveta Levina and Peter J. Bickel. Maximum likelihood estimation of intrinsic dimension. In NIPS, 2004. 17
  • 18. If SE data compresses in this way then…. • In compressed space, many measures tightly associated – Does not matter exactly what you collect – Since they will map the same structures • So collect what you can : – As fast as you can – Then model within the reduced dimensions • The “7M” Hypothesis: – Menzies mentions that many measures mean much the same thing 18
  • 19. How to test for 7M • Specifics do not matter • But general shape does • Examples : – Instability in “what matters most” – Intrinsic dimensionality – Active learning – Transfer learning – Filtering and wrapping – Privacy algorithms 19
  • 20. How to test for 7M • Specifics do not matter • But general shape does • Examples : – Instability in “what matters most” – Intrinsic dimensionality – Transfer learning – Filtering and wrapping – Privacy algorithms – Active learning 20
  • 21. Raw dimensions problematic (for effort estimation) • Conclusion instability • Learning effort = b0 + b1*x1+b2*x3+ • 20 times * 66% of the data – Record the learned“b” values • NASA93 (effort data) 21
  • 22. Raw dimensions problematic (for defect prediction) Tim Menzies, Andrew Butcher, David R. Cok, Andrian Marcus, Lucas Layman, Forrest Shull, Burak Turhan, Thomas Zimmermann: Local versus Global Lessons for Defect Prediction and Effort Estimation. IEEE Trans. Software Eng. 39(6): 822-834 (2013) 22
  • 23. By the way, same instabilities for social metrics Results from Helix repo • Many classes are “write once” – A few OO classes are “rewrite many” • Defect defectors that take into account this developer social interaction – perform very well indeed – Near optimum Results for AT&T • Studied patterns of programmer interaction with the code • Not a major influence on defects 23 Lumpe, Vasa, Menzies, Rush, Turhan, Learning better inspection optimization policies International Journal of Software Engineering and Knowledge Engineering 22(05), 621-644, 2012 Weyuker, Ostrand, Bell. 2008. Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models. Empirical Softw. Eng. 13, 5 (October 2008), 539-559
  • 24. How to test for 7M • Specifics do not matter • But general shape does • Examples : – Instability in “what matters most” – Intrinsic dimensionality – Active learning – Transfer learning – Filtering and wrapping – Privacy algorithms 24
  • 25. Intrinsic dimensionality (in theory) Elizaveta Levina and Peter J. Bickel. Maximum likelihood estimation of intrinsic dimension. In NIPS, 2004. 25
  • 26. Intrinsic dimensionality (in practice) 26 • Defect projects, open source JAVA projects • TRAIN: – Project 21 features onto two synthesized • using FASTMAP • X= First PCA component • Y= right angles to X – Recursively divide two dimensions (at median) • Stopping a SQRT(N) – In each grid, replace N projects with median centroid • TEST: – Estimate = interpolate between 2 near centroids • For 10 data sets, 5*5 cross-val – Performs no worse, and sometimes better, than Random forests, NaiveBayes • Conclusion: – 21 dimensions can map to two without loss of signal Vasil Papakroni, Data Carving: Identifying and Removing Irrelevancies in the Data by Masters thesis, WVU, 2013 http://goo.gl/i6caq7
  • 27. How to test for 7M • Specifics do not matter • But general shape does • Examples : – Instability in “what matters most” – Intrinsic dimensionality – Active learning – Transfer learning – Filtering and wrapping – Privacy algorithms 27
  • 28. Active learning in effort estimation • If difficult to find actual project effort, ask that for fewest projects • Put aside a hold-out set • Prune columns that are most often other column’s nearest neighbors • Sort rows by how often they are other people’s nearest neighbor • For first “I” rows in sort – Train (then test on hold out) – Stop when no performance gain after N new rows 28 Ekrem Kocaguneli, Tim Menzies, Jacky Keung, David R. Cok, Raymond J. Madachy: Active Learning and Effort Estimation: Finding the Essential Content of Software Effort Estimation Data. IEEE Trans. Software Eng. 39(8): 1040-1053 (2013)
  • 29. How to test for 7M • Specifics do not matter • But general shape does • Examples : – Instability in “what matters most” – Intrinsic dimensionality – Active learning – Transfer learning – Filtering and wrapping – Privacy algorithms 29
  • 30. Between Turkish Toasters AND NASA Space Ships 30
  • 31. Q: How to TRANSFER Lessons Learned? • Ignore most of the data • relevancy filtering: Turhan ESEj’09; Peters TSE’13 • variance filtering: Kocaguneli TSE’12,TSE’13 • performance similarities: He ESEM’13 • Contort the data • spectral learning (working in PCA space or some other rotation) Menzies, TSE’13; Nam, ICSE’13 • Build a bickering committee • Ensembles Minku, PROMISE’12 31
  • 32. BTW, Sometimes, TRANFER better than LOCAL Minku:PROMISE’12 Nam:ICSE’13 Peters:TSE’13 32
  • 33. How to test for 7M • Specifics do not matter • But general shape does • Examples : – Instability in “what matters most” – Intrinsic dimensionality – Active learning – Transfer learning – Filtering and wrapping – Privacy algorithms 33
  • 34. Some technology: feature selectors Wrappers Filters M.A. Hall and G. Holmes, Benchmarking Attribute Selection Techniques for Discrete Class Data Mining, IEEE Transactions On Knowledge And Data Engineering 15)6) 1437-1447. 2--3 • Slow: O(2N) for N features • Selection biased by target learner • Faster (some even linear time) • Selection may confuse target 34
  • 35. Filter results • Data from Norman Fenton’s Bayes nets discussing software defects = yes, no • Give classes x,y – Fx, Fy • frequency of discretized ranges in x,y – Log Odds Ratio [1] • log(Fx/Fy ) • Is zero if no difference in x,y • Most variables do not contribute to determination of defects Martin Možina, Janez Demšar, Michael Kattan, and Blaž Zupan. 2004. Nomograms for visualization of naive Bayesian classifier. In Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD '04), 35
  • 36. Filter results (more) • Defect prediction, NASA data • Baseline= learn defect model on all data (McCabes + Halstead + LOC measures) • Filter = sort columns on “info gain” • Experiment = use the first N items in the sort, stopping when recall, false alarm, same as baseline Tim Menzies, Jeremy Greenwald, Art Frank: Data Mining Static Code Attributes to Learn Defect Predictors. IEEE Trans. Software Eng. 33(1): 2-13 (2007) 36
  • 37. Filter results on defect data: many different features equally as good for prediction • Consistent with this data “pinging” a much lower dimensional space 37 Tim Menzies, Jeremy Greenwald, Art Frank: Data Mining Static Code Attributes to Learn Defect Predictors. IEEE Trans. Software Eng. 33(1): 2-13 (2007)
  • 38. Some Wrapper results: effort estimation, NASA data • X = f (a,b,c,..) • X’s variance comes from a,b,c • If less a,b,c – then less confusion about X • E.g effort estimation • Pred(30) = %estimates within 30% of actual Zhihao Chen, Tim Menzies, Daniel Port, Barry Boehm, Finding the Right Data for Software Cost Modelling, IEEE Software, Nov, 2005 38
  • 39. How to test for 7M • Specifics do not matter • But general shape does • Examples : – Instability in “what matters most” – Intrinsic dimensionality – Active learning – Transfer learning – Filtering and wrapping – Privacy algorithms 39
  • 40. Peter’s Power Principle (for row and column pruning) Filtering via range “power” • Divide data with N rows into • one region for classes x,y,etc • For each region x, of size nx • px = nx/N • py (of everything else) =(N-nx )/N • Let Fx and Fy be frequency of range r in (1) region x and (2) everywhere else • Do the Bayesian thing: • a = Fx * px • b= Fy * py • Power of range r for predicting x is: • POW[r,x] = a2/(a+b) Pruning • Column pruning • Sort columns by power of column (POC) • POC = max POW value in that column • Row pruning • Sort rows by power of row (POR) • If row is classified as x • POR = Prod( POW[r,x] for r in row ) • Keep 20% most powerful rows and columns: • 0.2 * 0.2 = 0.04 • i.e. 4% of the original data Fayola Peters Tim Menzies, Liang Gong, Hongyu Zhang, Balancing Privacy and Utility in Cross-Company Defect Prediction, 39(8) 1054-1068, 2013 40
  • 41. Q: What does that look like? A: Empty out the “billiard table” • This is a privacy algorithm: – CLIFF: prune X% of rows, we are 100-X% private – MORPH: mutate the survivors no more than half the distance to their nearest unlike neighbor – One of the few known privacy algorithms that does not damage data mining efficacy before after Fayola Peters Tim Menzies, Liang Gong, Hongyu Zhang, Balancing Privacy and Utility in Cross-Company Defect Prediction, 39(8) 1054-1068, 2013 41
  • 42. Advanced privacy methods • Pass around the reduced data set • “Alien”: new data is too “far away” from the reduced data – “Too far”: 10% of separation most distance pair • If anomalous, add to cache – For defect data, cache does not grow beyond 3% of total data Incremental learning Privacy-preserving Cross-company Learning • LACE : Learn from N software projects – Mixtures of open+closed source projects • As you learn, play “pass the parcel” – The cache of reduced data • Each company only adds its “aliens” to the passed cache – Morphing as they goes • Each company has full control of privacy • Generated very good defect predictors Peters, Ph.D. thesis, WVU, September 2014, in progress. ASE’14: submitted 42
  • 44. Underlying dimensions more interesting than raw dimensions • We can ignore most of the data – And still find the signal – Filters, active learning – Sometimes even enhancing the signal • Wrappers • We can significantly and usefully contort the data – And still get our signal – E.g. transfer learning, privacy 44
  • 45. • But is shared between them in some shared, underlying, shape. • Not every project has the same underlying simple shape – Many projects have different, albeit simple, shapes • We can exploit that shape, to great effect: – For better local predictions – For transferring lessons learned – For privacy-preserving data mining Data about software projects is not stored in metrc1, metric2,… 45
  • 46. We were looking in the wrong direction • SE project data = surface features of an underlying effect • Stop fussing on surface details (mere metrics) • Go beneath the surface 46
  • 47. Reflect LESS on raw dimensions 47
  • 48. Reflect MORE on INTRINSIC dimensions • Levina and Bickel report that it is possible to simplify seemingly complex data: – “... the only reason any methods work in very high dimensions is that, in fact, the data are not truly high- dimensional. Rather, they are embedded in a high- dimensional space, but can be efficiently summarized in a space of a much lower dimension ... “ Elizaveta Levina and Peter J. Bickel. Maximum likelihood estimation of intrinsic dimension. In NIPS, 2004. 48
  • 49. If SE data compresses in this way then…. • The “7M” Hypothesis: – Menzies mentions that many measures mean much the same thing • In compressed space, many measures tightly associated – Does not matter exactly what you collect – Since they will map the same structures 49
  • 50. So, what metrics to collect? • Whatever you can get, quickly, cheaply: – Then model within the reduced dimensions – Then cycle back to the users, for sanity, for clarity, for questions for the next round of analysis 50
  • 51. 51
  • 52. 52
  • 54. 54
  • 55. 55
  • 56. 56
  • 57. 57
  • 58. 58
  • 59. 59 End of my tale