1) The document discusses analyzing open source software development projects through data retrieved from development repositories.
2) It provides an analysis of the Joomla project based on data from its Git and issue tracking repositories, including metrics on commits, committers, issues, and companies contributing.
3) The analysis finds trends over time in areas like commits, committers, files and lines changed, and issue tracking that can help understand the project's development.
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Focus Group Open Source 04.06.2012 Jesus Gonzalez Barahona
1. Analyzing free software development projects
Jesus M. Gonzalez-Barahona
jgb@gsyc.es
http://identi.ca/jgbarah http://twitter.com/jgbarah
Bitergia
GSyC/LibreSoft, Universidad Rey Juan Carlos
Focus Open Source Group, Rome, June 4th, 2012
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 1 / 22
2. c 2012 Bitergia
Some rights reserved. This presentation is distributed under the
“Attribution-ShareAlike 3.0” license, by Creative Commons, available at
http://creativecommons.org/licenses/by-sa/3.0/
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 2 / 22
3. GSyC/LibreSoft
Research group at Universidad Rey Juan Carlos
About 20 persons, including students
Focus on FLOSS (free, libre, open source software)
One of the main research lines:
Understanding FLOSS development
Quantitative, empirical approach
Based on data retrieved from FLOSS development repositories
Participating in several R&D projects
http://libresoft.es
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 3 / 22
4. Bitergia: an spin-off
Company starting operations in June 2012
Building on the experience of LibreSoft
Offering professional products and services
Focused on:
Metrics about software developent
(including community metrics)
Specialized support for development forges
(including metrics for projects)
http://bitergia.com
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 4 / 22
5. Analyzing Joomla (preliminary work)
Content management framework
Source code management repositories:
git: git://github.com/joomla/joomla-cms.git
git: git://github.com/joomla/joomla-platform.git
From: 2005-09-15 04:11:08
To: 2012-05-20 11:36:34
20,605 commits, 215 committers
Issue tracking repository:
Github:
https://api.github.com/repos/joomla/joomla-cms/issues
Github: https:
//api.github.com/repos/joomla/joomla-platform/issues
Retrieved on: 2012-06-03
First submitter on: 2011-08-24 15:25:25
1,464 issue reports (including pull requests)
http://joomla.org/
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 5 / 22
6. Commits per month
200 400 600
Commits
0
2006 2007 2008 2009 2010 2011 2012
Time
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 6 / 22
7. Committers per month
30
Committers
20
10
0
2006 2007 2008 2009 2010 2011 2012
Time
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 7 / 22
8. Commits per committer per month
50
Commits per committer
30
10
0
2006 2007 2008 2009 2010 2011 2012
Time
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 8 / 22
9. Commits per committer per month (3D)
30
10 20
4080
60 200
0 20
150
100
50
0
Commits
Month
Committer
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 9 / 22
10. Commits per month (master branch)
Commits (branch 1)
400
200
0
2006 2007 2008 2009 2010 2011 2012
Time
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 10 / 22
11. Lines added & removed per month (master branch)
Branch 1: Lines added (black) / removed (green)
250000
100000
0
2006 2007 2008 2009 2010 2011 2012
Time
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 11 / 22
12. Files involved in each commit, mean per month (master
branch)
Branch 1: Lines added (black) / removed (green)
0 100000 250000
2006 2007 2008 2009 2010 2011 2012
Time
Files changed per commit
0 100 200 300
2006 2007 2008 2009 2010 2011 2012
Jesus Gonzalez-Barahona (Bitergia) Time
Analyzing free software development projects Focus Open Source 2012 12 / 22
13. Lines changed per commit, mean per month (master)
Lines changed per commit (mean per month) Branch 1: Lines added (black) / removed (green)
0 100000 250000
2006 2007 2008 2009 2010 2011 2012
Time
0 50 150 250
2006 2007 2008 2009 2010 2011 2012
Time
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 13 / 22
14. ranch 1 (per change): Lines added (black) / removed (green
Lines added & removed per file per month (master branch)
120
80
40
0
2006 2007 2008 2009 2010 2011 2012
Time
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 14 / 22
15. Density distribution of commit size (master branch)
Probability density
1.5
1.0
0.5
0.0
0 1 2 3
Log 10 scale
Files (black), lines added (red), lines removed (green)
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 15 / 22
16. Companies of committers to Joomla
Freelance
Newlifeinit
Ebay
Timble
Volunteer
University
Unknown
Rockettheme
Kontentdesign
Popcliq
Ezsystems
Nbcuniversal
Rmdstudios
Lighthost
Holidaycheckag
Syncleon
Outer ring: commits / Inner ring: committers
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 16 / 22
17. Issues
Time to fix bugs
0.6
0.4
Density
0.2
0.0
0 50 100 150 200 250 300
Time to fix (days)
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 17 / 22
18. Issues
Quickly fixed
0.8
Density
0.4
0.0
0 5 10 15 20 25 30
Time to fix (days)
Slowly fixed
0.006
Density
0.003
0.000
0 50 100 150 200 250 300 350
Time to fix (days)
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 18 / 22
19. Issues
Open and closed bugs
60
40
bugs
20
0
0 10 20 30 40 50
Weeks
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 19 / 22
20. Sidenote: the history of OpenOffice.org / LibreOffice
[Very preliminary, as found in the LibreOffice repository, 2000-2012]
80
60
40
0 20
1000
800
600
Commits 400
200
Month 0
150
100
50
Committer 0
[Contributions of more than 1,000 commits trimmed]
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 20 / 22
21. In summary
FLOSS development repositories have a wealth of information
Their analysis is potentially interested to any stakeholder
Getting the data out of the repository is not that difficult...
...but analysis may be
We’re interested in deep analysis
We’re interested in working with developers, managers, users
Which aspects of your project would you like to know?
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 21 / 22
22. This is the end
Have you learned something
useful?
[I would love to know what interested you the most]
[...and the least]
Jesus Gonzalez-Barahona (Bitergia) Analyzing free software development projects Focus Open Source 2012 22 / 22