SlideShare une entreprise Scribd logo
1  sur  25
Socio-Technical
Evolution and Migration
in the Ruby Ecosystem
Eleni Constantinou, Tom Mens
COMPLEXYS Research Institute, UMONS
BENEVOL 2016, Utrecht
Introduction
Software ecosystem
• Collection of software projects that are developed and
evolve together in the same environment [1]
Ecosystem environment
• Development team ⇒ Social aspect
• Source code artefacts ⇒ Technical aspect
Modifications
• Social: Contributors joining/leaving
• Technical: New/obsolete source code files
[1] M. Lungu. Towards reverse engineering software ecosystems. Int'l Conf. Software Maintenance, pages 428-431, 2008.1
Introduction
Evolution
• Longevity
• Growth
Ecosystem sustainability
Long-term effect of social/technical modifications
A sustainable software ecosystem
can increase or maintain its
user/developer community over
longer periods of time and can
survive inherent changes such
as new technologies or new
products (e.g. from competitors)
that can change the population
(the community of users,
developers etc) [2]
[2] D. Dhungana, I. Groher, E. Schludermann, S. Biffl. Software ecosystems vs. natural ecosystems: learning from the
ingenious mind of nature. Eur. Conf. on Software Architecture: Companion Volume, pages 96-102, 2010. 2
Background
3
Time
Unit 1
Time
Unit 2
Time
Unit 3
…
Time
Unit N-2
Time
Unit N-1
Time
Unit N
S
T
A
R
T
E
N
D
Technic
al
Artefact
s
Technic
al
Artefact
s
P1 P3P2 P1 P3P4
Definitions
4
Project Metrics
ObsoleteProjects(t)
NewProjects(t)
ActiveProjects(t)
ProjectRenewal(t)
ProjectAbandonment(t)
P2
P4
P2 P1 P3
P1 P3
P1 P3
P4 P4
P2
Definitions
5
Team Metrics
Leavers(t)
Joiners(t)
Stayers(t)
Team(t)
TeamRenewal(t)
TeamAbandonment(t)
Definitions
6
File Metrics
Obsolete(t)
New(t)
Maintained(t)
FileRenewal(t)
FileAbandonment(t)
X
✔
⃝
✔
X
✔ ⃝
X ⃝
Source Code Files
Refactoring activities
• Renamed files
• Moved files
Validity of renewal,
abandonment
measurements
7
Research Questions
RQ1 How does the ecosystem grow over time?
RQ2 How do the technical artefacts of the
ecosystem evolve?
RQ3 How does the ecosystem’s contributor
team evolve?
RQ4 How do changes in the contributor team
impact the technical artefacts?
8
Dataset
• Ruby ecosystem in GitHub
• GHTorrent dataset [2] (2016-09-05 dump)
• Timespan: October 2007 – September
2016
• Time unit: year quarters
• Commit activity
• Three levels: Base
project/Forks/Ecosystem
[2] G. Gousios. The GHTorrent dataset and tool suite. Working Conf. Mining Software Repositories, pages 233-236, 2013.9
Dataset Perils – Mitigation & filters
10
Filter Description Perils
1 Eliminate non-Ruby projects
2 Eliminate inactive projects Low project activity, inactive project,
repository is not a project
3 Eliminate isolated projects Personal projects
4 Eliminate forks without merges to
the base project
Inactive project, few projects use pull
requests
5 Eliminate short-lived contributors Noise of occasional/short-lived
contributors
6 Only consider source code files in
commits
Non-software development project
Dataset
Base Forks Ecosystem
Projects 10,792 49,101 60,073
Contributors 42,206 34,317 55,924
Touched Files 681,539 191,016 712,300
Commits 2,638,097 887,030 3,525,127
LOC 389,930,604 77,510,268 467,440,872
11
RQ1 How does the ecosystem grow over time?
12
Commits Lines of Code (LOC)
RQ1 How does the ecosystem grow over time?
13
Base Projects Forks
Quarter 25 (November 2013-February 2014)
Small number of new projects
5 10 15 20 25 30
Quarters
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
ProjectRenewal
ProjectAbandonment
5 10 15 20 25 30
Quarters
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
ProjectRenewal
ProjectAbandonment
5 10 15 20 25 30
Quarters
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
ProjectRenewal
ProjectAbandonment
RQ1 How does the ecosystem grow over time?
14
Base Projects Forks
Before quarter 25
• Base Projects: 30-40% new projects, less than 10% abandoned
• Forks: more than 60% new forks
RQ1 How does the ecosystem grow over time?
15
Evidence of contributor migration to JavaScript
After quarter 17 (December 2011)
Larger growth of JavaScript ecosystem
RQ2 How do the technical artefacts (files) evolve?
Base Projects Forks
Base projects: Bulk of development activity
After quarter 25: decrease of new files
16
RQ3 How does the contributor team evolve?
Base Projects Forks
Contributors leave forks but continue to participate in
base projects
After quarter 20: more Leavers ; less Joiners
Ecosystem
5 10 15 20 25 30
Quarters
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
TeamRenewal
TeamAbandonment
5 10 15 20 25 30
Quarters
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
TeamRenewal
TeamAbandonment
5 10 15 20 25 30
Quarters
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
TeamRenewal
TeamAbandonment
RQ3 How does the contributor team evolve?
Base Project Forks
Decreasing renewal ; increasing abandonment
After quarter 25: Abandonment > Renewal
Ecosystem
19
Ecosystem
Active in
Ruby
JavaScript 18,038
Python 10,707
Java 7,363
C 6,406
Ecosystem
Abandoned
Ruby
Percentage
JavaScript 13,814 77%
Python 8,131 76%
Java 5,132 70%
C 4,174 65%
Most Ruby Leavers…
• worked in JavaScript projects in parallel to Ruby projects
• Continued to work in JavaScript after abandoning Ruby
RQ3 How does the contributor team evolve?
RQ4 How do changes in the contributor team
impact the technical artefacts?
Diversity index of Leavers
(relative entropy)
20
Increased Leaver specialization throughout time:
Large contribution to important projects
ONTRIBUTIONS TO OTHER ECOSYSTEMSOF RUBY ABANDONERS
anguage Active in Ruby Language Abandoned Ruby
vaScript 18,038 JavaScript 13,814
hell 10,707 Shell 8,982
ython 10,211 HTML 8,237
SS 9,875 Python 8,131
va 7,363 CSS 8,082
TML 7,056 Java 5,132
6,406 C 4,174
HP 5,839 Go 3,993
mL 5,050 VimL 3,768
++ 4,649 PHP 3,517
offeeScript 3,946 C++ 3,318
o 3,334 CoffeeScript 2,670
bjective-C 3,095 Objective-C 1,993
erl 2,408 Emacs Lisp 1,289
uppet 1,862 Perl 1,276
14] who observed that newcomers do not tend to become
ers.
nally, the abandonment rate exceeds the joining rate of
cosystem after quarter 25 since and the number of active
opersisreduced.Somethingiswrong in previoussentence
e and”? Please fix. Combined with our observations
erning the ecosystem projects in Section III, this reveals
ence of a possible correlation between developer aban-
ment and project abandonment. To further investigate the
vior of contributors abandoning the Ruby ecosystem, we
sured their activity on GitHub projects with another main
diversity can be measured according to Shannon’s en
the Simpson index, while the specialization of aspecie
level relative to the species in the other level is meas
the relative entropy (a.k.a. Kullback-Liebler divergen
[15]. By measuring the specialization of Leavers, w
the relative risk they cause to the ecosystem (acco
their relativecontribution) until they abandonedtheeco
As explained in [16], the specialization of a contri
expressed in terms of relative entropy is defined as:
Scj =
n
i = 1
wi j
Cj
(log
wi j
Cj
− log
Pi
W
)
where n is the number of projects and m the nu
contributorsin the ecosystem, wi j the workload of con
cj to project pi counted in number of lines of cod
m
j = 1 wi j is the total workload of all contributors to
pi , Cj =
n
i = 1 wi j is the total workload of contribut
all projects she is contributing to, and W =
n
i = 1
is the total ecosystem workload.
We computed the contributor specialization with
constraint on the project and ecosystem workload
precisely, weconsider theecosystem and contributor w
for the quarters where the contributor was active
ecosystem, that is from the quarter of her first con
until the quarter before abandoning the ecosystem. F
Threats to validity
• Multiple user accounts
• Less common within the same GitHub repository
• Identity merging [3]
• Programming language identification
• GitHub dataset
• Filters to eliminate noise
• Activity outside GitHub
• Merged pull requests appear as non-merged in GitHub
• Not all activity results from registered users
21
[3] M. Goeminne and T. Mens, “A comparison of identity merge algorithms for software repositories,” Science of Computer
Programming, vol. 78, no. 8, pages 971–986, 2013
Conclusion
Ruby software ecosystem in GitHub
• Investigate the permanent modifications of
the socio-technical network
• Impact of permanent changes in contributor
team on the technical artefacts
• Preliminary evidence about contributor
migration across different ecosystems
(Ruby → JavaScript)
Identify risks in project/ecosystem evolution
due to important team changes
22
Ongoing/Future Work
Contributor migration across different ecosystems
Advanced socio-technical analyses
• Socio-technical congruence
• Socio-technical debt
• Their effect on the ecosystem evolution
23
Thank you!
24

Contenu connexe

Similaire à Socio-technical evolution and migration in the Ruby ecosystem

Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
Neil Chue Hong
 

Similaire à Socio-technical evolution and migration in the Ruby ecosystem (20)

Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
 
第1回バイオインフォマティクスデータ可視化セミナー@Riken
第1回バイオインフォマティクスデータ可視化セミナー@Riken第1回バイオインフォマティクスデータ可視化セミナー@Riken
第1回バイオインフォマティクスデータ可視化セミナー@Riken
 
Scale14x Patterns and Practices for Open Source Project Success
Scale14x Patterns and Practices for Open Source Project SuccessScale14x Patterns and Practices for Open Source Project Success
Scale14x Patterns and Practices for Open Source Project Success
 
Swift at IBM: Mobile, open source and the drive to the cloud
Swift at IBM: Mobile, open source and the drive to the cloudSwift at IBM: Mobile, open source and the drive to the cloud
Swift at IBM: Mobile, open source and the drive to the cloud
 
Software Sustainability in e-Research: Dying for a Change
Software Sustainability in e-Research: Dying for a ChangeSoftware Sustainability in e-Research: Dying for a Change
Software Sustainability in e-Research: Dying for a Change
 
A preliminary study of GitHub Actions workflow changes .pptx
A preliminary study of GitHub Actions workflow changes .pptxA preliminary study of GitHub Actions workflow changes .pptx
A preliminary study of GitHub Actions workflow changes .pptx
 
What's new in the latest source{d} releases!
What's new in the latest source{d} releases!What's new in the latest source{d} releases!
What's new in the latest source{d} releases!
 
[SiriusCon 2018] Closing session - Live Community Survey
[SiriusCon 2018] Closing session - Live Community Survey[SiriusCon 2018] Closing session - Live Community Survey
[SiriusCon 2018] Closing session - Live Community Survey
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
 
Using oss and hacker culture at an internet company at osc/tokyo 2014/03/01
Using oss and hacker culture at an internet company at osc/tokyo 2014/03/01Using oss and hacker culture at an internet company at osc/tokyo 2014/03/01
Using oss and hacker culture at an internet company at osc/tokyo 2014/03/01
 
The path to an hybrid open source paradigm
The path to an hybrid open source paradigmThe path to an hybrid open source paradigm
The path to an hybrid open source paradigm
 
Mannu_Kumar_CV
Mannu_Kumar_CVMannu_Kumar_CV
Mannu_Kumar_CV
 
Of Changes and Their History
Of Changes and Their HistoryOf Changes and Their History
Of Changes and Their History
 
Intro to open source - 101 presentation
Intro to open source - 101 presentationIntro to open source - 101 presentation
Intro to open source - 101 presentation
 
Leveraging the Crowd: Supporting Newcomers to Build an OSS Community
Leveraging the Crowd: Supporting Newcomers to Build an OSS CommunityLeveraging the Crowd: Supporting Newcomers to Build an OSS Community
Leveraging the Crowd: Supporting Newcomers to Build an OSS Community
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
A Methodology for Building the Internet of Things
A Methodology for Building the Internet of ThingsA Methodology for Building the Internet of Things
A Methodology for Building the Internet of Things
 
Open Source Community Metrics LibreOffice Conference
Open Source Community Metrics LibreOffice ConferenceOpen Source Community Metrics LibreOffice Conference
Open Source Community Metrics LibreOffice Conference
 
What is Rapid Innovation
What is Rapid InnovationWhat is Rapid Innovation
What is Rapid Innovation
 
A $5 Billion Value (Linux Foundation, 2015)
A $5 Billion Value (Linux Foundation, 2015)A $5 Billion Value (Linux Foundation, 2015)
A $5 Billion Value (Linux Foundation, 2015)
 

Plus de Tom Mens

Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Tom Mens
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)
Tom Mens
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package Managers
Tom Mens
 

Plus de Tom Mens (20)

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD student
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software development
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHub
 
The (r)evolution of CI/CD on GitHub
 The (r)evolution of CI/CD on GitHub The (r)evolution of CI/CD on GitHub
The (r)evolution of CI/CD on GitHub
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the Future
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHub
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networks
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero Space
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messages
 
Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
 
On the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsOn the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystems
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research Achievements
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminar
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package Managers
 

Dernier

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
Cherry
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
Cherry
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
Cherry
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 

Dernier (20)

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsKanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
Cot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNA
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 

Socio-technical evolution and migration in the Ruby ecosystem

  • 1. Socio-Technical Evolution and Migration in the Ruby Ecosystem Eleni Constantinou, Tom Mens COMPLEXYS Research Institute, UMONS BENEVOL 2016, Utrecht
  • 2. Introduction Software ecosystem • Collection of software projects that are developed and evolve together in the same environment [1] Ecosystem environment • Development team ⇒ Social aspect • Source code artefacts ⇒ Technical aspect Modifications • Social: Contributors joining/leaving • Technical: New/obsolete source code files [1] M. Lungu. Towards reverse engineering software ecosystems. Int'l Conf. Software Maintenance, pages 428-431, 2008.1
  • 3. Introduction Evolution • Longevity • Growth Ecosystem sustainability Long-term effect of social/technical modifications A sustainable software ecosystem can increase or maintain its user/developer community over longer periods of time and can survive inherent changes such as new technologies or new products (e.g. from competitors) that can change the population (the community of users, developers etc) [2] [2] D. Dhungana, I. Groher, E. Schludermann, S. Biffl. Software ecosystems vs. natural ecosystems: learning from the ingenious mind of nature. Eur. Conf. on Software Architecture: Companion Volume, pages 96-102, 2010. 2
  • 4. Background 3 Time Unit 1 Time Unit 2 Time Unit 3 … Time Unit N-2 Time Unit N-1 Time Unit N S T A R T E N D Technic al Artefact s Technic al Artefact s P1 P3P2 P1 P3P4
  • 8. Source Code Files Refactoring activities • Renamed files • Moved files Validity of renewal, abandonment measurements 7
  • 9. Research Questions RQ1 How does the ecosystem grow over time? RQ2 How do the technical artefacts of the ecosystem evolve? RQ3 How does the ecosystem’s contributor team evolve? RQ4 How do changes in the contributor team impact the technical artefacts? 8
  • 10. Dataset • Ruby ecosystem in GitHub • GHTorrent dataset [2] (2016-09-05 dump) • Timespan: October 2007 – September 2016 • Time unit: year quarters • Commit activity • Three levels: Base project/Forks/Ecosystem [2] G. Gousios. The GHTorrent dataset and tool suite. Working Conf. Mining Software Repositories, pages 233-236, 2013.9
  • 11. Dataset Perils – Mitigation & filters 10 Filter Description Perils 1 Eliminate non-Ruby projects 2 Eliminate inactive projects Low project activity, inactive project, repository is not a project 3 Eliminate isolated projects Personal projects 4 Eliminate forks without merges to the base project Inactive project, few projects use pull requests 5 Eliminate short-lived contributors Noise of occasional/short-lived contributors 6 Only consider source code files in commits Non-software development project
  • 12. Dataset Base Forks Ecosystem Projects 10,792 49,101 60,073 Contributors 42,206 34,317 55,924 Touched Files 681,539 191,016 712,300 Commits 2,638,097 887,030 3,525,127 LOC 389,930,604 77,510,268 467,440,872 11
  • 13. RQ1 How does the ecosystem grow over time? 12 Commits Lines of Code (LOC)
  • 14. RQ1 How does the ecosystem grow over time? 13 Base Projects Forks Quarter 25 (November 2013-February 2014) Small number of new projects
  • 15. 5 10 15 20 25 30 Quarters 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ProjectRenewal ProjectAbandonment 5 10 15 20 25 30 Quarters 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ProjectRenewal ProjectAbandonment 5 10 15 20 25 30 Quarters 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ProjectRenewal ProjectAbandonment RQ1 How does the ecosystem grow over time? 14 Base Projects Forks Before quarter 25 • Base Projects: 30-40% new projects, less than 10% abandoned • Forks: more than 60% new forks
  • 16. RQ1 How does the ecosystem grow over time? 15 Evidence of contributor migration to JavaScript After quarter 17 (December 2011) Larger growth of JavaScript ecosystem
  • 17. RQ2 How do the technical artefacts (files) evolve? Base Projects Forks Base projects: Bulk of development activity After quarter 25: decrease of new files 16
  • 18. RQ3 How does the contributor team evolve? Base Projects Forks Contributors leave forks but continue to participate in base projects After quarter 20: more Leavers ; less Joiners Ecosystem
  • 19. 5 10 15 20 25 30 Quarters 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 TeamRenewal TeamAbandonment 5 10 15 20 25 30 Quarters 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 TeamRenewal TeamAbandonment 5 10 15 20 25 30 Quarters 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 TeamRenewal TeamAbandonment RQ3 How does the contributor team evolve? Base Project Forks Decreasing renewal ; increasing abandonment After quarter 25: Abandonment > Renewal Ecosystem
  • 20. 19 Ecosystem Active in Ruby JavaScript 18,038 Python 10,707 Java 7,363 C 6,406 Ecosystem Abandoned Ruby Percentage JavaScript 13,814 77% Python 8,131 76% Java 5,132 70% C 4,174 65% Most Ruby Leavers… • worked in JavaScript projects in parallel to Ruby projects • Continued to work in JavaScript after abandoning Ruby RQ3 How does the contributor team evolve?
  • 21. RQ4 How do changes in the contributor team impact the technical artefacts? Diversity index of Leavers (relative entropy) 20 Increased Leaver specialization throughout time: Large contribution to important projects ONTRIBUTIONS TO OTHER ECOSYSTEMSOF RUBY ABANDONERS anguage Active in Ruby Language Abandoned Ruby vaScript 18,038 JavaScript 13,814 hell 10,707 Shell 8,982 ython 10,211 HTML 8,237 SS 9,875 Python 8,131 va 7,363 CSS 8,082 TML 7,056 Java 5,132 6,406 C 4,174 HP 5,839 Go 3,993 mL 5,050 VimL 3,768 ++ 4,649 PHP 3,517 offeeScript 3,946 C++ 3,318 o 3,334 CoffeeScript 2,670 bjective-C 3,095 Objective-C 1,993 erl 2,408 Emacs Lisp 1,289 uppet 1,862 Perl 1,276 14] who observed that newcomers do not tend to become ers. nally, the abandonment rate exceeds the joining rate of cosystem after quarter 25 since and the number of active opersisreduced.Somethingiswrong in previoussentence e and”? Please fix. Combined with our observations erning the ecosystem projects in Section III, this reveals ence of a possible correlation between developer aban- ment and project abandonment. To further investigate the vior of contributors abandoning the Ruby ecosystem, we sured their activity on GitHub projects with another main diversity can be measured according to Shannon’s en the Simpson index, while the specialization of aspecie level relative to the species in the other level is meas the relative entropy (a.k.a. Kullback-Liebler divergen [15]. By measuring the specialization of Leavers, w the relative risk they cause to the ecosystem (acco their relativecontribution) until they abandonedtheeco As explained in [16], the specialization of a contri expressed in terms of relative entropy is defined as: Scj = n i = 1 wi j Cj (log wi j Cj − log Pi W ) where n is the number of projects and m the nu contributorsin the ecosystem, wi j the workload of con cj to project pi counted in number of lines of cod m j = 1 wi j is the total workload of all contributors to pi , Cj = n i = 1 wi j is the total workload of contribut all projects she is contributing to, and W = n i = 1 is the total ecosystem workload. We computed the contributor specialization with constraint on the project and ecosystem workload precisely, weconsider theecosystem and contributor w for the quarters where the contributor was active ecosystem, that is from the quarter of her first con until the quarter before abandoning the ecosystem. F
  • 22. Threats to validity • Multiple user accounts • Less common within the same GitHub repository • Identity merging [3] • Programming language identification • GitHub dataset • Filters to eliminate noise • Activity outside GitHub • Merged pull requests appear as non-merged in GitHub • Not all activity results from registered users 21 [3] M. Goeminne and T. Mens, “A comparison of identity merge algorithms for software repositories,” Science of Computer Programming, vol. 78, no. 8, pages 971–986, 2013
  • 23. Conclusion Ruby software ecosystem in GitHub • Investigate the permanent modifications of the socio-technical network • Impact of permanent changes in contributor team on the technical artefacts • Preliminary evidence about contributor migration across different ecosystems (Ruby → JavaScript) Identify risks in project/ecosystem evolution due to important team changes 22
  • 24. Ongoing/Future Work Contributor migration across different ecosystems Advanced socio-technical analyses • Socio-technical congruence • Socio-technical debt • Their effect on the ecosystem evolution 23