SlideShare une entreprise Scribd logo
1  sur  20
On the Topology of Package Dependency Networks
A Comparison of Programming Language Ecosystems
Alexandre Decan, Tom Mens, Maëlick Claes
Software Engineering Lab
1
29 November 2016 – Int’l Workshop Software Ecosystem Architectures (WEA)
Research
Team
Previous Work
• A. Decan, T. Mens, M. Claes, P. Grosjean
– IWSECO-WEA 2015: "On the Development and Distribution of R
Packages: An Empirical Analysis of the R Ecosystem"
– SANER 2016:"When GitHub Meets CRAN: An Analysis of Inter-
Repository Package Dependency Problems”
• A. Serebrenik, T. Mens
– WEA 2015: "Challenges in Software Ecosystems Research"
• Generalizability
• Comparing different ecosystems
3
Software Packaging Ecosystems
• Ecosystem: ”a collection of software projects
which are developed and evolve together in
the same environment” [Lungu]
• Software distributed as packages
– Dependency relationships between
packages
– Package versioning
4
Software Packaging Ecosystems
for programming languages
• Many programming-language specific
package managers
5
npm
JavaScript
PyPI
Python
RubyGems
Ruby
CRAN
R
Software Packaging Ecosystems
for programming languages
IEEE Spectrum ranking of most popular programming languages
6
(http://spectrum.ieee.org/image/Mjc5MjI0Ng.png)
“The real standard library people want is more like what you find in Python
or Ruby, and it’s more batteries included, feature complete, and that is not
in JavaScript. That’s in the NPM world or the larger world.”
Ecosystem comparison
7
CRAN PyPI NPM
Snapshot date 2016-04-26 2016-02-17 2016-06-28
Packages 9k 56k 317k
Dependencies 21k 53k 728k
New packages in 2015 1.6k 17k 113k
Updates in 2015 8k 131k 711k
Data extraction
• CRAN: https://github.com/ecos-umons/extractoR
• npm: https://registry.npmjs.org
• PyPI: Missing dependencies information
=> https://kgullikson88.github.io/blog/pypi-analysis.html
8
Terminology
• b is a dependency of a
• a is a reverse dependency of b
• c is a transitive dependency of a
• a is a transitive reverse dependency of c
• {a, b, c, d, e, f} is a (weakly connected) component
• g is an isolated package 9
Dependency usage
in programming language ecosystems
PyPI has proportionally more isolated Python packages
(due to its extensive standard library?)
10
“The real standard library people want is more like what you find in Python or Ruby, and
it’s more batteries included, feature complete, and that is not in JavaScript. That’s in the
NPM world or the larger world.”
Topology
of programming language ecosystems
The majority of packages are part of a single huge component
11
Largest component:
• 76.5% (CRAN), 35.6% (PyPI), 63.8% (npm) of all packages
• 91% (CRAN), 88% (PyPI), 92% (npm) of all non-isolated packages
Differences in dependencies
between programming language ecosystems
12
npm packages have a much higher ratio of transitive dependencies
Differences in reverse dependencies
between programming language ecosystems
13
There are proportionally more very popular npm packages
(i.e. higher number of transitive reverse dependencies)
Differences in reverse dependencies
between programming language ecosystems
14
Number of packages required by more than 2% of the ecosystem
Possible explanation
micro-packages in npm
“In a lot of JavaScript environments, space is at a premium. [...]
Several larger libraries […] have actually intentionally split
themselves into sub-modules because people usually only ever
load them to use a single merge function.”
Example: isarray
150 direct, 77K inverse transitive deps in August 2016
var toString = {}.toString;
module.exports = Array.isArray || function (arr) {
return toString.call(arr) == '[object Array]’;
};
15
function leftpad (str, len, ch) {
str = String(str);
var i = -1;
if (!ch && ch !== 0) ch = ' ';
len = len - str.length;
while (++i < len) { str = ch + str; }
return str;
}
Known problems: leftpad
16
Its developer removed all his packages from npm:
“This impacted many thousands of projects. [...] We began
observing hundreds of failures per minute, as dependent projects –
and their dependents, and their dependents... – all failed when
requesting the now-unpublished package.”
http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm
function leftpad (str, len, ch) {
str = String(str);
var i = -1;
if (!ch && ch !== 0) ch = ' ';
len = len - str.length;
while (++i < len) { str = ch + str; }
return str;
}
Known problems: leftpad
17
npm managers un-unpublished leftpad but …
“a number of dependency chains [...] explicitly
requested 0.0.3.”
http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm
Conclusion
• Simple metrics can be used to compare the topology of
different package-based software ecosystems
• Similarities in the dependency graph structure
• Most non isolated packages are part of a large weakly
connected component
• Differences that can be explained by the specificities of
each ecosystem
• Python’s extensive standard library
• CRAN’s particular versioning policy
• npm's abundance of micro-packages
18
Future work
• See our SANER 2017 article
“An empirical comparison of dependency issues in
OSS packaging ecosystems”
• Include RubyGems
• Study the evolution over time
• Frequency of package updates
• Resilience of packages to failures in
dependencies
• Impact of solutions that rely on dependency
constraints and semantic versioning
• Beyond SANER 2017: study the interplay between
social and technical aspects
19
Thanks for you attention!
Questions?
20

Contenu connexe

Similaire à On the topology of package dependency networks: A comparison of programming language ecosystems

Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Tom Mens
 
Rustam Aliyev and Ivan Martynov - From monolith web app to micro-frontends – ...
Rustam Aliyev and Ivan Martynov - From monolith web app to micro-frontends – ...Rustam Aliyev and Ivan Martynov - From monolith web app to micro-frontends – ...
Rustam Aliyev and Ivan Martynov - From monolith web app to micro-frontends – ...OdessaJS Conf
 
From monolith web app to micro-frontends
From monolith web app to micro-frontendsFrom monolith web app to micro-frontends
From monolith web app to micro-frontendsRustam Aliyev
 
The Legion Programming Model for HPC
The Legion Programming Model for HPCThe Legion Programming Model for HPC
The Legion Programming Model for HPCinside-BigData.com
 
FOSDEM 2020 Presentation: Comparing dependency management issues across packa...
FOSDEM 2020 Presentation: Comparing dependency management issues across packa...FOSDEM 2020 Presentation: Comparing dependency management issues across packa...
FOSDEM 2020 Presentation: Comparing dependency management issues across packa...Fasten Project
 
Introduction to r
Introduction to rIntroduction to r
Introduction to rgslicraf
 
Genomics Applications in the Cloud with the DNAnexus Platform
Genomics Applications in the Cloud with the DNAnexus PlatformGenomics Applications in the Cloud with the DNAnexus Platform
Genomics Applications in the Cloud with the DNAnexus Platformkislyuk
 
PL Lecture 01 - preliminaries
PL Lecture 01 - preliminariesPL Lecture 01 - preliminaries
PL Lecture 01 - preliminariesSchwannden Kuo
 
Fasten Industry Meeting with GitHub about Dependancy Management
Fasten Industry Meeting with GitHub about Dependancy ManagementFasten Industry Meeting with GitHub about Dependancy Management
Fasten Industry Meeting with GitHub about Dependancy ManagementFasten Project
 
Clustered PHP - DC PHP 2009
Clustered PHP - DC PHP 2009Clustered PHP - DC PHP 2009
Clustered PHP - DC PHP 2009marcelesser
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaopenseesdays
 
Reproducibility with Revolution R Open
Reproducibility with Revolution R OpenReproducibility with Revolution R Open
Reproducibility with Revolution R OpenRevolution Analytics
 
Socio-Technical Empirical Comparison of Software Package Ecosystems
Socio-Technical Empirical Comparison of Software Package EcosystemsSocio-Technical Empirical Comparison of Software Package Ecosystems
Socio-Technical Empirical Comparison of Software Package EcosystemsTom Mens
 
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...Tom Mens
 
Distributions and package management in the containers era
Distributions and package management in the containers eraDistributions and package management in the containers era
Distributions and package management in the containers eranussbauml
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersTom Mens
 
Demystifying Containerization Principles for Data Scientists
Demystifying Containerization Principles for Data ScientistsDemystifying Containerization Principles for Data Scientists
Demystifying Containerization Principles for Data ScientistsDr Ganesh Iyer
 

Similaire à On the topology of package dependency networks: A comparison of programming language ecosystems (20)

Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!
 
Rustam Aliyev and Ivan Martynov - From monolith web app to micro-frontends – ...
Rustam Aliyev and Ivan Martynov - From monolith web app to micro-frontends – ...Rustam Aliyev and Ivan Martynov - From monolith web app to micro-frontends – ...
Rustam Aliyev and Ivan Martynov - From monolith web app to micro-frontends – ...
 
From monolith web app to micro-frontends
From monolith web app to micro-frontendsFrom monolith web app to micro-frontends
From monolith web app to micro-frontends
 
The Legion Programming Model for HPC
The Legion Programming Model for HPCThe Legion Programming Model for HPC
The Legion Programming Model for HPC
 
FOSDEM 2020 Presentation: Comparing dependency management issues across packa...
FOSDEM 2020 Presentation: Comparing dependency management issues across packa...FOSDEM 2020 Presentation: Comparing dependency management issues across packa...
FOSDEM 2020 Presentation: Comparing dependency management issues across packa...
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Genomics Applications in the Cloud with the DNAnexus Platform
Genomics Applications in the Cloud with the DNAnexus PlatformGenomics Applications in the Cloud with the DNAnexus Platform
Genomics Applications in the Cloud with the DNAnexus Platform
 
Node.js security tour
Node.js security tourNode.js security tour
Node.js security tour
 
PL Lecture 01 - preliminaries
PL Lecture 01 - preliminariesPL Lecture 01 - preliminaries
PL Lecture 01 - preliminaries
 
Fasten Industry Meeting with GitHub about Dependancy Management
Fasten Industry Meeting with GitHub about Dependancy ManagementFasten Industry Meeting with GitHub about Dependancy Management
Fasten Industry Meeting with GitHub about Dependancy Management
 
Clustered PHP - DC PHP 2009
Clustered PHP - DC PHP 2009Clustered PHP - DC PHP 2009
Clustered PHP - DC PHP 2009
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKenna
 
Reproducibility with Revolution R Open
Reproducibility with Revolution R OpenReproducibility with Revolution R Open
Reproducibility with Revolution R Open
 
2019 swan-cs3
2019 swan-cs32019 swan-cs3
2019 swan-cs3
 
Socio-Technical Empirical Comparison of Software Package Ecosystems
Socio-Technical Empirical Comparison of Software Package EcosystemsSocio-Technical Empirical Comparison of Software Package Ecosystems
Socio-Technical Empirical Comparison of Software Package Ecosystems
 
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
 
Distributions and package management in the containers era
Distributions and package management in the containers eraDistributions and package management in the containers era
Distributions and package management in the containers era
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package Managers
 
Plc part 1
Plc part 1Plc part 1
Plc part 1
 
Demystifying Containerization Principles for Data Scientists
Demystifying Containerization Principles for Data ScientistsDemystifying Containerization Principles for Data Scientists
Demystifying Containerization Principles for Data Scientists
 

Plus de Tom Mens

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD studentTom Mens
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentTom Mens
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubTom Mens
 
The (r)evolution of CI/CD on GitHub
 The (r)evolution of CI/CD on GitHub The (r)evolution of CI/CD on GitHub
The (r)evolution of CI/CD on GitHubTom Mens
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureTom Mens
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Tom Mens
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubTom Mens
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networksTom Mens
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsTom Mens
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero SpaceTom Mens
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesTom Mens
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Tom Mens
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...Tom Mens
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Tom Mens
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Tom Mens
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsTom Mens
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarTom Mens
 
ConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersTom Mens
 
On the diversity of software popularity metrics: An empirical study of npm
On the diversity of software popularity metrics: An empirical study of npmOn the diversity of software popularity metrics: An empirical study of npm
On the diversity of software popularity metrics: An empirical study of npmTom Mens
 
How to increase the technical health of your software?
How to increase the technical health of your software?How to increase the technical health of your software?
How to increase the technical health of your software?Tom Mens
 

Plus de Tom Mens (20)

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD student
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software development
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHub
 
The (r)evolution of CI/CD on GitHub
 The (r)evolution of CI/CD on GitHub The (r)evolution of CI/CD on GitHub
The (r)evolution of CI/CD on GitHub
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the Future
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHub
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networks
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero Space
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messages
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research Achievements
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminar
 
ConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker Containers
 
On the diversity of software popularity metrics: An empirical study of npm
On the diversity of software popularity metrics: An empirical study of npmOn the diversity of software popularity metrics: An empirical study of npm
On the diversity of software popularity metrics: An empirical study of npm
 
How to increase the technical health of your software?
How to increase the technical health of your software?How to increase the technical health of your software?
How to increase the technical health of your software?
 

Dernier

Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 

Dernier (20)

Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 

On the topology of package dependency networks: A comparison of programming language ecosystems

  • 1. On the Topology of Package Dependency Networks A Comparison of Programming Language Ecosystems Alexandre Decan, Tom Mens, Maëlick Claes Software Engineering Lab 1 29 November 2016 – Int’l Workshop Software Ecosystem Architectures (WEA)
  • 3. Previous Work • A. Decan, T. Mens, M. Claes, P. Grosjean – IWSECO-WEA 2015: "On the Development and Distribution of R Packages: An Empirical Analysis of the R Ecosystem" – SANER 2016:"When GitHub Meets CRAN: An Analysis of Inter- Repository Package Dependency Problems” • A. Serebrenik, T. Mens – WEA 2015: "Challenges in Software Ecosystems Research" • Generalizability • Comparing different ecosystems 3
  • 4. Software Packaging Ecosystems • Ecosystem: ”a collection of software projects which are developed and evolve together in the same environment” [Lungu] • Software distributed as packages – Dependency relationships between packages – Package versioning 4
  • 5. Software Packaging Ecosystems for programming languages • Many programming-language specific package managers 5 npm JavaScript PyPI Python RubyGems Ruby CRAN R
  • 6. Software Packaging Ecosystems for programming languages IEEE Spectrum ranking of most popular programming languages 6 (http://spectrum.ieee.org/image/Mjc5MjI0Ng.png) “The real standard library people want is more like what you find in Python or Ruby, and it’s more batteries included, feature complete, and that is not in JavaScript. That’s in the NPM world or the larger world.”
  • 7. Ecosystem comparison 7 CRAN PyPI NPM Snapshot date 2016-04-26 2016-02-17 2016-06-28 Packages 9k 56k 317k Dependencies 21k 53k 728k New packages in 2015 1.6k 17k 113k Updates in 2015 8k 131k 711k
  • 8. Data extraction • CRAN: https://github.com/ecos-umons/extractoR • npm: https://registry.npmjs.org • PyPI: Missing dependencies information => https://kgullikson88.github.io/blog/pypi-analysis.html 8
  • 9. Terminology • b is a dependency of a • a is a reverse dependency of b • c is a transitive dependency of a • a is a transitive reverse dependency of c • {a, b, c, d, e, f} is a (weakly connected) component • g is an isolated package 9
  • 10. Dependency usage in programming language ecosystems PyPI has proportionally more isolated Python packages (due to its extensive standard library?) 10 “The real standard library people want is more like what you find in Python or Ruby, and it’s more batteries included, feature complete, and that is not in JavaScript. That’s in the NPM world or the larger world.”
  • 11. Topology of programming language ecosystems The majority of packages are part of a single huge component 11 Largest component: • 76.5% (CRAN), 35.6% (PyPI), 63.8% (npm) of all packages • 91% (CRAN), 88% (PyPI), 92% (npm) of all non-isolated packages
  • 12. Differences in dependencies between programming language ecosystems 12 npm packages have a much higher ratio of transitive dependencies
  • 13. Differences in reverse dependencies between programming language ecosystems 13 There are proportionally more very popular npm packages (i.e. higher number of transitive reverse dependencies)
  • 14. Differences in reverse dependencies between programming language ecosystems 14 Number of packages required by more than 2% of the ecosystem
  • 15. Possible explanation micro-packages in npm “In a lot of JavaScript environments, space is at a premium. [...] Several larger libraries […] have actually intentionally split themselves into sub-modules because people usually only ever load them to use a single merge function.” Example: isarray 150 direct, 77K inverse transitive deps in August 2016 var toString = {}.toString; module.exports = Array.isArray || function (arr) { return toString.call(arr) == '[object Array]’; }; 15
  • 16. function leftpad (str, len, ch) { str = String(str); var i = -1; if (!ch && ch !== 0) ch = ' '; len = len - str.length; while (++i < len) { str = ch + str; } return str; } Known problems: leftpad 16 Its developer removed all his packages from npm: “This impacted many thousands of projects. [...] We began observing hundreds of failures per minute, as dependent projects – and their dependents, and their dependents... – all failed when requesting the now-unpublished package.” http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm
  • 17. function leftpad (str, len, ch) { str = String(str); var i = -1; if (!ch && ch !== 0) ch = ' '; len = len - str.length; while (++i < len) { str = ch + str; } return str; } Known problems: leftpad 17 npm managers un-unpublished leftpad but … “a number of dependency chains [...] explicitly requested 0.0.3.” http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm
  • 18. Conclusion • Simple metrics can be used to compare the topology of different package-based software ecosystems • Similarities in the dependency graph structure • Most non isolated packages are part of a large weakly connected component • Differences that can be explained by the specificities of each ecosystem • Python’s extensive standard library • CRAN’s particular versioning policy • npm's abundance of micro-packages 18
  • 19. Future work • See our SANER 2017 article “An empirical comparison of dependency issues in OSS packaging ecosystems” • Include RubyGems • Study the evolution over time • Frequency of package updates • Resilience of packages to failures in dependencies • Impact of solutions that rely on dependency constraints and semantic versioning • Beyond SANER 2017: study the interplay between social and technical aspects 19
  • 20. Thanks for you attention! Questions? 20

Notes de l'éditeur

  1. In this talk I will present an empirical study of the comparison of three different programming language ecosystems
  2. Alexander Serebrenik => you probably all know who his is, since he is ICSME chair Alexandre Decan first carried out research on formal database theory but I managed to convert him to the more practical side of SE research Bogdan Vasilescu, obtained his PhD with Serebrenik, and after a 2 year postdoc at UCDavis now joined CMU in Pittsburgh.
  3. But before delving into the comparative study itself, let’s start with a little bit of background I’ve recently finished PhD on the topic of maintainability issues in packaging software ecosystems Part of the thesis: previous papers on ecosystem, in particular the R ecosystem Last year Alexander presented most important challenges in ecosystems One of the future of my own thesis
  4. To beging with: we mean by ecosystem as the Lungu defintion In our case, software projects are software packages Particularity: dependency relationships between packages Nowadays major open source software libraries are distributed as part of software packaging software ecosytems
  5. To beging with: we mean by ecosystem as the Lungu defintion In our case, software projects are software packages Particularity: dependency relationships between packages Nowadays major open source software libraries are distributed as part of software packaging software ecosytems
  6. We selected three popular interpreted programming language’s packaging ecosystems. They all have gained in popularity recently R we previously studies and is a language originally oriented towards statistics and data analysis JavaScript a language mostly web application oriented Python more general language but nowadays also used for both web and data analysis
  7. We built dependency graph for each month since 2000 This shows the evolution of the increase of the size of these graphs. They all exhibit an exponential increase.
  8. For R we used tools we previously developed to get data from official sources For npm we got direct output from the official package API For PyPI we realized many data were missing dependencies so we used third party data These three languages have an ecosystem with a different philosophy
  9. We will use the following terminology … Building dependency graph for those ecosystems (Python/PyPI, R/CRAB, JavaScript/npm)
  10. What we observe: Python packages tend to be more isolated Explained by a more complete standard library JavaScript is on the opposite side R has a good standard library for data analysis but many packages extend it (e.g. ggplot2)
  11. Explain graphic: number of components vs component size On the y axis number of components On the x axis size of component Far right biggest component of each ecosystem One of the similarity: Most non isolated between packages are part of the same large component
  12. We looked at the distribution of the number of transitive dependencies of each package Differences between npm and the others
  13. Also npm packages have more packages on which thousands of packages depend upon, => npm might be more vunerable
  14. From our empirical study we saw that many npm popular packages are very fragile. => in particular micro-packages
  15. What happened? - Everything started with the disagreement over a module name “kik” Its developer unpublished *all* his 272 modules from npm, including leftpad This caused thousands of dependent projects to break, including Node and Babel The community stepped in within minutes to fix the problem. Required NPM managers to go against their own policy by un-unpublishing the module
  16. What happened? - Everything started with the disagreement over a module name “kik” Its developer unpublished *all* his 272 modules from npm, including leftpad This caused thousands of dependent projects to break, including Node and Babel The community stepped in within minutes to fix the problem. Required NPM managers to go against their own policy by un-unpublishing the module
  17. The observed differenced can have impact on both the ecosystem users and developers => importance of policy in managing ecosystems with policies and not only tools