Presentation by Damien Legay (Software Engineering Lab, University of Mons) during the BENEVOL 2020 software evolution research seminar.
Abstract: The Linux operating system comes in a variety of distributions. These distributions incorporate the Linux kernel and a host of third-party packages, which provide much of the end-user functionality of the distribution. Distribution maintainers have the difficult task of ensuring that these packages are and remain in good working order, free of security vulnerabilities and continuously adapt to meet the ever-evolving needs of their users. These concerns are sometimes in conflict, which lead distributions to adopt different philosophies. As a result, not only do distributions offer a different set of packages, but the packages they have in common are present in different versions. We define package freshness as the difference, in time and number of versions, between a package’s latest version available and the one deployed in a given distribution.
Through a mixed-method research approach, we provide qualitative and quantitative analyses to compare user perception with quantitatively measurable freshness of packages in mainstream Linux distributions: Arch Linux, CentOS, Debian Stable, Debian Unstable, Fedora and Ubuntu.
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Package Freshness in Linux Distributions: Perception versus Reality
1. Legay, Decan, Mens
Damien Legay, Alexandre Decan, Tom Mens
Software Engineering Lab
University of Mons
Package Freshness in Linux Distributions
1BENEVOL 2020 Software Evolution Research Seminar – 4 December 2020
Perception versus Reality
3. Legay, Decan, Mens
Linux distributions emphasise different qualities
Distribution Divergence
3Package Freshness in Linux Distributions
Stability Security Freshness
Debian (stable) QubesOS Arch Linux
CentOS Subgraph Gentoo
Linux Mint Alpine Linux OpenSUSE
Tumbleweed
Package freshness: how up to date a package is
compared to upstream
4. Legay, Decan, Mens
Mixed-methods research study:
§Qualitative survey of users of Linux distributions
§ Published in ICSME2020 - NIER track
§ "On Package Freshness in Linux Distributions," International
Conference on Software Maintenance and Evolution (ICSME 2020).
DOI: 10.1109/ICSME46990.2020.00072
§Quantitative analyses of package freshness based on extracted
historical data of Linux package distributions
§ Submitted / Under review
1
Package freshness
4Package Freshness in Linux Distributions
5. Legay, Decan, Mens
§170 participants surveyed:
§CHAOSSCon Europe 2020
§FOSDEM 2020
§Linux fora
§subreddits
§Focus on:
§ Perception of package freshness
§ Importance of package freshness
§ Motivations to update packages
§ Mechanisms used to update packages
Qualitative Survey of users
of Linux distributions
5Package Freshness in Linux Distributions
6. Legay, Decan, Mens
Linux Distributions Used
6Package Freshness in Linux Distributions
Distribution First Second Third Total
Ubuntu (family) 47 46 43 117
Debian (family) 30 37 26 93
Red Hat (family) 33 33 25 91
Arch Linux 29 8 8 45
OpenSUSE (family) 21 13 5 39
Linux Mint 5 10 7 22
Slackware 2 2 1 5
Other distributions 3 6 5 14
Ranking of most-used distributions (up to 3)
7. Legay, Decan, Mens
Asked about 6 package categories:
§Open source end-user software: LibreOffice, Firefox, GIMP…
§Proprietary end-user software: Adobe Reader, Skype, Spotify…
§Development tools: Emacs, Eclipse, git…
§System tools and libraries: openSSL, zsh, sudo…
§Programing language runtimes: Python, Java…
§Programing language libraries: Numpy, Lodash…
Package Categories
7Package Freshness in Linux Distributions
8. Legay, Decan, Mens
Perception of delay of package update deployment
User Perception
8Package Freshness in Linux Distributions
System
Tools
End-user Development ProgrammProgrammingEnd-user
9. Legay, Decan, Mens
Importance of Package Freshness
9Package Freshness in Linux Distributions
How important is package freshness to respondents?
system
tools
end-user
open source
dev.
tools
programming
language
runtimes
programming
language
libraries
end-user
proprietary
software
Around 75% of respondents consider freshness at least moderately
important, except for proprietary software.
10. Legay, Decan, Mens
Update Mechanisms Used
10Package Freshness in Linux Distributions
Official community repositories are used whenever possible.
Official repos
Community repositories
3rd party managers
Binaries
Sources
System Tools
system
tools
end-user
open source
dev.
tools
programming
language
runtimes
programming
language
libraries
end-user
proprietary
software
11. Legay, Decan, Mens
§Extracted data from 6 popular Linux distributions:
Arch, CentOS, Debian Stable, Debian Unstable, Fedora, Ubuntu
§Selected 890 packages common to all 6 distributions
§Selected snapshots of these packages:
§ At time of release for distributions with a point release policy
§ Daily for distributions with a rolling release policy
§Observation period: [2015-01-01, 2020-01-01[
Quantitative Analysis
11Package Freshness in Linux Distributions
12. Legay, Decan, Mens
Proportion of packages not using
the latest available version
12Package Freshness in Linux Distributions
Vast discrepancy between distributions: from 10% (Arch) to 80% (CentOS)
13. Legay, Decan, Mens
Update Delay
13Package Freshness in Linux Distributions
Time since a more recent version of the package has become available.
Example: package postgresql in Ubuntu 17.04
14. Legay, Decan, Mens
Update Delay
14Package Freshness in Linux Distributions
§Most packages in most distributions
have < 3 months of update delay
§In particular, 90% of Arch packages
have very low update delay (under
10 days)
§Contrasted by CentOS: Half of its
packages have been superseded by
another version by > 1 year
15. Legay, Decan, Mens
Version Lag
15Package Freshness in Linux Distributions
§70% to 90% of packages lag
behind by at most two versions
in most distributions
§CentOS: more than half the
packages lay behind by 3+
versions, 20% by 10+ versions!
16. Legay, Decan, Mens
Comparing Package Freshness
16Package Freshness in Linux Distributions
Ranking freshness of packages in distributions (1 = freshest, 6 = least fresh)
Arch almost always ranked first, CentOS very often ranked last.
17. Legay, Decan, Mens
Perception versus Reality
17Package Freshness in Linux Distributions
Perception Reality
Arch packages deployed in official
repositories in a few days
90% of packages updates
deployed within 10 days
Fedora and Ubuntu in the order of weeks 60% deployed in less than a
month
Debian Stable in the order of months 60%-70% deployed within a six-
month delay (30% > 3 months)
CentOS in the order of months 50% of packages outdated by
over a year
18. Legay, Decan, Mens
§Users consider it important to keep packages fresh for different
reasons:
§ security (90% of respondents)
§ bug fixing (88% of respondents)
§ benefiting from new features (66% of respondents)
§Users rely on official repositories whenever possible
è Important to have fresh packages in official repositories
Conclusions
18Package Freshness in Linux Distributions
19. Legay, Decan, Mens
§Package freshness varies a lot in popular Linux distributions
§ Arch packages the most fresh
§ CentOS packages much less fresh than other distributions
§Perception versus reality of package freshness?
§User perception is mostly accurate
§Exception: underestimating time for CentOS
§Nearly a third of respondents do not know at least for
specific package categories
Conclusions
19Package Freshness in Linux Distributions
20. Legay, Decan, Mens
§Finer-grained study of package freshness
§by package category
§in distribution-agnostic package managers (Flatpak, Snap,
AppImage, …)
§Study trade-offs between freshness, security and stability
§ Latest version not necessarily most secure or stable
§ New versions introduce new features, and fix bugs and security issues…
§ … but also introduce new (undiscovered) bugs and vulnerabilities
§Creation of historical database of package versions deployed in
distributions and package upstream release dates
Future Work
20Package Freshness in Linux Distributions