SlideShare une entreprise Scribd logo
1  sur  45
Télécharger pour lire hors ligne
The (r)evolution of CI/CD on GitHub
Promises and Perils of the GitHub Actions ecosystem
Tom Mens
Software Engineering Lab
March 2023
SECO-ASSIST
secoassist.github.io
2
3
Collaborative software development
4
Commits
Issues
Pull Requests
Comments
Code Reviews
Discussions
Project Management
...
Continuous Integration
Quality
analysis
Build Test Deploy
GitHub
Actions
Examples of CI/CD tools
5
Specifying
GitHub Actions
workflows
6
repository
workflow 3
workflow 2
step 3
job 1
workflow 1
job 2 job 3
workflows
jobs
steps
repository
Parallel
Parallel by default /
sequential
Sequential
.github/workflows/
strategy
step 2
step 1
use: (action) run: (shell cmd) use: (action)
Running workflows
7
GitHub
marketplace
8
Reusing Actions from GitHub MarketPlace
On the rise and fall of CI services in GitHub
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Abstract—Continuous integration (CI) services are used in
collaborative open source projects to automate parts of the
development workflow. Such services have been in widespread use
for over a decade, with new CIs being introduced over the years,
sometimes overtaking other CIs in popularity. We conducted a
longitudinal empirical study over a period of nine years, aiming
to better understand this rapidly evolving CI landscape. By
analysing the development history of 91,810 GitHub repositories
of active npm packages having used at least one CI service,
we quantitatively studied the evolution of seven popular CIs,
specifically focusing on their co-usage and migration in the
considered repositories. We provide statistical evidence of the rise
of GitHub Actions, that has become the dominant CI service in
less than 18 months time. This coincides with the fall of Travis
that has seen an important decrease in usage, likely due to a
combination of policy changes and migrations to GitHub Actions.
Index Terms—Continuous integration, distributed software
development, software repositories, GitHub
I. INTRODUCTION
Continuous integration (CI), deployment and delivery have
become the cornerstone of collaborative software development
and DevOps practices. CI automates the integration of code
changes from multiple contributors into a central repository
where automated builds, tests and code quality checks run.
Well-known examples of CI services are Jenkins, Travis,
CircleCI and AppVeyor. CI services can also be built-in in
social coding platforms such as GitHub and GitLab [1]. GitLab
already featured CI capabilities since November 2012. Based
on popular demand, and in response to CI support integrated
in GitLab, GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this article) in October 2018. In August 2019, they officially
began supporting Continuous Integration through GHA, and
the product was released publicly in November 2019.
GHA [2] allows to automate a wide range of tasks based
on a variety of triggers such as commits, issues, pull requests,
comments and many more. GHA can be used to facilitate code
reviews, code quality analysis, communication, dependency
and security monitoring and management, testing, etc. GHA
facilitates the integration with external services, and can even
obviate the need of using such external services altogether.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 56 million users in
September 2020 [3]. Given its popularity and the ease with
which GHA allows to automate the CI workflow, we hypoth-
esise that GHA has had a significant impact on today’s CI
landscape. More particularly, we believe that it has increased
the awareness of the need for CI, it has reduced the entry
barrier for projects to start using CI, and it may have lead
projects to migrate from other CI services towards GHA.
This article aims to quantitatively and objectively verify
these hypotheses, and discusses their consequences, through a
longitudinal analysis of how different CIs have been used over
a nine-year period in 91,810 GitHub repositories correspond-
ing to the software development history of reusable Node.JS
packages distributed through the npm package registry. This
empirical study focuses on four research questions:
RQ1 How did the CI landscape evolve? We identified 20
different CIs being used in the considered set of repositories,
some of which were considerably more prevalent than others.
Together with Travis, GHA covers more than 80% of all
usages. Moreover, in only 18 months GHA has overtaken all
other CIs in popularity.
RQ2 What are the most frequent combinations of CIs? We
observed that many repositories have used multiple CIs during
their lifetime. AppVeyor is nearly always used in combination
with some other CI. If a repository uses a CI simultaneously
with another one, it is mostly in combination with Travis,
GHA or CircleCI.
RQ3 How frequently are CIs being replaced by an alternative?
We observed a non-negligible amount of CI migrations. GHA
attracted most of these migrations. The majority of migrations
were moving away from Travis and towards GHA.
RQ4 How has the CI landscape changed since GHA was
introduced? Based on a regression discontinuity design, we
found that the usage of Travis, Azure and CircleCI has been
negatively affected by the introduction of GHA.
This article is structured as follows. Section II motivates the
selected dataset and discusses the data extraction and cleaning
steps that were carried out. Sections III to VI provide answers
to each research question. Section VII discusses the ramifi-
cations of these answers. Section VIII presents the threats to
validity of the conducted research. Section IX presents the
related work. Finally, Section X concludes.
II. DATA EXTRACTION
In order to analyse the use of CIs in software development
repositories on GitHub, we need a large dataset containing
On the Use of GitHub Actions in Software
Development Repositories
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Pooya Rostami Mazrae
Software Engineering Lab
University of Mons
Mons, Belgium
pooya.rostamimazrae@umons.ac.be
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Abstract—GitHub Actions was introduced in 2019 and con-
stitutes an integrated alternative to CI/CD services for GitHub
repositories. The deep integration with GitHub allows reposi-
tories to easily automate software development workflows. This
paper empirically studies the use of GitHub Actions on a dataset
comprising 68K repositories on GitHub, of which 43.9% are using
GitHub Actions workflows. We analyse which workflows are
automated and identify the most frequent automation practices.
We show that reuse of actions is a common practice, even if
this reuse is concentrated in a limited number of actions. We
study which actions are most frequently used and how workflows
refer to them. Furthermore, we discuss the related security
and versioning aspects. As such, we provide an overview of
the use of GitHub Actions, constituting a necessary first step
towards a better understanding of this emerging ecosystem and
its implications on collaborative software development in the
GitHub social coding platform.
Index Terms—GitHub Actions, continuous integration, collab-
orative software development, workflow automation
I. INTRODUCTION
Open source software (OSS) development is a continuous,
highly distributed and collaborative endeavour [1]. Develop-
ment of OSS projects faces many socio-technical challenges
[2]–[4]. The multitude of tools (e.g., version control systems,
software distribution managers, bug and issue trackers) and
development-related activities makes it very challenging for
contributor communities to keep up with the rapid pace of
producing and maintaining high-quality software releases.
Automated workflows were introduced to automate numer-
ous repetitive social or technical activities that are inherently
part of the collaborative software development process. Con-
tinuous integration, deployment and delivery (CI/CD) have
become the cornerstone of collaborative software develop-
ment and DevOps practices. Well-known examples of CI/CD
services are Travis, Jenkins, CircleCI and TeamCity. They
automate the integration of code changes from multiple con-
tributors into a central repository where automated builds, tests
and code quality checks run.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 73 million users in
2021 [5]. GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this paper) in October 2018 based on popular demand, and in
response to GitLab’s integrated CI/CD support [6]. In August
2019, GitHub officially began supporting CI through GHA,
and the product was released publicly in November 2019.
GHA [7] allows the automation of a wide range of tasks
based on a variety of triggers such as commits, issues, pull
requests, comments, schedules, and many more. Its deep
integration into GitHub implies that GHA can be used not
only for executing test suites or deploying new releases
as in traditional CI/CD services, but also to facilitate code
reviews, communication, dependency and security monitoring
and management, etc. GHA also promotes the use and sharing
of reusable components, called actions, in workflows. These
actions are distributed in public repositories and on the GitHub
Marketplace. They allow workflow developers to easily in-
tegrate specific tasks (e.g., set up a specific programming
language environment, publish a release on a package registry,
run tests and check code quality) without having to write the
corresponding code.
Since its public release in November 2019, GHA has
become the most dominant CI/CD service, only 18 months
after its introduction [8]. Its Marketplace of reusable actions
has been growing exponentially ever since, reaching 12K
reusable actions in February 2022. It is therefore fair to
say that GHA has become a software ecosystem of its own,
comparable to ecosystems of reusable software libraries (such
as npm, RubyGems, CRAN, Maven, and PyPI) that have been
empirically studied by many researchers in recent years (e.g.,
[9]–[14]).
The emerging GHA ecosystem is worthy of being empiri-
cally studied in its own right since it is likely to suffer from
the same issues related to dependency management, security
vulnerabilities, outdated or obsolete components, backward
compatibility, and so on. This article therefore quantitatively
studies the use of GHA in 68K repositories on GitHub. We
analyse which workflows are automated and identify the most
frequent automation practices. We show that reuse of actions
is a common practice and identify which actions are reused
and how. As such, we provide an overview of the use of
GHA, a necessary first step towards a better understanding
of the emerging GHA ecosystem and its implications on
software development in GitHub repositories. More concretely,
we answer the following research questions:
9
Empirical Software Engineering (2023) 28:52
https://doi.org/10.1007/s10664-022-10285-5
On the usage, co-usage and migration of CI/CD tools:
A qualitative analysis
Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1
Accepted: 28 December 2022
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023
Abstract
Continuous integration, delivery and deployment (CI/CD) is used to support the collabora-
tive software development process. CI/CD tools automate a wide range of activities in the
development workflow such as testing, linting, updating dependencies, creating and deploy-
ing releases, and so on. Previous quantitative studies have revealed important changes in the
landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many
software projects migrating to other CI/CD tools. In order to understand the reasons behind
these changes in CI/CD usage, this paper presents a qualitative study based on in-depth
interviews with 22 experienced software practitioners reporting on their usage, co-usage and
migration of 31 different CI/CD tools. Following an inductive and deductive coding process,
we analyse the interviews and found a high amount of competition between CI/CD tools. We
observe multiple reasons for co-using different CI/CD tools within the same project, and we
identify the main reasons and detractors for migrating to different alternatives. Among all
reported migrations, we observe a clear trend of migrations away from Travis and migrations
towards GitHub Actions and we identify the main reasons behind them.
Keywords CI/CD · Collaborative software development · Workflow automation ·
Qualitative analysis · Empirical software engineering
Communicated by: Alexander Serebrenik
Alexandre Decan (F.R.S.-FNRS Research Associate)
! Pooya Rostami Mazrae
pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be
Tom Mens
tom.mens@umons.ac.be
Mehdi Golzadeh
golzadeh.mehdi@gmail.com
Alexandre Decan
alexandre.decan@umons.ac.be
1 Software Engineering Lab, Université de Mons, Mons, Belgium
https://doi.org/10.1109/ICSME55016.2022.00029
https://doi.org/10.1109/SANER53432.2022.00084
On the rise and fall of CI services in GitHub
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Abstract—Continuous integration (CI) services are used in
collaborative open source projects to automate parts of the
development workflow. Such services have been in widespread use
for over a decade, with new CIs being introduced over the years,
sometimes overtaking other CIs in popularity. We conducted a
longitudinal empirical study over a period of nine years, aiming
to better understand this rapidly evolving CI landscape. By
analysing the development history of 91,810 GitHub repositories
of active npm packages having used at least one CI service,
we quantitatively studied the evolution of seven popular CIs,
specifically focusing on their co-usage and migration in the
considered repositories. We provide statistical evidence of the rise
of GitHub Actions, that has become the dominant CI service in
less than 18 months time. This coincides with the fall of Travis
that has seen an important decrease in usage, likely due to a
combination of policy changes and migrations to GitHub Actions.
Index Terms—Continuous integration, distributed software
development, software repositories, GitHub
I. INTRODUCTION
Continuous integration (CI), deployment and delivery have
become the cornerstone of collaborative software development
and DevOps practices. CI automates the integration of code
changes from multiple contributors into a central repository
where automated builds, tests and code quality checks run.
Well-known examples of CI services are Jenkins, Travis,
CircleCI and AppVeyor. CI services can also be built-in in
social coding platforms such as GitHub and GitLab [1]. GitLab
already featured CI capabilities since November 2012. Based
on popular demand, and in response to CI support integrated
in GitLab, GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this article) in October 2018. In August 2019, they officially
began supporting Continuous Integration through GHA, and
the product was released publicly in November 2019.
GHA [2] allows to automate a wide range of tasks based
on a variety of triggers such as commits, issues, pull requests,
comments and many more. GHA can be used to facilitate code
reviews, code quality analysis, communication, dependency
and security monitoring and management, testing, etc. GHA
facilitates the integration with external services, and can even
obviate the need of using such external services altogether.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 56 million users in
September 2020 [3]. Given its popularity and the ease with
which GHA allows to automate the CI workflow, we hypoth-
esise that GHA has had a significant impact on today’s CI
landscape. More particularly, we believe that it has increased
the awareness of the need for CI, it has reduced the entry
barrier for projects to start using CI, and it may have lead
projects to migrate from other CI services towards GHA.
This article aims to quantitatively and objectively verify
these hypotheses, and discusses their consequences, through a
longitudinal analysis of how different CIs have been used over
a nine-year period in 91,810 GitHub repositories correspond-
ing to the software development history of reusable Node.JS
packages distributed through the npm package registry. This
empirical study focuses on four research questions:
RQ1 How did the CI landscape evolve? We identified 20
different CIs being used in the considered set of repositories,
some of which were considerably more prevalent than others.
Together with Travis, GHA covers more than 80% of all
usages. Moreover, in only 18 months GHA has overtaken all
other CIs in popularity.
RQ2 What are the most frequent combinations of CIs? We
observed that many repositories have used multiple CIs during
their lifetime. AppVeyor is nearly always used in combination
with some other CI. If a repository uses a CI simultaneously
with another one, it is mostly in combination with Travis,
GHA or CircleCI.
RQ3 How frequently are CIs being replaced by an alternative?
We observed a non-negligible amount of CI migrations. GHA
attracted most of these migrations. The majority of migrations
were moving away from Travis and towards GHA.
RQ4 How has the CI landscape changed since GHA was
introduced? Based on a regression discontinuity design, we
found that the usage of Travis, Azure and CircleCI has been
negatively affected by the introduction of GHA.
This article is structured as follows. Section II motivates the
selected dataset and discusses the data extraction and cleaning
steps that were carried out. Sections III to VI provide answers
to each research question. Section VII discusses the ramifi-
cations of these answers. Section VIII presents the threats to
validity of the conducted research. Section IX presents the
related work. Finally, Section X concludes.
II. DATA EXTRACTION
In order to analyse the use of CIs in software development
repositories on GitHub, we need a large dataset containing
10
https://doi.org/10.1109/SANER53432.2022.00084
Dataset
11
1.6M+
Scoped packages
803K packages
on GitHub
Excluded 11,557
forks
Excluded inactive
repositories
201,403
Repositories
Presence
of CI configuration
files
119,033 CI usages
in
91,810 Repositories
May
2021
Cloned 676K
How prevalent is CI usage
in GitHub repositories?
CI services are used in
more than half of all
considered repositories.
Evolution of GitHub CI/CD landscape
13
Since 2021, GitHub Actions has become
the dominant CI/CD tool in GitHub
Most frequent co-usage of CIs
14
Analysing
CI churn
in the last 3 years
Migrations
between CIs
Migrations
toward GitHub
Actions
Migrations
away from Travis
What happened to Travis?
Travis changed
its free plan
GHA was
introduced
20
Empirical Software Engineering (2023) 28:52
https://doi.org/10.1007/s10664-022-10285-5
On the usage, co-usage and migration of CI/CD tools:
A qualitative analysis
Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1
Accepted: 28 December 2022
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023
Abstract
Continuous integration, delivery and deployment (CI/CD) is used to support the collabora-
tive software development process. CI/CD tools automate a wide range of activities in the
development workflow such as testing, linting, updating dependencies, creating and deploy-
ing releases, and so on. Previous quantitative studies have revealed important changes in the
landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many
software projects migrating to other CI/CD tools. In order to understand the reasons behind
these changes in CI/CD usage, this paper presents a qualitative study based on in-depth
interviews with 22 experienced software practitioners reporting on their usage, co-usage and
migration of 31 different CI/CD tools. Following an inductive and deductive coding process,
we analyse the interviews and found a high amount of competition between CI/CD tools. We
observe multiple reasons for co-using different CI/CD tools within the same project, and we
identify the main reasons and detractors for migrating to different alternatives. Among all
reported migrations, we observe a clear trend of migrations away from Travis and migrations
towards GitHub Actions and we identify the main reasons behind them.
Keywords CI/CD · Collaborative software development · Workflow automation ·
Qualitative analysis · Empirical software engineering
Communicated by: Alexander Serebrenik
Alexandre Decan (F.R.S.-FNRS Research Associate)
! Pooya Rostami Mazrae
pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be
Tom Mens
tom.mens@umons.ac.be
Mehdi Golzadeh
golzadeh.mehdi@gmail.com
Alexandre Decan
alexandre.decan@umons.ac.be
1 Software Engineering Lab, Université de Mons, Mons, Belgium
Methodology
21
• Around 30 questions related to CI usage, co-usage and migration
Interview questionnaire
• Selected candidates through Twitter, LinkedIn, email, direct messages
• Colleagues' referrals (snowballing)
Selection of respondents
• Using online video conferencing tool
Geographic diversity
• Actively contributed to, or having been responsible for a software project relying on CI
• Sufficient knowledge about which CI tool is used in that software project and how
• Having been involved in setting up or maintaining the CI process of the project
Inclusion Criteria
Demographics of respondents
• 22 respondents
• 16 from 7 European countries
• 4 from North America
• 2 from Asia
• software development experience
• average of 12 years and 4 months
• Good mix of industrial and open source
contributors
22
CI/CD tools being used
• 14 additional tools reported only once
• 3 custom-built in-house CI/CD solutions
23
The good ...
25
26
The bad ...
The ugly
27
CI/CD migrations
30
Reasons for
CI migration
31
Why is GitHub Actions so
popular?
• deep integration with GitHub
• ease of setup and use
• trendy
• speed
• reliability
• free tier for open source projects
• large marketplace of reusable Actions
• support for major operating systems
• company support (Microsoft)
• automation beyond CI/CD
33
Difficulties in CI migration
• Learning curve
• Fundamental differences between the
source and target of the migration
• Trial-and-error nature of configuring a
new CI tool
• Lack of familiarity with the new CI tool
• Important missing features
34
On the Use of GitHub Actions in Software
Development Repositories
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Pooya Rostami Mazrae
Software Engineering Lab
University of Mons
Mons, Belgium
pooya.rostamimazrae@umons.ac.be
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Abstract—GitHub Actions was introduced in 2019 and con-
stitutes an integrated alternative to CI/CD services for GitHub
repositories. The deep integration with GitHub allows reposi-
tories to easily automate software development workflows. This
paper empirically studies the use of GitHub Actions on a dataset
comprising 68K repositories on GitHub, of which 43.9% are using
GitHub Actions workflows. We analyse which workflows are
automated and identify the most frequent automation practices.
We show that reuse of actions is a common practice, even if
this reuse is concentrated in a limited number of actions. We
study which actions are most frequently used and how workflows
refer to them. Furthermore, we discuss the related security
and versioning aspects. As such, we provide an overview of
the use of GitHub Actions, constituting a necessary first step
towards a better understanding of this emerging ecosystem and
its implications on collaborative software development in the
GitHub social coding platform.
Index Terms—GitHub Actions, continuous integration, collab-
orative software development, workflow automation
I. INTRODUCTION
Open source software (OSS) development is a continuous,
highly distributed and collaborative endeavour [1]. Develop-
ment of OSS projects faces many socio-technical challenges
[2]–[4]. The multitude of tools (e.g., version control systems,
software distribution managers, bug and issue trackers) and
development-related activities makes it very challenging for
contributor communities to keep up with the rapid pace of
producing and maintaining high-quality software releases.
Automated workflows were introduced to automate numer-
ous repetitive social or technical activities that are inherently
part of the collaborative software development process. Con-
tinuous integration, deployment and delivery (CI/CD) have
become the cornerstone of collaborative software develop-
ment and DevOps practices. Well-known examples of CI/CD
services are Travis, Jenkins, CircleCI and TeamCity. They
automate the integration of code changes from multiple con-
tributors into a central repository where automated builds, tests
and code quality checks run.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 73 million users in
2021 [5]. GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this paper) in October 2018 based on popular demand, and in
response to GitLab’s integrated CI/CD support [6]. In August
2019, GitHub officially began supporting CI through GHA,
and the product was released publicly in November 2019.
GHA [7] allows the automation of a wide range of tasks
based on a variety of triggers such as commits, issues, pull
requests, comments, schedules, and many more. Its deep
integration into GitHub implies that GHA can be used not
only for executing test suites or deploying new releases
as in traditional CI/CD services, but also to facilitate code
reviews, communication, dependency and security monitoring
and management, etc. GHA also promotes the use and sharing
of reusable components, called actions, in workflows. These
actions are distributed in public repositories and on the GitHub
Marketplace. They allow workflow developers to easily in-
tegrate specific tasks (e.g., set up a specific programming
language environment, publish a release on a package registry,
run tests and check code quality) without having to write the
corresponding code.
Since its public release in November 2019, GHA has
become the most dominant CI/CD service, only 18 months
after its introduction [8]. Its Marketplace of reusable actions
has been growing exponentially ever since, reaching 12K
reusable actions in February 2022. It is therefore fair to
say that GHA has become a software ecosystem of its own,
comparable to ecosystems of reusable software libraries (such
as npm, RubyGems, CRAN, Maven, and PyPI) that have been
empirically studied by many researchers in recent years (e.g.,
[9]–[14]).
The emerging GHA ecosystem is worthy of being empiri-
cally studied in its own right since it is likely to suffer from
the same issues related to dependency management, security
vulnerabilities, outdated or obsolete components, backward
compatibility, and so on. This article therefore quantitatively
studies the use of GHA in 68K repositories on GitHub. We
analyse which workflows are automated and identify the most
frequent automation practices. We show that reuse of actions
is a common practice and identify which actions are reused
and how. As such, we provide an overview of the use of
GHA, a necessary first step towards a better understanding
of the emerging GHA ecosystem and its implications on
software development in GitHub repositories. More concretely,
we answer the following research questions:
35
https://doi.org/10.1109/ICSME55016.2022.00029
Research
Questions
36
What are the characteristics of repositories using
workflows?
Which kinds of workflows are automated?
What are the most frequent jobs in workflows?
What are the automation practices?
Which types of Actions are reused?
Dataset
• 67,870 repositories
• 4 out of 10 repositories
use GitHub Actions
workflows
• 70,278 workflow files
• 108,500 jobs
• 567,352 steps
37
Quantification of jobs and workflows
Workflows in repositories
single workflow (49.3%)
more than one workflow (50.7%)
Jobs in workflows
single job (77.8%)
more than one job (22.2%)
38
Characteristics of GitHub repositories
using GitHub Actions
Median Effect size
Characteristic With workflows
Without
workflows
Interpretation
Pull Requests 124 41 medium
Contributors 20 11 small
Commits 598 344 small
Issues 105 59 small
40
Repos with GHA workflows tend to have more
contributors, pull requests, commits, and issues
Most frequent event types
triggering workflows
63,4
56,3
16,1 15,4
6,2
8,6
0
10
20
30
40
50
60
70
push PR schedule workflow_dispatch release others
41
DifferDifferent ways of executing codecode
Step type Action target % of steps % of repositories
run: -- 49,9% 93,5%
uses:
Local path 0,8% 2,0%
Docker image 0,1% 1,8%
Same repository 0,2% 0,4%
Same owner 0,7% 4,3%
Other public
repository
48,3% 99,3%
42
Reusing Actions in steps is a common practice
Which Actions are reused?
35,50%
7,20% 6,60% 5,90% 5,80%
98%
22%
26%
19%
21%
0,00%
10,00%
20,00%
30,00%
40,00%
50,00%
60,00%
70,00%
80,00%
90,00%
100,00%
actions/checkout actions/cache actions/setup-node actions/upload-artifact actions/setup-python
Top 5 most frequent Actions in steps and repositories
steps repositories 44
• A few Actions concentrate
most of the reuse
• Most of them being
developed by GitHub
45
On the Outdatedness of Workflows
in the GitHub Actions Ecosystem
Alexandre Decan1
, Hassan Onsori Delicheh, Tom Mens
aSoftware Engineering Lab, University of Mons, Mons, Belgium
Abstract
GitHub Actions was introduced as a way to automate CI/CD workflows in
GitHub, the largest social coding platform. Thanks to its deep integration into
GitHub, GitHub Actions can be used to automate a wide range of social and
technical activities. Among its main features, it allows automation workflows
to rely on reusable components – the so-called Actions – to enable developers to
focus on the tasks that should be automated rather than on how to automate
them. As any other kind of reusable software components, Actions are contin-
uously updated, causing many automation workflows to use outdated versions
of these Actions. Based on a dataset of nearly one million workflows obtained
from 22K+ repositories between November 2019 and September 2022, we pro-
vide quantitative empirical evidence that reusing Actions in GitHub workflows
is common practice, even if this reuse tends to concentrate on a limited number
of Actions. We show that Actions are frequently updated, and we quantify to
which extent automation workflows are outdated with respect to these Actions.
Using two complementary metrics, technical lag and opportunity lag, we found
that most of the workflows are using an outdated Action release, are lagging
behind the latest available release for at least 7 months, and had the oppor-
tunity to be updated during at least 9 months. This calls for a more rigorous
management of Action outdatedness in automation workflows, as well as for
better policies and tooling to keep workflows up-to-date.
Keywords: software ecosystem, dependency management, continuous
integration, collaborative software development, workflow automation,
technical lag
Email addresses: alexandre.decan@umons.ac.be (Alexandre Decan),
hassan.onsoridelicheh@umons.ac.be (Hassan Onsori Delicheh), tom.mens@umons.ac.be
(Tom Mens)
1F.R.S.-FNRS Research Associate
Preprint submitted to Journal of Systems & Software March 21, 2023
Outdatedness in the
GitHub Actions ecosystem
46
• Four out of five workflows and nearly
two thirds of the steps are using an
outdated release of an Action.
• Steps using Actions provided by GitHub
are responsible for most of the
outdatedness.
• More than one third of the other steps
and nearly half of the other workflows
are using an outdated release of an
Action.
release of
actions/checkout@v2
release of
actions/checkout@v3
release of
actions/setup-*@v2
release of
actions/setup-*@v3
v1 v2 v3 v4
latest
technical lag
observation
date
GitHub workflow
selected
Action
lifeline
Outdatedness in the
GitHub Actions ecosystem
Technical lag of workflows / steps: the time period between the start of
reusing a selected Action and the latest release of that Action.
Outdatedness in the
GitHub Actions ecosystem
Technical lag of workflows / steps: the time period between the start of
reusing a selected Action and the latest release of that Action.
• Technical lag of outdated steps
tends to increase over time.
• Half of the outdated steps using
other Actions are using a version
that is lagging behind the latest one
for at least 7.3 months.
• Main cause of technical lag =
Actions provided by GitHub
Outdatedness in the
GitHub Actions ecosystem
Opportunity lag of workflows / steps: the time period during which a
workflow could have updated an outdated step to a more recent
version of an Action, but didn’t.
v1 v2 v3 v4
opportunity lag
observation
time
GitHub workflow
first update
opportunity
Action
lifeline
selected
Outdatedness in the
GitHub Actions ecosystem
Opportunity lag of workflows / steps: the time period during which a
workflow could have updated an outdated step to a more recent
version of an Action, but didn’t.
• The opportunity lag of outdated steps
tends to increase over time.
• On average, maintainers of outdated
steps have had the opportunity to
update them for 9 months, but have not
done so.
• Main cause of opportunity lag =
Actions provided by GitHub
new releases for
docker/*
Thank you for
your attention.
Any questions?
55

Contenu connexe

Tendances

Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusGrafana Labs
 
InfluxDB + Telegraf Operator: Easy Kubernetes Monitoring
InfluxDB + Telegraf Operator: Easy Kubernetes MonitoringInfluxDB + Telegraf Operator: Easy Kubernetes Monitoring
InfluxDB + Telegraf Operator: Easy Kubernetes MonitoringInfluxData
 
Observability in Java: Getting Started with OpenTelemetry
Observability in Java: Getting Started with OpenTelemetryObservability in Java: Getting Started with OpenTelemetry
Observability in Java: Getting Started with OpenTelemetryDevOps.com
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry IntroDimitrisFinas1
 
Exploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesExploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesRed Hat Developers
 
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioTHE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioDevOpsDays Tel Aviv
 
Kubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & OperatorsKubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & OperatorsSIGHUP
 
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...NETWAYS
 
Kubernetes extensibility
Kubernetes extensibilityKubernetes extensibility
Kubernetes extensibilityDocker, Inc.
 
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...Tonny Adhi Sabastian
 
Everything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed TracingEverything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed TracingAmuhinda Hungai
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsDesigning a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsJulian Mazzitelli
 
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfOSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfNETWAYS
 
Cloud Monitoring with Prometheus
Cloud Monitoring with PrometheusCloud Monitoring with Prometheus
Cloud Monitoring with PrometheusQAware GmbH
 
Operator SDK for K8s using Go
Operator SDK for K8s using GoOperator SDK for K8s using Go
Operator SDK for K8s using GoCloudOps2005
 
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDKubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDSunnyvale
 
GCP CloudRun Overview
GCP CloudRun OverviewGCP CloudRun Overview
GCP CloudRun OverviewOliver Fierro
 
KUBEDAY - JAPAN 2022 - Building FaaS Platforms.pdf
KUBEDAY - JAPAN  2022 - Building FaaS Platforms.pdfKUBEDAY - JAPAN  2022 - Building FaaS Platforms.pdf
KUBEDAY - JAPAN 2022 - Building FaaS Platforms.pdfMauricio (Salaboy) Salatino
 
elasticsearch-hadoopをつかってごにょごにょしてみる
elasticsearch-hadoopをつかってごにょごにょしてみるelasticsearch-hadoopをつかってごにょごにょしてみる
elasticsearch-hadoopをつかってごにょごにょしてみるKatsushi Yamashita
 
[DockerCon 2023] Reproducible builds with BuildKit for software supply chain ...
[DockerCon 2023] Reproducible builds with BuildKit for software supply chain ...[DockerCon 2023] Reproducible builds with BuildKit for software supply chain ...
[DockerCon 2023] Reproducible builds with BuildKit for software supply chain ...Akihiro Suda
 

Tendances (20)

Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 
InfluxDB + Telegraf Operator: Easy Kubernetes Monitoring
InfluxDB + Telegraf Operator: Easy Kubernetes MonitoringInfluxDB + Telegraf Operator: Easy Kubernetes Monitoring
InfluxDB + Telegraf Operator: Easy Kubernetes Monitoring
 
Observability in Java: Getting Started with OpenTelemetry
Observability in Java: Getting Started with OpenTelemetryObservability in Java: Getting Started with OpenTelemetry
Observability in Java: Getting Started with OpenTelemetry
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry Intro
 
Exploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesExploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on Kubernetes
 
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioTHE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
 
Kubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & OperatorsKubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & Operators
 
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
 
Kubernetes extensibility
Kubernetes extensibilityKubernetes extensibility
Kubernetes extensibility
 
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
 
Everything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed TracingEverything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed Tracing
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsDesigning a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd products
 
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfOSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
 
Cloud Monitoring with Prometheus
Cloud Monitoring with PrometheusCloud Monitoring with Prometheus
Cloud Monitoring with Prometheus
 
Operator SDK for K8s using Go
Operator SDK for K8s using GoOperator SDK for K8s using Go
Operator SDK for K8s using Go
 
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDKubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
 
GCP CloudRun Overview
GCP CloudRun OverviewGCP CloudRun Overview
GCP CloudRun Overview
 
KUBEDAY - JAPAN 2022 - Building FaaS Platforms.pdf
KUBEDAY - JAPAN  2022 - Building FaaS Platforms.pdfKUBEDAY - JAPAN  2022 - Building FaaS Platforms.pdf
KUBEDAY - JAPAN 2022 - Building FaaS Platforms.pdf
 
elasticsearch-hadoopをつかってごにょごにょしてみる
elasticsearch-hadoopをつかってごにょごにょしてみるelasticsearch-hadoopをつかってごにょごにょしてみる
elasticsearch-hadoopをつかってごにょごにょしてみる
 
[DockerCon 2023] Reproducible builds with BuildKit for software supply chain ...
[DockerCon 2023] Reproducible builds with BuildKit for software supply chain ...[DockerCon 2023] Reproducible builds with BuildKit for software supply chain ...
[DockerCon 2023] Reproducible builds with BuildKit for software supply chain ...
 

Similaire à The (r)evolution of CI/CD on GitHub

Github Case Study By Amil Ali
Github Case Study By Amil AliGithub Case Study By Amil Ali
Github Case Study By Amil AliAmilAli1
 
GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?GrapesTech Solutions
 
Difference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs BitbucketDifference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs Bitbucketjeetendra mandal
 
concordia hacktoberfest.pptx
concordia hacktoberfest.pptxconcordia hacktoberfest.pptx
concordia hacktoberfest.pptxAnkurVerma95745
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubTom Mens
 
Introduction to Github Actions
Introduction to Github ActionsIntroduction to Github Actions
Introduction to Github ActionsKnoldus Inc.
 
Introduction to Github Actions
Introduction to Github ActionsIntroduction to Github Actions
Introduction to Github ActionsKnoldus Inc.
 
What is the concept of GitOps.pdf
What is the concept of GitOps.pdfWhat is the concept of GitOps.pdf
What is the concept of GitOps.pdfCiente
 
why google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repositorywhy google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repositorymustafa sarac
 
Why Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single RepositoryWhy Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single RepositoryKapil Mohan
 
Get started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptxGet started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptxKhushiPanwar33
 
Difference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucketDifference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucketAcodez IT Solutions
 

Similaire à The (r)evolution of CI/CD on GitHub (20)

GitHub.docx
GitHub.docxGitHub.docx
GitHub.docx
 
Github Case Study By Amil Ali
Github Case Study By Amil AliGithub Case Study By Amil Ali
Github Case Study By Amil Ali
 
Git tech
Git techGit tech
Git tech
 
GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?
 
Difference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs BitbucketDifference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs Bitbucket
 
concordia hacktoberfest.pptx
concordia hacktoberfest.pptxconcordia hacktoberfest.pptx
concordia hacktoberfest.pptx
 
What is github.
What is github.What is github.
What is github.
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHub
 
Introduction to Github Actions
Introduction to Github ActionsIntroduction to Github Actions
Introduction to Github Actions
 
Introduction to Github Actions
Introduction to Github ActionsIntroduction to Github Actions
Introduction to Github Actions
 
Git version control
Git version controlGit version control
Git version control
 
What is the concept of GitOps.pdf
What is the concept of GitOps.pdfWhat is the concept of GitOps.pdf
What is the concept of GitOps.pdf
 
Github job support.pptx
Github job support.pptxGithub job support.pptx
Github job support.pptx
 
why google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repositorywhy google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repository
 
Why Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single RepositoryWhy Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single Repository
 
Git hub
Git hubGit hub
Git hub
 
Get started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptxGet started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptx
 
Difference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucketDifference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucket
 
GITHUB
GITHUBGITHUB
GITHUB
 
GitHub for partners
GitHub for partnersGitHub for partners
GitHub for partners
 

Plus de Tom Mens

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD studentTom Mens
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentTom Mens
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubTom Mens
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureTom Mens
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Tom Mens
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networksTom Mens
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsTom Mens
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero SpaceTom Mens
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesTom Mens
 
Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Tom Mens
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Tom Mens
 
On the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsOn the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsTom Mens
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...Tom Mens
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Tom Mens
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Tom Mens
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsTom Mens
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarTom Mens
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersTom Mens
 
ConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersTom Mens
 
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...Tom Mens
 

Plus de Tom Mens (20)

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD student
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software development
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHub
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the Future
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networks
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero Space
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messages
 
Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
 
On the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsOn the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystems
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research Achievements
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminar
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package Managers
 
ConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker Containers
 
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingThe Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingSelcen Ozturkcan
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Dernier (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingThe Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

The (r)evolution of CI/CD on GitHub

  • 1. The (r)evolution of CI/CD on GitHub Promises and Perils of the GitHub Actions ecosystem Tom Mens Software Engineering Lab March 2023 SECO-ASSIST secoassist.github.io
  • 2. 2
  • 3. 3
  • 4. Collaborative software development 4 Commits Issues Pull Requests Comments Code Reviews Discussions Project Management ... Continuous Integration Quality analysis Build Test Deploy GitHub Actions
  • 6. Specifying GitHub Actions workflows 6 repository workflow 3 workflow 2 step 3 job 1 workflow 1 job 2 job 3 workflows jobs steps repository Parallel Parallel by default / sequential Sequential .github/workflows/ strategy step 2 step 1 use: (action) run: (shell cmd) use: (action)
  • 9. On the rise and fall of CI services in GitHub Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Abstract—Continuous integration (CI) services are used in collaborative open source projects to automate parts of the development workflow. Such services have been in widespread use for over a decade, with new CIs being introduced over the years, sometimes overtaking other CIs in popularity. We conducted a longitudinal empirical study over a period of nine years, aiming to better understand this rapidly evolving CI landscape. By analysing the development history of 91,810 GitHub repositories of active npm packages having used at least one CI service, we quantitatively studied the evolution of seven popular CIs, specifically focusing on their co-usage and migration in the considered repositories. We provide statistical evidence of the rise of GitHub Actions, that has become the dominant CI service in less than 18 months time. This coincides with the fall of Travis that has seen an important decrease in usage, likely due to a combination of policy changes and migrations to GitHub Actions. Index Terms—Continuous integration, distributed software development, software repositories, GitHub I. INTRODUCTION Continuous integration (CI), deployment and delivery have become the cornerstone of collaborative software development and DevOps practices. CI automates the integration of code changes from multiple contributors into a central repository where automated builds, tests and code quality checks run. Well-known examples of CI services are Jenkins, Travis, CircleCI and AppVeyor. CI services can also be built-in in social coding platforms such as GitHub and GitLab [1]. GitLab already featured CI capabilities since November 2012. Based on popular demand, and in response to CI support integrated in GitLab, GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this article) in October 2018. In August 2019, they officially began supporting Continuous Integration through GHA, and the product was released publicly in November 2019. GHA [2] allows to automate a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments and many more. GHA can be used to facilitate code reviews, code quality analysis, communication, dependency and security monitoring and management, testing, etc. GHA facilitates the integration with external services, and can even obviate the need of using such external services altogether. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 56 million users in September 2020 [3]. Given its popularity and the ease with which GHA allows to automate the CI workflow, we hypoth- esise that GHA has had a significant impact on today’s CI landscape. More particularly, we believe that it has increased the awareness of the need for CI, it has reduced the entry barrier for projects to start using CI, and it may have lead projects to migrate from other CI services towards GHA. This article aims to quantitatively and objectively verify these hypotheses, and discusses their consequences, through a longitudinal analysis of how different CIs have been used over a nine-year period in 91,810 GitHub repositories correspond- ing to the software development history of reusable Node.JS packages distributed through the npm package registry. This empirical study focuses on four research questions: RQ1 How did the CI landscape evolve? We identified 20 different CIs being used in the considered set of repositories, some of which were considerably more prevalent than others. Together with Travis, GHA covers more than 80% of all usages. Moreover, in only 18 months GHA has overtaken all other CIs in popularity. RQ2 What are the most frequent combinations of CIs? We observed that many repositories have used multiple CIs during their lifetime. AppVeyor is nearly always used in combination with some other CI. If a repository uses a CI simultaneously with another one, it is mostly in combination with Travis, GHA or CircleCI. RQ3 How frequently are CIs being replaced by an alternative? We observed a non-negligible amount of CI migrations. GHA attracted most of these migrations. The majority of migrations were moving away from Travis and towards GHA. RQ4 How has the CI landscape changed since GHA was introduced? Based on a regression discontinuity design, we found that the usage of Travis, Azure and CircleCI has been negatively affected by the introduction of GHA. This article is structured as follows. Section II motivates the selected dataset and discusses the data extraction and cleaning steps that were carried out. Sections III to VI provide answers to each research question. Section VII discusses the ramifi- cations of these answers. Section VIII presents the threats to validity of the conducted research. Section IX presents the related work. Finally, Section X concludes. II. DATA EXTRACTION In order to analyse the use of CIs in software development repositories on GitHub, we need a large dataset containing On the Use of GitHub Actions in Software Development Repositories Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Pooya Rostami Mazrae Software Engineering Lab University of Mons Mons, Belgium pooya.rostamimazrae@umons.ac.be Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Abstract—GitHub Actions was introduced in 2019 and con- stitutes an integrated alternative to CI/CD services for GitHub repositories. The deep integration with GitHub allows reposi- tories to easily automate software development workflows. This paper empirically studies the use of GitHub Actions on a dataset comprising 68K repositories on GitHub, of which 43.9% are using GitHub Actions workflows. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice, even if this reuse is concentrated in a limited number of actions. We study which actions are most frequently used and how workflows refer to them. Furthermore, we discuss the related security and versioning aspects. As such, we provide an overview of the use of GitHub Actions, constituting a necessary first step towards a better understanding of this emerging ecosystem and its implications on collaborative software development in the GitHub social coding platform. Index Terms—GitHub Actions, continuous integration, collab- orative software development, workflow automation I. INTRODUCTION Open source software (OSS) development is a continuous, highly distributed and collaborative endeavour [1]. Develop- ment of OSS projects faces many socio-technical challenges [2]–[4]. The multitude of tools (e.g., version control systems, software distribution managers, bug and issue trackers) and development-related activities makes it very challenging for contributor communities to keep up with the rapid pace of producing and maintaining high-quality software releases. Automated workflows were introduced to automate numer- ous repetitive social or technical activities that are inherently part of the collaborative software development process. Con- tinuous integration, deployment and delivery (CI/CD) have become the cornerstone of collaborative software develop- ment and DevOps practices. Well-known examples of CI/CD services are Travis, Jenkins, CircleCI and TeamCity. They automate the integration of code changes from multiple con- tributors into a central repository where automated builds, tests and code quality checks run. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 73 million users in 2021 [5]. GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this paper) in October 2018 based on popular demand, and in response to GitLab’s integrated CI/CD support [6]. In August 2019, GitHub officially began supporting CI through GHA, and the product was released publicly in November 2019. GHA [7] allows the automation of a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments, schedules, and many more. Its deep integration into GitHub implies that GHA can be used not only for executing test suites or deploying new releases as in traditional CI/CD services, but also to facilitate code reviews, communication, dependency and security monitoring and management, etc. GHA also promotes the use and sharing of reusable components, called actions, in workflows. These actions are distributed in public repositories and on the GitHub Marketplace. They allow workflow developers to easily in- tegrate specific tasks (e.g., set up a specific programming language environment, publish a release on a package registry, run tests and check code quality) without having to write the corresponding code. Since its public release in November 2019, GHA has become the most dominant CI/CD service, only 18 months after its introduction [8]. Its Marketplace of reusable actions has been growing exponentially ever since, reaching 12K reusable actions in February 2022. It is therefore fair to say that GHA has become a software ecosystem of its own, comparable to ecosystems of reusable software libraries (such as npm, RubyGems, CRAN, Maven, and PyPI) that have been empirically studied by many researchers in recent years (e.g., [9]–[14]). The emerging GHA ecosystem is worthy of being empiri- cally studied in its own right since it is likely to suffer from the same issues related to dependency management, security vulnerabilities, outdated or obsolete components, backward compatibility, and so on. This article therefore quantitatively studies the use of GHA in 68K repositories on GitHub. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice and identify which actions are reused and how. As such, we provide an overview of the use of GHA, a necessary first step towards a better understanding of the emerging GHA ecosystem and its implications on software development in GitHub repositories. More concretely, we answer the following research questions: 9 Empirical Software Engineering (2023) 28:52 https://doi.org/10.1007/s10664-022-10285-5 On the usage, co-usage and migration of CI/CD tools: A qualitative analysis Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1 Accepted: 28 December 2022 © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023 Abstract Continuous integration, delivery and deployment (CI/CD) is used to support the collabora- tive software development process. CI/CD tools automate a wide range of activities in the development workflow such as testing, linting, updating dependencies, creating and deploy- ing releases, and so on. Previous quantitative studies have revealed important changes in the landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many software projects migrating to other CI/CD tools. In order to understand the reasons behind these changes in CI/CD usage, this paper presents a qualitative study based on in-depth interviews with 22 experienced software practitioners reporting on their usage, co-usage and migration of 31 different CI/CD tools. Following an inductive and deductive coding process, we analyse the interviews and found a high amount of competition between CI/CD tools. We observe multiple reasons for co-using different CI/CD tools within the same project, and we identify the main reasons and detractors for migrating to different alternatives. Among all reported migrations, we observe a clear trend of migrations away from Travis and migrations towards GitHub Actions and we identify the main reasons behind them. Keywords CI/CD · Collaborative software development · Workflow automation · Qualitative analysis · Empirical software engineering Communicated by: Alexander Serebrenik Alexandre Decan (F.R.S.-FNRS Research Associate) ! Pooya Rostami Mazrae pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be Tom Mens tom.mens@umons.ac.be Mehdi Golzadeh golzadeh.mehdi@gmail.com Alexandre Decan alexandre.decan@umons.ac.be 1 Software Engineering Lab, Université de Mons, Mons, Belgium https://doi.org/10.1109/ICSME55016.2022.00029 https://doi.org/10.1109/SANER53432.2022.00084
  • 10. On the rise and fall of CI services in GitHub Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Abstract—Continuous integration (CI) services are used in collaborative open source projects to automate parts of the development workflow. Such services have been in widespread use for over a decade, with new CIs being introduced over the years, sometimes overtaking other CIs in popularity. We conducted a longitudinal empirical study over a period of nine years, aiming to better understand this rapidly evolving CI landscape. By analysing the development history of 91,810 GitHub repositories of active npm packages having used at least one CI service, we quantitatively studied the evolution of seven popular CIs, specifically focusing on their co-usage and migration in the considered repositories. We provide statistical evidence of the rise of GitHub Actions, that has become the dominant CI service in less than 18 months time. This coincides with the fall of Travis that has seen an important decrease in usage, likely due to a combination of policy changes and migrations to GitHub Actions. Index Terms—Continuous integration, distributed software development, software repositories, GitHub I. INTRODUCTION Continuous integration (CI), deployment and delivery have become the cornerstone of collaborative software development and DevOps practices. CI automates the integration of code changes from multiple contributors into a central repository where automated builds, tests and code quality checks run. Well-known examples of CI services are Jenkins, Travis, CircleCI and AppVeyor. CI services can also be built-in in social coding platforms such as GitHub and GitLab [1]. GitLab already featured CI capabilities since November 2012. Based on popular demand, and in response to CI support integrated in GitLab, GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this article) in October 2018. In August 2019, they officially began supporting Continuous Integration through GHA, and the product was released publicly in November 2019. GHA [2] allows to automate a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments and many more. GHA can be used to facilitate code reviews, code quality analysis, communication, dependency and security monitoring and management, testing, etc. GHA facilitates the integration with external services, and can even obviate the need of using such external services altogether. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 56 million users in September 2020 [3]. Given its popularity and the ease with which GHA allows to automate the CI workflow, we hypoth- esise that GHA has had a significant impact on today’s CI landscape. More particularly, we believe that it has increased the awareness of the need for CI, it has reduced the entry barrier for projects to start using CI, and it may have lead projects to migrate from other CI services towards GHA. This article aims to quantitatively and objectively verify these hypotheses, and discusses their consequences, through a longitudinal analysis of how different CIs have been used over a nine-year period in 91,810 GitHub repositories correspond- ing to the software development history of reusable Node.JS packages distributed through the npm package registry. This empirical study focuses on four research questions: RQ1 How did the CI landscape evolve? We identified 20 different CIs being used in the considered set of repositories, some of which were considerably more prevalent than others. Together with Travis, GHA covers more than 80% of all usages. Moreover, in only 18 months GHA has overtaken all other CIs in popularity. RQ2 What are the most frequent combinations of CIs? We observed that many repositories have used multiple CIs during their lifetime. AppVeyor is nearly always used in combination with some other CI. If a repository uses a CI simultaneously with another one, it is mostly in combination with Travis, GHA or CircleCI. RQ3 How frequently are CIs being replaced by an alternative? We observed a non-negligible amount of CI migrations. GHA attracted most of these migrations. The majority of migrations were moving away from Travis and towards GHA. RQ4 How has the CI landscape changed since GHA was introduced? Based on a regression discontinuity design, we found that the usage of Travis, Azure and CircleCI has been negatively affected by the introduction of GHA. This article is structured as follows. Section II motivates the selected dataset and discusses the data extraction and cleaning steps that were carried out. Sections III to VI provide answers to each research question. Section VII discusses the ramifi- cations of these answers. Section VIII presents the threats to validity of the conducted research. Section IX presents the related work. Finally, Section X concludes. II. DATA EXTRACTION In order to analyse the use of CIs in software development repositories on GitHub, we need a large dataset containing 10 https://doi.org/10.1109/SANER53432.2022.00084
  • 11. Dataset 11 1.6M+ Scoped packages 803K packages on GitHub Excluded 11,557 forks Excluded inactive repositories 201,403 Repositories Presence of CI configuration files 119,033 CI usages in 91,810 Repositories May 2021 Cloned 676K
  • 12. How prevalent is CI usage in GitHub repositories? CI services are used in more than half of all considered repositories.
  • 13. Evolution of GitHub CI/CD landscape 13 Since 2021, GitHub Actions has become the dominant CI/CD tool in GitHub
  • 19. What happened to Travis? Travis changed its free plan GHA was introduced
  • 20. 20 Empirical Software Engineering (2023) 28:52 https://doi.org/10.1007/s10664-022-10285-5 On the usage, co-usage and migration of CI/CD tools: A qualitative analysis Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1 Accepted: 28 December 2022 © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023 Abstract Continuous integration, delivery and deployment (CI/CD) is used to support the collabora- tive software development process. CI/CD tools automate a wide range of activities in the development workflow such as testing, linting, updating dependencies, creating and deploy- ing releases, and so on. Previous quantitative studies have revealed important changes in the landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many software projects migrating to other CI/CD tools. In order to understand the reasons behind these changes in CI/CD usage, this paper presents a qualitative study based on in-depth interviews with 22 experienced software practitioners reporting on their usage, co-usage and migration of 31 different CI/CD tools. Following an inductive and deductive coding process, we analyse the interviews and found a high amount of competition between CI/CD tools. We observe multiple reasons for co-using different CI/CD tools within the same project, and we identify the main reasons and detractors for migrating to different alternatives. Among all reported migrations, we observe a clear trend of migrations away from Travis and migrations towards GitHub Actions and we identify the main reasons behind them. Keywords CI/CD · Collaborative software development · Workflow automation · Qualitative analysis · Empirical software engineering Communicated by: Alexander Serebrenik Alexandre Decan (F.R.S.-FNRS Research Associate) ! Pooya Rostami Mazrae pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be Tom Mens tom.mens@umons.ac.be Mehdi Golzadeh golzadeh.mehdi@gmail.com Alexandre Decan alexandre.decan@umons.ac.be 1 Software Engineering Lab, Université de Mons, Mons, Belgium
  • 21. Methodology 21 • Around 30 questions related to CI usage, co-usage and migration Interview questionnaire • Selected candidates through Twitter, LinkedIn, email, direct messages • Colleagues' referrals (snowballing) Selection of respondents • Using online video conferencing tool Geographic diversity • Actively contributed to, or having been responsible for a software project relying on CI • Sufficient knowledge about which CI tool is used in that software project and how • Having been involved in setting up or maintaining the CI process of the project Inclusion Criteria
  • 22. Demographics of respondents • 22 respondents • 16 from 7 European countries • 4 from North America • 2 from Asia • software development experience • average of 12 years and 4 months • Good mix of industrial and open source contributors 22
  • 23. CI/CD tools being used • 14 additional tools reported only once • 3 custom-built in-house CI/CD solutions 23
  • 29. Why is GitHub Actions so popular? • deep integration with GitHub • ease of setup and use • trendy • speed • reliability • free tier for open source projects • large marketplace of reusable Actions • support for major operating systems • company support (Microsoft) • automation beyond CI/CD 33
  • 30. Difficulties in CI migration • Learning curve • Fundamental differences between the source and target of the migration • Trial-and-error nature of configuring a new CI tool • Lack of familiarity with the new CI tool • Important missing features 34
  • 31. On the Use of GitHub Actions in Software Development Repositories Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Pooya Rostami Mazrae Software Engineering Lab University of Mons Mons, Belgium pooya.rostamimazrae@umons.ac.be Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Abstract—GitHub Actions was introduced in 2019 and con- stitutes an integrated alternative to CI/CD services for GitHub repositories. The deep integration with GitHub allows reposi- tories to easily automate software development workflows. This paper empirically studies the use of GitHub Actions on a dataset comprising 68K repositories on GitHub, of which 43.9% are using GitHub Actions workflows. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice, even if this reuse is concentrated in a limited number of actions. We study which actions are most frequently used and how workflows refer to them. Furthermore, we discuss the related security and versioning aspects. As such, we provide an overview of the use of GitHub Actions, constituting a necessary first step towards a better understanding of this emerging ecosystem and its implications on collaborative software development in the GitHub social coding platform. Index Terms—GitHub Actions, continuous integration, collab- orative software development, workflow automation I. INTRODUCTION Open source software (OSS) development is a continuous, highly distributed and collaborative endeavour [1]. Develop- ment of OSS projects faces many socio-technical challenges [2]–[4]. The multitude of tools (e.g., version control systems, software distribution managers, bug and issue trackers) and development-related activities makes it very challenging for contributor communities to keep up with the rapid pace of producing and maintaining high-quality software releases. Automated workflows were introduced to automate numer- ous repetitive social or technical activities that are inherently part of the collaborative software development process. Con- tinuous integration, deployment and delivery (CI/CD) have become the cornerstone of collaborative software develop- ment and DevOps practices. Well-known examples of CI/CD services are Travis, Jenkins, CircleCI and TeamCity. They automate the integration of code changes from multiple con- tributors into a central repository where automated builds, tests and code quality checks run. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 73 million users in 2021 [5]. GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this paper) in October 2018 based on popular demand, and in response to GitLab’s integrated CI/CD support [6]. In August 2019, GitHub officially began supporting CI through GHA, and the product was released publicly in November 2019. GHA [7] allows the automation of a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments, schedules, and many more. Its deep integration into GitHub implies that GHA can be used not only for executing test suites or deploying new releases as in traditional CI/CD services, but also to facilitate code reviews, communication, dependency and security monitoring and management, etc. GHA also promotes the use and sharing of reusable components, called actions, in workflows. These actions are distributed in public repositories and on the GitHub Marketplace. They allow workflow developers to easily in- tegrate specific tasks (e.g., set up a specific programming language environment, publish a release on a package registry, run tests and check code quality) without having to write the corresponding code. Since its public release in November 2019, GHA has become the most dominant CI/CD service, only 18 months after its introduction [8]. Its Marketplace of reusable actions has been growing exponentially ever since, reaching 12K reusable actions in February 2022. It is therefore fair to say that GHA has become a software ecosystem of its own, comparable to ecosystems of reusable software libraries (such as npm, RubyGems, CRAN, Maven, and PyPI) that have been empirically studied by many researchers in recent years (e.g., [9]–[14]). The emerging GHA ecosystem is worthy of being empiri- cally studied in its own right since it is likely to suffer from the same issues related to dependency management, security vulnerabilities, outdated or obsolete components, backward compatibility, and so on. This article therefore quantitatively studies the use of GHA in 68K repositories on GitHub. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice and identify which actions are reused and how. As such, we provide an overview of the use of GHA, a necessary first step towards a better understanding of the emerging GHA ecosystem and its implications on software development in GitHub repositories. More concretely, we answer the following research questions: 35 https://doi.org/10.1109/ICSME55016.2022.00029
  • 32. Research Questions 36 What are the characteristics of repositories using workflows? Which kinds of workflows are automated? What are the most frequent jobs in workflows? What are the automation practices? Which types of Actions are reused?
  • 33. Dataset • 67,870 repositories • 4 out of 10 repositories use GitHub Actions workflows • 70,278 workflow files • 108,500 jobs • 567,352 steps 37
  • 34. Quantification of jobs and workflows Workflows in repositories single workflow (49.3%) more than one workflow (50.7%) Jobs in workflows single job (77.8%) more than one job (22.2%) 38
  • 35. Characteristics of GitHub repositories using GitHub Actions Median Effect size Characteristic With workflows Without workflows Interpretation Pull Requests 124 41 medium Contributors 20 11 small Commits 598 344 small Issues 105 59 small 40 Repos with GHA workflows tend to have more contributors, pull requests, commits, and issues
  • 36. Most frequent event types triggering workflows 63,4 56,3 16,1 15,4 6,2 8,6 0 10 20 30 40 50 60 70 push PR schedule workflow_dispatch release others 41
  • 37. DifferDifferent ways of executing codecode Step type Action target % of steps % of repositories run: -- 49,9% 93,5% uses: Local path 0,8% 2,0% Docker image 0,1% 1,8% Same repository 0,2% 0,4% Same owner 0,7% 4,3% Other public repository 48,3% 99,3% 42 Reusing Actions in steps is a common practice
  • 38. Which Actions are reused? 35,50% 7,20% 6,60% 5,90% 5,80% 98% 22% 26% 19% 21% 0,00% 10,00% 20,00% 30,00% 40,00% 50,00% 60,00% 70,00% 80,00% 90,00% 100,00% actions/checkout actions/cache actions/setup-node actions/upload-artifact actions/setup-python Top 5 most frequent Actions in steps and repositories steps repositories 44 • A few Actions concentrate most of the reuse • Most of them being developed by GitHub
  • 39. 45 On the Outdatedness of Workflows in the GitHub Actions Ecosystem Alexandre Decan1 , Hassan Onsori Delicheh, Tom Mens aSoftware Engineering Lab, University of Mons, Mons, Belgium Abstract GitHub Actions was introduced as a way to automate CI/CD workflows in GitHub, the largest social coding platform. Thanks to its deep integration into GitHub, GitHub Actions can be used to automate a wide range of social and technical activities. Among its main features, it allows automation workflows to rely on reusable components – the so-called Actions – to enable developers to focus on the tasks that should be automated rather than on how to automate them. As any other kind of reusable software components, Actions are contin- uously updated, causing many automation workflows to use outdated versions of these Actions. Based on a dataset of nearly one million workflows obtained from 22K+ repositories between November 2019 and September 2022, we pro- vide quantitative empirical evidence that reusing Actions in GitHub workflows is common practice, even if this reuse tends to concentrate on a limited number of Actions. We show that Actions are frequently updated, and we quantify to which extent automation workflows are outdated with respect to these Actions. Using two complementary metrics, technical lag and opportunity lag, we found that most of the workflows are using an outdated Action release, are lagging behind the latest available release for at least 7 months, and had the oppor- tunity to be updated during at least 9 months. This calls for a more rigorous management of Action outdatedness in automation workflows, as well as for better policies and tooling to keep workflows up-to-date. Keywords: software ecosystem, dependency management, continuous integration, collaborative software development, workflow automation, technical lag Email addresses: alexandre.decan@umons.ac.be (Alexandre Decan), hassan.onsoridelicheh@umons.ac.be (Hassan Onsori Delicheh), tom.mens@umons.ac.be (Tom Mens) 1F.R.S.-FNRS Research Associate Preprint submitted to Journal of Systems & Software March 21, 2023
  • 40. Outdatedness in the GitHub Actions ecosystem 46 • Four out of five workflows and nearly two thirds of the steps are using an outdated release of an Action. • Steps using Actions provided by GitHub are responsible for most of the outdatedness. • More than one third of the other steps and nearly half of the other workflows are using an outdated release of an Action. release of actions/checkout@v2 release of actions/checkout@v3 release of actions/setup-*@v2 release of actions/setup-*@v3
  • 41. v1 v2 v3 v4 latest technical lag observation date GitHub workflow selected Action lifeline Outdatedness in the GitHub Actions ecosystem Technical lag of workflows / steps: the time period between the start of reusing a selected Action and the latest release of that Action.
  • 42. Outdatedness in the GitHub Actions ecosystem Technical lag of workflows / steps: the time period between the start of reusing a selected Action and the latest release of that Action. • Technical lag of outdated steps tends to increase over time. • Half of the outdated steps using other Actions are using a version that is lagging behind the latest one for at least 7.3 months. • Main cause of technical lag = Actions provided by GitHub
  • 43. Outdatedness in the GitHub Actions ecosystem Opportunity lag of workflows / steps: the time period during which a workflow could have updated an outdated step to a more recent version of an Action, but didn’t. v1 v2 v3 v4 opportunity lag observation time GitHub workflow first update opportunity Action lifeline selected
  • 44. Outdatedness in the GitHub Actions ecosystem Opportunity lag of workflows / steps: the time period during which a workflow could have updated an outdated step to a more recent version of an Action, but didn’t. • The opportunity lag of outdated steps tends to increase over time. • On average, maintainers of outdated steps have had the opportunity to update them for 9 months, but have not done so. • Main cause of opportunity lag = Actions provided by GitHub new releases for docker/*
  • 45. Thank you for your attention. Any questions? 55