In this presentation we explore how the CI/CD landscape on GitHub has evolved since the introduction of GitHub Actions. This presentation is based on several peer-reviewed articles published in 2022 and 2023.
The (r)evolution of CI/CD on GitHub
Promises and Perils of the GitHub Actions ecosystem
Tom Mens
Software Engineering Lab
March 2023
SECO-ASSIST
secoassist.github.io
On the rise and fall of CI services in GitHub
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Abstract—Continuous integration (CI) services are used in
collaborative open source projects to automate parts of the
development workflow. Such services have been in widespread use
for over a decade, with new CIs being introduced over the years,
sometimes overtaking other CIs in popularity. We conducted a
longitudinal empirical study over a period of nine years, aiming
to better understand this rapidly evolving CI landscape. By
analysing the development history of 91,810 GitHub repositories
of active npm packages having used at least one CI service,
we quantitatively studied the evolution of seven popular CIs,
specifically focusing on their co-usage and migration in the
considered repositories. We provide statistical evidence of the rise
of GitHub Actions, that has become the dominant CI service in
less than 18 months time. This coincides with the fall of Travis
that has seen an important decrease in usage, likely due to a
combination of policy changes and migrations to GitHub Actions.
Index Terms—Continuous integration, distributed software
development, software repositories, GitHub
I. INTRODUCTION
Continuous integration (CI), deployment and delivery have
become the cornerstone of collaborative software development
and DevOps practices. CI automates the integration of code
changes from multiple contributors into a central repository
where automated builds, tests and code quality checks run.
Well-known examples of CI services are Jenkins, Travis,
CircleCI and AppVeyor. CI services can also be built-in in
social coding platforms such as GitHub and GitLab [1]. GitLab
already featured CI capabilities since November 2012. Based
on popular demand, and in response to CI support integrated
in GitLab, GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this article) in October 2018. In August 2019, they officially
began supporting Continuous Integration through GHA, and
the product was released publicly in November 2019.
GHA [2] allows to automate a wide range of tasks based
on a variety of triggers such as commits, issues, pull requests,
comments and many more. GHA can be used to facilitate code
reviews, code quality analysis, communication, dependency
and security monitoring and management, testing, etc. GHA
facilitates the integration with external services, and can even
obviate the need of using such external services altogether.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 56 million users in
September 2020 [3]. Given its popularity and the ease with
which GHA allows to automate the CI workflow, we hypoth-
esise that GHA has had a significant impact on today’s CI
landscape. More particularly, we believe that it has increased
the awareness of the need for CI, it has reduced the entry
barrier for projects to start using CI, and it may have lead
projects to migrate from other CI services towards GHA.
This article aims to quantitatively and objectively verify
these hypotheses, and discusses their consequences, through a
longitudinal analysis of how different CIs have been used over
a nine-year period in 91,810 GitHub repositories correspond-
ing to the software development history of reusable Node.JS
packages distributed through the npm package registry. This
empirical study focuses on four research questions:
RQ1 How did the CI landscape evolve? We identified 20
different CIs being used in the considered set of repositories,
some of which were considerably more prevalent than others.
Together with Travis, GHA covers more than 80% of all
usages. Moreover, in only 18 months GHA has overtaken all
other CIs in popularity.
RQ2 What are the most frequent combinations of CIs? We
observed that many repositories have used multiple CIs during
their lifetime. AppVeyor is nearly always used in combination
with some other CI. If a repository uses a CI simultaneously
with another one, it is mostly in combination with Travis,
GHA or CircleCI.
RQ3 How frequently are CIs being replaced by an alternative?
We observed a non-negligible amount of CI migrations. GHA
attracted most of these migrations. The majority of migrations
were moving away from Travis and towards GHA.
RQ4 How has the CI landscape changed since GHA was
introduced? Based on a regression discontinuity design, we
found that the usage of Travis, Azure and CircleCI has been
negatively affected by the introduction of GHA.
This article is structured as follows. Section II motivates the
selected dataset and discusses the data extraction and cleaning
steps that were carried out. Sections III to VI provide answers
to each research question. Section VII discusses the ramifi-
cations of these answers. Section VIII presents the threats to
validity of the conducted research. Section IX presents the
related work. Finally, Section X concludes.
II. DATA EXTRACTION
In order to analyse the use of CIs in software development
repositories on GitHub, we need a large dataset containing
10
https://doi.org/10.1109/SANER53432.2022.00084
Dataset
11
1.6M+
Scoped packages
803K packages
on GitHub
Excluded 11,557
forks
Excluded inactive
repositories
201,403
Repositories
Presence
of CI configuration
files
119,033 CI usages
in
91,810 Repositories
May
2021
Cloned 676K
How prevalent is CI usage
in GitHub repositories?
CI services are used in
more than half of all
considered repositories.
Evolution of GitHub CI/CD landscape
13
Since 2021, GitHub Actions has become
the dominant CI/CD tool in GitHub
Methodology
21
• Around 30 questions related to CI usage, co-usage and migration
Interview questionnaire
• Selected candidates through Twitter, LinkedIn, email, direct messages
• Colleagues' referrals (snowballing)
Selection of respondents
• Using online video conferencing tool
Geographic diversity
• Actively contributed to, or having been responsible for a software project relying on CI
• Sufficient knowledge about which CI tool is used in that software project and how
• Having been involved in setting up or maintaining the CI process of the project
Inclusion Criteria
Demographics of respondents
• 22 respondents
• 16 from 7 European countries
• 4 from North America
• 2 from Asia
• software development experience
• average of 12 years and 4 months
• Good mix of industrial and open source
contributors
22
CI/CD tools being used
• 14 additional tools reported only once
• 3 custom-built in-house CI/CD solutions
23
Why is GitHub Actions so
popular?
• deep integration with GitHub
• ease of setup and use
• trendy
• speed
• reliability
• free tier for open source projects
• large marketplace of reusable Actions
• support for major operating systems
• company support (Microsoft)
• automation beyond CI/CD
33
Difficulties in CI migration
• Learning curve
• Fundamental differences between the
source and target of the migration
• Trial-and-error nature of configuring a
new CI tool
• Lack of familiarity with the new CI tool
• Important missing features
34
On the Use of GitHub Actions in Software
Development Repositories
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Pooya Rostami Mazrae
Software Engineering Lab
University of Mons
Mons, Belgium
pooya.rostamimazrae@umons.ac.be
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Abstract—GitHub Actions was introduced in 2019 and con-
stitutes an integrated alternative to CI/CD services for GitHub
repositories. The deep integration with GitHub allows reposi-
tories to easily automate software development workflows. This
paper empirically studies the use of GitHub Actions on a dataset
comprising 68K repositories on GitHub, of which 43.9% are using
GitHub Actions workflows. We analyse which workflows are
automated and identify the most frequent automation practices.
We show that reuse of actions is a common practice, even if
this reuse is concentrated in a limited number of actions. We
study which actions are most frequently used and how workflows
refer to them. Furthermore, we discuss the related security
and versioning aspects. As such, we provide an overview of
the use of GitHub Actions, constituting a necessary first step
towards a better understanding of this emerging ecosystem and
its implications on collaborative software development in the
GitHub social coding platform.
Index Terms—GitHub Actions, continuous integration, collab-
orative software development, workflow automation
I. INTRODUCTION
Open source software (OSS) development is a continuous,
highly distributed and collaborative endeavour [1]. Develop-
ment of OSS projects faces many socio-technical challenges
[2]–[4]. The multitude of tools (e.g., version control systems,
software distribution managers, bug and issue trackers) and
development-related activities makes it very challenging for
contributor communities to keep up with the rapid pace of
producing and maintaining high-quality software releases.
Automated workflows were introduced to automate numer-
ous repetitive social or technical activities that are inherently
part of the collaborative software development process. Con-
tinuous integration, deployment and delivery (CI/CD) have
become the cornerstone of collaborative software develop-
ment and DevOps practices. Well-known examples of CI/CD
services are Travis, Jenkins, CircleCI and TeamCity. They
automate the integration of code changes from multiple con-
tributors into a central repository where automated builds, tests
and code quality checks run.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 73 million users in
2021 [5]. GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this paper) in October 2018 based on popular demand, and in
response to GitLab’s integrated CI/CD support [6]. In August
2019, GitHub officially began supporting CI through GHA,
and the product was released publicly in November 2019.
GHA [7] allows the automation of a wide range of tasks
based on a variety of triggers such as commits, issues, pull
requests, comments, schedules, and many more. Its deep
integration into GitHub implies that GHA can be used not
only for executing test suites or deploying new releases
as in traditional CI/CD services, but also to facilitate code
reviews, communication, dependency and security monitoring
and management, etc. GHA also promotes the use and sharing
of reusable components, called actions, in workflows. These
actions are distributed in public repositories and on the GitHub
Marketplace. They allow workflow developers to easily in-
tegrate specific tasks (e.g., set up a specific programming
language environment, publish a release on a package registry,
run tests and check code quality) without having to write the
corresponding code.
Since its public release in November 2019, GHA has
become the most dominant CI/CD service, only 18 months
after its introduction [8]. Its Marketplace of reusable actions
has been growing exponentially ever since, reaching 12K
reusable actions in February 2022. It is therefore fair to
say that GHA has become a software ecosystem of its own,
comparable to ecosystems of reusable software libraries (such
as npm, RubyGems, CRAN, Maven, and PyPI) that have been
empirically studied by many researchers in recent years (e.g.,
[9]–[14]).
The emerging GHA ecosystem is worthy of being empiri-
cally studied in its own right since it is likely to suffer from
the same issues related to dependency management, security
vulnerabilities, outdated or obsolete components, backward
compatibility, and so on. This article therefore quantitatively
studies the use of GHA in 68K repositories on GitHub. We
analyse which workflows are automated and identify the most
frequent automation practices. We show that reuse of actions
is a common practice and identify which actions are reused
and how. As such, we provide an overview of the use of
GHA, a necessary first step towards a better understanding
of the emerging GHA ecosystem and its implications on
software development in GitHub repositories. More concretely,
we answer the following research questions:
35
https://doi.org/10.1109/ICSME55016.2022.00029
Research
Questions
36
What are the characteristics of repositories using
workflows?
Which kinds of workflows are automated?
What are the most frequent jobs in workflows?
What are the automation practices?
Which types of Actions are reused?
Dataset
• 67,870 repositories
• 4 out of 10 repositories
use GitHub Actions
workflows
• 70,278 workflow files
• 108,500 jobs
• 567,352 steps
37
Quantification of jobs and workflows
Workflows in repositories
single workflow (49.3%)
more than one workflow (50.7%)
Jobs in workflows
single job (77.8%)
more than one job (22.2%)
38
Characteristics of GitHub repositories
using GitHub Actions
Median Effect size
Characteristic With workflows
Without
workflows
Interpretation
Pull Requests 124 41 medium
Contributors 20 11 small
Commits 598 344 small
Issues 105 59 small
40
Repos with GHA workflows tend to have more
contributors, pull requests, commits, and issues
DifferDifferent ways of executing codecode
Step type Action target % of steps % of repositories
run: -- 49,9% 93,5%
uses:
Local path 0,8% 2,0%
Docker image 0,1% 1,8%
Same repository 0,2% 0,4%
Same owner 0,7% 4,3%
Other public
repository
48,3% 99,3%
42
Reusing Actions in steps is a common practice
Which Actions are reused?
35,50%
7,20% 6,60% 5,90% 5,80%
98%
22%
26%
19%
21%
0,00%
10,00%
20,00%
30,00%
40,00%
50,00%
60,00%
70,00%
80,00%
90,00%
100,00%
actions/checkout actions/cache actions/setup-node actions/upload-artifact actions/setup-python
Top 5 most frequent Actions in steps and repositories
steps repositories 44
• A few Actions concentrate
most of the reuse
• Most of them being
developed by GitHub
45
On the Outdatedness of Workflows
in the GitHub Actions Ecosystem
Alexandre Decan1
, Hassan Onsori Delicheh, Tom Mens
aSoftware Engineering Lab, University of Mons, Mons, Belgium
Abstract
GitHub Actions was introduced as a way to automate CI/CD workflows in
GitHub, the largest social coding platform. Thanks to its deep integration into
GitHub, GitHub Actions can be used to automate a wide range of social and
technical activities. Among its main features, it allows automation workflows
to rely on reusable components – the so-called Actions – to enable developers to
focus on the tasks that should be automated rather than on how to automate
them. As any other kind of reusable software components, Actions are contin-
uously updated, causing many automation workflows to use outdated versions
of these Actions. Based on a dataset of nearly one million workflows obtained
from 22K+ repositories between November 2019 and September 2022, we pro-
vide quantitative empirical evidence that reusing Actions in GitHub workflows
is common practice, even if this reuse tends to concentrate on a limited number
of Actions. We show that Actions are frequently updated, and we quantify to
which extent automation workflows are outdated with respect to these Actions.
Using two complementary metrics, technical lag and opportunity lag, we found
that most of the workflows are using an outdated Action release, are lagging
behind the latest available release for at least 7 months, and had the oppor-
tunity to be updated during at least 9 months. This calls for a more rigorous
management of Action outdatedness in automation workflows, as well as for
better policies and tooling to keep workflows up-to-date.
Keywords: software ecosystem, dependency management, continuous
integration, collaborative software development, workflow automation,
technical lag
Email addresses: alexandre.decan@umons.ac.be (Alexandre Decan),
hassan.onsoridelicheh@umons.ac.be (Hassan Onsori Delicheh), tom.mens@umons.ac.be
(Tom Mens)
1F.R.S.-FNRS Research Associate
Preprint submitted to Journal of Systems & Software March 21, 2023
Outdatedness in the
GitHub Actions ecosystem
46
• Four out of five workflows and nearly
two thirds of the steps are using an
outdated release of an Action.
• Steps using Actions provided by GitHub
are responsible for most of the
outdatedness.
• More than one third of the other steps
and nearly half of the other workflows
are using an outdated release of an
Action.
release of
actions/checkout@v2
release of
actions/checkout@v3
release of
actions/setup-*@v2
release of
actions/setup-*@v3
v1 v2 v3 v4
latest
technical lag
observation
date
GitHub workflow
selected
Action
lifeline
Outdatedness in the
GitHub Actions ecosystem
Technical lag of workflows / steps: the time period between the start of
reusing a selected Action and the latest release of that Action.
Outdatedness in the
GitHub Actions ecosystem
Technical lag of workflows / steps: the time period between the start of
reusing a selected Action and the latest release of that Action.
• Technical lag of outdated steps
tends to increase over time.
• Half of the outdated steps using
other Actions are using a version
that is lagging behind the latest one
for at least 7.3 months.
• Main cause of technical lag =
Actions provided by GitHub
Outdatedness in the
GitHub Actions ecosystem
Opportunity lag of workflows / steps: the time period during which a
workflow could have updated an outdated step to a more recent
version of an Action, but didn’t.
v1 v2 v3 v4
opportunity lag
observation
time
GitHub workflow
first update
opportunity
Action
lifeline
selected
Outdatedness in the
GitHub Actions ecosystem
Opportunity lag of workflows / steps: the time period during which a
workflow could have updated an outdated step to a more recent
version of an Action, but didn’t.
• The opportunity lag of outdated steps
tends to increase over time.
• On average, maintainers of outdated
steps have had the opportunity to
update them for 9 months, but have not
done so.
• Main cause of opportunity lag =
Actions provided by GitHub
new releases for
docker/*