R is used in a vast ways. From pure ad-hoc by hobbysts to an organized and structured way in an enterprise. Each way of R usage brings different reproducibility challenges. Going through range of typical workflows we will show that understanding reproducibility must start with understanding your workflow. Presenting workflows we will show how we deal reproducibiilty challenges with open-source R Suite (http://rsuite.io) solution developed by us to support our large scale R development.
5. Copyright (c) WLOG Solutions
John
Could not deliver R labs homework due to
package incompatibility at professors
laptop.
6. Copyright (c) WLOG Solutions
Kate and Henry
Missed deadlines due to problems
installing packages for their R shiny app at
Customer’s Server running
RedHat Enterprise 6.8.
7. Copyright (c) WLOG Solutions
The Team
Had serious issues with package versions
conflicts due to many users, many
projects,
running RedHat Enteprise machine
without internet access.
8. Copyright (c) WLOG Solutions
Three different stories
the same
reproducibility
problem.
10. Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
11. Copyright (c) WLOG Solutions
Reproducibility is the
ability to run a code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
12. Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
13. Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
at different computer,
in such way to
obtain the same outputs given the
same inputs.
14. Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
15. Copyright (c) WLOG Solutions
Bare metal
Operating system
Solution dependencies
Code
Data
21. Copyright (c) WLOG Solutions
When is reproducibility
important while you
program in R?
22. Copyright (c) WLOG Solutions
Debian/Ubuntu
RedHat/Centos
Windows
Debian/Ubuntu
RedHat/Centos
Windows
Development Production
Deploy (share) solution to production
23. Copyright (c) WLOG Solutions
Debian/Ubuntu
RedHat/Centos
Windows
Debian/Ubuntu
RedHat/Centos
Windows
Development Development’
Restore development environment
24. Copyright (c) WLOG Solutions
Three workflows
three reproducibility
solutions.
25. Copyright (c) WLOG Solutions
John, student/hobbyist
Dev/Production
Version
controlFamily&Friends or
Professor
MRAN
26. Copyright (c) WLOG Solutions
Kate and Henry, consultancy
team/freelancer/scientist
DevProduction
Continuous
integration
Version
control
Local CRAN
MRAN
On-premise
Cloud
Spark
etc.
27. Copyright (c) WLOG Solutions
The Team, corporate/in-house team
DevProduction
Continuous
integration
Version
control
Local CRAN
28. Copyright (c) WLOG Solutions
One word on Docker
Development Production
Build for
different OS
Deployment
package
. zip
29. Copyright (c) WLOG Solutions
Second word on Docker
Development Production
Build
Docker
image
30. Copyright (c) WLOG Solutions
CRAN
management
Multiple R
versions
Debian/Ubuntu
Windows
RedHat/CenOS
Docker
Jenkins
Isolated
projects
http://rsuite.io
https://github.com/WLOGSolutions/RSuite
https://www.slideshare.net/WLOGSolutions
No installation
on prod
Internetless
environments
System
requirements
Git/SVN
Binary
packages