SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
Automatically Archiving
Reproducible Studies
with Docker
Daniel Nüst | University of Münster | @nordholmen
useR!2017 Brussels, Thu 12:12 pm, room 2.02
http://sched.co/AxqM
Contents
Motivation
Docker
containerit
2
3
#1: reproducibility helps to avoid disaster
#2: reproducibility makes it easier to write papers
#3: reproducibility helps reviewers see it your way
#4: reproducibility enables continuity of your work
#5: reproducibility helps to build your reputation
4
ERC creation process
❏ Submit workspace to publication platform
❏ Publication platform…
❏ extracts metadata
❏ executes analysis
❏ check output vs. upload (syntax)
❏ capture runtime environment
(manifest + image)
❏ User checks metadata
❏ Publish ERC persistently
5
6
Slide by Docker inventor & Docker, Inc. CTO Solomon Hykes, DockerCon 2014
7
giphy.com
https://docs.docker.com/engine/docker-overview/ 8
science
data science
research
reproducibility
replication
package &
separate
applications and
their dependencies
for cloud
infrastructures
https://docs.docker.com/engine/docker-overview/ 9
science
data science
research
reproducibility
replication
package &
separate
applications and
their
dependencies for
cloud
infrastructures
Iconsbysynonymsof(CC-BY)
Docker basics
Dockerfile
ENV
RUN
CMD
Docker
Image
pause
stop/kill
start
logs
cp
exec
rm
stats
build
Docker CLI
run Docker
Container
Docker Engine
Docker Registry
run
11
Docker for Data Science
(all the Docker advantages… write once, biz ops, cloud, etc.)
Reproducibility
Project separation + don’t clutter dev machine
Environment (re)creation, documentation
Adopt good practices on the way
Easy collaboration
Easy transition from testing to production
Docker for RR lesson
https://github.com/
nuest/
docker-reproducible
-research
https://nuest.github.io/
docker-reproducible
-research
12
13
https://hub.docker.com/r/rocker/rstudio/
Base containers (r-base, r-devel, r-ver, ..)
Use case containers (r-devel-ubsan-clang, ..)
Stacks (tidyverse, geospatial, ..)
docker run -it -p 8787:8787 rocker/rstudio
http://localhost:8787/ (rstudio/rstudio)
Rocker: https://github.com/rocker-org
https://bioconductor.org/help/docker/
14
https://hub.docker.com/u/bioconductor/
15
giphy.com
16
containerit
https://github.com/
o2r-project/
containerit
http://o2r.info/2017/05/30/
containerit-package/
Packaging interactive session
17
> library(containerit); library("gstat"); library("sp")
> data(meuse)
> coordinates(meuse) = ~x+y
> data(meuse.grid)
> gridded(meuse.grid) = ~x+y
> v <- variogram(log(zinc)~1, meuse)
> m <- fit.variogram(v, vgm(1, "Sph", 300, 1))
> plot(v, model = m)
> dockerfile_object <- dockerfile()
INFO [2017-07-05 11:20:54] Trying to determine system
requirements for the package(s) 'sp, gstat, zoo,
futile.logger, xts, lambda.r, spacetime, futile.options,
FNN, intervals, lattice' from sysreq online DB
INFO [2017-07-05 11:21:03] Adding CRAN packages: sp,
gstat, zoo, futile.logger, xts, lambda.r, spacetime,
futile.options, FNN, intervals, lattice
INFO [2017-07-05 11:21:03] Created Dockerfile-Object
based on sessionInfo
> print(dockerfile_object)
FROM rocker/r-ver:3.4.1
LABEL maintainer="daniel"
RUN ["install2.r", "-r 'https://cloud.r-project.org'", "sp",
"gstat", "zoo", "futile.logger", "xts", "lambda.r",
"spacetime", "futile.options", "FNN", "intervals",
"lattice"]
WORKDIR /payload/
CMD ["R"]
> str(dockerfile_object, max.level = 2)
Formal class 'Dockerfile' [..] with 4 slots
..@ image :Formal class 'From' [..] with 2 slots
..@ maintainer :Formal class 'Label' [..] with 2 slots
..@ instructions :List of 2
..@ cmd :Formal class 'Cmd' [..] with 2 slots
Packaging a script w/ sysreqs dependency resolving
library(rgdal); require(maptools)
nc <- rgdal::readOGR(system.file("shapes/",
package="maptools"), "sids", verbose = FALSE)
proj4string(nc) <- CRS("+proj=longlat
+datum=NAD27")
plot(nc)
summary(nc)
> scriptCmd <- CMD_Rscript("demo.R")
> dockerfile_object <- dockerfile(
from = "~/Documents/2017_useR/demo.R",
cmd = scriptCmd)
# curl https://sysreqs.r-hub.io/pkg/
rgdal,sp,lattice/linux-x86_64-debian-gcc
# ["libgdal-dev", "libproj-dev", "gdal-bin"]
> print(dockerfile_object)
FROM rocker/r-ver:3.4.0
LABEL maintainer="daniel"
RUN export DEBIAN_FRONTEND=noninteractive; apt-get
-y update 
&& apt-get install -y gdal-bin 
libgdal-dev 
libproj-dev
RUN ["install2.r", "-r
'https://cloud.r-project.org'", "rgdal", "sp",
"lattice"]
WORKDIR /payload/
COPY [".", "."]
CMD ["R", "--vanilla", "-f",
"containerit_1a977e2dcdea.R"]
18
Running the container
> write(dockerfile_object)
INFO [2017-07-06 10:10:05] Writing dockerfile to
/home/daniel/Documents/2017_useR/Dockerfile
$ docker build -t user2017demo .
Sending build context to Docker daemon 6.054MB
Step 1/7 : FROM rocker/r-ver:3.4.1
3.4.1: Pulling from rocker/r-ver
c75480ad9aaf: Pull complete
[...]
The following additional packages will be installed:
[...]
* installing *source* package ‘foreign’ ...
[...]
Successfully built e30936ac8687
Successfully tagged user2017demo:latest
19
$ docker run -it user2017demo
R version 3.4.1 (2017-06-30) -- "Single Candle"
Copyright (C) 2017 The R Foundation for Statistical
Computing
Platform: x86_64-pc-linux-gnu (64-bit)
[..]
> library(rgdal); require(maptools)
Loading required package: sp
> nc <- rgdal::readOGR(system.file("shapes/",
package="maptools"), "sids", verbose = FALSE)
[...]
> summary(nc)
Object of class SpatialPolygonsDataFrame
Coordinates:
min max
x -84.32385 -75.45698
y 33.88199 36.58965
Is projected: FALSE
[...]
CLI
Based on docopt
https://github.com/
docopt/docopt.R
20
CLI
21
...
CLI
22
More
Labels for metadata
devtools session information (install from git under dev.)
Custom base images
Docker vs. R blog
http://bit.ly/docker-r
Boettiger, Carl. 2015. “An Introduction
to Docker for Reproducible Research,
with Examples from the R
Environment.” ACM SIGOPS
Operating Systems Review 49
(January): 71–79.
doi:10.1145/2723872.2723882
23
Summary
Executable Research Compendia are fun
Docker is great
containerit makes Docker easier
(DRY, less copy&paste, best practices, automatic system dependencies)
Benefits from Rocker (MRAN by default, …), harbor, ...
Alternatives / potential for combination:
package management locally (packrat, pkgsnap, switchr/GRANBase) or
remotely (MRAN timemachine/checkpoint), or install specific versions from
CRAN or source (requireGitHub, devtools)
24
BETA!
https://github.com/o2r-project/containerit/projects/1
Outlook
Support our ERC creation service
Get feedback
Singularity
OCI/acbuild
CRAN
Docker + R paper for RJournal?
Package rplumber/jug web apps
Versioned packages and system libs (sf::sf_extSoftVersion()) 25
26
giphy.com
Thanks!
What are your questions?
27
@o2r_project
github.com/o2r-project
o2r.info

Contenu connexe

Tendances

Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDPlcplcp1
 
Object Storage with Gluster
Object Storage with GlusterObject Storage with Gluster
Object Storage with GlusterGluster.org
 
Docker Internals - Twilio talk November 14th, 2013
Docker Internals - Twilio talk November 14th, 2013Docker Internals - Twilio talk November 14th, 2013
Docker Internals - Twilio talk November 14th, 2013Guillaume Charmes
 
はじめてのGlusterFS
はじめてのGlusterFSはじめてのGlusterFS
はじめてのGlusterFSTakahiro Inoue
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetesTed Jung
 
Cantainer CI/ CD with Kubernetes
Cantainer CI/ CD with KubernetesCantainer CI/ CD with Kubernetes
Cantainer CI/ CD with Kubernetesinwin stack
 
Why should i care about stateful containers?
Why should i care about stateful containers?Why should i care about stateful containers?
Why should i care about stateful containers?ClusterHQ
 
What Have Syscalls Done for you Lately?
What Have Syscalls Done for you Lately?What Have Syscalls Done for you Lately?
What Have Syscalls Done for you Lately?Docker, Inc.
 
CoreOS + Kubernetes @ All Things Open 2015
CoreOS + Kubernetes @ All Things Open 2015CoreOS + Kubernetes @ All Things Open 2015
CoreOS + Kubernetes @ All Things Open 2015Brandon Philips
 
Kubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and ServicesKubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and ServicesJian-Kai Wang
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersKento Aoyama
 
Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)HungWei Chiu
 
Networking and Go: An Engineer's Journey (Strangeloop 2019)
Networking and Go: An Engineer's Journey (Strangeloop 2019)Networking and Go: An Engineer's Journey (Strangeloop 2019)
Networking and Go: An Engineer's Journey (Strangeloop 2019)Sneha Inguva
 
2013 0928 programming by cuda
2013 0928 programming by cuda2013 0928 programming by cuda
2013 0928 programming by cuda小明 王
 
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetchRedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetchRedis Labs
 
Docker DANS workshop
Docker DANS workshopDocker DANS workshop
Docker DANS workshopvty
 
Containers: What are they, Really?
Containers: What are they, Really?Containers: What are they, Really?
Containers: What are they, Really?Sneha Inguva
 

Tendances (20)

Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
 
Docker Overview
Docker OverviewDocker Overview
Docker Overview
 
Object Storage with Gluster
Object Storage with GlusterObject Storage with Gluster
Object Storage with Gluster
 
Docker Internals - Twilio talk November 14th, 2013
Docker Internals - Twilio talk November 14th, 2013Docker Internals - Twilio talk November 14th, 2013
Docker Internals - Twilio talk November 14th, 2013
 
はじめてのGlusterFS
はじめてのGlusterFSはじめてのGlusterFS
はじめてのGlusterFS
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetes
 
Cantainer CI/ CD with Kubernetes
Cantainer CI/ CD with KubernetesCantainer CI/ CD with Kubernetes
Cantainer CI/ CD with Kubernetes
 
Why should i care about stateful containers?
Why should i care about stateful containers?Why should i care about stateful containers?
Why should i care about stateful containers?
 
Namespace
NamespaceNamespace
Namespace
 
What Have Syscalls Done for you Lately?
What Have Syscalls Done for you Lately?What Have Syscalls Done for you Lately?
What Have Syscalls Done for you Lately?
 
CoreOS + Kubernetes @ All Things Open 2015
CoreOS + Kubernetes @ All Things Open 2015CoreOS + Kubernetes @ All Things Open 2015
CoreOS + Kubernetes @ All Things Open 2015
 
Kubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and ServicesKubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and Services
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux Containers
 
Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)
 
Networking and Go: An Engineer's Journey (Strangeloop 2019)
Networking and Go: An Engineer's Journey (Strangeloop 2019)Networking and Go: An Engineer's Journey (Strangeloop 2019)
Networking and Go: An Engineer's Journey (Strangeloop 2019)
 
2013 0928 programming by cuda
2013 0928 programming by cuda2013 0928 programming by cuda
2013 0928 programming by cuda
 
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetchRedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
 
Docker internals
Docker internalsDocker internals
Docker internals
 
Docker DANS workshop
Docker DANS workshopDocker DANS workshop
Docker DANS workshop
 
Containers: What are they, Really?
Containers: What are they, Really?Containers: What are they, Really?
Containers: What are they, Really?
 

Similaire à containerit at useR!2017 conference, Brussels

RR & Docker @ MuensteR Meetup (Sep 2017)
RR & Docker @ MuensteR Meetup (Sep 2017)RR & Docker @ MuensteR Meetup (Sep 2017)
RR & Docker @ MuensteR Meetup (Sep 2017)Daniel Nüst
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Zabbix
 
Puppet at Opera Sofware - PuppetCamp Oslo 2013
Puppet at Opera Sofware - PuppetCamp Oslo 2013Puppet at Opera Sofware - PuppetCamp Oslo 2013
Puppet at Opera Sofware - PuppetCamp Oslo 2013Cosimo Streppone
 
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...Patrick Chanezon
 
BDM32: AdamCloud Project - Part II
BDM32: AdamCloud Project - Part IIBDM32: AdamCloud Project - Part II
BDM32: AdamCloud Project - Part IIDavid Lauzon
 
Dockerizing a Symfony2 application
Dockerizing a Symfony2 applicationDockerizing a Symfony2 application
Dockerizing a Symfony2 applicationRoman Rodomansky
 
Containers for sensor web services, applications and research @ Sensor Web Co...
Containers for sensor web services, applications and research @ Sensor Web Co...Containers for sensor web services, applications and research @ Sensor Web Co...
Containers for sensor web services, applications and research @ Sensor Web Co...Daniel Nüst
 
Scaling Docker Containers using Kubernetes and Azure Container Service
Scaling Docker Containers using Kubernetes and Azure Container ServiceScaling Docker Containers using Kubernetes and Azure Container Service
Scaling Docker Containers using Kubernetes and Azure Container ServiceBen Hall
 
Effective Data Pipelines with Docker & Jenkins - Brian Donaldson
Effective Data Pipelines with Docker & Jenkins - Brian DonaldsonEffective Data Pipelines with Docker & Jenkins - Brian Donaldson
Effective Data Pipelines with Docker & Jenkins - Brian DonaldsonDocker, Inc.
 
Docker @ FOSS4G 2016, Bonn
Docker @ FOSS4G 2016, BonnDocker @ FOSS4G 2016, Bonn
Docker @ FOSS4G 2016, BonnDaniel Nüst
 
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019kanedafromparis
 
Through the firewall with miniCRAN
Through the firewall with miniCRANThrough the firewall with miniCRAN
Through the firewall with miniCRANRevolution Analytics
 
Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Ricardo Amaro
 
Delivering Docker & K3s worloads to IoT Edge devices
Delivering Docker & K3s worloads to IoT Edge devicesDelivering Docker & K3s worloads to IoT Edge devices
Delivering Docker & K3s worloads to IoT Edge devicesAjeet Singh Raina
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Codemotion
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Codemotion
 
Amazon Web Services and Docker: from developing to production
Amazon Web Services and Docker: from developing to productionAmazon Web Services and Docker: from developing to production
Amazon Web Services and Docker: from developing to productionPaolo latella
 
Environment for training models
Environment for training modelsEnvironment for training models
Environment for training modelsFlyElephant
 

Similaire à containerit at useR!2017 conference, Brussels (20)

RR & Docker @ MuensteR Meetup (Sep 2017)
RR & Docker @ MuensteR Meetup (Sep 2017)RR & Docker @ MuensteR Meetup (Sep 2017)
RR & Docker @ MuensteR Meetup (Sep 2017)
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
 
Puppet at Opera Sofware - PuppetCamp Oslo 2013
Puppet at Opera Sofware - PuppetCamp Oslo 2013Puppet at Opera Sofware - PuppetCamp Oslo 2013
Puppet at Opera Sofware - PuppetCamp Oslo 2013
 
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
 
BDM32: AdamCloud Project - Part II
BDM32: AdamCloud Project - Part IIBDM32: AdamCloud Project - Part II
BDM32: AdamCloud Project - Part II
 
Dockerizing a Symfony2 application
Dockerizing a Symfony2 applicationDockerizing a Symfony2 application
Dockerizing a Symfony2 application
 
Containers for sensor web services, applications and research @ Sensor Web Co...
Containers for sensor web services, applications and research @ Sensor Web Co...Containers for sensor web services, applications and research @ Sensor Web Co...
Containers for sensor web services, applications and research @ Sensor Web Co...
 
Docker.io
Docker.ioDocker.io
Docker.io
 
Scaling Docker Containers using Kubernetes and Azure Container Service
Scaling Docker Containers using Kubernetes and Azure Container ServiceScaling Docker Containers using Kubernetes and Azure Container Service
Scaling Docker Containers using Kubernetes and Azure Container Service
 
Effective Data Pipelines with Docker & Jenkins - Brian Donaldson
Effective Data Pipelines with Docker & Jenkins - Brian DonaldsonEffective Data Pipelines with Docker & Jenkins - Brian Donaldson
Effective Data Pipelines with Docker & Jenkins - Brian Donaldson
 
Docker @ FOSS4G 2016, Bonn
Docker @ FOSS4G 2016, BonnDocker @ FOSS4G 2016, Bonn
Docker @ FOSS4G 2016, Bonn
 
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
 
Discovering OpenBSD on AWS
Discovering OpenBSD on AWSDiscovering OpenBSD on AWS
Discovering OpenBSD on AWS
 
Through the firewall with miniCRAN
Through the firewall with miniCRANThrough the firewall with miniCRAN
Through the firewall with miniCRAN
 
Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant
 
Delivering Docker & K3s worloads to IoT Edge devices
Delivering Docker & K3s worloads to IoT Edge devicesDelivering Docker & K3s worloads to IoT Edge devices
Delivering Docker & K3s worloads to IoT Edge devices
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
 
Amazon Web Services and Docker: from developing to production
Amazon Web Services and Docker: from developing to productionAmazon Web Services and Docker: from developing to production
Amazon Web Services and Docker: from developing to production
 
Environment for training models
Environment for training modelsEnvironment for training models
Environment for training models
 

Plus de Daniel Nüst

Docker @ Data Science Meetup
Docker @ Data Science MeetupDocker @ Data Science Meetup
Docker @ Data Science MeetupDaniel Nüst
 
Frameworks for geoprocessing on the web with R
Frameworks for geoprocessing on the web with RFrameworks for geoprocessing on the web with R
Frameworks for geoprocessing on the web with RDaniel Nüst
 
Agile 2015 a-geo-label-for-the-sensor-web
Agile 2015 a-geo-label-for-the-sensor-webAgile 2015 a-geo-label-for-the-sensor-web
Agile 2015 a-geo-label-for-the-sensor-webDaniel Nüst
 
Visualising Interpolations of Mobile Sensor Observations
Visualising Interpolations of Mobile Sensor ObservationsVisualising Interpolations of Mobile Sensor Observations
Visualising Interpolations of Mobile Sensor ObservationsDaniel Nüst
 
WPS Application Patterns
WPS Application PatternsWPS Application Patterns
WPS Application PatternsDaniel Nüst
 
JavaScript Client Libraries for the (Former) Long Tail of OGC Standards
JavaScript Client Libraries for the (Former) Long Tail of OGC StandardsJavaScript Client Libraries for the (Former) Long Tail of OGC Standards
JavaScript Client Libraries for the (Former) Long Tail of OGC StandardsDaniel Nüst
 
Open Source and GitHub for Teaching with Software Development Projects
Open Source and GitHub for Teaching with Software Development ProjectsOpen Source and GitHub for Teaching with Software Development Projects
Open Source and GitHub for Teaching with Software Development ProjectsDaniel Nüst
 
5 Star Open Geoprocessing
5 Star Open Geoprocessing5 Star Open Geoprocessing
5 Star Open GeoprocessingDaniel Nüst
 
The 52°North Web Processing Service
The 52°North Web Processing ServiceThe 52°North Web Processing Service
The 52°North Web Processing ServiceDaniel Nüst
 
Linked data and rdf
Linked  data and rdfLinked  data and rdf
Linked data and rdfDaniel Nüst
 
OGC SOS for Your Data
OGC SOS for Your DataOGC SOS for Your Data
OGC SOS for Your DataDaniel Nüst
 
sos4R - Accessing SensorWeb Data from R
sos4R - Accessing SensorWeb Data from Rsos4R - Accessing SensorWeb Data from R
sos4R - Accessing SensorWeb Data from RDaniel Nüst
 
Connecting R to the Sensor Web
Connecting R to the Sensor WebConnecting R to the Sensor Web
Connecting R to the Sensor WebDaniel Nüst
 
sos4R - 52° North Innovation Price Presentation
sos4R - 52° North Innovation Price Presentationsos4R - 52° North Innovation Price Presentation
sos4R - 52° North Innovation Price PresentationDaniel Nüst
 
Visualizing the Availability of Temporally Structured Sensor Data
Visualizing the Availability of Temporally Structured Sensor DataVisualizing the Availability of Temporally Structured Sensor Data
Visualizing the Availability of Temporally Structured Sensor DataDaniel Nüst
 

Plus de Daniel Nüst (17)

Docker @ Data Science Meetup
Docker @ Data Science MeetupDocker @ Data Science Meetup
Docker @ Data Science Meetup
 
Atlas Zukünfte
Atlas ZukünfteAtlas Zukünfte
Atlas Zukünfte
 
Frameworks for geoprocessing on the web with R
Frameworks for geoprocessing on the web with RFrameworks for geoprocessing on the web with R
Frameworks for geoprocessing on the web with R
 
Agile 2015 a-geo-label-for-the-sensor-web
Agile 2015 a-geo-label-for-the-sensor-webAgile 2015 a-geo-label-for-the-sensor-web
Agile 2015 a-geo-label-for-the-sensor-web
 
Visualising Interpolations of Mobile Sensor Observations
Visualising Interpolations of Mobile Sensor ObservationsVisualising Interpolations of Mobile Sensor Observations
Visualising Interpolations of Mobile Sensor Observations
 
WPS Application Patterns
WPS Application PatternsWPS Application Patterns
WPS Application Patterns
 
JavaScript Client Libraries for the (Former) Long Tail of OGC Standards
JavaScript Client Libraries for the (Former) Long Tail of OGC StandardsJavaScript Client Libraries for the (Former) Long Tail of OGC Standards
JavaScript Client Libraries for the (Former) Long Tail of OGC Standards
 
Open Source and GitHub for Teaching with Software Development Projects
Open Source and GitHub for Teaching with Software Development ProjectsOpen Source and GitHub for Teaching with Software Development Projects
Open Source and GitHub for Teaching with Software Development Projects
 
5 Star Open Geoprocessing
5 Star Open Geoprocessing5 Star Open Geoprocessing
5 Star Open Geoprocessing
 
The 52°North Web Processing Service
The 52°North Web Processing ServiceThe 52°North Web Processing Service
The 52°North Web Processing Service
 
Linked data and rdf
Linked  data and rdfLinked  data and rdf
Linked data and rdf
 
OGC SOS for Your Data
OGC SOS for Your DataOGC SOS for Your Data
OGC SOS for Your Data
 
sos4R - Accessing SensorWeb Data from R
sos4R - Accessing SensorWeb Data from Rsos4R - Accessing SensorWeb Data from R
sos4R - Accessing SensorWeb Data from R
 
Connecting R to the Sensor Web
Connecting R to the Sensor WebConnecting R to the Sensor Web
Connecting R to the Sensor Web
 
sos4R @ OGC TC
sos4R @ OGC TCsos4R @ OGC TC
sos4R @ OGC TC
 
sos4R - 52° North Innovation Price Presentation
sos4R - 52° North Innovation Price Presentationsos4R - 52° North Innovation Price Presentation
sos4R - 52° North Innovation Price Presentation
 
Visualizing the Availability of Temporally Structured Sensor Data
Visualizing the Availability of Temporally Structured Sensor DataVisualizing the Availability of Temporally Structured Sensor Data
Visualizing the Availability of Temporally Structured Sensor Data
 

Dernier

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Dernier (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

containerit at useR!2017 conference, Brussels

  • 1. Automatically Archiving Reproducible Studies with Docker Daniel Nüst | University of Münster | @nordholmen useR!2017 Brussels, Thu 12:12 pm, room 2.02 http://sched.co/AxqM
  • 3. 3 #1: reproducibility helps to avoid disaster #2: reproducibility makes it easier to write papers #3: reproducibility helps reviewers see it your way #4: reproducibility enables continuity of your work #5: reproducibility helps to build your reputation
  • 4. 4
  • 5. ERC creation process ❏ Submit workspace to publication platform ❏ Publication platform… ❏ extracts metadata ❏ executes analysis ❏ check output vs. upload (syntax) ❏ capture runtime environment (manifest + image) ❏ User checks metadata ❏ Publish ERC persistently 5
  • 6. 6 Slide by Docker inventor & Docker, Inc. CTO Solomon Hykes, DockerCon 2014
  • 9. https://docs.docker.com/engine/docker-overview/ 9 science data science research reproducibility replication package & separate applications and their dependencies for cloud infrastructures Iconsbysynonymsof(CC-BY)
  • 11. 11 Docker for Data Science (all the Docker advantages… write once, biz ops, cloud, etc.) Reproducibility Project separation + don’t clutter dev machine Environment (re)creation, documentation Adopt good practices on the way Easy collaboration Easy transition from testing to production
  • 12. Docker for RR lesson https://github.com/ nuest/ docker-reproducible -research https://nuest.github.io/ docker-reproducible -research 12
  • 13. 13 https://hub.docker.com/r/rocker/rstudio/ Base containers (r-base, r-devel, r-ver, ..) Use case containers (r-devel-ubsan-clang, ..) Stacks (tidyverse, geospatial, ..) docker run -it -p 8787:8787 rocker/rstudio http://localhost:8787/ (rstudio/rstudio) Rocker: https://github.com/rocker-org
  • 17. Packaging interactive session 17 > library(containerit); library("gstat"); library("sp") > data(meuse) > coordinates(meuse) = ~x+y > data(meuse.grid) > gridded(meuse.grid) = ~x+y > v <- variogram(log(zinc)~1, meuse) > m <- fit.variogram(v, vgm(1, "Sph", 300, 1)) > plot(v, model = m) > dockerfile_object <- dockerfile() INFO [2017-07-05 11:20:54] Trying to determine system requirements for the package(s) 'sp, gstat, zoo, futile.logger, xts, lambda.r, spacetime, futile.options, FNN, intervals, lattice' from sysreq online DB INFO [2017-07-05 11:21:03] Adding CRAN packages: sp, gstat, zoo, futile.logger, xts, lambda.r, spacetime, futile.options, FNN, intervals, lattice INFO [2017-07-05 11:21:03] Created Dockerfile-Object based on sessionInfo > print(dockerfile_object) FROM rocker/r-ver:3.4.1 LABEL maintainer="daniel" RUN ["install2.r", "-r 'https://cloud.r-project.org'", "sp", "gstat", "zoo", "futile.logger", "xts", "lambda.r", "spacetime", "futile.options", "FNN", "intervals", "lattice"] WORKDIR /payload/ CMD ["R"] > str(dockerfile_object, max.level = 2) Formal class 'Dockerfile' [..] with 4 slots ..@ image :Formal class 'From' [..] with 2 slots ..@ maintainer :Formal class 'Label' [..] with 2 slots ..@ instructions :List of 2 ..@ cmd :Formal class 'Cmd' [..] with 2 slots
  • 18. Packaging a script w/ sysreqs dependency resolving library(rgdal); require(maptools) nc <- rgdal::readOGR(system.file("shapes/", package="maptools"), "sids", verbose = FALSE) proj4string(nc) <- CRS("+proj=longlat +datum=NAD27") plot(nc) summary(nc) > scriptCmd <- CMD_Rscript("demo.R") > dockerfile_object <- dockerfile( from = "~/Documents/2017_useR/demo.R", cmd = scriptCmd) # curl https://sysreqs.r-hub.io/pkg/ rgdal,sp,lattice/linux-x86_64-debian-gcc # ["libgdal-dev", "libproj-dev", "gdal-bin"] > print(dockerfile_object) FROM rocker/r-ver:3.4.0 LABEL maintainer="daniel" RUN export DEBIAN_FRONTEND=noninteractive; apt-get -y update && apt-get install -y gdal-bin libgdal-dev libproj-dev RUN ["install2.r", "-r 'https://cloud.r-project.org'", "rgdal", "sp", "lattice"] WORKDIR /payload/ COPY [".", "."] CMD ["R", "--vanilla", "-f", "containerit_1a977e2dcdea.R"] 18
  • 19. Running the container > write(dockerfile_object) INFO [2017-07-06 10:10:05] Writing dockerfile to /home/daniel/Documents/2017_useR/Dockerfile $ docker build -t user2017demo . Sending build context to Docker daemon 6.054MB Step 1/7 : FROM rocker/r-ver:3.4.1 3.4.1: Pulling from rocker/r-ver c75480ad9aaf: Pull complete [...] The following additional packages will be installed: [...] * installing *source* package ‘foreign’ ... [...] Successfully built e30936ac8687 Successfully tagged user2017demo:latest 19 $ docker run -it user2017demo R version 3.4.1 (2017-06-30) -- "Single Candle" Copyright (C) 2017 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) [..] > library(rgdal); require(maptools) Loading required package: sp > nc <- rgdal::readOGR(system.file("shapes/", package="maptools"), "sids", verbose = FALSE) [...] > summary(nc) Object of class SpatialPolygonsDataFrame Coordinates: min max x -84.32385 -75.45698 y 33.88199 36.58965 Is projected: FALSE [...]
  • 23. More Labels for metadata devtools session information (install from git under dev.) Custom base images Docker vs. R blog http://bit.ly/docker-r Boettiger, Carl. 2015. “An Introduction to Docker for Reproducible Research, with Examples from the R Environment.” ACM SIGOPS Operating Systems Review 49 (January): 71–79. doi:10.1145/2723872.2723882 23
  • 24. Summary Executable Research Compendia are fun Docker is great containerit makes Docker easier (DRY, less copy&paste, best practices, automatic system dependencies) Benefits from Rocker (MRAN by default, …), harbor, ... Alternatives / potential for combination: package management locally (packrat, pkgsnap, switchr/GRANBase) or remotely (MRAN timemachine/checkpoint), or install specific versions from CRAN or source (requireGitHub, devtools) 24 BETA!
  • 25. https://github.com/o2r-project/containerit/projects/1 Outlook Support our ERC creation service Get feedback Singularity OCI/acbuild CRAN Docker + R paper for RJournal? Package rplumber/jug web apps Versioned packages and system libs (sf::sf_extSoftVersion()) 25
  • 27. Thanks! What are your questions? 27 @o2r_project github.com/o2r-project o2r.info