Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Taking R Mainstream in Production Systems

576 vues

Publié le

Presented at the NYC Open Data Meetup (http://www.meetup.com/nyhackr/events/223130503/)

Publié dans : Données & analyses
  • Soyez le premier à commenter

Taking R Mainstream in Production Systems

  1. 1. Taking R Mainstream in Production Systems Misha Lisovich misha@honestbuildings.com
  2. 2. The Question Q: Should I Use R in production? A: Yes! (In a couple of years)
  3. 3. The Process 1. Productize - Compelling data products - Innovation pipeline 2. Ruggedize - Toolchain: Rstudio, Devtools, Github, Travis CI, Docker - Strong testing - Production-ready Architecture 3. Assimilate - Command line tools - Make it into HTTP APIs - Make it into Docker containers
  4. 4. Step 1: Productize Internal Products: - Ad-hoc Analyses - Internal Dashboards - Automated reports - Rapid Prototyping External Products: - End-user data products - Backend services
  5. 5. 1. Dashboards Business Intelligence Internal ToolsData & Job Monitoring
  6. 6. 2. Automated Reports .Rmd -> html =
  7. 7. 3. Rapid Prototyping
  8. 8. 4. Backend Services Batch Data Processing (ETL) R APIs
  9. 9. 5. End-user Products
  10. 10. Step 2: Ruggedize 1. Create reproducible architecture 2. Set up strong testing & CI 3. Separate Production and Dev 4. Set up monitoring & reporting
  11. 11. Case Study: HB Architecture - Rstudio - Containerized Architecture - Continuous Integration - Multiple Environments - Notifications/Monitoring
  12. 12. Data Architecture elasticsearch: image: elasticsearch shiny-server: image: shiny ports: - "443:443" links: - elasticsearch etl: image:etl volumes: - .:/data etl-data: image: etl-data ETL Shiny Server Elastic ETL Data SQL S3 Web rAPI SQL Shiny Server Elastic ETL data ETL rAPI Docker ComposeContainers + = Rstudio Server
  13. 13. Environments ETL Shiny Server Elastic data volume SQL S3 www.dataproduct.com internal-dashboards.com ETL Shiny Server Elastic data volume SQL S3 staging-www.dataproduct.com staging-internal-dashboards.com Production Staging
  14. 14. Continuous Integration Github Travis CI commit latest-stable tag Production pull latest-stable Staging pull latest-stable Success!
  15. 15. Docker Registry/Rolling Back Docker Registry ETL data volume Changes Deployed to Prod Save Versioned Image Danger! Need to Rollback! ETL data volume Load Older Image Docker Registry
  16. 16. Step 3: Assimilate! (i.e., be kind to your devs)
  17. 17. Assimilate (contd) - HTTP APIs - OpenCPU, rapier - Docker containers - Rocker - Command line tools - Rscript, littler, docopt
  18. 18. Thank you! misha@honestbuildings.com

×