1. Microsoft Azure ♥ R
Data Science with Microsoft Azure and R
Dmitry Petukhov,
Microsoft Data Platform MVP, C# MCP,
Big Data Enthusiast && Coffee Addicted
2. Microsoft Azure + R. Prototype to Product Challenge
Prototyping
Flexibility
Distributed
Scalable
Fault-tolerance
Reliable
Production
Flexibility
Distributed
Scalable
Fault-tolerance
Reliable
+ Big Data Ready
+ LSML
Black Magic!
Migration
3. Microsoft Azure + R. Hello R!
Python is a COOL language!
But R…
Specialized in statistical analyze
Time-effective => ideal for…
…prototyping, competition, researching, and for fun!
Standalone computing => not bad scalable
Open source
Big bearded community
4. Storage
Resource
Management
ML Framework
Execution
Engine
Local OS
Local Disc
PythonRuntime
YetAnother
Runtime
scikit
learn
HDFS
YARN
MapReduce
Mahout
HDFS / S3
YARN /
Apache Mesos
Spark
MLlib
HDFS / S3
YARN /
Apache Mesos
Python / R
on Spark
Python/R
tools
Spark
Local PC Hybrid Model Cluster (on-premises/on-demand)
some
library
Machine Learning in Finance. Infrastructure for Data Scientist
Low HighCost of deployment/ownership
Distributed
FS
Dark
Magic…
ML as a Service
Python/R
tools
Microsoft Azure + R. Infrastructures for Data Scientists
5. Microsoft Azure + R. Microsoft ♥ R
R Server for Azure HDInsight
Data Science VM
Azure Machine Learning
Support R-scripts execution
Allow authoring custom R modules
Jupyter Notebooks with R kernel support
Azure HDInsight
Hadoop/Spark-cluster as a Service
SQL Server R Services
Power BI
Running R Scripts & excellent visualization
R Tools for Visual Studio
Microsoft
Azure
8. R Server for Azure HDInsight
Killer features list:
100% open source R implementation;
workload running inside HDInsight (Hadoop/Spark).
Microsoft Azure + R. R Server for Azure HDInsight
9. R, Python, SQL, C#
Microsoft Azure + R. Data Science VM
Microsoft R Server Developer Edition,
Anaconda Python distribution,
Jupyter notebooks for Python and R,
Visual Studio Community Edition with Python and R Tools,
Power BI desktop,
SQL Server Express edition
ML libs: CNTK, xgboost and Vowpal Wabbit
Azure SDK
Data Science VM inside:
10. R Tools in Azure Machine Learning:
Support R-scripts execution;
Allow authoring custom R modules;
Jupyter Notebooks with R kernel support.
Microsoft Azure + R. Azure Machine Learning
11. Microsoft Azure + R. Azure Machine Learning
Jupyter
Notebook
Azure ML
Studio
GitHub/
TFS in Azure
h(θ0, θn)
Commands flow
Data flow
Request/response flow
12. References
Cortana Intelligence and Machine Learning Blog
R for Azure Machine Learning. Quickstart
Machine Learning Algorithm Cheat Sheet
Machine Learning Hackathon. How to win?
Azure ML Repositories on GitHub
Microsoft Azure for all group on Facebook
Soon in Slack (invite form)
Microsoft Azure + R. References
14. Q&A
Now or later (send on d.petukhov@outlook.com)
Ping me
Habr: @codezombie
LinkedIn: @dpetukhov
Facebook: @code.zombi
Read my tech code instinct blog ( http://0xCode.in/ )
Microsoft Azure + R. Stay in Touch!
Notes de l'éditeur
Revolution Analytics
Revolution R Open и Revolution R Enterprise
Revolution R — это среда выполнения языка R (язык программирования для статистической обработки данных и работы с графикой), оптимизированная для многопоточных вычислений, а также, набор библиотек, для параллельной обработки в рамках концепции «больших данных».
R Server for Azure HDInsight is a 100% open source R implementation running the most comprehensive set of ML algorithms and statistical functions in the cloud that leverages Hadoop and Spark.
By making R Server available as a workload running inside HDInsight, we remove obstacles for users to unlock the power of R by eliminating memory and processing constraints and extending analytics from the laptop to large multi-node Hadoop and Spark clusters. This enables the ability to train and run ML models on larger datasets than previously possible to make more accurate predictions that affect the business. It also reduces the time to move ideas into production by eliminating the time-consuming installation or set up and procurement cycles for new hardware.