Apache Deep Learning 201 - Barcelona DWS March 2019
The art of using Apache NiFi with Apache Tika, Apache OpenNLP, Apache Spark, Apache MXNet, Apache NiFi MiNiFi, Apache NiFi Registry, Apache Livy, Apache HBase, Apache Phoenix, Apache Hive and Apache YARN for deep learning workloads. Including Submarine.
CHEAP Call Girls in Saket (-DELHI )š 9953056974š(=)/CALL GIRLS SERVICE
Ā
Apache Deep Learning 201 - Barcelona DWS March 2019
1. @PaaSDev
Apache Deep Learning 201 v1.00
(For Data Engineers)
Timothy Spann
https://github.com/tspannhw/ApacheDeepLearning201/
2. @PaaSDev
Disclaimer
ā¢ This is my personal integration and use of Apache software, no companies vision.
ā¢ This document may contain product features and technology directions that are under
development, may be under development in the future or may ultimately not be
developed. This is Timās ideas only.
ā¢ Technical feasibility, market demand, user feedback, and the Apache Software
Foundation community development process can all effect timing and final delivery.
ā¢ This documentās description of these features and technology directions does not
represent a contractual commitment, promise or obligation from Hortonworks to
deliver these features in any generally available product.
ā¢ Product features and technology directions are subject to change, and must not be
included in contracts, purchase orders, or sales agreements of any kind.
ā¢ Since this document contains an outline of general product development plans,
customers should not rely upon it when making a purchase decision.
3. @PaaSDev
There are some who call him...
DZone Zone Leader and Big Data MVB;
Princeton Future of Data Meetup
https://github.com/tspannhw
https://community.hortonworks.com/users/9304/tspann.html
https://dzone.com/users/297029/bunkertor.html
https://www.meetup.com/futureofdata-princeton/
8. @PaaSDev
Deep Learning for Big Data Engineers
Multiple users, frameworks, languages, devices, data sources & clusters
BIG DATA ENGINEER
ā¢ Experience in ETL
ā¢ Coding skills in Scala,
Python, Java
ā¢ Experience with Apache
Hadoop
ā¢ Knowledge of database
query languages such as
SQL
ā¢ Knowledge of Hadoop tools
such as Hive, or Pig
ā¢ Expert in ETL (Eating, Ties
and Laziness)
ā¢ Social Media Maven
ā¢ Deep SME in Buzzwords
ā¢ No Coding Skills
ā¢ Interest in Pig and Falcon
CAT AI
ā¢ Will Drive your Car
ā¢ Will Fix Your Code
ā¢ Will Beat You At Q-Bert
ā¢ Will Not Be Discussed
Today
ā¢ Will Not Finish This Talk For
Me, This Time
http://gluon.mxnet.io/chapter01_crashcourse/preface.html
11. @PaaSDev
Why Apache NiFi?
ā¢ Guaranteed delivery
ā¢ Data buffering
- Backpressure
- Pressure release
ā¢ Prioritized queuing
ā¢ Flow specific QoS
- Latency vs. throughput
- Loss tolerance
ā¢ Data provenance
ā¢ Supports push and pull
models
ā¢ Hundreds of processors
ā¢ Visual command and
control
ā¢ Over a 200 sources
ā¢ Flow templates
ā¢ Pluggable/multi-role
security
ā¢ Designed for extension
ā¢ Clustering
ā¢ Version Control
12. @PaaSDev
Aggregate all the Data!
Sensors, Drones, logs,
Geo-location devices
Photos, Images,
Results from running predictions on
Pre-trained models.
Collect: Bring Together
13. @PaaSDev
Mediate point-to-point and
Bidirectional data flows
Delivering data reliably to and from
Apache HBase, Druid, Apache Phoenix, Apache
Hive, Impala, Kudu, HDFS, Slack and Email.
Conduct: Mediate the Data Flow
15. @PaaSDev
ā¢ Cloud ready
ā¢ Python, C++, Scala, R, Julia, Matlab, MXNet.js and Perl Support
ā¢ Experienced team (XGBoost)
ā¢ AWS, Microsoft, NVIDIA, Baidu, Intel
ā¢ Apache Incubator Project
ā¢ Run distributed on YARN and Spark
ā¢ In my early tests, faster than TensorFlow. (Try this yourself)
ā¢ Runs on Raspberry PI, NVidia Jetson TX1 and other constrained devices
https://mxnet.incubator.apache.org/how_to/cloud.html
https://github.com/apache/incubator-mxnet/tree/1.3.1/example
https://gluon-cv.mxnet.io/api/model_zoo.html
16. @PaaSDev
ā¢ Great documentation
ā¢ Crash Course
ā¢ Gluon (Open API), GluonCV, GluonNLP
ā¢ Keras (One API Many Runtime Options)
ā¢ Great Python Interaction. Java and Scala APIs!
ā¢ Open Source Model Server Available
ā¢ ONNX (Open Neural Network Exchange Format) Support for AI Models
ā¢ Now in Version 1.4.0!
ā¢ Rich Model Zoo!
ā¢ Math Kernel Library and NVidia CUDA Optimizations
ā¢ TensorBoard compatible
http://mxnet.incubator.apache.org
/
http://gluon.mxnet.io/https://onnx.ai
/
pip3.6 install -U keras-mxnet
https://gluon-
nlp.mxnet.io/
pip3.6 install --upgrade mxnet
pip3.6 install gluonnlp pip3.6 install gluoncv
pip3.6 install mxnet-mkl>=1.3.0 --upgrade
20. @PaaSDev
Object Detection: Faster RCNN with GluonCV
net = gcv.model_zoo.get_model(faster_rcnn_resnet50_v1b_voc, pretrained=True)
Faster RCNN model trained on Pascal VOC dataset with
ResNet-50 backbone
https://gluon-cv.mxnet.io/api/model_zoo.html
21. @PaaSDev
Instance Segmentation: Mask RCNN with GluonCV
net = model_zoo.get_model('mask_rcnn_resnet50_v1b_coco', pretrained=True)
Mask RCNN model trained on COCO dataset with ResNet-50 backbone
https://gluon-cv.mxnet.io/build/examples_instance/demo_mask_rcnn.html
https://arxiv.org/abs/1703.06870
https://github.com/matterport/Mask_RCNN
22. @PaaSDev
Semantic Segmentation: DeepLabV3 with GluonCV
model = gluoncv.model_zoo.get_model('deeplab_resnet101_ade', pretrained=True)
GluonCV DeepLabV3 model on ADE20K dataset
https://gluon-cv.mxnet.io/build/examples_segmentation/demo_deeplab.html
run1.sh demo_deeplab_webcam.py
http://groups.csail.mit.edu/vision/datasets/ADE20K/ https://arxiv.org/abs/1706.05587
https://www.cityscapes-dataset.com/
This one is a bit
slower.
23. @PaaSDev
Semantic Segmentation: Fully Convolutional Networks
model = gluoncv.model_zoo.get_model(āfcn_resnet101_voc ', pretrained=True)
GluonCV FCN model on PASCAL VOC dataset
https://gluon-cv.mxnet.io/build/examples_segmentation/demo_fcn.html
run1.sh demo_fcn_webcam.py
https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf
25. @PaaSDev
Apache MXNet Model Server from Apache NiFi
https://community.hortonworks.com/articles/223916/posting-images-with-apache-nifi-17-and-a-custom-
pr.html
26. @PaaSDev
Apache MXNet Native Processor for Apache NiFi
This is a beta, community release by me using the new beta Java API for Apache MXNet.
https://github.com/tspannhw/nifi-mxnetinference-
processorhttps://community.hortonworks.com/articles/229215/apache-nifi-processor-for-apache-mxnet-ssd-
single.htmlhttps://www.youtube.com/watch?v=Q4dSGPvq
27. @PaaSDev
Edge Intelligence with Apache NiFi Subproject - MiNiFi
ā¬¢ Guaranteed delivery
ā¬¢ Data buffering
ā Backpressure
ā Pressure release
ā¬¢ Prioritized queuing
ā¬¢ Flow specific QoS
ā Latency vs. throughput
ā Loss tolerance
ā¬¢ Data provenance
ā¬¢ Recovery / recording a rolling
log of fine-grained history
ā¬¢ Designed for extension
ā¬¢ Java or C++ Agent
Different from Apache NiFi
ā¬¢ Design and Deploy
ā¬¢ Warm re-deploys
Key
Features
29. @PaaSDev
Multiple IoT Devices with Apache NiFi and Apache MXNet
https://community.hortonworks.com/articles/203638/ingesting-multiple-iot-devices-with-apache-nifi-17.html
30. @PaaSDev
Using Apache MXNet on The Edge with Sensors and Intel Movidius
(MiNiFi)
https://community.hortonworks.com/articles/176932/apache-deep-learning-101-using-apache-mxnet-on-the.html
https://community.hortonworks.com/articles/146704/edge-analytics-with-nvidia-jetson-tx1-running-apac.html
31. @PaaSDev
Using Apache MXNet on The Edge with Sensors and Google Coral (MiNiFi)
https://www.datainmotion.dev/2019/03/using-raspberry-pi-3b-with-apache-nifi.html