SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
TensorFlow-on-Hops: Hotdog or Not with Toppings  
jim_dowling
Register an account at:
http://hops.io/tf
©2018 Logical Clocks AB. All Rights Reserved
World’s Fastest/Biggest Hadoop*
16x
Throughput
Faster – 1.2M ops/sBigger
*https://www.usenix.org/conference/fast17/technical-sessions/presentation/niazi
37x
Number of files
Scale Challenge Winner (2017)
3
GPUs in
YARN
3/45
©2018 Logical Clocks AB. All Rights Reserved
Hops Data Platform
4/45
©2018 Logical Clocks AB. All Rights Reserved
Why not KubeFlow?
5
KubeFlow
©2018 Logical Clocks AB. All Rights Reserved
Hops – Projects and GPUs
6/70
DataLake
GPUs Compute
Kafka
Data EngineeringData Science
Project1 ProjectN
Elasticsearch
1.Security by Design:
Projects as a Sandbox for
self-service and teams
2.Ease of use: Data Scientists
need only code Python
3.Scale-out Deep Learning:
Parallel experiments,
Distributed Training
6
Python in the Cluster: Per-Project Conda Envs
Python libraries are usable by Spark/Tensorflow
7/45
Hops: TLS Certs for Security (not Kerberos)
•User Certificates:
- Per-project users
•Service Certificates:
- NameNode, ResourceManager, Kafka, HiveServer2, Livy, etc
•Application Certificates
•Supports Certification Revocation, Renewal, Reloading.
8
©2018 Logical Clocks AB. All Rights Reserved
GPU Resource Requests in Hops YARN
HopsYARNHopsYARN
10 GPUs on 1 host
100 GPUs on 10 hosts with ‘Infiniband’
Hops supports a Hetrogenous Mix of GPUs
4 GPUs on any host
9
©2018 Logical Clocks AB. All Rights Reserved
Data Parallelism on Hops/TensorFlow
10
Copy of the Model
on each GPU
©2018 Logical Clocks AB. All Rights Reserved
Parallel Experiments on Hops
11/45
Executor 1 Executor N
Spark Driver
Distributed FSTensorBoard
• Big Training/test datasets
• Model checkpointing
• Evaluation
• Distribute Training, Parallel Experiments
• Model Serving
Model Serving
ML in Production: Machine Learning Pipelines
©2018 Logical Clocks AB. All Rights Reserved
A Machine Learning Pipeline
13/45
Data
Collection
Experimentation Training Serving
Feature
Extraction
Data
Transformation
& Verification
Test
©2018 Logical Clocks AB. All Rights Reserved
Hops Small Data ML Pipeline
14/45
Data
Collection
Experimentation Training Serving
Feature
Extraction
Data
Transformation
& Verification
Test
Project Teams (Data Engineers/Scientists)
TfServingTensorFlow
HopsFS
Kafka
HopsYARN Kubernetes
©2018 Logical Clocks AB. All Rights Reserved
PySpark
Hops Big Data ML Pipeline
15/45
Data
Collection
Experimentation Training Serving
Feature
Extraction
Data
Transformation
& Verification
Test
TfServingTensorFlow
Project Teams (Data Engineers/Scientists)
HopsFS
Kafka
HopsYARN Kubernetes
Google Facets Overview
•Visualize data
distributions
•Min/max/mean/media
values for features
•Missing values in
columns
•Facets Overview
expects test/train
datasets as input
Feature Extraction
Experimentation
Training
Test + Serve
Data Ingestion
Clean/Transform
Data
Google Facets Dive
• Visualize the relationship between the data points across the
different features of a dataset.
Feature Extraction
Experimentation
Training
Test + Serve
Data Ingestion
Clean/Transform
Data
©2018 Logical Clocks AB. All Rights Reserved
18
features = ["Age", "Occupation", "Sex", …, "Country"]
h = hdfs.get_fs()
with h.open_file(hdfs.project_path() +
"/TestJob/data/census/adult.data", "r") as trainFile:
train_data =pd.read_csv(trainFile, names=features,
sep=r's*,s*', engine='python', na_values="?")
test_data = …
facets.overview(train_data, test_data)
facets.dive(test_data.to_json(orient='records'))
Data Ingestion and Google Facets
Feature Extraction
Experimentation
Training
Test + Serve
Data Acquisition
Clean/Transform
Data
©2018 Logical Clocks AB. All Rights Reserved
19/45
Small Data Preparation with tf.data API
def input_fn(batch_size):
files = tf.data.Dataset.list_files(IMAGES_DIR)
def tfrecord_dataset(filename):
return tf.data.TFRecordDataset(filename,
num_parallel_reads=32, buffer_size=8*1024*1024)
dataset = files.apply(tf.data.parallel_interleave
(tfrecord_dataset, cycle_length=32, sloppy=True)
dataset = dataset.apply(tf.data.map_and_batch(parser_fn, batch_size,
num_parallel_batches=4))
dataset = dataset.prefetch(4)
return dataset
Feature Extraction
Experimentation
Training
Test + Serve
Data Acquisition
Clean/Transform
Data
©2018 Logical Clocks AB. All Rights Reserved
Big Data Preparation with PySpark
images = spark.readImages(IMAGE_PATH, recursive = True,
numPartitions=10, sampleRatio = 0.1).cache()
tr = (ImageTransformer().setOutputCol(“transformed”)
.resize(height = 200, width = 200)
.crop(0, 0, height = 180, width = 180) )
smallImages = tr.transform(images).select(“transformed”)
# Output .tfrecords using TensorFlowOnSpark utility
dfutil.saveAsTFRecords(smallImages, OUTPUT_DIR)
20/45
Feature Extraction
Experimentation
Training
Test + Serve
Data Acquisition
Clean/Transform
Data
©2018 Logical Clocks AB. All Rights Reserved
Parallel Experiments
21/45
Time
The Outer Loop (hyperparameters):
“I have to run a hundred experiments to
find the best model,” he complained, as
he showed me his Jupyter notebooks.
“That takes time. Every experiment takes
a lot of programming, because there are
so many different parameters.
[Rants of a Data Scientist]
Hops
©2018 Logical Clocks AB. All Rights Reserved
Hyperparam Opt. with Tf/Spark on Hops
def train(learning_rate, dropout):
[TensorFlow Code here]
args_dict = {'learning_rate': [0.001, 0.005, 0.01],
'dropout': [0.5, 0.6]}
experiment.launch(spark, train, args_dict)
Launch 6 Spark Executors
22/45
Feature Extraction
Experimentation
Training
Test + Serve
Data Acquisition
Clean/Transform
Data
©2018 Logical Clocks AB. All Rights Reserved
HyperParam Opt. Visualization on TensorBoard
23/45
Hyperparam Opt Results Visualization
©2018 Logical Clocks AB. All Rights Reserved
Model Architecture Search on TensorFlow/Hops
def train_cifar10(learning_rate, dropout):
[TensorFlow Code here]
dict =
{'learning_rate': [0.005, 0.00005], 'dropout': [0.01, 0.99],
'num_layers': [1,3]}
experiment.evolutionary_search(spark, train_cifar10, dict,
direction='max',
popsize=10, generations=3, crossover=0.7, mutation=0.5)
©2018 Logical Clocks AB. All Rights Reserved
Differential Evolution in Tensorboard (1/4)
25
CIFAR-10
©2018 Logical Clocks AB. All Rights Reserved
Differential Evolution in Tensorboard (2/4)
26
CIFAR-10
©2018 Logical Clocks AB. All Rights Reserved
Differential Evolution in Tensorboard (3/4)
27
CIFAR-10
©2018 Logical Clocks AB. All Rights Reserved
Differential Evolution in Tensorboard (4/4)
28
CIFAR-10
©2018 Logical Clocks AB. All Rights Reserved
Distributed Training
29/45
WeeksWeeks
Time
The Inner Loop (training):
“ All these experiments took a lot of
computation — we used hundreds of
GPUs/TPUs for days. Much like a single
modern computer can outperform
thousands of decades-old machines, we
hope that in the future these experiments
will become household.”
[Google SoTA
ImageNet, Cifar-10, March18]
MinsMins
Hops
©2018 Logical Clocks AB. All Rights Reserved
Distributed Training: Theory and Practice
30 30/45
Image from @hardmaru on Twitter.
©2018 Logical Clocks AB. All Rights Reserved
Ring-AllReduce vs Parameter Server
GPU 0GPU 0
GPU 1GPU 1
GPU 2GPU 2
GPU 3GPU 3
send
send
send
send
recv
recv
recv
recv GPU 1GPU 1 GPU 2GPU 2 GPU 3GPU 3 GPU 4GPU 4
Param Server(s)Param Server(s)
Network Bandwidth is the Bottleneck for Distributed Training
31/45
©2018 Logical Clocks AB. All Rights Reserved
Horovod - AllReduce Inception V4 Performance
32/45
Inception V4, 20 Nodes on 2 GPU Servers, 40 Gb/s
% GPU
Utilization
% Network
Bandwidth
©2018 Logical Clocks AB. All Rights Reserved
Parameter Server – Inception V4 Performance
33
Horovod, Inception V4, 10 workers, 2 PSes, 40 Gb/s, Nvidia 1080Ti
% GPU
Utilization
% Network
Bandwidth
©2018 Logical Clocks AB. All Rights Reserved
Distributed Training with Horovod on Hops
hvd.init()
opt = hvd.DistributedOptimizer(opt)
if hvd.local_rank()==0:
[TensorFlow Code here]
…..
else:
[TensorFlow Code here]
…..
34/45
Feature Extraction
Experimentation
Training
Test + Serve
Data Acquisition
Clean/Transform
Data
Hops API
•Python (also Java/Scala)
- Manage tensorboard, Load/save models in HDFS
- Horovod, TensorFlowOnSpark
- Parallel experiments
• Gridsearch
• Model Architecture Search with Genetic Algorithms
- Secure Streaming Analytics with Kafka/Spark/Flink
• SSL/TLS certs, Avro Schema, Endpoints for Kafka/Zookeeper/etc
35/45
Feature Extraction
Experimentation
Training
Test + Serve
Data Acquisition
Clean/Transform
Data
©2018 Logical Clocks AB. All Rights Reserved
TensorFlow Model Serving
36/45
Feature Extraction
Experimentation
Training
Test + Serve
Data Acquisition
Clean/Transform
Data
Training-Serving Skew
•Monitor differences between performance during training
and performance during serving.
- Differences in how you process data in training vs serving.
- Differences in the training data and live data for serving.
- A feedback loop between your model and your algorithm.
•When to retrain?
- If you look at the input data and use covariant shift to see when
it deviates significantly from the data that was used to train the
model on.
37
http://hopshadoop.com:8080/hopsworks
0. Register an account
1. Create a ’tensorflow_tour’
serving/train_and_export_model.ipynb
Try out other notebooks – tensorflow/cnn/grid_search,
•2. Create a new project called ’yourname_hotdog’
• a) enable Python 3.6 for the project
b) search for the dataset ’hotdog’ and import it into ’yourname_hotdog’
c) download http://hopshadoop.com:8080/hotdog.ipynb to your laptop.
d) upload hotdog.ipynb into the Jupyter dataset in `yourname_hotdog`
e) install the conda dependencies: matplotlib, pillow, numpy
f) Start Jupyter and run hotdog.ipynb
[Credit Magnus Pedersson : https://www.youtube.com/watch?v=oxrcZ9uUblI ]
38
Summary
•Hops is a new Data Platform with first-class support for
Python / Deep Learning / ML / Data Governance / GPUs
•You can do fun stuff on Hops, like ”Hotdog or not”, as well as
serious stuff.
©2018 Logical Clocks AB. All Rights Reserved
The Team
Jim Dowling, Seif Haridi, Gautier Berthou, Salman Niazi, Mahmoud
Ismail, Theofilos Kakantousis, Ermias Gebremeskel, Antonios
Kouzoupis, Alex Ormenisan, Fabio Buso, Robin Andersson, August
Bonds.
Active:
Alumni:
Vasileios Giannokostas, Johan Svedlund Nordström,Rizvi Hasan, Paul Mälzer, Bram
Leenders, Juan Roca, Misganu Dessalegn, K “Sri” Srijeyanthan, Jude D’Souza, Alberto
Lorente, Andre Moré, Ali Gholami, Davis Jaunzems, Stig Viaene, Hooman Peiro,
Evangelos Savvidis, Steffen Grohsschmiedt, Qi Qi, Gayana Chandrasekara, Nikolaos
Stanogias, Daniel Bali, Ioannis Kerkinos, Peter Buechler, Pushparaj Motamari, Hamid
Afzali, Wasif Malik, Lalith Suresh, Mariano Valles, Ying Lieu, Fanti Machmount Al
Samisti, Braulio Grana, Adam Alpire, Zahin Azher Rashid, ArunaKumari Yedurupaka,
Tobias Johansson , Roberto Bampi, Roshan Sedar.
www.hops.io
@hopshadoop

Contenu connexe

Tendances

Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019Kim Hammar
 
How Machine Learning and AI Can Support the Fight Against COVID-19
How Machine Learning and AI Can Support the Fight Against COVID-19How Machine Learning and AI Can Support the Fight Against COVID-19
How Machine Learning and AI Can Support the Fight Against COVID-19Databricks
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkSri Ambati
 
GPU Accelerated Machine Learning
GPU Accelerated Machine LearningGPU Accelerated Machine Learning
GPU Accelerated Machine LearningSri Ambati
 
How to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML modelsHow to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML modelsDatabricks
 
Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015Christian Peel
 
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...Srivatsan Ramanujam
 
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...Spark Summit
 
Distributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling
Distributed Deep Learning with Apache Spark and TensorFlow with Jim DowlingDistributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling
Distributed Deep Learning with Apache Spark and TensorFlow with Jim DowlingDatabricks
 
Flux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineFlux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineJan Wiegelmann
 
Surge: Rise of Scalable Machine Learning at Yahoo!
Surge: Rise of Scalable Machine Learning at Yahoo!Surge: Rise of Scalable Machine Learning at Yahoo!
Surge: Rise of Scalable Machine Learning at Yahoo!DataWorks Summit
 
Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUsCarol McDonald
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSatish Mohan
 
Balancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningBalancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningDatabricks
 
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinHigh-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinPietro Michiardi
 
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...Yu Liu
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyJim Dowling
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14Sri Ambati
 
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSAccelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSDatabricks
 

Tendances (20)

Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
 
How Machine Learning and AI Can Support the Fight Against COVID-19
How Machine Learning and AI Can Support the Fight Against COVID-19How Machine Learning and AI Can Support the Fight Against COVID-19
How Machine Learning and AI Can Support the Fight Against COVID-19
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
 
GPU Accelerated Machine Learning
GPU Accelerated Machine LearningGPU Accelerated Machine Learning
GPU Accelerated Machine Learning
 
How to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML modelsHow to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML models
 
Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015
 
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
 
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
 
Distributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling
Distributed Deep Learning with Apache Spark and TensorFlow with Jim DowlingDistributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling
Distributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling
 
Flux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineFlux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / Pipeline
 
Surge: Rise of Scalable Machine Learning at Yahoo!
Surge: Rise of Scalable Machine Learning at Yahoo!Surge: Rise of Scalable Machine Learning at Yahoo!
Surge: Rise of Scalable Machine Learning at Yahoo!
 
Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUs
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
Balancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningBalancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine Learning
 
AI Development with H2O.ai
AI Development with H2O.aiAI Development with H2O.ai
AI Development with H2O.ai
 
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinHigh-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig Latin
 
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14
 
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSAccelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
 

Similaire à Berlin buzzwords 2018 TensorFlow on Hops

Scaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa ClaraScaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa ClaraJim Dowling
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Jim Dowling
 
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesPyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesJim Dowling
 
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...Big Data Value Association
 
Hopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleHopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleJim Dowling
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Jason Dai
 
All AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AIAll AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AIJim Dowling
 
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Animesh Singh
 
Odsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsOdsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsJim Dowling
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of MetadataJim Dowling
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataRahul Jain
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsStijn Decubber
 
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...gdgsurrey
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingJim Dowling
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120Hyoungjun Kim
 
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Chris Fregly
 
“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the CoreC4Media
 

Similaire à Berlin buzzwords 2018 TensorFlow on Hops (20)

Scaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa ClaraScaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesPyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
 
Spark ML Pipeline serving
Spark ML Pipeline servingSpark ML Pipeline serving
Spark ML Pipeline serving
 
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
 
Hopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleHopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, Sunnyvale
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
 
All AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AIAll AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AI
 
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
 
Odsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsOdsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on Hops
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of Metadata
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
Apache Eagle - Monitor Hadoop in Real Time
Apache Eagle - Monitor Hadoop in Real TimeApache Eagle - Monitor Hadoop in Real Time
Apache Eagle - Monitor Hadoop in Real Time
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
 
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowling
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
 
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
 
“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core
 

Plus de Jim Dowling

ARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfJim Dowling
 
PyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfPyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfJim Dowling
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleJim Dowling
 
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdfPyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdfJim Dowling
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdfJim Dowling
 
Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Jim Dowling
 
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022Jim Dowling
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupJim Dowling
 
Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Jim Dowling
 
Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21Jim Dowling
 
Hopsworks Feature Store 2.0 a new paradigm
Hopsworks Feature Store  2.0   a new paradigmHopsworks Feature Store  2.0   a new paradigm
Hopsworks Feature Store 2.0 a new paradigmJim Dowling
 
GANs for Anti Money Laundering
GANs for Anti Money LaunderingGANs for Anti Money Laundering
GANs for Anti Money LaunderingJim Dowling
 
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityInvited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityJim Dowling
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020Jim Dowling
 
Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Jim Dowling
 
The Feature Store in Hopsworks
The Feature Store in HopsworksThe Feature Store in Hopsworks
The Feature Store in HopsworksJim Dowling
 
End-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceEnd-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceJim Dowling
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsJim Dowling
 

Plus de Jim Dowling (18)

ARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdf
 
PyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfPyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdf
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData Seattle
 
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdfPyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf
 
Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning
 
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
 
Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021
 
Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21
 
Hopsworks Feature Store 2.0 a new paradigm
Hopsworks Feature Store  2.0   a new paradigmHopsworks Feature Store  2.0   a new paradigm
Hopsworks Feature Store 2.0 a new paradigm
 
GANs for Anti Money Laundering
GANs for Anti Money LaunderingGANs for Anti Money Laundering
GANs for Anti Money Laundering
 
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityInvited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020
 
Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019
 
The Feature Store in Hopsworks
The Feature Store in HopsworksThe Feature Store in Hopsworks
The Feature Store in Hopsworks
 
End-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceEnd-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in Finance
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
 

Dernier

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 

Dernier (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Berlin buzzwords 2018 TensorFlow on Hops

  • 1. TensorFlow-on-Hops: Hotdog or Not with Toppings   jim_dowling
  • 2. Register an account at: http://hops.io/tf
  • 3. ©2018 Logical Clocks AB. All Rights Reserved World’s Fastest/Biggest Hadoop* 16x Throughput Faster – 1.2M ops/sBigger *https://www.usenix.org/conference/fast17/technical-sessions/presentation/niazi 37x Number of files Scale Challenge Winner (2017) 3 GPUs in YARN 3/45
  • 4. ©2018 Logical Clocks AB. All Rights Reserved Hops Data Platform 4/45
  • 5. ©2018 Logical Clocks AB. All Rights Reserved Why not KubeFlow? 5 KubeFlow
  • 6. ©2018 Logical Clocks AB. All Rights Reserved Hops – Projects and GPUs 6/70 DataLake GPUs Compute Kafka Data EngineeringData Science Project1 ProjectN Elasticsearch 1.Security by Design: Projects as a Sandbox for self-service and teams 2.Ease of use: Data Scientists need only code Python 3.Scale-out Deep Learning: Parallel experiments, Distributed Training 6
  • 7. Python in the Cluster: Per-Project Conda Envs Python libraries are usable by Spark/Tensorflow 7/45
  • 8. Hops: TLS Certs for Security (not Kerberos) •User Certificates: - Per-project users •Service Certificates: - NameNode, ResourceManager, Kafka, HiveServer2, Livy, etc •Application Certificates •Supports Certification Revocation, Renewal, Reloading. 8
  • 9. ©2018 Logical Clocks AB. All Rights Reserved GPU Resource Requests in Hops YARN HopsYARNHopsYARN 10 GPUs on 1 host 100 GPUs on 10 hosts with ‘Infiniband’ Hops supports a Hetrogenous Mix of GPUs 4 GPUs on any host 9
  • 10. ©2018 Logical Clocks AB. All Rights Reserved Data Parallelism on Hops/TensorFlow 10 Copy of the Model on each GPU
  • 11. ©2018 Logical Clocks AB. All Rights Reserved Parallel Experiments on Hops 11/45 Executor 1 Executor N Spark Driver Distributed FSTensorBoard • Big Training/test datasets • Model checkpointing • Evaluation • Distribute Training, Parallel Experiments • Model Serving Model Serving
  • 12. ML in Production: Machine Learning Pipelines
  • 13. ©2018 Logical Clocks AB. All Rights Reserved A Machine Learning Pipeline 13/45 Data Collection Experimentation Training Serving Feature Extraction Data Transformation & Verification Test
  • 14. ©2018 Logical Clocks AB. All Rights Reserved Hops Small Data ML Pipeline 14/45 Data Collection Experimentation Training Serving Feature Extraction Data Transformation & Verification Test Project Teams (Data Engineers/Scientists) TfServingTensorFlow HopsFS Kafka HopsYARN Kubernetes
  • 15. ©2018 Logical Clocks AB. All Rights Reserved PySpark Hops Big Data ML Pipeline 15/45 Data Collection Experimentation Training Serving Feature Extraction Data Transformation & Verification Test TfServingTensorFlow Project Teams (Data Engineers/Scientists) HopsFS Kafka HopsYARN Kubernetes
  • 16. Google Facets Overview •Visualize data distributions •Min/max/mean/media values for features •Missing values in columns •Facets Overview expects test/train datasets as input Feature Extraction Experimentation Training Test + Serve Data Ingestion Clean/Transform Data
  • 17. Google Facets Dive • Visualize the relationship between the data points across the different features of a dataset. Feature Extraction Experimentation Training Test + Serve Data Ingestion Clean/Transform Data
  • 18. ©2018 Logical Clocks AB. All Rights Reserved 18 features = ["Age", "Occupation", "Sex", …, "Country"] h = hdfs.get_fs() with h.open_file(hdfs.project_path() + "/TestJob/data/census/adult.data", "r") as trainFile: train_data =pd.read_csv(trainFile, names=features, sep=r's*,s*', engine='python', na_values="?") test_data = … facets.overview(train_data, test_data) facets.dive(test_data.to_json(orient='records')) Data Ingestion and Google Facets Feature Extraction Experimentation Training Test + Serve Data Acquisition Clean/Transform Data
  • 19. ©2018 Logical Clocks AB. All Rights Reserved 19/45 Small Data Preparation with tf.data API def input_fn(batch_size): files = tf.data.Dataset.list_files(IMAGES_DIR) def tfrecord_dataset(filename): return tf.data.TFRecordDataset(filename, num_parallel_reads=32, buffer_size=8*1024*1024) dataset = files.apply(tf.data.parallel_interleave (tfrecord_dataset, cycle_length=32, sloppy=True) dataset = dataset.apply(tf.data.map_and_batch(parser_fn, batch_size, num_parallel_batches=4)) dataset = dataset.prefetch(4) return dataset Feature Extraction Experimentation Training Test + Serve Data Acquisition Clean/Transform Data
  • 20. ©2018 Logical Clocks AB. All Rights Reserved Big Data Preparation with PySpark images = spark.readImages(IMAGE_PATH, recursive = True, numPartitions=10, sampleRatio = 0.1).cache() tr = (ImageTransformer().setOutputCol(“transformed”) .resize(height = 200, width = 200) .crop(0, 0, height = 180, width = 180) ) smallImages = tr.transform(images).select(“transformed”) # Output .tfrecords using TensorFlowOnSpark utility dfutil.saveAsTFRecords(smallImages, OUTPUT_DIR) 20/45 Feature Extraction Experimentation Training Test + Serve Data Acquisition Clean/Transform Data
  • 21. ©2018 Logical Clocks AB. All Rights Reserved Parallel Experiments 21/45 Time The Outer Loop (hyperparameters): “I have to run a hundred experiments to find the best model,” he complained, as he showed me his Jupyter notebooks. “That takes time. Every experiment takes a lot of programming, because there are so many different parameters. [Rants of a Data Scientist] Hops
  • 22. ©2018 Logical Clocks AB. All Rights Reserved Hyperparam Opt. with Tf/Spark on Hops def train(learning_rate, dropout): [TensorFlow Code here] args_dict = {'learning_rate': [0.001, 0.005, 0.01], 'dropout': [0.5, 0.6]} experiment.launch(spark, train, args_dict) Launch 6 Spark Executors 22/45 Feature Extraction Experimentation Training Test + Serve Data Acquisition Clean/Transform Data
  • 23. ©2018 Logical Clocks AB. All Rights Reserved HyperParam Opt. Visualization on TensorBoard 23/45 Hyperparam Opt Results Visualization
  • 24. ©2018 Logical Clocks AB. All Rights Reserved Model Architecture Search on TensorFlow/Hops def train_cifar10(learning_rate, dropout): [TensorFlow Code here] dict = {'learning_rate': [0.005, 0.00005], 'dropout': [0.01, 0.99], 'num_layers': [1,3]} experiment.evolutionary_search(spark, train_cifar10, dict, direction='max', popsize=10, generations=3, crossover=0.7, mutation=0.5)
  • 25. ©2018 Logical Clocks AB. All Rights Reserved Differential Evolution in Tensorboard (1/4) 25 CIFAR-10
  • 26. ©2018 Logical Clocks AB. All Rights Reserved Differential Evolution in Tensorboard (2/4) 26 CIFAR-10
  • 27. ©2018 Logical Clocks AB. All Rights Reserved Differential Evolution in Tensorboard (3/4) 27 CIFAR-10
  • 28. ©2018 Logical Clocks AB. All Rights Reserved Differential Evolution in Tensorboard (4/4) 28 CIFAR-10
  • 29. ©2018 Logical Clocks AB. All Rights Reserved Distributed Training 29/45 WeeksWeeks Time The Inner Loop (training): “ All these experiments took a lot of computation — we used hundreds of GPUs/TPUs for days. Much like a single modern computer can outperform thousands of decades-old machines, we hope that in the future these experiments will become household.” [Google SoTA ImageNet, Cifar-10, March18] MinsMins Hops
  • 30. ©2018 Logical Clocks AB. All Rights Reserved Distributed Training: Theory and Practice 30 30/45 Image from @hardmaru on Twitter.
  • 31. ©2018 Logical Clocks AB. All Rights Reserved Ring-AllReduce vs Parameter Server GPU 0GPU 0 GPU 1GPU 1 GPU 2GPU 2 GPU 3GPU 3 send send send send recv recv recv recv GPU 1GPU 1 GPU 2GPU 2 GPU 3GPU 3 GPU 4GPU 4 Param Server(s)Param Server(s) Network Bandwidth is the Bottleneck for Distributed Training 31/45
  • 32. ©2018 Logical Clocks AB. All Rights Reserved Horovod - AllReduce Inception V4 Performance 32/45 Inception V4, 20 Nodes on 2 GPU Servers, 40 Gb/s % GPU Utilization % Network Bandwidth
  • 33. ©2018 Logical Clocks AB. All Rights Reserved Parameter Server – Inception V4 Performance 33 Horovod, Inception V4, 10 workers, 2 PSes, 40 Gb/s, Nvidia 1080Ti % GPU Utilization % Network Bandwidth
  • 34. ©2018 Logical Clocks AB. All Rights Reserved Distributed Training with Horovod on Hops hvd.init() opt = hvd.DistributedOptimizer(opt) if hvd.local_rank()==0: [TensorFlow Code here] ….. else: [TensorFlow Code here] ….. 34/45 Feature Extraction Experimentation Training Test + Serve Data Acquisition Clean/Transform Data
  • 35. Hops API •Python (also Java/Scala) - Manage tensorboard, Load/save models in HDFS - Horovod, TensorFlowOnSpark - Parallel experiments • Gridsearch • Model Architecture Search with Genetic Algorithms - Secure Streaming Analytics with Kafka/Spark/Flink • SSL/TLS certs, Avro Schema, Endpoints for Kafka/Zookeeper/etc 35/45 Feature Extraction Experimentation Training Test + Serve Data Acquisition Clean/Transform Data
  • 36. ©2018 Logical Clocks AB. All Rights Reserved TensorFlow Model Serving 36/45 Feature Extraction Experimentation Training Test + Serve Data Acquisition Clean/Transform Data
  • 37. Training-Serving Skew •Monitor differences between performance during training and performance during serving. - Differences in how you process data in training vs serving. - Differences in the training data and live data for serving. - A feedback loop between your model and your algorithm. •When to retrain? - If you look at the input data and use covariant shift to see when it deviates significantly from the data that was used to train the model on. 37
  • 38. http://hopshadoop.com:8080/hopsworks 0. Register an account 1. Create a ’tensorflow_tour’ serving/train_and_export_model.ipynb Try out other notebooks – tensorflow/cnn/grid_search, •2. Create a new project called ’yourname_hotdog’ • a) enable Python 3.6 for the project b) search for the dataset ’hotdog’ and import it into ’yourname_hotdog’ c) download http://hopshadoop.com:8080/hotdog.ipynb to your laptop. d) upload hotdog.ipynb into the Jupyter dataset in `yourname_hotdog` e) install the conda dependencies: matplotlib, pillow, numpy f) Start Jupyter and run hotdog.ipynb [Credit Magnus Pedersson : https://www.youtube.com/watch?v=oxrcZ9uUblI ] 38
  • 39. Summary •Hops is a new Data Platform with first-class support for Python / Deep Learning / ML / Data Governance / GPUs •You can do fun stuff on Hops, like ”Hotdog or not”, as well as serious stuff.
  • 40. ©2018 Logical Clocks AB. All Rights Reserved The Team Jim Dowling, Seif Haridi, Gautier Berthou, Salman Niazi, Mahmoud Ismail, Theofilos Kakantousis, Ermias Gebremeskel, Antonios Kouzoupis, Alex Ormenisan, Fabio Buso, Robin Andersson, August Bonds. Active: Alumni: Vasileios Giannokostas, Johan Svedlund Nordström,Rizvi Hasan, Paul Mälzer, Bram Leenders, Juan Roca, Misganu Dessalegn, K “Sri” Srijeyanthan, Jude D’Souza, Alberto Lorente, Andre Moré, Ali Gholami, Davis Jaunzems, Stig Viaene, Hooman Peiro, Evangelos Savvidis, Steffen Grohsschmiedt, Qi Qi, Gayana Chandrasekara, Nikolaos Stanogias, Daniel Bali, Ioannis Kerkinos, Peter Buechler, Pushparaj Motamari, Hamid Afzali, Wasif Malik, Lalith Suresh, Mariano Valles, Ying Lieu, Fanti Machmount Al Samisti, Braulio Grana, Adam Alpire, Zahin Azher Rashid, ArunaKumari Yedurupaka, Tobias Johansson , Roberto Bampi, Roshan Sedar. www.hops.io @hopshadoop