SlideShare une entreprise Scribd logo
1  sur  36
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Julien Simon
Principal Technical Evangelist, AI & Machine Learning,
AWS
@julsimon
From Notebook to Production
with Amazon SageMaker
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Put Machine Learning in the hands
of every developer and data scientist
Our mission
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Application
Services
Platform
Services
Frameworks
& Infrastructure
API-driven services:Vision, Language & Speech Services, Chatbots
AWS ML Stack
Deploy machine learning models with high-performance machine learning algorithms,
broad framework support, and one-click training, tuning, and inference.
Develop sophisticated models with any framework, create managed, auto-scaling
clusters of GPUs for large scale training, or run prediction
on trained models.
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Application
Services
Platform
Services
Frameworks
& Infrastructure
API-driven services:Vision, Language & Speech Services, Chatbots
AWS ML Stack
Deploy machine learning models with high-performance machine learning algorithms,
broad framework support, and one-click training, tuning, and inference.
Develop sophisticated models with any framework, create managed, auto-scaling
clusters of GPUs for large scale training, or run prediction
on trained models.
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
YesNo
DataAugmentation
Feature
Augmentation
The Machine Learning Process
Re-training
Predictions
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Amazon SageMaker
Pre-built
notebooks for
common
problems
K-MeansClustering
Principal Component Analysis
Neural TopicModelling
FactorizationMachines
Linear Learner
XGBoost
Latent Dirichlet Allocation
ImageClassification
Seq2Seq,
And more!
ALGORITHMS
Apache MXNet, Chainer
TensorFlow, PyTorch, scikit-learn
FRAMEWORKS Set up and manage
environments for training
Train and tune
model (trial and
error)
Deploy model
in production
Scale and manage the
production environment
Built-in, high-
performance
algorithms
Build
Git integration
Elastic inference
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Git integration
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Elastic Inference
https://aws.amazon.com/blogs/aws/amazon-elastic-inference-gpu-powered-deep-learning-inference-acceleration/
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Amazon SageMaker
Pre-built
notebooks for
common
problems
K-MeansClustering
Principal Component Analysis
Neural TopicModelling
FactorizationMachines
Linear Learner
XGBoost
Latent Dirichlet Allocation
ImageClassification
Seq2Seq,
And more!
ALGORITHMS
Apache MXNet, Chainer
TensorFlow, PyTorch, scikit-learn
FRAMEWORKS Set up and manage
environments for training
Train and tune
model (trial and
error)
Deploy model
in production
Scale and manage the
production environment
Built-in, high-
performance
algorithms
Build
New built-in algorithms
scikit-learn environment
Model marketplace
Search
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Search training jobs
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Machine Learning Marketplace
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Amazon SageMaker
Pre-built
notebooks for
common
problems
Built-in, high-
performance
algorithms
One-click
training
Hyperparameter
optimization
Train
Deploy model
in production
Scale and manage the
production
environment
P3DN, C5N
TensorFlow on 256 GPUs
Resume HPO tuning job
Build
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Amazon SageMaker
Fully managed
hosting with auto-
scaling
One-click
deployment
Pre-built
notebooks for
common
problems
Built-in, high-
performance
algorithms
One-click
training
Hyperparameter
optimization
Deploy
Model compilation
Elastic inference
Inference pipelines
TrainBuild
Working with Amazon SageMaker
TheAmazon SageMakerAPI
• Python SDK orchestrating all Amazon SageMaker activity
• High-level objects for algorithm selection, training, deploying,
automatic model tuning, etc.
• Spark SDK (Python & Scala)
• AWS CLI: ‘aws sagemaker’
• AWS SDK: boto3, etc.
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Model Training (on EC2)
Model Hosting (on EC2)
Trainingdata
Modelartifacts
Training code Helper code
Helper codeInference code
GroundTruth
Client application
Inference code
Training code
Inference requestInference
response
Inference Endpoint
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Training code
Factorization Machines
Linear Learner
PrincipalComponent Analysis
K-Means Clustering
XGBoost
And more
Built-inAlgorithms BringYour Own ContainerBringYour Own Script
Model options
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Built-in algorithms
orange: supervised, yellow: unsupervised
Linear Learner: regression, classification Image Classification: Deep Learning (ResNet)
Factorization Machines: regression, classification,
recommendation
Object Detection (SSD): Deep Learning
(VGG or ResNet)
K-Nearest Neighbors: non-parametric regression
and classification
NeuralTopic Model: topic modeling
XGBoost: regression, classification, ranking
https://github.com/dmlc/xgboost
Latent DirichletAllocation: topic modeling (mostly)
K-Means: clustering BlazingText:GPU-basedWord2Vec,
and text classification
Principal ComponentAnalysis: dimensionality
reduction
Sequence to Sequence: machine translation, speech
speech to text and more
RandomCut Forest: anomaly detection DeepAR: time-series forecasting (RNN)
Object2Vec: general-purpose embedding IP Insights: usage patterns for IP addresses
Semantic Segmentation: Deep Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Blazing Text
https://dl.acm.org/citation.cfm?id=3146354
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Demo:
Text Classification with BlazingText
https://github.com/awslabs/amazon-sagemaker-
examples/tree/master/introduction_to_amazon_algorithms/blazingtext_text_classification_dbpedia
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
OptimizingTensorFlow
https://aws.amazon.com/blogs/machine-learning/faster-training-with-
optimized-tensorflow-1-6-on-amazon-ec2-c5-and-p3-instances/
(March 2018)
Training a ResNet-50 benchmark with
the synthetic ImageNet dataset using
our optimized build ofTensorFlow
1.11 on a c5.18xlarge instance type is
11x faster than training on the stock
binaries.
https://aws.amazon.com/about-aws/whats-
new/2018/10/chainer4-4_theano_1-0-
2_launch_deep_learning_ami/ (October 2018)
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Demo:
Elastic Inference withTensorFlow
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-
sdk/tensorflow_iris_dnn_classifier_using_estimators/
Automatic Model Tuning
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Automatic Model Tuning
Finding the optimal set of hyper parameters
1. Manual Search (”I know what I’m doing”)
2. Grid Search (“X marks the spot”)
• Typically training hundreds of models
• Slow and expensive
3. Random Search (“Spray and pray”)
• Works better and faster than Grid Search
• But… but… but… it’s random!
4. HPO: use Machine Learning
• Training fewer models
• Gaussian Process Regression and Bayesian Optimization
• You can now resume from a previous tuning job
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Demo: HPO withTensorFlow
https://github.com/awslabs/amazon-sagemaker-
examples/tree/master/hyperparameter_tuning/tensorflow_mnist
CONV1
POOL1
CONV2
POOL2
DENSE1
DROPOUT1
OUTPUT
Class
probabilities
INPUT
Model Compilation
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Optimizing for the underlying hardware
https://aws.amazon.com/blogs/aws/amazon-sagemaker-neo-train-your-machine-learning-models-once-run-them-anywhere/
• Train once, run anywhere
• Frameworks and algorithms
• TensorFlow, Apache MXNet, PyTorch, ONNX, and XGBoost
• Hardware architectures
• ARM, Intel, and NVIDIA starting today
• Cadence, Qualcomm, and Xilinx hardware coming soon
• Amazon SageMaker Neo will be released as open source enabling hardware
vendors to customize it for their processors and devices.
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Demo:
Compiling ResNet-50 for the Raspberry Pi
Configure the compilation job
{
"RoleArn":$ROLE_ARN,
"InputConfig": {
"S3Uri":"s3://jsimon-neo/model.tar.gz",
"DataInputConfig": "{"data": [1, 3, 224, 224]}",
"Framework": "MXNET"
},
"OutputConfig": {
"S3OutputLocation": "s3://jsimon-neo/",
"TargetDevice": "rasp3b"
},
"StoppingCondition": {
"MaxRuntimeInSeconds": 300
}
}
Compile the model
$ aws sagemaker create-compilation-job
--cli-input-json file://config.json
--compilation-job-name resnet50-mxnet-pi
$ aws s3 cp s3://jsimon-neo/model-
rasp3b.tar.gz .
$ gtar tfz model-rasp3b.tar.gz
compiled.params
compiled_model.json
compiled.so
Predict with the compiled model
from dlr import DLRModel
model = DLRModel('resnet50', input_shape,
output_shape, device)
out = model.run(input_data)
Inference Pipelines
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Inference Pipelines
• Linear sequence of 2-5 containers that process inference requests
• Feature engineering with scikit-learn or SparkML (on AWS Glue or Amazon EMR)
• Predict with built-in or custom containers
• The sequence is deployed as a a single model
• Useful to preprocess, predict, and post-process
• Available for real-time prediction and batch transform
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Demo:
Inference with scikit-learn and linear learner
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-
sdk/scikit_learn_inference_pipeline/
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Amazon SageMaker
Fully managed
hosting with auto-
scaling
One-click
deployment
Pre-built
notebooks for
common
problems
Built-in, high-
performance
algorithms
One-click
training
Hyperparameter
optimization
Build Train Deploy
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Selected Amazon SageMaker customers
Automatic Model Tuning
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Resources
http://aws.amazon.com/free
https://ml.aws
https://aws.amazon.com/sagemaker
https://github.com/aws/sagemaker-python-sdk
https://github.com/aws/sagemaker-spark
https://github.com/awslabs/amazon-sagemaker-examples
https://gitlab.com/juliensimon/ent321 SageMaker workshop at AWS re:Invent 2018
https://medium.com/@julsimon
https://gitlab.com/juliensimon/dlnotebooks
Thank you!
Julien Simon
Principal Technical Evangelist, AI and Machine Learning
@julsimon

Contenu connexe

Tendances

Tendances (20)

Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
 
Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)
 
Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)
 
Amazon SageMaker In Action
Amazon SageMaker In Action Amazon SageMaker In Action
Amazon SageMaker In Action
 
Intro to SageMaker
Intro to SageMakerIntro to SageMaker
Intro to SageMaker
 
Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)
 
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
 
Amazon SageMaker Deep Dive - Meetup AWS Toulouse at D2SI
Amazon SageMaker Deep Dive - Meetup AWS Toulouse at D2SIAmazon SageMaker Deep Dive - Meetup AWS Toulouse at D2SI
Amazon SageMaker Deep Dive - Meetup AWS Toulouse at D2SI
 
Machine Learning on AWS (December 2018)
Machine Learning on AWS (December 2018)Machine Learning on AWS (December 2018)
Machine Learning on AWS (December 2018)
 
An Introduction to Amazon SageMaker (October 2018)
An Introduction to Amazon SageMaker (October 2018)An Introduction to Amazon SageMaker (October 2018)
An Introduction to Amazon SageMaker (October 2018)
 
AWS re:Invent 2018 - AIM302 - Machine Learning at the Edge
AWS re:Invent 2018 - AIM302  - Machine Learning at the Edge AWS re:Invent 2018 - AIM302  - Machine Learning at the Edge
AWS re:Invent 2018 - AIM302 - Machine Learning at the Edge
 
End to End Model Development to Deployment using SageMaker
End to End Model Development to Deployment using SageMakerEnd to End Model Development to Deployment using SageMaker
End to End Model Development to Deployment using SageMaker
 
Build, Train and Deploy Machine Learning Models at Scale (April 2019)
Build, Train and Deploy Machine Learning Models at Scale (April 2019)Build, Train and Deploy Machine Learning Models at Scale (April 2019)
Build, Train and Deploy Machine Learning Models at Scale (April 2019)
 
AWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
AWS re:Invent 2018 - AIM401 - Deep Learning using TensorflowAWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
AWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
 
Customer-Obsessed Digital User Engagement
Customer-Obsessed Digital User EngagementCustomer-Obsessed Digital User Engagement
Customer-Obsessed Digital User Engagement
 
Working with Amazon SageMaker Algorithms for Faster Model Training
Working with Amazon SageMaker Algorithms for Faster Model TrainingWorking with Amazon SageMaker Algorithms for Faster Model Training
Working with Amazon SageMaker Algorithms for Faster Model Training
 
Mcl345 re invent_sagemaker_dmbanga
Mcl345 re invent_sagemaker_dmbangaMcl345 re invent_sagemaker_dmbanga
Mcl345 re invent_sagemaker_dmbanga
 
Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)
 
Amazon SageMaker Deep Dive for Builders
Amazon SageMaker Deep Dive for BuildersAmazon SageMaker Deep Dive for Builders
Amazon SageMaker Deep Dive for Builders
 

Similaire à Amazon SageMaker (December 2018)

Similaire à Amazon SageMaker (December 2018) (20)

From Notebook to production with Amazon SageMaker
From Notebook to production with Amazon SageMakerFrom Notebook to production with Amazon SageMaker
From Notebook to production with Amazon SageMaker
 
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
AWS re:Invent 2018 - ENT321 - SageMaker WorkshopAWS re:Invent 2018 - ENT321 - SageMaker Workshop
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
 
Advanced Machine Learning with Amazon SageMaker
Advanced Machine Learning with Amazon SageMakerAdvanced Machine Learning with Amazon SageMaker
Advanced Machine Learning with Amazon SageMaker
 
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
 
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
 
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...
 
Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...
Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...
Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...
 
Perform Machine Learning at the IoT Edge using AWS Greengrass and Amazon Sage...
Perform Machine Learning at the IoT Edge using AWS Greengrass and Amazon Sage...Perform Machine Learning at the IoT Edge using AWS Greengrass and Amazon Sage...
Perform Machine Learning at the IoT Edge using AWS Greengrass and Amazon Sage...
 
Build, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at ScaleBuild, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at Scale
 
Building a Recommender System Using Amazon SageMaker's Factorization Machine ...
Building a Recommender System Using Amazon SageMaker's Factorization Machine ...Building a Recommender System Using Amazon SageMaker's Factorization Machine ...
Building a Recommender System Using Amazon SageMaker's Factorization Machine ...
 
Building a Serverless AI Powered Twitter Bot: Collision 2018
Building a Serverless AI Powered Twitter Bot: Collision 2018Building a Serverless AI Powered Twitter Bot: Collision 2018
Building a Serverless AI Powered Twitter Bot: Collision 2018
 
Where ml ai_heavy
Where ml ai_heavyWhere ml ai_heavy
Where ml ai_heavy
 
Amazon SageMaker workshop
Amazon SageMaker workshopAmazon SageMaker workshop
Amazon SageMaker workshop
 
Introduction to Scalable Deep Learning on AWS with Apache MXNet
Introduction to Scalable Deep Learning on AWS with Apache MXNetIntroduction to Scalable Deep Learning on AWS with Apache MXNet
Introduction to Scalable Deep Learning on AWS with Apache MXNet
 
Building WhereML, an AI Powered Twitter Bot for Guessing Locations of Picture...
Building WhereML, an AI Powered Twitter Bot for Guessing Locations of Picture...Building WhereML, an AI Powered Twitter Bot for Guessing Locations of Picture...
Building WhereML, an AI Powered Twitter Bot for Guessing Locations of Picture...
 
Predicting the Future with Amazon SageMaker - AWS Summit Sydney 2018
Predicting the Future with Amazon SageMaker - AWS Summit Sydney 2018Predicting the Future with Amazon SageMaker - AWS Summit Sydney 2018
Predicting the Future with Amazon SageMaker - AWS Summit Sydney 2018
 
[AWS Innovate 온라인 컨퍼런스] 간단한 Python 코드만으로 높은 성능의 기계 학습 모델 만들기 - 김무현, AWS Sr.데이...
[AWS Innovate 온라인 컨퍼런스] 간단한 Python 코드만으로 높은 성능의 기계 학습 모델 만들기 - 김무현, AWS Sr.데이...[AWS Innovate 온라인 컨퍼런스] 간단한 Python 코드만으로 높은 성능의 기계 학습 모델 만들기 - 김무현, AWS Sr.데이...
[AWS Innovate 온라인 컨퍼런스] 간단한 Python 코드만으로 높은 성능의 기계 학습 모델 만들기 - 김무현, AWS Sr.데이...
 
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
 
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
 

Plus de Julien SIMON

Plus de Julien SIMON (20)

An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceAn introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging Face
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersReinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face Transformers
 
Building NLP applications with Transformers
Building NLP applications with TransformersBuilding NLP applications with Transformers
Building NLP applications with Transformers
 
Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)
 
Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)
 
An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)
 
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
 
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
 
A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)
 
Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)
 
The Future of AI (September 2019)
The Future of AI (September 2019)The Future of AI (September 2019)
The Future of AI (September 2019)
 
Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)
 
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
 
Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)
 
Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)
 
Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)
 
Scaling Machine Learning from zero to millions of users (May 2019)
Scaling Machine Learning from zero to millions of users (May 2019)Scaling Machine Learning from zero to millions of users (May 2019)
Scaling Machine Learning from zero to millions of users (May 2019)
 
Become a Machine Learning developer with AWS (Avril 2019)
Become a Machine Learning developer with AWS (Avril 2019)Become a Machine Learning developer with AWS (Avril 2019)
Become a Machine Learning developer with AWS (Avril 2019)
 
Solve complex business problems with Amazon Personalize and Amazon Forecast (...
Solve complex business problems with Amazon Personalize and Amazon Forecast (...Solve complex business problems with Amazon Personalize and Amazon Forecast (...
Solve complex business problems with Amazon Personalize and Amazon Forecast (...
 
Optimize your Machine Learning workloads (April 2019)
Optimize your Machine Learning workloads (April 2019)Optimize your Machine Learning workloads (April 2019)
Optimize your Machine Learning workloads (April 2019)
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Amazon SageMaker (December 2018)

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Julien Simon Principal Technical Evangelist, AI & Machine Learning, AWS @julsimon From Notebook to Production with Amazon SageMaker
  • 2. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Put Machine Learning in the hands of every developer and data scientist Our mission
  • 3. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Application Services Platform Services Frameworks & Infrastructure API-driven services:Vision, Language & Speech Services, Chatbots AWS ML Stack Deploy machine learning models with high-performance machine learning algorithms, broad framework support, and one-click training, tuning, and inference. Develop sophisticated models with any framework, create managed, auto-scaling clusters of GPUs for large scale training, or run prediction on trained models.
  • 4. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Application Services Platform Services Frameworks & Infrastructure API-driven services:Vision, Language & Speech Services, Chatbots AWS ML Stack Deploy machine learning models with high-performance machine learning algorithms, broad framework support, and one-click training, tuning, and inference. Develop sophisticated models with any framework, create managed, auto-scaling clusters of GPUs for large scale training, or run prediction on trained models.
  • 5. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Data Visualization & Analysis Business Problem ML problem framing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging YesNo DataAugmentation Feature Augmentation The Machine Learning Process Re-training Predictions
  • 6. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Amazon SageMaker Pre-built notebooks for common problems K-MeansClustering Principal Component Analysis Neural TopicModelling FactorizationMachines Linear Learner XGBoost Latent Dirichlet Allocation ImageClassification Seq2Seq, And more! ALGORITHMS Apache MXNet, Chainer TensorFlow, PyTorch, scikit-learn FRAMEWORKS Set up and manage environments for training Train and tune model (trial and error) Deploy model in production Scale and manage the production environment Built-in, high- performance algorithms Build Git integration Elastic inference
  • 7. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Git integration
  • 8. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Elastic Inference https://aws.amazon.com/blogs/aws/amazon-elastic-inference-gpu-powered-deep-learning-inference-acceleration/
  • 9. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Amazon SageMaker Pre-built notebooks for common problems K-MeansClustering Principal Component Analysis Neural TopicModelling FactorizationMachines Linear Learner XGBoost Latent Dirichlet Allocation ImageClassification Seq2Seq, And more! ALGORITHMS Apache MXNet, Chainer TensorFlow, PyTorch, scikit-learn FRAMEWORKS Set up and manage environments for training Train and tune model (trial and error) Deploy model in production Scale and manage the production environment Built-in, high- performance algorithms Build New built-in algorithms scikit-learn environment Model marketplace Search
  • 10. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Search training jobs
  • 11. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Machine Learning Marketplace
  • 12. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Amazon SageMaker Pre-built notebooks for common problems Built-in, high- performance algorithms One-click training Hyperparameter optimization Train Deploy model in production Scale and manage the production environment P3DN, C5N TensorFlow on 256 GPUs Resume HPO tuning job Build
  • 13. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Amazon SageMaker Fully managed hosting with auto- scaling One-click deployment Pre-built notebooks for common problems Built-in, high- performance algorithms One-click training Hyperparameter optimization Deploy Model compilation Elastic inference Inference pipelines TrainBuild
  • 14. Working with Amazon SageMaker
  • 15. TheAmazon SageMakerAPI • Python SDK orchestrating all Amazon SageMaker activity • High-level objects for algorithm selection, training, deploying, automatic model tuning, etc. • Spark SDK (Python & Scala) • AWS CLI: ‘aws sagemaker’ • AWS SDK: boto3, etc.
  • 16. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Model Training (on EC2) Model Hosting (on EC2) Trainingdata Modelartifacts Training code Helper code Helper codeInference code GroundTruth Client application Inference code Training code Inference requestInference response Inference Endpoint
  • 17. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Training code Factorization Machines Linear Learner PrincipalComponent Analysis K-Means Clustering XGBoost And more Built-inAlgorithms BringYour Own ContainerBringYour Own Script Model options
  • 18. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Built-in algorithms orange: supervised, yellow: unsupervised Linear Learner: regression, classification Image Classification: Deep Learning (ResNet) Factorization Machines: regression, classification, recommendation Object Detection (SSD): Deep Learning (VGG or ResNet) K-Nearest Neighbors: non-parametric regression and classification NeuralTopic Model: topic modeling XGBoost: regression, classification, ranking https://github.com/dmlc/xgboost Latent DirichletAllocation: topic modeling (mostly) K-Means: clustering BlazingText:GPU-basedWord2Vec, and text classification Principal ComponentAnalysis: dimensionality reduction Sequence to Sequence: machine translation, speech speech to text and more RandomCut Forest: anomaly detection DeepAR: time-series forecasting (RNN) Object2Vec: general-purpose embedding IP Insights: usage patterns for IP addresses Semantic Segmentation: Deep Learning
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Blazing Text https://dl.acm.org/citation.cfm?id=3146354
  • 20. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Demo: Text Classification with BlazingText https://github.com/awslabs/amazon-sagemaker- examples/tree/master/introduction_to_amazon_algorithms/blazingtext_text_classification_dbpedia
  • 21. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. OptimizingTensorFlow https://aws.amazon.com/blogs/machine-learning/faster-training-with- optimized-tensorflow-1-6-on-amazon-ec2-c5-and-p3-instances/ (March 2018) Training a ResNet-50 benchmark with the synthetic ImageNet dataset using our optimized build ofTensorFlow 1.11 on a c5.18xlarge instance type is 11x faster than training on the stock binaries. https://aws.amazon.com/about-aws/whats- new/2018/10/chainer4-4_theano_1-0- 2_launch_deep_learning_ami/ (October 2018)
  • 22. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Demo: Elastic Inference withTensorFlow https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python- sdk/tensorflow_iris_dnn_classifier_using_estimators/
  • 24. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Automatic Model Tuning Finding the optimal set of hyper parameters 1. Manual Search (”I know what I’m doing”) 2. Grid Search (“X marks the spot”) • Typically training hundreds of models • Slow and expensive 3. Random Search (“Spray and pray”) • Works better and faster than Grid Search • But… but… but… it’s random! 4. HPO: use Machine Learning • Training fewer models • Gaussian Process Regression and Bayesian Optimization • You can now resume from a previous tuning job
  • 25. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Demo: HPO withTensorFlow https://github.com/awslabs/amazon-sagemaker- examples/tree/master/hyperparameter_tuning/tensorflow_mnist CONV1 POOL1 CONV2 POOL2 DENSE1 DROPOUT1 OUTPUT Class probabilities INPUT
  • 27. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Optimizing for the underlying hardware https://aws.amazon.com/blogs/aws/amazon-sagemaker-neo-train-your-machine-learning-models-once-run-them-anywhere/ • Train once, run anywhere • Frameworks and algorithms • TensorFlow, Apache MXNet, PyTorch, ONNX, and XGBoost • Hardware architectures • ARM, Intel, and NVIDIA starting today • Cadence, Qualcomm, and Xilinx hardware coming soon • Amazon SageMaker Neo will be released as open source enabling hardware vendors to customize it for their processors and devices.
  • 28. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Demo: Compiling ResNet-50 for the Raspberry Pi Configure the compilation job { "RoleArn":$ROLE_ARN, "InputConfig": { "S3Uri":"s3://jsimon-neo/model.tar.gz", "DataInputConfig": "{"data": [1, 3, 224, 224]}", "Framework": "MXNET" }, "OutputConfig": { "S3OutputLocation": "s3://jsimon-neo/", "TargetDevice": "rasp3b" }, "StoppingCondition": { "MaxRuntimeInSeconds": 300 } } Compile the model $ aws sagemaker create-compilation-job --cli-input-json file://config.json --compilation-job-name resnet50-mxnet-pi $ aws s3 cp s3://jsimon-neo/model- rasp3b.tar.gz . $ gtar tfz model-rasp3b.tar.gz compiled.params compiled_model.json compiled.so Predict with the compiled model from dlr import DLRModel model = DLRModel('resnet50', input_shape, output_shape, device) out = model.run(input_data)
  • 30. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Inference Pipelines • Linear sequence of 2-5 containers that process inference requests • Feature engineering with scikit-learn or SparkML (on AWS Glue or Amazon EMR) • Predict with built-in or custom containers • The sequence is deployed as a a single model • Useful to preprocess, predict, and post-process • Available for real-time prediction and batch transform
  • 31. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Demo: Inference with scikit-learn and linear learner https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python- sdk/scikit_learn_inference_pipeline/
  • 32. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Amazon SageMaker Fully managed hosting with auto- scaling One-click deployment Pre-built notebooks for common problems Built-in, high- performance algorithms One-click training Hyperparameter optimization Build Train Deploy
  • 33. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Selected Amazon SageMaker customers
  • 35. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Resources http://aws.amazon.com/free https://ml.aws https://aws.amazon.com/sagemaker https://github.com/aws/sagemaker-python-sdk https://github.com/aws/sagemaker-spark https://github.com/awslabs/amazon-sagemaker-examples https://gitlab.com/juliensimon/ent321 SageMaker workshop at AWS re:Invent 2018 https://medium.com/@julsimon https://gitlab.com/juliensimon/dlnotebooks
  • 36. Thank you! Julien Simon Principal Technical Evangelist, AI and Machine Learning @julsimon

Notes de l'éditeur

  1. SageMaker makes it easy to build ML models and get them ready for training by providing everything you need to quickly connect to your training data, and to select and optimize the best algorithm and framework for your application. Amazon SageMaker includes hosted Jupyter notebooks that make it is easy to explore and visualize your training data stored in Amazon S3. You can connect directly to data in S3, or use AWS Glue to move data from Amazon RDS, Amazon DynamoDB, and Amazon Redshift into S3 for analysis in your notebook.   To help you select your algorithm, Amazon SageMaker includes the 10 most common machine learning algorithms which have been pre-installed and optimized to deliver up to 10 times the performance you’ll find running these algorithms anywhere else. Amazon SageMaker also comes pre-configured to run TensorFlow and Apache MXNet, two of the most popular open source frameworks, or you have the option of using your own framework.
  2. SageMaker makes it easy to build ML models and get them ready for training by providing everything you need to quickly connect to your training data, and to select and optimize the best algorithm and framework for your application. Amazon SageMaker includes hosted Jupyter notebooks that make it is easy to explore and visualize your training data stored in Amazon S3. You can connect directly to data in S3, or use AWS Glue to move data from Amazon RDS, Amazon DynamoDB, and Amazon Redshift into S3 for analysis in your notebook.   To help you select your algorithm, Amazon SageMaker includes the 10 most common machine learning algorithms which have been pre-installed and optimized to deliver up to 10 times the performance you’ll find running these algorithms anywhere else. Amazon SageMaker also comes pre-configured to run TensorFlow and Apache MXNet, two of the most popular open source frameworks, or you have the option of using your own framework.
  3. You can begin training your model with a single click in the Amazon SageMaker console. The service manages all of the underlying infrastructure for you and can easily scale to train models at petabyte scale. To make the training process even faster and easier, Amazon SageMaker can automatically tune your model to achieve the highest possible accuracy.
  4. Once your model is trained and tuned, SageMaker makes it easy to deploy in production so you can start generating predictions on new data (a process called inference). Amazon SageMaker deploys your model on an auto-scaling cluster of Amazon EC2 instances that are spread across multiple availability zones to deliver both high performance and high availability. It also includes built-in A/B testing capabilities to help you test your model and experiment with different versions to achieve the best results.   For maximum versatility, we designed Amazon SageMaker in three modules – Build, Train, and Deploy – that can be used together or independently as part of any existing ML workflow you might already have in place.
  5. Seq2Seq: used by Amazon Translate and AWS Sockeye LDA: used by Amazon Comprehend
  6. Once your model is trained and tuned, SageMaker makes it easy to deploy in production so you can start generating predictions on new data (a process called inference). Amazon SageMaker deploys your model on an auto-scaling cluster of Amazon EC2 instances that are spread across multiple availability zones to deliver both high performance and high availability. It also includes built-in A/B testing capabilities to help you test your model and experiment with different versions to achieve the best results.   For maximum versatility, we designed Amazon SageMaker in three modules – Build, Train, and Deploy – that can be used together or independently as part of any existing ML workflow you might already have in place.