SlideShare une entreprise Scribd logo
1  sur  33
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Automate yourAmazonSageMaker workflows
Julien Simon
Global Evangelist, AI & Machine Learning, AWS
@julsimon
F l o o r 2 8 , T e l A v i v , J u l y 8 t h , 2 0 1 9
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Machinelearning cycle
Business
Problem
ML problem
framing
Data collection
Data integration
Data preparation
and cleaning
Data visualization
and analysis
Feature
engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
Are business
goals
met?
YESNO
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AmazonSageMaker
1
2
3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Build,trainand deploy models usingSageMaker
Business
Problem
ML problem
framing
Data collection
Data integration
Data preparation
and cleaning
Data visualization
and analysis
Feature
engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
Are business
goals
met?
Dataaugmentation
Feature
augmentation
Re-training
YESNO
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Whatabout thelines betweenthesesteps?
Business
Problem
ML problem
framing
Data collection
Data integration
Data preparation
and cleaning
Data visualization
and analysis
Feature
engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
Are business
goals
met?
YESNO
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
A recap on AWS automation
• Amazon SDKs
• AWS CloudFormation
• AWS Cloud Development Kit (CDK)
Orchestrating with AWS Step Functions
Creating notebook instances
Training / retraining models
Deploying models
• Strategies: canary deployment, blue-green deployment
• Production variants
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Automation tools and services
• Boto3
• AWS SDK for Python, covering all AWS services
• https://github.com/boto/boto3
• Of course, you can use other AWS SDKs forC++, Go, Java, PHP, Ruby, JS, and .NET
• Amazon SageMaker SDK
• High-level SDK focused on ML experimentation
• https://github.com/aws/sagemaker-python-sdk
• Amazon CloudFormation
• Infrastructure as Code service: templates (JSON,YAML) and stacks.
• https://docs.aws.amazon.com/cloudformation/index.html
• AWS Cloud Development Kit (CDK)
• The new kid on the block: write code (JS, Python, Java, C#), build a CloudFormation template
• https://docs.aws.amazon.com/cdk/
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Score card forAmazonSageMaker automation
Boto3 Amazon SageMaker SDK AWS CloudFormation AWS CDK
Maturity / stability *** ** *** * (developer preview)
Documentation / examples *** *** *** ** for Javascript
* for other languages
Open source Yes Yes No Yes
Service coverage 100% 100%, but high-level Notebook instances
Models
Endpoints
No training
Same as CF
Change management Source control Source control Source control
Change sets
Drift detection
Same as CF
Cleanup ** (Delete APIs) ** (Delete APIs) *** (built-in) Same as CF
Rollback * (difficult) * (difficult) *** (built-in) Same as CF
Speed *** (fastest) ** * Same as CF
Easy to learn ** *** (easiest) * Not sure yet…
Personal opinion Best coverage and fastest Not designed for automation,
but works for most scenarios.
Rock solid. My preferred
option to deploy and update
update endpoints
If you’re allergic to CF
templates or if you want to
try something different.
Promising,
but caveat emptor 
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
More options
• Troposphere (2013)
• Open source project https://github.com/cloudtools/troposphere
• Write Python code, generate CloudFormation templates (I like it a lot)
• Terraform (2014)
• Open source tool by Hashicorp https://www.terraform.io/
• Coverage is similar to CloudFormation
• Airflow (2015)
• Open source project https://airflow.apache.org/
• https://aws.amazon.com/blogs/machine-learning/build-end-to-end-machine-learning-workflows-
with-amazon-sagemaker-and-apache-airflow/
• MLFlow (2018)
• Open source project by Databricks https://mlflow.org/
• https://www.mlflow.org/docs/latest/python_api/mlflow.sagemaker.html
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWSStepFunctions
Task
Choice
Fail
Parallel
Mountains
People
Snow
NotSupportedImageType
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
StepFunctionsusesAmazonStatesLanguage(JSON)
{
"Comment": "Image Processing workflow",
"StartAt": "ExtractImageMetadata",
"States": {
"ExtractImageMetadata": {
"Type": "Task",
"Resource": "arn:aws:lambda:::function:photo-backendExtractImageMetadata-...",
"InputPath": "$",
"ResultPath": "$.extractedMetadata",
"Next": "ImageTypeCheck",
"Catch": [ {
"ErrorEquals": [ "ImageIdentifyError"],
"Next": "NotSupportedImageType"
} ],
"Retry": [ {
"ErrorEquals": [ "States.ALL"],
"IntervalSeconds": 1,
"MaxAttempts": 2,
"BackoffRate": 1.5 }, ...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customerexample
https://www.youtube.com/watch?v=-PWjI6Ri2aY
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customerexample(cont’d)
VPC VPC
Data Scientist
Email
Event
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://gitlab.com/juliensimon/sagemaker-
automation/blob/master/cloudformation/notebook-instance.yml
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://gitlab.com/juliensimon/sagemaker-
automation/tree/master/cdk/notebook_instances
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://gitlab.com/juliensimon/aws/tree/master/lambda_frameworks/serverless/s
agemakerscheduler
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Step Functions can “hide” asynchronous jobs
AWS
Glue
Amazon
SageMaker
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Simplifymachinelearning workflows
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AddAWSGlue ETLjobs inyour workflows
"Synchronously Run a Glue Job": {
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Parameters":
{
"JobName.$": "$.myJobName”,
“AllocatedCapacity”: 3
},
"Catch": [
{"ErrorEquals": ["States.TaskFailed"],
"ResultPath": "$.cause",
"Next" : "Notify on Error"
} ],
"ResultPath": "$.jobInfo",
"Next": "Report Success"
}
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AddAmazonSageMaker jobs inyour workflows
"Synchronously Run a Training Job": {
"Type": "Task",
"Resource":
"arn:aws:states:::sagemaker.createTrainingJob.sync",
"Parameters":
{
"AlgorithmSpecification": {...},
"HyperParameters": {...},
"InputDataConfig": [...],
...
},
"Catch": [
{"ErrorEquals": ["States.TaskFailed"],
"ResultPath": "$.cause",
"Next" : ”Notify on Error"
} ],
"ResultPath": "$.jobInfo",
"Next": "Report Success"
}
"Synchronously Run a Transform Job": {
"Type": "Task",
"Resource":
"arn:aws:states:::sagemaker.createTransformJob.sync",
"Parameters":
{
"TransformJobName.$": "$.transform",
"ModelName.$": "$.model",
"MaxConcurrentTransforms": 8,
...
},
"Catch": [
{"ErrorEquals": ["States.TaskFailed"],
"ResultPath": "$.cause",
"Next" : ”Notify on Error"
} ],
"ResultPath": "$.jobInfo",
"Next": "Report Success"
}
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Define workflows inJSON
{
"StartAt": "Download",
"States": {
"Download": {
"Type": "Task",
"Resource": "arn:aws:lambda:REGION:ACCT:function:download_data”,
"Next": "Train"
},
"Train": {
"Type": "Task",
"Resource": "arn:aws:states:::sagemaker:createTrainingJob.sync",
"ResultPath": "$.training_job",
"Parameters": {
"AlgorithmSpecification": {
"TrainingImage": "811284229777.dkr.ecr.us-east-1.amazonaws.com/
image-classification:latest",
"TrainingInputMode": "File"
}…
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Web app
Users
Instances
Endpoint
75% 25%
Canary deployment 1/2
Create new
variant
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Web app
Users
Remove
old
variant
Instances
Endpoint
Move all
users
Canary deployment 2/2
Shift more
traffic
25% 75% 100%
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Endpoint
Web app
Users
Create
new
variant
Switch
all traffic
Delete
old
variant
Instances
100% 100% 0% 0% 100% 100%
Blue-green deployment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://gitlab.com/juliensimon/ent321
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://gitlab.com/juliensimon/sagemaker-
automation/tree/master/cloudformation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Getting started
http://aws.amazon.com/free
https://aws.amazon.com/cloudformation
https://aws.amazon.com/lambda
https://aws.amazon.com/stepfunctions
https://aws.amazon.com/sagemaker
https://github.com/aws/sagemaker-python-sdk
https://github.com/awslabs/amazon-sagemaker-examples
https://github.com/aws-samples/aws-sagemaker-build
https://medium.com/@julsimon
https://gitlab.com/juliensimon/sagemaker-automation
Merci!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Julien Simon
Global Evangelist, AI & Machine Learning, AWS
@julsimon

Contenu connexe

Tendances

Quantum Computing with Amazon Braket
Quantum Computing with Amazon BraketQuantum Computing with Amazon Braket
Quantum Computing with Amazon Braket
Amazon Web Services
 
Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine Learning
Chris Fregly
 

Tendances (20)

Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
 
Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)
 
Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)
 
Build, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdfBuild, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdf
 
Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)
 
A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)
 
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
 
Optimize your ML workloads_converted.pdf
Optimize your ML workloads_converted.pdfOptimize your ML workloads_converted.pdf
Optimize your ML workloads_converted.pdf
 
Building serverless applications (April 2018)
Building serverless applications (April 2018)Building serverless applications (April 2018)
Building serverless applications (April 2018)
 
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
 
Machine Learning: From Notebook to Production with Amazon Sagemaker
Machine Learning: From Notebook to Production with Amazon SagemakerMachine Learning: From Notebook to Production with Amazon Sagemaker
Machine Learning: From Notebook to Production with Amazon Sagemaker
 
Quantum Computing with Amazon Braket
Quantum Computing with Amazon BraketQuantum Computing with Amazon Braket
Quantum Computing with Amazon Braket
 
Machine Learning with Amazon SageMaker
Machine Learning with Amazon SageMakerMachine Learning with Amazon SageMaker
Machine Learning with Amazon SageMaker
 
Intro To Serverless Application Architecture: Collision 2018
Intro To Serverless Application Architecture: Collision 2018Intro To Serverless Application Architecture: Collision 2018
Intro To Serverless Application Architecture: Collision 2018
 
re:Invent 2018: AI/ML Services
re:Invent 2018: AI/ML Servicesre:Invent 2018: AI/ML Services
re:Invent 2018: AI/ML Services
 
Amazon Sagemaker Studio를 통한 ML개발하기 - 소성운(크로키닷컴) :: AWS Community D...
Amazon Sagemaker Studio를 통한 ML개발하기 - 소성운(크로키닷컴) :: AWS Community D...Amazon Sagemaker Studio를 통한 ML개발하기 - 소성운(크로키닷컴) :: AWS Community D...
Amazon Sagemaker Studio를 통한 ML개발하기 - 소성운(크로키닷컴) :: AWS Community D...
 
Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine Learning
 
Serverless Architectural Patterns: Collision 2018
Serverless Architectural Patterns: Collision 2018Serverless Architectural Patterns: Collision 2018
Serverless Architectural Patterns: Collision 2018
 
Simple Cloud with Amazon Lightsail
Simple Cloud with Amazon LightsailSimple Cloud with Amazon Lightsail
Simple Cloud with Amazon Lightsail
 
Build Text Analytics Solutions with Amazon Comprehend and Amazon Translate
Build Text Analytics Solutions with Amazon Comprehend and Amazon TranslateBuild Text Analytics Solutions with Amazon Comprehend and Amazon Translate
Build Text Analytics Solutions with Amazon Comprehend and Amazon Translate
 

Similaire à Automate your Amazon SageMaker Workflows (July 2019)

Similaire à Automate your Amazon SageMaker Workflows (July 2019) (20)

Design and Implement a Serverless Media-Processing Workflow
Design and Implement a Serverless Media-Processing Workflow Design and Implement a Serverless Media-Processing Workflow
Design and Implement a Serverless Media-Processing Workflow
 
Design and Implement a Serverless Media-Processing Workflow - SRV328 - Atlant...
Design and Implement a Serverless Media-Processing Workflow - SRV328 - Atlant...Design and Implement a Serverless Media-Processing Workflow - SRV328 - Atlant...
Design and Implement a Serverless Media-Processing Workflow - SRV328 - Atlant...
 
ML Workflows with Amazon SageMaker and AWS Step Functions (API325) - AWS re:I...
ML Workflows with Amazon SageMaker and AWS Step Functions (API325) - AWS re:I...ML Workflows with Amazon SageMaker and AWS Step Functions (API325) - AWS re:I...
ML Workflows with Amazon SageMaker and AWS Step Functions (API325) - AWS re:I...
 
SRV328 Designing and Implementing a Serverless Media-Processing Workflow
SRV328 Designing and Implementing a Serverless Media-Processing WorkflowSRV328 Designing and Implementing a Serverless Media-Processing Workflow
SRV328 Designing and Implementing a Serverless Media-Processing Workflow
 
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
 
Immersion Day - Estratégias e melhores práticas para ingestão de dados
Immersion Day - Estratégias e melhores práticas para ingestão de dadosImmersion Day - Estratégias e melhores práticas para ingestão de dados
Immersion Day - Estratégias e melhores práticas para ingestão de dados
 
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
 
Serverless APIs and you
Serverless APIs and youServerless APIs and you
Serverless APIs and you
 
Bridging the Gap Between Real Time/Offline and AI/ML Capabilities in Modern S...
Bridging the Gap Between Real Time/Offline and AI/ML Capabilities in Modern S...Bridging the Gap Between Real Time/Offline and AI/ML Capabilities in Modern S...
Bridging the Gap Between Real Time/Offline and AI/ML Capabilities in Modern S...
 
Serverless for Developers
Serverless for DevelopersServerless for Developers
Serverless for Developers
 
Simplify your Web & Mobile applications with cloud-based serverless backends
Simplify your Web & Mobile applicationswith cloud-based serverless backendsSimplify your Web & Mobile applicationswith cloud-based serverless backends
Simplify your Web & Mobile applications with cloud-based serverless backends
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter Bot
 
Designing and Implementing a Serverless Media Processing Workflow Using AWS S...
Designing and Implementing a Serverless Media Processing Workflow Using AWS S...Designing and Implementing a Serverless Media Processing Workflow Using AWS S...
Designing and Implementing a Serverless Media Processing Workflow Using AWS S...
 
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
 
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
 
GraphQL backend with AWS AppSync & AWS Lambda
GraphQL backend with AWS AppSync & AWS LambdaGraphQL backend with AWS AppSync & AWS Lambda
GraphQL backend with AWS AppSync & AWS Lambda
 
Perform Machine Learning at the IoT Edge using AWS Greengrass and Amazon Sage...
Perform Machine Learning at the IoT Edge using AWS Greengrass and Amazon Sage...Perform Machine Learning at the IoT Edge using AWS Greengrass and Amazon Sage...
Perform Machine Learning at the IoT Edge using AWS Greengrass and Amazon Sage...
 
How LEGO.com Accelerates With Serverless
How LEGO.com Accelerates With ServerlessHow LEGO.com Accelerates With Serverless
How LEGO.com Accelerates With Serverless
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
 
Take Mobile and Web Apps to the Next Level with AWS AppSync and AWS Amplify
Take Mobile and Web Apps to the Next Level with AWS AppSync and AWS Amplify Take Mobile and Web Apps to the Next Level with AWS AppSync and AWS Amplify
Take Mobile and Web Apps to the Next Level with AWS AppSync and AWS Amplify
 

Plus de Julien SIMON

Plus de Julien SIMON (16)

An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceAn introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging Face
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersReinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face Transformers
 
Building NLP applications with Transformers
Building NLP applications with TransformersBuilding NLP applications with Transformers
Building NLP applications with Transformers
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)
 
An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)
 
The Future of AI (September 2019)
The Future of AI (September 2019)The Future of AI (September 2019)
The Future of AI (September 2019)
 
Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)
 
Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)
 
Become a Machine Learning developer with AWS (Avril 2019)
Become a Machine Learning developer with AWS (Avril 2019)Become a Machine Learning developer with AWS (Avril 2019)
Become a Machine Learning developer with AWS (Avril 2019)
 
Solve complex business problems with Amazon Personalize and Amazon Forecast (...
Solve complex business problems with Amazon Personalize and Amazon Forecast (...Solve complex business problems with Amazon Personalize and Amazon Forecast (...
Solve complex business problems with Amazon Personalize and Amazon Forecast (...
 
Build, Train and Deploy Machine Learning Models at Scale (April 2019)
Build, Train and Deploy Machine Learning Models at Scale (April 2019)Build, Train and Deploy Machine Learning Models at Scale (April 2019)
Build, Train and Deploy Machine Learning Models at Scale (April 2019)
 
Optimize your Machine Learning workloads (April 2019)
Optimize your Machine Learning workloads (April 2019)Optimize your Machine Learning workloads (April 2019)
Optimize your Machine Learning workloads (April 2019)
 
Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)
 
Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
 
Optimize your machine learning workloads on AWS (March 2019)
Optimize your machine learning workloads on AWS (March 2019)Optimize your machine learning workloads on AWS (March 2019)
Optimize your machine learning workloads on AWS (March 2019)
 
Building machine learning inference pipelines at scale (March 2019)
Building machine learning inference pipelines at scale (March 2019)Building machine learning inference pipelines at scale (March 2019)
Building machine learning inference pipelines at scale (March 2019)
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Automate your Amazon SageMaker Workflows (July 2019)

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Automate yourAmazonSageMaker workflows Julien Simon Global Evangelist, AI & Machine Learning, AWS @julsimon F l o o r 2 8 , T e l A v i v , J u l y 8 t h , 2 0 1 9
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Machinelearning cycle Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions Are business goals met? YESNO
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AmazonSageMaker 1 2 3
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Build,trainand deploy models usingSageMaker Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions Are business goals met? Dataaugmentation Feature augmentation Re-training YESNO
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Whatabout thelines betweenthesesteps? Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions Are business goals met? YESNO
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda A recap on AWS automation • Amazon SDKs • AWS CloudFormation • AWS Cloud Development Kit (CDK) Orchestrating with AWS Step Functions Creating notebook instances Training / retraining models Deploying models • Strategies: canary deployment, blue-green deployment • Production variants
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Automation tools and services • Boto3 • AWS SDK for Python, covering all AWS services • https://github.com/boto/boto3 • Of course, you can use other AWS SDKs forC++, Go, Java, PHP, Ruby, JS, and .NET • Amazon SageMaker SDK • High-level SDK focused on ML experimentation • https://github.com/aws/sagemaker-python-sdk • Amazon CloudFormation • Infrastructure as Code service: templates (JSON,YAML) and stacks. • https://docs.aws.amazon.com/cloudformation/index.html • AWS Cloud Development Kit (CDK) • The new kid on the block: write code (JS, Python, Java, C#), build a CloudFormation template • https://docs.aws.amazon.com/cdk/
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Score card forAmazonSageMaker automation Boto3 Amazon SageMaker SDK AWS CloudFormation AWS CDK Maturity / stability *** ** *** * (developer preview) Documentation / examples *** *** *** ** for Javascript * for other languages Open source Yes Yes No Yes Service coverage 100% 100%, but high-level Notebook instances Models Endpoints No training Same as CF Change management Source control Source control Source control Change sets Drift detection Same as CF Cleanup ** (Delete APIs) ** (Delete APIs) *** (built-in) Same as CF Rollback * (difficult) * (difficult) *** (built-in) Same as CF Speed *** (fastest) ** * Same as CF Easy to learn ** *** (easiest) * Not sure yet… Personal opinion Best coverage and fastest Not designed for automation, but works for most scenarios. Rock solid. My preferred option to deploy and update update endpoints If you’re allergic to CF templates or if you want to try something different. Promising, but caveat emptor 
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. More options • Troposphere (2013) • Open source project https://github.com/cloudtools/troposphere • Write Python code, generate CloudFormation templates (I like it a lot) • Terraform (2014) • Open source tool by Hashicorp https://www.terraform.io/ • Coverage is similar to CloudFormation • Airflow (2015) • Open source project https://airflow.apache.org/ • https://aws.amazon.com/blogs/machine-learning/build-end-to-end-machine-learning-workflows- with-amazon-sagemaker-and-apache-airflow/ • MLFlow (2018) • Open source project by Databricks https://mlflow.org/ • https://www.mlflow.org/docs/latest/python_api/mlflow.sagemaker.html
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWSStepFunctions Task Choice Fail Parallel Mountains People Snow NotSupportedImageType
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. StepFunctionsusesAmazonStatesLanguage(JSON) { "Comment": "Image Processing workflow", "StartAt": "ExtractImageMetadata", "States": { "ExtractImageMetadata": { "Type": "Task", "Resource": "arn:aws:lambda:::function:photo-backendExtractImageMetadata-...", "InputPath": "$", "ResultPath": "$.extractedMetadata", "Next": "ImageTypeCheck", "Catch": [ { "ErrorEquals": [ "ImageIdentifyError"], "Next": "NotSupportedImageType" } ], "Retry": [ { "ErrorEquals": [ "States.ALL"], "IntervalSeconds": 1, "MaxAttempts": 2, "BackoffRate": 1.5 }, ...
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customerexample https://www.youtube.com/watch?v=-PWjI6Ri2aY
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customerexample(cont’d) VPC VPC Data Scientist Email Event
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://gitlab.com/juliensimon/sagemaker- automation/blob/master/cloudformation/notebook-instance.yml
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://gitlab.com/juliensimon/sagemaker- automation/tree/master/cdk/notebook_instances
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://gitlab.com/juliensimon/aws/tree/master/lambda_frameworks/serverless/s agemakerscheduler
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Step Functions can “hide” asynchronous jobs AWS Glue Amazon SageMaker
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Simplifymachinelearning workflows
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AddAWSGlue ETLjobs inyour workflows "Synchronously Run a Glue Job": { "Type": "Task", "Resource": "arn:aws:states:::glue:startJobRun.sync", "Parameters": { "JobName.$": "$.myJobName”, “AllocatedCapacity”: 3 }, "Catch": [ {"ErrorEquals": ["States.TaskFailed"], "ResultPath": "$.cause", "Next" : "Notify on Error" } ], "ResultPath": "$.jobInfo", "Next": "Report Success" }
  • 24. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AddAmazonSageMaker jobs inyour workflows "Synchronously Run a Training Job": { "Type": "Task", "Resource": "arn:aws:states:::sagemaker.createTrainingJob.sync", "Parameters": { "AlgorithmSpecification": {...}, "HyperParameters": {...}, "InputDataConfig": [...], ... }, "Catch": [ {"ErrorEquals": ["States.TaskFailed"], "ResultPath": "$.cause", "Next" : ”Notify on Error" } ], "ResultPath": "$.jobInfo", "Next": "Report Success" } "Synchronously Run a Transform Job": { "Type": "Task", "Resource": "arn:aws:states:::sagemaker.createTransformJob.sync", "Parameters": { "TransformJobName.$": "$.transform", "ModelName.$": "$.model", "MaxConcurrentTransforms": 8, ... }, "Catch": [ {"ErrorEquals": ["States.TaskFailed"], "ResultPath": "$.cause", "Next" : ”Notify on Error" } ], "ResultPath": "$.jobInfo", "Next": "Report Success" }
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Define workflows inJSON { "StartAt": "Download", "States": { "Download": { "Type": "Task", "Resource": "arn:aws:lambda:REGION:ACCT:function:download_data”, "Next": "Train" }, "Train": { "Type": "Task", "Resource": "arn:aws:states:::sagemaker:createTrainingJob.sync", "ResultPath": "$.training_job", "Parameters": { "AlgorithmSpecification": { "TrainingImage": "811284229777.dkr.ecr.us-east-1.amazonaws.com/ image-classification:latest", "TrainingInputMode": "File" }…
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Web app Users Instances Endpoint 75% 25% Canary deployment 1/2 Create new variant
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Web app Users Remove old variant Instances Endpoint Move all users Canary deployment 2/2 Shift more traffic 25% 75% 100%
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Endpoint Web app Users Create new variant Switch all traffic Delete old variant Instances 100% 100% 0% 0% 100% 100% Blue-green deployment
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://gitlab.com/juliensimon/ent321
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://gitlab.com/juliensimon/sagemaker- automation/tree/master/cloudformation
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Getting started http://aws.amazon.com/free https://aws.amazon.com/cloudformation https://aws.amazon.com/lambda https://aws.amazon.com/stepfunctions https://aws.amazon.com/sagemaker https://github.com/aws/sagemaker-python-sdk https://github.com/awslabs/amazon-sagemaker-examples https://github.com/aws-samples/aws-sagemaker-build https://medium.com/@julsimon https://gitlab.com/juliensimon/sagemaker-automation
  • 33. Merci! © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Julien Simon Global Evangelist, AI & Machine Learning, AWS @julsimon

Notes de l'éditeur

  1. TODO: Tensorflow 2.0