End-to-End Machine Learning with Amazon SageMaker

© 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Sungmin Kim, AWS Solutions Architect
End-to-End Machine Learning with
Amazon SageMaker

In this Talk
• What is Machine Learning?
• Machine Learning Workflow
• Build → Train → Deploy
• Build fast and collaborate
• Amazon SageMaker Studio Notebooks
• Training and tune models
• Amazon SageMaker Training Job
• Amazon SageMaker Hyperparameter Optimization
• Deploy and manage models
• Amazon SageMaker Endpoints
• Amazon SageMaker Pipelines
• Automatic ML Model Generation
• Amazon SageMaker Autopilot
• Machine Learning in the cloud

Marketing Offer On A New Product

Option 1- Build A Rule Engine
Age Gender Purchase
Date
Items
30 M 3/1/2017 Toy
40 M 1/3/2017 Books
…. …… ….. …..
Input Output
Age Gender Purchase
Date
Items
30 M 3/1/2017 Toy
…. …… ….. …..
Rule 1: 15 <age< 30
Rule 2: Bought Toy=Y,
Last Purchase<30 days
Rule 3: Gender = ‘M’,
Bought Toy =‘Y’
Rule 4: ……..
Rule 5: ……..
Human
Programmer

Option 2 - Learn The Business Rules From Data
Learning
Algorithm
Model
Output
Historical Purchase Data
(Training Data)
Prediction
Age Gender Items
35 F
39 M Toy
Input - New Unseen Data
Age Gender Purchase
Date
Items
30 M 3/1/2017 Toy
40 M 1/3/2017 Books
…. …… ….. …..

We Call This Approach Machine Learning
Learning
Algorithm
Model
Output
Historical Purchase Data
(Training Data)
Prediction
Age Gender Items
35 F
39 M Toy
Input - New Unseen Data
Age Gender Purchase
Date
Items
30 M 3/1/2017 Toy
40 M 1/3/2017 Books
…. …… ….. …..
Rule 1: 15 <age< 30
Rule 2: Bought Toy=Y,
Last Purchase<30 days
Rule 3: Gender = ‘M’,
Bought Toy =‘Y’
Rule 4: ……..
Rule 5: ……..
Human
Programmer

Typical Machine Learning Process
Collect, prepare and
label training data
Choose and
optimize
ML algorithm
Train and
tune ML models
Set up and
manage
environments
for training
Deploy models
in production
Scale and manage
the production
environment
1
2
3

Set up and track
experiment
Machine Learning is iterative
Choose model
Debug, compare, and
evaluate experiments
Monitor quality, detect
drift, and retrain
Share, review, and
collaborate

Common machine learning development
Laptop
Upside:
• Flexible. Personal. Easy to get started.
Downside:
• Extremely difficult to scale
• Nearly impossible to run in production
• Need virtual environments in order to experiment

Common machine learning development
Servers
Upside:
• Familiar. May seem less expensive upfront.
Downside:
• Availability is incredibly challenging to maintain
• Stuck in either over- or under- utilization
• Experimentation is risky and expensive
• New ideas have to wait for months to start
• Good luck going global!

Amazon
SageMaker
Label
data
Aggregate &
prepare data
Store & share
features
Auto ML Spark/R Detect bias
Visualize in
notebooks
Pick
algorithm
Train
models
Tune
parameters
Debug &
profile
Deploy in
production
Manage
& monitor
CI/CD
Human
review
Ground
Truth Data Wrangler
Feature
store Autopilot Processing Clarify
Studio
Notebooks
Built-in or
Bring-your-own
Experiments
Spot Training
Distributed
Training
Automatic
Model
Tuning
Debugger
Model Hosting
Multi-model
Endpoints
Model
Monitor
Pipelines
Augmented
AI
AMAZON SAGEMAKER EDGE MANAGER
SAGEMAKER STUDIO IDE
AMAZON SAGEMAKER JUMPSTART
VISION SPEECH TEXT SEARCH CHATBOTS PERSONALIZATION FORECASTING FRAUD CONTACT CENTERS
Deep
Learning
AMIs &
Containers
GPUs &
CPUs
Elastic
Inference
Trainium Inferentia FPGA
AI SERVICES
ML SERVICES
FRAMEWORKS & INFRASTRUCTURE
DeepGraphLibrary
Amazon
Rekognition
Amazon
Polly
Amazon
Transcribe
+Medical
Amazon
Lex
Amazon
Personalize
Amazon
Forecast
Amazon
Comprehend
+Medical
Amazon
Textract
Amazon
Kendra
Amazon
CodeGuru
Amazon
Fraud Detector
Amazon
Translate
INDUSTRIAL AI CODE AND DEVOPS
Amazon
DevOps Guru
Voice ID
For Amazon Connect
Contact Lens
Amazon
Monitron
AWS Panorama
+ Appliance
Amazon Lookout
for Vision
Amazon Lookout
for Equipment
Amazon
HealthLake
HEALTHCARE AI
Amazon Lookout
for Metrics
ANOMOLY DETECTION
Amazon
Transcribe
for Medical
Amazon
Comprehend
for Medical
모든 개발자를 위한 다양한 인공 지능 도구 제공

End-to-End
Machine Learning
Platform
Zero setup Flexible Model
Training
Pay by the second
$
Amazon SageMaker
손쉬운 기계 학습 모델 생성, 훈련 및 서비스 배포 완전 관리 서비스

Set up and track
experiment
Choose model
Debug, compare, and
drift, and retrain
Share, review, and
collaborate
Build fast and collaborate

Amazon SageMaker Studio
Collaboration
at scale
코드 의존성 추적
없이 확장 가능한
노트북 공유
Easy
experiment
management
수천 개의 모델
실험을 구성, 추적 및
비교
Automatic
model
generation
코드 작성 없이
데이터를 가지고 자동
모델 생성
Higher quality
ML models
오류 자동 디버깅 및
실시간 오류 경보
모델 모니터링 및
고품질 유지
Increased
productivity
완전 자동화된 머신
러닝 워크플로 구축
기계 학습 모델 개발 및 배포를 위한 최초의 완전 통합 개발 환경 (IDE)

Amazon
SageMaker
Studio
시작 화면

한번의 클릭으로 노트북 공유 가능

개발자가 몇 초 만에 ML 노트북을 가동 후 한 번의 클릭으로 공유 할 수 있는
새로운 개발 환경 제공
Amazon SageMaker Notebooks
직원 자격 증빙으로 바로
개발 환경 접근 가능
관리자가 손쉽게 권한 및
접근 제어 가능
보안성 높은 완전
관리형 서비스
손쉬운 협업
환경 제공
클릭 한번으로
URL기반 공유 가능
싱글 사인온 (SSO)을
통한 손쉬운 접근
컴퓨팅 리소스 없이
서버리스 환경
별도의 설정이나
구동 불필요

• Jupyter notebooks
• Support for Jupyter Lab
• Multiple built-in kernels
• Install external libraries and
kernels
• Integrate with Git
• Sample notebooks
• VPC Integration for
integrated security

Set up and track
experiment
Choose model
Debug, compare, and
drift, and retrain
Share, review, and
collaborate
Train and tune models

Amazon SageMaker Training
Docker
Container
EC2
Instance
S3 Bucket
Elastic Container Registry
Download
Algorithm
Image
3
Write trained model to S3
4
Sends your data
2
EC2
Instance
EC2
Instance
model.fit()
1

From Amazon SageMaker Notebooks
training

Specify Training Infrastructure

Use Algorithm and Start training
Execution Role
SageMaker Estimator

How does training happen
XGBoost
validation(optional)
test(optional
ECR
S3
ML Instance
ml.m4.xlarge
xgboost
linear-learner
PCA
DeepAR
BlazingText
Image classification
…
Object Detection
Images
S3
SageMaker
Notebook
SageMaker
Training Job
train
Model

Launch
container for
training Job

SageMaker training supports Spot Instances
EC2 Instance Spot Pricing
• Specify a maximum wait time
• SageMaker will default to giving you the lowest
possible cost
• Store model checkpoints in Amazon S3 in case
your job is interrupted for BYOM
• Many built-in algorithms automatically revert to a
training job
• We have examples
• Save up to 90%!

Algorithm Options
Built-in
algorithms
Script mode Docker
container
(BYOC)
AWS ML
marketplace
4
1 2 3

Train with a built-in algorithm
xgboost
linear-learner
PCA
DeepAR
BlazingText
Image classification
…
Object Detection
Built-in Algorithm
Images

Bring Your own Container (BYOC)

Training code
• Matrix factorization
• Regression
• Principal component analysis
• K-means clustering
• Gradient boosted trees
• And more!
17 Built-in algorithms
Bring your own script
(Amazon SageMaker managed container)
Bring your own
algorithm
(you build the
Docker container)
Subscribe to
Algorithms and
Model Packages
on AWS
Marketplace
Many ways to train a model on SageMaker
Algorithm Options

Neural Networks
Number of layers
Hidden layer width
Learning rate
Embedding
dimensions
Dropout
…
Decision Trees
Tree depth
Max leaf nodes
Gamma
Eta
Lambda
Alpha
…
“Hyperparameters”
(algorithm parameters that significantly affect model quality)
Amazon SageMaker Automatic Model Tuning
Hyperparameter Tuning

Setting up hyper parameter tuning job
1
2
3

Automatic
Model Tuning
Training Job 1
Training Job 2
Training Job N
Best
Model
Selector
Best Model
• Define Metrics • Hyperparameter
ranges/scaling
• Stop tuning job
early
• Use warm start
• Bayesian ~OR~
Random Search
Hyperparameter Tuning

Hyperparameter Search Strategy
Bayesian Search
Random Search

Bayesian vs. Random Search
Bayesian Search Random Search

What if I need all my jobs tuned at the same time?
Bayesian Search Random Search

Set up and track
experiment
Choose model
Debug, compare, and
drift, and retrain
Share, review, and
collaborate
Deploy models

Amazon SageMaker Deployment
Hosting Services
Inference Image
Training Image
Training Data
Model artifacts
Endpoint
Amazon
SageMaker
Amazon S3 Amazon ECR
Model artifacts Inference Image
Model artifacts Inference Image

SageMaker Endpoints (Private API)
Auto Scaling group
Availability Zone 1
Availability Zone 2
Availability Zone 3
Elastic
Load Balancing
Model
Endpoint Client
Deployment / Hosting
Amazon SageMaker ML
Compute Instances
Input Data
(Request)
Prediction
(Response)

SageMaker Endpoints (Public API)
Auto Scaling group
Availability Zone 1
Availability Zone 2
Availability Zone 3
Elastic
Load Balancing
Model
Endpoint
Amazon
API Gateway Client
Deployment / Hosting
Amazon SageMaker ML
Compute Instances
Input Data
(Request)
Prediction
(Response)

Updating Endpoints
Blue-green
deployments mean no
scheduled downtime
Deploy one or more
models behind the
same endpoint

A/B Testing
A/B Testing
Secure Endpoint
Inference
Code
Helper
Code
Model
Artifacts
Inference
code Images
Client Application
Inference
request
Inference
result
• 1-10 Production Variants (Model Versions)
• All models must have the same I/O schema
• Endpoint Modification w/o service disruption
Model-1
Inference
Code
Helper
Code
Model
Artifacts
Inference
code Images
Model-2
{
…
'InitialVariantWeight’: 2
} {ProductionVariants}
{
…
'InitialVariantWeight’: 1
}

A/B Testing
Model Version 설정
Deploy Invoke

Multi-Model Endpoints
• Scalable/Cost Effective for large number
of models
• Works best when models are of similar
size and latency
• Automatic memory handling
Secure Endpoint
Model
Artifacts
Client Application
Inference
request
Inference
result
Model-1
Inference
Code
Helper
Code
Container Model
Artifacts
Model-2
Inference
Code
Helper
Code
Container
Invoke Endpoint: TargetModel = Model-1
Pre
fix = SalesForecast/ Prefix =
SalesForecast/

Multi-model endpoints
Significant savings for large-scale deployments
EP-1
Model 1
EP-2
Model 2
EP-10
Model 10
…
EP
Model 1
Model 2
…
Model 10
Sample scenario: ml.c5.xlarge, $0.238/hour, 2 instances running 24/7
10 separate endpoints
$3,430/month
1 multi-model endpoint
$343/month

Multi-model endpoints
nevada.tar.gz
Mode: MultiModel
Artifact location:
s3://bucket/your-endpoint-models
predict
predict(‘nevada.tar.gz’,
features)
s3://bucket/your-endpoint-models/
new_york.tar.gz
florida.tar.gz
texas.tar.gz
load
new_york.tar.gz
texas.tar.gz
florida.tar.gz
nevada.tar.gz
Amazon SageMaker
Multi-model endpoint Amazon S3
model storage

Define
Estimator
Object
created
deploy()
predict()
Object
created
fit()
Amazon SageMaker
End to End Training and Deployment

Set up and track
experiment
Choose model
Debug, compare, and
drift, and retrain
Share, review, and
collaborate
Manage Workflow for ML Lifecycle

Challenges with creating a complete workflow for the ML lifecycle
1
2 컨셉concept에서 프로덕션까지 모델을 가져오는 데는 여러 단계가 포함
• ML 수명주기lifecycle의 각 단계에 대한 표준 코드 패키지 생성
• 워크플로라는 구조로 연결
• 단계step 간 종속성 관리
• 오케스트레이션 된 시퀀스로 워크플로 실행
모델 구축, 훈련 및 배포는 반복적인 프로세스
3 워크플로의 각 단계에 대한 아티팩트 추적
5 ML Ops의 일부로 전체 워크플로 자동화 및 확장
4 수천 개의 모델에서 올바른 버전의 모델 배포 및 관리

Amazon SageMaker Pipelines
규모에 맞게 완전 자동화된 머신 러닝 워크플로 구축
ML Workflow
작성 및 관리
사용하기 쉬운
Python SDK로
상세한
Workflow를
만들고 시각적으로
관리
거버넌스 및
감사audit를 위한
모델 계보lineage 추적
ML 수명주기lifecycle의
각 단계에 대한 코드,
데이터 셋 및 버전
추적
Workflow 재생
및 재실행
사용자 지정
일정에 따라 모든
단계를 다시
실행하여 모델을
최신 상태로 유지
시각적으로
모델 비교, 선택
및 배포 가능
SageMaker
Studio의 시각적
인터페이스를 통해
모듈 배포 및 관리
Registery를
활용한 중앙
집중식 ML
모델 관리
모델 레지스트리를
사용하여 프로덕션
배포에 가장
적합한 모델 선택
CI/CD 지원이
내장된 완전
관리형 MLOps
CI/CD 사례를
사용하여 완전
자동화된 머신
러닝 워크플로
구축

CI/CD Pipeline 예제 (1)
2. Git Commit & Push 3. Automatic Pipelining
1. 코드 수정 & Git Add

CI/CD Pipeline 예제 (2)
프로덕션 배포 승인 여부
UI로 쉽게 모델 버전 간 성능을 비교할 수 있고, status를 변경해서 One-click
모델 배포 가능
모델 버전 간 metric 비교
3
1
4
2

Set up and track
experiment
Choose model
Debug, compare, and
drift, and retrain
Share, review, and
collaborate
Machine Learning Workflow

학습 모델
구축 및 협업
SageMaker
Notebooks
SageMaker
Pipelines
완전 자동화된
머신 러닝
워크플로 구축
학습 모델 훈련
및 검증
SageMaker
Training Job
One-click 배포 ,
모델 모니터링 및
고품질 유지
SageMaker
Endpoints
학습 모델
최적화 및 다중
알고리즘 튜닝
SageMaker
HPO

Set up and track
experiment
Choose model
Debug, compare, and
drift, and retrain
Share, review, and
collaborate
If You Still Feel Machine Learning Difficult…

자동 모델 생성
자동 모델 튜닝을 통한
ML 모델 자동 생성
추천 및 최적화 기능
리더 보드 확보 및
모델 개선 계속
Amazon SageMaker Autopilot
기존 AutoML의 단점을 극복하기 위해 모델 제어 및 가시성 확보를 기반으로
자동 모델 생성 및 관리 서비스
가시성 및 데이터 제어
모델에 맞는 노트북
소스 코드
빠르게 시작 가능

How Amazon SageMaker Autopilot Works
https://github.com/aws/amazon-sagemaker-examples/tree/master/autopilot

Autopilot from
SageMaker
Studio
1
2
3

Use Amazon SageMaker Autopilot to automatically
train and tune the best machine learning models
✓

Use Amazon SageMaker Autopilot to automatically
train and tune the best machine learning models

Generate the Codes and Notebooks for you
Amazon SageMaker Autopilot Data
Exploration
Amazon SageMaker Autopilot Candidate
Definition Notebook

Classification
• Linear Learner
• XGBoost
• KNN
Working with Text
• BlazingText
• Supervised
• Unsupervised*
Recommendation
• Factorization Machines
Forecasting
• DeepAR
Topic Modeling
• LDA
• NTM
Amazon SageMaker에서 제공하는 Built-in Algorithms
Sequence Translation
• Seq2Seq*
Clustering
• KMeans
Feature Reduction
• PCA
• Object2Vec
Anomaly Detection
• Random Cut Forests
• IP Insights
Computer Vision
• Image Classification
• Object Detection
• Semantic Segmentation
Regression
• Linear
Learner
• XGBoost
• KNN
https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html

PREPARE
SageMaker Ground Truth
Label training data for machine learning
SageMaker Data Wrangler
Aggregate and prepare data for
machine learning
SageMaker Processing
Built-in Python, BYO R/Spark
SageMaker Feature Store
Store, update, retrieve, and share features
SageMaker Clarify
Detect bias and understand
model predictions
BUILD
SageMaker Studio Notebooks
Jupyter notebooks with elastic compute
and sharing
Built-in and Bring
your-own Algorithms
Dozens of optimized algorithms or bring
your own
Local Mode
Test and prototype on your local machine
SageMaker Autopilot
Automatically create machine learning
models with full visibility
SageMaker JumpStart
Pre-built solutions for common use cases
TRAIN & TUNE
Managed Training
Distributed infrastructure
management
SageMaker Experiments
Capture, organize, and compare
every step
Automatic
Model Tuning
Hyperparameter optimization
Distributed Training
Libraries
Training for large datasets
and models
SageMaker Debugger
Debug and profile training runs
Managed Spot Training
Reduce training cost by 90%
DEPLOY & MANAGE
Managed Deployment
Fully managed, ultra low latency,
high throughput
Kubernetes & Kubeflow
Integration
Simplify Kubernetes-based
machine learning
Reduce cost by hosting multiple models
per instance
SageMaker Model Monitor
Maintain accuracy of deployed models
SageMaker Edge Manager
Manage and monitor models on
edge devices
SageMaker Pipelines
Workflow orchestration and automation
Amazon SageMaker
SageMaker Studio
Integrated development environment (IDE) for ML
Amazon SageMaker overview
✓
✓
✓
✓
✓
✓
✓
✓
✓

Machine learning development
Laptop Servers Cloud

Build Train Deploy
ML infrastructure
Operations
Security & Compliance
Machine Learning in the cloud
SageMaker offers up to 96% lower TCO and 10x more developer
productivity

Capability
Amazon
SageMaker
Compared to
self-managed
Amazon EC2
Compared to
self-managed
Kubernetes
(EKS)
Provision & manage
instances
Fully managed Self-managed Managed by AWS
Manage security &
compliance
Built-in Self-managed Self-managed
Infrastructure performance
optimization
Scales
automatically
Self-managed Self-managed
Infrastructure management
for high-availability
Optimizes
automatically
Self-managed Self-managed
Source of cost-savings

Getting started with
• SageMaker Immersion Day Workshop ✯✯✯
• SageMaker Examples (100+) ✯✯✯
• SageMaker Workshop (한국어)
• Amazon SageMaker Overview (2020-03-25)
• [Video] Amazon SageMaker Overview (2020-03-25)
• [Video] Amazon SageMaker 데모 (2020-03-25) ✯✯✯
• AI/ML Resources - 동영상, 발표 자료 등

Put machine learning in the
hands of every developer
Our mission at

End-to-End Machine Learning with Amazon SageMaker

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à End-to-End Machine Learning with Amazon SageMaker

Similaire à End-to-End Machine Learning with Amazon SageMaker (20)

Plus de Sungmin Kim

Plus de Sungmin Kim (13)

Dernier

Dernier (20)

End-to-End Machine Learning with Amazon SageMaker