9. AI Frameworks
AWS Deep Learning AMI
• Easy-to-launch tutorials
• Hassle-free setup and configuration
• Pay only for what you use
• Accelerate your model training and deployment
• Support for popular deep learning frameworks
15. Amazon SageMaker is a fully managed service that enable developers
and data scientist to quickly and easilybuild, train anddeploymachine
learning models at any scale..
..from idea to production.
20. 4
ML Hosting Service
Different Versions of
the same inference
code in ECR. Prod is
the master container,
it will serve over 50%
of requests.
Amazon ECR
Model Artifacts
21. 4
ML Hosting Service
Amazon ECR
Model Artifacts
ModelName: prod
Create a Model
Different Versions of
the same inference
code in ECR. Prod is
the master container,
it will serve over 50%
of requests.
22. 4
ML Hosting Service
Amazon ECR
Model Artifacts
Model Versions
Create a versions of
a Model
Different Versions of
the same inference
code in ECR. Prod is
the master container,
it will serve over 50%
of requests.
23. 4
ML Hosting Service
InstanceType: c3.4xlarge
MinInstanceCount: 5
MaxInstanceCount: 20
ModelNmae: prod
VariantName: prodPrimary
VariantWeight: 50
ProductionVariant
Amazon ECR
Model Artifacts
30 50
10 10
Model Versions
Create weighted
ProductionVariants
Different Versions of
the same inference
code in ECR. Prod is
the master container,
it will serve over 50%
of requests.
24. 4
ML Hosting Service
InstanceType: c3.4xlarge
MinInstanceCount: 5
MaxInstanceCount: 20
ModelName: prod
VariantName: prodPrimary
VariantWeight: 50
ProductionVariant
Amazon ECR
Model Artifacts
Endpoint Configuration
30 50
10 10
Model Versions
Create an
EndpointConfiguration
from one or many
ProductionVariants
Different Versions of
the same inference
code in ECR. Prod is
the master container,
it will serve over 50%
of requests.
25. 4
ML Hosting Service
InstanceType: c3.4xlarge
MinInstanceCount: 5
MaxInstanceCount: 20
ModelNmae: prod
VariantName: prodPrimary
VariantWeight: 50
ProductionVariant
Amazon ECR
Model Artifacts
Endpoint Configuration
Inference Endpoint
30 50
10 10
Model Versions
Create an Endpoint
from one
EndpointConfiguration
Different Versions of
the same inference
code in ECR. Prod is
the master container,
it will serve over 50%
of requests.
26. • Training Algorithm / inference code is
packaged in Docker Image on ECR
• SageMaker pulls training algorithm image
from ECR into Model raining Service
• Amazon SM downloads or streams the
training data and runs Training
• After training Amazon SM uploads model
artifacts to Amazon S3
• For Inference, Amazon SM pulls the model
artifacts and the inference image from ECR
into Model Hosting Service
• Amazon SM exposes an inference
endpoint for client application to send
prediction requests
• Ground truth data collected from the client
could be sent into training bucket to retrain
and update the model