More Related Content Similar to Where ml ai_heavy Similar to Where ml ai_heavy (20) More from Randall Hunt (9) Where ml ai_heavy1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Randall Hunt – Some Guy From Los Angeles
@WhereML a Serverless AI Powered Location
Guessing Twitter Bot
Built with Amazon SageMaker and AWS Lambda
Based on LocationNet work by Jaeyoung Choi and Kevin Li
2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
About Me
• Technical Evangelist at AWS
• I build some demos: https://github.com/ranman
• I write some blogs: https://aws.amazon.com/blogs/aws/author/randhunt/
• Formerly of SpaceX, NASA, MongoDB
• I like Python
• I dislike javascript
• I look ridiculous in my badge photo
3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Example
Try it! Tweet to
@WhereML with a picture.
Hold your cell phone
camera to the screen
4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Architecture
AWS Lambda FunctionAmazon API
Gateway
Amazon SageMaker
Model Artifacts
Inference
Endpoint
Inference code
Amazon ECR
Inference code
5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Solving Some Of The Hardest Problems In Computer Science
Learning Language Perception Problem
Solving
Reasoning
10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Put machine learning in the hands of every developer
and data scientist
ML @ AWS: Our mission
11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Customer Running ML on AWS Today
12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reviewing The ML Process
13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
– Predictions
YesNo
DataAugmentation
Feature
Augmentation
The Machine Learning Process
Re-training
17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
– Predictions
YesNo
DataAugmentation
Feature
Augmentation
Discovery: The Analysts
Re-training
• Help formulate the right
questions
• Domain Knowledge
18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
– Predictions
YesNo
DataAugmentation
Feature
Augmentation
Integration: The Data Architecture
Retraining
• Build the data platform:
• Amazon S3
• AWS Glue
• Amazon Athena
• Amazon EMR
• Amazon Redshift
Spectrum
19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
• Setup and manage
Notebook Environments
• Setup and manage
Training Clusters
• Write Data Connectors
• Scale ML algorithms to
large datasets
• Distribute ML training
algorithm to multiple
machines
• Secure Model artifacts
Why We built Amazon SageMaker: The Model Training Undifferentiated Heavy Lifting
20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Business Problem –
Model Deployment
Monitoring &
Debugging
– Predictions
• Setup and manage Model
Inference Clusters
• Manage and Scale Model
Inference APIs
• Monitor and Debug Model
Predictions
• Models versioning and
performance tracking
• Automate New Model
version promotion to
production (A/B testing)
Why We built Amazon SageMaker: The Model Deployment Undifferentiated Heavy Lifting
21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
A fully managed service that enables data scientists and
developers to quickly and easily build machine-learning
based models into production smart applications.
Amazon SageMaker
22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker
1 2 3 4
I I I I
Notebook Instances Algorithms ML Training Service ML Hosting Service
23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
1
I
Notebook Instances
Zero Setup For Exploratory Data Analysis
Authoring &
Notebooks
ETL Access to AWS
Database services
Access to S3 Data
Lake
• Recommendations/Personalization
• Fraud Detection
• Forecasting
• Image Classification
• Churn Prediction
• Marketing Email/Campaign Targeting
• Log processing and anomaly detection
• Speech to Text
• More…
“Just add data”
24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Streaming datasets,
for cheaper training
Train faster, in a
single pass
Greater reliability on
extremely large
datasets
Choice of several ML
algorithms
Amazon SageMaker: 10x better algorithms
2
I
Algorithms
25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cost vs. Time
$$$$
$$$
$$
$
Minutes Hours Days Weeks Months
Single
Machine
Distributed, with
Strong Machines
26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Infinitely Scalable ML Algorithms
27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
2
I
Algorithms
Training code
• Matrix Factorization
• Regression
• Principal Component Analysis
• K-Means Clustering
• Gradient Boosted Trees
• And More!
Amazon provided Algorithms
Bring Your Own Script (IM builds the Container)
IM Estimators in
Apache Spark Bring Your Own Algorithm (You build the Container)
Amazon SageMaker: 10x better algorithms
28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Managed Distributed Training with Flexibility
Training code
• Matrix Factorization
• Regression
• Principal Component Analysis
• K-Means Clustering
• Gradient Boosted Trees
• And More!
Amazon provided Algorithms
Bring Your Own Script (IM builds the Container)
Bring Your Own Algorithm (You build the Container)
3
I
ML Training Service
Fetch Training data
Save Model Artifacts
Fully
managed –
Secured–
Amazon ECR
Save Inference Image
IM Estimators in
Apache Spark
CPU GPU HPO
29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
4
I
ML Hosting Service
Amazon ECR
Amazon SageMaker
Easy Model Deployment to Amazon SageMaker
Versions of the same
inference code saved in
inference containers.
Prod is the primary one,
50% of the traffic must
be served there!
30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
4
I
ML Hosting Service
Amazon ECR
Model Artifacts
Inference Image
Versions of the same
inference code saved in
inference containers.
Prod is the primary one,
50% of the traffic must
be served there!
Create a Model
ModelName: prod
Amazon SageMaker
Easy Model Deployment to Amazon SageMaker
31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
4
I
ML Hosting Service
Amazon ECR
Model Artifacts
Inference Image
Model versions
Versions of the same
inference code saved in
inference containers.
Prod is the primary one,
50% of the traffic must
be served there!
Create versions of a Model
Amazon SageMaker
Easy Model Deployment to Amazon SageMaker
32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
4
I
ML Hosting Service
Amazon ECR
30 50
10 10
InstanceType: c3.4xlarge
InitialInstanceCount: 3
ModelName: prod
VariantName: primary
InitialVariantWeight: 50
ProductionVariant
Model Artifacts
Inference Image
Model versions
Versions of the same
inference code saved in
inference containers.
Prod is the primary one,
50% of the traffic must
be served there!
Create weighted
ProductionVariants
Amazon SageMaker
Easy Model Deployment to Amazon SageMaker
33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
4
I
ML Hosting Service
Amazon ECR
30 50
10 10
ProductionVariant
Model Artifacts
Inference Image
Model versions
Versions of the same
inference code saved in
inference containers.
Prod is the primary one,
50% of the traffic must
be served there!
Create an
EndpointConfiguration from
one or many
ProductionVariant(s)EndpointConfiguration
Amazon SageMaker
Easy Model Deployment to Amazon SageMaker
InstanceType: c3.4xlarge
InitialInstanceCount: 3
ModelName: prod
VariantName: primary
InitialVariantWeight: 50
34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
4
I
ML Hosting Service
Amazon ECR
30 50
10 10
ProductionVariant
Model Artifacts
Inference Image
Model versions
Versions of the same
inference code saved in
inference containers.
Prod is the primary one,
50% of the traffic must
be served there! Create an Endpoint from
one EndpointConfiguration
EndpointConfiguration
Inference Endpoint
Amazon SageMaker
Easy Model Deployment to Amazon SageMaker
InstanceType: c3.4xlarge
InitialInstanceCount: 3
ModelName: prod
VariantName: primary
InitialVariantWeight: 50
35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
4
I
ML Hosting Service
Amazon ECR
30 50
10 10
ProductionVariant
Model Artifacts
Inference Image
Model versions
Versions of the same
inference code saved in
inference containers.
Prod is the primary one,
50% of the traffic must
be served there!
One-Click!
EndpointConfiguration
Inference Endpoint
Amazon Provided Algorithms
Amazon SageMaker
Easy Model Deployment to Amazon SageMaker
InstanceType: c3.4xlarge
InitialInstanceCount: 3
ModelName: prod
VariantName: primary
InitialVariantWeight: 50
36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
4
I
ML Hosting Service
Auto-Scaling Inference
APIs
A/B Testing (more to
come)
Low Latency & High
Throughput
Bring Your Own Model
Python SDK
Amazon SageMaker
Easy Model Deployment to Amazon SageMaker
37. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Building the Model
38. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Credits
• This model, LocationNet, was built by Jaeyoung Choi of
the International Computer Science Institute and Kevin
Li of the University of California, Berkley
• Supported by the AWS Cloud Credits for Research
Program
• Based on work on PlaNet by Weyland et. all
Ref: arxiv.org/abs/1602.05314 : PlaNet—Photo
Geolocation with Convolutional Neural Networks
39. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
LocationNet Model Approach
• Model trained and built with Apache MXNet
• Trained with 33.9 million geo-tagged images from the
AWS Multimedia Commons Dataset for 12 epochs over
9 days using a single p2.16xlarge.
• Uses Google’s S2 Spherical Geometry library to
subdivide the earth into 15,527 of multi-scale geographic
cells which serve as classes for the data.
• Built on ResNet-101 architecture
40. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Model Architecture
41. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Example S2 Multi-scale partitioning
Ref: arxiv.org/abs/1602.05314 : PlaNet—Photo Geolocation with
Convolutional Neural Networks
42. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
43. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Pros / Cons of this approach
• Surprisingly precise in cities
with larger numbers of
partitions
• Fast inference (<100ms on
t2.large)
• Excellent performance for
unique objects / landmarks
• Small model <300 MB can
be deployed anywhere
• Surprisingly inaccurate in
locales with fewer partitions.
• Poor performance for
common objects / terrain
44. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Infrastructure for WhereML
45. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Architecture
1. Twitter Webhook calls to API Gateway endpoint
2. API Gateway invokes Lambda function with payload
from twitter
3. Lambda function calls out to SageMaker Inference
Endpoint with URL of image
4. Inference endpoint downloads image and classifies it
with LocationNet
5. Lambda posts results back to Twitter
AWS Lambda FunctionAmazon API
Gateway
SageMaker
Inference
Endpoint
46. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Amazon API Gateway
47. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
AWS Lambda Function
• Proxy-Invocation from API Gateway sends entire request to the Lambda
• AWS Lambda Function:
1. Parses the incoming request
2. Verifies it is from Twitter and verifies message integrity
3. Parses the tweet
4. Sends the media URL in tweet to SageMaker endpoint
5. Uses twitter API to respond to original tweet
• Billed per GB/Second. 400,000 GB/S perpetual free-tier.
• Any scale of requests, but SageMaker endpoint limited to 10,000 TPS
• Python 🐍
48. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Lambda Function – Verify Request
1 def lambda_handler(events, context):
2 # deal with bad requests
3 if event.get('path') != WEBHOOK_PATH:
4 return {'statusCode': 404, 'body': ''}
5 # deal with subscription calls
6 if event.get('httpMethod') == 'GET':
7 crc = event.get('queryStringParameters', {}).get('crc_token')
8 if not crc: return {'statusCode': 401, 'body': 'bad crc'}
9 return {'statusCode': 200, 'body': sign_crc(crc)}
10 # deal with bad crc
11 if not verify_request(event, context):
12 return {'statusCode': 400, 'body': 'bad crc'}
49. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Lambda Function – Verify Request Utilities
1 def sign_crc(crc):
2 h = hmac.new(
3 bytes(CONSUMER_SECRET, 'ascii'), bytes(crc, 'ascii'),
4 digestmod=sha256)
5 return json.dumps({
6 "response_token": "sha256="+b64encode(h.digest()).decode()
7 })
8
9 def verify_request(event, context):
10 crc = event['headers']['X-Twitter-Webhooks-Signature']
11 h = hmac.new(
12 bytes(CONSUMER_SECRET, 'ascii'),
13 bytes(event['body'], 'utf-8'),
14 digestmod=sha256)
15 crc = b64decode(crc[7:]) # strip out the first 7 characters ("sha256=")
16 return hmac.compare_digest(h.digest(), crc)
50. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Lambda Function – SageMaker and Twitter
1 def lambda_handler(events, context):
2 # we're good! load that event up
3 twitter_events = json.loads(event['body'])
4 for event in twitter_events.get('tweet_create_events', []):
5 if validate_record(event):
6 body = json.dumps({'url': media, 'max_predictions': MAX_PREDICTIONS})
7 results = json.loads(
8 sagemaker.invoke_endpoint(EndpointName=ENDPOINT_NAME, Body=body)
9 )['Body'].read()
10 status = build_tweet(results)
11 twitter_api.PostUpdate(
12 "📍 ?n" + status[0],
13 media=status[1],
14 in_reply_to_status_id=event['id_str'],
15 auto_populate_reply_metadata=True
16 )
51. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
SageMaker Architecture
• Docker container stored in ECR
• Autoscaling Endpoint spins up
containers as needed and
automatically fetches model artifacts
from S3 and puts them in
/opt/models
• Flask app responds to /ping and
/inference
Amazon SageMaker
Model Artifacts
Inference
Endpoint
Inference code
Amazon ECR
Inference code
52. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
SageMaker Inference Code – Load Model
1 import mxnet as mx
2 import numpy as np
3 sym, arg_params, aux_params = mx.model.load_checkpoint(MODEL_NAME, 12)
4 mod = mx.mod.Module(symbol=sym, context=mx.cpu())
5 mod.bind([('data', (1, 3, 224, 224))], for_training=False)
6 mod.set_params(arg_params, aux_params, allow_missing=True)
7 Batch = namedtuple('Batch', ['data'])
8 grids = load_grids()
53. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
SageMaker Inference Code – Predict
1 def predict(img, max_predictions):
2 mod.forward(Batch(img), is_train=False)
3 prob = mod.get_outputs()[0].asnumpy()[0]
4 pred = np.argsort(prob)[::-1]
5 result = []
6 for i in range(max_predictions):
7 pred_loc = grids[int(pred[i])]
8 result.append((pred_loc, prob[]))
9 return result
10
11 def download_and_predict(url, max_predictions=3):
12 img = preprocess_image(download_image(url))
13 return predict(img, max_predictions)
54. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
SageMaker Inference Code – Flask App
1 from flask import request, jsonify, Flask
2 import predict
3 app = Flask("WhereML")
4
5 @app.route("/ping")
6 def ping():
7 return "", 200
8
9 @app.route("/invocations", methods=["POST"])
10 def invoke():
11 data = request.get_json(force=True)
12 return jsonify(predict.download_and_predict(data['url']))
55. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
SageMaker Inference Code – Docker File
1 FROM mxnet/python:latest
2 WORKDIR /app
3 RUN pip install -U flask scikit-image numpy reverse_geocoder boto3
4 COPY *.py /app/
5 COPY grids.txt /app/
6 ENTRYPOINT ["python", "app.py"]
7 EXPOSE 8080
56. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Working with Twitter
57. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Twitter API
• Two ways to respond to mentions: User Streams API
(deprecated) and Account Activity API (beta)
• Account Activity Webhooks allow fully “serverless” approach
• UserStreams API requires running container/instance to poll
for updates
• UserStreams API is going away in June 2018… even though
the replacement for it is not GA…
58. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Registering Webhook
1 twitter = OAuth1Session(**keys)
2 base_url = "https://api.twitter.com/1.1/all/env-beta/"
3 params={'url': "https://mywebsite.com/twitter/whereml"}
4 webhook_id = twitter.post(base_url+"webhooks.json", params).json()['id']
5 # pass webhook ID in prod, not needed in beta
6 twitter.post(base_url+"subscriptions.json")
59. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
SageMaker
Notebooks
Training
Algorithm
SageMaker
Training
Amazon ECR
Code Commit
Code Pipeline
SageMaker
Hosting
dataset
AWS
Lambda
API
Gateway
SageMaker Example End-to-End Architecture
Build
Train
Deploy
static website hosted on S3
Inference requests
Amazon S3
Amazon
Cloudfront
Web assets on
Cloudfront
60. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Easy to get started!
Tons of tools
Machine Learning is FUN
61. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Built live on twitch.tv/aws
62. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. @jrhunt
Thank you!
randhunt@amazon.com
Editor's Notes When did it all start? Well, that was approx 60 years ago!
In 1957, Frank Rosenblatt - an electro-mechanical neural network, the Perceptron, which he trained to recognize images (20x20 “pixels”).
In 1975, Paul Werbos published a article describing “backpropagation”, an algorithm allowing better and faster training of neural networks.
So, if neural networks have been around for so long, whats wrong with them?
Why do we talk about them today?
Amazon Robotics was founded in 2003 on the notion that in order to meet consumer demands in eCommerce, a better approach to order fulfillment solutions was necessary. Amazon Robotics empowers a smarter, faster, more consistent customer experience through automation
automates fulfilment center operations using various methods of robotic technology including autonomous mobile robots, sophisticated control software, language perception, power management, computer vision, depth sensing, machine learning, object recognition, and semantic understanding of commands.
Amazon Prime Air is a service that will deliver packages up to five pounds in 30 minutes or less using small drones and relies extensively on visual object recognition.
We have Prime Air development centers in the United States, the United Kingdom, Austria, France and Israel.
Amazon Go is a new kind of store with no checkout required. We created the world’s most advanced shopping technology so you never have to wait in line. With our Just Walk Out Shopping experience, simply use the Amazon Go app to enter the store, take the products you want, and go! No lines, no checkout. (No, seriously.)
No lines, no checkout
Our checkout-free shopping experience is made possible by the same types of technologies used in self-driving cars: computer vision, sensor fusion, and deep learning. Our Just Walk Out Technology automatically detects when products are taken from or returned to the shelves and keeps track of them in a virtual cart. When you’re done shopping, you can just leave the store. Shortly after, we’ll charge your Amazon account and send you a receipt.
We’re excited to have a broad set of customers running ML on AWS today.
[ADD: Conde Nas – using P2 and TF to detect hand bags https://technology.condenast.com/story/handbag-brand-and-color-detection]
The Data platform
Highly-optimized Machine Learning Algorithms
Amazon SageMaker installs high-performance, scalable machine learning algorithms optimized for speed, scale, and accuracy, to run on petabytes of training datasets. Based on the type of learning that you are undertaking, you can choose from supervised algorithms, such as linear/logistic regression or classification; as well as unsupervised learning, such as with k-means clustering.
Linear Classification and Regression
Factorization Machines
K-Means Clustering
Principal Components Analysis (PCA)
Latent Dirichlet Analysis (Spectral LDA)
Neural Topic Modeling
Seqence2Sequence
Gradient Boosted Trees (XGBoost)
An Amazon Resource (has an ARN)
Links the model artifacts (weights) to the inference container (predictions code)
A model can link multiple inference containers and associated model artifacts
A model is used to create a ProductionVariant
One or many ProductionVariants constitute an EndpointConfiguration
An Endpoint Configuration is used to create an Endpoint
Each version has its inference container on ECR and model artifacts on S3
Versions of a model can be used to create multiple ProductionVariants with different weights
The ProductionVariants can be used to create an EndpointConfiguration for the versions of the model
The EndpointConfiguration defines the Endpoint
A ProductionVariant is analogous to an Auto-Scaling Group for a specific IM model. The associated VariantWeight determines the portion of traffic handled by the ProductionVariant.
The Endpoint serves the inference traffic with one or multiple auto-scaling groups (production variants) of models versions. Prod gets 50% of the traffic because its weight contributes 50% of the sum of all VariantWeights in the EndpointConfiguration Cells are hierarchical decomposition of the sphere
Compact, represented by 64 bit int
Similar levels represent similar sizes of area
Containment query for arbitrary regions are really fast
Projects points/regions of the sphere into a cube and takes each cube face as a quad-tree where the sphere point is projected into it. After that space is discretized and cells are enumerated on a Hilbert curve.
Hilbert curve is a space-filling curve that converts multiple dimensions into one dimension while preserving locality Invoke takes two invocation types: Event, RequestResponse, DryRun (to just verify that you have permissions We can put this in a cloudformation custom resource