AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Marshall Tappen and Ernesto Gonzalez
Amazon Fulfillment Technologies
November 30, 2016
MAC301
Transforming Industrial
Processes with Deep Learning

What to Expect from the Session
• Description of how Amazon Fulfillment Technologies has
used computer vision to improve our processes.
• Walk through how we combined deep learning and
traditional computer vision to automate an industrial
process.
• What are the challenges and the opportunity created by
deep learning classifiers?

Overview of fulfillment process

One thing you have to understand about
fulfillment centers
Bins can hold anything

Misplaced inventory “disappears”
Amazon Confidential 5
Associate
rearranged
inventory
when
picking
items.

Misplaced inventory “disappears”
We call this
an
inventory
defect

Our solution: use computer vision to locate
inventory defects

First step: get a physical system to capture
images
Station
Outbound
frame
Inbound frame
Totes and
conveyance

Capture set of images as pod arrives at
the station
Arrival Image
Tower
Departure Image
Tower
Station

Associate interacts with pod
Arrival Image
Tower
Departure Image
Tower
Station

Photographed again as pod leaves
Arrival Image
Tower
Departure Image
Tower
Station

General strategy
• We want to take advantage of deep learning.
• The cameras capture images of an entire pod, but we
need data at the bin level.
• We will have a two-step process:
1. Extracting bins from images
2. Analyzing bin Images

Computer vision step 1: pod image to bin
images

No problem, use 2-D barcodes!

No problem, use 2-D barcodes!
Bands block the
barcodes

Solution, if we can detect the trays

And we can detect the sides

We have a set of points to match with a recipe of the
pod’s geometry

Map the coordinate system of the database to
the face of the pod in the image

Detecting the side of a pod: downsample image
and convert to grayscale
2046 X 2046 Image 512 X 512 Image

Correlate* with left rail template
Filter
* In practice, we use normalized cross-correlation

Threshold

Fit a line (similar process for the other side)

We can detect trays in the same way

We can detect trays in the same way
Now we
have
locations to
tie the
virtual
template to
the image!

Transformation between image and pod
physical coordinates is called a homography
We can verify
that it works by
calculating the
boundary of
each bin in the
image and
coloring it in.

How can we use computer vision?
• Automatic
identification of
every item?

• Automatic
identification of
every item?(TOO
HARD)
• Automatic
counting of every
item?

What does computer vision need to tell us?
• Automatic
identification of every
item?(TOO HARD)
• Automatic counting
of every item? (TOO
HARD)

Instead, we can look for changes
Inbound to the Station Outbound from the Station

Our first attempt was with hand-engineered
computer vision

It’s hard!
Must be robust to items rolling or shuffling inside
the bin, illumination changes, specularity, etc.

The big insight
• We realized our problem was just binary classification.
• Two images in, one label out.
• Why not try this deep-learning thing?

We did the simplest thing possible
• Take the first image,
convert it to grayscale,
and put it in the red
channel of a new image
• Take the second image
and put it in the blue
channel
• Now, we have a single
image to pass to the
neural network

It worked great!
Best Hand-
Engineered Model
CIFAR CNN
Krizhevsky’s CNN

Processing pipeline
Pod Image
Bin Extraction
Bin Images
Defect
Detection

Implementation details
• Implemented in OpenCV in Python
• C++ extensions for some steps
• Neural net uses Caffe
• Trained on G2 instances
• Runs on CPU in FC server room
• Can tolerate latency in our current use-pattern

Software architecture
Inventory
Event
Correlator
(EC2)
VBI
Service
(EC2)
Remote
Count
Website
(Defect
Detection)
(EC2)
Site Server Room AWS
Inventory
Bin Count
Elimination
(EC2)
• Get Bin Defect
Result
• Get Bin Space
Available
Capture
Event
Data
Router
Bin
Extraction
Process
Auto
Count
Process
Local
Storage
Service
Put
Pod Face
Images
Put Bin
Images
Get Pod
Images
Camera
Controller
File Pusher
Barcode
Extraction
Edge
Device (s)
EDGE
DEVICE
Get Bin
Image
Get Bin
Image
Applications
SNS
HTTP
POST
SNS
DynamoDB
SNS
SNS
SQS
Get Work for Remote
Counting
SQS
SQS
SNS

Automatic
identification of every
item?(TOO HARD)
Automatic counting
of every item?

Could we just count the number of items in the
bin?
• At this point, we have lots of data.
• Some of it has errors from inventory defects, but
networks have proven resilient to this kind of thing.
• Why not just train a network to directly count bins?

Using a convolutional neural network
• We used the Caffe implementation of GoogLeNet [1]
[1] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent
Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE International
Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

Maps cleanly onto classification paradigm
• Treat it as a multi-class classification problem
Neural
Network
0.1
0.2
0.4
0.4

This saved the project
• Hit the targets we needed
• Eliminated a lot of hardware (no more before/after shots
needed)
• Made the project cost effective
• Here is what we learned:
• Don’t focus on algorithms, focus on DATA

How else can we use this data?
• We want to find free space
in the bin without having to
label data.
• We can guess from
dimensions of items.
• But where is the space at?
2.0
1.0

Train model to predict emptiness from an image
Emptiness scoreConv
Avg
Po
ol
GoogleNet
Conv
(3*3)
This is a noisy,
probably incorrect
estimate!

But we can use layers in the network to find where the
space actually is!
emptiness scoreConv
Avg
Po
ol
GoogleNet
Conv
(3*3)
1024 channels
3*3

Original image Activation map Binary map
Original image Activation map Binary map
And it works!

Takeaways
• We have great pattern recognition machinery now.
• Focus on the data:
• How can you get lots of it?
• What can you get for free?
• How much labeling do you really need?
• Is there a proxy problem?

Remember to complete
your evaluations!

AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)

Similaire à AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301) (20)

Plus de Amazon Web Services

Plus de Amazon Web Services (20)

Dernier

Dernier (20)

AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)