Contenu connexe Similaire à "Optimizing SSD Object Detection for Low-power Devices," a Presentation from Allegro (20) Plus de Edge AI and Vision Alliance (20) "Optimizing SSD Object Detection for Low-power Devices," a Presentation from Allegro2. © 2019 allegro.ai
Agenda
● Deep-learning computer vision: Towards embedded deployment
● Single-Shot-Detection: A short overview
● Prior design - A low hanging fruit for optimization
● Data-driven prior optimization
● Results
2
3. © 2019 allegro.ai
About allegro.ai
End2end platform optimized for DL based perception / CV
● Automated labeling
● Experiment management
● Dataset management
● Deep learning (Dev)Ops
● Continuous / active learning
Trusted by:
3
5. © 2019 allegro.ai
Embedded Object Detection: Living on the Edge
Model Design equation:
+ Low memory
+ Efficiency (compute OPs)
+ Accuracy
= Bill-of-Materials
5
General rule for inference -
“large model” equals:
● Accurate
● Many operations
● High memory footprint
Vestibulum congue
FRAME
RATE
ACCURACY
POWER
DRAW
CHOOSE TWO
6. © 2019 allegro.ai
Detection: Towards Embedded Applications
1. Function split: [feature extractor] + [detection heads]
2. Multiple heads for different tasks Shared feature extractor
3. Single-Shot models Execution
path is not dynamic
4. Use weak feature extractors Low operations
count
5. Optional: model quantization Performance
boost, optimized
6
☹ DLCV “state of the art” == High memory footprint & low FPS
7. © 2019 allegro.ai
Model “Zoo” - Which is Best for Me?
7
“Common” detection
tasks:
Deployable architecture
show only slight
difference in accuracy
Larger model
higher accuracy
Source: https://github.com/amdegroot/ssd.pytorch
9. © 2019 allegro.ai
The problem
with detection…
Timed on Intel(R) Core(TM) i7-
7700K CPU @ 4.20GHz
Non-optimized pytorch
Simulated “embedded” cpu
deployment
9
Source: https://github.com/amdegroot/ssd.pytorch
10. © 2019 allegro.ai
What is Going On?
In detection, getting the actual results means
● Refining 10-20k suggestions to detections
● Expensive algorithms (e.g. NMS), 30-50% of processing time
Reducing the feature extractor size does not help
Opportunity:
● Lower complexity of refinement algorithm (hurt accuracy!)
OR
● Reduce number of suggestions (preserve accuracy?!)
10
12. © 2019 allegro.ai 12
Not SSD: “many shot”
Speed/accuracy trade-offs for modern convolutional object detectors, arXiv:1611.10012
2 stages - Many-Shot Detection NNs (not SSD / YOLO)
1. Image → feature space → object proposals generator
2. Object proposals → resample feature space → classifier
Computationally
intensive:
FPS depends on
number of
proposals
13. © 2019 allegro.ai
● Fast detection with minor penalty in accuracy
● Predictions of different scales from different layers
● Significantly more proposals (~24K vs ~6K of F-RCNN)
● Supports small objects
13
Single Shot Detection
SSD: Single Shot MultiBox Detector, arXiv:1512.02325
Speed/accuracy trade-offs for modern convolutional object detectors, arXiv:1611.10012
14. © 2019 allegro.ai
YOLO / SSD
14
SSD: Single Shot MultiBox Detector, arXiv:1512.02325
YOLO9000: Better, Faster, Stronger, arXiv:1612.08242
15. © 2019 allegro.ai
Origin of Suggestions in Single-Shot:
15
● Prior Grid:
(box/proposal/anchor)
Set of priors for each target
“pixel” @ each resolution
● Localization: mapping
between priors and
bounding-boxes
● High-quality object classifier
for every prior type
17. © 2019 allegro.ai
Suggested Path for Optimization:
● Previous work: priors tuned for benchmark datasets,
not applicable for real-world datasets.
● Prior amount/shapes should be tailored to the objects (and sizes)
● Enable selection between accuracy/performance
● Independent/additive to all other optimizations
Bonus points:
● Optimization as part of the pipeline (matched with the data)
● Automatically prune model execution graph, check if priors are
not generated for specific scale (i.e. “no big objects in dataset”)
17
18. © 2019 allegro.ai 18
The size of all objects is
known in advanced
Tune the priors for our
specific purpose
Data-Centric Approach - Toy Example
small large
prior size/scale
source: Allegro.ai - AI research team
19. © 2019 allegro.ai 19
Data-Centric Approach - Optimization Example
small large
prior size/scale
Remove unused prior: #10
Delete priors: #1, #2, #8
Reshape other priors
Confirm they match with
all the examples in dataset
source: Allegro.ai - AI research team
21. © 2019 allegro.ai
Problem Definition
Task: “Pet detector”
Classes: “cat”, “dog”, “bird”
Examples taken from
VOC/COCO “train” sets.
Size: 24K ROIs
Unique Priors: 24.5 K (36 types)
21
Detectorpriormatch
prior size/scalesource: Allegro.ai - AI research team
22. © 2019 allegro.ai
Optimized Prior Matching
Task: “Pet detector”
Classes: “cat”, “dog”, “bird”
Examples taken from
VOC/COCO “train” sets.
Size: 24K ROIs
Unique Priors: 16K (21 types)
22
Detectorpriormatch
prior size/scalesource: Allegro.ai - AI research team
23. © 2019 allegro.ai
Method: (I) Collect Statistics (with Augmentations)
“Dataset as Database”
Apply data
augmentations
Collect object bounding
boxes
23source: Allegro.ai - AI research team
24. © 2019 allegro.ai
Method: (II) Partition to Detection Resolutions
Model architecture
Partition box population
Resolution/Scale
(small to large)
24source: Allegro.ai - AI research team
25. © 2019 allegro.ai
Method: (III) Weighted Clustering
Clustering
using naive K-means
Data Bias aware
weighting function
Ensure “fair” priors
25source: Allegro.ai - AI research team
26. © 2019 allegro.ai
Method: (IV) Merge Similar/Prune: “Light”
Optimization
Small boxes are
redundant.
Negligible accuracy
decrease
26source: Allegro.ai - AI research team
27. © 2019 allegro.ai
Optimization II
Greedy merging strategy
Decrease number of priors
Small cost in accuracy
Method: (V) Merge Similar/Prune: “Hard”
2727source: Allegro.ai - AI research team
31. © 2019 allegro.ai
Take-Home Messages
● Successful implementation of data-driven optimization
● Applicable to any SSD meta-architecture (SSD, DSSD, FPN)
● Can change the input size and get optimized priors for any input
depending on deployment
31
32. © 2019 allegro.ai
Future Work
● AutoML - Pruning towards required accuracy and model
footprint
● “Reverse optimization” - Flag biased datasets which require
more examples if there is underrepresentation
● Mask optimisation for instance segmentation (MASK-RCNN etc.)
32
33. © 2019 allegro.ai
Resources
33
Tools used:
● allegro.ai deep learning perception platform
● Deep learning framework: Pytorch
Research papers:
● Speed/accuracy trade-offs for modern convolutional object
detectors, arXiv:1611.10012
● SSD: Single Shot MultiBox Detector, arXiv:1512.02325
36. © 2019 allegro.ai
What is Deep Learning Computer Vision
● Computer vision: classification, detection, segmentation,
recognition,...
● Based on deep learning - “Weak AI” - important, but limited tasks
- data intensive, large memory footprint, expensive ops
“Accurate” inference :=
Model trained on ‘input
data’ gives accurate
predictions
in deployment
36
Model inference := perform CV task on input image/video
37. © 2019 allegro.ai
Detect = Locate Object + Classify
● Astounding progress
● Data-driven models
● Deployable tech
● Dedicated hardware
OPTIONAL
37
Dragon: 86%
40. © 2019 allegro.ai
Ground Truth = GT
Intersection over Union = IoU
“Good prior”
How to choose prior-GT match for training?
41. © 2019 allegro.ai
IoU: not *the* ideal choice for matching with priors
Difference between thresholds
is not easy to see unaided
Question:
Which Priors here should be
trained to match the dog’s
bounding box?
https://www.reddit.com/r/computervision/comments/876h0f/yolo_v3_released/dwd7hpm/