8. Main Idea:
✓ Dense connectivity: creates short path in the network and encourages feature reuse.
ResNet
GoogleNet
FractalNet
DENSENET [HUANG ET AL. 2017]
9. REDUNDANCY IN DEEP MODELS
Classifier
Input Prediction
Low-level features Mid-level features High-level features
Classifier
17. Main Idea:
✓ Multi-scale feature fusion: merge signals with different frequencies.
Interlinked CNN (Zhou et al, ISNN’15) Neural Fabric (Saxena & Verbeek, NIPS’16)
MSDNet (Huang et al, ICLR’18) HRNet (Sun et al, CVPR’19)
MULTI-SCALE NETWORKS
18. Main Idea:
✓ Automatic architecture search using reinforcement learning, genetic/evolutional
algorithms or differentiable approaches.
AutoML is a very active research field, see www.automl.org
Neural Architecture Search [Zoph and Le, 2017 and many]
20. Main Idea:
✓ Split convolution into multiple groups
Standard Convolution Group Convolution
𝑂
𝐶 × 𝐶
𝐺
𝑂 𝐶 × 𝐶
Networks using Group Convolution:
✓ AlexNet (Krizhevsky et al, NIPS’12)
✓ ResNeXt (Xie et al, CVPR’17)
✓ CondenseNet (Huang et al, CVPR’18)
✓ ShuffleNet (Zhang et al, CVPR’18)
✓ …
Group Convolution
21. Main Idea:
✓ Split convolution into multiple groups, each group has one channel
Networks using DSC:
✓ Xception (Chollet, CVPR’17)
✓ MobileNet (Howard et al, CVPR’18)
✓ MobileNet V2 (Sandler et al, 2018)
✓ ShuffleNet V2 (Ma et al, CVPR’19)
✓ NasNet (Zoph, CVPR’18)
✓ …
Depth-wise Separable Convolution (DSC)
22. Main Idea:
✓ Channel-wise attention: second order operations
Squeeze and excitation network (Hu et al, CVPR’18)
23. Main Idea:
✓ Increase receptive field via filter dilation
Dilated convolution (Yu & Koltun, ICLR’18)
24. Main Idea:
✓ Learn the offset filed for convolutional filters
Deformable convolution (Dai et al, CVPR’18)
30. Why do we use the same
expensive model for all images?
31. Can we use small&cheap models for easy images;
big&expensive models for hard ones?
32. A NAIVE IDEA OF ADAPTIVE EVALUATION
AlexNet
Inception
ResNet
"easy" horse
"hard" hoarse
33. CHALLENGE: LACK OF COARSE-LEVEL FEATURES
Down-sampling
Linear
Output
Classifiers only work well on coarse-scale feature maps
Nearly all computation has been done before getting a coarse feature
Fine-level features Mid-level features Coarse-level features
Down-sampling
Down-sampling
Input
Classifier Classifier Classifier
36. Classifier 4Classifier 2 Classifier 3Classifier 1
…
…
…
Test
Input
…
MULTI-SCALE FEATURES
Classifiers only operate on high level features!
Fine-level features
Mid-level features
Coarse-level features
42. MORE RESULTS AND DISCUSSIONS
Please refer to
Multi-scale dense network for efficient image classification, ICLR Oral,
2018 (Acceptance rate 2.2%, Rank 4/935)
43. ADAPTIVE INFERENCE IS A CHALLENGING PROBLEM
How to design proper network architectures?
How to effectively training dynamic networks?
How to efficiently perform dynamic evaluation?
Adaptive inference for object detection and segmentation?
Spatial adaptive and temporal adaptive?