SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
EfficientNet:
Rethinking Model Scaling for Convolutional Neural Networks
Mingxing Tan, et al., “EfficientNet: Rethinking Model Scaling for
Convolutional Neural Networks”, ICML 2019
9th June, 2019
PR12 Paper Review
JinWon Lee
Samsung Electronics
References
• Google AI Blog
 https://ai.googleblog.com/2019/05/efficientnet-improving-accuracy-
and.html
• Hoya012’s Research Blog
 https://hoya012.github.io/blog/EfficientNet-review/
Two SteamsAfter ResNet
Better accuracy vs Better efficiency
Intro.
• Scaling up ConvNets is widely used to achieve better accuracy.
 ResNet can be scaled from ResNet-18 to ResNet-200 by using more layers.
 GPipe achived 84.3% ImageNet top-1 accuracy by scaling up a baseline
model 4 times larger.
• The most common way is to scale up ConvNets by their depth, width,
or image resolution.
 In previous work, it is common to scale only one of the three dimensions.
 Though it is possible to scale up two or three dimensions arbitrarily, arbitrary
scaling requires tedious manual tuning and still often yields sub-optimal
accuracy and efficiency.
Intro.
• The authors want to study and rethink the process of scaling up
ConvNets.
 Q: Is there a principled method to scale up ConvNets that can achieve
better accuracy and efficiency?
• Empirical study shows that it is critical to balance all dimensions of
network width/depth/resolution, and surprisingly such balance can
be achieved by simply scaling each of them with constant ratio.
• Based on this observation, authors propose a compound scaling
methods.
Compound Scaling
RelatedWork – ConvNet Accuracy
• ConvNets have become increasingly more accurate by going bigger.
 While the 2014 ImageNet winner GoogleNet (Szegedy et al., 2015) achieves
74.8% top-1 accuracy with about 6.8M parameters, the 2017 ImageNet
winner SENet (Hu et al., 2018) achieves 82.7% top-1 accuracy with 145M
parameters.
 Recently, GPipe (Huang et al., 2018) further pushes the state-of-the-art
ImageNet top-1 validation accuracy to 84.3% using 557M parameters.
• Although higher accuracy is critical for many applications, we have
already hit the hardware memory limit, and thus further accuracy
gain needs better efficiency.
RelatedWork – ConvNet Efficiency
• Deep ConvNets are often over-parameterized.
 Model compression is a common way to reduce model size by trading
accuracy for efficiency.
 it is also common to handcraft efficient mobile-size ConvNets, such as
SqueezeNets, MobileNets, and ShuffleNets.
 Recently, neural architecture search becomes increasingly popular in
designing efficient mobile-size ConvNets such as MNasNet.
• However, it is unclear how to apply these techniques for larger
models that have much larger design space and much more
expensive tuning cost.
RelatedWork – Model Scaling
• There are many ways to scale a ConvNet for different resource
constraints
 ResNet can be scaled down (e.g., ResNet-18) or up (e.g.,ResNet-200) by
adjusting network depth (#layers).
 WideResNet and MobileNets can be scaled by network width (#channels).
 It is also well-recognized that bigger input image size will help accuracy with
the overhead of more FLOPS.
• The Network depth and width are both important for ConvNets
expressive power, it still remains an open question of how to
effectively scale a ConvNet to achieve better efficiency and accuracy.
Problem Formulation
input tensor
spatial
dimension
channel
dimension
stage
Fi is repeated Li times in stage i
We can define ConvNets as:
Problem Formulation
• Unlike regular ConvNet designs that mostly focus on finding the best
layer architecture Fi, model scaling tries to expand the network
length (Li), width (Ci), and/or resolution (Hi;Wi) without changing Fi
predefined in the baseline network.
• By fixing Fi, model scaling simplifies the design problem for new
resource constraints, but it still remains a large design space to
explore different Li;Ci;Hi;Wi for each layer.
Problem Formulation
• In order to further reduce the design space, the authors restrict that
all layers must be scaled uniformly with constant ratio.
coefficients for scaling
network width, depth and
resolution
Scaling Dimensions – Depth
• The intuition is that deeper ConvNet can capture richer and more
complex features, and generalize well on new tasks.
• However, the accuracy gain of very deep network diminishes.
 For example, ResNet-1000 has similar accuracy as ResNet-101 even though it
has much more layers.
Scaling Dimensions –Width
• Scaling network width is commonly used for small size models.
• As discussed inWideResNet, wider networks tend to be able to capture more
fine-grained features and are easier to train.
• However, extremely wide but shallow networks tend to have difficulties in
capturing higher level features.
• And the accuracy quickly saturates when networks become much wider with
larger w.
Scaling Dimensions – Resolution
• With higher resolution input images, ConvNets can potentially capture more
fine-grained patterns.
 Starting from 224x224 in early ConvNets, modern ConvNets tend to use 299x299 or 331x331
for better accuracy. Recently, GPipe achieves state-of-the-art ImageNet accuracy with
480x480 resolution.
• Higher resolutions improve accuracy, but the accuracy gain diminishes for very
high resolutions.
Scaling Dimensions
Observation 1
Scaling up any dimension of network width, depth, or resolution
improves accuracy, but the accuracy gain diminishes for bigger models.
Compound Scaling
• Intuitively, the compound scaling method
makes sense because if the input image is
bigger, then the network needs more layers to
increase the receptive field and more channels
to capture more fine-grained patterns on the
bigger image.
• If we only scale network width w without
changing depth (d=1.0) and resolution (r=1.0),
the accuracy saturates quickly.
• With deeper (d=2.0) and higher resolution
(r=2.0), width scaling achieves much better
accuracy under the same FLOPS cost.
Compound Scaling
Observation 2
In order to pursue better accuracy and efficiency, it is critical to
balance all dimensions of network width, depth, and resolution during
ConvNet scaling.
Compound Scaling Method
• , ,  are constants that can be determined by a small grid search.
• Intuitively,  is a user-specified coefficient that controls how many
more resources are available for model scaling.
Compound Scaling Method
• Notably, the FLOPS of a regular convolution op is proportional to d,
w2, r2.
 Doubling network depth will double FLOPS, but doubling network width or
resolution will increase FLOPS by four times. Since convolution ops usually
dominate the computation cost in ConvNets, scaling a ConvNet with above
equation will approximately increase total FLOPS by
• In this paper, total FLOPs approximately increase by
EfficientNetArchitecture
• Inspired by MNasNet, the authors develop our baseline network by
leveraging a multi-objective neural architecture search that
optimizes both accuracy and FLOPS.
• Optimization Goal :
• Latency is not included in the optimization goal since they are not
targeting any specific hardware device.
where
EfficientNet-B0 Baseline Network
EfficientNet-B1 to B7
• Step 1:
We first fix  = 1, assuming twice more resources available and do a
small grid search of , , .
The best values for EfficientNet-B0 are =1.2, =1.1, =1.15.
• Step 2:
We then fix , ,  as constants and scale up baseline network with
different  to obtain EfficientNet-B1 to B7.
Scaling Up MobileNets and ResNets
ImageNet Results for EfficientNet
ImageNet Results for EfficientNet
Inference Latency Comparison
Transfer Learning Results for EfficientNets
<Transfer Learning Datasets>
Transfer Learning Results for EfficientNets
Discussion
• Disentangling the contribution of proposed scaling method from the
EfficientNet architecture.
Class Activation Maps

Contenu connexe

Tendances

Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural NetworksAshray Bhandare
 
Resnet.pptx
Resnet.pptxResnet.pptx
Resnet.pptxYanhuaSi
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural networkFerdous ahmed
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural networkMojammilHusain
 
ShuffleNet - PR054
ShuffleNet - PR054ShuffleNet - PR054
ShuffleNet - PR054Jinwon Lee
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network Yan Xu
 
CNN Machine learning DeepLearning
CNN Machine learning DeepLearningCNN Machine learning DeepLearning
CNN Machine learning DeepLearningAbhishek Sharma
 
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksJeremy Nixon
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]Dongmin Choi
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolutionPrudhvi Raj
 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide威智 黃
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNNShuai Zhang
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsKasun Chinthaka Piyarathna
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networksananth
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution OverviewLEE HOSEONG
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksChristian Perone
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningJunaid Bhat
 
ResNet basics (Deep Residual Network for Image Recognition)
ResNet basics (Deep Residual Network for Image Recognition)ResNet basics (Deep Residual Network for Image Recognition)
ResNet basics (Deep Residual Network for Image Recognition)Sanjay Saha
 

Tendances (20)

Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
 
Resnet.pptx
Resnet.pptxResnet.pptx
Resnet.pptx
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
ShuffleNet - PR054
ShuffleNet - PR054ShuffleNet - PR054
ShuffleNet - PR054
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
CNN Machine learning DeepLearning
CNN Machine learning DeepLearningCNN Machine learning DeepLearning
CNN Machine learning DeepLearning
 
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolution
 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
ResNet basics (Deep Residual Network for Image Recognition)
ResNet basics (Deep Residual Network for Image Recognition)ResNet basics (Deep Residual Network for Image Recognition)
ResNet basics (Deep Residual Network for Image Recognition)
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
 

Similaire à PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

PR-183: MixNet: Mixed Depthwise Convolutional Kernels
PR-183: MixNet: Mixed Depthwise Convolutional KernelsPR-183: MixNet: Mixed Depthwise Convolutional Kernels
PR-183: MixNet: Mixed Depthwise Convolutional KernelsJinwon Lee
 
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptxEfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptxssuser2624f71
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architecturesananth
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesJinwon Lee
 
Deep Learning in Low Power Devices
Deep Learning in Low Power DevicesDeep Learning in Low Power Devices
Deep Learning in Low Power DevicesLokesh Vadlamudi
 
Transformer models for FER
Transformer models for FERTransformer models for FER
Transformer models for FERIRJET Journal
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSitakanta Mishra
 
Mix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsMix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsSeunghyun Hwang
 
Small Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their DesignSmall Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their DesignForrest Iandola
 
Efficient de cvpr_2020_paper
Efficient de cvpr_2020_paperEfficient de cvpr_2020_paper
Efficient de cvpr_2020_papershanullah3
 
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...Laxmi Kant Tiwari
 
PR-366: A ConvNet for 2020s
PR-366: A ConvNet for 2020sPR-366: A ConvNet for 2020s
PR-366: A ConvNet for 2020sJinwon Lee
 
Group Communication Techniques in Overlay Networks
Group Communication Techniques in Overlay NetworksGroup Communication Techniques in Overlay Networks
Group Communication Techniques in Overlay NetworksKnut-Helge Vik
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignJinwon Lee
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...Jinwon Lee
 
Cluster Technique used in Advanced Computer Architecture.pptx
Cluster Technique used in Advanced Computer Architecture.pptxCluster Technique used in Advanced Computer Architecture.pptx
Cluster Technique used in Advanced Computer Architecture.pptxtiwarirajan1
 
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...Edge AI and Vision Alliance
 
04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptxZainULABIDIN496386
 

Similaire à PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (20)

PR-183: MixNet: Mixed Depthwise Convolutional Kernels
PR-183: MixNet: Mixed Depthwise Convolutional KernelsPR-183: MixNet: Mixed Depthwise Convolutional Kernels
PR-183: MixNet: Mixed Depthwise Convolutional Kernels
 
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptxEfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
 
Deep Learning in Low Power Devices
Deep Learning in Low Power DevicesDeep Learning in Low Power Devices
Deep Learning in Low Power Devices
 
Transformer models for FER
Transformer models for FERTransformer models for FER
Transformer models for FER
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
 
Mix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsMix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional Kernels
 
Small Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their DesignSmall Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their Design
 
Efficient de cvpr_2020_paper
Efficient de cvpr_2020_paperEfficient de cvpr_2020_paper
Efficient de cvpr_2020_paper
 
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...
MobileNet Review | Mobile Net Research Paper Review | MobileNet v1 Paper Expl...
 
PR-366: A ConvNet for 2020s
PR-366: A ConvNet for 2020sPR-366: A ConvNet for 2020s
PR-366: A ConvNet for 2020s
 
Group Communication Techniques in Overlay Networks
Group Communication Techniques in Overlay NetworksGroup Communication Techniques in Overlay Networks
Group Communication Techniques in Overlay Networks
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
MobileNet V3
MobileNet V3MobileNet V3
MobileNet V3
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
 
Cluster Technique used in Advanced Computer Architecture.pptx
Cluster Technique used in Advanced Computer Architecture.pptxCluster Technique used in Advanced Computer Architecture.pptx
Cluster Technique used in Advanced Computer Architecture.pptx
 
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
 
04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx
 

Plus de Jinwon Lee

PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersJinwon Lee
 
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...Jinwon Lee
 
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for VisionPR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for VisionJinwon Lee
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...Jinwon Lee
 
PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)Jinwon Lee
 
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object DetectorPR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object DetectorJinwon Lee
 
PR-258: From ImageNet to Image Classification: Contextualizing Progress on Be...
PR-258: From ImageNet to Image Classification: Contextualizing Progress on Be...PR-258: From ImageNet to Image Classification: Contextualizing Progress on Be...
PR-258: From ImageNet to Image Classification: Contextualizing Progress on Be...Jinwon Lee
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionJinwon Lee
 
PR-207: YOLOv3: An Incremental Improvement
PR-207: YOLOv3: An Incremental ImprovementPR-207: YOLOv3: An Incremental Improvement
PR-207: YOLOv3: An Incremental ImprovementJinwon Lee
 
PR-197: One ticket to win them all: generalizing lottery ticket initializatio...
PR-197: One ticket to win them all: generalizing lottery ticket initializatio...PR-197: One ticket to win them all: generalizing lottery ticket initializatio...
PR-197: One ticket to win them all: generalizing lottery ticket initializatio...Jinwon Lee
 
PR-155: Exploring Randomly Wired Neural Networks for Image Recognition
PR-155: Exploring Randomly Wired Neural Networks for Image RecognitionPR-155: Exploring Randomly Wired Neural Networks for Image Recognition
PR-155: Exploring Randomly Wired Neural Networks for Image RecognitionJinwon Lee
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorJinwon Lee
 
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...Jinwon Lee
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksJinwon Lee
 
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning TasksPR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning TasksJinwon Lee
 
In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitJinwon Lee
 
Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingEfficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingJinwon Lee
 
MobileNet - PR044
MobileNet - PR044MobileNet - PR044
MobileNet - PR044Jinwon Lee
 
PVANet - PR033
PVANet - PR033PVANet - PR033
PVANet - PR033Jinwon Lee
 

Plus de Jinwon Lee (20)

PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision Learners
 
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
 
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for VisionPR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
 
PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)
 
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object DetectorPR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
 
PR-258: From ImageNet to Image Classification: Contextualizing Progress on Be...
PR-258: From ImageNet to Image Classification: Contextualizing Progress on Be...PR-258: From ImageNet to Image Classification: Contextualizing Progress on Be...
PR-258: From ImageNet to Image Classification: Contextualizing Progress on Be...
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
 
PR-207: YOLOv3: An Incremental Improvement
PR-207: YOLOv3: An Incremental ImprovementPR-207: YOLOv3: An Incremental Improvement
PR-207: YOLOv3: An Incremental Improvement
 
PR-197: One ticket to win them all: generalizing lottery ticket initializatio...
PR-197: One ticket to win them all: generalizing lottery ticket initializatio...PR-197: One ticket to win them all: generalizing lottery ticket initializatio...
PR-197: One ticket to win them all: generalizing lottery ticket initializatio...
 
PR-155: Exploring Randomly Wired Neural Networks for Image Recognition
PR-155: Exploring Randomly Wired Neural Networks for Image RecognitionPR-155: Exploring Randomly Wired Neural Networks for Image Recognition
PR-155: Exploring Randomly Wired Neural Networks for Image Recognition
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
 
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
 
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning TasksPR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
 
In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unit
 
Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingEfficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter Sharing
 
MobileNet - PR044
MobileNet - PR044MobileNet - PR044
MobileNet - PR044
 
PVANet - PR033
PVANet - PR033PVANet - PR033
PVANet - PR033
 

Dernier

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Dernier (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

  • 1. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks Mingxing Tan, et al., “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks”, ICML 2019 9th June, 2019 PR12 Paper Review JinWon Lee Samsung Electronics
  • 2. References • Google AI Blog  https://ai.googleblog.com/2019/05/efficientnet-improving-accuracy- and.html • Hoya012’s Research Blog  https://hoya012.github.io/blog/EfficientNet-review/
  • 3. Two SteamsAfter ResNet Better accuracy vs Better efficiency
  • 4. Intro. • Scaling up ConvNets is widely used to achieve better accuracy.  ResNet can be scaled from ResNet-18 to ResNet-200 by using more layers.  GPipe achived 84.3% ImageNet top-1 accuracy by scaling up a baseline model 4 times larger. • The most common way is to scale up ConvNets by their depth, width, or image resolution.  In previous work, it is common to scale only one of the three dimensions.  Though it is possible to scale up two or three dimensions arbitrarily, arbitrary scaling requires tedious manual tuning and still often yields sub-optimal accuracy and efficiency.
  • 5. Intro. • The authors want to study and rethink the process of scaling up ConvNets.  Q: Is there a principled method to scale up ConvNets that can achieve better accuracy and efficiency? • Empirical study shows that it is critical to balance all dimensions of network width/depth/resolution, and surprisingly such balance can be achieved by simply scaling each of them with constant ratio. • Based on this observation, authors propose a compound scaling methods.
  • 7. RelatedWork – ConvNet Accuracy • ConvNets have become increasingly more accurate by going bigger.  While the 2014 ImageNet winner GoogleNet (Szegedy et al., 2015) achieves 74.8% top-1 accuracy with about 6.8M parameters, the 2017 ImageNet winner SENet (Hu et al., 2018) achieves 82.7% top-1 accuracy with 145M parameters.  Recently, GPipe (Huang et al., 2018) further pushes the state-of-the-art ImageNet top-1 validation accuracy to 84.3% using 557M parameters. • Although higher accuracy is critical for many applications, we have already hit the hardware memory limit, and thus further accuracy gain needs better efficiency.
  • 8. RelatedWork – ConvNet Efficiency • Deep ConvNets are often over-parameterized.  Model compression is a common way to reduce model size by trading accuracy for efficiency.  it is also common to handcraft efficient mobile-size ConvNets, such as SqueezeNets, MobileNets, and ShuffleNets.  Recently, neural architecture search becomes increasingly popular in designing efficient mobile-size ConvNets such as MNasNet. • However, it is unclear how to apply these techniques for larger models that have much larger design space and much more expensive tuning cost.
  • 9. RelatedWork – Model Scaling • There are many ways to scale a ConvNet for different resource constraints  ResNet can be scaled down (e.g., ResNet-18) or up (e.g.,ResNet-200) by adjusting network depth (#layers).  WideResNet and MobileNets can be scaled by network width (#channels).  It is also well-recognized that bigger input image size will help accuracy with the overhead of more FLOPS. • The Network depth and width are both important for ConvNets expressive power, it still remains an open question of how to effectively scale a ConvNet to achieve better efficiency and accuracy.
  • 10. Problem Formulation input tensor spatial dimension channel dimension stage Fi is repeated Li times in stage i We can define ConvNets as:
  • 11. Problem Formulation • Unlike regular ConvNet designs that mostly focus on finding the best layer architecture Fi, model scaling tries to expand the network length (Li), width (Ci), and/or resolution (Hi;Wi) without changing Fi predefined in the baseline network. • By fixing Fi, model scaling simplifies the design problem for new resource constraints, but it still remains a large design space to explore different Li;Ci;Hi;Wi for each layer.
  • 12. Problem Formulation • In order to further reduce the design space, the authors restrict that all layers must be scaled uniformly with constant ratio. coefficients for scaling network width, depth and resolution
  • 13. Scaling Dimensions – Depth • The intuition is that deeper ConvNet can capture richer and more complex features, and generalize well on new tasks. • However, the accuracy gain of very deep network diminishes.  For example, ResNet-1000 has similar accuracy as ResNet-101 even though it has much more layers.
  • 14. Scaling Dimensions –Width • Scaling network width is commonly used for small size models. • As discussed inWideResNet, wider networks tend to be able to capture more fine-grained features and are easier to train. • However, extremely wide but shallow networks tend to have difficulties in capturing higher level features. • And the accuracy quickly saturates when networks become much wider with larger w.
  • 15. Scaling Dimensions – Resolution • With higher resolution input images, ConvNets can potentially capture more fine-grained patterns.  Starting from 224x224 in early ConvNets, modern ConvNets tend to use 299x299 or 331x331 for better accuracy. Recently, GPipe achieves state-of-the-art ImageNet accuracy with 480x480 resolution. • Higher resolutions improve accuracy, but the accuracy gain diminishes for very high resolutions.
  • 16. Scaling Dimensions Observation 1 Scaling up any dimension of network width, depth, or resolution improves accuracy, but the accuracy gain diminishes for bigger models.
  • 17. Compound Scaling • Intuitively, the compound scaling method makes sense because if the input image is bigger, then the network needs more layers to increase the receptive field and more channels to capture more fine-grained patterns on the bigger image. • If we only scale network width w without changing depth (d=1.0) and resolution (r=1.0), the accuracy saturates quickly. • With deeper (d=2.0) and higher resolution (r=2.0), width scaling achieves much better accuracy under the same FLOPS cost.
  • 18. Compound Scaling Observation 2 In order to pursue better accuracy and efficiency, it is critical to balance all dimensions of network width, depth, and resolution during ConvNet scaling.
  • 19. Compound Scaling Method • , ,  are constants that can be determined by a small grid search. • Intuitively,  is a user-specified coefficient that controls how many more resources are available for model scaling.
  • 20. Compound Scaling Method • Notably, the FLOPS of a regular convolution op is proportional to d, w2, r2.  Doubling network depth will double FLOPS, but doubling network width or resolution will increase FLOPS by four times. Since convolution ops usually dominate the computation cost in ConvNets, scaling a ConvNet with above equation will approximately increase total FLOPS by • In this paper, total FLOPs approximately increase by
  • 21. EfficientNetArchitecture • Inspired by MNasNet, the authors develop our baseline network by leveraging a multi-objective neural architecture search that optimizes both accuracy and FLOPS. • Optimization Goal : • Latency is not included in the optimization goal since they are not targeting any specific hardware device. where
  • 23. EfficientNet-B1 to B7 • Step 1: We first fix  = 1, assuming twice more resources available and do a small grid search of , , . The best values for EfficientNet-B0 are =1.2, =1.1, =1.15. • Step 2: We then fix , ,  as constants and scale up baseline network with different  to obtain EfficientNet-B1 to B7.
  • 24. Scaling Up MobileNets and ResNets
  • 25. ImageNet Results for EfficientNet
  • 26. ImageNet Results for EfficientNet
  • 28. Transfer Learning Results for EfficientNets <Transfer Learning Datasets>
  • 29. Transfer Learning Results for EfficientNets
  • 30. Discussion • Disentangling the contribution of proposed scaling method from the EfficientNet architecture.