PVANet - PR033

•

2 j'aime•2,016 vues

Jinwon Lee

Tensorflow-KR 논문읽기모임 33번째 발표자료입니다 영상링크 : https://youtu.be/TYDGTnxUGHQ 논문링크 : https://arxiv.org/abs/1611.08588

Technologie

PVANet:
Lightweight Deep Neural Networks
for Real-time Object Detection
3rd September, 2017
JinWon Lee
Samsung Electronics
Sanghoon Hong, B. Roh, K. Kim,Y. Cheon, M. Park
Intel Imaging and CameraTechnology

Many slides are copied from Sanghoon Hong’s slides
https://drive.google.com/drive/folders/0B8z5oUpB2DysSm1IOV9yeXRULVE

BeforeWe Start…
• Faster R-CNN
 PR-013 : presented by Jinwon Lee
 https://youtu.be/kcPAGIgBGRs
• YOLO
 PR-016 : presented byTaegyun Jeon
 https://youtu.be/eTDcoeqj1_w
• YOLO9000
 PR-023 : presented by Jinwon Lee
 https://youtu.be/6fdclSGgeio
• Concepts of Distance / Metric
 Terry’s deep learning talk byTerryTaewoong Um
 https://youtu.be/4KXgdf6Bmo4?list=PL0oFI08O71gKEXITQ7OG2SCCXkrtid7
Fq

Recap – Faster R-CNN
• Insert a Region Proposal Network (RPN)
after the last convolutional layer 
using GPU!
• RPN trained to produce region
proposals directly; no need for external
region proposals
• After RPN, use RoI Pooling and an
upstream classifier and bbox regressor
just like Fast R-CNN

Motivations
• Object Detection: slow & computationally expensive
• Successes in network compression
• Can we design a less-redundant network from scratch?
Kim et al. (2016). Compression of Deep Convolutional Neural Networks for
Fast and Low Power Mobile Applications
Han et al. (2015) Learning both weights and connections for
efficient neural networks

Design Principles
• Deep but Narrow
• Modified concatenated ReLU
• Inception
• Hyper-feature concatenation

Deep but Narrow
• Reduce redundancies from excessive convolutional outputs

Modified Concatenated ReLU(mCReLU)
• Reduce redundancies in the early convolutional layers
• Better accuracy and less training loss than the original C.ReLU(Shang et al. 2016)

Inception
• Reduce redundancies resulted from various-sized objects
(Szegedy et al. 2015)

Main Building Blocks of PVANet
• Every convolutional layer in these building blocks has its
corresponding activation layers, a BatchNorm and a ReLU layer

Hyper-featureConcatenation
• Low-level details bypass redundant convolutional layers
• Higher-level convolutions concentrate on contexts/abstractions
Kong et al. (2016) HyperNet: Towards Accurate Region
Proposal Generation and Joint Object Detection
pooling upscale

Overall Structure
• 54 convolutional + 3 fully connected layers
• Residual connections and batch normalization

Results
• ILSVRC2012 Classification(Validation)
 As accurate as GoogLeNet and as light as AlexNet

Results
• VOC2007 Detection
 PRN can capture almost 99% of the target objects with only 200 proposals

Results
• VOC2012 Detection
 The lightest among >80% mAP models

Results
• VOC2012 Detection
 Compressed model runs real-time (30 fps) on a GPU

Summary
• PVANet: Lightweight, deep neural network for high-accuracy real-time object
detection
• Design principles for a less-redundant network
 Deep but narrow
 Modified C.ReLU
 Inception and hyper-feature concatenation
• Potential for real-time object detection in edge devices or embedded systems
• Other methodologies can be easily integrated with PVANet and further
reduce its computational cost

Recommandé

ShuffleNet - PR054Jinwon Lee

PR-207: YOLOv3: An Incremental ImprovementJinwon Lee

YOLO9000 - PR023Jinwon Lee

PR-132: SSD: Single Shot MultiBox DetectorJinwon Lee

Yolo v2 ai_tech_20190421穗碧陳

PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...Jinwon Lee

PR-270: PP-YOLO: An Effective and Efficient Implementation of Object DetectorJinwon Lee

Recent Object Detection Research & Person DetectionKai-Wen Zhao

Recommandé

ShuffleNet - PR054Jinwon Lee

PR-207: YOLOv3: An Incremental ImprovementJinwon Lee

YOLO9000 - PR023Jinwon Lee

PR-132: SSD: Single Shot MultiBox DetectorJinwon Lee

Yolo v2 ai_tech_20190421穗碧陳

PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...Jinwon Lee

PR-270: PP-YOLO: An Effective and Efficient Implementation of Object DetectorJinwon Lee

Recent Object Detection Research & Person DetectionKai-Wen Zhao

PR-284: End-to-End Object Detection with Transformers(DETR)Jinwon Lee

Faster R-CNN - PR012Jinwon Lee

PR-155: Exploring Randomly Wired Neural Networks for Image RecognitionJinwon Lee

Efficient Neural Architecture Search via Parameter SharingJinwon Lee

Introduction to CNNShuai Zhang

Deep learningRouyun Pan

PR-144: SqueezeNext: Hardware-Aware Neural Network DesignJinwon Lee

Learning Convolutional Neural Networks for GraphsMathias Niepert

Review-image-segmentation-by-deep-learningTrong-An Bui

[PR12] You Only Look Once (YOLO): Unified Real-Time Object DetectionTaegyun Jeon

Convolutional Neural Networks : Popular Architecturesananth

PR-183: MixNet: Mixed Depthwise Convolutional KernelsJinwon Lee

DeconvNet, DecoupledNet, TransferNet in Image SegmentationNamHyuk Ahn

PR095: Modularity Matters: Learning Invariant Relational Reasoning TasksJinwon Lee

PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksJinwon Lee

PR-317: MLP-Mixer: An all-MLP Architecture for VisionJinwon Lee

201907 AutoML and Neural Architecture SearchDaeJin Kim

[PR12] Inception and Xception - Jaejun YooJaeJun Yoo

Case Study of Convolutional Neural NetworkNamHyuk Ahn

Pelee: a real time object detection system on mobile devices Paper ReviewLEE HOSEONG

The Revolution of Deep LearningFrédéric Parienté

Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...Universitat Politècnica de Catalunya

Contenu connexe

Tendances

PR-284: End-to-End Object Detection with Transformers(DETR)Jinwon Lee

Faster R-CNN - PR012Jinwon Lee

PR-155: Exploring Randomly Wired Neural Networks for Image RecognitionJinwon Lee

Efficient Neural Architecture Search via Parameter SharingJinwon Lee

Introduction to CNNShuai Zhang

Deep learningRouyun Pan

PR-144: SqueezeNext: Hardware-Aware Neural Network DesignJinwon Lee

Learning Convolutional Neural Networks for GraphsMathias Niepert

Review-image-segmentation-by-deep-learningTrong-An Bui

[PR12] You Only Look Once (YOLO): Unified Real-Time Object DetectionTaegyun Jeon

Convolutional Neural Networks : Popular Architecturesananth

PR-183: MixNet: Mixed Depthwise Convolutional KernelsJinwon Lee

DeconvNet, DecoupledNet, TransferNet in Image SegmentationNamHyuk Ahn

PR095: Modularity Matters: Learning Invariant Relational Reasoning TasksJinwon Lee

PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksJinwon Lee

PR-317: MLP-Mixer: An all-MLP Architecture for VisionJinwon Lee

201907 AutoML and Neural Architecture SearchDaeJin Kim

[PR12] Inception and Xception - Jaejun YooJaeJun Yoo

Case Study of Convolutional Neural NetworkNamHyuk Ahn

Pelee: a real time object detection system on mobile devices Paper ReviewLEE HOSEONG

Tendances (20)

PR-284: End-to-End Object Detection with Transformers(DETR)

Faster R-CNN - PR012

PR-155: Exploring Randomly Wired Neural Networks for Image Recognition

Efficient Neural Architecture Search via Parameter Sharing

Introduction to CNN

Deep learning

PR-144: SqueezeNext: Hardware-Aware Neural Network Design

Learning Convolutional Neural Networks for Graphs

Review-image-segmentation-by-deep-learning

[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection

Convolutional Neural Networks : Popular Architectures

PR-183: MixNet: Mixed Depthwise Convolutional Kernels

DeconvNet, DecoupledNet, TransferNet in Image Segmentation

PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks

PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks

PR-317: MLP-Mixer: An all-MLP Architecture for Vision

201907 AutoML and Neural Architecture Search

[PR12] Inception and Xception - Jaejun Yoo

Case Study of Convolutional Neural Network

Pelee: a real time object detection system on mobile devices Paper Review

En vedette

The Revolution of Deep LearningFrédéric Parienté

Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...Universitat Politècnica de Catalunya

MobileNet - PR044Jinwon Lee

스사모 테크톡 - Apache Flink 둘러보기SangWoo Kim

NVIDIA Seminar ディープラーニングによる画像認識と応用事例Takayoshi Yamashita

SSD: Single Shot MultiBox Detector (ECCV2016)Takanori Ogata

En vedette (6)

The Revolution of Deep Learning

Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...

MobileNet - PR044

스사모 테크톡 - Apache Flink 둘러보기

NVIDIA Seminar ディープラーニングによる画像認識と応用事例

SSD: Single Shot MultiBox Detector (ECCV2016)

Similaire à PVANet - PR033

Netw 208 Success Begins / snaptutorial.comWilliamsTaylor65

Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Saimunur Rahman

Object Detetcion using SSD-MobileNetIRJET Journal

A Ensemble Learning-based No Reference QoE Model for User Generated ContentsDuc Nguyen

Future Internet: Managing Innovation and TestbedShinji Shimojo

MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong

Attentive-YOLO: On-Site Water Pipeline Inspection Using Efficient Channel Att...ShuvamRoy12

161209 Unsupervised Learning of Video Representations using LSTMsJunho Cho

Vehicular Content Centric Network (VCCN): A Survey and Research ChallengesSyed Hassan Ahmed

Server side storySimone Deponti

Fast object re detection and localization in video for spatio-temporal fragme...MediaMixerCommunity

Nas net where model learn to generate modelsKhang Pham

slide-171212080528.pptxSharanrajK22MMT1003

Networking Challenges for the Next DecadeOpen Networking Summit

2.1 framingJAIGANESH SEKAR

Real Time Object Dectection using machine learningpratik pratyay

Future services on JanetJisc

An open-source testbed for IoT systemsAugusto Ciuffoletti

Dp2 ppt by_bikramjit_chowdhury_finalBikramjit Chowdhury

REMOWZ - Realtime Water Quality Monitoring using ZigBee based WSN (Part II)Nitin Balakrishnan

Similaire à PVANet - PR033 (20)

Netw 208 Success Begins / snaptutorial.com

Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)

Object Detetcion using SSD-MobileNet

A Ensemble Learning-based No Reference QoE Model for User Generated Contents

Future Internet: Managing Innovation and Testbed

MDEC Data Matters Series: machine learning and Deep Learning, A Primer

Attentive-YOLO: On-Site Water Pipeline Inspection Using Efficient Channel Att...

161209 Unsupervised Learning of Video Representations using LSTMs

Vehicular Content Centric Network (VCCN): A Survey and Research Challenges

Server side story

Fast object re detection and localization in video for spatio-temporal fragme...

Nas net where model learn to generate models

slide-171212080528.pptx

Networking Challenges for the Next Decade

2.1 framing

Real Time Object Dectection using machine learning

Future services on Janet

An open-source testbed for IoT systems

Dp2 ppt by_bikramjit_chowdhury_final

REMOWZ - Realtime Water Quality Monitoring using ZigBee based WSN (Part II)

Plus de Jinwon Lee

PR-366: A ConvNet for 2020sJinwon Lee

PR-355: Masked Autoencoders Are Scalable Vision LearnersJinwon Lee

PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...Jinwon Lee

PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...Jinwon Lee

PR-297: Training data-efficient image transformers & distillation through att...Jinwon Lee

PR-258: From ImageNet to Image Classification: Contextualizing Progress on Be...Jinwon Lee

PR243: Designing Network Design SpacesJinwon Lee

PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee

PR-217: EfficientDet: Scalable and Efficient Object DetectionJinwon Lee

PR-197: One ticket to win them all: generalizing lottery ticket initializatio...Jinwon Lee

PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee

In datacenter performance analysis of a tensor processing unitJinwon Lee

Deep learning seminar_snu_161031Jinwon Lee

인공지능, 기계학습 그리고 딥러닝Jinwon Lee

Plus de Jinwon Lee (14)

PR-366: A ConvNet for 2020s

PR-355: Masked Autoencoders Are Scalable Vision Learners

PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...

PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...

PR-297: Training data-efficient image transformers & distillation through att...

PR-258: From ImageNet to Image Classification: Contextualizing Progress on Be...

PR243: Designing Network Design Spaces

PR-231: A Simple Framework for Contrastive Learning of Visual Representations

PR-217: EfficientDet: Scalable and Efficient Object Detection

PR-197: One ticket to win them all: generalizing lottery ticket initializatio...

PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

In datacenter performance analysis of a tensor processing unit

Deep learning seminar_snu_161031

인공지능, 기계학습 그리고 딥러닝

Dernier

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Real Time Object Detection Using Open CVKhem

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

A Year of the Servo Reboot: Where Are We Now?Igalia

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Slack Application Development 101 Slidespraypatel2

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Dernier (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Finology Group – Insurtech Innovation Award 2024

Real Time Object Detection Using Open CV

08448380779 Call Girls In Civil Lines Women Seeking Men

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Automating Google Workspace (GWS) & more with Apps Script

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

A Year of the Servo Reboot: Where Are We Now?

Data Cloud, More than a CDP by Matt Robison

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Slack Application Development 101 Slides

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Powerful Google developer tools for immediate impact! (2023-24 C)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Handwritten Text Recognition for manuscripts and early printed texts

The 7 Things I Know About Cyber Security After 25 Years | April 2024

PVANet - PR033

1. PVANet: Lightweight Deep Neural Networks for Real-time Object Detection 3rd September, 2017 JinWon Lee Samsung Electronics Sanghoon Hong, B. Roh, K. Kim,Y. Cheon, M. Park Intel Imaging and CameraTechnology

2. Many slides are copied from Sanghoon Hong’s slides https://drive.google.com/drive/folders/0B8z5oUpB2DysSm1IOV9yeXRULVE

3. BeforeWe Start… • Faster R-CNN  PR-013 : presented by Jinwon Lee  https://youtu.be/kcPAGIgBGRs • YOLO  PR-016 : presented byTaegyun Jeon  https://youtu.be/eTDcoeqj1_w • YOLO9000  PR-023 : presented by Jinwon Lee  https://youtu.be/6fdclSGgeio • Concepts of Distance / Metric  Terry’s deep learning talk byTerryTaewoong Um  https://youtu.be/4KXgdf6Bmo4?list=PL0oFI08O71gKEXITQ7OG2SCCXkrtid7 Fq

4. PASCALVOC 2012 Leaderboard

5. Recap – Faster R-CNN • Insert a Region Proposal Network (RPN) after the last convolutional layer  using GPU! • RPN trained to produce region proposals directly; no need for external region proposals • After RPN, use RoI Pooling and an upstream classifier and bbox regressor just like Fast R-CNN

6. Recap – Faster R-CNN

7. Motivations • Object Detection: slow & computationally expensive • Successes in network compression • Can we design a less-redundant network from scratch? Kim et al. (2016). Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications Han et al. (2015) Learning both weights and connections for efficient neural networks

8. Design Principles • Deep but Narrow • Modified concatenated ReLU • Inception • Hyper-feature concatenation

9. Deep but Narrow • Reduce redundancies from excessive convolutional outputs

10. Modified Concatenated ReLU(mCReLU) • Reduce redundancies in the early convolutional layers • Better accuracy and less training loss than the original C.ReLU(Shang et al. 2016)

11. Inception • Reduce redundancies resulted from various-sized objects (Szegedy et al. 2015)

12. Main Building Blocks of PVANet • Every convolutional layer in these building blocks has its corresponding activation layers, a BatchNorm and a ReLU layer

13. Hyper-featureConcatenation • Low-level details bypass redundant convolutional layers • Higher-level convolutions concentrate on contexts/abstractions Kong et al. (2016) HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection pooling upscale

14. Overall Structure • 54 convolutional + 3 fully connected layers • Residual connections and batch normalization

15. Details of Networks

16. Results • ILSVRC2012 Classification(Validation)  As accurate as GoogLeNet and as light as AlexNet

17. Results • VOC2007 Detection  PRN can capture almost 99% of the target objects with only 200 proposals

18. Results • VOC2012 Detection  The lightest among >80% mAP models

19. Results • VOC2012 Detection  Compressed model runs real-time (30 fps) on a GPU

20. Summary • PVANet: Lightweight, deep neural network for high-accuracy real-time object detection • Design principles for a less-redundant network  Deep but narrow  Modified C.ReLU  Inception and hyper-feature concatenation • Potential for real-time object detection in edge devices or embedded systems • Other methodologies can be easily integrated with PVANet and further reduce its computational cost