【arXiv】Feature Evaluation of Deep Convolutional Neural Networks for Object Recognition and Detection

•

2 j'aime•2,385 vues

arXiv1509.07627 http://arxiv.org/abs/1509.07627 In this paper, we evaluate convolutional neural network (CNN) features using the AlexNet architecture developed by [9] and very deep convolutional network (VGGNet) architecture developed by [16]. To date, most CNN researchers have employed the last layers before output, which were extracted from the fully connected feature layers. However, since it is unlikely that feature representation effectiveness is dependent on the problem, this study evaluates additional convolutional layers that are adjacent to fully connected layers, in addition to executing simple tuning for feature concatenation (e.g., layer 3 + layer 5 + layer7) and transformation, using tools such as principal component analysis. In our experiments, we carried out detection and classification tasks using the Caltech 101 and Daimler Pedestrian Benchmark Datasets.

Sciences

Feature Evaluation of Deep Convolutional Neural
Networks for Object Recognition and Detection
Hirokatsu KATAOKA, Kenji Iwata, Yutaka SATOH
National Institute of Advanced Industrial Science and Technology (AIST)
http://www.hirokatsukataoka.net/
arXiv preprint arXiv:1509.07627
http://arxiv.org/abs/1509.07627

Feature Evaluation
•  Significant task in computer vision
–  Based on the DeCAF [Donahue+, ICML2014], we evaluate several CNN
features + SVM classifier
–  The representative architecture: AlexNet [Krizhevsky+, NIPS2012] &
VGGNet[Simonyan+, ICLR2015]
–  Basic Idea1: Which layer has better feature in CNN architecture?
–  Basic Idea2: Mid- & High-level CNN features should be concatenated!
(e.g. Layer 3 + Layer 5 + Layer 7)

CNN Architecture & Feature Extraction
•  AlexNet & VGGNet
–  AlexNet: 8-layer architecture
–  VGGNet: 16-layer arhitecture (each pooling layer and last 2 FC layers are
applied as feature vector)
Input

Conv

Conv

Pool

Conv

Pool

FC

FC

So.max

Input

Conv

Conv

Pool

FC

FC

AlexNet

VGGNet

Conv

Conv

Pool

Conv

Conv

Pool

Conv

Conv

Pool

Conv

Conv

Pool

FC

So.max

Input

Conv

Pool

FC

So.max

:
Image
input

:
Convolu:onal
layer

:
Max-‐pooling
layer

:
Fully-‐connected
layer

:
So.max
layer

Layer1

Layer2

Layer3

Layer4

Layer5

Layer6

Layer7

Layer1

Layer2

Layer3

Layer4

Layer5

Layer6

Layer7

Experiment
•  Settings
–  Layer: 3 – 7 (middle and deeper layers)
•  Conv., pooling and fully-connected layers
–  Concatenation and transformation
•  Layer 345, 456, 567, 357
•  Principal component analysis (PCA): 1500dims
–  Classifier
•  Support vector machine (SVM)
•  The parameters are based on DeCAF [Donahue+, ICML2014]
•  Datasets
–  Daimler pedestrian benchmark dataset (pedestrian detection) [Munder+,
TPAMI2006]
–  Caltech 101 dataset (object classification) [Fei-Fei+, CVPRW2004]

Results on the Daimler dataset
•  Daimler pedestrian benchmark dataset
–  VGGNet Layer 5 (original vector) is the best rate (99.35%)
–  In AlexNet, Layer 3 with PCA is the best rate (98.71%)
Mid-layer is tend to be better rate on the pedestrian detection data

Results on the Caltech 101 dataset
•  Caltech 101 dataset
–  VGGNet Layer 5 (original vector) is the best rate (91.80%)
–  In AlexNet, Layer 5 with PCA is the best rate (78.37%)
The layer before FC layer performs good rate in object classification

Feature Concatenation
•  Three-layer connection with PCA
–  Layer 345, 456, 567, 357
–  4,500 dimensions (1,500dims at each vector)
–  Left: Daimler
–  Right: Caltech 101
Daimler Caltech 101
VGGNet layer 567 is the significant tuning
Pedestrian detection: mid-level feature
Object classification: high-level feature

Conclusion
•  Feature evaluation with AlexNet & VGGNet
–  VGGNet is better than AlexNet
–  Mid-level feature is good for pedestrian detection, and high-level feature is
good for object classification task
–  Concatenation of VGGNet - 5th Pooling, last 2 FC layers is the best setting on
the Daimler pedestrian benchmark and Caltech 101 dataset
–  PCA is effective transformation for CNN feature

Recommandé

Search Interface Feature EvaluationSimona Galdikaite, B.B.A, MBA

Ux evaluation and design methodsFarrukh Sahar

Building for People: 5 Practical Tip for Greating Great UXqixingz

【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder DatasetHirokatsu Kataoka

【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...Hirokatsu Kataoka

【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...Hirokatsu Kataoka

【BMVC2016】Recognition of Transitional Action for Short-Term Action Prediction...Hirokatsu Kataoka

ILSVRC2015 手法のメモHirokatsu Kataoka

Recommandé

Search Interface Feature EvaluationSimona Galdikaite, B.B.A, MBA

Ux evaluation and design methodsFarrukh Sahar

Building for People: 5 Practical Tip for Greating Great UXqixingz

【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder DatasetHirokatsu Kataoka

【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...Hirokatsu Kataoka

【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...Hirokatsu Kataoka

【BMVC2016】Recognition of Transitional Action for Short-Term Action Prediction...Hirokatsu Kataoka

ILSVRC2015 手法のメモHirokatsu Kataoka

【慶應大学講演】なぜ、博士課程に進学したか？Hirokatsu Kataoka

【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...Hirokatsu Kataoka

Practical UX Methods - as presented at FOWD 2014Patrick McNeil

10 tips for a better UX surveyCaroline Jarrett

Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2Daiki Shimada

CVPR 2016 まとめ v1cvpaper. challenge

Deep Residual Learning (ILSVRC2015 winner)Hirokatsu Kataoka

TensorFlowによるCNNアーキテクチャ構築Hirokatsu Kataoka

ECCV 2016 速報Hirokatsu Kataoka

CVPR 2016 速報Hirokatsu Kataoka

【チュートリアル】コンピュータビジョンによる動画認識Hirokatsu Kataoka

【ECCV 2016 BNMW】Human Action Recognition without HumanHirokatsu Kataoka

【チュートリアル】コンピュータビジョンによる動画認識 v2Hirokatsu Kataoka

【SSII2015】人を観る技術の先端的研究Hirokatsu Kataoka

PythonによるCVアルゴリズム実装Hirokatsu Kataoka

CV分野におけるサーベイ方法Hirokatsu Kataoka

【チュートリアル】動的な人物・物体認識技術 -Dense Trajectories-Hirokatsu Kataoka

Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...Hirokatsu Kataoka

Volatile Oils Pharmacognosy And Phytochemistry -INandakishor Bhaurao Deshmukh

Harmful and Useful Microorganisms Presentationtahreemzahra82

Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9

Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju

Contenu connexe

En vedette

【慶應大学講演】なぜ、博士課程に進学したか？Hirokatsu Kataoka

【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...Hirokatsu Kataoka

Practical UX Methods - as presented at FOWD 2014Patrick McNeil

10 tips for a better UX surveyCaroline Jarrett

Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2Daiki Shimada

CVPR 2016 まとめ v1cvpaper. challenge

Deep Residual Learning (ILSVRC2015 winner)Hirokatsu Kataoka

TensorFlowによるCNNアーキテクチャ構築Hirokatsu Kataoka

ECCV 2016 速報Hirokatsu Kataoka

CVPR 2016 速報Hirokatsu Kataoka

【チュートリアル】コンピュータビジョンによる動画認識Hirokatsu Kataoka

【ECCV 2016 BNMW】Human Action Recognition without HumanHirokatsu Kataoka

En vedette (12)

【慶應大学講演】なぜ、博士課程に進学したか？

【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...

Practical UX Methods - as presented at FOWD 2014

10 tips for a better UX survey

Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2

CVPR 2016 まとめ v1

Deep Residual Learning (ILSVRC2015 winner)

TensorFlowによるCNNアーキテクチャ構築

ECCV 2016 速報

CVPR 2016 速報

【チュートリアル】コンピュータビジョンによる動画認識

【ECCV 2016 BNMW】Human Action Recognition without Human

Plus de Hirokatsu Kataoka

【チュートリアル】コンピュータビジョンによる動画認識 v2Hirokatsu Kataoka

【SSII2015】人を観る技術の先端的研究Hirokatsu Kataoka

PythonによるCVアルゴリズム実装Hirokatsu Kataoka

CV分野におけるサーベイ方法Hirokatsu Kataoka

【チュートリアル】動的な人物・物体認識技術 -Dense Trajectories-Hirokatsu Kataoka

Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...Hirokatsu Kataoka

Plus de Hirokatsu Kataoka (6)

【チュートリアル】コンピュータビジョンによる動画認識 v2

【SSII2015】人を観る技術の先端的研究

PythonによるCVアルゴリズム実装

CV分野におけるサーベイ方法

【チュートリアル】動的な人物・物体認識技術 -Dense Trajectories-

Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...

Dernier

Volatile Oils Pharmacognosy And Phytochemistry -INandakishor Bhaurao Deshmukh

Harmful and Useful Microorganisms Presentationtahreemzahra82

Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9

Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju

STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B

Topic 9- General Principles of International Law.pptxJorenAcuavera1

THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh

Davis plaque method.pptx recombinant DNA technologycaarthichand2003

basic entomology with insect anatomy and taxonomyDrAnita Sharma

FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV

User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems

Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9

Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48

The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar

Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad

REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS

Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131

preservation, maintanence and improvement of industrial organism.pptxnoordubaliya2003

User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems

OECD bibliometric indicators: Selected highlights, April 2024innovationoecd

Dernier (20)

Volatile Oils Pharmacognosy And Phytochemistry -I

Harmful and Useful Microorganisms Presentation

Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...

Pests of castor_Binomics_Identification_Dr.UPR.pdf

STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx

Topic 9- General Principles of International Law.pptx

THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx

Davis plaque method.pptx recombinant DNA technology

basic entomology with insect anatomy and taxonomy

FREE NURSING BUNDLE FOR NURSES.PDF by na

User Guide: Orion™ Weather Station (Columbia Weather Systems)

Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR

Vision and reflection on Mining Software Repositories research in 2024

The dark energy paradox leads to a new structure of spacetime.pptx

Environmental Biotechnology Topic:- Microbial Biosensor

REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...

Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai

preservation, maintanence and improvement of industrial organism.pptx

User Guide: Capricorn FLX™ Weather Station

OECD bibliometric indicators: Selected highlights, April 2024

【arXiv】Feature Evaluation of Deep Convolutional Neural Networks for Object Recognition and Detection

1. Feature Evaluation of Deep Convolutional Neural Networks for Object Recognition and Detection Hirokatsu KATAOKA, Kenji Iwata, Yutaka SATOH National Institute of Advanced Industrial Science and Technology (AIST) http://www.hirokatsukataoka.net/ arXiv preprint arXiv:1509.07627 http://arxiv.org/abs/1509.07627

2. Feature Evaluation •  Significant task in computer vision –  Based on the DeCAF [Donahue+, ICML2014], we evaluate several CNN features + SVM classifier –  The representative architecture: AlexNet [Krizhevsky+, NIPS2012] & VGGNet[Simonyan+, ICLR2015] –  Basic Idea1: Which layer has better feature in CNN architecture? –  Basic Idea2: Mid- & High-level CNN features should be concatenated! (e.g. Layer 3 + Layer 5 + Layer 7)

3. CNN Architecture & Feature Extraction •  AlexNet & VGGNet –  AlexNet: 8-layer architecture –  VGGNet: 16-layer arhitecture (each pooling layer and last 2 FC layers are applied as feature vector) Input Conv Conv Pool Conv Pool FC FC So.max Input Conv Conv Pool FC FC AlexNet VGGNet Conv Conv Pool Conv Conv Pool Conv Conv Pool Conv Conv Pool FC So.max Input Conv Pool FC So.max : Image input : Convolu:onal layer : Max-‐pooling layer : Fully-‐connected layer : So.max layer Layer1 Layer2 Layer3 Layer4 Layer5 Layer6 Layer7 Layer1 Layer2 Layer3 Layer4 Layer5 Layer6 Layer7

4. Experiment •  Settings –  Layer: 3 – 7 (middle and deeper layers) •  Conv., pooling and fully-connected layers –  Concatenation and transformation •  Layer 345, 456, 567, 357 •  Principal component analysis (PCA): 1500dims –  Classifier •  Support vector machine (SVM) •  The parameters are based on DeCAF [Donahue+, ICML2014] •  Datasets –  Daimler pedestrian benchmark dataset (pedestrian detection) [Munder+, TPAMI2006] –  Caltech 101 dataset (object classification) [Fei-Fei+, CVPRW2004]

5. Results on the Daimler dataset •  Daimler pedestrian benchmark dataset –  VGGNet Layer 5 (original vector) is the best rate (99.35%) –  In AlexNet, Layer 3 with PCA is the best rate (98.71%) Mid-layer is tend to be better rate on the pedestrian detection data

6. Results on the Caltech 101 dataset •  Caltech 101 dataset –  VGGNet Layer 5 (original vector) is the best rate (91.80%) –  In AlexNet, Layer 5 with PCA is the best rate (78.37%) The layer before FC layer performs good rate in object classification

7. Feature Concatenation •  Three-layer connection with PCA –  Layer 345, 456, 567, 357 –  4,500 dimensions (1,500dims at each vector) –  Left: Daimler –  Right: Caltech 101 Daimler Caltech 101 VGGNet layer 567 is the significant tuning Pedestrian detection: mid-level feature Object classification: high-level feature

8. Conclusion •  Feature evaluation with AlexNet & VGGNet –  VGGNet is better than AlexNet –  Mid-level feature is good for pedestrian detection, and high-level feature is good for object classification task –  Concatenation of VGGNet - 5th Pooling, last 2 FC layers is the best setting on the Daimler pedestrian benchmark and Caltech 101 dataset –  PCA is effective transformation for CNN feature