SlideShare une entreprise Scribd logo
1  sur  35
Télécharger pour lire hors ligne
Detection:
第 1页 | 共 25 页
Object Detection: Intuition
Detection ≈ Localization +
Classification
第 2页 | 共 25 页
Outline
•R-CNN
•SPP-Net
•Fast R-CNN
•Unified Approach
第 3页 | 共 25 页
Outline
•R-CNN
•SPP-Net
•Fast R-CNN
•Unified Approach
第 4页 | 共 25 页
R-CNN: Pipeline Overview
Step1. Input an image
Step2. Use selective search to obtain ~2k proposals
Step3. Warp each proposal and apply CNN to extract its features
Step4. Adopt class-specified SVM to score each proposal
Step5. Rank the proposals and use NMS to get the bboxes.
Step6. Use class-specified regressors to refine the bboxes’
positions.Ross Girshick et al. Rich feature hierarchies for accurate object detection and semantic
segmentation, CVPR14 第 5页 | 共 25 页
R-CNN: Performance in PASCAL VOC07
• AlexNet(T-Net): 58.5 mAP
• VGG-Net(O-Net): 66.0 mAP
第 6页 | 共 25 页
R-CNN: Limitation
• TOO SLOWWWW !!! (13s/image on a GPU or
53s/image on a CPU, and VGG-Net 7x slower)
• Proposals need to be warped to a fixed size.
第 7页 | 共 25 页
Outline
•R-CNN
•SPP-Net
•Fast R-CNN
•Unified Approach
第 8页 | 共 25 页
SPP-Net: Motivation
• Cropping may loss some information about the object
• Warpping may change the object’s appearance
He et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual
Recognition, TPAMI15 第 9页 | 共 25 页
SPP-Net: Spatial Pyramid Pooling (SPP)
Layer
• FC layer need a fixed-length input while conv layer can be
adapted to arbitrary input size.
• Thus we need a bridge between the conv and FC layer.
• Here comes the SPP layer.
第 10页 | 共 25 页
SPP-Net: Training for Detection(1)
Conv5
feature
map
Conv5
feature
map
Conv5
feature
map
Image Pyramid FeatMap Pyramids
conv
Step1. Generate a image pyramid and exact the conv
FeatMap of the whole image
第 11页 | 共 25 页
SPP-Net: Training for Detection(2)
• Step 2, For each proposal, walking the image
pyramid and find a project version that has a
number of pixels closest to 224x224. (For scaling
invariance in training.)
• Step 3, find the corresponding FeatMap in Conv5
and use SPP layer to pool it to a fix size.
• Step 4, While getting
all the proposals’
feature, fine-tune the
FC layer only.
• Step 5, Train the
class-specified SVM
第 12页 | 共 25 页
SPP-Net: Testing for Detection
• Allmost the same as R-CNN, except Step3.
第 13页 | 共 25 页
SPP-Net: Performance
• Speed: 64x faster than R-CNN using one scale,
and 24x faster using five-scale paramid.
• mAP: +1.2 mAP vs R-CNN
第 14页 | 共 25 页
SPP-Net: Limitation
2. Training is expensive in space and time.
1. Training is a multi-stage pipeline.
FC layersConv layers SVM regressor
store
第 15页 | 共 25 页
Outline
•R-CNN
•SPP-Net
•Fast R-CNN
•Unified Approach
第 16页 | 共 25 页
Fast R-CNN: Motivation
Ross Girshick, Fast R-CNN, Arxiv tech report
JOINT TRAINING!!
第 17页 | 共 25 页
Fast R-CNN: Joint Training Framework
Joint the feature extractor, classifier, regressor
together in a unified framework
第 18页 | 共 25 页
Fast R-CNN: RoI pooling layer
≈ one scale SPP layer
第 19页 | 共 25 页
Fast R-CNN: Regression Loss
A smooth L1 loss which is less sensitive
to outliers than L2 loss
第 20页 | 共 25 页
Fast R-CNN: Scale Invariance
image pyramids ( multi scale )brute force ( single scale )
Conv5
feature
map
conv
• In practice, single scale is good enough. (The main
reason why it can faster x10 than SPP-Net)
第 21页 | 共 25 页
Fast R-CNN: Other tricks
• SVD on FC layers: 30% speed up at testing
time with a little performance drop.
• Which layers to fine-tune? Fix the shallow
conv layers can reduce the training time with
a little performance drop.
• Data augment: use VOC12 as the additional
trainset can boost mAP by ~3%
第 22页 | 共 25 页
Fast R-CNN: Performance
• Without data augment, the mAP just +0.9 on VOC077
• But training and testing time has been greatly speed
up. (training 9x, testing 213x vs R-CNN)
• Without data augment, the mAP +2.3 on VOC127
第 23页 | 共 25 页
Fast-RCNN: Discussion about #proposal
Are more proposals always better ? NO!
第 24页 | 共 25 页
Outline
•R-CNN
•SPP-Net
•Fast R-CNN
•Unified Approach
第 25页 | 共 25 页
Unified Approach: Motivation
No Need For
Regions
第 26页 | 共 25 页
Unified Approach: Framework
第 27页 | 共 25 页
• Move Away from classification network
• Use a deep network like GoogleNet
• Divide the image into 7 by 7 grids
• Each grid responsible for predicting the object
center falling in the grid
• Predict the class probabilities and coordinates
for the object
• Testing time reduces significantly as no
regions are required
• Loss function is combination of class
probabilities error and bounding box
regression error as in Fast RCNN
Unified Approach: Training
第 28页 | 共 25 页
• Most grid parameters will tend towards zero
as one object will only contribute towards one
grid
• Introduce extra probability for background vs
foreground
• Probability error loss for a class activated only
when foreground
• Optimize for Pr(Class/ob) rather than
Pr(Class)
• Final probabilities calculated by
Pr(ob)*Pr(Class/ob)
Unified Approach: Training
第 29页 | 共 25 页
• Run initial iterations by minimizing pr(ob) and
pr(class/ob) separately
• Can run joint minimization in later stages
• The network predicts the bounding box taking
convolutions from the whole image
• This instigates error in localization
• Penalize predictions which outputs lower iou
by rescaling probabilities to the iou instead of
1
Unified Approach: Detection layer
第 30页 | 共 25 页
Unified Approach: Detection layer
第 31页 | 共 25 页
Unified Approach: Detection layer
第 32页 | 共 25 页
Unified Approach: Network
第 33页 | 共 25 页
Variant of GoogleNet with pooling layers replaced by
convolutional layers which helps in localizing objects
Leaky Relu layer with 1.1(X>0) + 0.1(x<0) increases
map
Logistic layer at the end to enforce predictions within
0 to 1
Unified Approach: Saliency
第 34页 | 共 25 页
• Predicts images at 45 fps
• Competitive Performance with Fast Rcnn
using Caffe net (MAP Score = 58.8) in VOC
2007
• Almost 95 times faster than Fast RCNN
More details to be published in
upcoming paper
第 35页 | 共 25 页
THANKS

Contenu connexe

Tendances

Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Jihong Kang
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkNader Karimi
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detectionWenjing Chen
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methodsBrodmann17
 
Pr057 mask rcnn
Pr057 mask rcnnPr057 mask rcnn
Pr057 mask rcnnTaeoh Kim
 
#10 pydata warsaw object detection with dn ns
#10   pydata warsaw object detection with dn ns#10   pydata warsaw object detection with dn ns
#10 pydata warsaw object detection with dn nsAndrew Brozek
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetRishabh Indoria
 
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
150424 Scalable Object Detection using Deep Neural Networks
150424 Scalable Object Detection using Deep Neural Networks150424 Scalable Object Detection using Deep Neural Networks
150424 Scalable Object Detection using Deep Neural NetworksJunho Cho
 
物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)HironoriKanazawa
 
YolactEdge Review [cdm]
YolactEdge Review [cdm]YolactEdge Review [cdm]
YolactEdge Review [cdm]Dongmin Choi
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detectionBrodmann17
 
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basicsBrodmann17
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...Dongmin Choi
 
You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)Universitat Politècnica de Catalunya
 

Tendances (20)

Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
Adaptive object detection using adjacency and zoom prediction
Adaptive object detection using adjacency and zoom predictionAdaptive object detection using adjacency and zoom prediction
Adaptive object detection using adjacency and zoom prediction
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methods
 
Pr057 mask rcnn
Pr057 mask rcnnPr057 mask rcnn
Pr057 mask rcnn
 
#10 pydata warsaw object detection with dn ns
#10   pydata warsaw object detection with dn ns#10   pydata warsaw object detection with dn ns
#10 pydata warsaw object detection with dn ns
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
 
Recurrent Instance Segmentation (UPC Reading Group)
Recurrent Instance Segmentation (UPC Reading Group)Recurrent Instance Segmentation (UPC Reading Group)
Recurrent Instance Segmentation (UPC Reading Group)
 
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
 
150424 Scalable Object Detection using Deep Neural Networks
150424 Scalable Object Detection using Deep Neural Networks150424 Scalable Object Detection using Deep Neural Networks
150424 Scalable Object Detection using Deep Neural Networks
 
Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018
 
物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)
 
YolactEdge Review [cdm]
YolactEdge Review [cdm]YolactEdge Review [cdm]
YolactEdge Review [cdm]
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detection
 
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
 
You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)
 

Similaire à Detection

Improving region based CNN object detector using bayesian optimization
Improving region based CNN object detector using bayesian optimizationImproving region based CNN object detector using bayesian optimization
Improving region based CNN object detector using bayesian optimizationAmgad Muhammad
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxfahmi324663
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkNAVER Engineering
 
[2020 CVPR Efficient DET paper review]
[2020 CVPR Efficient DET paper review][2020 CVPR Efficient DET paper review]
[2020 CVPR Efficient DET paper review]taeseon ryu
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...CodeOps Technologies LLP
 
Recent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionRecent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionKai-Wen Zhao
 
Pelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper ReviewPelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper ReviewLEE HOSEONG
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewLEE HOSEONG
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)DonghyunKang12
 
위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등DACON AI 데이콘
 
Performance Analysis of Lattice QCD with APGAS Programming Model
Performance Analysis of Lattice QCD with APGAS Programming ModelPerformance Analysis of Lattice QCD with APGAS Programming Model
Performance Analysis of Lattice QCD with APGAS Programming ModelKoichi Shirahata
 
Model compression
Model compressionModel compression
Model compressionNanhee Kim
 
Comparative Study of Object Detection Algorithms
Comparative Study of Object Detection AlgorithmsComparative Study of Object Detection Algorithms
Comparative Study of Object Detection AlgorithmsIRJET Journal
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 

Similaire à Detection (20)

Improving region based CNN object detector using bayesian optimization
Improving region based CNN object detector using bayesian optimizationImproving region based CNN object detector using bayesian optimization
Improving region based CNN object detector using bayesian optimization
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptx
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
 
[2020 CVPR Efficient DET paper review]
[2020 CVPR Efficient DET paper review][2020 CVPR Efficient DET paper review]
[2020 CVPR Efficient DET paper review]
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
 
Recent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionRecent Object Detection Research & Person Detection
Recent Object Detection Research & Person Detection
 
Pelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper ReviewPelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper Review
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
Temporal Segment Network
Temporal Segment NetworkTemporal Segment Network
Temporal Segment Network
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
SPPNet
SPPNetSPPNet
SPPNet
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등
 
Performance Analysis of Lattice QCD with APGAS Programming Model
Performance Analysis of Lattice QCD with APGAS Programming ModelPerformance Analysis of Lattice QCD with APGAS Programming Model
Performance Analysis of Lattice QCD with APGAS Programming Model
 
Model compression
Model compressionModel compression
Model compression
 
Comparative Study of Object Detection Algorithms
Comparative Study of Object Detection AlgorithmsComparative Study of Object Detection Algorithms
Comparative Study of Object Detection Algorithms
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
OBDPC 2022
OBDPC 2022OBDPC 2022
OBDPC 2022
 

Dernier

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 

Dernier (20)

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 

Detection

  • 2. Object Detection: Intuition Detection ≈ Localization + Classification 第 2页 | 共 25 页
  • 5. R-CNN: Pipeline Overview Step1. Input an image Step2. Use selective search to obtain ~2k proposals Step3. Warp each proposal and apply CNN to extract its features Step4. Adopt class-specified SVM to score each proposal Step5. Rank the proposals and use NMS to get the bboxes. Step6. Use class-specified regressors to refine the bboxes’ positions.Ross Girshick et al. Rich feature hierarchies for accurate object detection and semantic segmentation, CVPR14 第 5页 | 共 25 页
  • 6. R-CNN: Performance in PASCAL VOC07 • AlexNet(T-Net): 58.5 mAP • VGG-Net(O-Net): 66.0 mAP 第 6页 | 共 25 页
  • 7. R-CNN: Limitation • TOO SLOWWWW !!! (13s/image on a GPU or 53s/image on a CPU, and VGG-Net 7x slower) • Proposals need to be warped to a fixed size. 第 7页 | 共 25 页
  • 9. SPP-Net: Motivation • Cropping may loss some information about the object • Warpping may change the object’s appearance He et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, TPAMI15 第 9页 | 共 25 页
  • 10. SPP-Net: Spatial Pyramid Pooling (SPP) Layer • FC layer need a fixed-length input while conv layer can be adapted to arbitrary input size. • Thus we need a bridge between the conv and FC layer. • Here comes the SPP layer. 第 10页 | 共 25 页
  • 11. SPP-Net: Training for Detection(1) Conv5 feature map Conv5 feature map Conv5 feature map Image Pyramid FeatMap Pyramids conv Step1. Generate a image pyramid and exact the conv FeatMap of the whole image 第 11页 | 共 25 页
  • 12. SPP-Net: Training for Detection(2) • Step 2, For each proposal, walking the image pyramid and find a project version that has a number of pixels closest to 224x224. (For scaling invariance in training.) • Step 3, find the corresponding FeatMap in Conv5 and use SPP layer to pool it to a fix size. • Step 4, While getting all the proposals’ feature, fine-tune the FC layer only. • Step 5, Train the class-specified SVM 第 12页 | 共 25 页
  • 13. SPP-Net: Testing for Detection • Allmost the same as R-CNN, except Step3. 第 13页 | 共 25 页
  • 14. SPP-Net: Performance • Speed: 64x faster than R-CNN using one scale, and 24x faster using five-scale paramid. • mAP: +1.2 mAP vs R-CNN 第 14页 | 共 25 页
  • 15. SPP-Net: Limitation 2. Training is expensive in space and time. 1. Training is a multi-stage pipeline. FC layersConv layers SVM regressor store 第 15页 | 共 25 页
  • 17. Fast R-CNN: Motivation Ross Girshick, Fast R-CNN, Arxiv tech report JOINT TRAINING!! 第 17页 | 共 25 页
  • 18. Fast R-CNN: Joint Training Framework Joint the feature extractor, classifier, regressor together in a unified framework 第 18页 | 共 25 页
  • 19. Fast R-CNN: RoI pooling layer ≈ one scale SPP layer 第 19页 | 共 25 页
  • 20. Fast R-CNN: Regression Loss A smooth L1 loss which is less sensitive to outliers than L2 loss 第 20页 | 共 25 页
  • 21. Fast R-CNN: Scale Invariance image pyramids ( multi scale )brute force ( single scale ) Conv5 feature map conv • In practice, single scale is good enough. (The main reason why it can faster x10 than SPP-Net) 第 21页 | 共 25 页
  • 22. Fast R-CNN: Other tricks • SVD on FC layers: 30% speed up at testing time with a little performance drop. • Which layers to fine-tune? Fix the shallow conv layers can reduce the training time with a little performance drop. • Data augment: use VOC12 as the additional trainset can boost mAP by ~3% 第 22页 | 共 25 页
  • 23. Fast R-CNN: Performance • Without data augment, the mAP just +0.9 on VOC077 • But training and testing time has been greatly speed up. (training 9x, testing 213x vs R-CNN) • Without data augment, the mAP +2.3 on VOC127 第 23页 | 共 25 页
  • 24. Fast-RCNN: Discussion about #proposal Are more proposals always better ? NO! 第 24页 | 共 25 页
  • 26. Unified Approach: Motivation No Need For Regions 第 26页 | 共 25 页
  • 27. Unified Approach: Framework 第 27页 | 共 25 页 • Move Away from classification network • Use a deep network like GoogleNet • Divide the image into 7 by 7 grids • Each grid responsible for predicting the object center falling in the grid • Predict the class probabilities and coordinates for the object • Testing time reduces significantly as no regions are required • Loss function is combination of class probabilities error and bounding box regression error as in Fast RCNN
  • 28. Unified Approach: Training 第 28页 | 共 25 页 • Most grid parameters will tend towards zero as one object will only contribute towards one grid • Introduce extra probability for background vs foreground • Probability error loss for a class activated only when foreground • Optimize for Pr(Class/ob) rather than Pr(Class) • Final probabilities calculated by Pr(ob)*Pr(Class/ob)
  • 29. Unified Approach: Training 第 29页 | 共 25 页 • Run initial iterations by minimizing pr(ob) and pr(class/ob) separately • Can run joint minimization in later stages • The network predicts the bounding box taking convolutions from the whole image • This instigates error in localization • Penalize predictions which outputs lower iou by rescaling probabilities to the iou instead of 1
  • 30. Unified Approach: Detection layer 第 30页 | 共 25 页
  • 31. Unified Approach: Detection layer 第 31页 | 共 25 页
  • 32. Unified Approach: Detection layer 第 32页 | 共 25 页
  • 33. Unified Approach: Network 第 33页 | 共 25 页 Variant of GoogleNet with pooling layers replaced by convolutional layers which helps in localizing objects Leaky Relu layer with 1.1(X>0) + 0.1(x<0) increases map Logistic layer at the end to enforce predictions within 0 to 1
  • 34. Unified Approach: Saliency 第 34页 | 共 25 页 • Predicts images at 45 fps • Competitive Performance with Fast Rcnn using Caffe net (MAP Score = 58.8) in VOC 2007 • Almost 95 times faster than Fast RCNN More details to be published in upcoming paper
  • 35. 第 35页 | 共 25 页 THANKS