Locating objects without_bounding_boxes - paper review
1. Locating Objects
Without Bounding Box
2021.01.10딥러닝논문읽기 모임 - 이미지 (월요일팀)
☃️이윤아허다운김선옥 신정호☃️
CVPR (Computer Vision and Pattern Recognition)2019 - Oral, Best Paper Finalist (Top 1 %)
2. 0. Motivation
▪ Bounding-boxannotation istedious, time-consuming and expensive
▪ Tiny & lots of objects → overlapped bounding box → low network performance
2
Small objects seem easier to detect with points than the bounding boxes
3. 1. Introduction
▪ What does ''object localization''mean in this paper ?
"... we define the object localization task as obtaining a single 2D coordinate corresponding to the location of each object."
3
4. 2. Related Work
▪ Generic object detectors
▪ Fast RCNN
▪ Faster RCNN
▪ SSD (Single shot multibox detector)
▪ YOLO
▪ …
4
require ground-truthed bounding boxes to train the CNNs or
require to set the maximum number of objects in the image being analyzed
5. 3. CNN Architecture
5
pixel별로 추정된 확률 (objectness)
실제 object의 좌표
실제 object의 개수
추정된 object의 개수
Fully connected network (U-net in this case)
7. Hausdorff distance (HD) Average hausdorff distance (AHD)
이상치에 민감 미분 불가능
4. The Average Hausdorff Distance
7
*HD의 Suprema 를 average로 대체
8. 5. The Weighted Hausdorff Distance
8
2. We multiply by px to penalize high activations in areas of the image where there is no ground truth
point y nearby. In other words, the loss function penalizes estimated points that should not be there.
1. If the second term is removed, then the trivial solution is px = 0 ∀x ∈ Ω.
First term
0.8 0.1 0.1 0.1
0.1 0.2 0.3 0.1
0.1 0.1 0.9 0.2
0.1 0.1 0.4 0.2
9. 5. The Weighted Hausdorff Distance
9
Second term
1. If the first term is removed, then the trivial solution is px = 1 ∀x ∈ Ω.
2. If px0 ≈ 1, then f(·) ≈ d(x0,y). This means the point x0 will contribute to the loss as in the AHD
5. low activations around ground truth points will be penalized.
3. If px0 ≈ 0, x0 != y, then f(·) ≈ dmax. Then, if α = −∞, the point x0 will not contribute to
the loss because the “minimum” Mx∈Ω [ · ] will ignore x0
4. If another point x1 closer to y with px1 > 0 exists, x1 will be “selected” instead by M [ · ].
** Smooth approximation of the minimum function
(α parameter를통해 smoothness를조절)
0.8 0 0.1 0.1
0.1 0.2 0.3 0.2
0.1 0.1 0.9 0.2
0.1 0.2 0.4 0.2
10. 5. The Weighted Hausdorff Distance - 추가 설명
Smooth approximation of minimum function
Proof on 3.
12. 5. CNN Architecture (re-visited)
12
pixel별로 추정된 확률
실제 object의 좌표
실제 object의 개수
추정된 object의 개수
Fully connected network (U-net in this case)
13. 5. CNN Architecture and Location Estimation
13
(the estimated objectlocations)(the confidence that there is an
object at pixel coordinatex)
Gaussian
Mixture Model
Ex) |T| = 10 Ex) = 3
Setting
Threshold
How can we set the threshold? GMM?