VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
(Research Note) Delving deeper into convolutional neural networks for camera relocalization
1. Euler angle and gimbal lock
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 1
2. Euler angle and gimbal lock
Loss of a degree of freedom with Euler angles
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 2
When 𝛽 =
𝜋
2
then cos
𝜋
2
= 0 and sin
𝜋
2
= 1
3. Euler angle and gimbal lock
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 3
4. Euler angle and gimbal lock
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 4
Loss of a degree of freedom with Euler angles
5. Resolve gimbal lock (Loss of a degree of freedom )
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 5
1. Change 𝛽
2. Use different orientation representation
=> quaternion
Rotation don’t commute
𝑅 𝑥 𝑅 𝑦 ≠ 𝑅 𝑦 𝑅 𝑥
6. Quaternion (四元數)
The history
Complex number
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 6
https://www.youtube.com/watch?v=mHVwd8gYLnI&t=2s
Extend Complex number
What is 𝑏𝑐 𝑖𝑗 ?
How to define 𝑖𝑗 ?
7. Quaternion (四元數)
Forget about 𝑖𝑗, how about define another one 𝑘 ?
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 7
https://www.youtube.com/watch?v=mHVwd8gYLnI&t=2s
𝑖
𝑗 𝑘
8. Double cover of quaternion
There are two distinct quaternions for each distinct orientation frame in 3D space.
The belt trick reflects this double-valued relationship, distinguishing a one-circuit 360-degree rotation
from the equivalent two-circuit 720-degree rotation.*
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 8
When applying regression on similar image,
we may get distinct quaternions.
* Andrew J. Hanson (6 February 2006). Visualizing Quaternions. Elsevier. pp. 114–. ISBN 978-0-08-047477-9.
9. National Chung Cheng University, Taiwan
Robot Vision Laboratory
2017/12/03
Jacky Liu
(Research Note)
Delving deeper into convolutional neural
networks for camera relocalization
10. About this work
Delving deeper into convolutional neural networks
for camera relocalization
Wu, Jian1 , Ma, Liwei2 , Hu, Xiaolin1
ICRA2017 - IEEE International Conference on Robotics and Automation
1. Tsinghua National Laboratory for Information Science and Technology (TNList), De- partm
ent of Computer Science and Technology, Tsinghua Univer- sity, 100084, Beijing, China
2. Intel Labs China, Intel Corporation, 100090, Beijing, China liwei.ma@intel.com
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 10
11. Contributions
1. Good rotation representation that solve
the double cover problem of quaternion
(which used by PoseNet)
=> Euler6
2. Camera poses in training set are
always very sparse in the whole pose
space.
=> pose synthesis
3. Regressing orientation & translation
together might not be optimal
=> BranchNet
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 11
12. Related work
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 12
Camera relocalization
Keypoints
SIFT ORB SCoRe
Keyframes
G.
Klein2008
A. P.
Gee2012
However, these methods only provide a coarse estimation to the
camera pose because of the sparsity of poses in training set.
Camera relocalization Multi-task CNNs
13. Related work
Camera relocalization - CNN
PoseNet (keyframes-based approach)
• Encodes the key frames in training set into the parameters of models.
SE3-Net
• Point cloud data limits this algorithm to RGB-D
• The number of predicted objects must be specified in training
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 13
Camera relocalization Multi-task CNNs
14. Related work
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 14
Multi-task CNNs
TCDCN
1. Facial landmark detection
2. Appearance attribute and expression
HyperFace
1. Faces detection
2. Localizaing landmarks
3. Head pose
4. Gender
• Sharing lower layer for low level
common knowledge
• Separate higher layer for specific
predictions
R-CNN
1. Human pose estimation
2. Action detection
MCNNs
1. Attribute relationships
2. Attribute classifiers
Camera relocalization Multi-task CNNs
15. Related work
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 15
• Sharing lower layer for low level
common knowledge
• Separate higher layer for specific
predictions
Camera relocalization Multi-task CNNs
Input
Task1
Task2
16. Method
Summary
A. Orientation Representation
B. Pose Synthesis
C. Mutli-task CNN for Camera Relocalization
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 16
17. Method - Orientation Representation
Predict
Q = [0,1,0,0]
Ground truth
Q’ = [0,-1,0,0]
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 17
translation orientation
orientation
Even if we got the right orientation,
we still have large error
Quoternion Euler6
18. Pose Synthesis
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 18
Overfitting on sparse trajectory
19. 2017/12/13
Delving deeper into convolutional neural ne
tworks for camera relocalization
19
How to resolve overfitting?
(Hint: 2 methods)
20. Pose Synthesis
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 20
Overfitting on sparse trajectory
21. Method
Mutli-task CNN for Camera Relocalization
To quantitatively understand relationship between orientation and translation
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 21
Translation
Rotation
6𝐷𝑜𝐹 = |𝑋, 𝑌, 𝑍, 𝜙, 𝜃, 𝜓|
Intra group correlations
• Orientation:0.391
• Translation:0.293
(self-correlations are not
involved)
Inter group correlations
• 0.256
22. Method
Mutli-task CNN for Camera Relocalization
Learn from statictic
• In the extreme case, regressing orientation
and translation separately by two individual
networks may also give better results.
High computation cost of individual network
• But regressing orientation and translation
individually significantly increases the
computing cost.
Balance - branching
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 22
translation
orientation
translation
orientation
23. Method
Summary
A. Orientation Representation
B. Pose Synthesis
C. Mutli-task CNN for Camera Relocalization
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 23
24. Experiment
Dataset: 7Scenes
• Each sequence (seq-XX.zip) consists of 500-1000 frames
• RGBD: 640x480 => 343x256
• Initial learning rate 10−5
(dropped by 90% every 10000 iter.)
• End iteration at 45000
Hardware
• 2 Nvidia Titan X GPU
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 24
34. Pretrain
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 34
Surprisingly pretain on ImageNet increase error
35. Did FCN helps?
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 35
36. Efficiency of the BranchNet
Storing weights took 46 MB for BranchNet-Euler6.
Branching networks slowed down the forward speed from 5ms to 6ms per
frame on a NVIDIA Titan X GPU.
BranchNet-Euler6 in the GPU of an Intel NUC mobile platform (Intel CoreTM
i5-6260U) with clCaffe [24], and reached a speed of 43 fps, which meets the
real-time requirement of many robotic applications.
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 36
37. Conclusion
CNN-based camera relocalization
1. A new orientation representation Euler6.
2. The pose synthesis for data augmentation.
3. The BranchNet for multi-task regression.
Experiments showed that all of the above techniques improved the
relocalization accuracy, and
they together reduced the error of previous methods by a significant margin.
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 37
38. Conclusion
• Work well on monocular image => RGBD => SCoRe Forests [2] still
perform better
• They attempted to utilize the depth information by simply add the depth
image as the fourth channel to the original input which has RGB channels
but did not obtain much better results than our current results.
• How to utilize the depth information to improve the performance of CNN
remains to be an
open problem.
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 38
39. Recap
1. Euler => Quoternion => Euler6
2. Correlation analysis => important for multi-task CNN
3. Separate network / Branching => efficiency
4. Data augmentation (pose synthesis)
5. Do we need FC (or other layer)?
6. Did pretrain data set always help?
2017/12/13 Delving deeper into convolutional neural networks for camera relocalization 39