Softassign and EM-ICP on GPU

ToruTamaki, MihoAbe, Bisser Raytchev, Kazufumi Kaneda
19th Nov. 2010

Contributionofthistalk
 Fast GPU implementations of registration
algorithms for 3D point sets.
 Softassign [Gold et al., 1998]
 EM-ICP [Granger et al., 2002]
 (Weighted) Horn’s method [Horn, 1987]
 So, what is “registartion” ?

Whatis“Registration”or“Alignment”?
A set of images
Image registration

Registrationof3Dpointsets
大石岳史,増田智仁,倉爪亮,池内克史,創建期奈良大仏及び大仏殿のデ
ジタル復元,日本バーチャルリアリティ学会論文誌, Vol. 10, No. 3,
pp.429-436, 2005.10.
A statue
Range data
from one
view
Range data
from
another
view
Aligned (registered) 3d
point cloud
An example of rendered
CG image of the statue

3Dregistrationalgorithm
 Input
 Two point sets: 𝑋 and 𝑌
 Output
 Rotation matrix 𝑅
 Translation vector 𝒕
X Y
𝑅 and 𝒕

Algorithmsforregistration
Horn’s method
• Corresponding point sets are given.
• Estimate R and t.
ICP (Iterative closest point)
• Unknown correspondence.
• Fast, standard.
• Easily fail due to local minimum.
• A lot of variants follow.
Softassign
• Robust.
• Very slow because of iterations.
EM-ICP
• Robust.
• Very slow because of iterations.
Registration
algorithm

Horn’smethod:correspondenceisknown.
𝑋 𝑌
X Y
?
Unknown correspondence
X Y
Known correspondence
𝒙1 𝒚1
𝒙2 𝒚2
⋮⋮
𝑇
𝑇
𝑇
𝑇
𝒙1 = (𝑥1𝑥, 𝑥1𝑦, 𝑥1𝑧) 𝑇

Horn’smethod:correspondenceisknown.
𝑋 𝑌
𝒙1 𝒚1
𝒙2 𝒚2
⋮⋮
𝑇
𝑇
𝑇
𝑇
𝒙 𝒚
Compute centers
𝑋 𝑌
Centering
𝑋 − 𝒙 𝑌 − 𝒚
𝑋 𝑌𝑆 =
𝐾 =
Computer 1st Eigenvector 𝒒
: quaternion 𝑞
Convert 𝑞 to 𝑅
𝒕 = 𝒙 − 𝑅𝒚
1 2
3
4
5

ICP:correspondenceisunknown.
𝑋 𝑌
𝒙1 𝒚1
𝒙2 𝒚2
⋮⋮
𝑇
𝑇
𝑇
𝑇
Find closest
(nearest) point
to 𝒙1 in 𝑌
𝑌∗
𝒚𝑖
𝒚𝑖
Put the point
to 𝑌∗

𝑋 𝑌
𝒙1 𝒚1
𝒙2 𝒚2
⋮⋮
𝑇
𝑇
𝑇
𝑇
Find closest
(nearest) point
to 𝒙1 in 𝑌
𝑌∗
𝒚 𝑗
𝒚𝑖
Put the point
to 𝑌∗
𝒚 𝑗
⋮
Horn’s method
with 𝑋 and 𝑌∗
Estimate 𝑅 and 𝒕

𝑋 𝑅𝑌 + 𝒕
𝒙1 𝒚1
𝒙2 𝒚2
⋮⋮
𝑇
𝑇
𝑇
𝑇
Find closest
(nearest) point
to 𝒙1 in 𝑌
𝑌∗
𝒚 𝑗
𝒚𝑖
Put the point
to 𝑌∗
𝒚 𝑗
⋮
Horn’s method
with 𝑋 and 𝑌∗
Repeat
Fast, but easy to fail
due to hard correspondence.

Softassign:softcorrespondence.
𝑋
𝑌
𝒙𝑖
𝒚 𝑗
𝑚𝑖𝑗
𝑚𝑖𝑗 = ||𝒙𝑖 − 𝑅𝒚 𝑗 + 𝒕 ||
𝑀
Weighted
Horn’s method
with 𝑋 and 𝑌
Repeat
Each row and column
should be normalized to 1
by Shinkhorn iterations

Shinkhorniterations
𝑀
Each row and column
𝑚𝑖𝑗
sum up to 1
sum up to 1
sum up to 1
⋮
sum up to 1
Repeat row and column
normalization until converge.

Shinkhorniterations
𝑀
Each row and column
𝑚𝑖𝑗
sumupto1
sumupto1
sumupto1
⋮
sumupto1
Repeat row and column
normalization until converge.

Shinkhorn.GPU(rownormalization)
𝑀
Each row and column
𝟏
1
1
1
⋮
𝑹 𝑀
Using sgemv of CUBLAS

Shinkhorn.GPU(rownormalization)
𝑀
Each row and column
𝑹 𝑀
UsingCUDA kernel
Row-wise
division
Column normalization is done
by the same way.

WeightedHorn’smethod
𝑋 𝑌𝑆 = 𝑋 𝑌𝑆 = 𝑀
3 3
Normal version Weighted version
Using CUBLAS sgemv twice.

Centering.GPU(weightedversion)
𝑋
𝑹 𝑀 𝟏
1
1
1
⋮
𝑋
∗∗
CUDA
kernel
CUBLAS
sasum
𝑹 𝑀 𝟏
∗
CUBLAS
sasum
𝒙
Weighted
center
Same as for 𝒚
Weighted
sum

PipelineofSoftassing.GPU
𝑋
𝑌
𝑋
𝑌
𝑀
𝑋 𝑌𝑆 = 𝑀
Compute 𝑀 with CUDA kernel
Shinkhorn.GPU
Centering.GPU
𝑆
Weighted Horn’s method
𝐾
𝑅 and 𝒕
Solve
Eigenvalue
problem
𝒙 , 𝒚

EM-ICP:softcorrespondence.
𝑌
𝑋
𝒚𝑖
𝒙𝑗
𝑑𝑖𝑗
𝑑𝑖𝑗 = ||𝒙𝑗 − 𝑅𝒚𝑖 + 𝒕 ||
𝐴
Weighted
Horn’s method
with 𝑋′ and 𝑌
Repeat
𝑋′
𝒙′𝑖
Pseudo correspondence 𝑋′
Each row is normalized once.

RownormalizationonGPU
𝐴
𝟏
1
1
1
⋮
𝑪
Not normalized yet.

RownormalizationonGPU
𝐴
UsingCUDA kernel
Row-wise
division
+
sqrt
𝑪
Now normalized.

Computingweights
𝐴
𝟏
1
1
1
⋮
𝝀
Now normalized.

Pseudocorrespondence
𝑋
𝐴
𝑋′
CUBLAS
sgemv
Centering: same with Softassing.GPU
Now normalized.

WeightedHorn’smethod
𝑋′ 𝑌𝑆 =
3
Weighted version
0
0𝜆1
𝜆2
⋱
𝝀𝑋′
∗
CUDA
kernel
𝑋
𝑋’ 𝑌𝑆 =
CUBLAS
sgemm
3
Weighted version (2 steps)
(not efficient)

PipelineofEM-ICP.GPU
𝑋
𝑌
𝑋
𝑌
𝐴
Compute 𝐴with CUDA kernel
Row normalization on GPU
Centering.GPU
𝑆
2 step weighted Horn’s method
𝐾
𝑅 and 𝒕
Solve
Eigenvalue
problem
𝒙 , 𝒚
𝝀𝑋′
∗
𝑋
𝑋′
𝑌
𝑆
=

Computingtimeoverdifferentnumberofpoints
Successfully aligned
5000 points less than
7 seconds.
Slightly fast,
but failed.
GPU: GeForce8800GT CPU: Intel Core2 Quad + OpenMP (4 cores)

Summary
 Implemented 3D registration algorithms on a
GPU are:
 Softassign,
 EM-ICP,
 Weighted Horn’s method.
 EM-ICP.GPU is
 able to align 5000 points within 7 seconds,
 60 times faster than EM-ICP.CPU,
 more robust than ICP.CPU.
 Code, binary, and movies are available at:
 http://home.hiroshima-u.ac.jp/tamaki/study/cuda_softassign_emicp/

Limitations
 Number of points
 Should be less than 8000 for GeForce8800GT with
512MB memory.
 More memory, more points.
 Stopping condition
 requires to store whole matrix 𝑀 or 𝐴, and
compare with previous ones: inefficient.
 Hence, currently, number of iterations is fixed.

Softassign and EM-ICP on GPU

Recommended

Recommended

More Related Content

What's hot

What's hot (15)

Viewers also liked

Viewers also liked (6)

Similar to Softassign and EM-ICP on GPU

Similar to Softassign and EM-ICP on GPU (20)

More from Toru Tamaki

More from Toru Tamaki (20)

Recently uploaded

Recently uploaded (20)

Softassign and EM-ICP on GPU