COSC 426 Lecture 5 on Mathematical Principles Behind AR Registration. Given by Adrian Clark from the HIT Lab NZ at the University of Canterbury, August 8, 2012
2. Registra'on
• We
wish
to
calculate
the
transforma'on
from
the
camera
to
the
object
(extrinsic
parameters).
In
order
to
this,
we
must
find
the
transforma'on
from
the
camera
to
the
image
plane
(camera
intrinsics),
and
combine
that
with
the
transforma'on
from
known
points
in
the
object
to
their
loca'ons
in
the
image
plane.
3. Object
to
Image
Plane
• The
calcula'on
for
the
point
on
image
plane
(px,py)
is
related
to
the
ray
passing
from
object
(Px,Py,Pz)
through
the
camera
focal
point
and
intersec'ng
the
image
plane
at
focal
length
f,
such
that:
4. Object
to
Image
Plane
• The
previous
formulas
can
be
represented
in
matrix
form
as:
(equa'on
is
non-‐linear
–
s
is
scale
factor)
• Previous
equa'ons
have
been
assuming
a
perfect
pinhole
aperture.
Instead
we
have
a
lens,
which
has
a
principal
point
(up,
vp)
–
the
transforma'on
from
camera
origin
to
image
plane
origin
–
and
scale
factor
(sx,sy)
pixel
distance
to
real
world
units
(mm).
6. Camera
Calibra'on
• Knowing
the
camera
intrinsics
we
can
calculate
the
transforma'on
from
an
object
P
to
a
pixel
(u,v).
• During
the
process
of
calibra'on
we
calculate
the
intrinsics.
• This
is
done
by
taking
mul'ple
images
of
a
planar
chessboard
where
each
square
is
a
known
size.
7. Camera
Calibra'on
• If
we
assume
the
z
value
of
each
point
on
the
chessboard
to
be
at
0,
then
the
transforma'on
is
found
as:
For
each
point
there
is
a
Homography
mapping
P
to
(u,v):
8. Camera
Calibra'on
• Through
some
deriva'on
and
subs'tu'on,
we
find:
With
the
homography
represented
as:
The
matrix:
Mul'plies
with
the
H
vector
to:
9. Camera
Calibra'on
• With
at
least
four
pairs
of
point
correspondences,
we
can
solve:
using
Singular
Value
Decomposi'on
for
total
least
square
minimiza'on.
From
the
homography
of
these
four
points,
the
values
of
(u,v),
s,
(sx,sy)
can
be
es'mated
with
a
bit
more
maths.
(Zhang,
Z.:
2000,
A
flexible
new
technique
for
camera
calibra'on,
IEEE
Transac'ons
on
Paern
Analysis
and
Machine
Intelligence
22,
1330–1334.)
10. Camera
Calibra'on
• Once
we
have
the
camera
calibra'on,
we
can
go
ahead
and
compute
the
extrinsic
parameters
(transforma'on)
as:
Now
that
we
know
the
complete
transforma'on,
we
can
op'mise
our
intrinsic
parameters
using
the
Levenberg-‐Marquardt
Algorithm
on:
We
can
also
calculate
radial
distor'ons
of
the
lens
and
remove
them
if
we
feel
so
inclined,
and
further
op'mise.
12. Camera
Calibra'on
• Camera
Parameters
1. Perspec've
Projec'on
Matrix
2. Image
Distor'on
Parameters
• Two
camera
calibra'on
methods
1. Accurate
2
step
method
2. Easy
1
step
method
13. Easy
1
step
method:
'calib_camera2.exe'
• Finds
all
camera
parameters
including
distor'on
and
perspec've
projec'on
matrix.
• Doesn’t
require
careful
setup.
• Accuracy
is
good
enough
for
image
overlay.
(Not
good
enough
for
3D
measurement.)
14. Using
'calib_dist2.exe'
Selecting dots with mouse Getting distortion parameters
by automatic line-fitting
• Take pattern pictures as large as possible.
• Slant in various directions with big angle.
• 4 times or more
15. Accurate
2
step
method
• Using
dot
paern
and
grid
paern
• 2
step
method
– 1)
Gedng
distor'on
parameters
–
calib_dist.exe
– 2)
Gedng
perspec've
projec'on
parameters
22. Registra'on
• We
now
have
a
reliable
model
of
the
cameras
intrinsic
parameters,
and
have
removed
any
radial
distor'on.
Now
it’s
just
a
maer
of
learning
some
points
in
a
marker,
and
then
searching
for
them
in
each
frame,
calcula'ng
the
extrinsic
parameters
as:
25. Ques'on:
Gedng
TCM
• Known
Parameters
– Camera
Parameter:
C
– Image
Distor'on
Parameters:
x0,
y0,
f,
s
– Coordinates
of
4
Ver'ces
in
Marker
Coordinates
Frame
• Obtained
Parameters
by
Image
Processing
– Coordinates
of
4
Ver'ces
in
Observed
Screen
Coordinates
• Goal
– Gedng
Transforma'on
Matrix
from
Marker
to
Camera
29. Es'ma'on
of
Transforma'on
Matrix
1st
step:
Geometrical
calcula'on
– Rota'on
&
Transla'on
2nd
step:
Op'miza'on
– Itera've
processing
• Op'miza'on
of
Rota'on
Component
• Op'miza'on
of
Transla'on
Component
30. Op'miza'on
of
Rota'on
Component
• Observed
posi'ons
of
4
ver'ces
• Calculated
posi'ons
of
4
ver'ces
– Posi'ons
in
marker
coordinates
Es'mated
transforma'on
matrix
&
Perspec've
matrix
– Ideal
screen
coordinates
Distor'on
func'on
– Posi'ons
in
observed
screen
coordinates
• Minimizing
distance
between
observed
and
calculated
posi'ons
by
changing
rota'on
component
in
es'mated
transforma'on
matrix
31. Search
Tcm
by
Minimizing
Error
• Op'miza'on
– Itera've
process
32. (2)
Use
of
es'ma'on
accuracy
arGetTransMat()
minimizes
the
'err'.
It
returns
this
minimized
'err'.
If
'err'
is
s'll
big,
Miss-‐detected
marker.
Use
of
camera
parameters
by
bad
calibra'on.
33. How
to
set
the
ini'al
condi'on
for
Op'miza'on
Process
• Geometrical
calcula'on
based
on
4
ver'ces
coordinates
– Independent
in
each
image
frame:
Good
feature.
– Unstable
result
(Jier
occurs.):
Bad
feature
• Use
of
informa'on
from
previous
image
frame
– Needs
previous
frame
informa'on.
– Cannot
use
for
the
first
frame.
– Stable
results.
(This
does
not
mean
accurate
results.)
• ARToolKit
supports
both
34. Two
types
of
ini'al
condi'on
1. Geometrical
calcula'on
based
on
4
ver'ces
in
screen
coordinates
double arGetTransMat( ARMarkerInfo *marker_info,
double center[2], double width,
double conv[3][4] );
2. Use
of
informa'on
from
previous
image
frame
double arGetTransMatCont( ARMarkerInfo *marker_info,
double prev_conv[3][4],
double center[2], double width,
double conv[3][4] );
35. Use
of
Inside
paern
• Why?
– Square
has
symmetries
in
90
degree
rota'on
• 4
templates
are
needed
for
each
paern
– Enable
the
use
of
mul'ple
markers
• How?
– Template
matching
– Normalizing
the
shape
of
inside
paern
– Normalized
correla'on
36.
Accuracy
vs.
Speed
on
paern
iden'fica'on
• Paern
normaliza'on
takes
much
'me.
• This
is
a
problem
when
using
many
markers.
• Normaliza'on
process.
Normalization Resolution convert
39. In
'config.h'
– #define
AR_PATT_SAMPLE_NUM
64
– #define
AR_PATT_SIZE_X
16
– #define
AR_PATT_SIZE_Y
16
Identification Accuracy Speed
Large size Good Slow
Small size Bad Fast
41. Natural
Feature
Registra'on
• There
are
three
steps
to
natural
feature
registra'on:
Find
reliable
points,
describe
points
uniquely,
match
points.
• There
are
heaps
of
exis'ng
natural
feature
registra'on
algorithms
(SIFT,
SURF,
GLOH,
Ferns…)
with
their
own
intricacies,
so
we
will
just
look
at
a
high
level
approach
42. How
NFR
Works
1. Find
feature
points
in
the
image.
2. In
order
to
differen'ate
each
feature
point,
create
a
descriptor
of
a
local
window
using
a
func'on.
3. Repeat
1
and
2
for
both
the
source,
or
“marker”
image,
as
well
as
the
current
frame.
4. Compare
all
features
in
marker
to
all
features
in
current
frame
to
find
closest
matches.
5. Use
matches
to
calculate
transforma'on
43. Feature
Detec'on
• Feature
detec'on
involves
finding
areas
of
an
image
which
are
unique
amongst
their
surroundings,
and
can
easily
be
iden'fied
regardless
of
changes
in
viewpoint.
• Good
feature
candidates
are
corners
and
points.
45. Feature
Descrip'on
• A
feature
point
has
0
dimensions,
and
as
such,
there
is
no
way
of
telling
them
apart.
• To
resolve
this,
a
window
surrounding
the
point
is
transformed
into
a
1
dimensional
array.
• The
window
is
examined
at
the
scale
the
point
was
found
at,
and
the
transforma'on
needs
to
allow
for
distor'on/deforma'on,
but
s'll
able
to
iden'fy
between
every
feature.
47. Feature
Matching
• A
marker
is
trained
when
the
features
and
descriptors
present
have
all
been
found.
• During
run'me,
this
process
is
performed
for
each
frame
of
video.
• The
descriptors
of
each
features
are
compared
between
the
marker
and
the
current
frame.
If
the
descriptors
of
two
features
are
similar
within
a
threshhold,
they
are
assumed
to
be
a
match.
49. Registra'on
• From
here,
we
can
op'onally
run
RANSAC
over
the
homography
calcula'on:
1. Pick
4
random
points,
find
homography
2. Test
homography
by
evalua'ng
other
points
3. If
p-‐HP<e,
Recompute
homography
with
all
inliers,
else
goto
1
• From
there
we
just
take
the
homography,
combine
it
with
the
camera
intrinsics
and
get
the
transforma'on
matrix.
51. NFR
Applica'ons
• Any
applica'on
using
marker
based
registra'on
can
also
be
achieved
using
NFR,
but
there
are
a
number
of
addi'onal
possibili'es.
• As
NFR
does
not
require
special
markings,
any
exis'ng
media
can
be
used
without
modifica'on,
e.g.
pain'ngs
in
museums,
print
media
adver'sements,
etc
52. NFR
Applica'ons
• NFR
is
especially
suited
to
applica'ons
where
there
is
another
“layer”
of
data
relevant
to
an
exis'ng
surface,
e.g.
Three
dimensional
overlays
of
map
data,
“MagicBooks”,
proposed
building
sites,
manufacturing
blue
prints,
etc
57. Mobile
NFR
Mobile
Augmented
Reality
is
becoming
extremely
popular
due
to
the
ubiquitous
nature
of
devices
with
cameras
and
displays.
The
processing
capabili'es
of
these
devices
is
improving,
and
natural
feature
registra'on
is
becoming
increasingly
feasible
with
the
design
of
NFR
algorithms
for
high
performance.
Wagner,
D.;
Reitmayr,
G.;
Mulloni,
A.;
Drummond,
T.;
Schmals'eg,
D.;
,
"Pose
tracking
from
natural
features
on
mobile
phones,"
Mixed
and
Augmented
Reality,
2008.
ISMAR
2008.
7th
IEEE/ACM
Interna?onal
Symposium
on
,
vol.,
no.,
pp.125-‐134,
15-‐18
Sept.
2008
58. Non-‐Rigid
NFR
Using
deforma'on
models,
non-‐rigid
planar
surfaces
can
be
registered,
and
their
shape
recovered.
Not
only
does
this
improve
registra'on
robustness,
but
also
allows
for
more
realis'c
rendering
of
augmented
content
J.
Pilet,
V.
Lepe't,
and
P.
Fua,
Fast
Non-‐Rigid
Surface
Detec'on,
Registra'on
and
Realis'c
Augmenta'on,
Interna'onal
Journal
of
Computer
Vision,
Vol.
76,
Nr.
2,
February
2008.
M.
Salzmann,
J.Pilet,
S.Ilic,
P.Fua,
Surface
Deforma'on
Models
for
Non-‐Rigid
3-‐-‐D
Shape
Recovery,
IEEE
Transac'ons
on
Paern
Analysis
and
Machine
Intelligence,
Vol.
29,
Nr.
8,
pp.
1481
-‐
1487,
August
2007.
59. Model
Based
Tracking
Using
a
known
three
dimensional
model
in
conjunc'on
with
edge/texture
informa'on,
three
dimensional
objects
can
be
tracked
regardless
of
view
point.
Model
based
tracking
also
improves
robustness
to
self
occlusion.
Reitmayr,
G.;
Drummond,
T.W.;
,
"Going
out:
robust
model-‐based
tracking
for
outdoor
augmented
reality,"
Mixed
and
Augmented
Reality,
2006.
ISMAR
2006.
IEEE/ACM
Interna?onal
Symposium
on,
pp.109-‐118,
22-‐25
Oct.
2006
L.
Vacched,
V.
Lepe't
and
P.
Fua,
Stable
Real-‐Time
3D
Tracking
Using
Online
and
Offline
Informa'on,
IEEE
Transac?ons
on
PaGern
Analysis
and
Machine
Intelligence,
Vol.
26,
Nr.
10,
pp.
1385-‐1391,
2004.
61. What
makes
good
NFR?
• In
order
for
a
natural
feature
registra'on
algorithm
to
work
well
it
must
be
robust
to
common
image
transforma'ons
and
distor'ons:
62. Feature
descriptor
robustness
• Feature
descriptors
are
vulnerable
to
transforma'ons
and
distor'ons,
with
the
excep'on
of
transla'on
and
scale,
which
are
handled
by
modifying
the
descriptor
window
to
match
the
scale
and
posi'on
the
feature
was
detected
at.
64. OPIRA
• The
Op'cal-‐flow
Perspec've
Invariant
Registra'on
Augmenta'on
is
an
algorithm
which
adds
perspec've
invariance
to
exis'ng
registra'on
algorithms
by
tracking
the
object
over
mul'ple
frames
using
op'cal
flow,
and
using
perspec've
correc'on
to
eliminate
the
effect
of
perspec've
distor'ons.
Clark,
A.,
Green,
R.
and
Grant,
R.:
2008,
Perspec've
correc'on
for
improved
visual
registra'on
using
natural
features.,
Image
and
Vision
Compu'ng
New
Zealand,
2008.
IVCNZ
2008.
23rd
Interna'onal
Conference,
pp.
1-‐6
66. OPIRA
Process
• Once
an
ini'al
frame
of
registra'on
occurs,
all
correct
points
used
for
registra'on
are
tracked
from
frame
t-‐1
to
t
using
sparse
op'cal
flow.
• The
transforma'on
is
calculated
for
frame
t
based
on
the
tracked
points
and
their
marker
posi'ons
as
matched
in
frame
t-‐1
67. OPIRA
Process
(Cont.)
• Using
the
inverse
of
the
transforma'on
computed
using
Op'cal
Flow,
the
frame
t
is
warped
to
match
the
posi'on
and
orienta'on
of
the
marker.
• The
registra'on
algorithm
is
performed
on
the
newly
aligned
frame.
Matches
are
found,
and
the
transforma'on
is
mul'plied
by
the
Op'cal
Flow
transform
to
realign
the
transforma'on
with
the
original
image.
69. Addi'onal
Benefits
• OPIRA
is
able
to
add
some
degree
of
scale
and
rota'on
invariance
to
exis'ng
algorithms,
by
transforming
the
object
to
match
it’s
marker
representa'on.
• Using
the
undistorted
image,
we
can
perform
background
subtrac'on
to
isolate
occluding
objects
for
pixel
scale
occlusion
in
Augmented
Reality.