Lecture 1 - Learning Dynamical Systems from Demonstrations
Nadia2013 research
1. Nadia
Barbara
Figueroa
Fernandez
3D Computer Vision and Applications in Robotics and Multimedia
Reconstruct your world
Reconstruct yourself
2. • BACKGROUND
• 3D
COMPUTER
VISION
• APPLICATIONS
IN
ROBOTICS
Research
Projects
at
TU
Dortmund
Master’s
Thesis
at
DLR
• APPLICATIONS
IN
MULTIMEDIA
Research
Projects
at
NYU
Abu
Dhabi
DLR’s
rollin’
JusEn
Humanoid
AGENDA
4. Fundamentals
1
General
DefiniEon
2
My
DefiniEon
3
What
if
a
point
cloud?
“Generate
3D
representaBons
of
the
world
from
the
viewpoint
of
a
sensor,
generally
in
the
form
of
3D
point
clouds.”
“Ability
of
powered
devices
to
acquire
a
real
Bme
picture
of
the
world
in
three
dimensions”.
-‐
Wikipedia
3D
COMPUTER
VISION
€
p ∈ P
€
p = (x,y,z,r,g,b)“A
point
cloud
is
a
set
of
points
where
.”
5. • Primesense
3D
sensor
• MicrosoP
Kinect
Example
text
3
Light
Coding
–
Structured
Light
• Stereo
Systems
• MulB-‐Camera
Stereo
2
TriangulaEon-‐based
Systems
1
Time-‐Of-‐Flight
Sensors
Sensing
Devices
3D
COMPUTER
VISION
• LIDAR
(Light
DetecBon
and
Ranging)
• Radar
• Sonar
• TOF
Cameras
• PMD
(Photonic
Mixing
Device)
6. APPLICATIONS
IN
ROBOTICS
CalibraEon
and
VerificaEon
Mapping
and
NavigaEonObject
RecogniEon
and
Mobile
ManipulaEon
7. Nadia
Figueroa
and
JiVu
Kurian
OBJECT
RECOGNITION
FOR
A
MOBILE
MANIPULATION
PLATFORM
GOAL:
Detect
and
esBmate
the
pose
of
a
wanted
object
in
a
table
top
scenario.
PROPOSED
APPROACH:
Use
CCD
and
PMD
cameras.
PRE-‐REQUISITES:
1.-‐
CalibraBon
of
PMD-‐CCD
Camera
Rig
2.-‐
Object
Database
8. Pre-‐Requisite
1:
CalibraEon
of
PMD-‐CCD
rig
OBJECT
RECOGNITION
FOR
A
MOBILE
MANIPULATION
PLATFORM
CalibraEon
and
camera
set-‐up
(CCD-‐PMD)
• Binocular
camera
setup
of
PMD
and
CCD
Camera.
• Stereo
System
CalibraBon
Method.
– MathemaBcally
align
the
2
cameras
in
1
viewing
plane.
– Using
epipolar
geometry,
calculate
essenBal
and
fundamental
matrices.
9. Pre-‐Requisite
2:
Object
Database
OBJECT
RECOGNITION
FOR
A
MOBILE
MANIPULATION
PLATFORM
Object
model
generaEon
• Each
object
is
matched
with
20
training
images.
• The
keypoints
(SURF)
that
are
repeatedly
matched
are
selected
as
the
„best“
keypoints.
• APer
training
each
object,
we
get
100
keypoints
per
object.
Object
1
Object
2
Object
3
11. PMD
Data
FlaVening
and
Variance
SegmentaEon
Algorithm
OBJECT
RECOGNITION
FOR
A
MOBILE
MANIPULATION
PLATFORM
Original
PMD
Segmented
PMD
Fla^ened
PMD
13. DLR’S
ROLLIN’
JUSTIN
Built
of
light-‐weight
structures
and
joints
with
mechanical
compliances
and
flexibiliEes.
(+)
Compliant
behavior
of
the
arm
(-‐)
Low
posiEong
accuracy
at
the
TCP
(Tool-‐Center-‐Point)
end
pose.
Designed
to
interact
with
humans
and
unknown
environments.
How
is
this
low
posiEon
accuracy
compensated
in
this
lightweight
design?
Using
the
torque
sensors.
(+)
An
approximaBon
of
a
joint’s
deflecBon
is
obtained
by:
:measured
torque
:sBffness
coefficient
of
the
gear
(-‐)
This
approx.
is
insufficient.
It
cannot
measure
the
remaining
mechanical
flexibiliBes.
€
Θi = θi +
τi
Ki
€
τ
€
K
15. MASTER
THESIS
MOTIVATION
Problem
Goal
Requirements
Create
a
verificaBon
rouBne
to
idenBfy
the
maximum
bounds
of
the
TCP
posiBoning
errors
of
humanoid
JusBn’s
upper
kinemaBc
chains.
The
feasibility
of
moBon
planning
is
highly
dependent
on
the
posiBoning
accuracy.
1.
Avoid
using
any
external
sensory
system.
2.
Avoid
any
human
intervenBon
16. Supervisors:
Florian
Schmidt
and
Haider
Ali
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
€
TCP = Tw
h
Th
a
Ta
tcp
TCP measured by
forward kinematics:
€
TCP = Tw
h
Th
s
Ts
tcp
TCP measured by
stereo vision system:
€
Ts
tcp
€
Th
s
€
Ta
tcp
€
Th
a
€
TCP
Tw
h
TCP End-Pose Error:
Proposed
Approach:
Use
the
on-‐board
stereo
vision
system
to
esBmate
the
TCP
end-‐pose.
17. 3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
3D
point
clouds
of
the
hand
from
the
stereo
cameras.
EsBmate
TCP
by
using
registraBon
between
a
point
cloud
of
the
hand
and
a
model.
RegistraEon
method
evaluaEon
1.
Keypoint
extracBon
(SIFT)
&
point-‐to-‐
point
correspondence.
2.
Local
descriptor
(FPFH/SHOT/CSHOT)
matching
using
Ransac-‐based
correspondence
search.
Model
GeneraEon
Data
AcquisiEon
Pose
EsEmaEon
Model
generated
from
an
extended
metaview
registraBon
method
from
a
selected
subset
of
views
generated
by
analyzing
the
distribuBon
of
max/min
depth
values.
18. Data
AcquisiEon:
Dense
3D
point
cloud
generated
from
Stereo
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
19. 3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
Point
Cloud
Processing
Pass-‐through
filter
(remove
background).
StaBsBcal
Outlier
Removal
(remove
outliers)
Voxel
Grid
Filter
(downsample).
20. 3D
RegistraEon
Methods
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
21. Model
GeneraEon
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
22. Model
GeneraEon
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
Extended
Metaview
RegistraEon
Method
Consists
of
3
steps:
Global
Thresholding
Process:
Reject
the
views
that
lie
in
unstable
areas.
Next
Best
View
Ordering
Algorithm:
Find
an
order
for
incrementally
registering
the
subset
of
point
clouds.
Metaview
RegistraEon:
The
resulBng
subset
of
views
are
registered
and
merged.
23. VerificaEon
RouEne
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
€
ek = 〈et ,eθ 〉
€
fk = 3dRMS
E = (e1,..,eN )
F = ( f1,..., fN )
€
F* = RANSAC(F)
eb = 〈max(et ∈ E*),max(eθ ∈ E*)〉
24. VerificaEon
RouEne
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
25. Method
EvaluaEon
(Ground
Truth)
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
Pose
EsEmaEon
using
IR
ART
tracking
system
(Ground
Truth)
ART
System
Set-‐up
– MulB-‐camera
setup
that
esBmates
the
6DOF
pose
of
the
tracking
targets.
– Mean
accuracy
of
0.04
pixels.
– Speed
of
100
fps.
26. Method
EvaluaEon
(Ground
Truth)
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
Implicit
loop
closure
with
tracking
system
(Ground
Truth)
– By
expressing
in
ART
coordinate
system
a
double
loop
closure
is
generated.
€
TCPfk = Tart
heT
TheT
h
Th
a
Ta
tcp
€
TCPreg = Tart
heT
TheT
h
Th
s
Ts
tcp
€
TCPart = (Tart
heT
TheT
h
)−1
Tart
haT
ThaT
tcp
§ Error
IdenBficaBon
€
Ta
tcp
€
Th
a
€
TCP
€
ART
€
Tart
heT
€
Tart
haT
€
ThaT
tcp
€
TheT
h
€
Ts
tcp
€
Th
s
€
TCPfk,TCPreg
27.
Two
step
calibraEon:
I.
Center
of
RotaEon
EsEmaEon:
Non-‐rigid
geometrically
constrained
sphere-‐fimng
min
subject
to
:spherical
fit
:measurements
:spherical
constraint
II.
Axis
of
RotaEons
EsEmaEon
Combined
plane/circle
fimng
for
each
axis.
min
:planar
:radial
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
CalibraEon
of
Tracking
targets
to
JusEn
– The
esBmaBon
of
relies
on
the
idenBficaBon
of
and
€
TCPart
€
TheT
h
€
ThaT
tcp
€
f = (δk
2
+εk
2
)
k=1
N
∑
€
εk =||vk − m ||2
−r2
€
uT
DT
Du
€
uT
Cu =1
εk
δk
€
u
€
C
€
D
28. 3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
CalibraEon
of
Tracking
targets
to
JusEn
(cont’d)
– Create
spherical
trajectories
around
and
.
– CoR
is
the
posiBon
of
the
joint
deviaBons
throughout
10
calibraBons.
– AoRs
are
the
rotaBons
– Moun*ng
frames:
deviaBons
throughout
10
calibraBons.
€
R = [AoRx,AoRy,AoRz ]
€
t = [mx,my,mz ]T
€
head
€
TCP
ThaT
tcp
= TCP(R,t)−1
Tart
haT
TheT
h
= head(R,t)−1
Tart
heT
€
ThaT
tcp
€
TheT
h
29. Method
EvaluaEon
(Ground
Truth)
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
30. Method
EvaluaEon
(Ground
Truth)
3D
REGISTRATION
FOR
VERIFICATION
OF
HUMANOID
JUSTIN’S
UPPER
BODY
KINEMATICS
33. Nadia
Figueroa
and
Haider
Ali
(DLR)
SEGMENTATION
AND
POSE
ESTIMATION
OF
PLANAR
METALLIC
OBJECTS
PROBLEM:
Pose
esBmaBon
of
planar
metallic
objects
in
a
pile.
PROPOSED
APPROACH:
(i)
SegmentaBon
using
Euclidean
clustering
(ii)
Pose
EsBmaBon
using
RegistraBon
34. SEGMENTATION
AND
POSE
ESTIMATION
OF
PLANAR
METALLIC
OBJECTS
3D
point
clouds
of
the
cloud
from
a
range
sensor.
Cluster
RegistraEon
Euclidean
Clustering
We
extract
n-‐clusters
C
from
pile
P
that
represent
the
planar
objects
by
analyzing
the
angle
deviaBons
between
the
surface
normal
vectors.
Model
PosiEve
aligned
clusters
3D
point
clouds
of
the
cloud
from
a
range
sensor.
Data
AcquisiEon
Euclidean
Clustering
35. CONTEXTUAL
OBJECT
CATEGORY
RECOGNITION
IN
RGB-‐D
SCENES
PROBLEM:
Object
category
recogniBon
in
RGB-‐D
Data
PROPOSED
APPROACH:
(i)
Novel
combinaBon
of
depth
and
color
features.
(ii)
Scene
segmentaBon
based
on
table
detecBon
and
euclidean
clustering.
(iii)
ClassificaBon
results
augmented
by
a
context
model
learnt
from
social
media.
37. CONTEXTUAL
OBJECT
CATEGORY
RECOGNITION
IN
RGB-‐D
SCENES
RGB-‐D
Object
Features
and
Classifier
We
use
a
linear
SVM
to
train
6
object
categories.
The
accuracy
of
our
classicaBon
framework
(63.91%)
is
four-‐Bmes
the
minimum
baseline
generated
by
a
random
guess
(16.67%).
MulE-‐object
ClassificaEon
39. Kinect
Fusion
Uses
Truncated
Signed
Distance
FuncEon
(TSDF)
to
represent
the
3D
data.
What
is
a
TSDF?
A
TSDF
cloud
is
a
point
cloud
which
use
of
how
the
data
is
stored
within
GPU
at
KinFu
runBme.
Each
element
in
the
grid
represents
a
voxel,
and
the
value
inside
it
represents
the
TSDF
value.
The
TSDF
value
is
the
distance
to
the
nearest
isosurface.
40. RGB-‐D
KINECT
FUSION
FOR
CONSISTENT
RECONSTRUCTIONS
OF
INDOOR
SPACES
Nadia
Figueroa,
Haiwei
Dong
and
Abdulmotaleb
El
Saddik
PROBLEM:
GeneraBng
geometric
models
of
environments
for
interior
design,
architectural
and
re-‐pair
or
remodeling
of
indoor
spaces.
PROPOSED
APPROACH:
RGB-‐D
Kinect
Fusion,
which
is
a
combined
approach
towards
consistent
reconstrucBons
of
indoor
Spaces
based
on
Kinect
Fusion
and
6D
RGB-‐D
Odometry
based
on
efficient
feature
matching.
41. RGB-‐D
KINECT
FUSION
FOR
CONSISTENT
RECONSTRUCTIONS
OF
INDOOR
SPACES
6D
RGB-‐D
ODOMETRY
42. FROM
SENSE
TO
PRINT
Nadia
Figueroa,
Haiwei
Dong
and
Abdulmotaleb
El
Saddik
43. FROM
SENSE
TO
PRINT
SegmentaEon
based
on
Camera
Pose
SemanEcs
Object
on
Table
Top
SegmentaEon
Human
Bust
SegmentaEon