SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
Institut
                                          Autom


Institut fur Prozessrechentechnik,
          ¨
  Automation und Robotik (IPR)




Triangulation Methods

              Seminar paper
                   of

      Zlatka Mihaylova
                  SS 2009




 Supervisor   :   M.Phys. Matteo Ciucci
Contents
1   Introduction                                                                                                                                               1

2   Basics                                                                                                                                                     2
    2.1 Epipolar geometry      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   2
    2.2 Fundamental matrix     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   3
    2.3 Camera matrices .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   3
    2.4 Essential matrix . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   4

3   Reconstructing 3D points from an image pair                                                                                                                5
    3.1 General approach . . . . . . . . . . . . . . . . . . . . . .                                               .   .   .   .   .   .   .   .   .   .   .   5
    3.2 Computation of the fundamental matrix . . . . . . . . . .                                                  .   .   .   .   .   .   .   .   .   .   .   5
        3.2.1 Normalized eight point algorithm . . . . . . . . .                                                   .   .   .   .   .   .   .   .   .   .   .   6
        3.2.2 Algebraic minimization algorithm . . . . . . . . .                                                   .   .   .   .   .   .   .   .   .   .   .   6
        3.2.3 Gold standard algorithm . . . . . . . . . . . . . .                                                  .   .   .   .   .   .   .   .   .   .   .   7
        3.2.4 Automatic computation of the fundamental matrix                                                      .   .   .   .   .   .   .   .   .   .   .   7
    3.3 Image rectification . . . . . . . . . . . . . . . . . . . . .                                               .   .   .   .   .   .   .   .   .   .   .   8

4   Triangulation methods                                                                                                                                       9
    4.1 Linear triangulation methods . .                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    9
    4.2 Minimization of geometric error                    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    9
    4.3 Sampson approximation . . . .                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   10
    4.4 The optimal solution . . . . . .                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   10

5   Practical Examples of Triangulation                                                                                                                        11
    5.1 Triangulation with structured light . . .                          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   11
         5.1.1 Light spot technique . . . . . .                            .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   11
         5.1.2 Stripe projection . . . . . . . .                           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   12
         5.1.3 Projection of a static line pattern                         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   12
         5.1.4 Projection of encoded patterns .                            .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   13
         5.1.5 Light spot stereo analysis . . . .                          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   13
    5.2 Triangulation from stereo vision . . . .                           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   14

6   Conclusion                                                                                                                                                 15




                                                                   i
1   INTRODUCTION                                                                                1


1    Introduction
The ability of our brain to percept the relative distance between objects and to particular object
is not a consequence of measuring the exact length. We calculate it with the help of information
collected through our eyes. This approach is widely used in robotics, because all decisions to
be taken are based on knowing the relationships between objects and almost never these could
be measured directly.
    One method for computing the location of a point in three dimensional space is by com-
paring two or more images taken from a different points of view. In case of two images, this
method is called triangulation. The human visual perception system is based on this principle,
namely producing two slightly displaced images, which encode information about the positions
in the space.
    The next chapter introduces the basic terms of the triangulation. For better a understanding,
the reader should be familiar with the geometry standing behind this concept and the mathe-
matical description of the problem.
2     BASICS                                                                                       2


2      Basics
In the following section we explain the basic concepts on which the triangulation is based. This
kind of mathematics is often referred to as epipolar geometry. There are some special matrices
involved in the description of relationships between three points in a space, which we are going
to observe in the next few sections. They precisely describe the main principles in stereo vision.

2.1     Epipolar geometry
The epipolar geometry is a subset of projective geometry which helps for searching correspond-
ing points between two images. This kind of geometry is independent of the scene structure,
but only of the used cameras, their internal parameters and relative positions.
    The line (also referred to as baseline) connecting the camera centers and intersecting the
image planes in point e , respectively e defines the epipoles. Another important term is the
epipolar plane π. Three points are needed to define this plane - an external object point C and
its projections on the images. This statement can be reformulated by saying that the plane is
also defined by the object point and the camera centers, because the projection of the 3D point
lies on the line CA, respectively CB. From Figure 1 it is obvious, that π is not the same for
all three dimensional points, but all epipolar planes contain the baseline. If we know c - the
projection of C on the first image and additionally the plane π , then the projection on the other
plane is still ambiguous. That means c is not fixed. However, instead of searching the whole
second image plane, we can reduce the computational time by searching the projected point on
only one line - the epipolar line e c .


                                            C



                                               �
                                           epipolar plane




                       c                                                              c’
                        ep
                                                                                    e
                                                                                 lin




                          ipo
                             lar
                                                                                 ar
                                                                             ol




                                   lin
                                                   e                   e’
                                                                            ip




                                       e
                                                                            ep




                  A                                         baseline                       B




Figure 1: The 3D point C is projected on the both images in c and c . The points A and B indicate the
centers of two pinhole cameras. By definition they are located on the epipolar plane π.
2     BASICS                                                                                    3


2.2     Fundamental matrix
The fundamental 3 × 3 matrix F represents the connection between one point in the first image
and a line in the second. This kind of mapping is very important, because as long as the funda-
mental matrix is computed, we would be able to estimate easily the rest correspondences. As
shown in Figure 1 any possible corresponding point of the projected c belongs to the epipolar
line of the other image. There are different ways for computing F depending on the information
we have. If both cameras are calibrated, which means their intrinsic parameters are given and
furthermore information about the transfer plane π is available, then the correlation between the
points in the image planes is already defined. This correlation is mathematically described by
F.
    The most important property of the fundamental Matrix F is that for all pairs of points c
and c the following equation must be valid:
                                             T
                                           c Fc = 0                                           (1)
    Well-known as a correspondence condition, this equation implies that there is another way
of computing the correspondences between two sets of points. In difference to already dis-
cussed possibility, now we see that the knowledge of the camera matrices P and P is not
necessary. The fundamental matrix can be estimated if we have the coordinates of at least seven
corresponding point pairs [HZ03].
    From the correspondence condition another interesting properties of the fundamental matrix
F can be derived. For example the definition for the epipolar line:

                                             l = Fc                                           (2)
where l is the epipolar line in the second image. We can convert the equation similarly for
the another epipole line, because this relation is transposable. There is no image or camera
prioritization, the both pictures are treated equally.
    F is a homogeneous matrix. Therefore it should have eight degree of freedom. Actually it
has seven degree of freedom, because its determinant is null by default. This observation has a
direct connection to the number of point correspondences needed for the computation of F .
    Finally we have to mention that assigning one line to a point is unidirectional. Trying to
find the corresponding point to a line is meaningless and not possible.

2.3     Camera matrices
The camera matrices P and P describe the projective properties of the cameras. One important
issue, discussed in [HZ03] , is how these matrices relate to the fundamental matrix F . In one
direction, both P and P result in an unique fundamental matrix F , but in the other direction
these camera matrices can be determined from the fundamental matrix F up to a projective
transformation of 3D space. The resulted ambiguity can be solved by adding an additional
constraint to the product P T F P , that is the resulting matrix P T F P should be skew-symmetric.
Skew symmetric matrices (also known as antisymmetric matrices [Wol]) have the form:
                                                             
                                             0     a12 a13
                                        −a12       0     a23                                (3)
                                          −a13 −a23 0
or in other words, matrices satisfying the condition: A = −AT .
2     BASICS                                                                                    4


    As we stated in the previous section: (1) c T F c = 0. Knowing the relations c = P C and
c = P C, we could now prove, that the matrix P T F P equivalent to C T P T F P C should be
antisymmetric.

2.4     Essential matrix
The matrix already discussed in section 2.2 is a generalization of another matrix called essential
matrix E. The both matrices represent the epipolar constraint defined in the same section, but
in the case of E, we have information about the intrinsic camera parameters, the cameras are
said to be calibrated. The intrinsic camera parameters are for example the focal length of the
camera, image format, principle point and the radial distortion coefficient of the lens. This
additional information reduces the degrees of freedom of the essential matrix to five: three
degrees of freedom of the rotation matrix R and two degrees of freedom of vector t, where t
                                            −→
is the coordinate vector of the translation AB separating the two cameras’ coordinate systems
(more information available in [FP03]).
    The epipolar constraint is satisfied also by Essential matrix.
                                             T
                                           c Ec = 0                                           (4)
    The relation between F and E can be expressed with the following equation:
                                                 −T
                                        F =P          EP −1                                   (5)
   where P and P are already discussed in section calibration matrices.
   If the intrinsic camera parameters are given, then we need to know only five, and not seven
point correspondences. However, the most difficult part of the triangulation approach is exactly
finding the corresponding points in the two images.
3     RECONSTRUCTING 3D POINTS FROM AN IMAGE PAIR                                              5


3      Reconstructing 3D points from an image pair
3.1     General approach
One simple algorithm for reconstruction of a 3D point from an image pair is proposed in
[BB82]. The simple technique involves taking two images of a scene, separated by a base-
line, then identifying the correspondences and applying triangulation rules for defining the two
lines, on which the world point lies. The intersection between these lines give us as result the
values of the 3D point world coordinates
    Unfortunately, finding the corresponding point pairs is not a trivial work. This usually
happens via pattern matching. The main idea is to find a correlation between the pixels of the
both images. For this purpose pixel areas from the first image are compared to pixel areas from
the second and if a pattern has been found, we compute the disparity (displacement) between
the positions of these patterns in the both images.
    The correlation of two images is very expensive operation, which means it requires huge
amount of computational power (the complexity of this operation is O(n2 m2 ) for m × m patch
and n × n pixel image). But the biggest disadvantage of correlation is that some parts of the
3D scene could not be matched properly. For example, when a point exists in the first view,
but in the second view it lies hidden behind some object. The bigger the distance between
the camera centers, the higher the possibility of such an error. Otherwise, we can choose a
definitely smaller distance to place the both cameras, but in this case, the accuracy of the depth
computation decreases also.
    Supposing enough point correspondences are found, then the algorithm for determining the
world point proposed by [HZ03] involves the following steps:
    • Computing the fundamental matrix F from the point pairs. At least eight corresponding
      point pairs are necessary for building a liner system with unknown F . The result of this
      linear system will be the coefficients of the fundamental matrix.
    • Using F for determining the camera matrices P and P . In case when the both cameras
      have the same intrinsic parameters, we simply use the equation P T F P = 0. In practice,
      we actually deal with calibrated cameras, which is to say, we have computed the essential
      matrix E.
    • Reconstructing the three-dimensional point C for every pair of corresponding points c
      and c with the help of both equations: c = P C and c = P C given in section 2.4. The
      special case with world point C laying on the baseline, can not be calculated, because all
      points on the baseline are projected in the epipoles and thus not uniquely defined.
    If the intrinsic camera parameters are given, then instead computing the fundamental matrix,
of course, it is better to be found the essential matrix. This information makes the second step
useless, because the essential matrix E contains the camera calibration parameters.
    The described method give a solution only for the idealized case of the problem. Which
means, that in a real situation, where the images are distorted by a different kinds of noise,
the general approach will not be error resistant. Therefore some further methods with better
practical results are proposed, for example in the section 3.2.3.

3.2     Computation of the fundamental matrix
The importance of the fundamental matrix F estimation is clear from the previous sections.
Having this matrix computed give us the possibility to find not only the 3D points from the
3   RECONSTRUCTING 3D POINTS FROM AN IMAGE PAIR                                               6


scene but also the camera calibrations. Therefore various computational methods are being
invented for its determination.

3.2.1   Normalized eight point algorithm
Beginning with the most simple method, which fundamentals were described in the section
2.2. The equation (1) holds for every point pair c and c , which means, that in theory every
eight such pairs define an uniquely F up to scaling (because the fundamental matrix has eight
degrees of freedom). We assume, that the homogeneous coordinates of the points c and c are,
respectively (x, y, 1) and (x , y , 1) ([HZ03]). Then every point pair defines an equation, which
solution contains the nine coefficients of the fundamental matrix:

         x xf11 + x yf12 + x f13 + y xf21 + y yf22 + y f23 + xf31 + yf32 + f33 = 0           (6)

    But in section 2.2 we mentioned, that we actually need only seven point correspondences.
In fact, there is no mistake. We can really compute the fundamental matrix out of seven known
point pairs, but in this case, the method is less stable and needs more computational time.
    Another important issue ist the singularity property of F . Which means, additional infor-
mation that det(A) = 0 is given. In other words, if the found F appears to be not singular, we
use the Frobenius norm ||F − F || to replace it with the closest singular matrix to F , namely
F . Forcing the singularity of F is necessary, because otherwise there could be discrepancies
between epipolar lines - they all could not meet in the epipole.
    The normalized eight point algorithm has been proposed for the first time in [HH81]. It is
nothing more than improvement of the already described approach based on eight point cor-
respondences. The important part of the normalized eight point algorithm is the cleverer con-
struction of the linear equations (6). As pointed out by [HZ03], the normalization consists in
translating and scaling the image, in order to organize the reference points around the origin
of the coordinates system before solving the linear equations. The following normalization,
suggested in [PCF06], for example, is a good solution of the problem: c i = K −1 c , where
                                              w+h          w
                                                              
                                                2
                                                       0    2
                                                      w+h   h
                                      KN =  0         2    2
                                                                                            (7)
                                                0      0    1
and h is the height, w is the width of the image. This transformation makes the normalized
eight point algorithm showing better performance and stability of the result.
    Unfortunately, in the reality this idealized situation is very rare, which means, that most
often, we have to deal with noisy measurements. For this reason, other statistically more stable
algorithms are invented.

3.2.2   Algebraic minimization algorithm
The algebraic minimization algorithm is based on the previous simple eight point algorithm for
estimating the fundamental matrix. The difference between those two approaches is following:
after finding F from the previous eight point algorithm, we try to minimize the algebraic error.
The linear system build from the equation (6) for every point pair can be written in the form:

                                            Af = 0                                           (8)

where A is the matrix derived from the coordinates of the two corresponding points and f is
the vector containing the coefficients of F . The fundamental matrix F could be written as a
3   RECONSTRUCTING 3D POINTS FROM AN IMAGE PAIR                                                       7


product of any non singular matrix M and e corresponding to homogeneous coordinates of the
epipole in one of the images. Decomposing F to f = E m gives us the possibility to present
the minimization problem as follow: min ||AE m|| subject to ||E m|| = 1, where E is a 9 × 9
matrix, computed iteratively from e and m contains the coefficients of M .
    Although iterative, this algorithm, proposed by [HZ03] is effective and simple for imple-
mentation.

3.2.3   Gold standard algorithm
The gold standard algorithm belongs to the group of algorithms trying to minimize the geomet-
ric image distance. It uses as basis some of the previous methods and brings the most important
improvement of performing very well in real situations. Usually, the most common type of
noise appearing in real measurements is Gaussian. Therefore we should rather use the advance
of statistical models, than pursuing exact results. There are two things we have to assume. The
first assumption is that we are dealing with erroneous measurements, which in fact describes the
real situation. Secondly, we suppose, that the noise in our images has a Gaussian distribution.
Under these assumption, our model has been reduced to a minimization problem. That is to say,
we can calculate the fundamental matrix, by minimizing the Likelihood function:

                                                ˆ               ˆ 2
                                         p(ci , ci )2 + p(c i , c i ) = 0                           (9)
                                     i
                                      ˆ
    The terms p(ci , ci ) and p(c i , c i ) express the probability of observation ci , respectively c i
                     ˆ
when in fact the exact (correct) corresponding points are ci and cˆi .
                                                               ˆ
    The gold standard algorithm provides the best results from all discussed methods in terms
of being stable in those systems distorted by Gaussian noise. In fact, this is the case for almost
every reality based model, therefore one can be sure, that using the gold standard algorithm will
give back the most accurate results.

3.2.4   Automatic computation of the fundamental matrix
If we want to use triangulation method in robotics, there is one very important step, we should
not miss. We have already some very useful algorithms for computing the fundamental ma-
trix, but this is only one part of the whole measurement process. The robot vision functions
on the following principle: given two input images as sensor data and the robot must some-
how acquire the knowledge of exact object position. The missing part of this process is an
answer of the question: how can a robot detect those point correspondences, which he needs
in order to compute the fundamental matrix? An algorithm, able to automatically detect point
correspondences should be invented.
    Meanwhile, there are available a lot of algorithms for extracting key features from images.
For example, the Harris detector can be used to find the corners in one image. It is a simple
approach, and has the biggest disadvantages of being scaling dependent. However adapting
Harris detector to be invariant to affine transformation is not impossible task. Very successful
combination of Harris and Laplacian detectors is presented in [MS04]. There are, of course,
a great number of algorithms detecting so called ”points of interest”. For example Laplacian
and Difference of Gaussian (DoG) detectors work on the principle of finding areas with fast
changing color value. They are scale invariant, because they filter the image with Gaussian
kernel and this way define regions with structures of interest.
    Another very interesting approach for detecting key structures in a picture is submitted in
the paper [KZB01] and it is called Salient region detector. The main idea of this method is to
3     RECONSTRUCTING 3D POINTS FROM AN IMAGE PAIR                                               8


use the local complexity as a measure of saliency. One area can be marked as salient, only if
the local attributes in this area show unpredictability over certain set of scales. The procedure
consists of three steps. First, the Shannon entropy H(s) is calculated for different scales and
in the second step, the optimal scales are selected as the scales with the highest entropy. In the
next step, magnitude change of the probability density function W (s) as a function of scale at
each peak is calculated and finally the result is formed as product of both: H(s) and W (s) of
each circular window with radius s. This method could be further extended in order to become
affine invariant.
    Specially for the needs of stereo problem analysis, an algorithm calculating the so called
maximally stable extremal regions (MSER) was developed and suggested in the paper [MCUP02].
On the basis of local binarization of the image, these maximally stable extremal regions are de-
tected and an exploration of their properties shows some very positive characteristics. They are
invariance to affine transformation, stable and allow multi-scale detection, which means the fine
structures are detected, as well as the very large ones. The informal explanation of the MSER
concept is following: all pixels of one image are divided according to some varying threshold
in two groups. Shifting the threshold from the one end of the intensity scale to the other makes
our binary images change. This way we can define our regions of maximum intensity and in-
verting the image gives us, respectively the minimum regions. The authors of the paper propose
an algorithm running with complexity O n log log n which guarantees fast liner performance
with increasing pixel number.

3.3     Image rectification
Image rectification is an often used method in computer vision, simplifying the search of match-
ing points between the images. The simple idea behind image rectification is to project the both
images on another plane, so that they are forced to share a common plane. The benefits of these
transformations are significant. If we want to find a matching point c of c then we don’t need to
search the whole plane, but only a line of it and this line is however parallel to the x-axis. The
implementation of this idea can be done by projecting the both images on another plane, so that
their epipolar lines are becoming scanlines of the new image, and they are also parallel to the
baseline. [FP03].
    An important point to be mentioned, is that the image rectification algorithms are based
on the already discussed methods for finding corresponding points. It is an advantage when
the underlying point correspondences detector performs automatically. The next steps involve
mapping the epipole to a point in infinity and then applying it to the other image, so that it
matches the epilpolar line. This algorithm is explained in [HZ03].
4     TRIANGULATION METHODS                                                                          9


4      Triangulation methods
In this chapter, we are going to state the problems by triangulation and their solutions. Assuming
that the fundamental matrix F and the both camera matrices P and P are given and we can
rely on their correctness, the first idea, that come immediately in mind is to back-project the
rays from the corresponding image points c and c . The point in 3D space, where these rays
will intersect each other, is exactly what we search. At first, this idea seems to work, but in
practice, we can never be sure, that the images contain perfect measurements. In the case of
noise-distorted image pair, the previously discussed idea will fail, because the back-projected
rays won’t intersect in a 3D point at all.
    One possible solution of this problem, already discussed in section 3.2.3 is to estimate the
fundamental matrix and the world point C simultaneously, using the Gold standard algorithm.
The second possibility is by obtaining an optimal Maximum Likelihood estimator for the point.
In the following section, we are going to discuss the second possibility. For the first one, please
refer to section 3.2.3.

4.1    Linear triangulation methods
The fact, that two rays calculated from the image points don’t cross at a world point, can be
geometrically represented with the statement: c = P C and c = P C are not satisfied for any
C. We can remodel and combine these two equation to become one equation of the form:
AC = 0, where A is a matrix, derived from the homogeneous coordinates of the points c and c ,
as well as the columns of the camera matrices p1 , p2 , p3 , p 1 , p 2 , p 3 . As suggested in [HZ03]:

                                            xp3 T − p1 T
                                                            
                                          yp3 T − p2 T 
                                     A =x p 3 T − p 1 T 
                                                                                                  (10)
                                           y p 3T − p 2T
    This way we have a linear system from four equation, in order to find the four homogeneous
coordinates (X, Y, Z, 1)T of the world point C.
    There are two linear methods for finding the best solution for C. The homogeneous method
tries to find the solution as the unit singular vector corresponding to the smallest singular value
of A. The alternative inhomogeneous method turns the set of equations into a inhomogeneous
set of linear equations.
    All linear methods have the same disadvantage - they are not projective invariant, which
means, that objects like c, c , P and P do not remain the same by transformation under the
laws of projective geoetry. In other words, there is no such transformation H, for which
τ (c, c , P, P ) = H −1 τ (c, c , P H −1 , P H −1 ), where τ () marks the triangulation function. Thus,
there are more suitable methods for solving the same problem, discussed in the following sec-
tions.

4.2    Minimization of geometric error
As we assumed in the previous section, the measured image points c and c don’t satisfy the
epipolar constraint, because they are noise distorted. If we mark the corresponding points,
                                                 ˆ
which satisfy the epipolar constraint with c and c , then we can turn the problem into minimiza-
                                           ˆ
tion problem:
                                              ˆ           ˆ 2
                                   min d(c, c)2 + d(c , c )                                 (11)
4     TRIANGULATION METHODS                                                                     10

                                                                    ˆT ˆ
where d(a, b) stays for the Euclidean distance and the constraint c F c = 0 holds. Once we find
                 ˆ
the points c and c , the solution for C is easy and can be calculated by any triangulation method.
           ˆ

4.3    Sampson approximation
An alternative to the minimization of the geometric error method is the so called Sampson
approximation. Without examining it in small details, we will make an overview of the method.
The Sampson correction δc of the world point C is expressed by (x, y, x , y ), where (x, y)T
and (x , y )T are the coordinates of the points c, respectively c . Logically C could be presented
                                                                              ˆ
as the calculated C from the faulty measurements plus the Sampson correction δc . After some
transformations (for details, please refer to [HZ03]), the end result looks like:
                                                                                  
               xˆ         x                                                  (F T c )1
             y   y 
             ˆ =          −                    c TFc                   (F T c ) 
                                                                                     2 
                ˆ
             x   x  (F c)2 + (F c)2 + (F T c )2 + (F T c )2  (F c)1 
                                                                                             (12)
                                        1        2           1          2
               yˆ         y                                                   (F c)2
where, for example the expression (F T c )1 replaces the polynomial f11 x + f21 y + f31 . The
Sampson approximation is accurate only in case the needed correction is very small. Otherwise,
there is a more stable algorithm, presented in the next section, which results satisfy the epipolar
constraint.

4.4    The optimal solution
The optimal algorithm tries to return an accurate result by finding a global minimum in a cost
function, similar to the Likelihood function (9) presented in the previous chapter. Using the
knowledge that the corresponding point always lies on the corresponding epipolar line, we
define the cost function as:
                                                          2
                                       d(c, l)2 + d(c , l )                                   (13)
where l and l are corresponding polar lines to the points c , respectively c. With a proper
parameterization of the epipolar pencils in the images, the solution of this minimization problem
is optimal.
5     PRACTICAL EXAMPLES OF TRIANGULATION                                                             11


5       Practical Examples of Triangulation
5.1     Triangulation with structured light
In all triangulation methods with structured light, one of the two cameras is replaced by light
source. Therefore these technics are often referred as active triangulation. In the following
sections are presented the most basic principles of triangulation via structured light. There are,
of course, a lot of variations and improvements, but the basic idea remains always the same.

5.1.1    Light spot technique
The light spot technique is based on a simple construction with a laser ray, object lens and
detector, which can be either charge-coupled device (CCD) or position-sensing detector (PSD).
As shown in the Figure 2, the laser ray points on the object’s surface and the lens projection of
this point plays the role of photodiode. It produces differences in the electric current on the light
sensitive area of the PSD. On the basis of this difference, we can measure the exact position of
the point on the sensor, respective to calculate the position of its image on the object. Scanning
of a surface succeed via sample points.
                           laser




                                                                                                           Ө
                                                           p’             PSD
                                                                     q’
                                                                                                               h




                             q
                             p
                         measured
                         object



      Figure 2: The picture visualize how depth information can be gained via light spot technique.


   This technique has a lot of advantages. The result is fast, accurate and additionally inde-
pendent from the surface color. But there is one constraint for this method, namely, the surface
must be no ideal mirror, because part of the light should reflect in the direction of the objective.
There are also some problems to be solved. For example, if part of the surface is hidden by an-
other structure from the same surface, then it is impossible for the laser ray to reach the hidden
one.
5   PRACTICAL EXAMPLES OF TRIANGULATION                                                            12


5.1.2   Stripe projection
The main idea of stripe projection is to show how the object’s surface modulate the input signal.
In this particular case, the input signal is one laser line. Where the line intersects an object, on
the image taken from the camera, we can see displacements in the light stripe proportional to
the distance of the object. For this purpose we need to know previously where would the line be
projected, if no object was placed in front of the camera. Having this information and knowing
how the measured object impacts the light line, we can easily estimate the position of almost all
3D points, lying on the object. Figure 3 illustrates the geometry of this approach:




                            camera




          laser




                                     d

         measured object
                                         h
                                     Ө
                                         reference surface



                             (a)                                                (b)

Figure 3: (a): The camera registers the point displacement, so we can now calculate its position in 3D
space (b): Demonstration of the method’s implementation.


    We know exactly where our point on the reference surface should be projected and the
distance to the reference surface r is also previously known. This means, if we manage to find
h, then the distance to the object is simple the difference between r and h. Finding h is a simple
task with the knowledge of the displacement d and the angle θ on which our laser ray is inclined:
                                                          d
                                                h=                                               (14)
                                                        tan θ

5.1.3   Projection of a static line pattern
One obvious disadvantage of the stripe projection method is, that all objects are scanned line
by line, which means slow performance, requiring one image per every single line. In order to
make the approach faster, we can project more lines simultaneously. The end result is static line
pattern, deformed by the object’s surface. Although this improvement shows better results, it
has additional disadvantage. By surfaces with fast changing forms, the run of every single line
is very difficult to be followed. Therefore projection of a static line patter should be further ex-
tended to encode every single point from the surface uniquely. A lot of methods were developed
in order to accomplish this task. Some of them are discussed in the following section.
5   PRACTICAL EXAMPLES OF TRIANGULATION                                                               13


5.1.4   Projection of encoded patterns
In order to encode each surface point uniquely, the static stripe pattern should be extended. This
can happen either with adding more colors in the projected pattern, or just taking more pictures
of the scene lighted by slightly changing pattern.
    Figure 4 illustrates one example solution for projected pattern. Stripe patterns with different
wavelengths are projected successively, building an unique code for every point on the surface.
The same procedure can be repeated with horizontal lines also. This approach, called binary
coding via sequence of fringe patterns is very successful but it fails when the stripe pattern needs
to be very fine. If high resolution position information required, then a better approach could be
projection of phase shifted pattern with the same wavelength. But this is suitable only for really
fine structures, therefore most often hybrid methods are used, which means one mixture of the
both methods bringing the quality of precisely encoded rough, as well as fine object structures.




Figure 4: Active triangulation by projecting stripe patterns on the object’s surface. The different wave-
length of the pattern encodes uniquely every point. The images are taken from [J¨ h05].
                                                                                  a


    The second possibility of encoding one point uniquely is via colored pattern. Certain vari-
ation of the pattern is possible, for example they could be differentiates not only by color, but
also by width and the pattern itself. Because the projected pattern is previously known, finally
the result and the expected pattern are compared. In this way, if some occlusions are present,
they could be very easily detected.

5.1.5   Light spot stereo analysis
The light spot stereo analysis is a method inspired by the human binocular vision. Two cameras
take pictures from the scene. A laser ray is projected over the object’s surface and registered by
the camera pair. The disparity between the laser points in the both images functions as basis for
computing the distance to that point. This approach is a mixture between active triangulation
method and triangulation from stereo vision.
5     PRACTICAL EXAMPLES OF TRIANGULATION                                                               14


5.2     Triangulation from stereo vision
Similar to the human vision system, the triangulation from stereo vision functions with two
cameras. The distance between the plane, on which the both cameras are positioned and the 3D
point in the space can be easily measured. As shown in Figure 5, the distance vector between
the cameras called stereoscopic basis is marked with b. Assuming, that the distance X3 is at
least two times greater than the focal length d , we can express the relationship between these
three quantities as:
                                                   d
                                            p=b                                             (15)
                                                  X3
The newly introduced quantity p is referred as parallax or disparity and its geometrical meaning
is the offset between the two projections of the world point on the image plane, defined by the
parallel optical axes of the camera pair. For more details and derivation of the final equation
(15), please refer to [J¨ h05].
                        a




                 d’                                  x3                                    x1
        b




Figure 5: The graphic represents the view angles of the both cameras and geometrically visualizes how
disparity depends on the focal length d of the cameras, stereoscopic basis b and the distance to the object
X3


    There are some interesting consequences of the equation (15). Firstly, one can conclude,
that the disparity p is proportional to the stereoscopic base b. Secondly, the disparity is in-
versely proportional to the distance to the measured object X3 . Summarizing this observation,
the greater distance to the measured object means loss of accuracy by estimating the depth
information and the bigger stereoscopic base invokes on the contrary higher precision.
6   CONCLUSION                                                                                    15


6    Conclusion
One very interesting issue is the usage of triangulation methods in medicine. Therefore instead
of summarizing all written till now, as conclusion, we would like to mention some real examples
taken from the medical area. This would perfectly illustrate the importance of such methods in
in science as well as in our everyday life.
    Triangulation principles are used most often in the optical tracking systems. The most
popular and widely used system is Polaris R produced by Northern Digital Inc. (also known
as NDI). The Polaris R family members (presented in Figure 6) offer passive, active and hybrid
tracking and the points, which are needed for the triangulation itself are implemented as markers
fixed on the surgical instruments.




    Figure 6: The picture taken from the NDI webpage [pol] shows two members of Polaris family.


    Another example of optical tracking systems, used rather for research purposes are the
ART R systems produced by Advanced Realtime Tracking GmbH (ART GmbH). The exam-
ple system smARTtrack R presented on the Figure 7 consists of two cameras fixed on a rigid
bar, so that no calibration is needed. Different configurations are possible, depending on param-
eters like focal length, baseline length and angle between the cameras. The ART R trademark
is very popular, because it allows the building of multiple cameras systems, for example with
three, four and five cameras.




Figure 7: The picture from the ART webpage [art] illustrates the smARTtrack R stereo vision system.


    3D vision system could be implemented also in endoscopic instruments. For this purpose
two very small cameras are embedded in a tube, with relatively small stereoscopic base, which
is no problem, because the measured distances by endoscopy are also very limited. Figure 8 a)
6   CONCLUSION                                                                                  16


shows how such device looks like. The hole surgical systems, presented on the Figure 8 is called
Da Vinci R and consists of a high-resolution 3D endoscope coupled with two 3-chip cameras and
a console helping by visualizing the camera records and by repositioning the surgical camera
inside the patient. For more technical details about this system, please refer to the producer’s
webpage [daV].




                                 (a)                               (b)

Figure 8: (a): Da Vinci R 3D endoscope with two cameras (b): Da Vinci R console helping the surgeon
by positioning the instruments in the patient’s body and visualizing the camera records.
REFERENCES                                                                                17


References
[art]      ART Systems homepage. http://www.ar-tracking.de/smARTtrack.
           49.0.html

[BB82]     BALLARD, Dana H. ; B ROWN, Christopher M.: Computer Vision. 2nd edition.
           Prentice Hall, 1982

[daV]      Da Vinci Surgical System homepage. http://www.intuitivesurgical.
           com/products/davinci_surgicalsystem/3d.aspx

[FP03]     F ORSYTH, David A. ; P ONCE, Jean: Computer Vision A Modern Approach. Pren-
           tice Hall, 2003

[HH81]     H.C.L ONGUET-H IGGINS: A Computer Algorithm for Reconstructing a Scene from
           Two Projections. Nature, 1981

[HZ03]     H ARTLEY, Richard ; Z ISSERMAN, Andrew: Multiple View Geometry in Computer
           Vision. 2nd edition. Cambridge University Press, 2003

[J¨ h05]
  a          ¨
           J AHNE, Bernd: Digitale Bildverarbeitung. 6th edition. Springer Verlag, 2005

[KZB01]    K ADIR, Timor ; Z ISSERMAN, Andrew ; B RADY, Michael: An affine invariant
           salient region detector. In: Department of Engineering Science, University of Ox-
           ford (2001)

[MCUP02] M ATAS, J. ; C HUM, O. ; U RBAN, M. ; PAJDLA, T.: Robust Wide Baseline Stereo
         from Maximally Stable Extremal Regions. In: Center for Machine Perception,
         Dept. of Cybernetics, CTU Prague, Karlovo (2002)

[MS04]     M IKOLAJCZYK, Krystian ; S CHMID, Cordelia: Scale and Affine Invariant Inter-
           est Point Detectors. In: International Journal of Computer Vision 60(1) (2004),
           January, S. 63–86

[PCF06]    PARAGIOS, Nikos ; C HEN, Yunmei ; FAUGERAS, Oliver: Handbook of Mathemat-
           ical Models in Computer Vision. Springer Verlag, 2006

[pol]      NDI   homepage.                  http://www.ndigital.com/medical/
           polarisfamily.php

[Wol]      W OLFRAM R ESEARCH: Wolfram MathWorld.                 http://mathworld.
           wolfram.com/AntisymmetricMatrix.html

Contenu connexe

Tendances

Lecture notes on planetary sciences and orbit determination
Lecture notes on planetary sciences and orbit determinationLecture notes on planetary sciences and orbit determination
Lecture notes on planetary sciences and orbit determinationErnst Schrama
 
Eui math-course-2012-09--19
Eui math-course-2012-09--19Eui math-course-2012-09--19
Eui math-course-2012-09--19Leo Vivas
 
Ieml semantic topology
Ieml semantic topologyIeml semantic topology
Ieml semantic topologyAntonio Medina
 
Morton john canty image analysis and pattern recognition for remote sensing...
Morton john canty   image analysis and pattern recognition for remote sensing...Morton john canty   image analysis and pattern recognition for remote sensing...
Morton john canty image analysis and pattern recognition for remote sensing...Kevin Peña Ramos
 
Real-Time Non-Photorealistic Shadow Rendering
Real-Time Non-Photorealistic Shadow RenderingReal-Time Non-Photorealistic Shadow Rendering
Real-Time Non-Photorealistic Shadow RenderingTamás Martinec
 
Fuzzy and Neural Approaches in Engineering MATLAB
Fuzzy and Neural Approaches in Engineering MATLABFuzzy and Neural Approaches in Engineering MATLAB
Fuzzy and Neural Approaches in Engineering MATLABESCOM
 
Applied Stochastic Processes
Applied Stochastic ProcessesApplied Stochastic Processes
Applied Stochastic Processeshuutung96
 
A Matlab Implementation Of Nn
A Matlab Implementation Of NnA Matlab Implementation Of Nn
A Matlab Implementation Of NnESCOM
 
Emotions prediction for augmented EEG signals using VAE and Convolutional Neu...
Emotions prediction for augmented EEG signals using VAE and Convolutional Neu...Emotions prediction for augmented EEG signals using VAE and Convolutional Neu...
Emotions prediction for augmented EEG signals using VAE and Convolutional Neu...BouzidiAmir
 
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zinggFundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zinggRohit Bapat
 
Trade-off between recognition an reconstruction: Application of Robotics Visi...
Trade-off between recognition an reconstruction: Application of Robotics Visi...Trade-off between recognition an reconstruction: Application of Robotics Visi...
Trade-off between recognition an reconstruction: Application of Robotics Visi...stainvai
 

Tendances (19)

phd thesis
phd thesisphd thesis
phd thesis
 
Lecture notes on planetary sciences and orbit determination
Lecture notes on planetary sciences and orbit determinationLecture notes on planetary sciences and orbit determination
Lecture notes on planetary sciences and orbit determination
 
Eui math-course-2012-09--19
Eui math-course-2012-09--19Eui math-course-2012-09--19
Eui math-course-2012-09--19
 
Ieml semantic topology
Ieml semantic topologyIeml semantic topology
Ieml semantic topology
 
Cg notes
Cg notesCg notes
Cg notes
 
t
tt
t
 
Morton john canty image analysis and pattern recognition for remote sensing...
Morton john canty   image analysis and pattern recognition for remote sensing...Morton john canty   image analysis and pattern recognition for remote sensing...
Morton john canty image analysis and pattern recognition for remote sensing...
 
Real-Time Non-Photorealistic Shadow Rendering
Real-Time Non-Photorealistic Shadow RenderingReal-Time Non-Photorealistic Shadow Rendering
Real-Time Non-Photorealistic Shadow Rendering
 
thesis
thesisthesis
thesis
 
phd-thesis
phd-thesisphd-thesis
phd-thesis
 
Sona project
Sona projectSona project
Sona project
 
Machine learning-cheat-sheet
Machine learning-cheat-sheetMachine learning-cheat-sheet
Machine learning-cheat-sheet
 
Fuzzy and Neural Approaches in Engineering MATLAB
Fuzzy and Neural Approaches in Engineering MATLABFuzzy and Neural Approaches in Engineering MATLAB
Fuzzy and Neural Approaches in Engineering MATLAB
 
Applied Stochastic Processes
Applied Stochastic ProcessesApplied Stochastic Processes
Applied Stochastic Processes
 
A Matlab Implementation Of Nn
A Matlab Implementation Of NnA Matlab Implementation Of Nn
A Matlab Implementation Of Nn
 
Emotions prediction for augmented EEG signals using VAE and Convolutional Neu...
Emotions prediction for augmented EEG signals using VAE and Convolutional Neu...Emotions prediction for augmented EEG signals using VAE and Convolutional Neu...
Emotions prediction for augmented EEG signals using VAE and Convolutional Neu...
 
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zinggFundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
 
Report
ReportReport
Report
 
Trade-off between recognition an reconstruction: Application of Robotics Visi...
Trade-off between recognition an reconstruction: Application of Robotics Visi...Trade-off between recognition an reconstruction: Application of Robotics Visi...
Trade-off between recognition an reconstruction: Application of Robotics Visi...
 

En vedette (11)

Triangulation
TriangulationTriangulation
Triangulation
 
09 triangulation
09 triangulation09 triangulation
09 triangulation
 
Lecture note triangulation_and_trilatera2016
Lecture note triangulation_and_trilatera2016Lecture note triangulation_and_trilatera2016
Lecture note triangulation_and_trilatera2016
 
Triangulation and trilateration pdf...
Triangulation and trilateration pdf...Triangulation and trilateration pdf...
Triangulation and trilateration pdf...
 
Triangulation
Triangulation Triangulation
Triangulation
 
Types of highways
Types of highwaysTypes of highways
Types of highways
 
Triangulation
TriangulationTriangulation
Triangulation
 
Engineering surveying-ii
Engineering surveying-iiEngineering surveying-ii
Engineering surveying-ii
 
Surveying
Surveying Surveying
Surveying
 
Qualitative Research
Qualitative ResearchQualitative Research
Qualitative Research
 
Linear Measurements
Linear MeasurementsLinear Measurements
Linear Measurements
 

Similaire à Triangulation methods Mihaylova

Avances Base Radial
Avances Base RadialAvances Base Radial
Avances Base RadialESCOM
 
Stochastic Programming
Stochastic ProgrammingStochastic Programming
Stochastic ProgrammingSSA KPI
 
Introduction to Radial Basis Function Networks
Introduction to Radial Basis Function NetworksIntroduction to Radial Basis Function Networks
Introduction to Radial Basis Function NetworksESCOM
 
Robust link adaptation in HSPA Evolved
Robust link adaptation in HSPA EvolvedRobust link adaptation in HSPA Evolved
Robust link adaptation in HSPA EvolvedDaniel Göker
 
Elementray college-algebra-free-pdf-download-olga-lednichenko-math-for-colleg...
Elementray college-algebra-free-pdf-download-olga-lednichenko-math-for-colleg...Elementray college-algebra-free-pdf-download-olga-lednichenko-math-for-colleg...
Elementray college-algebra-free-pdf-download-olga-lednichenko-math-for-colleg...Olga Lednichenko
 
7 introplasma
7 introplasma7 introplasma
7 introplasmaYu Chow
 
A buffer overflow study attacks and defenses (2002)
A buffer overflow study   attacks and defenses (2002)A buffer overflow study   attacks and defenses (2002)
A buffer overflow study attacks and defenses (2002)Aiim Charinthip
 
Lecture Notes in Machine Learning
Lecture Notes in Machine LearningLecture Notes in Machine Learning
Lecture Notes in Machine Learningnep_test_account
 
Master thesis xavier pererz sala
Master thesis  xavier pererz salaMaster thesis  xavier pererz sala
Master thesis xavier pererz salapansuriya
 
Script md a[1]
Script md a[1]Script md a[1]
Script md a[1]Peter
 
Computer Graphics Notes.pdf
Computer Graphics Notes.pdfComputer Graphics Notes.pdf
Computer Graphics Notes.pdfAOUNHAIDER7
 
Pulse Preamplifiers for CTA Camera Photodetectors
Pulse Preamplifiers for CTA Camera PhotodetectorsPulse Preamplifiers for CTA Camera Photodetectors
Pulse Preamplifiers for CTA Camera Photodetectorsnachod40
 

Similaire à Triangulation methods Mihaylova (20)

Cs665 writeup
Cs665 writeupCs665 writeup
Cs665 writeup
 
Avances Base Radial
Avances Base RadialAvances Base Radial
Avances Base Radial
 
Di11 1
Di11 1Di11 1
Di11 1
 
Stochastic Programming
Stochastic ProgrammingStochastic Programming
Stochastic Programming
 
Introduction to Radial Basis Function Networks
Introduction to Radial Basis Function NetworksIntroduction to Radial Basis Function Networks
Introduction to Radial Basis Function Networks
 
Robust link adaptation in HSPA Evolved
Robust link adaptation in HSPA EvolvedRobust link adaptation in HSPA Evolved
Robust link adaptation in HSPA Evolved
 
Elementray college-algebra-free-pdf-download-olga-lednichenko-math-for-colleg...
Elementray college-algebra-free-pdf-download-olga-lednichenko-math-for-colleg...Elementray college-algebra-free-pdf-download-olga-lednichenko-math-for-colleg...
Elementray college-algebra-free-pdf-download-olga-lednichenko-math-for-colleg...
 
7 introplasma
7 introplasma7 introplasma
7 introplasma
 
A buffer overflow study attacks and defenses (2002)
A buffer overflow study   attacks and defenses (2002)A buffer overflow study   attacks and defenses (2002)
A buffer overflow study attacks and defenses (2002)
 
Lecture Notes in Machine Learning
Lecture Notes in Machine LearningLecture Notes in Machine Learning
Lecture Notes in Machine Learning
 
Final report
Final reportFinal report
Final report
 
Lecturenotesstatistics
LecturenotesstatisticsLecturenotesstatistics
Lecturenotesstatistics
 
Cliff sugerman
Cliff sugermanCliff sugerman
Cliff sugerman
 
Master thesis xavier pererz sala
Master thesis  xavier pererz salaMaster thesis  xavier pererz sala
Master thesis xavier pererz sala
 
Script md a[1]
Script md a[1]Script md a[1]
Script md a[1]
 
Computer Graphics Notes.pdf
Computer Graphics Notes.pdfComputer Graphics Notes.pdf
Computer Graphics Notes.pdf
 
Pulse Preamplifiers for CTA Camera Photodetectors
Pulse Preamplifiers for CTA Camera PhotodetectorsPulse Preamplifiers for CTA Camera Photodetectors
Pulse Preamplifiers for CTA Camera Photodetectors
 
Queueing 3
Queueing 3Queueing 3
Queueing 3
 
Queueing
QueueingQueueing
Queueing
 
Queueing 2
Queueing 2Queueing 2
Queueing 2
 

Triangulation methods Mihaylova

  • 1. Institut Autom Institut fur Prozessrechentechnik, ¨ Automation und Robotik (IPR) Triangulation Methods Seminar paper of Zlatka Mihaylova SS 2009 Supervisor : M.Phys. Matteo Ciucci
  • 2. Contents 1 Introduction 1 2 Basics 2 2.1 Epipolar geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Fundamental matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.3 Camera matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.4 Essential matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3 Reconstructing 3D points from an image pair 5 3.1 General approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.2 Computation of the fundamental matrix . . . . . . . . . . . . . . . . . . . . . 5 3.2.1 Normalized eight point algorithm . . . . . . . . . . . . . . . . . . . . 6 3.2.2 Algebraic minimization algorithm . . . . . . . . . . . . . . . . . . . . 6 3.2.3 Gold standard algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2.4 Automatic computation of the fundamental matrix . . . . . . . . . . . 7 3.3 Image rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4 Triangulation methods 9 4.1 Linear triangulation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.2 Minimization of geometric error . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.3 Sampson approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.4 The optimal solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 5 Practical Examples of Triangulation 11 5.1 Triangulation with structured light . . . . . . . . . . . . . . . . . . . . . . . . 11 5.1.1 Light spot technique . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.1.2 Stripe projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.1.3 Projection of a static line pattern . . . . . . . . . . . . . . . . . . . . . 12 5.1.4 Projection of encoded patterns . . . . . . . . . . . . . . . . . . . . . . 13 5.1.5 Light spot stereo analysis . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.2 Triangulation from stereo vision . . . . . . . . . . . . . . . . . . . . . . . . . 14 6 Conclusion 15 i
  • 3. 1 INTRODUCTION 1 1 Introduction The ability of our brain to percept the relative distance between objects and to particular object is not a consequence of measuring the exact length. We calculate it with the help of information collected through our eyes. This approach is widely used in robotics, because all decisions to be taken are based on knowing the relationships between objects and almost never these could be measured directly. One method for computing the location of a point in three dimensional space is by com- paring two or more images taken from a different points of view. In case of two images, this method is called triangulation. The human visual perception system is based on this principle, namely producing two slightly displaced images, which encode information about the positions in the space. The next chapter introduces the basic terms of the triangulation. For better a understanding, the reader should be familiar with the geometry standing behind this concept and the mathe- matical description of the problem.
  • 4. 2 BASICS 2 2 Basics In the following section we explain the basic concepts on which the triangulation is based. This kind of mathematics is often referred to as epipolar geometry. There are some special matrices involved in the description of relationships between three points in a space, which we are going to observe in the next few sections. They precisely describe the main principles in stereo vision. 2.1 Epipolar geometry The epipolar geometry is a subset of projective geometry which helps for searching correspond- ing points between two images. This kind of geometry is independent of the scene structure, but only of the used cameras, their internal parameters and relative positions. The line (also referred to as baseline) connecting the camera centers and intersecting the image planes in point e , respectively e defines the epipoles. Another important term is the epipolar plane π. Three points are needed to define this plane - an external object point C and its projections on the images. This statement can be reformulated by saying that the plane is also defined by the object point and the camera centers, because the projection of the 3D point lies on the line CA, respectively CB. From Figure 1 it is obvious, that π is not the same for all three dimensional points, but all epipolar planes contain the baseline. If we know c - the projection of C on the first image and additionally the plane π , then the projection on the other plane is still ambiguous. That means c is not fixed. However, instead of searching the whole second image plane, we can reduce the computational time by searching the projected point on only one line - the epipolar line e c . C � epipolar plane c c’ ep e lin ipo lar ar ol lin e e’ ip e ep A baseline B Figure 1: The 3D point C is projected on the both images in c and c . The points A and B indicate the centers of two pinhole cameras. By definition they are located on the epipolar plane π.
  • 5. 2 BASICS 3 2.2 Fundamental matrix The fundamental 3 × 3 matrix F represents the connection between one point in the first image and a line in the second. This kind of mapping is very important, because as long as the funda- mental matrix is computed, we would be able to estimate easily the rest correspondences. As shown in Figure 1 any possible corresponding point of the projected c belongs to the epipolar line of the other image. There are different ways for computing F depending on the information we have. If both cameras are calibrated, which means their intrinsic parameters are given and furthermore information about the transfer plane π is available, then the correlation between the points in the image planes is already defined. This correlation is mathematically described by F. The most important property of the fundamental Matrix F is that for all pairs of points c and c the following equation must be valid: T c Fc = 0 (1) Well-known as a correspondence condition, this equation implies that there is another way of computing the correspondences between two sets of points. In difference to already dis- cussed possibility, now we see that the knowledge of the camera matrices P and P is not necessary. The fundamental matrix can be estimated if we have the coordinates of at least seven corresponding point pairs [HZ03]. From the correspondence condition another interesting properties of the fundamental matrix F can be derived. For example the definition for the epipolar line: l = Fc (2) where l is the epipolar line in the second image. We can convert the equation similarly for the another epipole line, because this relation is transposable. There is no image or camera prioritization, the both pictures are treated equally. F is a homogeneous matrix. Therefore it should have eight degree of freedom. Actually it has seven degree of freedom, because its determinant is null by default. This observation has a direct connection to the number of point correspondences needed for the computation of F . Finally we have to mention that assigning one line to a point is unidirectional. Trying to find the corresponding point to a line is meaningless and not possible. 2.3 Camera matrices The camera matrices P and P describe the projective properties of the cameras. One important issue, discussed in [HZ03] , is how these matrices relate to the fundamental matrix F . In one direction, both P and P result in an unique fundamental matrix F , but in the other direction these camera matrices can be determined from the fundamental matrix F up to a projective transformation of 3D space. The resulted ambiguity can be solved by adding an additional constraint to the product P T F P , that is the resulting matrix P T F P should be skew-symmetric. Skew symmetric matrices (also known as antisymmetric matrices [Wol]) have the form:   0 a12 a13 −a12 0 a23  (3) −a13 −a23 0 or in other words, matrices satisfying the condition: A = −AT .
  • 6. 2 BASICS 4 As we stated in the previous section: (1) c T F c = 0. Knowing the relations c = P C and c = P C, we could now prove, that the matrix P T F P equivalent to C T P T F P C should be antisymmetric. 2.4 Essential matrix The matrix already discussed in section 2.2 is a generalization of another matrix called essential matrix E. The both matrices represent the epipolar constraint defined in the same section, but in the case of E, we have information about the intrinsic camera parameters, the cameras are said to be calibrated. The intrinsic camera parameters are for example the focal length of the camera, image format, principle point and the radial distortion coefficient of the lens. This additional information reduces the degrees of freedom of the essential matrix to five: three degrees of freedom of the rotation matrix R and two degrees of freedom of vector t, where t −→ is the coordinate vector of the translation AB separating the two cameras’ coordinate systems (more information available in [FP03]). The epipolar constraint is satisfied also by Essential matrix. T c Ec = 0 (4) The relation between F and E can be expressed with the following equation: −T F =P EP −1 (5) where P and P are already discussed in section calibration matrices. If the intrinsic camera parameters are given, then we need to know only five, and not seven point correspondences. However, the most difficult part of the triangulation approach is exactly finding the corresponding points in the two images.
  • 7. 3 RECONSTRUCTING 3D POINTS FROM AN IMAGE PAIR 5 3 Reconstructing 3D points from an image pair 3.1 General approach One simple algorithm for reconstruction of a 3D point from an image pair is proposed in [BB82]. The simple technique involves taking two images of a scene, separated by a base- line, then identifying the correspondences and applying triangulation rules for defining the two lines, on which the world point lies. The intersection between these lines give us as result the values of the 3D point world coordinates Unfortunately, finding the corresponding point pairs is not a trivial work. This usually happens via pattern matching. The main idea is to find a correlation between the pixels of the both images. For this purpose pixel areas from the first image are compared to pixel areas from the second and if a pattern has been found, we compute the disparity (displacement) between the positions of these patterns in the both images. The correlation of two images is very expensive operation, which means it requires huge amount of computational power (the complexity of this operation is O(n2 m2 ) for m × m patch and n × n pixel image). But the biggest disadvantage of correlation is that some parts of the 3D scene could not be matched properly. For example, when a point exists in the first view, but in the second view it lies hidden behind some object. The bigger the distance between the camera centers, the higher the possibility of such an error. Otherwise, we can choose a definitely smaller distance to place the both cameras, but in this case, the accuracy of the depth computation decreases also. Supposing enough point correspondences are found, then the algorithm for determining the world point proposed by [HZ03] involves the following steps: • Computing the fundamental matrix F from the point pairs. At least eight corresponding point pairs are necessary for building a liner system with unknown F . The result of this linear system will be the coefficients of the fundamental matrix. • Using F for determining the camera matrices P and P . In case when the both cameras have the same intrinsic parameters, we simply use the equation P T F P = 0. In practice, we actually deal with calibrated cameras, which is to say, we have computed the essential matrix E. • Reconstructing the three-dimensional point C for every pair of corresponding points c and c with the help of both equations: c = P C and c = P C given in section 2.4. The special case with world point C laying on the baseline, can not be calculated, because all points on the baseline are projected in the epipoles and thus not uniquely defined. If the intrinsic camera parameters are given, then instead computing the fundamental matrix, of course, it is better to be found the essential matrix. This information makes the second step useless, because the essential matrix E contains the camera calibration parameters. The described method give a solution only for the idealized case of the problem. Which means, that in a real situation, where the images are distorted by a different kinds of noise, the general approach will not be error resistant. Therefore some further methods with better practical results are proposed, for example in the section 3.2.3. 3.2 Computation of the fundamental matrix The importance of the fundamental matrix F estimation is clear from the previous sections. Having this matrix computed give us the possibility to find not only the 3D points from the
  • 8. 3 RECONSTRUCTING 3D POINTS FROM AN IMAGE PAIR 6 scene but also the camera calibrations. Therefore various computational methods are being invented for its determination. 3.2.1 Normalized eight point algorithm Beginning with the most simple method, which fundamentals were described in the section 2.2. The equation (1) holds for every point pair c and c , which means, that in theory every eight such pairs define an uniquely F up to scaling (because the fundamental matrix has eight degrees of freedom). We assume, that the homogeneous coordinates of the points c and c are, respectively (x, y, 1) and (x , y , 1) ([HZ03]). Then every point pair defines an equation, which solution contains the nine coefficients of the fundamental matrix: x xf11 + x yf12 + x f13 + y xf21 + y yf22 + y f23 + xf31 + yf32 + f33 = 0 (6) But in section 2.2 we mentioned, that we actually need only seven point correspondences. In fact, there is no mistake. We can really compute the fundamental matrix out of seven known point pairs, but in this case, the method is less stable and needs more computational time. Another important issue ist the singularity property of F . Which means, additional infor- mation that det(A) = 0 is given. In other words, if the found F appears to be not singular, we use the Frobenius norm ||F − F || to replace it with the closest singular matrix to F , namely F . Forcing the singularity of F is necessary, because otherwise there could be discrepancies between epipolar lines - they all could not meet in the epipole. The normalized eight point algorithm has been proposed for the first time in [HH81]. It is nothing more than improvement of the already described approach based on eight point cor- respondences. The important part of the normalized eight point algorithm is the cleverer con- struction of the linear equations (6). As pointed out by [HZ03], the normalization consists in translating and scaling the image, in order to organize the reference points around the origin of the coordinates system before solving the linear equations. The following normalization, suggested in [PCF06], for example, is a good solution of the problem: c i = K −1 c , where  w+h w  2 0 2 w+h h KN =  0 2 2 (7) 0 0 1 and h is the height, w is the width of the image. This transformation makes the normalized eight point algorithm showing better performance and stability of the result. Unfortunately, in the reality this idealized situation is very rare, which means, that most often, we have to deal with noisy measurements. For this reason, other statistically more stable algorithms are invented. 3.2.2 Algebraic minimization algorithm The algebraic minimization algorithm is based on the previous simple eight point algorithm for estimating the fundamental matrix. The difference between those two approaches is following: after finding F from the previous eight point algorithm, we try to minimize the algebraic error. The linear system build from the equation (6) for every point pair can be written in the form: Af = 0 (8) where A is the matrix derived from the coordinates of the two corresponding points and f is the vector containing the coefficients of F . The fundamental matrix F could be written as a
  • 9. 3 RECONSTRUCTING 3D POINTS FROM AN IMAGE PAIR 7 product of any non singular matrix M and e corresponding to homogeneous coordinates of the epipole in one of the images. Decomposing F to f = E m gives us the possibility to present the minimization problem as follow: min ||AE m|| subject to ||E m|| = 1, where E is a 9 × 9 matrix, computed iteratively from e and m contains the coefficients of M . Although iterative, this algorithm, proposed by [HZ03] is effective and simple for imple- mentation. 3.2.3 Gold standard algorithm The gold standard algorithm belongs to the group of algorithms trying to minimize the geomet- ric image distance. It uses as basis some of the previous methods and brings the most important improvement of performing very well in real situations. Usually, the most common type of noise appearing in real measurements is Gaussian. Therefore we should rather use the advance of statistical models, than pursuing exact results. There are two things we have to assume. The first assumption is that we are dealing with erroneous measurements, which in fact describes the real situation. Secondly, we suppose, that the noise in our images has a Gaussian distribution. Under these assumption, our model has been reduced to a minimization problem. That is to say, we can calculate the fundamental matrix, by minimizing the Likelihood function: ˆ ˆ 2 p(ci , ci )2 + p(c i , c i ) = 0 (9) i ˆ The terms p(ci , ci ) and p(c i , c i ) express the probability of observation ci , respectively c i ˆ when in fact the exact (correct) corresponding points are ci and cˆi . ˆ The gold standard algorithm provides the best results from all discussed methods in terms of being stable in those systems distorted by Gaussian noise. In fact, this is the case for almost every reality based model, therefore one can be sure, that using the gold standard algorithm will give back the most accurate results. 3.2.4 Automatic computation of the fundamental matrix If we want to use triangulation method in robotics, there is one very important step, we should not miss. We have already some very useful algorithms for computing the fundamental ma- trix, but this is only one part of the whole measurement process. The robot vision functions on the following principle: given two input images as sensor data and the robot must some- how acquire the knowledge of exact object position. The missing part of this process is an answer of the question: how can a robot detect those point correspondences, which he needs in order to compute the fundamental matrix? An algorithm, able to automatically detect point correspondences should be invented. Meanwhile, there are available a lot of algorithms for extracting key features from images. For example, the Harris detector can be used to find the corners in one image. It is a simple approach, and has the biggest disadvantages of being scaling dependent. However adapting Harris detector to be invariant to affine transformation is not impossible task. Very successful combination of Harris and Laplacian detectors is presented in [MS04]. There are, of course, a great number of algorithms detecting so called ”points of interest”. For example Laplacian and Difference of Gaussian (DoG) detectors work on the principle of finding areas with fast changing color value. They are scale invariant, because they filter the image with Gaussian kernel and this way define regions with structures of interest. Another very interesting approach for detecting key structures in a picture is submitted in the paper [KZB01] and it is called Salient region detector. The main idea of this method is to
  • 10. 3 RECONSTRUCTING 3D POINTS FROM AN IMAGE PAIR 8 use the local complexity as a measure of saliency. One area can be marked as salient, only if the local attributes in this area show unpredictability over certain set of scales. The procedure consists of three steps. First, the Shannon entropy H(s) is calculated for different scales and in the second step, the optimal scales are selected as the scales with the highest entropy. In the next step, magnitude change of the probability density function W (s) as a function of scale at each peak is calculated and finally the result is formed as product of both: H(s) and W (s) of each circular window with radius s. This method could be further extended in order to become affine invariant. Specially for the needs of stereo problem analysis, an algorithm calculating the so called maximally stable extremal regions (MSER) was developed and suggested in the paper [MCUP02]. On the basis of local binarization of the image, these maximally stable extremal regions are de- tected and an exploration of their properties shows some very positive characteristics. They are invariance to affine transformation, stable and allow multi-scale detection, which means the fine structures are detected, as well as the very large ones. The informal explanation of the MSER concept is following: all pixels of one image are divided according to some varying threshold in two groups. Shifting the threshold from the one end of the intensity scale to the other makes our binary images change. This way we can define our regions of maximum intensity and in- verting the image gives us, respectively the minimum regions. The authors of the paper propose an algorithm running with complexity O n log log n which guarantees fast liner performance with increasing pixel number. 3.3 Image rectification Image rectification is an often used method in computer vision, simplifying the search of match- ing points between the images. The simple idea behind image rectification is to project the both images on another plane, so that they are forced to share a common plane. The benefits of these transformations are significant. If we want to find a matching point c of c then we don’t need to search the whole plane, but only a line of it and this line is however parallel to the x-axis. The implementation of this idea can be done by projecting the both images on another plane, so that their epipolar lines are becoming scanlines of the new image, and they are also parallel to the baseline. [FP03]. An important point to be mentioned, is that the image rectification algorithms are based on the already discussed methods for finding corresponding points. It is an advantage when the underlying point correspondences detector performs automatically. The next steps involve mapping the epipole to a point in infinity and then applying it to the other image, so that it matches the epilpolar line. This algorithm is explained in [HZ03].
  • 11. 4 TRIANGULATION METHODS 9 4 Triangulation methods In this chapter, we are going to state the problems by triangulation and their solutions. Assuming that the fundamental matrix F and the both camera matrices P and P are given and we can rely on their correctness, the first idea, that come immediately in mind is to back-project the rays from the corresponding image points c and c . The point in 3D space, where these rays will intersect each other, is exactly what we search. At first, this idea seems to work, but in practice, we can never be sure, that the images contain perfect measurements. In the case of noise-distorted image pair, the previously discussed idea will fail, because the back-projected rays won’t intersect in a 3D point at all. One possible solution of this problem, already discussed in section 3.2.3 is to estimate the fundamental matrix and the world point C simultaneously, using the Gold standard algorithm. The second possibility is by obtaining an optimal Maximum Likelihood estimator for the point. In the following section, we are going to discuss the second possibility. For the first one, please refer to section 3.2.3. 4.1 Linear triangulation methods The fact, that two rays calculated from the image points don’t cross at a world point, can be geometrically represented with the statement: c = P C and c = P C are not satisfied for any C. We can remodel and combine these two equation to become one equation of the form: AC = 0, where A is a matrix, derived from the homogeneous coordinates of the points c and c , as well as the columns of the camera matrices p1 , p2 , p3 , p 1 , p 2 , p 3 . As suggested in [HZ03]: xp3 T − p1 T    yp3 T − p2 T  A =x p 3 T − p 1 T   (10) y p 3T − p 2T This way we have a linear system from four equation, in order to find the four homogeneous coordinates (X, Y, Z, 1)T of the world point C. There are two linear methods for finding the best solution for C. The homogeneous method tries to find the solution as the unit singular vector corresponding to the smallest singular value of A. The alternative inhomogeneous method turns the set of equations into a inhomogeneous set of linear equations. All linear methods have the same disadvantage - they are not projective invariant, which means, that objects like c, c , P and P do not remain the same by transformation under the laws of projective geoetry. In other words, there is no such transformation H, for which τ (c, c , P, P ) = H −1 τ (c, c , P H −1 , P H −1 ), where τ () marks the triangulation function. Thus, there are more suitable methods for solving the same problem, discussed in the following sec- tions. 4.2 Minimization of geometric error As we assumed in the previous section, the measured image points c and c don’t satisfy the epipolar constraint, because they are noise distorted. If we mark the corresponding points, ˆ which satisfy the epipolar constraint with c and c , then we can turn the problem into minimiza- ˆ tion problem: ˆ ˆ 2 min d(c, c)2 + d(c , c ) (11)
  • 12. 4 TRIANGULATION METHODS 10 ˆT ˆ where d(a, b) stays for the Euclidean distance and the constraint c F c = 0 holds. Once we find ˆ the points c and c , the solution for C is easy and can be calculated by any triangulation method. ˆ 4.3 Sampson approximation An alternative to the minimization of the geometric error method is the so called Sampson approximation. Without examining it in small details, we will make an overview of the method. The Sampson correction δc of the world point C is expressed by (x, y, x , y ), where (x, y)T and (x , y )T are the coordinates of the points c, respectively c . Logically C could be presented ˆ as the calculated C from the faulty measurements plus the Sampson correction δc . After some transformations (for details, please refer to [HZ03]), the end result looks like:       xˆ x (F T c )1  y   y   ˆ = − c TFc  (F T c )  2  ˆ  x   x  (F c)2 + (F c)2 + (F T c )2 + (F T c )2  (F c)1   (12) 1 2 1 2 yˆ y (F c)2 where, for example the expression (F T c )1 replaces the polynomial f11 x + f21 y + f31 . The Sampson approximation is accurate only in case the needed correction is very small. Otherwise, there is a more stable algorithm, presented in the next section, which results satisfy the epipolar constraint. 4.4 The optimal solution The optimal algorithm tries to return an accurate result by finding a global minimum in a cost function, similar to the Likelihood function (9) presented in the previous chapter. Using the knowledge that the corresponding point always lies on the corresponding epipolar line, we define the cost function as: 2 d(c, l)2 + d(c , l ) (13) where l and l are corresponding polar lines to the points c , respectively c. With a proper parameterization of the epipolar pencils in the images, the solution of this minimization problem is optimal.
  • 13. 5 PRACTICAL EXAMPLES OF TRIANGULATION 11 5 Practical Examples of Triangulation 5.1 Triangulation with structured light In all triangulation methods with structured light, one of the two cameras is replaced by light source. Therefore these technics are often referred as active triangulation. In the following sections are presented the most basic principles of triangulation via structured light. There are, of course, a lot of variations and improvements, but the basic idea remains always the same. 5.1.1 Light spot technique The light spot technique is based on a simple construction with a laser ray, object lens and detector, which can be either charge-coupled device (CCD) or position-sensing detector (PSD). As shown in the Figure 2, the laser ray points on the object’s surface and the lens projection of this point plays the role of photodiode. It produces differences in the electric current on the light sensitive area of the PSD. On the basis of this difference, we can measure the exact position of the point on the sensor, respective to calculate the position of its image on the object. Scanning of a surface succeed via sample points. laser Ө p’ PSD q’ h q p measured object Figure 2: The picture visualize how depth information can be gained via light spot technique. This technique has a lot of advantages. The result is fast, accurate and additionally inde- pendent from the surface color. But there is one constraint for this method, namely, the surface must be no ideal mirror, because part of the light should reflect in the direction of the objective. There are also some problems to be solved. For example, if part of the surface is hidden by an- other structure from the same surface, then it is impossible for the laser ray to reach the hidden one.
  • 14. 5 PRACTICAL EXAMPLES OF TRIANGULATION 12 5.1.2 Stripe projection The main idea of stripe projection is to show how the object’s surface modulate the input signal. In this particular case, the input signal is one laser line. Where the line intersects an object, on the image taken from the camera, we can see displacements in the light stripe proportional to the distance of the object. For this purpose we need to know previously where would the line be projected, if no object was placed in front of the camera. Having this information and knowing how the measured object impacts the light line, we can easily estimate the position of almost all 3D points, lying on the object. Figure 3 illustrates the geometry of this approach: camera laser d measured object h Ө reference surface (a) (b) Figure 3: (a): The camera registers the point displacement, so we can now calculate its position in 3D space (b): Demonstration of the method’s implementation. We know exactly where our point on the reference surface should be projected and the distance to the reference surface r is also previously known. This means, if we manage to find h, then the distance to the object is simple the difference between r and h. Finding h is a simple task with the knowledge of the displacement d and the angle θ on which our laser ray is inclined: d h= (14) tan θ 5.1.3 Projection of a static line pattern One obvious disadvantage of the stripe projection method is, that all objects are scanned line by line, which means slow performance, requiring one image per every single line. In order to make the approach faster, we can project more lines simultaneously. The end result is static line pattern, deformed by the object’s surface. Although this improvement shows better results, it has additional disadvantage. By surfaces with fast changing forms, the run of every single line is very difficult to be followed. Therefore projection of a static line patter should be further ex- tended to encode every single point from the surface uniquely. A lot of methods were developed in order to accomplish this task. Some of them are discussed in the following section.
  • 15. 5 PRACTICAL EXAMPLES OF TRIANGULATION 13 5.1.4 Projection of encoded patterns In order to encode each surface point uniquely, the static stripe pattern should be extended. This can happen either with adding more colors in the projected pattern, or just taking more pictures of the scene lighted by slightly changing pattern. Figure 4 illustrates one example solution for projected pattern. Stripe patterns with different wavelengths are projected successively, building an unique code for every point on the surface. The same procedure can be repeated with horizontal lines also. This approach, called binary coding via sequence of fringe patterns is very successful but it fails when the stripe pattern needs to be very fine. If high resolution position information required, then a better approach could be projection of phase shifted pattern with the same wavelength. But this is suitable only for really fine structures, therefore most often hybrid methods are used, which means one mixture of the both methods bringing the quality of precisely encoded rough, as well as fine object structures. Figure 4: Active triangulation by projecting stripe patterns on the object’s surface. The different wave- length of the pattern encodes uniquely every point. The images are taken from [J¨ h05]. a The second possibility of encoding one point uniquely is via colored pattern. Certain vari- ation of the pattern is possible, for example they could be differentiates not only by color, but also by width and the pattern itself. Because the projected pattern is previously known, finally the result and the expected pattern are compared. In this way, if some occlusions are present, they could be very easily detected. 5.1.5 Light spot stereo analysis The light spot stereo analysis is a method inspired by the human binocular vision. Two cameras take pictures from the scene. A laser ray is projected over the object’s surface and registered by the camera pair. The disparity between the laser points in the both images functions as basis for computing the distance to that point. This approach is a mixture between active triangulation method and triangulation from stereo vision.
  • 16. 5 PRACTICAL EXAMPLES OF TRIANGULATION 14 5.2 Triangulation from stereo vision Similar to the human vision system, the triangulation from stereo vision functions with two cameras. The distance between the plane, on which the both cameras are positioned and the 3D point in the space can be easily measured. As shown in Figure 5, the distance vector between the cameras called stereoscopic basis is marked with b. Assuming, that the distance X3 is at least two times greater than the focal length d , we can express the relationship between these three quantities as: d p=b (15) X3 The newly introduced quantity p is referred as parallax or disparity and its geometrical meaning is the offset between the two projections of the world point on the image plane, defined by the parallel optical axes of the camera pair. For more details and derivation of the final equation (15), please refer to [J¨ h05]. a d’ x3 x1 b Figure 5: The graphic represents the view angles of the both cameras and geometrically visualizes how disparity depends on the focal length d of the cameras, stereoscopic basis b and the distance to the object X3 There are some interesting consequences of the equation (15). Firstly, one can conclude, that the disparity p is proportional to the stereoscopic base b. Secondly, the disparity is in- versely proportional to the distance to the measured object X3 . Summarizing this observation, the greater distance to the measured object means loss of accuracy by estimating the depth information and the bigger stereoscopic base invokes on the contrary higher precision.
  • 17. 6 CONCLUSION 15 6 Conclusion One very interesting issue is the usage of triangulation methods in medicine. Therefore instead of summarizing all written till now, as conclusion, we would like to mention some real examples taken from the medical area. This would perfectly illustrate the importance of such methods in in science as well as in our everyday life. Triangulation principles are used most often in the optical tracking systems. The most popular and widely used system is Polaris R produced by Northern Digital Inc. (also known as NDI). The Polaris R family members (presented in Figure 6) offer passive, active and hybrid tracking and the points, which are needed for the triangulation itself are implemented as markers fixed on the surgical instruments. Figure 6: The picture taken from the NDI webpage [pol] shows two members of Polaris family. Another example of optical tracking systems, used rather for research purposes are the ART R systems produced by Advanced Realtime Tracking GmbH (ART GmbH). The exam- ple system smARTtrack R presented on the Figure 7 consists of two cameras fixed on a rigid bar, so that no calibration is needed. Different configurations are possible, depending on param- eters like focal length, baseline length and angle between the cameras. The ART R trademark is very popular, because it allows the building of multiple cameras systems, for example with three, four and five cameras. Figure 7: The picture from the ART webpage [art] illustrates the smARTtrack R stereo vision system. 3D vision system could be implemented also in endoscopic instruments. For this purpose two very small cameras are embedded in a tube, with relatively small stereoscopic base, which is no problem, because the measured distances by endoscopy are also very limited. Figure 8 a)
  • 18. 6 CONCLUSION 16 shows how such device looks like. The hole surgical systems, presented on the Figure 8 is called Da Vinci R and consists of a high-resolution 3D endoscope coupled with two 3-chip cameras and a console helping by visualizing the camera records and by repositioning the surgical camera inside the patient. For more technical details about this system, please refer to the producer’s webpage [daV]. (a) (b) Figure 8: (a): Da Vinci R 3D endoscope with two cameras (b): Da Vinci R console helping the surgeon by positioning the instruments in the patient’s body and visualizing the camera records.
  • 19. REFERENCES 17 References [art] ART Systems homepage. http://www.ar-tracking.de/smARTtrack. 49.0.html [BB82] BALLARD, Dana H. ; B ROWN, Christopher M.: Computer Vision. 2nd edition. Prentice Hall, 1982 [daV] Da Vinci Surgical System homepage. http://www.intuitivesurgical. com/products/davinci_surgicalsystem/3d.aspx [FP03] F ORSYTH, David A. ; P ONCE, Jean: Computer Vision A Modern Approach. Pren- tice Hall, 2003 [HH81] H.C.L ONGUET-H IGGINS: A Computer Algorithm for Reconstructing a Scene from Two Projections. Nature, 1981 [HZ03] H ARTLEY, Richard ; Z ISSERMAN, Andrew: Multiple View Geometry in Computer Vision. 2nd edition. Cambridge University Press, 2003 [J¨ h05] a ¨ J AHNE, Bernd: Digitale Bildverarbeitung. 6th edition. Springer Verlag, 2005 [KZB01] K ADIR, Timor ; Z ISSERMAN, Andrew ; B RADY, Michael: An affine invariant salient region detector. In: Department of Engineering Science, University of Ox- ford (2001) [MCUP02] M ATAS, J. ; C HUM, O. ; U RBAN, M. ; PAJDLA, T.: Robust Wide Baseline Stereo from Maximally Stable Extremal Regions. In: Center for Machine Perception, Dept. of Cybernetics, CTU Prague, Karlovo (2002) [MS04] M IKOLAJCZYK, Krystian ; S CHMID, Cordelia: Scale and Affine Invariant Inter- est Point Detectors. In: International Journal of Computer Vision 60(1) (2004), January, S. 63–86 [PCF06] PARAGIOS, Nikos ; C HEN, Yunmei ; FAUGERAS, Oliver: Handbook of Mathemat- ical Models in Computer Vision. Springer Verlag, 2006 [pol] NDI homepage. http://www.ndigital.com/medical/ polarisfamily.php [Wol] W OLFRAM R ESEARCH: Wolfram MathWorld. http://mathworld. wolfram.com/AntisymmetricMatrix.html