Depth & space

Depth & Space
Becca Kennedy
Perception in Real & Virtual Environments
9/19/12

Overview
• Edges, lines, and texture elements must be interpreted in
terms of 3D structure to understand the world
• Observer must determine:
▫ Depth – distance of the surface from the observer
▫ Surface orientation – slant and tilt
• Depth and surface orientation are recovered together
▫ 3D orientation determines distances of object parts
from the observer, and distance of parts determines
3D orientation

Overview
• Slant – size of the angle between the observer’s line of sight
and the surface normal
▫ Surface normal – virtual line sticking out perpendicularly
out of the surface at that point
• Tilt – the direction of the depth gradient relative to the
frontal plane

The Problem of Depth Perception
• Depth perception from a 2-D retinal image is ambiguous

The Problem of Depth Perception –
Heuristic Assumptions
• Visual system implicitly makes heuristic assumptions
about the nature of the world
• Our visual system is fooled by 3-D movies
▫ Visual system implicitly assumes that both eyes are
looking at the same scene
▫ The different image presented to each eye is
interpreted as depth
▫ But usually this heuristic is correct

The Problem of Depth Perception –
Marr’s 2.5-D Sketch
• There are many independent processing modules
computing depth information from separate sources
▫ Each module processes different kinds of information
• The final common depth interpretation is expressed as a
2.5-D sketch

Sources of Depth Information
• Ocular information vs. optical information
▫ Ocular information arises from factors that depend on the state of
the eyes themselves
▫ Optical information arises from the structure of the light entering
the eyes
• Binocular information vs. monocular information
• Static information vs. dynamic information

Ocular Information
• Accommodation – the process through which the ciliary
muscles in the eye control the optical focus of the lens by
temporarily changing its shape
▫ Monocular cue
▫ Thick lens for close objects, thin lens for far objects
▫ Weak source of depth information, but used at close distances

Ocular Information
• Convergence – the extent to which the two eyes are
turned inward to fixate an object
▫ Binocular cue
▫ Fixating on a close object results in a large convergence angle
▫ Fixating on a far object results in a small convergence angle
▫ Visual system uses the angle of eye convergence to determine
distance to the fixated point

Stereoscopic Information
• Stereopsis is the process of perceiving the relative
distance to objects based on their lateral displacement in
the two retinal images
▫ This relative lateral displacement is binocular disparity
 Direction of binocular disparity provides info about which points are closer
and which are farther than the fixated point
 Magnitude of binocular disparity provides information about how much
closer or farther they are
• Specifies ratios of distances to objects rather than simply
which is farther and which is closer

Corresponding Retinal Positions
• Corresponding positions on the two retinae are positions
that would coincide if the two foveae were superimposed
by simple lateral displacement
▫ Binocular disparity occurs when a given point in the external world
doesn’t project to corresponding positions
 Crossed disparity indicate that a point is closer than the fixated point
 Uncrossed disparity indicates that a point is farther away than the fixated point

Corresponding Retinal Positions
• The horopter is the set of environmental points that
stimulate corresponding points on the two retinae
▫ Theoretical horopter –defined as the locus of points which make
the same angle at the eyes
▫ Empirical horopter –defined by singleness of vision; larger than
theoretical horopter
• Panum’s fusional area is the area around the horopter
within which disparate images are perceptually fused, so
we don’t see double images
▫ Points that lie outside Panum’s area create disparity that we
experience as depth

The Correspondence Problem
• How does the visual system determine which features in
one retinal image correspond to which features in the
other?
• For many years, theorists assumed that this problem was
solved by a shape analysis for each left and right image
that occurred before stereopsis

The Correspondence Problem
• The alternative possibility is that stereopsis occurs first
▫ Random dot stereograms
 When each image is viewed alone, the dots look random
 Shape-first theory would predict that depth perception of random-dot
images stereoscopically would be impossible
 Random dot stereograms show that stereoscopic depth can be
perceived without monocular shape information

Computational Theories
• Most dots in the left image have a corresponding dot in the
right image
▫ The visual system needs to figure out which pairs of dots go together
• The first Marr-Poggio algorithm (1976, 1977)
▫ Individual pixels in the left and right images are matched according to
location and color
 Among these matches, there are the correct ones that correspond to the visible
portions of the actual surfaces in the real world
▫ Two heuristic constraints help provide the correct solution
 Surface opacity – only the nearest surface can be seen
 Surface continuity – correct solution will tend to be one in which matches are close
together in depth

Edge-Based Algorithms
• Marr and Poggio suggested a second algorithm in 1979
• Differed from the first in the following ways:
▫ Edge-based matching – matches edges in the left and right
images rather than pixels
▫ Multiple scales – visual system first looks for corresponding
edges at a large spatial scale, followed by more detailed matching
at finer-grained levels
▫ Single-pass operation – noniterative; finds best edge-based
correspondence in a single pass through a multistage operation

Multi-Orientation, Multi-Scale (MOMS) filters
• Jones and Malik (1990)
▫ A process of matching the vector representing a given position in
one eye to each of the vectors representing laterally displaced
positions in the other eye
 Specifies the most likely correspondence
▫ Better and more robust matches because MOMS vectors carry a
lot of spatial information
 Compared to outputs of single receptors or edge detectors

Physiology of Stereoscopic Vision
• Binocular depth cells
▫ Hubel and Wiesel (1962)
 Discovered cells in V1 of the visual cortex that were sensitive to binocular stimulation
▫ Barlow, Blakemore, and Pettigrew (1967)
 Reported that some binocular cells in area V1 responded optimally to stimulation in
disparate locations of the two retinae
• To show that these cells are involved in depth perception, need to
also demonstrate a connection between disparity and behavior
▫ Blake and Hirsch (1975)
 Reared cats so that their vision was alternated between left and right eyes for 6 months
 These cats had few binocular neurons and they were not able to use binocular disparity to
perceive depth
▫ Recent brain imaging experiments have shown that many different areas
are activated by stimuli that create binocular disparity
 Depth perception involves many stages of processing that extend from primary visual cortex

Dynamic Information
• Motion parallax
▫ The differential motion of pairs of points due to their different depths
relative to the fixation point
 Nearby objects move quickly, far off objects appear stationary
• Optic flow caused by a moving observer
▫ Relative to the fixation point…
 Points closer to the observer flow in the direction opposite the observer’s motion
 Points farther than fixation point flow in the same direction as the observer’s motion

Dynamic Information
• Another pattern of optic flow is optic expansion or looming
▫ Fixated point is stationary on the retina
▫ Other points flow outward, faster with more distance from fixation
point

Dynamic Information
• Optic flow caused by moving objects
▫ Kinetic depth effect (KDE; Wallach & O’Connell, 1953) – ability to
perceive depth from object motion
▫ Visual system uses a rigidity heuristic
 Biased toward perceiving rigid motions rather than plastic motions
• Accretion/Deletion of Texture
▫ Appearance and disappearance of texture behind a moving edge

Pictorial Information
• Convergence of parallel lines
• Position relative to the horizon

• Relative size
• Familiar size
▫ In a VE, if not enough depth cues are present, the observer begins to
depend on retinal size (Kenyon, Sandin, Smith, Pawlicki, & Defanti, 2007)
• Texture gradients

Pictorial Information – Edge Interpretation
• Edge and contour interpretations
▫ E.g., occlusion or interposition – blocking of light from an object by an
opaque object causing occlusion or interposition
▫ Edges provide relative rather than absolute depth information
▫ Available from virtually unlimited distances within visible range
• Vertex (edge intersection) classification
▫ Guzman’s (1968, 1969) program SEE attempted to interpret line
drawings of simple configurations of blocks
 He developed a classification scheme for edge intersections (vertices): Ts, Ys, Ks, Xs,
Ls, etc.
▫ Huffman and Clowes (1971) developed a complete catalog of the vertex
types that arise in viewing simple trihedral angles from all possible
viewpoints

• Four types of edges:
1. Orientation edges – places where there are discontinuities in
surface orientation; when two different orientations meet along
an edge
2. Depth edges – places where there is a spatial discontinuity in
depth between surfaces; places where one surface occludes
another that extends behind it, with space between
3. Illumination edges – places where there is a difference in the
amount of light falling on a homogenous surface; edge of a
shadow, highlight, or spotlight
4. Reflectance edges – places where there is a change in the
light-reflecting properties of the surface material; e.g., designs
painted on a surface

• Edge labels
▫ Two kinds of orientation edges
 Convex orientation edges are labeled with a +
 Concave orientation edges are labeled with a -
▫ Arrows indicate that the closer surface is on the right

• Physical constraints
▫ Not all logically possible labelings are physically possible

• Extensions and Generalizations
▫ Waltz (1975) extended the Huffman-Clowes analysis to include 11
types of edges, including shadows and “cracks” (orientation edges
at 180 degree angles)
 Adding shadows making interpretation more accurate because it
provides further constraints
▫ Malik (1987) extended analysis of edge labeling to curved
objects
 New depth edge type extremal edge or limb (double arrow) occurs
when a surface curves smoothly around to partly exclude itself

• Extensions and Generalizations
▫ Barrow and Tennenbaum (1978)’s analysis contained additional
constraints:
 The smoothness assumption – if an occluding edge in the image is
smooth, then so is the contour of the surface that produced it
 The general viewpoint assumption – small changes in viewpoint will
not cause qualitative differences in the image

• Shading information
▫ Shading – variations in the amount of light reflected from the surface as
a result of variations in the orientation of the surface relative to a light
source
▫ Horn’s (1975, 1977) Computational Analysis
 Showed that percentage changes in image luminance are directly proportional to
percentage changes in the orientation of the surface
▫ Humans are able to interpret surfaces with significantly specular
characteristics, like glossy surfaces that reflect light more coherently
than matte surfaces do
 How?

• Shading information
▫ Cast shadows
 Shadows of one objet that fall on the surface of another object provide more
depth information
 Distance between object and its shadow cast on the surface gives height of its
bottom above the surface

• Aerial perspective
▫ Refers to certain systematic differences in the contrast and color of
objects that occur when they are viewed from great distances
 Contrast is reduced by additional atmosphere through which they are viewed, which
contains particles of dust, water, or pollutants that scatter light
 Mountains that are far away appear bluer because the atmosphere scatters longer
wavelengths of light more than shorter wavelengths

Integrating Information Sources
• Depth cues are often highly correlated, making them easy to integrate
• What happens when cues are in conflict with one another? 3 possibilities:
1. One source dominates a conflicting source
 E.g., In Ames room, perspective information dominates familiar size
2. A compromise is achieved between two conflicting sources
 Visual system makes independent estimates of depth from each source alone, then
integrates them according to a mathematical rule
 Bruno and Cutting (1988) found that information integration was
additive; sum independent effects of sources
3. The two sources interact to arrive at an optimal solution
 E.g., convergence specifies absolute depth, binocular disparity specifies ratios of
distances; together they can provide a complete depth map

Depth Perception and VEs
• Our visual system is really good at depth perception in real
environments, but this is hard to replicate in virtual scenes
▫ Ocular depth information (accommodation, convergence) is less useful
▫ Stereoscopic depth information may not be available
▫ Motion cues may not be faithfully represented
▫ Depth cues may be conflicting
▫ Etc.!
• But augmented reality can also improve real-world depth
perception

Depth & space

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (18)

En vedette

En vedette (6)

Similaire à Depth & space

Similaire à Depth & space (20)

Plus de becca_kennedy

Plus de becca_kennedy (7)

Dernier

Dernier (20)

Depth & space