SlideShare a Scribd company logo
1 of 28
Download to read offline
Overview
                  Tests of Visual Parsing
                              Conclusion




Visual Parsing After Recovery from Blindness
         Ostrovsky et al. Psychological Science 2009.


                               Daniel O’Shea
                           djoshea@stanford.edu


                            Brain CS Lunch Talk

                                 11 March 2010




    Daniel O’Shea (djoshea@stanford.edu)    Visual Parsing After Recovery from Blindness   1 / 19
Overview
                      Tests of Visual Parsing
                                  Conclusion




Overview
   Project Prakash
   Learning to Recognize Objects
   Subjects

Tests of Visual Parsing
   Static Visual Parsing
   Dynamic Visual Parsing
   Recovery of Static Object Recognition

Conclusion
   Motion Information as a Scaffold
   Experimental and Theoretical Evidence for Temporal Learning
   Future Directions


        Daniel O’Shea (djoshea@stanford.edu)    Visual Parsing After Recovery from Blindness   2 / 19
Overview    Project Prakash
                        Tests of Visual Parsing   Learning to Recognize Objects
                                    Conclusion    Subjects


Project Prakash




      “The goal of Project Prakash is to bring light into the
      lives of curably blind children and, in so doing, illuminate
      some of the most fundamental scientific questions about
      how the brain develops and learns to see.”
   http://web.mit.edu/bcs/sinha/prakash.html

          Daniel O’Shea (djoshea@stanford.edu)    Visual Parsing After Recovery from Blindness   3 / 19
Overview    Project Prakash
                     Tests of Visual Parsing   Learning to Recognize Objects
                                 Conclusion    Subjects



Project Prakash

    Launched in 2003 by Pawan Sinha at MIT BCS
    Based in New Delhi; India home to 30% of blind population
    Humanitarian aim: Screen children for treatable blindness
    and provide modern treatment




       Daniel O’Shea (djoshea@stanford.edu)    Visual Parsing After Recovery from Blindness   4 / 19
Overview    Project Prakash
                      Tests of Visual Parsing   Learning to Recognize Objects
                                  Conclusion    Subjects



Project Prakash

    Launched in 2003 by Pawan Sinha at MIT BCS
    Based in New Delhi; India home to 30% of blind population
    Humanitarian aim: Screen children for treatable blindness
    and provide modern treatment

Scientific Opportunity

    How does the visual system learn to perceive objects?
    Track object learning in longitudinal studies
    Complementary to infant studies which have more plastic
    visual processing but are limited in experimental complexity
    Separation of visual learning from sensory development

        Daniel O’Shea (djoshea@stanford.edu)    Visual Parsing After Recovery from Blindness   4 / 19
Overview             Project Prakash
                               Tests of Visual Parsing            Learning to Recognize Objects
                                           Conclusion             Subjects


Learning to Recognize

      Visual system must learn to “parse” visual world into
      meaningful entities
      Segmentation focused on heuristics: alignment of contours,
                               Y. Ostrovsky et al.

      similarity of texture representations




      Fig. 1. Example illustrating how a natural image (a) is typically a collection of many regions of different hues
      and luminances (b). The human visual system has to accomplish the task of integrating subsets of these regions
      into coherent objects, illustrated in (c).

             Daniel O’Shea (djoshea@stanford.edu)                 Visual Parsing After Recovery from Blindness           5 / 19
Overview    Project Prakash
                        Tests of Visual Parsing   Learning to Recognize Objects
                                    Conclusion    Subjects


Central questions




      Critical period for learning visual parsing?
      Evidence that mature visual system uses these heuristics for
      object recognition, but are they useful in organizing visual
      information during development?




          Daniel O’Shea (djoshea@stanford.edu)    Visual Parsing After Recovery from Blindness   6 / 19
Overview    Project Prakash
                      Tests of Visual Parsing   Learning to Recognize Objects
                                  Conclusion    Subjects




Recovery Subjects

    S.K. 29 yo, congenital bilateral aphakia (absence of lens);
    corrected with eyeglasses!
    P.B. 7 yo, treated with bilateral cataract surgery with
    intraocular lens implantation
    J.A. 13 yo, treated with bilateral cataract surgery with
    intraocular lens implantation

Control Subjects

    Visual input scaled to simulate information loss due to
    imperfect acuity in treated group (20/80 to 20/120)


        Daniel O’Shea (djoshea@stanford.edu)    Visual Parsing After Recovery from Blindness   7 / 19
Overview                            Static Visual Parsing
                                                  Tests of Visual Parsing                           Dynamic Visual Parsing
                                                              Conclusion                            Recovery of Static Object Recognition


Tests of Static Visual Parsing
                                                                                      Y. Ostrovsky et al.




                 a
                                                  A           B           C           D                 E              F                 G



                                                  How many    How many    How many      How many        Which object Trace the             How many
                                                   objects?    objects?    objects?      objects?        is in front? long curve            objects?


                 b
                                            100                                                                                                              Control
                                                                                                                                                             Group
                  Performance (% correct)




                                                                                                                                                             S.K.

                                                                                                                                                             J.A.

                                                                                                                                                             P.B.
                                            50




                                                                                                   NA NA          NA NA
                                             0
                                                      A           B            C               D              E                 F                 G
                 c                                                            d


      A: Non-overlapping shapes: No difficulty.


                 Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg-
                 mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates
                 that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently
                 treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to
         Daniel O’Shea (djoshea@stanford.edu)                                                       Visual Parsing After Recovery from Blindness
                 recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that
                 agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes.
                                                                                                                                                                              8 / 19
Overview                            Static Visual Parsing
                                                  Tests of Visual Parsing                           Dynamic Visual Parsing
                                                              Conclusion                            Recovery of Static Object Recognition


Tests of Static Visual Parsing
                                                                                      Y. Ostrovsky et al.




                 a
                                                  A           B           C           D                 E              F                 G



                                                  How many    How many    How many      How many        Which object Trace the             How many
                                                   objects?    objects?    objects?      objects?        is in front? long curve            objects?


                 b
                                            100                                                                                                              Control
                                                                                                                                                             Group
                  Performance (% correct)




                                                                                                                                                             S.K.

                                                                                                                                                             J.A.

                                                                                                                                                             P.B.
                                            50




                                                                                                   NA NA          NA NA
                                             0
                                                      A           B            C               D              E                 F                 G
                 c                                                            d


      B: Overlapping shapes as line drawings: overfragmentation,
      all closed loops perceived as distinct.

                 Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg-
                 mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates
                 that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently
                 treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to
         Daniel O’Shea (djoshea@stanford.edu)                                                       Visual Parsing After Recovery from Blindness
                 recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that
                 agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes.
                                                                                                                                                                              8 / 19
Overview                            Static Visual Parsing
                                                  Tests of Visual Parsing                           Dynamic Visual Parsing
                                                              Conclusion                            Recovery of Static Object Recognition


Tests of Static Visual Parsing
                                                                                      Y. Ostrovsky et al.




                 a
                                                  A           B           C           D                 E              F                 G



                                                  How many    How many    How many      How many        Which object Trace the             How many
                                                   objects?    objects?    objects?      objects?        is in front? long curve            objects?


                 b
                                            100                                                                                                              Control
                                                                                                                                                             Group
                  Performance (% correct)




                                                                                                                                                             S.K.

                                                                                                                                                             J.A.

                                                                                                                                                             P.B.
                                            50




                                                                                                   NA NA          NA NA
                                             0
                                                      A           B            C               D              E                 F                 G
                 c                                                            d


      C: Overlapping shapes as filled translucent surfaces:
      overfragmentation, all regions of constant luminance
      perceived as distinct
                 Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg-
                 mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates
                 that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently
                 treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to
         Daniel O’Shea (djoshea@stanford.edu)                                                       Visual Parsing After Recovery from Blindness
                 recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that
                 agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes.
                                                                                                                                                                              8 / 19
Overview                            Static Visual Parsing
                                                  Tests of Visual Parsing                           Dynamic Visual Parsing
                                                              Conclusion                            Recovery of Static Object Recognition


Tests of Static Visual Parsing
                                                                                      Y. Ostrovsky et al.




                 a
                                                  A           B           C           D                 E              F                 G



                                                  How many    How many    How many      How many        Which object Trace the             How many
                                                   objects?    objects?    objects?      objects?        is in front? long curve            objects?


                 b
                                            100                                                                                                              Control
                                                                                                                                                             Group
                  Performance (% correct)




                                                                                                                                                             S.K.

                                                                                                                                                             J.A.

                                                                                                                                                             P.B.
                                            50




                                                                                                   NA NA          NA NA
                                             0
                                                      A           B            C               D              E                 F                 G
                 c                                                            d


      D: Opaque overlapping shapes: could correctly indicate
      number of objects.

                 Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg-
                 mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates
                 that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently
                 treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to
         Daniel O’Shea (djoshea@stanford.edu)                                                       Visual Parsing After Recovery from Blindness
                 recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that
                 agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes.
                                                                                                                                                                              8 / 19
Overview                            Static Visual Parsing
                                                  Tests of Visual Parsing                           Dynamic Visual Parsing
                                                              Conclusion                            Recovery of Static Object Recognition


Tests of Static Visual Parsing
                                                                                      Y. Ostrovsky et al.




                 a
                                                  A           B           C           D                 E              F                 G



                                                  How many    How many    How many      How many        Which object Trace the             How many
                                                   objects?    objects?    objects?      objects?        is in front? long curve            objects?


                 b
                                            100                                                                                                              Control
                                                                                                                                                             Group
                  Performance (% correct)




                                                                                                                                                             S.K.

                                                                                                                                                             J.A.

                                                                                                                                                             P.B.
                                            50




                                                                                                   NA NA          NA NA
                                             0
                                                      A           B            C               D              E                 F                 G
                 c                                                            d


      E: Opaque overlapping shapes: could not indicate depth
      ordering (chance performance)

                 Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg-
                 mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates
                 that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently
                 treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to
         Daniel O’Shea (djoshea@stanford.edu)                                                       Visual Parsing After Recovery from Blindness
                 recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that
                 agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes.
                                                                                                                                                                              8 / 19
Overview                            Static Visual Parsing
                                                  Tests of Visual Parsing                           Dynamic Visual Parsing
                                                              Conclusion                            Recovery of Static Object Recognition


Tests of Static Visual Parsing
                                                                                      Y. Ostrovsky et al.




                 a
                                                  A           B           C           D                 E              F                 G



                                                  How many    How many    How many      How many        Which object Trace the             How many
                                                   objects?    objects?    objects?      objects?        is in front? long curve            objects?


                 b
                                            100                                                                                                              Control
                                                                                                                                                             Group
                  Performance (% correct)




                                                                                                                                                             S.K.

                                                                                                                                                             J.A.

                                                                                                                                                             P.B.
                                            50




                                                                                                   NA NA          NA NA
                                             0
                                                      A           B            C               D              E                 F                 G
                 c                                                            d


      F: Field of oriented line segments: difficulty finding path of
      coherent segments

                 Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg-
                 mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates
                 that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently
                 treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to
         Daniel O’Shea (djoshea@stanford.edu)                                                       Visual Parsing After Recovery from Blindness
                 recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that
                 agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes.
                                                                                                                                                                              8 / 19
Overview                            Static Visual Parsing
                                                  Tests of Visual Parsing                           Dynamic Visual Parsing
                                                              Conclusion                            Recovery of Static Object Recognition


Tests of Static Visual Parsing
                                                                                      Y. Ostrovsky et al.




                 a
                                                  A           B           C           D                 E              F                 G



                                                  How many    How many    How many      How many        Which object Trace the             How many
                                                   objects?    objects?    objects?      objects?        is in front? long curve            objects?


                 b
                                            100                                                                                                              Control
                                                                                                                                                             Group
                  Performance (% correct)




                                                                                                                                                             S.K.

                                                                                                                                                             J.A.

                                                                                                                                                             P.B.
                                            50




                                                                                                   NA NA          NA NA
                                             0
                                                      A           B            C               D              E                 F                 G
                 c                                                            d


      G: 3D shapes with luminance from lighting/shadows:
      percept of multiple objects, one for each facet

                 Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg-
                 mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates
                 that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently
                 treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to
         Daniel O’Shea (djoshea@stanford.edu)                                                       Visual Parsing After Recovery from Blindness
                 recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that
                 agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes.
                                                                                                                                                                              8 / 19
Overview                            Static Visual Parsing
                                                  Tests of Visual Parsing                           Dynamic Visual Parsing
                                                              Conclusion                            Recovery of Static Object Recognition


Tests of Static Visual Parsing
                                                                                      Y. Ostrovsky et al.




                 a
                                                  A           B           C           D                 E              F                 G



                                                  How many    How many    How many      How many        Which object Trace the             How many
                                                   objects?    objects?    objects?      objects?        is in front? long curve            objects?


                 b
                                            100                                                                                                              Control
                                                                                                                                                             Group
                  Performance (% correct)




                                                                                                                                                             S.K.

                                                                                                                                                             J.A.

                                                                                                                                                             P.B.
                                            50




                                                                                                   NA NA          NA NA
                                             0
                                                      A           B            C               D              E                 F                 G
                 c                                                            d


      Conclusion: Tendency to overfragment driven by low-level
      cues (hue and luminance regions). Inability to use cues of
      contour continuation, junction structure, figural symmetry.
                 Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg-
                 mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates
                 that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently
                 treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to
         Daniel O’Shea (djoshea@stanford.edu)                                                       Visual Parsing After Recovery from Blindness
                 recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that
                 agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes.
                                                                                                                                                                              8 / 19
Overview               Static Visual Parsing
        a                                            Tests of Visual Parsing              Dynamic Visual Parsing
                                         A           B         C          D                E          F         G
                                                                 Conclusion               Recovery of Static Object Recognition


Object Tracing                           How many
                                          objects?
                                                      How many
                                                       objects?
                                                                  How many
                                                                   objects?
                                                                              How many
                                                                               objects?
                                                                                            Which object Trace the
                                                                                             is in front? long curve
                                                                                                                                  How many
                                                                                                                                   objects?


        b
                                   100                                                                                                              Control
                                                                                                                                                    Group
         Performance (% correct)

      Overfragmentated percept evident in object tracing                                                                                            S.K.

                                                                                                                                                    J.A.
      Segmentation consistent with algorithm based on
                                   50
                                                                                                                                                    P.B.

      hue/luminance similarity metric

                                                                                       NA NA             NA NA
                                    0
                                             A            B            C           D                 E                 F                 G
        c                                                            d




        Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg-
        mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates
        that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently
        treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to
        recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that
        agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes.

              Daniel O’Shea (djoshea@stanford.edu)                                        Visual Parsing After Recovery from Blindness                               9 / 19
Overview    Static Visual Parsing
                       Tests of Visual Parsing   Dynamic Visual Parsing
                                   Conclusion    Recovery of Static Object Recognition


Dynamic Visual Input




      Natural visual experience is dynamic, includes motion
      Show individual shapes undergoing independent smooth
      translational motion and study percept in a dynamic context




         Daniel O’Shea (djoshea@stanford.edu)    Visual Parsing After Recovery from Blindness   10 / 19
indicate correct recognition responses.
                                                                          Overview        Static Visual Parsing
         So far, we have described the recently Visualsubjects’ per-
                                       Tests of treated Parsing                       allowed S.K. to better Parsing shapes embedded in noise
                                                                                         Dynamic Visual perceive
      formance with static imagery. In order to makeConclusion
                                                        our experiments               (Fig. 4, Tests C and Static Object Recognition
                                                                                         Recovery of D). Motion thus appeared to be instrumental
      more representative of everyday visual experience, which typ-                   for enabling the recently treated subjects to link together parts
      ically involves dynamic inputs, we used a set of stimuli that                   of an object and segregate them from the background.
Dynamic Visual Input
      incorporated motion cues (Fig. 4, Tests A and B). The task was
      the same as for the tests of static visual parsing—to indicate the
                                                                                         The recently treated subjects’ recognition results with real-
                                                                                      world images, already summarized, provide evidence of another
      number of objects shown. The individual shapes underwent                        role that motion might play in their object perception skills.
      independent smooth translational motion. For overlapping fig-                    When we examined which images the recently treated subjects
      ures, the movement was constrained such that the overlap was                    were able to recognize, an interesting pattern became evident.
      Motion facilitates object binding (linking together the parts)
      maintained at all times.                                                        As illustrated in Figure 3, the recently treated subjects’ recog-
         The inclusion of motion brought about a dramatic change in                   nition responses showed a significant congruence with inde-
      and segregation from background in the presence of noise
      the recently treated subjects’ responses. As Figure 4 indicates,                pendently derived motility ratings of the objects depicted in the
      they responded correctly on a majority of the trials. Motion also               images (p < .01 in a chi-square test for each of the 3 subjects). A


                                                        100                                                                  Control
                                                                                                                             Group
                              Performance (% correct)




                                                                                                                             S.K.

                                                                                                                             J.A.

                                                                                                                             P.B.
                                                         50




                                                          0                                 NA NA            NA NA
                                                              A           B           C               D


                                                              How many    How many    Name the           Name the
                                                               objects?    objects?    object             object

                                Fig. 4. Recently treated and control subjects’ performance on tasks designed to assess the role of
                                dynamic information in object segregation. ‘‘NA’’ indicates that data are not available for a subject.
                                In the illustrations of the four tasks, the arrows indicate direction of movement.

                Daniel O’Shea (djoshea@stanford.edu)                                      Visual Parsing After Recovery from Blindness                      11 / 19
Overview                Static Visual Parsing
                                         Tests of Visual Parsing               Dynamic Visual Parsing
                                                     Conclusion                Recovery of Static Object Recognition


Static Recognition and Object Motility

            More “motile” objects more likely to be recognized
            Object motion (alongside low-level cues) binds regions into a
            cohesive whole; learning of this cohesive structure facilitates
            recognition in a static context
                                                       Visual Parsing After Recovery From Blindness




   Fig. 3. Motility ratings of the 50 images used to test object recognition and the recently treated subjects’ ability to recognize these images. Motility
   ratings were obtained from 5 normally sighted respondents who were naive as to the purpose of the experiment; the height of the black bar below each
   object indicates that object’s average rating on a scale from 1 (very unlikely to be seen in motion) to 5 (very likely to be seen in motion). The circles
   indicate correct recognition responses.
                    Daniel O’Shea (djoshea@stanford.edu)                       Visual Parsing After Recovery from Blindness                         12 / 19
Overview          Static Visual Parsing
                                               Tests of Visual Parsing         Dynamic Visual Parsing
                                                           Conclusion          Recovery of Static Object Recognition


Follow-up Test of Static Visual Parsing

      S.K. (18 mo), P.B. (10 mo), and J.A. (12 mo) registered
      significant improvement in static visual parsing
      Critical period for visual learning should not be strictly
      applied in predicting therapeutic gains
                                 Y. Ostrovsky et al.




                                         100                                          S.K. (1 mo.)   S.K. (18 mo.)
                                                                                      J.A. (3 mo.)   J.A. (12 mo.)
               Performance (% correct)




                                                                                      P.B. (1 mo.)   P.B. (10 mo.)


                                          50




                                           0



                                               How many    Name the       Trace the     How many
                                                objects?    object       long curve      objects?

                 Fig. 5. The recently treated subjects’ performance on four tasks with static displays soon after
                 treatment and at follow-up testing after the passage of several months (indicated in the key).

          Daniel O’Shea (djoshea@stanford.edu)                                 Visual Parsing After Recovery from Blindness   13 / 19
Overview    Motion Information as a Scaffold
                      Tests of Visual Parsing   Experimental and Theoretical Evidence for Temporal Learning
                                  Conclusion    Future Directions




Motion Information Available Early

    Motion sensitivity develops early in primate brain
    Segmentation from motion 2 months prior to segmentation
    from static cues in infants
    Infants link separated parts of partially occluded object
    through common motion




        Daniel O’Shea (djoshea@stanford.edu)    Visual Parsing After Recovery from Blindness        14 / 19
Overview    Motion Information as a Scaffold
                      Tests of Visual Parsing   Experimental and Theoretical Evidence for Temporal Learning
                                  Conclusion    Future Directions




Motion Information Available Early

    Motion sensitivity develops early in primate brain
    Segmentation from motion 2 months prior to segmentation
    from static cues in infants
    Infants link separated parts of partially occluded object
    through common motion

Motion Information as a Scaffold
    Motion allows early grouping of parts
    Correlation of motion-based grouping and static cues (e.g.
    aligned contours) facilitates grouping via static cues


        Daniel O’Shea (djoshea@stanford.edu)    Visual Parsing After Recovery from Blindness        14 / 19
To look for such a signature, we focused on            obtained in these test phases: animal tasks un-       advantage of the fact that primates are effectively




                                                                                                                                                                        Downloaded from www.sciencemag.org on March 9, 20
 position tolerance. If two objects consistently                                 Overview                Motion Information as a Scaffold
                                                            related to the test stimuli; no attentional cueing; blind during the brief time it takes to complete a
 swapped identity across temporally contiguous              Tests of Visual Parsing presentations saccade (18). It consistently madeEvidence for Temporal Learning
                                                            and completely randomized, brief             Experimental and Theoretical the image of
 changes in retinal position then, after sufficient         of test stimuli (16). We alternated betweenFutureone object at a peripheral retinal position (swap
                                                                              Conclusion                  these  Directions
 experience in this “altered” visual world, the             two phases (test phase ~5 min; exposure phase         position) temporally contiguous with the retinal
 visual system might incorrectly associate the              ~15 min) until neuronal isolation was lost.           image of the other object at the center of the
 neural representations of those objects viewed at              To create the altered visual world (“Exposure     retina (Fig. 1).
Temporal Contiguity and IT cortex
 different positions into a single object representa-
 tion (12, 13). We focused on the top level of the
                                                            phase” in Fig. 1A), each monkey freely viewed
                                                            the video monitor on which isolated objects
                                                                                                                      We recorded from 101 IT neurons while the
                                                                                                                  monkeys were exposed to this altered visual
 primate ventral visual stream, the inferior tempo-         appeared intermittently, and its only task was to     world (isolation held for at least two test phases;
 ral cortex (IT), where many individual neurons             freely look at each object. This exposure “task” is   n = 50 in monkey 1; 51 in monkey 2). For each
                                                            a natural, automatic primate behavior in that it      neuron, we measured its object selectivity at each
                                                            requires no training. However, by means of real-      position as the difference in response to the two
                     Disruption of temporal contiguity of visual experience alters
 McGovern Institute for Brain Research and Department of
 Brain and Cognitive Sciences, Massachusetts Institute of   time eye-tracking (17), the images that played out    objects (P − N; all key effects were also found
 Technology, Cambridge, MA 02139, USA.                      on the monkey’s retina during exploration of this     with a contrast index of selectivity) (fig. S6). We
                     object specificity of IT neurons
 *To whom correspondence should be addressed. E-mail:
 dicarlo@mit.edu
                                                            world were under precise experimental control
                                                            (16). The objects were placed on the video
                                                                                                                  found that, at the swap position, IT neurons (on
                                                                                                                  average) decreased their initial object selectivity


 Fig. 1. Experimental
 design and predictions.
 (A) IT responses were
 tested in “Test phase”
 (green boxes, see text),
 which alternated with
 “Exposure phase.” Each
 exposure phase con-
 sisted of 100 normal
 exposures (50 P→P, 50
 N→N) and 100 swap
 exposures (50 P→N, 50
 N→P). Stimulus size was
 1.5° (16). (B) Each box
 shows the exposure-
 phase design for a sin-
 gle neuron. Arrows show
 the saccade-induced tem-
 poral contiguity of reti-
 nal images (arrowheads
 point to the retinal im-
 ages occurring later in
 time, i.e., at the end of
 the saccade). The swap
 position was strictly alternated (neuron-by-neuron) so that it was counter- object images (here P and N). Thus, the predicted effect is a decrease in object
 balanced across neurons. (C) Prediction for responses collected in the test phase: selectivity at the swap position that increases with increasing exposure (in the
 If the visual system builds tolerance using temporal contiguity (here driven by limit, reversing object preference), and little or no change in object selectivity at
 saccades), the swap exposure should cause incorrect grouping of two different the non-swap position.
       Li and DiCarlo. Unsupervised natural experience rapidly alters invariant object representation in visual cortex. Science 2008.

                                                                                                                                              1503
                                Danielwww.sciencemag.org SCIENCE VOL 321
                                       O’Shea (djoshea@stanford.edu)                                 12 SEPTEMBER 2008
                                                                                                        Visual Parsing After Recovery from Blindness                                                                        15 / 19
Overview    Motion Information as a Scaffold
                                     Tests of Visual Parsing   Experimental and Theoretical Evidence for Temporal Learning
                                                 Conclusion    Future Directions


Invariant Object Recognition with Slow Feature Analysis

          Slowness principle can be used to learn independent
          representations of object position, angle, and identity from
       966smoothly moving stimuli L. Wiskott
              M. Franzius, N. Wilbert, and




       Fig. 2. Model architecture and stimuli. An input image is fed into the hierarchical
       network. The circles in each layer denote the overlapping receptive fields for the SFA-
   Franzius, Wilbert, Wiskott. ICANN 2008.
       nodes and converge towards the top layer. The same set of steps is applied on each
       layer, which is O’Shea (djoshea@stanford.edu) side. Parsing After Recovery from Blindness
                Daniel visualized on the right hand Visual                                                         16 / 19
Overview                  Motion Information as a Scaffold
                                          Tests of Visual Parsing                 Experimental and Theoretical Evidence for Temporal Learning
                                                      Conclusion                  Future Directions


Learning Static Segmentation from Motion

                     Segmentation According to Natural Examples (SANE)
                                                                                                                                                        11


                                                                           SANE                                            cgtg (5)
     SANE better
     Martin better




    Fig. 11. The examples for which color, multiresolution SANE most outperformed cgtg Martin with features radius 5 using match distance 3 and vice-versa.

    the Martin detectors in any comparison to SANE.                       instance of the same object class. Unlike SANE, which segments
       The results in Figure 10 make it clear that SANE outperforms each image independently, LOCUS jointly segments a collection
    Martin on these data sets. The examples in Figure 11 illustrate that of images, allowing information in any image to influence the
    SANE’s advantage lies in its higher precision. While the Martin segmentation of the others.
    detectors have little trouble finding object boundaries, they also        LOCUS provides a good comparison to SANE. Its shape
   Ross and Kaelbling. non-object boundaries in to Natural Examples: Learning Static Segmentation from Motion Segmentation. IEEE the
    detect many other Segmentation According the process. In all four model, which includes a probabilistic template describing
    data sets, the color, multiresolution SANE f-measures outperform      expected overall shape of the objects, is much more global than
   Transactions detectors at Analysis andradii using the strict matching SANE’s shape model, which learns the relationships between
    all Martin on Pattern all feature Machine Intelligence 2008.
    distance of 1. As the match maximum distance relaxes to 3 and 5, neighboring boundary segments. Furthermore, LOCUS does not
    the Martin results improve, especially their recall. At radius 3, the attempt to learn a common image model, while SANE segments
                     Daniel match SANE on the traffic data. At radius Visual Parsing After Recovery from Blindness image prop- / 19
    best Martin results only O’Shea (djoshea@stanford.edu)                new images using previously learned models of the            17
Overview                       Motion Information as a Scaffold
                                                         Tests of Visual Parsing                      Experimental and Theoretical Evidence for Temporal Learning
                                                                     Conclusion                       Future Directions


     Structure from Motion          where and          the measurements whose assignments will
                                    be swapped, and        and     are the projections of the fea-
                                    tures originally assigned to them .
                                       To conclude the E-step and compute the virtual measure-
                                    ments in (11), the only thing left to do is to compute the
                          Recover 3D structure and camera motion from multiple
                                    marginal probabilities       from the sample         . Fortu-
                                    nately, this can be done without explicitly storing the sam-
                                                                                                          Figure 4. Three out of 11 cube images. Although the
                          images without correspondence
                                    ples by keeping running counts of how many times each
                                    measurement         is assigned to feature , and use that to
                                                                                                          images were originally taken as a sequence in time, the
                                                                                                          ordering of the images is irrelevant to our method.
                                    compute       . If we define       to be this count, we have:

                                                                                               (15)

                                    4.3    Implementation in Practice
                                       The pseudo-code for the final algorithm is as follows:

                                      1. Generate an initial structure and motion estimate       .
hose assignments will                                                                                          t=0 !=0.0       t=1 !=25.1           t=3 !=23.5
rojections of the fea-                2. Given      and the data , run the Metropolis sam-
                                         pler in each image to obtain approximate values for
 e the virtual measure-                  the weights    , using equation (15).
 do is to compute the
sample         . Fortu-               3. Calculate the virtual measurements       with (11).
 citly storing the sam-
                            Figure 4. Three out of new estimate
                                      4. Find the 11 cube images. Although the and motion
                                                                      for structure                          t=10 !=18.7         t=20 !=13.5          t=100 !=1.0
 ow many times each
                            images were originally taken asmeasurements time,as data. This can
                                         using the virtual a sequence in      the
ure , and use that to
                            ordering of the images using any SFM method compatible
                                          be done is irrelevant to our method.           with the         Figure 5. The structure estimate as initialized and at suc-
this count, we have:
                                          projection model assumed.                                       cessive iterations   of the algorithm.

                  (15)                5. If not converged, return to step 2.
                                                                                                        time, with the annealing parameter decreasing exponen-
e                                      To avoid getting stuck in local minima, it is important in
                                                                                                        tially from 25 pixels to 1 pixel. For each EM iteration, we
                                       practice to add annealing to this basic scheme. In annealing
                 Dellaert, Seitz, Thorpe, and Thrun. Structure from Motionfor the early
rithm is as follows:                   we artificially increase the noise parameter        without Correspondence. IEEE each image for 10000 Conference on Computer
                                                                                                        ran the sampler in Computer Society steps. An entire
                                                                                                        run takes about a minute of CPU time on a standard PC. As
                                       iterations, gradually decreasing it to its correct value. This
motion estimate .                                                                                       is typical for EM, the algorithm can sometimes get stuck in
                                   t=0 has two beneficial consequences. !=23.5 the posterior distri-
                 Vision and Pattern Recognition 2000.
                                        !=0.0       t=1 !=25.1        t=3 First,
                                                                                                        local minima, in which case we restart it manually.
 the Metropolis sam-                   bution     will be less peaked when is high, so that the
                                       Metropolis sampler will explore the space of assignments             In practice, the algorithm converges consistently and fast
 proximate values for
                                       more easily, and avoid getting stuck on islands of high          to an estimate for the structure and motion where the correct
 15).
                                                                                                        correspondence is the most probable one, and where most if
                                  Daniel O’Shea (djoshea@stanford.edu)
                                       probability. Second, the expected log likelihood            is Visual Parsing After Recovery from Blindness                       18 / 19
Overview    Motion Information as a Scaffold
                        Tests of Visual Parsing   Experimental and Theoretical Evidence for Temporal Learning
                                    Conclusion    Future Directions


Future Directions (Suggestions for Presentations. . . )



       Evidence for motion-based learning of object binding?
       Evidence from motion-alteration or
       temporal-contiguity-altering experiments?
       Neuroscience and SFA?
       Bootstrapped learning of novel objects?




          Daniel O’Shea (djoshea@stanford.edu)    Visual Parsing After Recovery from Blindness        19 / 19

More Related Content

Recently uploaded

Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEaurabinda banchhor
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxJanEmmanBrigoli
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...JojoEDelaCruz
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxruthvilladarez
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxElton John Embodo
 

Recently uploaded (20)

Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSE
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptx
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docx
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docx
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Featured (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

Visual Parsing after Recovery from Blindness

  • 1. Overview Tests of Visual Parsing Conclusion Visual Parsing After Recovery from Blindness Ostrovsky et al. Psychological Science 2009. Daniel O’Shea djoshea@stanford.edu Brain CS Lunch Talk 11 March 2010 Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 1 / 19
  • 2. Overview Tests of Visual Parsing Conclusion Overview Project Prakash Learning to Recognize Objects Subjects Tests of Visual Parsing Static Visual Parsing Dynamic Visual Parsing Recovery of Static Object Recognition Conclusion Motion Information as a Scaffold Experimental and Theoretical Evidence for Temporal Learning Future Directions Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 2 / 19
  • 3. Overview Project Prakash Tests of Visual Parsing Learning to Recognize Objects Conclusion Subjects Project Prakash “The goal of Project Prakash is to bring light into the lives of curably blind children and, in so doing, illuminate some of the most fundamental scientific questions about how the brain develops and learns to see.” http://web.mit.edu/bcs/sinha/prakash.html Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 3 / 19
  • 4. Overview Project Prakash Tests of Visual Parsing Learning to Recognize Objects Conclusion Subjects Project Prakash Launched in 2003 by Pawan Sinha at MIT BCS Based in New Delhi; India home to 30% of blind population Humanitarian aim: Screen children for treatable blindness and provide modern treatment Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 4 / 19
  • 5. Overview Project Prakash Tests of Visual Parsing Learning to Recognize Objects Conclusion Subjects Project Prakash Launched in 2003 by Pawan Sinha at MIT BCS Based in New Delhi; India home to 30% of blind population Humanitarian aim: Screen children for treatable blindness and provide modern treatment Scientific Opportunity How does the visual system learn to perceive objects? Track object learning in longitudinal studies Complementary to infant studies which have more plastic visual processing but are limited in experimental complexity Separation of visual learning from sensory development Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 4 / 19
  • 6. Overview Project Prakash Tests of Visual Parsing Learning to Recognize Objects Conclusion Subjects Learning to Recognize Visual system must learn to “parse” visual world into meaningful entities Segmentation focused on heuristics: alignment of contours, Y. Ostrovsky et al. similarity of texture representations Fig. 1. Example illustrating how a natural image (a) is typically a collection of many regions of different hues and luminances (b). The human visual system has to accomplish the task of integrating subsets of these regions into coherent objects, illustrated in (c). Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 5 / 19
  • 7. Overview Project Prakash Tests of Visual Parsing Learning to Recognize Objects Conclusion Subjects Central questions Critical period for learning visual parsing? Evidence that mature visual system uses these heuristics for object recognition, but are they useful in organizing visual information during development? Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 6 / 19
  • 8. Overview Project Prakash Tests of Visual Parsing Learning to Recognize Objects Conclusion Subjects Recovery Subjects S.K. 29 yo, congenital bilateral aphakia (absence of lens); corrected with eyeglasses! P.B. 7 yo, treated with bilateral cataract surgery with intraocular lens implantation J.A. 13 yo, treated with bilateral cataract surgery with intraocular lens implantation Control Subjects Visual input scaled to simulate information loss due to imperfect acuity in treated group (20/80 to 20/120) Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 7 / 19
  • 9. Overview Static Visual Parsing Tests of Visual Parsing Dynamic Visual Parsing Conclusion Recovery of Static Object Recognition Tests of Static Visual Parsing Y. Ostrovsky et al. a A B C D E F G How many How many How many How many Which object Trace the How many objects? objects? objects? objects? is in front? long curve objects? b 100 Control Group Performance (% correct) S.K. J.A. P.B. 50 NA NA NA NA 0 A B C D E F G c d A: Non-overlapping shapes: No difficulty. Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg- mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes. 8 / 19
  • 10. Overview Static Visual Parsing Tests of Visual Parsing Dynamic Visual Parsing Conclusion Recovery of Static Object Recognition Tests of Static Visual Parsing Y. Ostrovsky et al. a A B C D E F G How many How many How many How many Which object Trace the How many objects? objects? objects? objects? is in front? long curve objects? b 100 Control Group Performance (% correct) S.K. J.A. P.B. 50 NA NA NA NA 0 A B C D E F G c d B: Overlapping shapes as line drawings: overfragmentation, all closed loops perceived as distinct. Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg- mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes. 8 / 19
  • 11. Overview Static Visual Parsing Tests of Visual Parsing Dynamic Visual Parsing Conclusion Recovery of Static Object Recognition Tests of Static Visual Parsing Y. Ostrovsky et al. a A B C D E F G How many How many How many How many Which object Trace the How many objects? objects? objects? objects? is in front? long curve objects? b 100 Control Group Performance (% correct) S.K. J.A. P.B. 50 NA NA NA NA 0 A B C D E F G c d C: Overlapping shapes as filled translucent surfaces: overfragmentation, all regions of constant luminance perceived as distinct Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg- mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes. 8 / 19
  • 12. Overview Static Visual Parsing Tests of Visual Parsing Dynamic Visual Parsing Conclusion Recovery of Static Object Recognition Tests of Static Visual Parsing Y. Ostrovsky et al. a A B C D E F G How many How many How many How many Which object Trace the How many objects? objects? objects? objects? is in front? long curve objects? b 100 Control Group Performance (% correct) S.K. J.A. P.B. 50 NA NA NA NA 0 A B C D E F G c d D: Opaque overlapping shapes: could correctly indicate number of objects. Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg- mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes. 8 / 19
  • 13. Overview Static Visual Parsing Tests of Visual Parsing Dynamic Visual Parsing Conclusion Recovery of Static Object Recognition Tests of Static Visual Parsing Y. Ostrovsky et al. a A B C D E F G How many How many How many How many Which object Trace the How many objects? objects? objects? objects? is in front? long curve objects? b 100 Control Group Performance (% correct) S.K. J.A. P.B. 50 NA NA NA NA 0 A B C D E F G c d E: Opaque overlapping shapes: could not indicate depth ordering (chance performance) Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg- mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes. 8 / 19
  • 14. Overview Static Visual Parsing Tests of Visual Parsing Dynamic Visual Parsing Conclusion Recovery of Static Object Recognition Tests of Static Visual Parsing Y. Ostrovsky et al. a A B C D E F G How many How many How many How many Which object Trace the How many objects? objects? objects? objects? is in front? long curve objects? b 100 Control Group Performance (% correct) S.K. J.A. P.B. 50 NA NA NA NA 0 A B C D E F G c d F: Field of oriented line segments: difficulty finding path of coherent segments Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg- mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes. 8 / 19
  • 15. Overview Static Visual Parsing Tests of Visual Parsing Dynamic Visual Parsing Conclusion Recovery of Static Object Recognition Tests of Static Visual Parsing Y. Ostrovsky et al. a A B C D E F G How many How many How many How many Which object Trace the How many objects? objects? objects? objects? is in front? long curve objects? b 100 Control Group Performance (% correct) S.K. J.A. P.B. 50 NA NA NA NA 0 A B C D E F G c d G: 3D shapes with luminance from lighting/shadows: percept of multiple objects, one for each facet Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg- mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes. 8 / 19
  • 16. Overview Static Visual Parsing Tests of Visual Parsing Dynamic Visual Parsing Conclusion Recovery of Static Object Recognition Tests of Static Visual Parsing Y. Ostrovsky et al. a A B C D E F G How many How many How many How many Which object Trace the How many objects? objects? objects? objects? is in front? long curve objects? b 100 Control Group Performance (% correct) S.K. J.A. P.B. 50 NA NA NA NA 0 A B C D E F G c d Conclusion: Tendency to overfragment driven by low-level cues (hue and luminance regions). Inability to use cues of contour continuation, junction structure, figural symmetry. Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg- mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes. 8 / 19
  • 17. Overview Static Visual Parsing a Tests of Visual Parsing Dynamic Visual Parsing A B C D E F G Conclusion Recovery of Static Object Recognition Object Tracing How many objects? How many objects? How many objects? How many objects? Which object Trace the is in front? long curve How many objects? b 100 Control Group Performance (% correct) Overfragmentated percept evident in object tracing S.K. J.A. Segmentation consistent with algorithm based on 50 P.B. hue/luminance similarity metric NA NA NA NA 0 A B C D E F G c d Fig. 2. Subjects’ parsing of static images. Seven tasks (a) were used to assess the recently treated subjects’ ability to perform simple image seg- mentation and shape analysis. The graph (b) shows the performance of these subjects relative to the control subjects on these tasks. ‘‘NA’’ indicates that data are not available for a subject. S.K.’s tracing of a pattern drawn by one of the authors (c) illustrates the fragmented percepts of the recently treated subjects. In the upper row of (d), the outlines indicate the regions of real-world images that S.K. saw as distinct objects. He was unable to recognize any of these images. For comparison, the lower row of (d) shows the segmentation of the same images according to a simple algorithm that agglomerated spatially adjacent regions that satisfied a threshold criterion of similarity in their hue and luminance attributes. Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 9 / 19
  • 18. Overview Static Visual Parsing Tests of Visual Parsing Dynamic Visual Parsing Conclusion Recovery of Static Object Recognition Dynamic Visual Input Natural visual experience is dynamic, includes motion Show individual shapes undergoing independent smooth translational motion and study percept in a dynamic context Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 10 / 19
  • 19. indicate correct recognition responses. Overview Static Visual Parsing So far, we have described the recently Visualsubjects’ per- Tests of treated Parsing allowed S.K. to better Parsing shapes embedded in noise Dynamic Visual perceive formance with static imagery. In order to makeConclusion our experiments (Fig. 4, Tests C and Static Object Recognition Recovery of D). Motion thus appeared to be instrumental more representative of everyday visual experience, which typ- for enabling the recently treated subjects to link together parts ically involves dynamic inputs, we used a set of stimuli that of an object and segregate them from the background. Dynamic Visual Input incorporated motion cues (Fig. 4, Tests A and B). The task was the same as for the tests of static visual parsing—to indicate the The recently treated subjects’ recognition results with real- world images, already summarized, provide evidence of another number of objects shown. The individual shapes underwent role that motion might play in their object perception skills. independent smooth translational motion. For overlapping fig- When we examined which images the recently treated subjects ures, the movement was constrained such that the overlap was were able to recognize, an interesting pattern became evident. Motion facilitates object binding (linking together the parts) maintained at all times. As illustrated in Figure 3, the recently treated subjects’ recog- The inclusion of motion brought about a dramatic change in nition responses showed a significant congruence with inde- and segregation from background in the presence of noise the recently treated subjects’ responses. As Figure 4 indicates, pendently derived motility ratings of the objects depicted in the they responded correctly on a majority of the trials. Motion also images (p < .01 in a chi-square test for each of the 3 subjects). A 100 Control Group Performance (% correct) S.K. J.A. P.B. 50 0 NA NA NA NA A B C D How many How many Name the Name the objects? objects? object object Fig. 4. Recently treated and control subjects’ performance on tasks designed to assess the role of dynamic information in object segregation. ‘‘NA’’ indicates that data are not available for a subject. In the illustrations of the four tasks, the arrows indicate direction of movement. Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 11 / 19
  • 20. Overview Static Visual Parsing Tests of Visual Parsing Dynamic Visual Parsing Conclusion Recovery of Static Object Recognition Static Recognition and Object Motility More “motile” objects more likely to be recognized Object motion (alongside low-level cues) binds regions into a cohesive whole; learning of this cohesive structure facilitates recognition in a static context Visual Parsing After Recovery From Blindness Fig. 3. Motility ratings of the 50 images used to test object recognition and the recently treated subjects’ ability to recognize these images. Motility ratings were obtained from 5 normally sighted respondents who were naive as to the purpose of the experiment; the height of the black bar below each object indicates that object’s average rating on a scale from 1 (very unlikely to be seen in motion) to 5 (very likely to be seen in motion). The circles indicate correct recognition responses. Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 12 / 19
  • 21. Overview Static Visual Parsing Tests of Visual Parsing Dynamic Visual Parsing Conclusion Recovery of Static Object Recognition Follow-up Test of Static Visual Parsing S.K. (18 mo), P.B. (10 mo), and J.A. (12 mo) registered significant improvement in static visual parsing Critical period for visual learning should not be strictly applied in predicting therapeutic gains Y. Ostrovsky et al. 100 S.K. (1 mo.) S.K. (18 mo.) J.A. (3 mo.) J.A. (12 mo.) Performance (% correct) P.B. (1 mo.) P.B. (10 mo.) 50 0 How many Name the Trace the How many objects? object long curve objects? Fig. 5. The recently treated subjects’ performance on four tasks with static displays soon after treatment and at follow-up testing after the passage of several months (indicated in the key). Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 13 / 19
  • 22. Overview Motion Information as a Scaffold Tests of Visual Parsing Experimental and Theoretical Evidence for Temporal Learning Conclusion Future Directions Motion Information Available Early Motion sensitivity develops early in primate brain Segmentation from motion 2 months prior to segmentation from static cues in infants Infants link separated parts of partially occluded object through common motion Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 14 / 19
  • 23. Overview Motion Information as a Scaffold Tests of Visual Parsing Experimental and Theoretical Evidence for Temporal Learning Conclusion Future Directions Motion Information Available Early Motion sensitivity develops early in primate brain Segmentation from motion 2 months prior to segmentation from static cues in infants Infants link separated parts of partially occluded object through common motion Motion Information as a Scaffold Motion allows early grouping of parts Correlation of motion-based grouping and static cues (e.g. aligned contours) facilitates grouping via static cues Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 14 / 19
  • 24. To look for such a signature, we focused on obtained in these test phases: animal tasks un- advantage of the fact that primates are effectively Downloaded from www.sciencemag.org on March 9, 20 position tolerance. If two objects consistently Overview Motion Information as a Scaffold related to the test stimuli; no attentional cueing; blind during the brief time it takes to complete a swapped identity across temporally contiguous Tests of Visual Parsing presentations saccade (18). It consistently madeEvidence for Temporal Learning and completely randomized, brief Experimental and Theoretical the image of changes in retinal position then, after sufficient of test stimuli (16). We alternated betweenFutureone object at a peripheral retinal position (swap Conclusion these Directions experience in this “altered” visual world, the two phases (test phase ~5 min; exposure phase position) temporally contiguous with the retinal visual system might incorrectly associate the ~15 min) until neuronal isolation was lost. image of the other object at the center of the neural representations of those objects viewed at To create the altered visual world (“Exposure retina (Fig. 1). Temporal Contiguity and IT cortex different positions into a single object representa- tion (12, 13). We focused on the top level of the phase” in Fig. 1A), each monkey freely viewed the video monitor on which isolated objects We recorded from 101 IT neurons while the monkeys were exposed to this altered visual primate ventral visual stream, the inferior tempo- appeared intermittently, and its only task was to world (isolation held for at least two test phases; ral cortex (IT), where many individual neurons freely look at each object. This exposure “task” is n = 50 in monkey 1; 51 in monkey 2). For each a natural, automatic primate behavior in that it neuron, we measured its object selectivity at each requires no training. However, by means of real- position as the difference in response to the two Disruption of temporal contiguity of visual experience alters McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of time eye-tracking (17), the images that played out objects (P − N; all key effects were also found Technology, Cambridge, MA 02139, USA. on the monkey’s retina during exploration of this with a contrast index of selectivity) (fig. S6). We object specificity of IT neurons *To whom correspondence should be addressed. E-mail: dicarlo@mit.edu world were under precise experimental control (16). The objects were placed on the video found that, at the swap position, IT neurons (on average) decreased their initial object selectivity Fig. 1. Experimental design and predictions. (A) IT responses were tested in “Test phase” (green boxes, see text), which alternated with “Exposure phase.” Each exposure phase con- sisted of 100 normal exposures (50 P→P, 50 N→N) and 100 swap exposures (50 P→N, 50 N→P). Stimulus size was 1.5° (16). (B) Each box shows the exposure- phase design for a sin- gle neuron. Arrows show the saccade-induced tem- poral contiguity of reti- nal images (arrowheads point to the retinal im- ages occurring later in time, i.e., at the end of the saccade). The swap position was strictly alternated (neuron-by-neuron) so that it was counter- object images (here P and N). Thus, the predicted effect is a decrease in object balanced across neurons. (C) Prediction for responses collected in the test phase: selectivity at the swap position that increases with increasing exposure (in the If the visual system builds tolerance using temporal contiguity (here driven by limit, reversing object preference), and little or no change in object selectivity at saccades), the swap exposure should cause incorrect grouping of two different the non-swap position. Li and DiCarlo. Unsupervised natural experience rapidly alters invariant object representation in visual cortex. Science 2008. 1503 Danielwww.sciencemag.org SCIENCE VOL 321 O’Shea (djoshea@stanford.edu) 12 SEPTEMBER 2008 Visual Parsing After Recovery from Blindness 15 / 19
  • 25. Overview Motion Information as a Scaffold Tests of Visual Parsing Experimental and Theoretical Evidence for Temporal Learning Conclusion Future Directions Invariant Object Recognition with Slow Feature Analysis Slowness principle can be used to learn independent representations of object position, angle, and identity from 966smoothly moving stimuli L. Wiskott M. Franzius, N. Wilbert, and Fig. 2. Model architecture and stimuli. An input image is fed into the hierarchical network. The circles in each layer denote the overlapping receptive fields for the SFA- Franzius, Wilbert, Wiskott. ICANN 2008. nodes and converge towards the top layer. The same set of steps is applied on each layer, which is O’Shea (djoshea@stanford.edu) side. Parsing After Recovery from Blindness Daniel visualized on the right hand Visual 16 / 19
  • 26. Overview Motion Information as a Scaffold Tests of Visual Parsing Experimental and Theoretical Evidence for Temporal Learning Conclusion Future Directions Learning Static Segmentation from Motion Segmentation According to Natural Examples (SANE) 11 SANE cgtg (5) SANE better Martin better Fig. 11. The examples for which color, multiresolution SANE most outperformed cgtg Martin with features radius 5 using match distance 3 and vice-versa. the Martin detectors in any comparison to SANE. instance of the same object class. Unlike SANE, which segments The results in Figure 10 make it clear that SANE outperforms each image independently, LOCUS jointly segments a collection Martin on these data sets. The examples in Figure 11 illustrate that of images, allowing information in any image to influence the SANE’s advantage lies in its higher precision. While the Martin segmentation of the others. detectors have little trouble finding object boundaries, they also LOCUS provides a good comparison to SANE. Its shape Ross and Kaelbling. non-object boundaries in to Natural Examples: Learning Static Segmentation from Motion Segmentation. IEEE the detect many other Segmentation According the process. In all four model, which includes a probabilistic template describing data sets, the color, multiresolution SANE f-measures outperform expected overall shape of the objects, is much more global than Transactions detectors at Analysis andradii using the strict matching SANE’s shape model, which learns the relationships between all Martin on Pattern all feature Machine Intelligence 2008. distance of 1. As the match maximum distance relaxes to 3 and 5, neighboring boundary segments. Furthermore, LOCUS does not the Martin results improve, especially their recall. At radius 3, the attempt to learn a common image model, while SANE segments Daniel match SANE on the traffic data. At radius Visual Parsing After Recovery from Blindness image prop- / 19 best Martin results only O’Shea (djoshea@stanford.edu) new images using previously learned models of the 17
  • 27. Overview Motion Information as a Scaffold Tests of Visual Parsing Experimental and Theoretical Evidence for Temporal Learning Conclusion Future Directions Structure from Motion where and the measurements whose assignments will be swapped, and and are the projections of the fea- tures originally assigned to them . To conclude the E-step and compute the virtual measure- ments in (11), the only thing left to do is to compute the Recover 3D structure and camera motion from multiple marginal probabilities from the sample . Fortu- nately, this can be done without explicitly storing the sam- Figure 4. Three out of 11 cube images. Although the images without correspondence ples by keeping running counts of how many times each measurement is assigned to feature , and use that to images were originally taken as a sequence in time, the ordering of the images is irrelevant to our method. compute . If we define to be this count, we have: (15) 4.3 Implementation in Practice The pseudo-code for the final algorithm is as follows: 1. Generate an initial structure and motion estimate . hose assignments will t=0 !=0.0 t=1 !=25.1 t=3 !=23.5 rojections of the fea- 2. Given and the data , run the Metropolis sam- pler in each image to obtain approximate values for e the virtual measure- the weights , using equation (15). do is to compute the sample . Fortu- 3. Calculate the virtual measurements with (11). citly storing the sam- Figure 4. Three out of new estimate 4. Find the 11 cube images. Although the and motion for structure t=10 !=18.7 t=20 !=13.5 t=100 !=1.0 ow many times each images were originally taken asmeasurements time,as data. This can using the virtual a sequence in the ure , and use that to ordering of the images using any SFM method compatible be done is irrelevant to our method. with the Figure 5. The structure estimate as initialized and at suc- this count, we have: projection model assumed. cessive iterations of the algorithm. (15) 5. If not converged, return to step 2. time, with the annealing parameter decreasing exponen- e To avoid getting stuck in local minima, it is important in tially from 25 pixels to 1 pixel. For each EM iteration, we practice to add annealing to this basic scheme. In annealing Dellaert, Seitz, Thorpe, and Thrun. Structure from Motionfor the early rithm is as follows: we artificially increase the noise parameter without Correspondence. IEEE each image for 10000 Conference on Computer ran the sampler in Computer Society steps. An entire run takes about a minute of CPU time on a standard PC. As iterations, gradually decreasing it to its correct value. This motion estimate . is typical for EM, the algorithm can sometimes get stuck in t=0 has two beneficial consequences. !=23.5 the posterior distri- Vision and Pattern Recognition 2000. !=0.0 t=1 !=25.1 t=3 First, local minima, in which case we restart it manually. the Metropolis sam- bution will be less peaked when is high, so that the Metropolis sampler will explore the space of assignments In practice, the algorithm converges consistently and fast proximate values for more easily, and avoid getting stuck on islands of high to an estimate for the structure and motion where the correct 15). correspondence is the most probable one, and where most if Daniel O’Shea (djoshea@stanford.edu) probability. Second, the expected log likelihood is Visual Parsing After Recovery from Blindness 18 / 19
  • 28. Overview Motion Information as a Scaffold Tests of Visual Parsing Experimental and Theoretical Evidence for Temporal Learning Conclusion Future Directions Future Directions (Suggestions for Presentations. . . ) Evidence for motion-based learning of object binding? Evidence from motion-alteration or temporal-contiguity-altering experiments? Neuroscience and SFA? Bootstrapped learning of novel objects? Daniel O’Shea (djoshea@stanford.edu) Visual Parsing After Recovery from Blindness 19 / 19