SlideShare une entreprise Scribd logo
1  sur  44
Télécharger pour lire hors ligne
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                 O’Reilly Open Source Convention 2008




          Ruby Track: Session 2471

 Real-time Computer Vision With Ruby

                J. Wedekind

         Wednesday, July 23rd 2008


       Nanorobotics EPSRC Basic Technology Grant

       Microsystems and Machine Vision Laboratory
       Modelling Research Centre




1/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                           Introduction
                            In This Talk




Brain.eval <<REQUIRED
  require ’RMagick’
  require ’Qt4’
  require ’complex’
  require ’matrix’
  require ’narray’
REQUIRED

 2/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                        Introduction
                  EU Esprit MINIMAN Project




3/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                         Introduction
                    EU IST MiCRoN Project




4/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                       Introduction
                 EPSRC Nanorobotics Project




                                    Electron Microscopy
                                   • telemanipulation
                                   • drift-compensation
                                   • closed-loop control
                                     Computer Vision
                                   • real-time software
                                   • system integration
                                   • theoretical insights



5/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                         Introduction
                      Industrial Robotics


                             Default Situation
                   • proprietary operating system
                   • proprietary robot software
                   • proprietary process simulation software
                   • proprietary mathematics software
                   • proprietary machine vision software
                   • proprietary manufacturing software
                        Total Cost of Lock-in (TCL)
                   • duplication of work
                   • integration problems
                   • lack of progress
                   • handicapped developers


6/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                         Introduction
                Innovation Happens Elsewhere



                 Ron Goldman & Richard P. Gabriel

           “The market need is greatest for platform
           products because of the importance of a
           reliable promise that vendor lock-in will not
           endanger the survival of products built or
           modified on the software stack above that
           platform.”


           “It is important to remove as many barriers to
           collaboration as possible: social, political,
           and technical.”




7/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                   Design Considerations
            HornetsEye’s Distinguishing Features




       • GPL
       • Ruby
       • Real-Time




8/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                 Design Considerations
                                                        GPLv3

Four Freedoms (Richard Stallman)
 1. The freedom to run the program, for any purpose.
 2. The freedom to study how the program works, and adapt it to your
    needs.
 3. The freedom to redistribute copies so you can help your neighbor.
 4. The freedom to improve the program, and release your
    improvements to the public, so that the whole community benefits.
Respect The Freedom Of Downstream Users (Richard Stallman)
GPL requires derived works to be available under the same license.
Covenant Not To Assert Patent Claims (Eben Moglen)
GPLv3 deters users of the program from instituting patent ligitation by
the threat of withdrawing further rights to use the program.
Other (Eben Moglen)
GPLv3 has regulations against DMCA restrictions and tivoization.


                   9/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                          Design Considerations
                                                                  Ruby


# ---------------------------------------------------------------------------------------------------------
img = Magick::Image.read( "circle.png" )[ 0 ]
str = img.export_pixels_to_str( 0, 0, img.columns, img.rows, "I", Magick::CharPixel )
arr = NArray.to_na( str, NArray::BYTE, img.columns, img.rows )
puts ( arr / 128 ).inspect
# NArray.byte(20,20):
# [ [ 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1 ],
# [ 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1 ],
# [ 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1 ],
# [ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1 ],
# [ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1 ],
# [ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 ],
# [ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 ],
# [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ],
# [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ],
# ...
# ---------------------------------------------------------------------------------------------------------

                  No high-level code in C++!

                  10/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                         Design Considerations
                               Real-Time



        Real-time Object Recognition




11/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                              HornetsEye Core
                                                              Compact Storage


# ------------------------------------------------------------------------------------------------------
class Sequence
 Type = Struct.new( :name, :type, :size, :default, :pack, :unpack ); @@types = []
 def Sequence.register_type( sym, type, size, default, pack, unpack )
   eval "#{sym.to_s} = Type.new( sym.to_s, type, size, default, pack, unpack )"
 end
 register_type( :OBJECT, Object, 1, nil, proc { |o| [o] }, proc { |s| s[0] } )
 register_type( :UBYTE, Fixnum, 1, 0, proc { |o| [o].pack("C") },
    proc { |s| s.unpack("C")[0] } )
 def initialize( type = OBJECT, n = 0, value = nil )
   @type, @data = type, type.pack.call( value == nil ? type.default : value ) * n
   @size = n
 end
 def []( i )
   p = i * @type.size; @type.unpack.call( @data[ p...( p + @type.size ) ] )
 end
 def []=( i, o )
   p = i * @type.size; @data[ p...( p + @type.size ) ] = @type.pack.call( o ); o
 end
end
# ------------------------------------------------------------------------------------------------------


               12/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                            HornetsEye Core
                                                          N-Dimensional Arrays



# ------------------------------------------------------------------------------------------------------
class MultiArray
 UBYTE = Sequence::UBYTE
 OBJECT = Sequence::OBJECT
 def initialize( type = OBJECT, *shape )
   @shape = shape
   stride = 1
   @strides = shape.collect { |s| old = stride; stride *= s; old }
   @data = Sequence.new( type, shape.inject( 1 ) { |r,d| r*d } )
 end
 def []( *indices )
   @data[ indices.zip( @strides ).inject( 0 ) { |p,i| p + i[0] * i[1] } ]
 end
 def []=( *indices )
   value = indices.pop
   @data[ indices.zip( @strides ).inject( 0 ) { |p,i| p + i[0] * i[1] } ] = value
 end
end
# ------------------------------------------------------------------------------------------------------




               13/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                          HornetsEye Core
                                                       Element-wise Operations


# ------------------------------------------------------------------------------------------------------
class Sequence
 attr_reader :type, :data, :size
 def collect( type = @type )
   retval = Sequence.new( type, @size )
   ( 0...@size ).each { |i| retval[i] = yield self[i] }
   retval
 end
end
class MultiArray
 attr_accessor :shape, :strides, :data
 def MultiArray.import( type, data, *shape )
   retval = MultiArray.new( type )
   stride = 1; retval.strides = shape.collect { |s| old = stride; stride *= s; old }
   retval.shape, retval.data = shape, data; retval
 end
 def collect( type = @data.type, &action )
   MultiArray.import( type, @data.collect( type, &action ), *@shape )
 end
end
# ------------------------------------------------------------------------------------------------------


               14/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                           HornetsEye Core
                                                         Return-type Coercions



# ------------------------------------------------------------------------------------------------------
class Sequence
 @@coercions = Hash.new
 @@coercions.default = OBJECT
 def Sequence.register_coercion( result, type1, type2 )
   @@coercions[ [ type1, type1 ] ] = type1
   @@coercions[ [ type2, type2 ] ] = type2
   @@coercions[ [ type1, type2 ] ] = result
   @@coercions[ [ type2, type1 ] ] = result
 end
 register_coercion( OBJECT, OBJECT, UBYTE )
 def +( other )
   retval = Sequence.new( @@coercions[ [ @type, other.type ] ], @size )
   ( 0...@size ).each { |i| retval[i] = self[i] + other[i] }
   retval
 end
end
# ------------------------------------------------------------------------------------------------------




               15/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                             HornetsEye Core
                                            Fast Malloc Objects



VALUE Malloc::wrapMid( VALUE rbSelf, VALUE rbOffset, VALUE rbLength )
{
  char *self; Data_Get_Struct( rbSelf, char, self );
  return rb_str_new( self + NUM2INT( rbOffset ), NUM2INT( rbLength ) );
}



m=Malloc.new(1000)
m.mid(10,4)
# "000000000000"
m.assign(10,"test")
m.mid(10,4)
# "test"




                  16/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                       HornetsEye Core
                                                                       Native Operations

g,h ∈ {0, 1,  . .  w − 1} × {0, 1, . . . , h − 1} → R
             . ,
   x1 
   ) = g( x1 )/2
                                                                             MultiArray.dfloat( 320, 240 ):
   
h( 
   
               
               
               
                                                       MultiArray.dfloat        [ [ 245.0, 244.0, 197.0, ... ],
   
  x          
              x 
    2              2                                     Fixnum                    [ 245.0, 247.0, 197.0, ... ],
                                                                                   [ 247.0, 248.0, 187.0, ... ]
Ruby                                              h=g/2                          ...

                                                                                           [3.141]
     MultiArray.respond to?( ”binary div lint dfloat” )
                                      no




                                                                                                        String.unpack(”D”)
                                                                                      Array.pack(”D”)
     h = g.collect { |x| x / 2 }                         yes



C++                                        for ( int i=0; i<n; i++ )
                                              *r++ = *p++ / q;
     MultiArray.binary   div   byte   byte
     MultiArray.binary   div   byte   bytergb                          ”x54xE3xA5x9BxC4x20x09x40”
     MultiArray.binary   div   byte   dcomplex
     MultiArray.binary   div   byte   dfloat
     MultiArray.binary   div   byte   dfloatrgb
     ...



                                      17/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                HornetsEye I/O
                           Colourspace Conversions




                                         
 Y   0.299
  
                   0.587     0.114 
                                      
                                         R  0 
                                             
                                             
  
                                   
                                            
                                             
 =
  
  
Cb  −0.168736 −0.331264
                                      
                                          + 
                                             
                                             
                                0.500    G 128
                                   
                                            
  
  
                                   
                                      
                                            
                                             
                                             
  
                                   
                                            
                                             
C   0.500
   r               −0.418688 −0.081312    B  128
           also see: http://fourcc.org/

   18/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                             HornetsEye I/O
                                 BMP, GIF, JPEG, PPM, PNG, PNM, TIFF, ...




mgk = Magick::Image.read( "circle.png" )[0] # code is simplified
str = magick.export_pixels_to_str( 0, 0, mgk.columns, mgk.rows, "RGB",
                                   Magick::CharPixel )
arr = MultiArray.import( MultiArray::UBYTERGB, str, mgk.columns, mgk.rows )

                                      ⇓

arr = MultiArray.load_rgb24( "circle.png" )
mgk = Magick::Image.new( *arr.shape ) { |x| x.depth = 8 }
mgk.import_pixels( 0, 0, arr.shape[0], arr.shape[1], "RGB", arr.to_s,
                   Magick::CharPixel )
Magick::ImageList.new.push( mgk ).write( "circle.png" )

                                      ⇓

arr = MultiArray.save_rgb24( "circle.png" )

                  19/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                              HornetsEye I/O
                                              Video Decoding




xine_t *m_xine = xine_new(); // code is simplified
xine_config_load( m_xine, "/home/myusername/.xine/config" );
xine_init( m_xine );
xine_video_port_t *m_videoPort = xine_new_framegrab_video_port( m_xine );
xine_stream_t *m_stream = xine_stream_new( m_xine, NULL, m_videoPort );
xine_open( m_stream, "test.avi" );
xine_video_frame_t *m_frame;
xine_get_next_video_frame( m_videoPort, &m_frame );
xine_free_video_frame( m_videoPort, &m_frame );
                                      ⇓
xine = XineInput.new( "test.avi" )
img = xine.read
                  20/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                              HornetsEye I/O
                                              Video Encoding




// "const unsigned char *data" points to I420-data of 320x240 frame
FILE *m_control = popen( "mencoder - -o test.avi"   // code is simplified
                         " -ovc lavc -lavcopts vcodec=ffv1", "w" );
fprintf( m_control, "YUV4MPEG2 W320 H240 F25000000:1000000 Ip A0:0n" );
fprintf( m_control, "FRAMEn" );
fwrite( data, 320 * 240 * 3 / 2, 1, m_control );

                                      ⇓

# "img" is of type "HornetsEye::Image" or "HornetsEye::MultiArray"
mencoder = MEncoderOutput.new( "test.avi", 25 )
mencoder.write( img )

                  21/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                HornetsEye I/O
                                          Video For Linux (V4L/V4L2)




int m_fd = open( "/dev/video0", O_RDWR, 0 );    // code is incomplete
ioctl( VIDIOC_S_FMT, &m_format );
ioctl( VIDIOC_REQBUFS, &m_req );
ioctl( VIDIOC_QUERYBUF, &m_buf[0] );
ioctl( VIDIOC_QBUF, &m_buf[0] );
ioctl( VIDIOC_STREAMON, &type );
ioctl( VIDIOC_DQBUF, &buf )
// ...
                                      ⇓
v4l2 = V4L2Input.new
img = v4l2.read
                  22/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                              HornetsEye I/O
                                          IIDC/DCAM (libdc1394)




raw1394handle_t m_handle = dc1394_create_handle( 0 ); // code is incomplete
dc1394_cameracapture m_camera; int numCameras;
nodeid_t *m_cameraNode = dc1394_get_camera_nodes( m_handle, &numCameras, 0 );
dc1394_camera_on( m_handle, 0 );
dc1394_dma_setup_capture( m_handle, m_cameraNode[ 0 ], 0,
                          FORMAT_VGA_NONCOMPRESSED, MODE_640x480_YUV422,
                          FRAMERATE_15, 4, 1, NULL, &m_camera );
dc1394_start_iso_transmission( m_handle, m_camera.node );
// ...

                                      ⇓

firewire = DC1394Input.new
img = firewire.read

                  23/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                        HornetsEye I/O
                                                    XPutImage/glDrawPixels




# ----------------------------------------------------------
img = MultiArray.load_rgb24( "howden.jpg" )
display = X11Display.new
output = XImageOutput.new
# output = OpenGLOutput.new
window = X11Window.new( display, output,
                                     320, 240 )
window.title = "Test"
output.write( img )
window.show
display.eventLoop
# ----------------------------------------------------------

                    24/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                          HornetsEye I/O
                                                           XvPutImage

# -----------------------------------------------------
xine = XineInput.new( "dvd://1" ); sleep 2
display = X11Display.new
output = XVideoOutput.new
window = X11Window.new( display, output,
                                 768, 576 * 9 / 16 )
window.title = "Test"
window.show
delay = xine.frame_duration.to_f / 90000.0
time = Time.now
while xine.status? and output.status?
  output.write( xine.read )
  time_left = delay - ( Time.now.to_f -
                  time.to_f )
  display.eventLoop( time_left * 1000 )
  time += delay
end
# -----------------------------------------------------

                        25/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                         HornetsEye I/O
                                   High Dynamic Range Images

Exposure Series           Alignment (Hugin)        Tonemapping (QtPfsGui)



                                               ⇒



                                                     Loading And Saving
                     ⇒




                                                   img = MultiArray.
                                                     load_rgbf("test.exr")
                                                   img.
                                                     save_rgbf("test.exr")


             26/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                        HornetsEye I/O
                   Qt4-QtRuby: G++ Dataflow




27/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                         HornetsEye I/O
                   Qt4-QtRuby: Ruby Dataflow




28/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                                       HornetsEye I/O
                                                                                Qt4-QtRuby: XVideo Integration

# ---------------------------------------------------------------------------
class VideoPlayer < Qt::Widget
 def initialize
   super
   @xvideo = Hornetseye::XvWidget.new( self )
   layout = Qt::VBoxLayout.new( self )
   layout.addWidget( @xvideo )
   @xine = Hornetseye::XineInput.new( "test.avi", false )
   @timer = startTimer( @xine.frame_duration * 1000 / 90000 )
   resize( 640, 400 )
 end
 def timerEvent( e )
   begin
     if @xine
       img = @xine.read
       @xvideo.write( img )
     end
   rescue
     @xine = nil
     killTimer( @timer )
     @xvideo.clear
     @timer = 0
   end
 end
end
app = Qt::Application.new( ARGV )
VideoPlayer.new.show
app.exec
# ---------------------------------------------------------------------------


                                  29/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                 HornetsEye I/O
                               Microsoft Windows


            /

        V4LInput           VFWInput
        V4L2Input          DShowInput
        DC1394Input        —
        XineInput          —
        MPlayerInput       MPlayerInput
        MEncoderOutput     MEncoderOutput
        X11Display         W32Display
        X11Window          W32Window
        XImageOutput       GDIOutput
        OpenGLOutput       —
        XVideoOutput       —


30/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                            Introduction
                           From Here On




Brain.eval <<REQUIRED
  require ’hornetseye’
  include Hornetseye
REQUIRED




 31/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                     Computer Vision With Ruby
                                                                       Look-Up-Tables (LUTs)

g ∈ {0, 1, . . . , w} × {0, 1, . . . , h} → {0, 1, . . . , 255}
m ∈ {0, 1, . . . , 255} → {0, 1, . . . , 255}3
                   
   x1 
   ) = m g( x1 )
   
                   
                     
h( 
   
   
  x 
                     
                     
                     
                     
                    x 
      2               2


# -----------------------------------------------------------------------
img = MultiArray.load_grey8( "test.jpg" )
class Numeric
  def clip( range )
   [ [ self, range.begin ].max, range.end ].min
 end
end
colours = {}
for i in 0...256
 hue = 240 - i * 240.0 / 256.0
 colours[i] =
  RGB( ( ( hue - 180 ).abs - 60 ).clip( 0...60 ) * 255 / 60.0,
      ( 120 - ( hue - 120 ).abs ).clip( 0...60 ) * 255 / 60.0,
      ( 120 - ( hue - 240 ).abs ).clip( 0...60 ) * 255 / 60.0 )
end
img.map( colours, MultiArray::UBYTERGB, 256 ).display
# -----------------------------------------------------------------------

                                32/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                               Computer Vision With Ruby
                                                                                  Masks And Ramps

# -----------------------------------------------------------------------
class MultiArray
 def MultiArray.ramp1( *shape )
   retval = MultiArray.new( MultiArray::LINT, *shape )
   for x in 0...shape[0]
    retval[ x, 0...shape[1] ] = x                                                        Compute Bounding Box
   end
   retval
 end
 # def MultiArray.ramp2 ...
end
input = V4LInput.new
x, y =
 MultiArray.ramp1( input.width, input.height ),
 MultiArray.ramp2( input.width, input.height )
display = X11Display.new
output = XVideoOutput.new
window = X11Window.new( display, output, 640, 480 )                              0   1   2   3   4   5
window.title = "Thresholding"                                                    0   1   2   3   4   5
window.show                                                                      0   1   2   3   4   5
while input.status? and output.status?                                           0   1   2   3   4   5
 img = input.read_grey8                                                          0   1   2   3   4   5
 mask = img.binarise_lt( 48 )                                                    0   1   2   3   4   5
 result = ( img / 4 ) * ( mask + 1 )
 if mask.sum > 0                                                             2 3 4 2 3 1 2 3 2 3
   bbox = [ x.mask( mask ).range, y.mask( mask ).range ]
   result[ *bbox ] *= 2                                                              1≤x≤4
 end
 output.write( result )
 display.processEvents
end
# -----------------------------------------------------------------------


                                         33/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                           Computer Vision With Ruby
                                                                           Warps (Use Image As LUT)
                                                                                     
                                                                   g W( x1 )
                                                                                        x1 
g ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → R3
                                                                      
                                                                                  if W( ) ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1}
                                                                  
                                                                                     
                                                                                        
                                                            x1        
                                                                                      
h ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → R3      ) =
                                                            
                                                                                   
                                                         h( 
                                                                      
                                                                        x             
                                                                                       x 
                                                                        2              2
W ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → Z2
                                                           x    
                                                                  
                                                              2   
                                                                  
                                                                   0
                                                                  
                                                                                  otherwise
# ----------------------------------------------------------------------
class MultiArray
 # def MultiArray.ramp1 ...
 def MultiArray.ramp2( *shape )
   retval = MultiArray.new( MultiArray::LINT, *shape )
   for y in 0...shape[1]
    retval[ 0...shape[0], y ] = y
   end
   retval
 end
end
img = MultiArray.load_rgb24( "test.jpg" )
w, h = *img.shape; c = 0.5 * h
x, y = MultiArray.ramp1( h, h ), MultiArray.ramp2( h, h )
warp = MultiArray.new( MultiArray::LINT, h, h, 2 )
warp[ 0...h, 0...h, 0 ], warp[ 0...h, 0...h, 1 ] =
 ( ( ( x - c ).atan2( y - c ) / Math::PI + 1 ) * w / 2 - 0.5 ),
 ( ( x - c ) ** 2 + ( y - c ) ** 2 ).sqrt
img.warp_clipped( warp ).display
# ----------------------------------------------------------------------

                                34/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                  Computer Vision With Ruby
                                                                 Affine Transform using Warps
                                                                                                     
# --------------------------------------------------------------------        x1  cos(α) − sin(α)    x1 
                                                                              ) = 
                                                                              
                                                                              
                                                                         Wα ( 
                                                                                                   
                                                                                                        
                                                                                                         
class MultiArray                                                                  
                                                                                    
                                                                                                   
                                                                                                        
                                                                                                         
                                                                              
                                                                             x     sin(α) cos(α) 
                                                                                                   
                                                                                                        
                                                                                                         
                                                                                                        x 
 def MultiArray.ramp1( *shape )                                                 2                          2
  retval = MultiArray.new( MultiArray::LINT, *shape )
  for x in 0...shape[0]
    retval[ x, 0...shape[1] ] = x
  end
  retval
 end
 # def MultiArray.ramp2 ...
end
img = MultiArray.load_rgb24( "test.jpg" )
w, h = *img.shape
v = Vector[ MultiArray.ramp1( w, h ) - w / 2,
         MultiArray.ramp2( w, h ) - h / 2 ]
angle = 30.0 * Math::PI / 180.0
m = Matrix[ [ Math::cos( angle ), -Math::sin( angle ) ],
         [ Math::sin( angle ), Math::cos( angle ) ] ]
warp = MultiArray.new( MultiArray::LINT, w, h, 2 )
warp[ 0...w, 0...h, 0 ], warp[ 0...w, 0...h, 1 ] =
  ( m * v )[0] + w / 2, ( m * v )[1] + h / 2
img.warp_clipped( warp ).display
# --------------------------------------------------------------------

                           35/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                  Computer Vision With Ruby
          Center Of Gravity And Principal Components




36/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                    Computer Vision With Ruby
                                    Linear Shift-Invariant Filters

   Input Image                Sharpen                   Gaussian Blur




Gauss-Gradient (X)       Gauss-Gradient (Y)




                 37/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                   Computer Vision With Ruby
                                    Edge- And Corner-Images

  Input Image                  Sobel                 Gauss-Gradient




Harris-Stephens         Kanade-Lucas-Tomasi




                38/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                         Computer Vision With Ruby
                                                   (Inverse Compositional) Lucas-Kanade

given: template T , image I , previous pose p
sought: pose-change ∆p
argmin           ||T (x) − I(W p (W∆p (x)))||2 dx = (∗)
                               −1  −1
                                                                    −1   −1
  ∆p       x∈T                                                  I(W p (W∆p (x)))
(1) T (x) − I(W p (W∆p (x))) = T (W∆p (x)) − I(W p (x))
                −1  −1                           −1                 −1
                                                                I(W p (x))
                          δT          T     δW p
(2) T (W∆p (x)) ≈ T (x) +    (x)          ·      (x) · ∆p
                          δx                 δp
   (1,2)
(∗) = argmin(||H p + b||2 ) = (H T H)−1 H T b
            p
             
             h1,1 h1,2 · · ·
                                          
                                          b1                          T (x)
             
             
                              
                               
                                          
                                           
                                           
                                         
where H =  2,1 h2,2 · · ·     and b = b 
             
             
             h                
                                          
                                           
             
                              
                                          2
                                           
                                           
              .    .                     .
                          .. 
                                         
              .    .
                                         .
                             .
                                         
                .   .                        .
                                         

        δT      T  δW p
hi, j =    (xi ) ·      (xi ) , bi = T (xi ) − I(W p (xi ))
                                                   −1
        δx         δp j

           S. Baker and I. Matthew: “Lucas-Kanade 20 years on: a unifying framework”
                       http://www.ri.cmu.edu/projects/project 515.html

                            39/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                  Computer Vision With Ruby
                              3 Degrees-Of-Freedom Lucas-Kanade

                            Initialisation
p = Vector[ xshift, yshift, rotation ]
w, h, sigma = tpl.shape[0], tpl.shape[1], 5.0
x, y = xramp( w, h ), yramp( w, h )
gx = tpl.gauss_gradient_x( sigma )
gy = tpl.gauss_gradient_y( sigma )
c = Matrix[ [ 1, 0 ], [ 0, 1 ], [ -y, x ] ] * Vector[ gx, gy ]
hs = ( c * c.covector ).collect { |e| e.sum }
                              Tracking
field = MultiArray.new( MultiArray::SFLOAT, w,   h, 2 )
field[ 0...w, 0...h, 0 ] = x * cos( p[2] ) - y   * sin( p[2] ) + p[0]
field[ 0...w, 0...h, 1 ] = x * sin( p[2] ) + y   * cos( p[2] ) + p[1]
diff = img.warp_clipped_interpolate( field ) -   tpl
s = c.collect { |e| ( e * diff ).sum }
d = hs.inverse * s
p += Matrix[ [ cos(p[2]), -sin(p[2]), 0 ],
             [ sin(p[2]), cos(p[2]), 0 ],
             [          0,          0, 1 ] ] *   d


           40/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                   Computer Vision With Ruby
                Interactive Presentation Software




41/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                           Current/Future Work


• feature extraction
  – multiresolution Lucas-Kanade
  – wavelet-based features
• feature descriptors
  – appearance templates
• feature based object recognition
  – geometric hashing
  – RANSAC
• feature based tracking
  – bounded hough transform
• parallel processing

No high-level code in C++!
                  42/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                    Appeal




Computer vision only will happen if we ...
• break with business as usual
• remove all barriers to collaboration
• allow users and developers to innovate
• need fully hackable hardware
• fight for a free software stack

       43/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                           Conclusion
        http://rubyforge.org/projects/hornetseye/

            Let’s do it!




44/44

Contenu connexe

Similaire à Real-time Computer Vision With Ruby - OSCON 2008

PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsGraham Dumpleton
 
Using Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systemsUsing Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systemsSerge Stinckwich
 
Linux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - WonokaerunLinux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - Wonokaerunidsecconf
 
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...Intel® Software
 
Java code coverage with JCov. Implementation details and use cases.
Java code coverage with JCov. Implementation details and use cases.Java code coverage with JCov. Implementation details and use cases.
Java code coverage with JCov. Implementation details and use cases.Alexandre (Shura) Iline
 
ARM IoT Firmware Emulation Workshop
ARM IoT Firmware Emulation WorkshopARM IoT Firmware Emulation Workshop
ARM IoT Firmware Emulation WorkshopSaumil Shah
 
Automated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumerationAutomated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumerationRuo Ando
 
Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠Integris Security LLC
 
How to write clean & testable code without losing your mind
How to write clean & testable code without losing your mindHow to write clean & testable code without losing your mind
How to write clean & testable code without losing your mindAndreas Czakaj
 
Swift profiling middleware and tools
Swift profiling middleware and toolsSwift profiling middleware and tools
Swift profiling middleware and toolszhang hua
 
Fine line between performance and security
Fine line between performance and securityFine line between performance and security
Fine line between performance and securityAlmudena Vivanco
 
Coverage Solutions on Emulators
Coverage Solutions on EmulatorsCoverage Solutions on Emulators
Coverage Solutions on EmulatorsDVClub
 
Voice Enable Blind Assistance System -Real time Object Detection
Voice Enable Blind Assistance System -Real time Object DetectionVoice Enable Blind Assistance System -Real time Object Detection
Voice Enable Blind Assistance System -Real time Object DetectionIRJET Journal
 
Labs_BT_20221017.pptx
Labs_BT_20221017.pptxLabs_BT_20221017.pptx
Labs_BT_20221017.pptxssuserb4d806
 
4_image_detection.pdf
4_image_detection.pdf4_image_detection.pdf
4_image_detection.pdfFEG
 
Improving app performance with Kotlin Coroutines
Improving app performance with Kotlin CoroutinesImproving app performance with Kotlin Coroutines
Improving app performance with Kotlin CoroutinesHassan Abid
 
Demystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels CampDemystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels CampAndré Baptista
 
Python_for_Visual_Effects_and_Animation_Pipelines
Python_for_Visual_Effects_and_Animation_PipelinesPython_for_Visual_Effects_and_Animation_Pipelines
Python_for_Visual_Effects_and_Animation_PipelinesRussell Darling
 
01 foundations
01 foundations01 foundations
01 foundationsankit_ppt
 

Similaire à Real-time Computer Vision With Ruby - OSCON 2008 (20)

PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web Applications
 
Using Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systemsUsing Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systems
 
Linux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - WonokaerunLinux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - Wonokaerun
 
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
 
Java code coverage with JCov. Implementation details and use cases.
Java code coverage with JCov. Implementation details and use cases.Java code coverage with JCov. Implementation details and use cases.
Java code coverage with JCov. Implementation details and use cases.
 
ARM IoT Firmware Emulation Workshop
ARM IoT Firmware Emulation WorkshopARM IoT Firmware Emulation Workshop
ARM IoT Firmware Emulation Workshop
 
Automated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumerationAutomated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumeration
 
Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠
 
How to write clean & testable code without losing your mind
How to write clean & testable code without losing your mindHow to write clean & testable code without losing your mind
How to write clean & testable code without losing your mind
 
Swift profiling middleware and tools
Swift profiling middleware and toolsSwift profiling middleware and tools
Swift profiling middleware and tools
 
Fine line between performance and security
Fine line between performance and securityFine line between performance and security
Fine line between performance and security
 
Coverage Solutions on Emulators
Coverage Solutions on EmulatorsCoverage Solutions on Emulators
Coverage Solutions on Emulators
 
Voice Enable Blind Assistance System -Real time Object Detection
Voice Enable Blind Assistance System -Real time Object DetectionVoice Enable Blind Assistance System -Real time Object Detection
Voice Enable Blind Assistance System -Real time Object Detection
 
Labs_BT_20221017.pptx
Labs_BT_20221017.pptxLabs_BT_20221017.pptx
Labs_BT_20221017.pptx
 
4_image_detection.pdf
4_image_detection.pdf4_image_detection.pdf
4_image_detection.pdf
 
Improving app performance with Kotlin Coroutines
Improving app performance with Kotlin CoroutinesImproving app performance with Kotlin Coroutines
Improving app performance with Kotlin Coroutines
 
Demystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels CampDemystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels Camp
 
Python_for_Visual_Effects_and_Animation_Pipelines
Python_for_Visual_Effects_and_Animation_PipelinesPython_for_Visual_Effects_and_Animation_Pipelines
Python_for_Visual_Effects_and_Animation_Pipelines
 
01 foundations
01 foundations01 foundations
01 foundations
 
Vhdl new
Vhdl newVhdl new
Vhdl new
 

Plus de Jan Wedekind

The SOLID Principles for Software Design
The SOLID Principles for Software DesignThe SOLID Principles for Software Design
The SOLID Principles for Software DesignJan Wedekind
 
Fundamentals of Computing
Fundamentals of ComputingFundamentals of Computing
Fundamentals of ComputingJan Wedekind
 
Using Generic Image Processing Operations to Detect a Calibration Grid
Using Generic Image Processing Operations to Detect a Calibration GridUsing Generic Image Processing Operations to Detect a Calibration Grid
Using Generic Image Processing Operations to Detect a Calibration GridJan Wedekind
 
Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...Jan Wedekind
 
The MiCRoN Project
The MiCRoN ProjectThe MiCRoN Project
The MiCRoN ProjectJan Wedekind
 
Computer vision for microscopes
Computer vision for microscopesComputer vision for microscopes
Computer vision for microscopesJan Wedekind
 
Focus set based reconstruction of micro-objects
Focus set based reconstruction of micro-objectsFocus set based reconstruction of micro-objects
Focus set based reconstruction of micro-objectsJan Wedekind
 
Machine vision and device integration with the Ruby programming language (2008)
Machine vision and device integration with the Ruby programming language (2008)Machine vision and device integration with the Ruby programming language (2008)
Machine vision and device integration with the Ruby programming language (2008)Jan Wedekind
 
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...Jan Wedekind
 
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)Jan Wedekind
 
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011Jan Wedekind
 
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...Jan Wedekind
 
Machine Vision made easy with Ruby - ShRUG June 2010
Machine Vision made easy with Ruby - ShRUG June 2010Machine Vision made easy with Ruby - ShRUG June 2010
Machine Vision made easy with Ruby - ShRUG June 2010Jan Wedekind
 
Computer Vision using Ruby and libJIT - RubyConf 2009
Computer Vision using Ruby and libJIT - RubyConf 2009Computer Vision using Ruby and libJIT - RubyConf 2009
Computer Vision using Ruby and libJIT - RubyConf 2009Jan Wedekind
 
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006Jan Wedekind
 
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...Jan Wedekind
 
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009Jan Wedekind
 

Plus de Jan Wedekind (17)

The SOLID Principles for Software Design
The SOLID Principles for Software DesignThe SOLID Principles for Software Design
The SOLID Principles for Software Design
 
Fundamentals of Computing
Fundamentals of ComputingFundamentals of Computing
Fundamentals of Computing
 
Using Generic Image Processing Operations to Detect a Calibration Grid
Using Generic Image Processing Operations to Detect a Calibration GridUsing Generic Image Processing Operations to Detect a Calibration Grid
Using Generic Image Processing Operations to Detect a Calibration Grid
 
Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...
 
The MiCRoN Project
The MiCRoN ProjectThe MiCRoN Project
The MiCRoN Project
 
Computer vision for microscopes
Computer vision for microscopesComputer vision for microscopes
Computer vision for microscopes
 
Focus set based reconstruction of micro-objects
Focus set based reconstruction of micro-objectsFocus set based reconstruction of micro-objects
Focus set based reconstruction of micro-objects
 
Machine vision and device integration with the Ruby programming language (2008)
Machine vision and device integration with the Ruby programming language (2008)Machine vision and device integration with the Ruby programming language (2008)
Machine vision and device integration with the Ruby programming language (2008)
 
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
 
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)
 
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011
 
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...
 
Machine Vision made easy with Ruby - ShRUG June 2010
Machine Vision made easy with Ruby - ShRUG June 2010Machine Vision made easy with Ruby - ShRUG June 2010
Machine Vision made easy with Ruby - ShRUG June 2010
 
Computer Vision using Ruby and libJIT - RubyConf 2009
Computer Vision using Ruby and libJIT - RubyConf 2009Computer Vision using Ruby and libJIT - RubyConf 2009
Computer Vision using Ruby and libJIT - RubyConf 2009
 
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006
 
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...
 
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009
 

Dernier

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 

Dernier (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Real-time Computer Vision With Ruby - OSCON 2008

  • 1. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf O’Reilly Open Source Convention 2008 Ruby Track: Session 2471 Real-time Computer Vision With Ruby J. Wedekind Wednesday, July 23rd 2008 Nanorobotics EPSRC Basic Technology Grant Microsystems and Machine Vision Laboratory Modelling Research Centre 1/44
  • 2. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction In This Talk Brain.eval <<REQUIRED require ’RMagick’ require ’Qt4’ require ’complex’ require ’matrix’ require ’narray’ REQUIRED 2/44
  • 3. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction EU Esprit MINIMAN Project 3/44
  • 4. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction EU IST MiCRoN Project 4/44
  • 5. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction EPSRC Nanorobotics Project Electron Microscopy • telemanipulation • drift-compensation • closed-loop control Computer Vision • real-time software • system integration • theoretical insights 5/44
  • 6. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction Industrial Robotics Default Situation • proprietary operating system • proprietary robot software • proprietary process simulation software • proprietary mathematics software • proprietary machine vision software • proprietary manufacturing software Total Cost of Lock-in (TCL) • duplication of work • integration problems • lack of progress • handicapped developers 6/44
  • 7. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction Innovation Happens Elsewhere Ron Goldman & Richard P. Gabriel “The market need is greatest for platform products because of the importance of a reliable promise that vendor lock-in will not endanger the survival of products built or modified on the software stack above that platform.” “It is important to remove as many barriers to collaboration as possible: social, political, and technical.” 7/44
  • 8. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Design Considerations HornetsEye’s Distinguishing Features • GPL • Ruby • Real-Time 8/44
  • 9. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Design Considerations GPLv3 Four Freedoms (Richard Stallman) 1. The freedom to run the program, for any purpose. 2. The freedom to study how the program works, and adapt it to your needs. 3. The freedom to redistribute copies so you can help your neighbor. 4. The freedom to improve the program, and release your improvements to the public, so that the whole community benefits. Respect The Freedom Of Downstream Users (Richard Stallman) GPL requires derived works to be available under the same license. Covenant Not To Assert Patent Claims (Eben Moglen) GPLv3 deters users of the program from instituting patent ligitation by the threat of withdrawing further rights to use the program. Other (Eben Moglen) GPLv3 has regulations against DMCA restrictions and tivoization. 9/44
  • 10. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Design Considerations Ruby # --------------------------------------------------------------------------------------------------------- img = Magick::Image.read( "circle.png" )[ 0 ] str = img.export_pixels_to_str( 0, 0, img.columns, img.rows, "I", Magick::CharPixel ) arr = NArray.to_na( str, NArray::BYTE, img.columns, img.rows ) puts ( arr / 128 ).inspect # NArray.byte(20,20): # [ [ 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1 ], # [ 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1 ], # [ 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1 ], # [ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1 ], # [ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1 ], # [ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 ], # [ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 ], # [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], # [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], # ... # --------------------------------------------------------------------------------------------------------- No high-level code in C++! 10/44
  • 11. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Design Considerations Real-Time Real-time Object Recognition 11/44
  • 12. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core Compact Storage # ------------------------------------------------------------------------------------------------------ class Sequence Type = Struct.new( :name, :type, :size, :default, :pack, :unpack ); @@types = [] def Sequence.register_type( sym, type, size, default, pack, unpack ) eval "#{sym.to_s} = Type.new( sym.to_s, type, size, default, pack, unpack )" end register_type( :OBJECT, Object, 1, nil, proc { |o| [o] }, proc { |s| s[0] } ) register_type( :UBYTE, Fixnum, 1, 0, proc { |o| [o].pack("C") }, proc { |s| s.unpack("C")[0] } ) def initialize( type = OBJECT, n = 0, value = nil ) @type, @data = type, type.pack.call( value == nil ? type.default : value ) * n @size = n end def []( i ) p = i * @type.size; @type.unpack.call( @data[ p...( p + @type.size ) ] ) end def []=( i, o ) p = i * @type.size; @data[ p...( p + @type.size ) ] = @type.pack.call( o ); o end end # ------------------------------------------------------------------------------------------------------ 12/44
  • 13. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core N-Dimensional Arrays # ------------------------------------------------------------------------------------------------------ class MultiArray UBYTE = Sequence::UBYTE OBJECT = Sequence::OBJECT def initialize( type = OBJECT, *shape ) @shape = shape stride = 1 @strides = shape.collect { |s| old = stride; stride *= s; old } @data = Sequence.new( type, shape.inject( 1 ) { |r,d| r*d } ) end def []( *indices ) @data[ indices.zip( @strides ).inject( 0 ) { |p,i| p + i[0] * i[1] } ] end def []=( *indices ) value = indices.pop @data[ indices.zip( @strides ).inject( 0 ) { |p,i| p + i[0] * i[1] } ] = value end end # ------------------------------------------------------------------------------------------------------ 13/44
  • 14. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core Element-wise Operations # ------------------------------------------------------------------------------------------------------ class Sequence attr_reader :type, :data, :size def collect( type = @type ) retval = Sequence.new( type, @size ) ( 0...@size ).each { |i| retval[i] = yield self[i] } retval end end class MultiArray attr_accessor :shape, :strides, :data def MultiArray.import( type, data, *shape ) retval = MultiArray.new( type ) stride = 1; retval.strides = shape.collect { |s| old = stride; stride *= s; old } retval.shape, retval.data = shape, data; retval end def collect( type = @data.type, &action ) MultiArray.import( type, @data.collect( type, &action ), *@shape ) end end # ------------------------------------------------------------------------------------------------------ 14/44
  • 15. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core Return-type Coercions # ------------------------------------------------------------------------------------------------------ class Sequence @@coercions = Hash.new @@coercions.default = OBJECT def Sequence.register_coercion( result, type1, type2 ) @@coercions[ [ type1, type1 ] ] = type1 @@coercions[ [ type2, type2 ] ] = type2 @@coercions[ [ type1, type2 ] ] = result @@coercions[ [ type2, type1 ] ] = result end register_coercion( OBJECT, OBJECT, UBYTE ) def +( other ) retval = Sequence.new( @@coercions[ [ @type, other.type ] ], @size ) ( 0...@size ).each { |i| retval[i] = self[i] + other[i] } retval end end # ------------------------------------------------------------------------------------------------------ 15/44
  • 16. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core Fast Malloc Objects VALUE Malloc::wrapMid( VALUE rbSelf, VALUE rbOffset, VALUE rbLength ) { char *self; Data_Get_Struct( rbSelf, char, self ); return rb_str_new( self + NUM2INT( rbOffset ), NUM2INT( rbLength ) ); } m=Malloc.new(1000) m.mid(10,4) # "000000000000" m.assign(10,"test") m.mid(10,4) # "test" 16/44
  • 17. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core Native Operations g,h ∈ {0, 1,  . .  w − 1} × {0, 1, . . . , h − 1} → R  . ,  x1   ) = g( x1 )/2     MultiArray.dfloat( 320, 240 ):   h(            MultiArray.dfloat [ [ 245.0, 244.0, 197.0, ... ],   x    x  2 2 Fixnum [ 245.0, 247.0, 197.0, ... ], [ 247.0, 248.0, 187.0, ... ] Ruby h=g/2 ... [3.141] MultiArray.respond to?( ”binary div lint dfloat” ) no String.unpack(”D”) Array.pack(”D”) h = g.collect { |x| x / 2 } yes C++ for ( int i=0; i<n; i++ ) *r++ = *p++ / q; MultiArray.binary div byte byte MultiArray.binary div byte bytergb ”x54xE3xA5x9BxC4x20x09x40” MultiArray.binary div byte dcomplex MultiArray.binary div byte dfloat MultiArray.binary div byte dfloatrgb ... 17/44
  • 18. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Colourspace Conversions          Y   0.299       0.587 0.114    R  0                           =       Cb  −0.168736 −0.331264    +          0.500  G 128                                                  C   0.500 r −0.418688 −0.081312  B  128 also see: http://fourcc.org/ 18/44
  • 19. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O BMP, GIF, JPEG, PPM, PNG, PNM, TIFF, ... mgk = Magick::Image.read( "circle.png" )[0] # code is simplified str = magick.export_pixels_to_str( 0, 0, mgk.columns, mgk.rows, "RGB", Magick::CharPixel ) arr = MultiArray.import( MultiArray::UBYTERGB, str, mgk.columns, mgk.rows ) ⇓ arr = MultiArray.load_rgb24( "circle.png" ) mgk = Magick::Image.new( *arr.shape ) { |x| x.depth = 8 } mgk.import_pixels( 0, 0, arr.shape[0], arr.shape[1], "RGB", arr.to_s, Magick::CharPixel ) Magick::ImageList.new.push( mgk ).write( "circle.png" ) ⇓ arr = MultiArray.save_rgb24( "circle.png" ) 19/44
  • 20. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Video Decoding xine_t *m_xine = xine_new(); // code is simplified xine_config_load( m_xine, "/home/myusername/.xine/config" ); xine_init( m_xine ); xine_video_port_t *m_videoPort = xine_new_framegrab_video_port( m_xine ); xine_stream_t *m_stream = xine_stream_new( m_xine, NULL, m_videoPort ); xine_open( m_stream, "test.avi" ); xine_video_frame_t *m_frame; xine_get_next_video_frame( m_videoPort, &m_frame ); xine_free_video_frame( m_videoPort, &m_frame ); ⇓ xine = XineInput.new( "test.avi" ) img = xine.read 20/44
  • 21. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Video Encoding // "const unsigned char *data" points to I420-data of 320x240 frame FILE *m_control = popen( "mencoder - -o test.avi" // code is simplified " -ovc lavc -lavcopts vcodec=ffv1", "w" ); fprintf( m_control, "YUV4MPEG2 W320 H240 F25000000:1000000 Ip A0:0n" ); fprintf( m_control, "FRAMEn" ); fwrite( data, 320 * 240 * 3 / 2, 1, m_control ); ⇓ # "img" is of type "HornetsEye::Image" or "HornetsEye::MultiArray" mencoder = MEncoderOutput.new( "test.avi", 25 ) mencoder.write( img ) 21/44
  • 22. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Video For Linux (V4L/V4L2) int m_fd = open( "/dev/video0", O_RDWR, 0 ); // code is incomplete ioctl( VIDIOC_S_FMT, &m_format ); ioctl( VIDIOC_REQBUFS, &m_req ); ioctl( VIDIOC_QUERYBUF, &m_buf[0] ); ioctl( VIDIOC_QBUF, &m_buf[0] ); ioctl( VIDIOC_STREAMON, &type ); ioctl( VIDIOC_DQBUF, &buf ) // ... ⇓ v4l2 = V4L2Input.new img = v4l2.read 22/44
  • 23. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O IIDC/DCAM (libdc1394) raw1394handle_t m_handle = dc1394_create_handle( 0 ); // code is incomplete dc1394_cameracapture m_camera; int numCameras; nodeid_t *m_cameraNode = dc1394_get_camera_nodes( m_handle, &numCameras, 0 ); dc1394_camera_on( m_handle, 0 ); dc1394_dma_setup_capture( m_handle, m_cameraNode[ 0 ], 0, FORMAT_VGA_NONCOMPRESSED, MODE_640x480_YUV422, FRAMERATE_15, 4, 1, NULL, &m_camera ); dc1394_start_iso_transmission( m_handle, m_camera.node ); // ... ⇓ firewire = DC1394Input.new img = firewire.read 23/44
  • 24. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O XPutImage/glDrawPixels # ---------------------------------------------------------- img = MultiArray.load_rgb24( "howden.jpg" ) display = X11Display.new output = XImageOutput.new # output = OpenGLOutput.new window = X11Window.new( display, output, 320, 240 ) window.title = "Test" output.write( img ) window.show display.eventLoop # ---------------------------------------------------------- 24/44
  • 25. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O XvPutImage # ----------------------------------------------------- xine = XineInput.new( "dvd://1" ); sleep 2 display = X11Display.new output = XVideoOutput.new window = X11Window.new( display, output, 768, 576 * 9 / 16 ) window.title = "Test" window.show delay = xine.frame_duration.to_f / 90000.0 time = Time.now while xine.status? and output.status? output.write( xine.read ) time_left = delay - ( Time.now.to_f - time.to_f ) display.eventLoop( time_left * 1000 ) time += delay end # ----------------------------------------------------- 25/44
  • 26. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O High Dynamic Range Images Exposure Series Alignment (Hugin) Tonemapping (QtPfsGui) ⇒ Loading And Saving ⇒ img = MultiArray. load_rgbf("test.exr") img. save_rgbf("test.exr") 26/44
  • 27. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Qt4-QtRuby: G++ Dataflow 27/44
  • 28. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Qt4-QtRuby: Ruby Dataflow 28/44
  • 29. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Qt4-QtRuby: XVideo Integration # --------------------------------------------------------------------------- class VideoPlayer < Qt::Widget def initialize super @xvideo = Hornetseye::XvWidget.new( self ) layout = Qt::VBoxLayout.new( self ) layout.addWidget( @xvideo ) @xine = Hornetseye::XineInput.new( "test.avi", false ) @timer = startTimer( @xine.frame_duration * 1000 / 90000 ) resize( 640, 400 ) end def timerEvent( e ) begin if @xine img = @xine.read @xvideo.write( img ) end rescue @xine = nil killTimer( @timer ) @xvideo.clear @timer = 0 end end end app = Qt::Application.new( ARGV ) VideoPlayer.new.show app.exec # --------------------------------------------------------------------------- 29/44
  • 30. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Microsoft Windows / V4LInput VFWInput V4L2Input DShowInput DC1394Input — XineInput — MPlayerInput MPlayerInput MEncoderOutput MEncoderOutput X11Display W32Display X11Window W32Window XImageOutput GDIOutput OpenGLOutput — XVideoOutput — 30/44
  • 31. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction From Here On Brain.eval <<REQUIRED require ’hornetseye’ include Hornetseye REQUIRED 31/44
  • 32. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Look-Up-Tables (LUTs) g ∈ {0, 1, . . . , w} × {0, 1, . . . , h} → {0, 1, . . . , 255} m ∈ {0, 1, . . . , 255} → {0, 1, . . . , 255}3      x1   ) = m g( x1 )         h(      x          x  2 2 # ----------------------------------------------------------------------- img = MultiArray.load_grey8( "test.jpg" ) class Numeric def clip( range ) [ [ self, range.begin ].max, range.end ].min end end colours = {} for i in 0...256 hue = 240 - i * 240.0 / 256.0 colours[i] = RGB( ( ( hue - 180 ).abs - 60 ).clip( 0...60 ) * 255 / 60.0, ( 120 - ( hue - 120 ).abs ).clip( 0...60 ) * 255 / 60.0, ( 120 - ( hue - 240 ).abs ).clip( 0...60 ) * 255 / 60.0 ) end img.map( colours, MultiArray::UBYTERGB, 256 ).display # ----------------------------------------------------------------------- 32/44
  • 33. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Masks And Ramps # ----------------------------------------------------------------------- class MultiArray def MultiArray.ramp1( *shape ) retval = MultiArray.new( MultiArray::LINT, *shape ) for x in 0...shape[0] retval[ x, 0...shape[1] ] = x Compute Bounding Box end retval end # def MultiArray.ramp2 ... end input = V4LInput.new x, y = MultiArray.ramp1( input.width, input.height ), MultiArray.ramp2( input.width, input.height ) display = X11Display.new output = XVideoOutput.new window = X11Window.new( display, output, 640, 480 ) 0 1 2 3 4 5 window.title = "Thresholding" 0 1 2 3 4 5 window.show 0 1 2 3 4 5 while input.status? and output.status? 0 1 2 3 4 5 img = input.read_grey8 0 1 2 3 4 5 mask = img.binarise_lt( 48 ) 0 1 2 3 4 5 result = ( img / 4 ) * ( mask + 1 ) if mask.sum > 0 2 3 4 2 3 1 2 3 2 3 bbox = [ x.mask( mask ).range, y.mask( mask ).range ] result[ *bbox ] *= 2 1≤x≤4 end output.write( result ) display.processEvents end # ----------------------------------------------------------------------- 33/44
  • 34. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Warps (Use Image As LUT)       g W( x1 )  x1  g ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → R3      if W( ) ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1}          x1         h ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → R3  ) =          h(       x    x     2 2 W ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → Z2 x    2    0  otherwise # ---------------------------------------------------------------------- class MultiArray # def MultiArray.ramp1 ... def MultiArray.ramp2( *shape ) retval = MultiArray.new( MultiArray::LINT, *shape ) for y in 0...shape[1] retval[ 0...shape[0], y ] = y end retval end end img = MultiArray.load_rgb24( "test.jpg" ) w, h = *img.shape; c = 0.5 * h x, y = MultiArray.ramp1( h, h ), MultiArray.ramp2( h, h ) warp = MultiArray.new( MultiArray::LINT, h, h, 2 ) warp[ 0...h, 0...h, 0 ], warp[ 0...h, 0...h, 1 ] = ( ( ( x - c ).atan2( y - c ) / Math::PI + 1 ) * w / 2 - 0.5 ), ( ( x - c ) ** 2 + ( y - c ) ** 2 ).sqrt img.warp_clipped( warp ).display # ---------------------------------------------------------------------- 34/44
  • 35. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Affine Transform using Warps       # --------------------------------------------------------------------  x1  cos(α) − sin(α)  x1   ) =      Wα (         class MultiArray              x   sin(α) cos(α)         x  def MultiArray.ramp1( *shape ) 2 2 retval = MultiArray.new( MultiArray::LINT, *shape ) for x in 0...shape[0] retval[ x, 0...shape[1] ] = x end retval end # def MultiArray.ramp2 ... end img = MultiArray.load_rgb24( "test.jpg" ) w, h = *img.shape v = Vector[ MultiArray.ramp1( w, h ) - w / 2, MultiArray.ramp2( w, h ) - h / 2 ] angle = 30.0 * Math::PI / 180.0 m = Matrix[ [ Math::cos( angle ), -Math::sin( angle ) ], [ Math::sin( angle ), Math::cos( angle ) ] ] warp = MultiArray.new( MultiArray::LINT, w, h, 2 ) warp[ 0...w, 0...h, 0 ], warp[ 0...w, 0...h, 1 ] = ( m * v )[0] + w / 2, ( m * v )[1] + h / 2 img.warp_clipped( warp ).display # -------------------------------------------------------------------- 35/44
  • 36. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Center Of Gravity And Principal Components 36/44
  • 37. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Linear Shift-Invariant Filters Input Image Sharpen Gaussian Blur Gauss-Gradient (X) Gauss-Gradient (Y) 37/44
  • 38. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Edge- And Corner-Images Input Image Sobel Gauss-Gradient Harris-Stephens Kanade-Lucas-Tomasi 38/44
  • 39. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby (Inverse Compositional) Lucas-Kanade given: template T , image I , previous pose p sought: pose-change ∆p argmin ||T (x) − I(W p (W∆p (x)))||2 dx = (∗) −1 −1 −1 −1 ∆p x∈T I(W p (W∆p (x))) (1) T (x) − I(W p (W∆p (x))) = T (W∆p (x)) − I(W p (x)) −1 −1 −1 −1 I(W p (x)) δT T δW p (2) T (W∆p (x)) ≈ T (x) + (x) · (x) · ∆p δx δp (1,2) (∗) = argmin(||H p + b||2 ) = (H T H)−1 H T b p  h1,1 h1,2 · · ·    b1  T (x)                 where H =  2,1 h2,2 · · ·  and b = b    h            2      . . . ..       . .  . .     . . .     δT T δW p hi, j = (xi ) · (xi ) , bi = T (xi ) − I(W p (xi )) −1 δx δp j S. Baker and I. Matthew: “Lucas-Kanade 20 years on: a unifying framework” http://www.ri.cmu.edu/projects/project 515.html 39/44
  • 40. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby 3 Degrees-Of-Freedom Lucas-Kanade Initialisation p = Vector[ xshift, yshift, rotation ] w, h, sigma = tpl.shape[0], tpl.shape[1], 5.0 x, y = xramp( w, h ), yramp( w, h ) gx = tpl.gauss_gradient_x( sigma ) gy = tpl.gauss_gradient_y( sigma ) c = Matrix[ [ 1, 0 ], [ 0, 1 ], [ -y, x ] ] * Vector[ gx, gy ] hs = ( c * c.covector ).collect { |e| e.sum } Tracking field = MultiArray.new( MultiArray::SFLOAT, w, h, 2 ) field[ 0...w, 0...h, 0 ] = x * cos( p[2] ) - y * sin( p[2] ) + p[0] field[ 0...w, 0...h, 1 ] = x * sin( p[2] ) + y * cos( p[2] ) + p[1] diff = img.warp_clipped_interpolate( field ) - tpl s = c.collect { |e| ( e * diff ).sum } d = hs.inverse * s p += Matrix[ [ cos(p[2]), -sin(p[2]), 0 ], [ sin(p[2]), cos(p[2]), 0 ], [ 0, 0, 1 ] ] * d 40/44
  • 41. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Interactive Presentation Software 41/44
  • 42. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Current/Future Work • feature extraction – multiresolution Lucas-Kanade – wavelet-based features • feature descriptors – appearance templates • feature based object recognition – geometric hashing – RANSAC • feature based tracking – bounded hough transform • parallel processing No high-level code in C++! 42/44
  • 43. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Appeal Computer vision only will happen if we ... • break with business as usual • remove all barriers to collaboration • allow users and developers to innovate • need fully hackable hardware • fight for a free software stack 43/44
  • 44. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Conclusion http://rubyforge.org/projects/hornetseye/ Let’s do it! 44/44