SlideShare a Scribd company logo
1 of 22
Download to read offline
Computer Vision for Fun
and Proļ¬t
Image Processing: The Lowdown
!
!
Steven Mitchell, Ph.D.
Componica, LLC
Copyright 2011 - Componica, LLC (http://www.componica.com/)
What does Componica have to offer?
Decades of Experience and Expertise 

A large library of code generated internally 

A community of innovative Computer Engineers and Programmers

Access to academic papers and a history of eļ¬€ectively applying archived
research 

libSeal - A long term project to take previously written code and turn it into a
single library:

SEAL - Steve's Evil (or Eclectic) Algorithms Library
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Image Processing
Any sort of signal processing done to an image

The Acquisition of an image

Compression and storage of an image

Enhancement and restoration

Registration of an image to another.

Measurement of data from the image (height of
building, speed of car)

Interpretation and Recognition of objects.
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Getting the Images in the First Place
Standard image ļ¬les:

GIF, PNG, JPG: Standard. I tend to always use PNGs

TIFF, PDF: More challenging, requires an external library

Video ļ¬les:

FFMpeg: Without it, it would be a nightmare to read video ļ¬les but, if it can't read a video
ļ¬le, I can't read that video ļ¬le.

Web Cam: OpenCV has a simple way of doing it. Outside OpenCV, there are scattered
libraries

Mobile Devices

Scanner

MRI and CT scanners: Use a ļ¬le format called DICOM
Copyright 2011 - Componica, LLC (http://www.componica.com/)
A Menu of Tools:
Image Enhancement - Computer...uncrop and enhance!
Image Segmentation - These pixels belong to this, those pixels belong to
that.
Image Registration - Line this image up with that image.
Object Recognition - This is a image of a frog and that's an image of a
cheeseburger.
Image Compression - Crush this image and make sure the process is undo-
able. (Not Covered...there are plenty of free libraries to do this.)
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Image Enhancement
Simple Pixel Stuļ¬€:
Normalizing brightness and contrast.

Gamma correction (the non-linear
eļ¬€ects of TV)

Histogram equalization - maximize the
global contrast. 

Color Correction...color temperature
and tint. White balancing.
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Geometric Stuļ¬€:
Interpolation. Make the image bigger or smaller. This is not as easy as it sounds if you
want non-aliased results. Trips up web developers whenever they try to roll their own
thumbnail generator.

Warping an image from one geometry to another.

Simple rotation, scale, and translation. You need two or three landmark points.

Perspective (aka Projection or Homogeneous) transforms. You need four landmark
points.

Lens distortions. Images pinch or barrel out in camera lens. You can calibrate and
correct for that with enough landmark points.

General warping. Typically used for image morphing in special eļ¬€ects.
Image Enhancement - Geometry
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Image Enhancement - Interpolation
AMATEUR
PRO
UPSAMPLING DOWNSAMPLINGORIGINALS
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Image Enhancement - Geometry
ORIGINAL IMAGE PERSPECTIVE CORRECTED
The original image is warped to a perspective corrected version.

The four black dots indicated the landmark points used to normalize the image
to this artiļ¬cial view.

The black regions are areas that falls outside the original image. Unknown data.
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Noise removing. Median ļ¬lter, average ļ¬lter, etc.

De-blurring. You estimate the cause of the blurring
and then undo it using deconvolution.

Motion blur - Estimate the camera movements
and undo it.

Focus blur - Estimate the lens blur and undo it.
Image Enhancement
http://cg.postech.ac.kr/research/fast_motion_deblurring/
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Divide an image into known parts:

It's not quite object recognition because images are typically interpreted
on a pixel basis using edge detection or colors.

Sometimes it's good enough because you just care about the borders,
not content

Border of tumor vs. healthy tissue.

The bright red ball in the color picture.

Is this pixel part of a letter ā€˜qā€™ or paper?

Often the ļ¬rst step to a bigger solution.
Image Segmentation
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Make a decision based on a single pixel:

Simple thresholding - is this pixel darker than 160?
Slightly better - is this pixel red?
Even more better - is this pixel statistically more likely
to be paper or letter based on it's RGB value.
The works - I'm modeling the uneven-ness of the
lighting on this paper and made a statistical model of
RGB, is this damn pixel ink or paper?
The downfall of this is you're looking at single pixels
without understand how it's neighbors relate to it, missing
the whole picture. Sometimes it works. Fast and easy to
do.
Image Segmentation - Pixel Classes
ORIGINAL
SIMPLE
THRESHOLD
ADAPTIVE
BINARIZATION
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Minimizing or maximizing a path which borders between
elements. This one is a common technique, with many variants:

Assign a cost to all the pixels. For example, edge detection
- it will cost me a lot to cross an edge. Or color transition, it
will cost me to cross to a diļ¬€erent color.

Use an optimization technique from classic data structures
(typically used in graph theory if you still remember) to
compute the cheapest path from one location of the image
to a diļ¬€erent location.

Dynamic Programming - Strange name but all it means is
compute the cheapest path from one side of an image to
another. Works best for paths that tend to be linear.

Minimum Graph Cut - Find the cheapest way to split the
image in two regions. This works well for circular paths and
3D. Tends to be much more complicated than dynamic
programming and slower.

Check out this: http://www.youtube.com/watch?v=6NcIJXTlugc
Image Segmentation - Least Cost Path
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Random Decision Forests applied to pixels
Apply twenty questions to pixel and it's
surrounding neighbors to create decision trees
to guess what this pixel is over.

Have a number of these decision trees (forest)
and aggregate the results. Strangely it tends to
works in many cases.
Image Segmentation - Bleeding Edge
Copyright 2011 - Componica, LLC (http://www.componica.com/)
I have a source image that I can transform (move, resize,
rotate, or bend). Make it best ļ¬t a target image.

Translation Only - Shift the source image until it best
ļ¬ts the target.

Similarity Transform - Adjust the scale, rotation, and
translation until the source image overlaps the target.

Perspective Transform - Move the source imageā€™s four
corners until it matches the target in a perspective
manner.

Non-Rigid Warping - The source image is on a rubber
sheet. Warp it onto the target image.

Obvious Applications:

Augmented Reality

Image Stitching

Object Detection
Image Registration
Copyright 2011 - Componica, LLC (http://www.componica.com/)
The most common way to register as image is the following:

Find the most interesting points on the two images (usually
blobs and corners).

Scale-invariant feature transform - SIFT (Patented and
slow)

Speeded Up Robust Features - SURF (Recently Patented
and fast)

Features from Accelerated Segment Test - FAST (Very
fast but noisy)

With SURF and SIFT you get a position, an angle, scale
factor and a 64 element vector to compare. With FAST
you get a position and maybe a rotation.

Compare all the interesting points from one image to the other
forming matching pairs of points between images. Naive
implementations are SLOW.

Use RANSAC to ļ¬nd a consensus (next page).
Image Registration - Interesting Points
SURF
FAST
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Matching points locally doesnā€™t yield global matches.

RANdom Sample Consensus - RANSAC - prunes away the
mismatches and computes the transform that converts the
source image to the target.

It works as followed:

Most transforms can be described by a minimal number of
points. For similarity transforms itā€™s two points, for
projection transforms itā€™s four points.

Pick two (or four) matched pair of points at random.

Compute a transform from those two (or four) sets of points.

Transform all the source points using this transform and see
how many points are close to the target.

Repeat this and keep the best transform that matches the
most points.
Image Registration - Interesting Points
Here the green lines indicate pairs of matched points
that ļ¬t the transform (looks like a similarity transform)
and the red lines are matched pairs that failed to ļ¬t this
transform and therefore rejected.!
!
RANSAC may seem ad hoc, but it works surprisingly
well.
Copyright 2011 - Componica, LLC (http://www.componica.com/)
A very accurate way to line up a template onto an
image is to use derivates and linear algebra.

Works very well only when the images are very close
to each other in the ļ¬rst place.

Itā€™s usually the polishing step after using an
Interesting Points / RANSAC method.

Known as Lucas-Kanade Tracker -or- Baker-
Matthews Tracker.

Itā€™s the secret sauce to how the ā€œPredatorā€ algorithm
works.
Image Registration - The Mathy Way
http://www.wedesoft.demon.co.uk/lucas-kanade-tracker.html
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Making computers recognize objects in images

Many diļ¬€erent algorithms, and each algorithm has appropriate uses
depending on the objects being detected. For example:

AdaBoost - Awesome at detecting and locating faces, sucks at
recognizing whoā€™s face it is.

Deep Neural Networks - Works great for highly distorted text, but too
slow for generalized OCR.

Active Appearance Models - Works great for both recognition and
segmentation, but only on generally ļ¬xed shapes like faces and hearts.
Sucks for livers and cancer.

Most recognizers depend on training the algorithm on known objects oļ¬„ine
and then testing. Which brings up the topic of data...
Object Recognition
Copyright 2011 - Componica, LLC (http://www.componica.com/)
The data can be the most valuable part of a
trainable system as many algorithms will
generally function with somewhat similar hit rates.

Often ideas fail to take into account where the
data comes from. Itā€™s the killer of many ideas.

The basics of training something:

Data is normally split into a training and
testing set. You train a thingy with the
training set and test how well it works with
the testing set.

Why? Most trainable thingies are prone to
overļ¬tting. Splitting the data into two sets
prevents this problem because you use the
test set to know when to stop training.

Disadvantage, you eļ¬€ectively need twice as
much data. Sucks.
Object Recognition - Itā€™s all about the Data
Itā€™s obvious this data is best represented as a line, but!
if the model over-ļ¬ts the data, it may compute a!
relationship thatā€™s nonsensical.
As your algorithm learns from a training set (blue line),!
the error decreases for that set, but in the testing set,!
it will hit a point where overļ¬tting is happening and!
will increase the overall error in the real world. You stop!
training at the point the testing set starts getting worse.
Copyright 2011 - Componica, LLC (http://www.componica.com/)
Steveā€™s crappy breakdown of object recognition algorithms:

A ship of fools approach - Armies of stupid algorithms that together become smart. Kinda like
democracy sort of...not really.

Letā€™s create a brain, Igor - So what happens if you simulate brain tissue? Itā€™s grown quite a bit
since the neural network hype in the late 80s / early 90s. Fun Fact: This will eventually kill us all in a
bloody uprising.

If the shoe ļ¬ts... - Well if this template StarBucks logo ļ¬ts somewhere on my image using object
registration as described above, then Iā€™m guessing this StarBucks logo is present in the image.
Duh.

Find Features to Filter and Fit in a Feed-forward Fashion - You see this pattern all the time and
it lacks of creativity. Find interesting features in the image, and feed them into a trainable function
like a neural network, a non-linear regression, or a support vector machine. Boring.

Iā€™ve stared at a wall for 20 minutes now and I think everything out there is either one or a combination of
these general ideas. Hmm...Thatā€™s it? Disappointed.
Object Recognition - The Menu
End of Part One...

More Related Content

Viewers also liked

Information management
Information managementInformation management
Information managementLorie Lynne
Ā 
Step by-step compsressor Selection and sizing
Step by-step compsressor Selection and sizingStep by-step compsressor Selection and sizing
Step by-step compsressor Selection and sizingtantoy13
Ā 
Beep...Destroy All Humans!
Beep...Destroy All Humans!Beep...Destroy All Humans!
Beep...Destroy All Humans!Componica LLC
Ā 
Binary Features for Object Detection and Landmarking
Binary Features for Object Detection and LandmarkingBinary Features for Object Detection and Landmarking
Binary Features for Object Detection and LandmarkingComponica LLC
Ā 
General knowledge
General knowledgeGeneral knowledge
General knowledgeBelindaB83
Ā 
Introduction to Computer Vision
Introduction to Computer VisionIntroduction to Computer Vision
Introduction to Computer VisionComponica LLC
Ā 

Viewers also liked (7)

ŁˆŲ­ŲÆŲ© Ų§Ł„ŁŁ‚Ł‡ Ų§Ł„Ų§Ų³Ł„Ų§Ł…ŁŠ Ł„Ł„ŲµŁ Ų§Ł„ŲŖŲ§Ų³Ų¹
ŁˆŲ­ŲÆŲ© Ų§Ł„ŁŁ‚Ł‡ Ų§Ł„Ų§Ų³Ł„Ų§Ł…ŁŠ Ł„Ł„ŲµŁ Ų§Ł„ŲŖŲ§Ų³Ų¹ŁˆŲ­ŲÆŲ© Ų§Ł„ŁŁ‚Ł‡ Ų§Ł„Ų§Ų³Ł„Ų§Ł…ŁŠ Ł„Ł„ŲµŁ Ų§Ł„ŲŖŲ§Ų³Ų¹
ŁˆŲ­ŲÆŲ© Ų§Ł„ŁŁ‚Ł‡ Ų§Ł„Ų§Ų³Ł„Ų§Ł…ŁŠ Ł„Ł„ŲµŁ Ų§Ł„ŲŖŲ§Ų³Ų¹
Ā 
Information management
Information managementInformation management
Information management
Ā 
Step by-step compsressor Selection and sizing
Step by-step compsressor Selection and sizingStep by-step compsressor Selection and sizing
Step by-step compsressor Selection and sizing
Ā 
Beep...Destroy All Humans!
Beep...Destroy All Humans!Beep...Destroy All Humans!
Beep...Destroy All Humans!
Ā 
Binary Features for Object Detection and Landmarking
Binary Features for Object Detection and LandmarkingBinary Features for Object Detection and Landmarking
Binary Features for Object Detection and Landmarking
Ā 
General knowledge
General knowledgeGeneral knowledge
General knowledge
Ā 
Introduction to Computer Vision
Introduction to Computer VisionIntroduction to Computer Vision
Introduction to Computer Vision
Ā 

Recently uploaded

Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot ModelDeepika Singh
Ā 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
Ā 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
Ā 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
Ā 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vƔzquez
Ā 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
Ā 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
Ā 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
Ā 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
Ā 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
Ā 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
Ā 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
Ā 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
Ā 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
Ā 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
Ā 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
Ā 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
Ā 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
Ā 

Recently uploaded (20)

Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Ā 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
Ā 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Ā 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
Ā 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Ā 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Ā 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Ā 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Ā 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Ā 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
Ā 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Ā 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Ā 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
Ā 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
Ā 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
Ā 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Ā 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
Ā 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
Ā 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Ā 

Computer Vision For Fun and Profit

  • 1. Computer Vision for Fun and Proļ¬t Image Processing: The Lowdown ! ! Steven Mitchell, Ph.D. Componica, LLC
  • 2. Copyright 2011 - Componica, LLC (http://www.componica.com/) What does Componica have to offer? Decades of Experience and Expertise A large library of code generated internally A community of innovative Computer Engineers and Programmers Access to academic papers and a history of eļ¬€ectively applying archived research libSeal - A long term project to take previously written code and turn it into a single library: SEAL - Steve's Evil (or Eclectic) Algorithms Library
  • 3. Copyright 2011 - Componica, LLC (http://www.componica.com/) Image Processing Any sort of signal processing done to an image The Acquisition of an image Compression and storage of an image Enhancement and restoration Registration of an image to another. Measurement of data from the image (height of building, speed of car) Interpretation and Recognition of objects.
  • 4. Copyright 2011 - Componica, LLC (http://www.componica.com/) Getting the Images in the First Place Standard image ļ¬les: GIF, PNG, JPG: Standard. I tend to always use PNGs TIFF, PDF: More challenging, requires an external library Video ļ¬les: FFMpeg: Without it, it would be a nightmare to read video ļ¬les but, if it can't read a video ļ¬le, I can't read that video ļ¬le. Web Cam: OpenCV has a simple way of doing it. Outside OpenCV, there are scattered libraries Mobile Devices Scanner MRI and CT scanners: Use a ļ¬le format called DICOM
  • 5. Copyright 2011 - Componica, LLC (http://www.componica.com/) A Menu of Tools: Image Enhancement - Computer...uncrop and enhance! Image Segmentation - These pixels belong to this, those pixels belong to that. Image Registration - Line this image up with that image. Object Recognition - This is a image of a frog and that's an image of a cheeseburger. Image Compression - Crush this image and make sure the process is undo- able. (Not Covered...there are plenty of free libraries to do this.)
  • 6. Copyright 2011 - Componica, LLC (http://www.componica.com/) Image Enhancement Simple Pixel Stuļ¬€: Normalizing brightness and contrast. Gamma correction (the non-linear eļ¬€ects of TV) Histogram equalization - maximize the global contrast. Color Correction...color temperature and tint. White balancing.
  • 7. Copyright 2011 - Componica, LLC (http://www.componica.com/) Geometric Stuļ¬€: Interpolation. Make the image bigger or smaller. This is not as easy as it sounds if you want non-aliased results. Trips up web developers whenever they try to roll their own thumbnail generator. Warping an image from one geometry to another. Simple rotation, scale, and translation. You need two or three landmark points. Perspective (aka Projection or Homogeneous) transforms. You need four landmark points. Lens distortions. Images pinch or barrel out in camera lens. You can calibrate and correct for that with enough landmark points. General warping. Typically used for image morphing in special eļ¬€ects. Image Enhancement - Geometry
  • 8. Copyright 2011 - Componica, LLC (http://www.componica.com/) Image Enhancement - Interpolation AMATEUR PRO UPSAMPLING DOWNSAMPLINGORIGINALS
  • 9. Copyright 2011 - Componica, LLC (http://www.componica.com/) Image Enhancement - Geometry ORIGINAL IMAGE PERSPECTIVE CORRECTED The original image is warped to a perspective corrected version. The four black dots indicated the landmark points used to normalize the image to this artiļ¬cial view. The black regions are areas that falls outside the original image. Unknown data.
  • 10. Copyright 2011 - Componica, LLC (http://www.componica.com/) Noise removing. Median ļ¬lter, average ļ¬lter, etc. De-blurring. You estimate the cause of the blurring and then undo it using deconvolution. Motion blur - Estimate the camera movements and undo it. Focus blur - Estimate the lens blur and undo it. Image Enhancement http://cg.postech.ac.kr/research/fast_motion_deblurring/
  • 11. Copyright 2011 - Componica, LLC (http://www.componica.com/) Divide an image into known parts: It's not quite object recognition because images are typically interpreted on a pixel basis using edge detection or colors. Sometimes it's good enough because you just care about the borders, not content Border of tumor vs. healthy tissue. The bright red ball in the color picture. Is this pixel part of a letter ā€˜qā€™ or paper? Often the ļ¬rst step to a bigger solution. Image Segmentation
  • 12. Copyright 2011 - Componica, LLC (http://www.componica.com/) Make a decision based on a single pixel: Simple thresholding - is this pixel darker than 160? Slightly better - is this pixel red? Even more better - is this pixel statistically more likely to be paper or letter based on it's RGB value. The works - I'm modeling the uneven-ness of the lighting on this paper and made a statistical model of RGB, is this damn pixel ink or paper? The downfall of this is you're looking at single pixels without understand how it's neighbors relate to it, missing the whole picture. Sometimes it works. Fast and easy to do. Image Segmentation - Pixel Classes ORIGINAL SIMPLE THRESHOLD ADAPTIVE BINARIZATION
  • 13. Copyright 2011 - Componica, LLC (http://www.componica.com/) Minimizing or maximizing a path which borders between elements. This one is a common technique, with many variants: Assign a cost to all the pixels. For example, edge detection - it will cost me a lot to cross an edge. Or color transition, it will cost me to cross to a diļ¬€erent color. Use an optimization technique from classic data structures (typically used in graph theory if you still remember) to compute the cheapest path from one location of the image to a diļ¬€erent location. Dynamic Programming - Strange name but all it means is compute the cheapest path from one side of an image to another. Works best for paths that tend to be linear. Minimum Graph Cut - Find the cheapest way to split the image in two regions. This works well for circular paths and 3D. Tends to be much more complicated than dynamic programming and slower. Check out this: http://www.youtube.com/watch?v=6NcIJXTlugc Image Segmentation - Least Cost Path
  • 14. Copyright 2011 - Componica, LLC (http://www.componica.com/) Random Decision Forests applied to pixels Apply twenty questions to pixel and it's surrounding neighbors to create decision trees to guess what this pixel is over. Have a number of these decision trees (forest) and aggregate the results. Strangely it tends to works in many cases. Image Segmentation - Bleeding Edge
  • 15. Copyright 2011 - Componica, LLC (http://www.componica.com/) I have a source image that I can transform (move, resize, rotate, or bend). Make it best ļ¬t a target image. Translation Only - Shift the source image until it best ļ¬ts the target. Similarity Transform - Adjust the scale, rotation, and translation until the source image overlaps the target. Perspective Transform - Move the source imageā€™s four corners until it matches the target in a perspective manner. Non-Rigid Warping - The source image is on a rubber sheet. Warp it onto the target image. Obvious Applications: Augmented Reality Image Stitching Object Detection Image Registration
  • 16. Copyright 2011 - Componica, LLC (http://www.componica.com/) The most common way to register as image is the following: Find the most interesting points on the two images (usually blobs and corners). Scale-invariant feature transform - SIFT (Patented and slow) Speeded Up Robust Features - SURF (Recently Patented and fast) Features from Accelerated Segment Test - FAST (Very fast but noisy) With SURF and SIFT you get a position, an angle, scale factor and a 64 element vector to compare. With FAST you get a position and maybe a rotation. Compare all the interesting points from one image to the other forming matching pairs of points between images. Naive implementations are SLOW. Use RANSAC to ļ¬nd a consensus (next page). Image Registration - Interesting Points SURF FAST
  • 17. Copyright 2011 - Componica, LLC (http://www.componica.com/) Matching points locally doesnā€™t yield global matches. RANdom Sample Consensus - RANSAC - prunes away the mismatches and computes the transform that converts the source image to the target. It works as followed: Most transforms can be described by a minimal number of points. For similarity transforms itā€™s two points, for projection transforms itā€™s four points. Pick two (or four) matched pair of points at random. Compute a transform from those two (or four) sets of points. Transform all the source points using this transform and see how many points are close to the target. Repeat this and keep the best transform that matches the most points. Image Registration - Interesting Points Here the green lines indicate pairs of matched points that ļ¬t the transform (looks like a similarity transform) and the red lines are matched pairs that failed to ļ¬t this transform and therefore rejected.! ! RANSAC may seem ad hoc, but it works surprisingly well.
  • 18. Copyright 2011 - Componica, LLC (http://www.componica.com/) A very accurate way to line up a template onto an image is to use derivates and linear algebra. Works very well only when the images are very close to each other in the ļ¬rst place. Itā€™s usually the polishing step after using an Interesting Points / RANSAC method. Known as Lucas-Kanade Tracker -or- Baker- Matthews Tracker. Itā€™s the secret sauce to how the ā€œPredatorā€ algorithm works. Image Registration - The Mathy Way http://www.wedesoft.demon.co.uk/lucas-kanade-tracker.html
  • 19. Copyright 2011 - Componica, LLC (http://www.componica.com/) Making computers recognize objects in images Many diļ¬€erent algorithms, and each algorithm has appropriate uses depending on the objects being detected. For example: AdaBoost - Awesome at detecting and locating faces, sucks at recognizing whoā€™s face it is. Deep Neural Networks - Works great for highly distorted text, but too slow for generalized OCR. Active Appearance Models - Works great for both recognition and segmentation, but only on generally ļ¬xed shapes like faces and hearts. Sucks for livers and cancer. Most recognizers depend on training the algorithm on known objects oļ¬„ine and then testing. Which brings up the topic of data... Object Recognition
  • 20. Copyright 2011 - Componica, LLC (http://www.componica.com/) The data can be the most valuable part of a trainable system as many algorithms will generally function with somewhat similar hit rates. Often ideas fail to take into account where the data comes from. Itā€™s the killer of many ideas. The basics of training something: Data is normally split into a training and testing set. You train a thingy with the training set and test how well it works with the testing set. Why? Most trainable thingies are prone to overļ¬tting. Splitting the data into two sets prevents this problem because you use the test set to know when to stop training. Disadvantage, you eļ¬€ectively need twice as much data. Sucks. Object Recognition - Itā€™s all about the Data Itā€™s obvious this data is best represented as a line, but! if the model over-ļ¬ts the data, it may compute a! relationship thatā€™s nonsensical. As your algorithm learns from a training set (blue line),! the error decreases for that set, but in the testing set,! it will hit a point where overļ¬tting is happening and! will increase the overall error in the real world. You stop! training at the point the testing set starts getting worse.
  • 21. Copyright 2011 - Componica, LLC (http://www.componica.com/) Steveā€™s crappy breakdown of object recognition algorithms: A ship of fools approach - Armies of stupid algorithms that together become smart. Kinda like democracy sort of...not really. Letā€™s create a brain, Igor - So what happens if you simulate brain tissue? Itā€™s grown quite a bit since the neural network hype in the late 80s / early 90s. Fun Fact: This will eventually kill us all in a bloody uprising. If the shoe ļ¬ts... - Well if this template StarBucks logo ļ¬ts somewhere on my image using object registration as described above, then Iā€™m guessing this StarBucks logo is present in the image. Duh. Find Features to Filter and Fit in a Feed-forward Fashion - You see this pattern all the time and it lacks of creativity. Find interesting features in the image, and feed them into a trainable function like a neural network, a non-linear regression, or a support vector machine. Boring. Iā€™ve stared at a wall for 20 minutes now and I think everything out there is either one or a combination of these general ideas. Hmm...Thatā€™s it? Disappointed. Object Recognition - The Menu
  • 22. End of Part One...