Computer vision is a field of artificial intelligence that trains computers to interpret and understand the visual world. Using digital images from cameras and videos and deep learning models, machines can accurately identify and classify objects — and then react to what they “see.”
1. Computer Vision
Sanjay S
Computer Science Department, RMKEC
Abstract
Computer vision is the process of using
machines to understand and analyze imagery
(both photos and videos). While these types
of algorithms have been around in various
forms since the 1960’s, recent advances
in Machine Learning, as well as leaps forward
in data storage, computing capabilities, and
cheap high-quality input devices, have driven
major improvements in how well our software
can explore this kind of content.
How computer vision works
One of the major open questions in both
Neuroscience and Machine Learning is: how
exactly do our brains work, and how can we
approximate that with our own algorithms? The
reality is that there are very few working and
comprehensive theories of brain computation;
so despite the fact that Neural Nets are
supposed to “mimic the way the brain work
The same paradox holds true for computer
vision – since we’re not decided on how the
brain and eyes process images, it’s difficult to say
how well the algorithms used in production
approximate our own internal mental processes.
For example, studies have shown that some
functions that we thought happen in the brain
of frogs actually take place in the eyes. We’re a
far cry from amphibians, but similar uncertainty
exists in human cognition.
That’s a lot of memory to require for one image,
and a lot of pixels for an algorithm to iterate over.
But to train a model with meaningful accuracy –
especially when you’re talking about Deep
Learning – you’d usually need tens of thousands
of images, and the more the merrier. Even if you
were to use Transfer Learning to use the insights
of an already trained model, you’d still need a
few thousand images to train yours on.
2. Computer vision on Medical field
The research of computer vision, imaging
processing and pattern recognition has made
substantial progress during the past several
decades. Also, medical imaging has attracted
increasing attention in recent years due to its
vital component in healthcare applications.
Investigators have published a wealth of basic
science and data documenting the progress and
healthcare application on medical imaging. Since
the development of these research fields has set
the clinicians to advance from the bench to the
bedside, the Journal of Healthcare
Engineering set out to publish this special issue
devoted to the topic of advanced computer
vision methods for healthcare engineering, as
well as review articles that will stimulate the
continuing efforts to understand the problems
usually encountered in this field. The result is a
collection of fifteen outstanding articles
submitted by investigators.
Usage of CV in Medical field
X-Rays
The role of X-rays is to identify if there is any
abnormalities or damage to a human organ or
body part. Computer vision can be trained to
classify scan results just like a radiologist would
do and pinpoint all potential problems in a
single take. This is a healthier approach with
radiation exposure limited as much as possible,
especially in the case of children and the elderly.
CT scans
This method is used to detect tumours, internal
bleeding, and other life-threatening conditions.
The advantage of using computer vision here is
that the entire process can be automated with
increased precision, since the machine could
identify even those details that are invisible to
the human eye. A recent study at the University
of Central Florida proved that while trained
physicians had only 65% accuracy in detecting
lung cancer, the machine was right in 95% of the
cases.
This result becomes even more critical when we
are talking about brain damage, strokes, or
internal bleeding when every second can make a
difference.
3. Object Detection
Image classification involves assigning a class
label to an image, whereas object localization
involves drawing a bounding box around one or
more objects in an image. Object detection is
more challenging and combines these two tasks
and draws a bounding box around each object
of interest in the image and assigns them a class
label. Together, all of these problems are
referred to as object recognition.
Object recognition is a general term to describe
a collection of related computer vision tasks that
involve identifying objects in digital
photographs. Image classification involves
predicting the class of one object in an
image. Object localization refers to identifying
the location of one or more objects in an image
and drawing abounding box around their
extent. Object detection combines these two
tasks and localizes and classifies one or more
objects in an image.
Retail and Retail Security
Amazon recently opened to the public
the Amazon Go store where shoppers need not
wait in line at the checkout counter to pay for
their purchases. Located in Seattle, Washington,
the Go store is fitted with cameras specialized in
computer vision. It initially only allowed Amazon
employee shoppers, but welcomed the public
beginning in early 2018.
The technology that runs behind the Go store is
called Just Walk Out. As shown in this
one-minute video, shoppers activate the IOS or
Android mobile phone app before entering the
gates of the store.
StopLift
ScanItAll’s computer vision technology works
with the grocery store’s existing ceiling-installed
video cameras and point-of-sale (POS) systems.
Through the camera, the software “watches”
the cashier scan all products at the checkout
counter. Any product that is not scanned at the
POS is labeled as a “loss” by the software. After
being notified of the loss, the company says it is
up to management to take the next step to
accost the staff and take measures to prevent
similar incidents from happening in the future.
Using algorithms, Stoplift claims that ScanItAll
can identify sweethearting behaviors such as
covering the barcode, stacking items on top of
one another, skipping the scanner and directly
bagging the merchandise.
Automotive - Tesla
Another company that claims it has developed
self-driving cars is Tesla, which claims that all its
three Autopilot car models are equipped for full
self-driving capability.
Each vehicle, the website reports, is fitted with
eight cameras for 360-degree visibility around
the car with a viewing distance of 250 meters
around. Twelve ultrasonic sensors enable the car
to detect both hard and soft objects. The
company claims that a forward-facing radar
enables the car to see through heavy rain, fog,
dust and even the car ahead.
Its camera system, called Tesla Vision, works with
vision processing tools which the company
4. claims are built on a deep neural network and
able to deconstruct the environment to enable
the car to navigate complex roads.
Agriculture -Cainthus
Animal facial recognition is one feature that
Dublin-based Cainthus claims to offer. Cainthus
uses predictive imaging analysis to monitor the
health and well-being of crops and livestock.
Cainthus uses predictive imaging analysis to
monitor the health and well-being of crops and
livestock.
Cainthus also claims to provide features like
all-weather crop analysis in rates of growth,
general plant health, stressor identification, fruit
ripeness and crop maturity, among others.
Cargill, a producer and distributor of agricultural
products such as sugar, refined oil, cotton,
chocolate and salt, recently partnered with
Cainthus to bring facial recognition technology
to dairy farms worldwide. The deal includes a
minority equity investment from Cargill although
terms were not disclosed.
Benefits and challenges
Using computer vision systems can translate into
saving months before learning about
life-threatening conditions. In the case of some
cancer forms, this could easily be the difference
between saving or losing a patient. The benefit
of these systems is that they can be trained to
spot even slightest abnormalities.
Money is also an important issue here. Every
wrong diagnosis translates into additional costs,
more tests, more hospital days, and improper
treatment, not to mention the psychological
impact of potentially bad news that turn out to
be false.
The great news about using computer vision is
that the same algorithms can be reused for other
patients and data from other medical centres
can be easily transferred to the system for
further algorithm training, thus enhancing the
accuracy rates consistently.
The real challenge of training such a system is in
finding the right sets of relevant images for
training, including of rare cases. To get an
excellent accuracy, such training sets should
have proper tagging and enough variations to
avoid over-training on simple cases.
Concerns about data privacy and personal
security are also on the priority list, but with
proper data anonymisation techniques, one
patient’s data can save the lives of many others.
The futuristic dream of completely automated
diagnosis still has countless technical and ethical
barriers, but consistent advancements have been
made in the last years. AI can be used at various
stages of the hospital-patient relationship, from
easier admission via chatbots to personalised
treatment based on DNA analysis.
5. Medical image analysis is already becoming a
field where AI proves to bring groundbreaking
results.
Reference
1) http://www.academia.edu/Docume
nts/in/Computer_Vision?page=2
2) https://machinelearningmastery
.com/object-recognition-with-dee
p-learning/
3) Computer Vision: Algorithms and
Applications – “Computer Vision
4) Convolutional Neural Networks
(Deeplearning.ai and Coursera)
5) https://www.healtheuropa.eu/co
mputer-vision-accuracy-of-diagno
stics/93650/