Computer vision enables machines to understand and interpret visual content like images in the same way humans can. It involves techniques from fields like artificial intelligence and machine learning to allow computers to identify and process objects, scenes, and activities in digital images and videos. Some key challenges in computer vision include processing image data, requiring large datasets to train models, and making real-time decisions from images like detecting unsafe situations. Computer vision has applications across many domains like forestry, healthcare, autonomous vehicles, and more.
2. Human vision vs computer vision
Now we understood how complex the working of human vision system is.
Similarly achieving the vision capabilities in a machine is equally challenging .
Challenges in computer vision :
● Image stored as vector array in digital form. Deep learning techniques are required to get
insights from this data
● A very huge data set would be required to train the system to identify objects at various
angles/environmental conditions
● Time based decision making. Example Alert has to be generated by a surveillance robot when
someone crosses railway line and a train is approaching, otherwise, it should be considered
normal
3. ● In case of living objects ability to differentiate
between the living object, a statue of the object,
life size poster/photo of the object
● Understanding the object with its context example
as humans we will be able to explain the emotion
in the picture but it is challenging for a machine to
understand the relation between different objects
in an image
4. What is computer vision?
❏ Computer vision enables machines to be able to read visual content. for example– to see a photo
of blue dress and learn it as a blue dress then apply that knowledge to other images of blue dress
without needing to rely on a person to tag all those images first
❏ Computer vision tasks include methods for acquiring, processing, analysing and understanding
digital images, and extraction of high dimensional data from the real world in order to produce
numerical or symbolic information
Definition: Computer vision is a field of artificial intelligence that brings
computers to capture and interpret information from image and video data .
5. ❖ The image understanding can be seen as disentangling of symbolic information from
image data and using models constructed with the aid of geometry physics statistics
and learning theory does making computer vision as an interdisciplinary scientific
field
Computer vision is an interdisciplinary field
6. Purpose of Computer vision
Object classification What broad category of object is in the photograph ?
Object identification Which type of a given object is in the photograph ?
Object verification Is the object in the photograph?
Object Detection Where are the objects in the photograph?
Object Landmark
Detection
What are the key points for the object in the photograph?
Object Segmentation What pixels belong to the object in the image?
Object Recognition What objects are in the photograph and Where are they?
7. Key terms in computer vision
Artificial neural network:ANN refers to an network of interconnected
layer processing elements that work together to power computer vision.ANNs act
much like the neural network configurations of the human brain allowing computers
to see the images and videos and learn exactly what is in them computer vision is
rooted in ANNs
Machine Learning:Machine learning refers to algorithms that learn
patterns from the data the
computer has been given called inputs and use this patterns to make
predictions with new data
called output
8. How a machine looks at an image?
Images are stored in the computer as array of integers. Each integer value represents a
pixel value. Pixels are the building blocks of an image. In the below grayscale image,
every pixel value in the integer array represents the intensity of the colour at a given
coordinates in the image considered
9. Similarly if we consider colour image, then we need three arrays
representing the intensity of red, blue and green to represent the image. The
range for every channel (Red,Blue,Green varies between 0 to 255(0 ,0 ,0)
represent black, and (255, 255, 255) represents white.
10. Concepts in computer vision:
● Pattern Recognition
● Image Processing
● Artificial Intelligence
● Mathematics
● Physics
Concepts and Techniques in Computer Vision
Techniques in computer vision:
1. Image Processing
2. Feature Detection and Matching
3. Image Segmentation
4. Image recognition
5. Image Detection
11. 1. Image Processing: Image processing is a method to perform some
operations on image, in order to get an enhanced image or to extract some useful
information from it.
Image processing basically includes the following three steps:
a. Importing the image via acquisition tools
b. Analysing and manipulating the image
c. Output in which result can be altered image or report that is based on image analysis
Purpose of Image Processing is divided into 5 groups:
Visualisation: The purpose is to observe the objects that are not visible in an image.
Image sharpening and Restoration: The purpose is to create a better image.
Image retrieval:The purpose is to seek for the image of interest.
Measurement of pattern: The purpose is to measure various objects in an image.
Image recognition:The purpose is to distinguish the objects in an image.
12. 2. Image Segmentation : Segmentation is a process of extracting pixels in an
image that are related. Segmentation algorithms usually take an image and produce a group of
contours or a mask where a set of related pixels are assigned to a unique colour value to identify
it .
The main purpose for image segmentation is to partition an image into a collection of set of
pixels and achieve the following results for
—meaningful regions (coherent objects)
—linear structures (line ,curve,………)
—shapes (circles, eclipse,..........)
BINARY SEGMENTATION SEMANTIC SEGMENTATION
13. 3. Feature Detection and Matching:
It is a piece of information which is relevant for solving the computation task related to a certain
application. Features may be specific structures in the image such as points,edges or objects. Features may
also be the result of a General neighborhood operation or feature detection applied to the image.
● Identify the interest point in the image. The features that are in specific locations of the images, such
as mountain peaks, building corners, doorways etc. These kinds of localised features are often called
as keypoint features.
● This feature can be matched based on their orientation and local appearance(edge profiles) are called
edges and they can also be good indicators of object boundaries.
● The local appearance around each feature point is described in some way that is (ideally) invariant
under changes in illumination translation came and in plane rotation we typically end up with a
descriptive vector for each feature point.
14. Image Recognition:
Recognition is one of the toughest challenges in the concept of computer vision. For human eyes
recognising an object feature or attribute would be very easy. However, this does not apply for a machine
It would be very hard for a machine to recognise or detect objects. Because, these objects vary.
Object Recognition:
● Object recognition refers to identification of what is present in the image while object detection refers
to locating where it is present in the image.
● Object recognition through deep learning can be achieved through training models. To train models
from scratch, the first thing you need to do is to collect large number of data sets. Then you need to
design certain architecture that will be used to create the model
● The output of object recognition will include the identified object category along with the probability
of correctness
15. Image Detection:
● Image object detection is a technique that processes the image and detect objects in it.
● Object recognition is a process of rendering an image while object detection answers the location
of an object in the image
● Object detection uses and objects features for classifying its class.
● When it comes to apply deep machine learning to image detection, developers use Python along with
open-source libraries like OpenCV image detection, Open detection, Image AI and others. These
libraries simplify the learning process and offer a ready-to-use environment
● The commonly used techniques for object detection are
* Haar cascades algorithm
* Viola Jones algorithm
17. ● In foresters evaluation of the emerging market for computer vision platforms and 11 most
significant providers in the category– Amazon Web Services, Chooch AI, clarifai, Deepomatic,
Google, Hive, IBM, Microsoft neurala and SAS were evaluated. The report details the findings
about how well each vendor scored against 10 criteria and where they Stand in relation to each
other the business professional can use this review to select the right partner for the computer
vision needs.