62 328-337

Analysis of Unsupervised Content-based Image
Retrieval by Fusing the Image Features
S. M. Zakariya and Nadeem Akhtar
Department of Computer Engineering, ZHCET, AMU, Aligarh, India
Email: s.zakariya@gmail.com, nadeemalakhtar@gmail.com
Abstract— In Content-Based Image Retrieval (CBIR) systems, the visual contents of the
images in the database are took out and represented by multi-dimensional characteristic
vectors. A well known CBIR system that retrieves images by unsupervised method known
as cluster based image retrieval system. For enhancing the performance and retrieval rate
of CBIR system, we fuse the visual contents of an image. Recently, we developed two
cluster-based CBIR systems by fusing the scores of two visual contents of an image. In this
paper, we analyzed the performance of the two recommended CBIR systems at different
levels of precision using images of varying sizes and resolutions. We also compared the
performance of the recommended systems with that of the other two existing CBIR systems
namely UFM and CLUE. Experimentally, we find that the recommended systems
outperform the other two existing systems and one recommended system also comparatively
performed better in every resolution of image.
Index Terms— Image Features, Image Pre-processing steps, Content Based Image Retrieval,
Image database, Performance Measure.
I. INTRODUCTION
"Content-based" means that the search will examine the real contents of the image rather than the metadata
such as keywords, tags, and/or descriptions associated with the image. Content-based image retrieval
exercises the visual contents of an image such as color, shape, texture, and spatial layout to represent and
index the image, or any other information that can be derived from the image itself [2]. In classic content-
based image retrieval systems, the visual contents of the images in the database are took out and depicted by
multi-dimensional trait (visual content) vectors [5]. The trait vectors of the images in the database form a
feature database. Features can also be categorized as low-level and high-level features. Low-level features are
features that can be obtained from the pixel itself, examples are color and texture. High-level features are
features obtained from the combination of low-level features. But, the three of the most widely used visual
features are: color, texture, and shape. Details of each feature are: (1) Color: Observing the images based on
the colors they include is one of the most broadly used methods because it does not depend on image
orientation or size. (2) Texture: Measure of textures appears for visual patterns in images and how they are
particularly defined. Textures are embodied by texels which are then situated into a quantity of sets,
depending on how many textures are identified in the image. (3) Shape: Shape does not pass on to the shape
of an image but to the shape of a specific region that is being sought out. Shapes will often be determined
first applying segmentation or edge detection to an image [9]. A classical structure of content-based image
retrieval system, as shown in Figure 1, retrieves images by performing three main functions namely (i) Image
DOI: 02.ITC.2014.5.62
© Association of Computer Electronics and Electrical Engineers, 2014
Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC

329
segmentation, (ii) Feature Extraction, and (iii) Computation of similarity measures.
Image segmentation is the method of splitting an image into sections such that each section is identical with
respect to some feature, explanation given in [3]. There are a number of reviews that attempt to classify the
different segmentation techniques. According to Ref. [13], Segmentation techniques can be classified into the
following groups: Histogram Thresholding, Feature Space Clustering, Region Based Approaches, Edge
Detection Approaches, Fuzzy Approaches, Neural network Approaches, and Graph-theoretical Approaches.
In this work, we fuse visual contents (shape-color and texture-color contents) with a marginal value to
compute the similarity of the query. We also analyze the retrieval rate at different precision levels and
compare the results with existing systems in every resolution of image database [1] and [7].
This paper is organized as follows. In the next section, we discuss few existing CBIR systems. In section III,
we discuss the unsupervised approach of image retrieval. In section IV, we present some experimental results
and comparisons. Finally, we conclude and give future direction in section V.
Figure 1: A structure of CBIR system
II. EXISTING CBIR SYSTEMS
Existing CBIR systems can be grouped into two categories: full-image retrieval system and region-based
image retrieval system. Some of the existing CBIR systems may also belong to the both categories [9].
In full-image retrieval systems features are extracted for the entire image without segmenting it into regions.
Full image retrieval systems use the global feature of images. In this system, the images in the database are
segmented but the query image is not segmented. So, for the query image, the global features are used while
for images in the database, the local features are used.
In region-based systems, the image is segmented into regions prior to the extraction of the features. Then, the
features are extracted for each region. Here, local features are used for both the query image and the images
in the database. Region-based systems can be further divided into three types: In the first type, the query
image is not segmented but the images in the database are segmented and the system looks for images that
hold the query image as their part, this is called sub-image retrieval. In the second type, both the query and
the images in the database are segmented but only one part of the query image is used for the searching. In
the third type both the query image and images in the database are segmented and all the regions of the query
image will be used for the comparison [16].
Most of the existing CBIR systems are region-based systems because region-based systems are healthier than
full-image retrieval systems. Region-based systems use different segmentation techniques to divide images
into subparts. Some of the existing systems are: (i) Blobworld, has been developed by UC Berkeley
Computer Vision Group [8]. It segments the image into blobs (regions) using an (Expectation-Maximization)
EM-algorithm based on the color and texture features of the pixels. It is a region-based image retrieval
system [10]. (ii) The Earth Mover's Distance, Multi-Dimensional Scaling, and Color-Based Image Retrieval,
in this retrieval system, images are deemed as points in a metric space in which they are moved roughly so as
to locate image neighborhoods of interest, based on color information. This distance function is called the
Earth Mover’s Distance (EMD) [11]. The system also uses of multi-dimensional scaling (MDS) techniques to
insert a group of images as points in a two- or three-dimensional Euclidean space so that their distances are
conserved as much as achievable. It is a full image-based retrieval system. (iii) NaTra is developed by
department of Electrical and Computer Engineering, University of California at Santa Barbara [12]. Images
are automatically segmented into about six to twelve non overlapping identical regions. The segmentation is
based on an edge flow algorithm which uses ‘edges’ in color and texture features to identify identical regions.
It is a region-based image retrieval system. (iv) PicSOM is developed by Laboratory of Computer and
Information Science, Helsinki University of Technology. Image is divided into five regions. For each region,
color and texture properties are used. In addition, edge and shape properties are used as features. Features are
Image
Segmentation
Similarity
Computation
Feature
Extraction
Input Query
Image
Set of Similar
Images

330
stored in a tree arrangement that uses self-organizing map (SOM). It is a region-based image retrieval system
[14]. (v) SIMPLIcity (Semantics-sensitive Integrated Matching for Picture LIbraries) is developed by James
Ze Wang, Gio Wiederhold, Jia Li and others at Stanford University [15]. It segments the image into 4 x 4
pixel blocks and extracts a feature vector for each block. Use the k-mean clustering approach to segment the
image into regions. It is a region-based image retrieval system. (vi) UFM (Unified Feature Matching)
developed by Chen and Wang [13]. UFM scheme describes the similarity between images by incorporating
properties of all regions in the images. The similarity of two images is then defined as the overall similarity
between two families of fuzzy features and quantified by a similar measure, UFM measure, which
incorporates properties of all the regions in the images. It is a region-based image retrieval system. (vii)
CLUE (CLUster based rEtrieval of images) is developed by Chen et al. [2] and [5]. It is known as cluster-
based retrieval of images by unsupervised learning (CLUE), for improving user relations with image retrieval
systems by developing the similarity information. CLUE retrieves image cluster by applying a graph –
theoretic clustering algorithm to a collection of images in the surrounding area of the query. In particular,
clusters created depend on which images are retrieved in reply to the query.
III. UNSUPERVISED METHOD OF CBIR
Unsupervised learning occupies learning samples in the input when no definite output values are provided. It
is differentiated from supervised learning in that the learner is provided only unlabeled patterns.
Unsupervised learning is applied to the class of problems, where one seeks to cease how the data are
classified. In a standard CBIR system, target images (known as, images in the database) are stored by feature
similarities with respect to the query. CLUE technique is used for improving user interaction with image
retrieval systems by fully exploiting the similarity information. CLUE retrieves image clusters by applying a
graph-theoretic clustering algorithm to a collection of images in the locality of the query [2] and [5].
The recently recommend two different approaches of CBIR namely color-shape & color-texture systems also
based graph clustering algorithms, to improve the performance of the CLUE algorithm (as shown in Figure
2) [1] and [7]. In the first approach, we sum the values of color and shape visual contents, for assigning the
weights to different images. On the basis of these weights the relevant images are extracted from the image
database. In other approach, we sum the values of color and texture features, for assigning the weights to
different images. On the basis of these weights the relevant images are extracted from the image database.
Figure 2: Structure of recommended CBIR system
First step of implementation of the CBIR system is to segment the image. For finding the segment of image
we should detect the edges of the input image, so the steps for detecting the edges in the input image are:
convert into gray scale form and apply edge detection technique [6] and [16]. Now the image is segmented,
the second step is to extract the features, for extracting the features divide the segmented image into the
number of regions (cluster). Extract the feature of each region. Features are extracted by mainly three ways:
(i) color, (ii) shape, and (iii) texture. Color features are among the most important and extensively used low
Euclidean distance, shape
comparison, region matching etc.
Visualisation
Feature
Extraction
Computation of
Similarity
Image
Database
Histogram, color layout, regions
etc.
Fusing features shape-color and texture-
color after applying threshold of 0.6
Image Segmentation
Values of similar
images

331
level features in image database. As given in Ref. [4], there are three central equations for color distribution
on the image for each region. Now we discuss the extraction of the relevant images from the large image
database in unsupervised way. First we say that, the numbers of clusters are not fixed; it depends on the some
conditions, in our case, we set one condition, if a cluster has less than 100 images that cluster would not be
further clustered. For example, see Figure 3 shows an image database of 100 nodes (images) partitioned into
the form of tree structure [2].
Figure 3: Partitioned into the form of tree structure for N (100 images) nodes
Following Steps are the description of the retrieval of relevant images by unsupervised way [16]:
Step 1: Assume database has total 100 images as shown in Figure 3, N= 100; these are the sorted pre-
processed target images with respect to the query image.
Step 2: First find, collection of neighboring target images with respect to query image using Nearest
Neighbor Method (discussed in [5]).
Step 3: Constructs a weighted undirected graph, that contain the query image and its neighboring target
images using the equations discussed in previous chapter, it counts the nonnegative weight of an edge.
Step 4: Now, selection of a representative node, the node has maximum sum value of their feature among
these 100 target nodes select as representative node in each clusters, and apply normalized-cut Ncut equation
to the graph for partitioning the graph into two sub graphs, If the similarity measure of the target nodes less
than the representative nodes then images are confined into the left cluster C1 (contain 40 images), and others
into C2 that contains 60 images (of given example of 100 target images) [6].
Step 5: Criteria of Clusters partitioning: In this method, we select cluster to partition which has maximum
number of nodes (images). In above example, second cluster (C2) has maximum nodes, so further divide C2
into two sub clusters C3 and C4, repeat Step 4 & Step 5 until a stopping criteria satisfies.
Step 6: Sopping Criteria: In our method, if a cluster has more than 100 images that cluster would be further
divided into two sub clusters: (for above example, only for understanding purpose, if a cluster has nodes
(images) more than 25 that divided into two new clusters, otherwise left as leaf nodes.
Step 7: Retrieval of Relevant Images: start to retrieve leaf clusters from left to right (in inorder traversal
way). In our method, if we get first 100 images then we stop this procedure and re-initiate the program for the
next query and so on. For given example, assume we want 50 relevant images, first, retrieve left-most cluster
C8, we get 25 images from this cluster (now indexing as per the similarity measure 1 to 25), then retrieve
cluster C7 (in left to right manner, has 15 images, indexing 25 to 40), total images retrieved 40, then retrieve
C5 (25 images, indexing 41 to 50), total images exceeds the limits, condition would be failed, no further
cluster retrieval possible, and display the top 50 relevant images.
Step 8: Save the collection of images in a file for all queries (for all iteration) and manually count the
precision at different levels.
IV. RESULT AND DISCUSSION
In this section, we present our results and compare the performance of the suggested systems with the two
other existing systems. Our systems used the same feature extraction technique as given in Ref. {[2], [5] and

332
[12]}. We used the Euclidean distance as the similarity measure between the query and target images in the
database. We have used four COREL image database, each database has its own size (number of images) and
resolutions which we mentioned in the corresponding table’s caption. We have worked on approximately
80,000 images. Detail of databases given in [8].
Performance of the system is measured using precision statistics technique. In the sense of image
(information) retrieval, precision is the ratio of retrieved images that are relevant to the total number of
retrieved images [16]. Precision takes all retrieved images into account. In this work, we evaluated precision
at a given cut off rank (evaluated precision at k, where k = 10, 20, 30, ….. upto 100).
ܲ‫݊݋݅ݏ݅ܿ݁ݎ‬ =
|ܴ݈݁݁‫ݐ݊ܽݒ‬ ‫ݏ݁݃ܽ݉ܫ‬ ‫݋ݐ݌ݑ‬ ݇|
|ܶ‫݈ܽݐ݋‬ ‫ݏ݁݃ܽ݉ܫ‬ ܴ݁‫݀݁ݒ݁݅ݎݐ‬ ‫݋ݐ݌ݑ‬ ݇|
The calculation of precisions mathematically illustrated as: Let P is the total precision and P1, P2, P3, ,..
………upto P100 are the precisions for image queries 1, 2, 3,…….upto 100, for one particular category,
because each category has total 100 images, after computing these precision, took average and reported in the
corresponding Tables. Example is given in the following equation.
ܲ (ܽ‫ݐ‬ ݇ = 10) =
|ܲଵ + ܲଶ + … … … …… . . + ܲଵ଴଴|
|݇ = 10|
We have tested all four CBIR systems on all four databases, in which, two are existing system: UFM and
CLUE, and two are recently recommended system: shape-color and texture-color.
Database 1 contains 10,000 images of resolution 185 X 84. We randomly picked 100 general purpose images
from this database and tested our systems. The average of result is shown in Table I and graphically
represented in Figure 4.
TABLE I. PRECISION RESULT OF DATABASE 1
Existing Recommended
CBIR Systems UFM CLUE Shape-Color Texture-Color
Average Relevant
Images
3.09 5.76 7.68 9.12
Average Retrieved
Images
12 12 12 12
Average Precision 0.258 0.48 0.64 0.76
Database 2 also contains 10,000 images of resolution 185 X 96. We have chosen 100 random generated
queries of images and retrieved the relevant images from this database for our systems. The average of result
is shown in Table II and graphically represented in Figure 5.
Database 3 contains 60,000 general purpose images of resolution 185 X 85. We randomly picked 100 images
for queries and retrieved the relevant images from this database for our systems. The average of result is
shown Table III and graphically represented in Figure 6.
Figure 4: Average Precision of Database 1

333
TABLE II. PRECISION RESULT OF DATABASE 2
CBIR Systems UFM CLUE Shape-Color Texture-Color
Average Relevant
Images
13.32 14.245 10.625 11.00
Average Retrieved
Images
37 39 25 25
Average Precision 0.36 0.365 0.425 0.44
TABLE III. PRECISION RESULT OF DATABASE 3
CBIR Systems UFM CLUE Color-Shape Color-texture
Average Relevant
Images
12.09 13.10 12.00 13.025
Average Retrieved
Images
31 31 25 25
Average Precision 0.39 042 0.48 0.521
Database 4 is a benchmark database (shown in Table IV), contains 10 different categories of images, each
category has 100 images of resolution 256 X 384. In this database, we made query image to each image and
retrieved the relevant images at top k. Corresponding figures: Figure 7 and Figure 8 show the one result of
each recommended system (Shape-Color & Texture-Color) for the sample query of Bus image. The average

334
precision values of color-shape and color-texture systems for each category at different precision levels (k =
10, 20 ….. 100) are shown in the Table V and Table VI respectively. We took 100 random queries of images
from each of the ten categories of this database (Database 4), in turn, a total of 1000 random queries, and
retrieved the relevant images from this database. Finally, computed the average precision. Figure 9 is the
graphical representaion of Table V. Figure 10 is the graphical representaion of Table VI.
TABLE IV. DESCRIPTION OF COREL DATABASE 4
Category No Category Name Category No Category Name
1 African People 6 Elephants
2 Beach 7 Flowers
3 Buildings 8 Horses
4 Buses 9 Glaciers
5 Dinosaurs 10 Food
Figure 7: Result of Shape-Color system, first image is the query image: 16 Matches out of 25
Figure 8: Result of Texture-Color system, first image is the query image: 19 Matches out of 25

335
TABLE V. PERFORMANCE AT DIFFERENT PRECISION (K) FOR SHAPE-COLOR SYSTEM OF DATABASE 4
ID Name 10 20 30 40 50 60 70 80 90 100
1 People 0.70 0.685 0.667 0.636 0.615 0.575 0.568 0.555 0.542 0.53
2 Beach 0.68 0.645 0.617 0.586 0.554 0.535 0.505 0.475 0.446 0.42
3 Buildings 0.60 0.565 0.537 0.506 0.485 0.454 0.433 0.415 0.395 0.37
4 Buses 0.84 0.775 0.767 0.746 0.725 0.708 0.675 0.634 0.628 0.62
5 Dinosaurs 1.00 0.995 0.987 0.980 0.978 0.975 0.973 0.970 0.965 0.95
6 Elephants 0.58 0.535 0.487 0.436 0.395 0.367 0.335 0.315 0.325 0.30
7 Flowers 0.86 0.835 0.807 0.793 0.785 0.775 0.765 0.760 0.752 0.74
8 Horses 0.84 0.825 0.812 0.798 0.793 0.792 0.789 0.785 0.776 0.77
9 Mountains 0.54 0.500 0.497 0.436 0.395 0.372 0.345 0.332 0.315 0.30
10 Food 0.78 0.745 0.712 0.707 0.693 0.685 0.665 0.660 0.654 0.64
Avg All
Categories
0.74 0.710 0.689 0.662 0.642 0.624 0.605 0.590 0.579 0.564
Figure 9: At different precision k for Shape-Color system of Database 4 for each category

336
TABLE VI. PERFORMANCE AT DIFFERENT PRECISION (K) FOR TEXTURE-COLOR SYSTEM OF DATABASE 4
ID Name 10 20 30 40 50 60 70 80 90 100
1 People 0.70 0.685 0.676 0.646 0.625 0.595 0.558 0.545 0.538 0.53
2 Beach 0.68 0.645 0.628 0.605 0.575 0.555 0.525 0.485 0.456 0.43
3 Buildings 0.64 0.595 0.567 0.546 0.525 0.494 0.463 0.435 0.415 0.40
4 Buses 0.88 0.845 0.817 0.786 0.755 0.728 0.705 0.674 0.668 0.65
5 Dinosaurs 1.00 0.995 0.979 0.967 0.957 0.952 0.947 0.935 0.929 0.92
6 Elephants 0.58 0.535 0.476 0.446 0.415 0.377 0.345 0.335 0.325 0.32
7 Flowers 0.86 0.845 0.837 0.813 0.795 0.785 0.780 0.778 0.774 0.77
8 Horses 0.84 0.825 0.812 0.798 0.793 0.792 0.789 0.785 0.780 0.78
9 Mountains 0.55 0.500 0.497 0.436 0.395 0.372 0.345 0.332 0.315 0.30
10 Food 0.78 0.745 0.727 0.707 0.693 0.684 0.665 0.660 0.654 0.65
Avg All Categories 0.75 0.722 0.702 0.675 0.653 0.633 0.612 0.596 0.604 0.575
Figure 10: At different precision k for Texture-Color system of Database 4 for each category

337
V. CONCLUSIONS
In this paper, we analyzed the results of four content based image retrieval systems (UFM, CLUE, Shape-
color, and Texture-color). We have used four different COREL databases of varying sizes and varying
resolutions. We used Database 4 for detail analysis because most of the researchers used it as a benchmark.
We analyzed the results at different precision levels (k) and compared the average performance. We found
that retrieval rate of the fused CBIR systems are better. We also, analyzed that Texture-Color system gives
better results in all categories of images. We used the two features namely texture and shape in combination
with the third feature color. In future, we may combine these two features themselves. Quality of the clusters
depends on the choice of the partitioning algorithm. In future, other graph theoretic clustering techniques can
also be checked for possible performance improvement. Our systems work only on a database that consist
images of same resolution. In future, we may devise some technique to enable it to work also on a database
consisting of images of varying resolution.
ACKNOWLEDGMENT
We wish to express our most sincere gratitude to Dr. Rashid Ali. We would like to extend our profound
gratitude to Prof. Nesar Ahmad, as a Chairman Department of Computer Engineering, A.M.U., Aligarh,
India, for providing various facilities during the study and experimental works. We also express our deep
regards and thanks to all faculty members of the department.
REFERENCES
[1] S. M. Zakariya, R. Ali and N. Ahmad, “Combining visual features of an image at different precision value of
unsupervised content based image retrieval”, 2010 IEEE ICCIC, pp. 110 – 113, December 2010.
[2] Y. Chen, J. Z. Wang, and R. Krovetz, “CLUE: Cluster-Based Retrieval of Images by Unsupervised learning”, IEEE
Transaction on Image Processing, vol. 14, no. 8, pp. 1187-1201, August 2005.
[3] Pal, Nikhil R., and Sankar K. Pal. “A review on image segmentation techniques”, Pattern recognition, vol. 26, no. 9,
pp. 1277-1294, 1993.
[4] H. D. Cheng, X. H. Jiang, Y. Sun, and Jing Li Wang, “Color image segmentation: Advances & prospects”, Elsevier
Science, Pattern Recognition, vol. 34, no. 12, pp. 2259–2281, December 2001.
[5] Yixin Chen, James Z. Wang and Robert Krovetz, “Content-Based Image Retrieval by Clustering”, Proceedings of
the 5th ACM SIGMM International Workshop on Multi-media Information Retrieval, pp. 193-200, 2003.
[6] J. Shi and J. Malik, “Normalized cuts and image segmentation”, IEEE Transaction Pattern Analysis Machine
Intelligence, vol. 22, no. 8, pp. 888–905, August 2000.
[7] A. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-based image retrieval at the end of the early
years”, IEEE Transaction Pattern Analysis Machine Intelligence, vol. 22, No. 12, pp. 1349–1380, December 2000.
[8] S. M. Zakariya, R. Ali, and N. Ahmad, “Unsupervised Content Based Image Retrieval by Combining Visual
Features of an Image With A Threshold”, IJCCT, vol. 2, no. 4, pp. 204-209, December 2010.
[9] C. Carson, M. Thomas, S. Belongie, J.M. Hellerstein, and J. Malik, “Blobworld: A System for Region Based Image
Indexing and Retrieval,” Proceeding Visual Information Systems, pp. 509-516, June 1999.
[10] Y. Rubner, L. J. Guibas, and C. Tomasi, “The Earth Mover’s Distance Multi-Dimensional Scaling, and Color-Based
Image Retrieval”, Proceeding DARPA Image Understanding workshop, pp. 661-668, May 1997.
[11] W.Y. Ma and B. Manjunath, “NaTra: A Toolbox for Navigating Large Image Databases,” Proc. IEEE International
Conference on Image Processing, pp. 568-571, vol. 8, no. 20, February1997.
[12] Y. Chen and J. Z. Wang, “A Region-Based Fuzzy Feature Matching Approach to Content-Based Image Retrieval”,
IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 24, no. 9, pp. 1252-1267, September 2002.
[13] Jorma Laaksonen, Markus Koskela, Sami Laakso, and Erkki Oja, “Picsom - Content-based image retrieval with
Self-Organizing Maps”, Pattern Recognition Letters, vol. 21, pp.1199–1207, June 2000.
[14] J. Z. Wang, J. Li, and G. Wiederhold, “SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture Libraries”,
IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 23, no. 9, pp. 947-963, September 2001.
[15] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: Image Segmentation Using Expectation
Maximization and its application to Image Querying”, IEEE Transaction On Pattern Analysis and Machine
Intelligence, vol. 24, no. 8, pp. 924-937, 2002.
[16] S. M. Zakariya, Nesar. Ahmad and Rashid Ali, “Unsupervised Learning Method for Content Based Image
Retrieval”, LAP Lambert Academic Publishing, Germany, June 2013.

62 328-337

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (16)

En vedette

En vedette (9)

Similaire à 62 328-337

Similaire à 62 328-337 (20)

Plus de idescitation

Plus de idescitation (20)

Dernier

Dernier (20)

62 328-337