Content-based Image Retrieval

Content based image retrieval in large image databases
Lukasz Miroslaw, Ph.D. , Wojciech Tarnawski, Ph.D.
Institute of Informatics

Wroclaw University of Technology, Poland

Contents

 Introduction. Past Projects.

 Content-based image retrieval.

 Collaboration opportunities.

Division of Artiﬁcial Inteligence
Competences:
 Optimization Techniques: Swarm Optimization, Tabu Search,
Simulated Annealing, Evolutionary Algorithms.
 Machine Learning.
 Data Mining.
 Computer Vision.

 Applications: stock market, CBIR, biology.

How complex is the universe?

Fig. Credit: Wikipedia. Hubble Ultra Deep Field image of a region
of the observable universe (equivalent sky area size shown in
bottom left corner), near the constellation Fornax. Each spot is a
galaxy, consisting of billions of stars. The light from the smallest,
most redshifted galaxies is thought to have originated roughly 13
billion years

Why Do We Need Metaheuristics?

Deﬁnition: Metaheuristic designates a computational
method that optimizes a problem by iteratively trying
to improve a candidate solution with regard to a
given measure of quality (Wikipedia).

•Stochastic methods: simulated annealing,
evolutionary algorithms, ant colony optimization.

•Deterministic methods: gradient methods, tabu
search, dynamic programming, etc.

Evolutionary Algorithm

Fig. Scheme of EA.

Is Selection of the best the best ?
Prof. Roman Galar, Dr Artur Chorazyczewski, Prof. Iwona Karcz-Duleba

Content Based Image Retrieval
Project funded by Polish-Singapur Programme.

Duration: 3 years until March 2011.

Objective: Content-based Image Retrieval in large
databases.

Problem: What does it mean “similar”? How to
deﬁne a similarity measures between two images?

Brain’s visual circuits do error
correction on the ﬂy
Prediction Coding.

Hierarchical Layers detect objects in the bottom-up
manner.

Final Objects

Prediction error More Complex Shapes / Textures / Detailed Geometry

Horizontal / Vertical Lines / Basic Geometrical Shapes

Credit: Physorg.org

Egner T.,Monti J.M., Summerﬁeld C., Expectation and Surprise Determine Neural Population Responses in the
Ventral Visual Stream, J Neuroscience 2010, 30(49):16601-16608.

Multi-Scale Approach :
Anisotropic diffusion
Here, a number of images are generated in the
process of convolving the original image with the
Gaussian kernel with the t-variance (scale):

The set of derivative images I(x,y,t) is
equivalent to concurrent solutions of the heat
transport problem or diffusion on the plane.

Since, the convolution operation smoothes
region boundaries we decided to use edge-
preserving isotropic diffusion:

Mean-Shift Segmentation
The ”mean-shift”-based image segmentation algorithm(3) is a non-parametric
clustering in 5D image space (3D color space + 2D planar space). The
method does not require to know the number and the shape of clusters.

(3) D. Comaniciu and P. Meer, “Mean shift: A robust approach toward
feature space analysis” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24,
no. 5, pp. 603–619, 2002.

The Framework
The aim of the proposed segmentation is to partition the image into non-overlapping most –
meaningful image-regions to receive a very generalized (coarse) view of natural images.
1) Input image
2) Isotropic diffusion step
3) Mean-shift based
segmentation
4) Accumulating the
clustering results over all
scales
5) Mode based clustering of
values collected in
accumulator space
6) „Visualization” of the clusters
detected in previous step -
mapping input pixels into set
of cluster-labels
7) Region merging taking
into account region size
and planar co-occurency
8) Output image

Gestalt Theory in Segmentation
„Visualization” of clusters detected in the The most-meaningful (merged)
accumulator space : Grouping laws regions: Collaboration laws

Image Retrieval System

Image database

GEN-SEG image
segmentation
Image-to-image
Calculation of MPEG-7-based
features for segmented regions similarity
calculation

Calculation of MPEG-7-based
features for segmented regions

GEN-SEG image
segmentation
List of the most similar
Query Image images

Descriptors Used
The Moving Picture Experts Group (MPEG) defined few visual descriptors and
the appropriate distance measures (MPEG-7 standard):

Scalable Color Descriptor – color histogram in HSV color space that is
encoded by a Haar transform.

Color Layout Descriptor – represents the spatial distribution of the color of
visual signals.

Edge Histogram – represents the spatial distribution of five types of edges,
namely four directional edges and one non-directional

Texture Browsing – 5D descriptor related to perceptual characterization of
texture in terms of regularity , coarseness and directionality

Region Shape – describes the shape of an object in image. The descriptor is
robust to noise that may be introduced in the process of segmentation

Image Retrieval Concept
We have defined the measure of the similarity between images by composition
of distance metrics defined in MPEG-7 standard. The procedure consists of the
following steps:

1.Calculate region-to-region distances between the query image and images in
the database for chosen set of MPEG-7 descriptors.

2.Sort the list of distances for all regions from query-image.

Democracy: each region belongs to a certain image in the database and
votes for it.

3.Accumulate the votes and pick the best candidates.

Preliminary results: the method outperforms grid-based
segmentation IR (min. F-score = 0.34 on MGV Database)

Image-based Genome-scale RNAi
Profiling

Objective: Identification of genes in cell division
process by esiRNA gene silencing method.

•Initial Object-identification based on template
matching.

•Post-processing of candidate solutions: Evolutionary
Algorithm (soft-selection, gaussian mutation).

Genome-wide High-Content Screening
Image-based analysis of > 1300 genes’ phenotypes
Identiﬁcation of > 300 novel genes’ functions.

Fig. GUI Interface of esiImage.

Fig. Detection of mitotic cells in Phase-Contrast Microscopy. HeLa Cell Line.

esiImage: Java-based toolkit for automatic detection of
phenotypes. Author: Lukasz Miroslaw (2006)

Credit: Dr Artur Chorazyczewski

Genome-wide High-Content Screening

Fig. Mitotic Index for CDC16 and control.

Kittler R, Pelletier L, Heninger AK, Slabicki M, Theis M, Miroslaw L, Poser I, Lawo S, Grabner
H, Kozak K, Wagner J, Surendranath V, Richter C, Bowen W, Jackson AL, Habermann B, Hyman
AA, Buchholz F. Genome-scale RNAi proﬁling of cell division in human tissue culture cells.
Nature Cell Biol. 2007 Dec;9(12):1401-12.

Rapid Cervical Cancer Diagnosis

Fig. Image partition.

80 Statistical Geometrical Features.

Learning and Testing Phase.
Fig. Phase-Contrast Image with endothelial cells of interest.
Sequential Forward-Backward
Dresden Technical University, Prof. Fuchs Image Processing Group
Selection.

kNN, Linear Fisher Discriminant.
T. Schilling, L. Mirosław, G. Głąb, M. Smereka.
Towards rapid cervical cancer diagnosis: automated Post-processing.
detection and classification of pathological cells in
phase contrast images 2007. Int. J. Gynec Cancer
17(1):118-26.

Rapid Cervical Cancer Diagnosis

Project proposals #1

Let’s look inside PubMed

 Consortium:
 BIOTEC (Prof. Schroeder): www.gopubmed.com
 Wroclaw University of Technology (CBIR)
 ETH Zurich ? (Computer Vision)

Objective: A semantic search engine for life
science images.

Molecular Retrieval System

Prof. Kim Baldridge Lab

Computer Vision meets AI

Wroclaw University of Technology
Wyb. Wyspianskiego 27
Wroclaw, Poland

info@ai.pwr.wroc.pl
www.ai.pwr.wroc.pl

PATSI: Photo Annotation through Similar
Images with Annotation Length Optimization

 Automatic Image Annotation describes previously unseen image
with a set of keywords from the semantic dictionary.

 Hypothesis: Similar images should share similar annotations.

 Model:
 Image is described by a set of visual features.

 Each visual feature is composed of low-level attributes
(shape, color, edges, texture).

[1] M. Stanek, G. Paradowski, H. Kwasnicka, PATSI — Photo Annotation through Finding Similar Images with Multivariate Gaussian Models,
Lecture Notes in Computer Science, 2010, Vol. 6375, pp. 284-291.

[2] B. Broda, M. Stanek, G. Paradowski, H. Kwasnicka, ImageCLEF 2008 (the method was ranked 5 among 18 participants in one of the

Model
Image model: Multivariate Gaussian Distribution

Jensen-Shannon Divergence between image A and B is de ﬁned as:

Kullback - Leibler divergence:

Content Based Image Retrieval

1. Similaris : a web-based system for gathering similarity measures for a
given image database.

2. Visible : a web-based system for evaluation of the CBIR system.

Author: Bartek Dzienkowski, M.Sc.

Results

Grzegorz Paradowski, Marlena Ochocinska

Model-free Detection of Near-
Duplicate Fragments
 Problem definition:

Given two random images (i.e. without any a’priori knowledge
about their content) identify pairs of visually similar, near-
duplicate image fragments that are related by afficine
transformations. The fragments are formed by sets of matched
keypoint pairs satisfying the same affine transformation.

Features analysed: Hessian-Laplace, Harris-Affine, MSER, SURFT and SIFT descriptors.
 Calculate and decompose the affine transformation for each pair of matched triangles.
 Build the histogram of parameters for the decomposed affine transformations.
 Detect and post-process the high density areas (peaks) in the histogram to build the
near-duplicate fragments in both images.

[1] Mariusz Paradowski, Andrzej Sluzek, Matching Planar Objects of Images using Histograms of Decomposed Affine Transforms. Submitted to Pattern
Recognition.
[2] Mariusz Paradowski, Andrzej Sluzek, Detection of Image Fragments Related by Affine Transforms: Matching Triangles and Ellipses, ICISA 2010, Korea.

Results

Road sign Beer cans Box and a bottle

Different side of a tower Landscape

The topological method – topological
graph [1]

Key points are detected and paired

13

graph [2]

Spatial neighbors are found

14

graph [3]

Topological constraints are verified

15

graph [4]

Nodes and edges are removed

16

Results – topological matching

Two objects Same text Deformed book

Same scene, some differences Different camera position
17

Fuzzy Image Retrieval
 Automatic Image Annotation describes previously unseen image
with a set of keywords from the semantic dictionary.

 Hypothesis: Similar images should share similar annotations.

 Model:
 Image is described by a set of visual features.

 Each visual feature is composed of low-level attributes
(shape, color, edges, texture).

[1] PATSI — Photo Annotation through Finding Similar Images with Multivariate Gaussian Models, Lecture Notes in Computer Science,
2010, Vol. 6375, pp. 284-291.

One of the leading Polish IT institutes.

Main activity : artiﬁcial intelligence, machine
learning, computer vision, data mining.

 Budget: 4.5M EUR (national grants, EU funds)

Collaboration
 LMC at ETH Zurich (Image Processing)
 Industry: Microsoft, IBM, Volvo, Google.
 Research: TU Munchen, TU Dresden, xxx?

Team
 Prof. Halina Kwaśnicka Michalak Krzysztof, Ph.D.
Assoc Prof. at WUT, Poland Expertise in Machine Learning.
Deputy Director of Institute of Informatics
Head of the AI Division. Project Leader. Paweł Myszkowski
Assistant Prof. at WUT, Poland
 Prof. Urszula Markowska-Kaczmar Expertise in Data Mining and Evolutionary
Assoc Prof. at WUT. Algorithms.
Expertise in Computational Intelligence, Neural
Networks.
Martin Tabakow
 Elzbieta Hudyma Assistant Prof. at WUT, Poland
Assistant Prof. at WUT, Poland Expertise in Machine Learning and Analysis of
Expertise in Computer Graphics, Machine biomedical images.
Learning.
Bartłomiej Broda
 Mariusz Paradowski Pre-doc at WUT, Poland
PhD, Post-doc at Nanyang Technological Expertise in Machine Learning, CBIR systems
University, Singapur. and image annotations.
Assistant Profesor at WUT, Poland
Expertise in Machine Learning and Computer Wojciech Tarnawski
Vision. PhD, Post-doc in LTNT at ETH Zurich
 Michał Stanek Expertise in Machine Learning and Computer
Pre-doc at WUT, Poland Vision
Expertise in Machine Learning, CBIR systems and
Computer Vision. Lukasz Miroslaw
Expertise in Machine Learning, Computer
Master Students: Grzegorz Terlikowski, Sylwester Vision, Bioinformatics.
Plamowski, Bartek Dzienkowski, Marlena
Ochocinska, Agnieszka Glebala, Lukasz Jercinski

Transfer Annotator


Transfer Annotator


Tests performed on MGV 2006 database.

Transfer Annotator


Tests performed on ICPR 2004 database.

Transfer Annotator

Tests performed on IAPR TC 12 database.

Transfer Annotator

Results

Grzegorz Terlikowski, Marlena Ochocinska, Wojciech Tarnawski and Lukasz Miroslaw

Model-free Detection of Near-
Duplicate Fragments
 Problem definition:

Given two random images (i.e. without any a’priori knowledge
about their content) identify pairs of visually similar, near-
duplicate image fragments that are related by affine
transformations. The fragments are formed by sets of matched
keypoint pairs satisfying the same affine transformation.

Features analysed: Hessian-Laplace, Harris-Affine, MSER, SURFT and SIFT descriptors.
 Calculate and decompose the affine transformation for each pair of matched triangles.
 Build the histogram of parameters for the decomposed affine transformations.
 Detect and post-process the high density areas (peaks) in the histogram to build the
near-duplicate fragments in both images.

[1] Mariusz Paradowski, Andrzej Sluzek, Matching Planar Objects of Images using Histograms of Decomposed Affine Transforms. Submitted to peer-
reviewed journal.
[2] Mariusz Paradowski, Andrzej Sluzek, Detection of Image Fragments Related by Affine Transforms: Matching Triangles and Ellipses, ICISA 2010, Korea.

Results

 High precision, lower recall.
 Real-time analysis is possible.
 Method was tested on 3 databases:
 in-house data based with 100 out/indoor images containing
objects acquired in different conditions (4950 image pairs)
 Faces category of Caltech101 (180K image pairs)
 Oxford5K (270K image pairs)

Results

HarAff HesLap MSER
Quality measure SURF
SIFT SIFT SIFT

Precision
[object] 0.96 0.95 0.95 0.97

Recall [object] 0.82 0.70 0.74 0.62

Precision [area] 0.95 0.93 0.94 0.90

Recall [area] 0.65 0.51 0.50 0.50
Tested on XXX

Results
Coverage
face area 10 20 30 40 50
[%]

Precision
[object] 0.48 0.71 0.89 0.95 0.96

Recall
[object] 0.87 0.84 0.81 0.78 0.72

Tab. Face identiﬁcation. Tested on 188790 image pairs (Calltech101 face category)

Results

Without ROI With ROI

Precision [object] 0.88 0.94

Recall [object] 0.56 0.54

Tab. Image Retrieval. Tested on 276970 image pairs (Oxford5K)

Michal Stanek, Oskar Maier, Halina Kwasnicka, "Wroclaw University of Technology Participation at ImageCLEF 2010 Photo Annotation Track", Conf. on Multilingual and Multimodal Information Access
Evaluation, 20-23 Sep 2010, Padva, Italy.

H. Kwaśnicka, M. Paradowski, M. Stanek, M. Spytkowski, A. Śluzek , Intelligent approaches to searching similar images on the basis of visual content, ICI 2010, 10th Int.Conf on Information at Delta Universit
for Science and Technology.

Test it yourself!
www.ai.pwr.wroc.pl/similaris


Let’s look inside PubMed

 Consortium:
 BIOTEC (Prof. Schroeder): www.gopubmed.com
 Wroclaw University of Technology (CBIR)
 ETH Zurich (Computer Vision)

Objective: A semantic search engine for life
science images.

Automated Sign Language Recognition (ASLR)

 Consortium:
 xxx
 Wroclaw University of Technology (Computer Vision, AI, Machine
Learning)
 ETH Zurich (Computer Vision, Gesture Analysis)

 Objective: search for the publications and

Objective: To develop a translation system to
make it possible to communicate with deaf
people.

Problems: sign language is not international,
three types of gestures: ﬁnger spelling, word-
level sign vocabulary and tongue, body and
mouth position.

Content-based Image Retrieval

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Content-based Image Retrieval

Similar to Content-based Image Retrieval (20)

More from University of Zurich

More from University of Zurich (7)

Recently uploaded

Recently uploaded (20)

Content-based Image Retrieval

Editor's Notes