88 92

ISSN: 2277 – 9043
International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE)
Volume 1, Issue 6, August 2012

CLASSIFICATION OF REMOTELY SENSED IMAGE USING RELEVANCE
VECTOR MACHINE
1
A.Kalarani, 2G.viji, 2S.Ramprakash
1
Assistant Professor, P.S.R.Rengasamy college of engg for women, Sivakasi.
2
Assistant Professor, P.S.R.Rengasamy college of engg for women, Sivakasi.
2
Lecturer, M.Kumarasamy College of Engg, Karur.

Abstract— This paper introduces a remotely sensed image classification of remotely sensed images. This feature
classification method based on relevance vector machines makes the RVM based classification approach more
(RVMs). The features of the remotely sensed image are suitable for applications that require low complexity and
extracted and the classification is done[4] with the help of possibly, real time classification.
those features. It is shown that approximately the good
classification accuracy is obtained using RVM-based
classification, with a significantly smaller relevance vector II. PROPOSED METHODOLOGY
rate and, therefore, much faster testing time. This feature
makes the RVM-based classification approach more
suitable for applications that require low complexity and, REMOTELY WAVELET FEATURE
possibly, real-time classification. SENSED TRANSFORM EXTRACTION
IMAGE
Index Terms—Classification, remotely sensed image
,Bayesian learning, relevance vector machines (RVMs).
PERFORMANCE CLASSIFICATION
MEASURES (RVM)
I. INTRODUCTION
In the recent years, relevance vector machines Fig 1.Proposed Method of RVM algorithm
(RVMs) have been successfully used in many application
domains. In particular, the RVM constitutes a Bayesian The proposed methodology classifies the remote
approximation for solving generalized linear classification and sensed image based on RVM algorithm. In the first stage the
regression models[1]. This method not only provides accurate remote sensed image is transformed using DWT .The
predictions but also force sparsity (simplicity) of the method, approximated image is then chosen. The features of the
and can produce confidence intervals for the predictions. approximated image were extracted .The extracted features
Good trade-offs between accuracy and sparseness of the were classified into
solution has been observed in many application domains. In i)statistical features
the field of remote sensing, the use of RVM has been recently ii)textural features
introduced for the prediction of biophysical parameters. The statistical features include i) mean ii) variance and
Being a kernel-based method, the key point for obtaining good iii) standard deviation. The textural features include i) energy
RVM classifiers is the definition of a suitable kernel function ii) entropy iii) contrast and iv) homogeneity.The extracted
that can properly represent relations (similarities) among features were taken as training and testing samples. The
samples (pixels). training and testing samples were classified using RVM
algorithm and the performance were measured[12].
The advantages of the RVM are probabilistic
predictions, automatic estimations of parameters, and the
possibility of choosing arbitrary kernel functions. Most III. RVM CLASSIFICATION
importantly, RVM classification results[9] in fewer
relevance vectors (RVs), classification can be carried Supervised learning techniques make use of a
out much faster with the RVM . For example, the training set that consists of a set of sample input vectors
RVM has been used for the detection of micro
calcification clusters in digital mammograms, and it has
xn n1 together with the corresponding targets t n n1 . The
N N

been shown that the RVM classifier is much more suitable targets are basically real values in regression tasks or class
for real-time processing and reduces the computational labels in classification problems. It is typically desired to learn
complexity while maintaining similar detection accuracy. a model of the dependency of the targets on the inputs from
It is proposed in this letter to utilize the RVM for the training set, so that accurate predictions of t can be made

88
All Rights Reserved © 2012 IJARCSEE

ISSN: 2277 – 9043

for previously unseen values of x[8]. Commonly, these N
  w2 
pw /    
i
predications can be based on some function y(x) defined over 2
exp   i i
 

(3)
the input space in the form of i 1  2 

  1 ,  2 ,...., N T
M
y x; w   wi x   wT x  (1)
where shows the hyper parameters
i 1 introduced to control the strength of the prior over its
associated weight[3]. Hence, the prior is Gaussian, but
conditioned on  .For a certain  value, the posterior weight
as a linearly weighted sum of M (generally nonlinear and distribution conditioned on the data can be obtained using
fixed) basis functions Bayes‘ rule, i.e.,
x   (1 x ,  2 x ,..., M x ) T .
pt / w pw /  
Although this model is linear in the parameters (or pw / t ,    (4)
weights), w  w w ..., wM  it can still be highly flexible
T pt /  
1, 2,
as the size of the basis set M can be effectively large. Learning where p(t/w) is the likelihood, p(w/α) is the prior, and p(t)is
is basically the process of inferring the function or, referred to as evidence. The weights cannot be analytically
equivalently, the parameters of the function y(x). In this obtained, and therefore, a Laplacian approximation procedure
context, it is desired to estimate reasonable values for the is used.1) Since p(w/t,α) is linearly proportional to p(t/w) ×
parameters (or weights), w  w w1, 2, ..., wM  . Given a set
T
p(w|α), it is possible to aim to find the maximum of

of N corresponding training pairs x n , t n n 1 , the objective is logpt / w pw /   
N

to find values for the weights w  w w ..., wM  , such
T N

 t log y n  1  t n  log 1  y n  
1, 2, 1 T (5)
n w Aw
that y(x) generalizes well enough to new data, yet only a few n 1 2
elements of w are nonzero[5]. Having only a few nonzero
weights facilitates a sparse representation with the advantage
of providing fast implementation. for the most probable weights WMP, with yn = σ{y(xn;w)} and
A = diag(α0, α1, . . . , αN) being composed of the current
The RVM introduces a prior over the model weights values of α. This is a penalized logistic log-likelihood function
governed by a set of hyper parameters , in a probabilistic and requires iterative maximization. The iteratively reweighed
framework. One hyper parameter is associated with each least-squares algorithm] can be used to find WMP[6]. The
weight, and the most probable values are iteratively estimated logistic log-likelihood function can be differentiated twice to
from the training data[1]. The most compelling feature of the obtain the Hessian in the form of
RVM is that it typically utilizes significantly fewer kernel
functions , while providing a good performance. For two-
class classification, any target can be classified into two ww log pw / t ,   | wMP    T B  A   (6)
classes such that t n   ,1 . A Bernoulli distribution can
0 
where B = diag(β1, β2, . . . , βN) is a diagonal matrix with βn
be adopted for p(t|x) in the probabilistic framework because = σ{y(xn;w)}[1 − σ{y(xn;wMP)}], and Φ is the ‗design‘ matrix
only two values (0 and 1) are possible. The logistic sigmoid with Φnm = K(xn, xm−1) and Φn1 = 1. This result is then
link function σ(y) = 1/(1 + e−y) is applied to y(x) to link negated and inverted to give the covariance Σ, as shown as
random and systematic components, and generalize the linear follows[12], for a Gaussian approximation to the posterior
model. over weights centered at WMP.
Following the definition of the Bernoulli distribution , the
likelihood is written as Σ = (ΦT BΦ + A)−1. (7)
N
pt / w    y ( x n ; w) n 1   y ( x n ; w)
t 1t n In this way, the classification problem is locally linearized
(2)
around WMP. in an effective way with
n 1

WMP =ΣΦTBˆt (8)
for the targets tn Є {0, 1}.The likelihood is complemented by
a prior over the parameters(weights) in the form of
t=ΦwMP + B−1(t − y). (9)

These equations are basically equivalent to the solution of a
generalized least-squares problem. After obtaining WMP, the

89

ISSN: 2277 – 9043

i are updated using  , K xi , x j   xi .x j 
d
hyper parameters
 i  i / wi where wi is the ith posterior mean weight,
new 2

and i is defined as i  1   i  i i , where Σii is the ith RBF kernel
diagonal element of the covariance, and can be regarded as a
measure of how well determined each parameter wi is by the 
K ( xi , x j )  exp   || xi  x j || 2 
data[15]. During the optimization process, many i will have
large values, and thus, the corresponding model weights are The accuracy and the relevance vector for the extracted
pruned out, realizing sparsity. The optimization process features (homogeneity and contrast) are tabulated as
typically continues until the maximum change in  i values
is below a certain threshold or the maximum number of Table 1. Extracted features:
iterations is reached.
MODEL FEATURES AC RV
III. EXPERIMENTAL RESULTS RVM homogeneity 96 5
In this section, the proposed RVM classifier is
tested on an urban image of the area of pavia, italy. RVM contrast 97 11

The RV plots for the two class problem{0,1} for the features
homogeneity and contrast are shown in Figures1and 2
respectively.

RVM Classification
Fig (a) Fig (b)
1.2 Class 1
Class 2
Fig.(a) RGB composition of Pavia image, and b) groundtruth. 1.1 Decision boundary
p=0.25/0.75
This image was acquired by the DAIS 7915 airbone imaging 1 RVs
spectrometer of DLR . This is a challenging urban 0.9
classification problem dominated by directional features and
0.8
relatively high spatial resolution.Different values of the width
0.7
for the kernel were tried exponentially .
0.6

The most popular kernels used in RVM are the 0.5

linear, polynomial, and radial basis function (RBF) kernels. 0.4
The RBF kernel typically shows a performance and is
therefore employed in the provided results. Note that 
0.3

0.2
serves as an inner product coefficient for the polynomial
0.2 0.4 0.6 0.8 1 1.2
kernel, whereas it determines the RBF width in the case of the
RBF kernel.
Fig. 2. Classification maps obtained for a two-class problem
for the feature homogeneity. Red and blue dots indicate the
Linear kernel
classes {0,1}, red dots point out the relevant vectors (RVs), the

K xi , x j   xi .x j
red line represents the classification boundary, and the grey
lines are the confidence intervals at p = 0.25 and p = 0.75.

Polynomial kernel

90

ISSN: 2277 – 9043
RVM Classification
Class 1
Class 2
Decision boundary
[6] D. J. C. MacKay, ―The evidence framework applied to
2
p=0.25/0.75 Classification networks,‖ Neural Comput., vol. 4, no. 5, pp.
RVs
720– 736, 1992.
1.5
[7] I.T.Nabney, ―Efficient training of RBF networks for
classification,‖ inProc. 9th ICANN, 1999, vol. 1, pp. 210–215.
1

[8] R.Johansson and P.Nugues, ―Sparse Bayesian
classification of Predicate arguments,‖ in Proc. 9th Conf.
0.5
Comput. Natural Language Learn.,43rd Annu. Meeting Assoc.
Comput. Linguistics, Ann Arbor, MI, 2005,pp. 177–200.
0.5 1 1.5 2 2.5 3
[9] G.Camps-Valls, L.Gomez-Chova, J. Vila-Francés, J.
Amorós-López,J. Muñoz-Mar´ı, and J. Calpe-Maravilla,
Fig. 3. Classification maps obtained for a two-class problem ―Retrieval of oceanic chlorophyll concentration with
for the feature contrast. Red and blue dots indicate the classes relevance vector machines,‖ Remote Sensingof Environment,
{0,1}, red dots point out the relevant vectors (RVs), the red vol. 105, no. 1, pp. 23–33, Nov 2006.
line represents the classification boundary, and the grey lines
are the confidence intervals at p = 0.25 and p = 0.75. [10] B. E. Boser, I.M. Guyon, and V. Vapnik, ―A training
algorithm for optimal margin classifiers,‖ in Proc. 5th Annu.
IV. CONCLUSION ACM Workshop Comput. Learn.Theory, 1992, pp. 144–152.
RVM-based image classification provide good [11] C. Burges, ―A tutorial on support vector machines for
classification accuracy, with a significantly smaller RV rate pattern recognition,‖in Proc. Data Miningand Knowl.
and therefore , much faster testing time.The most evident and Discovery,
compelling results are its accuracy and sparseness .RVM- U.Fayyad, Ed., 1998, pp. 1–43.
based classification approach is more suitable for applications
that require low complexity and, possibly, real-time [12] F. Melgani and L. Bruzzone, ―Classification of hyper
classification. spectral remote sensing images with support vector
machines,‖ IEEE Transactions on Geoscience and Remote
REFERENCES Sensing, vol. 42, no. 8, pp. 1778-1790,Aug 2004.
[1] Pijush Samui1, Venkata Ravibabu Mandla, Arun [14] G.Camps-Valls and L. Bruzzone, ―Kernel-based
Krishna and Tarun Teja ―Prediction of Rainfall Using Support methods for hyper spectral image classification,‖ IEEE
Vector Machine and Relevance Vector Machine‖, Open Transactions on Geoscience and Remote Sensing, vol. 43, no.
access e-Journal Earth Science India, eISSN: 0974 – 8350 Vol. 6, June 2005.
4(IV), October, 2011, pp. 188 – 200
[15] G. Camps-Valls, L. Gómez-Chova, J. Muñoz-Mar´ı, J.
[2] A., Chua, L. H. C., and Quek, C. (2010) ―A novel Vila-Francés,and J. Calpe-Maravilla, ―Composite kernels for
application of a neuro-fuzzy computational technique in hyper spectral image classification,‖ IEEE Geoscience and
event-based rainfall–runoff modeling. Expert Systems with Remote Sensing Letters, vol. 3,no. 1, pp. 93–97, Jan 2006.
Applications,‖ v. 37(12), pp. 7456–7468.
[16] Matthias Seeger, ―Gaussian processes for machine
[3] M. E. Tipping, ―The relevance vector machine,‖ in learning,‖ International Journal of Neural Systems, vol. 14,
Advances in Neural Information ProcessingSystems , vol. 12, no. 2, pp. 69–106, 2004.
S. A. Solla, T. K. Leen, and K.-R. Müller, Eds. Cambridge,
MA: MIT Press, 2000. [17] C. E. Rasmussen and C. K. I. Williams, Gaussian
Processes for Machine Learning, The MIT Press, 2006.
[4] M. E. Tipping, ―Sparse Bayesian learning and the
relevance [18] N. Nikolaev and P. Tino, ―Sequential relevance vector
vector machine,‖ J. Mach. Learn. Res., vol. 1, pp. 211–244, machine earning from time series,‖ in Proceedings of
2001. International Joint Conference on Neural Networks, Montreal,
Canada, Aug 2005, pp. 468–473.
[5] W. Liyang, Y. Yongyi, R. M. Nishikawa, M. N.Wernick,
and A.Edwards,―Relevance vector machine for automatic
detection of clustered microcalcifications,‖IEEE Trans. Med.
Imag., vol.24, no. 10, pp. 1278–1285,Oct. 2005.

91

ISSN: 2277 – 9043

[19] J. Quiñonero-Candela, Learning with Uncertainty – Ramprakash Subburam
Gaussian Processes and Relevance Vector Machines, Ph.D. received the B.Engg. degree in
thesis, Technical University of Denmark, Informatics and Electronics and Instrumentation
Mathematical Modelling, Kongens Lyngby (Denmark), Engineering from Anna
November 2004. University, Chennai, in 2009 and
doing Master of Engg. degree in
[20] G. Camps-Valls, M. Mart´ınez-Ramón, J. L. Rojo- Anna University, coimbatore. He
Álvarez, and J. Muñoz-Mar´ı, ―Nonlinear system has been worked as an
identification with composite relevance vector machines,‖ Instrumentation Site Engineer in Micotec Engineers
IEEE Sign Processing Letters, vol. 14, no. 4, pp. 279–282, and contractors ( A sub contractor to Yokogawa india
April 2007. ltd) to Empee Cogen power plant, Edaikal,
Tirunelveli District from may 2009. From June 2010
[18] G. Camps-Valls, L. Gomez-Chova, J. Vila-Francés, J. to till now, he is working in M.Kumarasamy College
Amorós-López,J. Muñoz-Mar´ı, and J. Calpe-Maravilla, of Engg, Karur. His research area includes wireless
―Relevance vector machines for sparse learning of biophysical communication, Bio-medical instrumentation,
parameters,‖ in SPIE International Symposium Remote process control, Digital Image processing.
Sensing, XI, Bruges, Belgium, Set 2005, vol. 5982.

[19] G.Camps-Valls, L. Gomez-Chova, J. Vila-Francés,
J.Amorós- López,J. Muñoz-Mar´ı, and J. Calpe-Maravilla,
―Retrieval of oceanic chlorophyll lconcentration with
relevance vector machines,‖ Remote Sensing of Environment,
vol. 105, no. 1, pp. 23–33, Nov 2006.

Kalarani Athilingam completed her
B.Engg. degree in Electronics and
Communication Engineering from Anna
University, Chennai, in 2008 and the
Master of Engg. degree from Anna
University, Tirunelveli, in 2010. From
June 2010 to till now, She is working in
P.S.R.Rengasamy College of Engg for women, Sivakasi. Her
research area includes Digital Electronics, Digital Image
processing, Antenna. Communication theory.She has been
attended several workshops and conferences in various engg
colleges.

Viji Gurusamy received the B.Engg.
degree in Electronics and
Communication Engineering from
Anna University, Chennai, in 2008 and
the Master of Engg. degree from Anna
University, Thirunelveli, in 2010. From
June 2010 to May 2012, She was
worked in M.Kumarasamy College of
Engg, Karur. Now she is currently working in
P.S.R.Rengasamy College of Engg for women, Sivakasi. She
had attended four international conferences and one national
conference in various colleges. Her research area includes
Digital Signal processing, Digital Image processing, Digital
Communication.

92

88 92

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (11)

En vedette

En vedette (13)

Similaire à 88 92

Similaire à 88 92 (20)

Plus de Ijarcsee Journal

Plus de Ijarcsee Journal (20)

Dernier

Dernier (20)

88 92