1. Combination of Supervised and Unsupervised
Classification Using Belief Function Theory
Fatma Karem(1)
Mounir Dhibi(1)
Research Unit PMI 09/UR/13-0, University campus Zarouk Gafsa 2112, Tunisia
Arnaud Martin(2)
Rennes 1 University , UMR 6074 IRISA, Street Edouard Branly BP 30219, 22302
Lannion Cedex, France
Combination of Clustering and Classification
10/5/2012 1
2. PLAN
Problematic
Fusion of information
Belief Function Theory
Proposed Approach
Results
Conclusion and Perspectives
10/5/2012 2 Combination of Clustering and Classification
3. Classification Problems
Many methods how to choose between them ?
Dependance of the obtained results to the parameters initially chosen
Incertain data manipulated sometimes missing
How to choose the
best parameters ?
Fusion between clustering and
classification
Combination of Clustering and
10/5/2012 3
Classification
4. Objectives of fusion
Taking account of the complementarity of both methods
Limitation of problems due to the choice of parameters, training
Reduction of data and results uncertainty
How to combine ???
Recommendations:
Choice of an approach treating uncertainty and imprecision
Limitation of conflicts between clustering and classification
Combination of Clustering and
10/5/2012 4
Classification
5. Information Fusion
Goal of fusion :
Combination of informations coming from different sources
Reduction of sources uncertainty and imprecision
Trying to make a compromise between the sources in order to reduce
conflict between them
Theories treating uncertainty
Exemples : theory of probability (bayesian approach), theory of possibility,
Belief Function theory (Dempster-Shafer Theory)
Combination of Clustering and
10/5/2012 5
Classification
6. Belief Function Theory (1/2)
Let Ө be a finite non empty set of elementary events to a given problem
called the frame of discernement Ө = {θi, i=1,…,n} where θi are the
hypotheses about one problem domain.
The set of all the subsets of Ө is referred by the power set of Ө denoted
by
2θ
The impact of a piece of evidence on the different subsets of the frame
of discernment Ө is represented by the basic belief assignment (bba)
denoted by m.
m: such that (1)
Combination of Clustering and
10/5/2012 6
Classification
7. Belief Function Theory
(2/2)
Belief function
Plausibility function
Combination of Clustering and
10/5/2012 7
Classification
8. New Approach (1/3)
Learning database
Clustering Classification
How to make a
compromise
between the two ?
Decision making
Combination of Clustering and
10/5/2012 8
Classification
9. New Approach (2/3)
Source 1 Source 2
Clustering Classification
clusters classes
mNS mS
Combination
Decision
making To which class belong
each object ?
Final
Decision
Combination of Clustering and
10/5/2012 9
Classification
10. New Approach (3/3)
How to measure our belief in the classes given by the supervised classification?
Unsupervised Supervised Step 1
Classification Classification
Probabilistic model
Computation of of Appriou
similarity between
clusters and classes
(recovery) Conjunctive Step 2
Combination
Decision making Step 3
Criterion: pignistic probability
Combination of Clustering and
10/5/2012 10
Classification
11. Masses computation
Computation of mass function for the unsupervised and
supervised source
Mass computation for the
Clustering
clustering Classification
C1 C2
C2
C1
C3
C4
C6
C3
C5 C4
Recovery
Combination of Clustering and
10/5/2012 11
Classification
12. Computation of mass function for
the unsupervised source (1/2)
Computation of recovery
C1
C2
C3
C4
C6
Let Q={ ,i=1,…..M} : the set of classes found by the supervised classification
C={ ,i=1,…..n} : the set of classes found by the unsupervised classification
Combination of Clustering and
10/5/2012 12
Classification
13. Computation of mass function for
the supervised source (2/2)
With the class affected by the supervised classifier to an observation x
qi the real class
the realibilty coefficient of the supervised classification for the class
Combination of Clustering and
10/5/2012 13
Classification
14. Experimental Results (1/4)
Data Classification Classification
performance performance
before fusion after fusion
iris 97,33 100
Abalone 53,67 76,35
Breast-cancer 64,52 80
Haberman 75,17 100
Obtained results for KNN+FCM
Combination of Clustering and
10/5/2012 14
Classification
15. Experimental Results (2/4)
Data Classification Classification
performance performance
before fusion after fusion
iris 96 100
Abalone 52 79,80
Breast-cancer 96 100
Haberman 73,83 77,74
Obtained results for Bayes+FCM
Combination of Clustering and
10/5/2012 15
Classification
16. Experimental Results (3/4)
Data Classification Classification
performance performance
before fusion after fusion
iris 97,33 100
Abalone 53,10 78,69
Breast-cancer 64,52 80
Haberman 75,17 99,34
Obtained results for KNN+Mixture model
Combination of Clustering and
10/5/2012 16
Classification
17. Experimental Results (4/4)
Data Classification Classification
performance performance
before fusion after fusion
Iris 96 100
Abalone 52 82,45
Breast-cancer 96 100
Haberman 73,83 77,74
Obtained results for Bayes + Mixture model
Combination of Clustering and
10/5/2012 17
Classification
18. Conclusion and Perspectives
•Conclusions
New approach treating uncertainty and resolve conflict
The new approach gives good results to generic data
• Perspectives
Release of the database
Missing data
Real images sonar and medical images
Improvement of the mechanism of fusion
Combination of Clustering and
10/5/2012 18
Classification