SlideShare une entreprise Scribd logo
1  sur  52
Télécharger pour lire hors ligne
ERCIM'11 - Handling imprecision in graphical models
Classification with decision trees from a
nonparametric predictive inference
perspective
Joaquín Abellán†, Rebecca M. Baker§, Frank P.A. Coolen§,
Richard J. Crossman¶, Andrés. R. Masegosa†
† Department of Computer Science and Artificial Intelligence, University of
Granada, Granada, Spain
§ Department of Mathematical Sciences, Durham University, Durham, UK
¶ Warwick Medical School, University of Warwick, Coventry, UK
London, December 2011
ERCIM’11 London (UK) 1/28
Outline
1 Introduction
2 Imprecise Models for Multinomial data
3 Uncertainty Measures and Decision Trees
4 Experimental Evaluation
5 Conclusions & Future Works
ERCIM’11 London (UK) 2/28
Introduction
Part I
Introduction
ERCIM’11 London (UK) 3/28
Introduction
Imprecise Probabilities
Imprecise Probabilities is a term that subsumes several
theories for reasoning under uncertainty:
Belief Functions
Reachable Probability Intervals
Capacities of various orders
Upper and lower probabilities
Credal Sets
.....
ERCIM’11 London (UK) 4/28
Introduction
Uncertainty Measures
Extensions of the classic information theory has been
developed for these imprecise models.
Several extensions to the Shannon entropy measure has
been proposed:
Maximum Entropy measure proposed by (Abellan - Moral,
2003) as a total uncertainty measure for general credal
sets.
The employment of this measure in the case of the
Imprecise Dirichlet model (IDM) has lead to several
successful applications in the field of data-mining, specially
on decision trees for supervised classification.
ERCIM’11 London (UK) 5/28
Introduction
Imprecise Dirichlet Model (IDM) model
The IDM model was proposed by Walley in 1996 for
statistical inference from multinomial data was
developed to correct shortcomings of previous alternative
objective models.
It verifies a set of principles which are claimed by Walley
to be desirable for inference:
Representation Invariance Principle.
It can be expressed via a set of reachable probability
intervals and a belief function (Abellán,2006).
ERCIM’11 London (UK) 6/28
Introduction
Non-parametric Predictive Inference
Non-parametric Predictive Inference model for multinomial data (NPI-M) is
presented by (Coolen-Augustin,2005) as an alternative to the IDM model.
ERCIM’11 London (UK) 7/28
Introduction
Non-parametric Predictive Inference
Non-parametric Predictive Inference model for multinomial data (NPI-M) is
presented by (Coolen-Augustin,2005) as an alternative to the IDM model.
Learn from data in the absence of prior knowledge and with only a few
modeling assumptions.
ERCIM’11 London (UK) 7/28
Introduction
Non-parametric Predictive Inference
Non-parametric Predictive Inference model for multinomial data (NPI-M) is
presented by (Coolen-Augustin,2005) as an alternative to the IDM model.
Learn from data in the absence of prior knowledge and with only a few
modeling assumptions.
Post-data exchangeability-like assumption together with a latent
variable representation of data.
B
BB
B
P
P P
P
P
ERCIM’11 London (UK) 7/28
Introduction
Non-parametric Predictive Inference
Non-parametric Predictive Inference model for multinomial data (NPI-M) is
presented by (Coolen-Augustin,2005) as an alternative to the IDM model.
Learn from data in the absence of prior knowledge and with only a few
modeling assumptions.
Post-data exchangeability-like assumption together with a latent
variable representation of data.
B
BB
B
P
P P
P
P
It does not satisfy as Representation Invariance Principle (RIP).
Not a shortcoming for Coolen-Augustin. A weaker principle is
proposed.
ERCIM’11 London (UK) 7/28
Introduction
Classification with Decision Trees using NPI-M model
In this work, we employ the NPI-M model to build
decision trees using the maximum entropy (ME)
measure as a split criteria:
ERCIM’11 London (UK) 8/28
Introduction
Classification with Decision Trees using NPI-M model
In this work, we employ the NPI-M model to build
decision trees using the maximum entropy (ME)
measure as a split criteria:
We experimentally compare the decision trees build with
IDM with different S values.
Classification accuracy is slightly improve.
The decision trees are notably smaller.
A parameter free method.
ERCIM’11 London (UK) 8/28
Imprecise Models for Multinomial Data
Part II
Imprecise Models for Multinomial
Data
ERCIM’11 London (UK) 9/28
Imprecise Models for Multinomial Data
Imprecise Dirichlet Model
The imprecise Dirichlet model (IDM) was introduced by Walley for
inference about the probability distribution of a categorical variable.
ERCIM’11 London (UK) 10/28
Imprecise Models for Multinomial Data
Imprecise Dirichlet Model
The imprecise Dirichlet model (IDM) was introduced by Walley for
inference about the probability distribution of a categorical variable.
Let us assume that X is a variable taking values on a finite set X = {x1, . . . , xK }
and we have a sample of N independent and identically distributed outcomes of
X.
ERCIM’11 London (UK) 10/28
Imprecise Models for Multinomial Data
Imprecise Dirichlet Model
The imprecise Dirichlet model (IDM) was introduced by Walley for
inference about the probability distribution of a categorical variable.
Let us assume that X is a variable taking values on a finite set X = {x1, . . . , xK }
and we have a sample of N independent and identically distributed outcomes of
X.
The goal is to estimate the probabilities θx = p(x) with which X takes its values.
ERCIM’11 London (UK) 10/28
Imprecise Models for Multinomial Data
Imprecise Dirichlet Model
The imprecise Dirichlet model (IDM) was introduced by Walley for
inference about the probability distribution of a categorical variable.
Let us assume that X is a variable taking values on a finite set X = {x1, . . . , xK }
and we have a sample of N independent and identically distributed outcomes of
X.
The goal is to estimate the probabilities θx = p(x) with which X takes its values.
Common Bayesian procedure:
Assumes a Dirichlet prior for the parameter vector (θx )x∈X .
f((θx )x∈X ) =
Γ(s)
x∈X Γ(s · tx )
x∈X
θs·tx −1
x
The parameter s > 0 and t = (tx )x∈X , which is a vector of positive real
numbers satisfying x∈X tx = 1.
Take the posterior expectation of the parameters given the sample.
ERCIM’11 London (UK) 10/28
Imprecise Models for Multinomial Data
Imprecise Dirichlet Model
Posterior Probability
f((θx )x∈X |D) =
Γ(s)
x∈X Γ(s · tx ) x∈X
θ
n(x)+s·tx −1
x
where n(x) are the count frequencies of x in the data.
ERCIM’11 London (UK) 11/28
Imprecise Models for Multinomial Data
Imprecise Dirichlet Model
Posterior Probability
f((θx )x∈X |D) =
Γ(s)
x∈X Γ(s · tx ) x∈X
θ
n(x)+s·tx −1
x
where n(x) are the count frequencies of x in the data.
The IDM only depends on parameter s and assumes all the possible values of
the vector t.
This defines a closed and convex set of prior distributions.
ERCIM’11 London (UK) 11/28
Imprecise Models for Multinomial Data
Imprecise Dirichlet Model
Posterior Probability
f((θx )x∈X |D) =
Γ(s)
x∈X Γ(s · tx ) x∈X
θ
n(x)+s·tx −1
x
where n(x) are the count frequencies of x in the data.
The IDM only depends on parameter s and assumes all the possible values of
the vector t.
This defines a closed and convex set of prior distributions.
This credal set for a particular variable X can be represented by a system of
probability intervals:
p (x) ∈ [
n(x)
N + s
,
n(x) + s
N + s
]
Parameter s determines how quickly the lower and upper probabilities
converge as more data become available.
ERCIM’11 London (UK) 11/28
Imprecise Models for Multinomial Data
The NPI-M model
The NPI model for multinomial data (NPI-M) was developed by
Coolen and Augustin is based on a variation of Hill’s assumption A(n)
which relates to predictive inference involving real-valued data
observations.
Post-data exchangeability-like assumption together with a latent variable
representation of data.
ERCIM’11 London (UK) 12/28
Imprecise Models for Multinomial Data
The NPI-M model
The NPI model for multinomial data (NPI-M) was developed by
Coolen and Augustin is based on a variation of Hill’s assumption A(n)
which relates to predictive inference involving real-valued data
observations.
Post-data exchangeability-like assumption together with a latent variable
representation of data.
Suppose there are 5 different categories {B, P, R, Y, G} the observations: 4
outcomes of B and 5 outcomes of P.
B
P
P
B
P
B
B
P
P
ERCIM’11 London (UK) 12/28
Imprecise Models for Multinomial Data
The NPI-M model
The NPI model for multinomial data (NPI-M) was developed by
Coolen and Augustin is based on a variation of Hill’s assumption A(n)
which relates to predictive inference involving real-valued data
observations.
Post-data exchangeability-like assumption together with a latent variable
representation of data.
Suppose there are 5 different categories {B, P, R, Y, G} the observations: 4
outcomes of B and 5 outcomes of P.
B
P
P
B
P
B
B
P
P
B
B
B
B
P
P
P
P
P
ERCIM’11 London (UK) 12/28
Imprecise Models for Multinomial Data
The NPI-M model
Theorem (Coolen and Augustin 2009) The lower and upper probabilities for
the event x:
P(x) = max 0,
n(x) − 1
N
P(x) = min
n(x) + 1
N
, 1
ERCIM’11 London (UK) 13/28
Imprecise Models for Multinomial Data
The NPI-M model
Theorem (Coolen and Augustin 2009) The lower and upper probabilities for
the event x:
P(x) = max 0,
n(x) − 1
N
P(x) = min
n(x) + 1
N
, 1
For the previous case {B, P, R, Y, G}:
B
B
B
B
P
P
P
P
P
[
3
9
,
5
9
]; [
4
9
,
6
9
]; [0,
1
9
]; [0,
1
9
]; [0,
1
9
]
ERCIM’11 London (UK) 13/28
Imprecise Models for Multinomial Data
The A-NPI-M model
This set of lower and upper probabilities,L, associated with the singleton events
x expresses a reachable set of probability intervals, i.e. a credal set.
L = {[max 0,
n(x) − 1
N
, min
n(x) + 1
N
, 1 ] : x ∈ {x1, ..., xK }}
ERCIM’11 London (UK) 14/28
Imprecise Models for Multinomial Data
The A-NPI-M model
This set of lower and upper probabilities,L, associated with the singleton events
x expresses a reachable set of probability intervals, i.e. a credal set.
L = {[max 0,
n(x) − 1
N
, min
n(x) + 1
N
, 1 ] : x ∈ {x1, ..., xK }}
The set of probability distributions obtained from NPI-M is not a credal set (see
a counterexample in the paper).
p = ( 3
9
, 4
9
, 2
27
, 2
27
, 2
27
) belongs to the credal set:
[3
9
, 5
9
]; [ 4
9
, 6
9
]; [0, 1
9
]; [0, 1
9
]; [0, 1
9
]
It is not possible to find a configuration of the probability wheel that
corresponds to p.
ERCIM’11 London (UK) 14/28
Imprecise Models for Multinomial Data
The A-NPI-M model
This set of lower and upper probabilities,L, associated with the singleton events
x expresses a reachable set of probability intervals, i.e. a credal set.
L = {[max 0,
n(x) − 1
N
, min
n(x) + 1
N
, 1 ] : x ∈ {x1, ..., xK }}
The set of probability distributions obtained from NPI-M is not a credal set (see
a counterexample in the paper).
p = ( 3
9
, 4
9
, 2
27
, 2
27
, 2
27
) belongs to the credal set:
[3
9
, 5
9
]; [ 4
9
, 6
9
]; [0, 1
9
]; [0, 1
9
]; [0, 1
9
]
It is not possible to find a configuration of the probability wheel that
corresponds to p.
The A-NPI-M model is a simplified approximate model derived from NPI-M by
considering all probability distributions compatible with the set of lower and upper
probabilities, L, obtained from NPI-M.
ERCIM’11 London (UK) 14/28
Uncertainty Measures and Decision Trees
Part III
Uncertainty Measures and
Decision Trees
ERCIM’11 London (UK) 15/28
Uncertainty Measures and Decision Trees
Decision Trees
A decision tree, also called a classification tree, is a simple structure that can
be used as a classifier.
Each node represents an attribute variable
Each branch represents one of the states of this variable
Each tree leaf specifies an expected value of the class variable
Example of decision tree for three attribute variables Ai (i = 1, 2, 3), with two
possible values (0, 1) and a class variable C with cases or states c1, c2, c3:
ERCIM’11 London (UK) 16/28
Uncertainty Measures and Decision Trees
Building Decision Trees from data
Basic Rule
Associate to each node the most informative variable about the class C.
Not already been selected in the path from the root to this node.
This variable provides more information than if it had not been included.
ERCIM’11 London (UK) 17/28
Uncertainty Measures and Decision Trees
Building Decision Trees from data
Basic Rule
Associate to each node the most informative variable about the class C.
Not already been selected in the path from the root to this node.
This variable provides more information than if it had not been included.
We are given a data set D.
Each node of DT defines a set of probabilities for the
class variable C, Pσ.
σ is the configuration of a node.
σ = (A1 = 1, A3 = 0) associated to node c3.
ERCIM’11 London (UK) 17/28
Uncertainty Measures and Decision Trees
Building Decision Trees from data
Basic Rule
Associate to each node the most informative variable about the class C.
Not already been selected in the path from the root to this node.
This variable provides more information than if it had not been included.
We are given a data set D.
Each node of DT defines a set of probabilities for the
class variable C, Pσ.
σ is the configuration of a node.
σ = (A1 = 1, A3 = 0) associated to node c3.
Pσ are obtained from the IDM, NPI-M or A-NPI-M from the data set D[σ].
ERCIM’11 London (UK) 17/28
Uncertainty Measures and Decision Trees
Imprecise Information Gain
ImpInfGain = TU(Pσ
) −


ai ∈Ai
rσ
ai
TU(Pσ∪(Ai =ai )
)


ERCIM’11 London (UK) 18/28
Uncertainty Measures and Decision Trees
Imprecise Information Gain
ImpInfGain = TU(Pσ
) −


ai ∈Ai
rσ
ai
TU(Pσ∪(Ai =ai )
)


TU is a total uncertainty measure function, normally defined on credal sets.
rσ
ai
is the relative frequency with which Ai takes the value ai in D[σ].
σ ∪ (Ai = ai ) is the result of adding the value Ai = ai to configuration σ
ERCIM’11 London (UK) 18/28
Uncertainty Measures and Decision Trees
Imprecise Information Gain
ImpInfGain = TU(Pσ
) −


ai ∈Ai
rσ
ai
TU(Pσ∪(Ai =ai )
)


TU is a total uncertainty measure function, normally defined on credal sets.
rσ
ai
is the relative frequency with which Ai takes the value ai in D[σ].
σ ∪ (Ai = ai ) is the result of adding the value Ai = ai to configuration σ
Imprecise Information Gain is a comparison of the total uncertainty for
configuration σ with the expected total uncertainty of σ ∪ (Ai = ai ).
We put a new node or we split only if this difference is positive.
ERCIM’11 London (UK) 18/28
Uncertainty Measures and Decision Trees
Maximum Entropy
Maximum Entropy is an aggregate uncertainty measure which aims to
generalize classic measures (Hartley Measure and Shanon Entropy).
It satisfies all the required properties (additivity, subadditivity, monotonicity,
proper range, etc.)
It is defined as a functional S∗:
S∗
(Pσ
) = max
p∈Pσ
{−
x∈X
p(x) log2 p(x)}
ERCIM’11 London (UK) 19/28
Uncertainty Measures and Decision Trees
Maximum Entropy
Maximum Entropy is an aggregate uncertainty measure which aims to
generalize classic measures (Hartley Measure and Shanon Entropy).
It satisfies all the required properties (additivity, subadditivity, monotonicity,
proper range, etc.)
It is defined as a functional S∗:
S∗
(Pσ
) = max
p∈Pσ
{−
x∈X
p(x) log2 p(x)}
Efficient algorithms for different imprecise models have been proposed.
Credal Sets (AbellanMoral,2003); IDM (AbellanMoral,2006); NPI-M and
A-NPI-M (Abellan et al.,2010).
ERCIM’11 London (UK) 19/28
Uncertainty Measures and Decision Trees
Maximum Entropy
Maximum Entropy is an aggregate uncertainty measure which aims to
generalize classic measures (Hartley Measure and Shanon Entropy).
It satisfies all the required properties (additivity, subadditivity, monotonicity,
proper range, etc.)
It is defined as a functional S∗:
S∗
(Pσ
) = max
p∈Pσ
{−
x∈X
p(x) log2 p(x)}
Efficient algorithms for different imprecise models have been proposed.
Credal Sets (AbellanMoral,2003); IDM (AbellanMoral,2006); NPI-M and
A-NPI-M (Abellan et al.,2010).
This method for building decision trees for classification
can be similarly extended for other imprecise models.
ERCIM’11 London (UK) 19/28
Experimental Evaluation
Part IV
Experimental Evaluation
ERCIM’11 London (UK) 20/28
Experimental Evaluation
Experimental Set-up
The aim is to compare the use of the IDM with the use of
the NPI-M and the A-NPI-M for building decision trees
ERCIM’11 London (UK) 21/28
Experimental Evaluation
Experimental Set-up
The aim is to compare the use of the IDM with the use of
the NPI-M and the A-NPI-M for building decision trees
Decision trees for supervised classification using imprecise models, IDM, NPI-M
and A-NPI:
The maximum entropy as base measure for uncertainty.
Classic Shannon’s entropy measure is also employed:
information gain and information gain ratio.
Experiments on 40 UCI data sets using a 10-fold-cross validation scheme.
Statistical Tests for the comparison:
Corrected Paired T-Test at 5% significant level.
Friedman Test at 5% significant level.
ERCIM’11 London (UK) 21/28
Experimental Evaluation
Evaluating Classification Accuracy
IDM NPI-M A-NPI-M IG IGR
Av. Accuracy 76.73 76.77 76.77 74.33 75.66
Friedman Rank 2.95 2.94 2.76 3.06 3.29
Friedman Test accepts the hypothesis that all algorithms
performs equally well.
ERCIM’11 London (UK) 22/28
Experimental Evaluation
Evaluating the size of the decision trees
The size of a decision tree is relevant:
A decision tree is composed by a set of classification rules.
Smaller trees implies more representative classification rules (lower risk of
over-fitted predictions).
ERCIM’11 London (UK) 23/28
Experimental Evaluation
Evaluating the size of the decision trees
The size of a decision tree is relevant:
A decision tree is composed by a set of classification rules.
Smaller trees implies more representative classification rules (lower risk of
over-fitted predictions).
IDM NPI-M A-NPI-M IG IGR
Av. N. Nodes 610.16 589.52 589.81 1119.46 1288.56
Friedman Rank 2.96 1.4 1.8 4.18 4.67
Imprecise models provided much more compact representations than
precise based methods.
NPI-M and A-NPI-M build smaller decision trees than IDM with s=1 (default
value).
ERCIM’11 London (UK) 23/28
Experimental Evaluation
Evaluating IDM with different s values
IDM depends of the parameter s (affects to the size of the intervals ):
s=0 implies the information gain criterion.
Higher s values implies higher intervals. Decrease the size of the trees.
ERCIM’11 London (UK) 24/28
Experimental Evaluation
Evaluating IDM with different s values
IDM depends of the parameter s (affects to the size of the intervals ):
s=0 implies the information gain criterion.
Higher s values implies higher intervals. Decrease the size of the trees.
NPI-M IDM-S1 IDM-S1.5 IDM-S2.0
Av. N. Nodes 589.5 610.2 480.3 383.8
Friedman Rank 2.4 3.85 2.43 1.28
IDM with s=1.5 obtains decision trees with a similar size to NPI-M.
IDM with s=2.0 obtains decision trees with smaller similar size than NPI-M.
ERCIM’11 London (UK) 24/28
Experimental Evaluation
Evaluating IDM with different s values
NPI-M IDM-S1 IDM-S1.5 IDM-S2.0
Av. Accuracy 76.77 76.73 76.23 75.75
Friedman Rank 2.25 2.26 2.46 3.03
IDM with s>1 performs worst than NPI-M and IDM with s=1.
NPI-M models get the best classification accuracy with the smallest
decision trees.
ERCIM’11 London (UK) 25/28
Conclusions and Future Works
Part V
Conclusions and Future Works
ERCIM’11 London (UK) 26/28
Conclusions and Future Works
Conclusions and Future Works
We have presented an application of the NPI model for
multinomial data in classification.
A known scheme to build decision trees via two NPI-based
models, an exact model (NPI-M), and an approximate model
(A-NPI-M), is used.
Models based on NPI model have a performance of accuracy
similar to that of the best model based on the IDM (which varies
its parameter), but using smaller trees.
ERCIM’11 London (UK) 27/28
Conclusions and Future Works
Conclusions and Future Works
We have presented an application of the NPI model for
multinomial data in classification.
A known scheme to build decision trees via two NPI-based
models, an exact model (NPI-M), and an approximate model
(A-NPI-M), is used.
Models based on NPI model have a performance of accuracy
similar to that of the best model based on the IDM (which varies
its parameter), but using smaller trees.
Future Works:
Extend these methods for credal classification.
ERCIM’11 London (UK) 27/28
Conclusions and Future Works
Thanks for you attention!
Questions?
ERCIM’11 London (UK) 28/28

Contenu connexe

Tendances

Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learningUniversity of Groningen
 
Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practicetuxette
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC datatuxette
 
An introduction to deep learning
An introduction to deep learningAn introduction to deep learning
An introduction to deep learningVan Thanh
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biologyKernel methods for data integration in systems biology
Kernel methods for data integration in systems biologytuxette
 
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco ScutariBayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco ScutariBayes Nets meetup London
 
Learning from (dis)similarity data
Learning from (dis)similarity dataLearning from (dis)similarity data
Learning from (dis)similarity datatuxette
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNNtuxette
 
17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis FunctionsAndres Mendez-Vazquez
 
Citython presentation
Citython presentationCitython presentation
Citython presentationAnkit Tewari
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Syed Atif Naseem
 
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...Anirbit Mukherjee
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...butest
 
A preliminary study of diversity in ELM ensembles (HAIS 2018)
A preliminary study of diversity in ELM ensembles (HAIS 2018)A preliminary study of diversity in ELM ensembles (HAIS 2018)
A preliminary study of diversity in ELM ensembles (HAIS 2018)Carlos Perales
 
Cryptography Baby Step Giant Step
Cryptography Baby Step Giant StepCryptography Baby Step Giant Step
Cryptography Baby Step Giant StepSAUVIK BISWAS
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"tuxette
 
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...Shao-Chuan Wang
 
Lecture13 xing fei-fei
Lecture13 xing fei-feiLecture13 xing fei-fei
Lecture13 xing fei-feiTianlu Wang
 

Tendances (20)

Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learning
 
Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practice
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC data
 
An introduction to deep learning
An introduction to deep learningAn introduction to deep learning
An introduction to deep learning
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biologyKernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco ScutariBayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
 
Estimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample SetsEstimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample Sets
 
Learning from (dis)similarity data
Learning from (dis)similarity dataLearning from (dis)similarity data
Learning from (dis)similarity data
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
 
17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions
 
Citython presentation
Citython presentationCitython presentation
Citython presentation
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)
 
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...
 
A preliminary study of diversity in ELM ensembles (HAIS 2018)
A preliminary study of diversity in ELM ensembles (HAIS 2018)A preliminary study of diversity in ELM ensembles (HAIS 2018)
A preliminary study of diversity in ELM ensembles (HAIS 2018)
 
Cryptography Baby Step Giant Step
Cryptography Baby Step Giant StepCryptography Baby Step Giant Step
Cryptography Baby Step Giant Step
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"
 
Polynomial Matrix Decompositions
Polynomial Matrix DecompositionsPolynomial Matrix Decompositions
Polynomial Matrix Decompositions
 
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
 
Lecture13 xing fei-fei
Lecture13 xing fei-feiLecture13 xing fei-fei
Lecture13 xing fei-fei
 

En vedette

Lviv Outsourcing Forum Олександра Альхімович “Drivers of change at the begin...
Lviv Outsourcing Forum Олександра Альхімович  “Drivers of change at the begin...Lviv Outsourcing Forum Олександра Альхімович  “Drivers of change at the begin...
Lviv Outsourcing Forum Олександра Альхімович “Drivers of change at the begin...Lviv Startup Club
 
The Hyper Connected Era: Mobile First, Cloud First and Multi Screen
The Hyper Connected Era: Mobile First, Cloud First and Multi Screen The Hyper Connected Era: Mobile First, Cloud First and Multi Screen
The Hyper Connected Era: Mobile First, Cloud First and Multi Screen Jose Papo, MSc
 
The Evolution of Offshoring
The Evolution of OffshoringThe Evolution of Offshoring
The Evolution of Offshoringswiss IT bridge
 
Industry 4.0 and the road towards it
Industry 4.0 and the road towards itIndustry 4.0 and the road towards it
Industry 4.0 and the road towards itRick Bouter
 
Routing protocol for delay tolerant network a survey and comparison
Routing protocol for delay tolerant network   a survey and comparisonRouting protocol for delay tolerant network   a survey and comparison
Routing protocol for delay tolerant network a survey and comparisonPhearin Sok
 
Origins of Public Relations
Origins of Public RelationsOrigins of Public Relations
Origins of Public RelationsKelli Burns
 
The Fourth Industrial Revolution
The Fourth Industrial RevolutionThe Fourth Industrial Revolution
The Fourth Industrial RevolutionLuca Lamera
 
Outsourcing & Offshoring
Outsourcing & OffshoringOutsourcing & Offshoring
Outsourcing & OffshoringArun Khedwal
 
4th Industrial Revolution by Sam Williams
4th Industrial Revolution by Sam Williams4th Industrial Revolution by Sam Williams
4th Industrial Revolution by Sam WilliamsCertus Solutions
 
Huawei parameter strategy v1.4 1st dec
Huawei parameter strategy v1.4  1st decHuawei parameter strategy v1.4  1st dec
Huawei parameter strategy v1.4 1st decKetut Widya
 

En vedette (15)

Lviv Outsourcing Forum Олександра Альхімович “Drivers of change at the begin...
Lviv Outsourcing Forum Олександра Альхімович  “Drivers of change at the begin...Lviv Outsourcing Forum Олександра Альхімович  “Drivers of change at the begin...
Lviv Outsourcing Forum Олександра Альхімович “Drivers of change at the begin...
 
The Hyper Connected Era: Mobile First, Cloud First and Multi Screen
The Hyper Connected Era: Mobile First, Cloud First and Multi Screen The Hyper Connected Era: Mobile First, Cloud First and Multi Screen
The Hyper Connected Era: Mobile First, Cloud First and Multi Screen
 
The fourth industrial revolution
The fourth industrial revolutionThe fourth industrial revolution
The fourth industrial revolution
 
The Future of Work - The Fourth Industrial Revolution
The Future of Work - The Fourth Industrial RevolutionThe Future of Work - The Fourth Industrial Revolution
The Future of Work - The Fourth Industrial Revolution
 
The Evolution of Offshoring
The Evolution of OffshoringThe Evolution of Offshoring
The Evolution of Offshoring
 
Industry 4.0 and the road towards it
Industry 4.0 and the road towards itIndustry 4.0 and the road towards it
Industry 4.0 and the road towards it
 
BlitzTrader_PPT
BlitzTrader_PPTBlitzTrader_PPT
BlitzTrader_PPT
 
Routing protocol for delay tolerant network a survey and comparison
Routing protocol for delay tolerant network   a survey and comparisonRouting protocol for delay tolerant network   a survey and comparison
Routing protocol for delay tolerant network a survey and comparison
 
Origins of Public Relations
Origins of Public RelationsOrigins of Public Relations
Origins of Public Relations
 
Analysis of Business Environment
Analysis of  Business EnvironmentAnalysis of  Business Environment
Analysis of Business Environment
 
The Fourth Industrial Revolution
The Fourth Industrial RevolutionThe Fourth Industrial Revolution
The Fourth Industrial Revolution
 
Outsourcing & Offshoring
Outsourcing & OffshoringOutsourcing & Offshoring
Outsourcing & Offshoring
 
4th Industrial Revolution by Sam Williams
4th Industrial Revolution by Sam Williams4th Industrial Revolution by Sam Williams
4th Industrial Revolution by Sam Williams
 
Huawei parameter strategy v1.4 1st dec
Huawei parameter strategy v1.4  1st decHuawei parameter strategy v1.4  1st dec
Huawei parameter strategy v1.4 1st dec
 
10 facts about jobs in the future
10 facts about jobs in the future10 facts about jobs in the future
10 facts about jobs in the future
 

Similaire à lassification with decision trees from a nonparametric predictive inference perspective

Chapter 19 decision-making under risk
Chapter 19   decision-making under riskChapter 19   decision-making under risk
Chapter 19 decision-making under riskBich Lien Pham
 
Chapter 19 decision-making under risk
Chapter 19   decision-making under riskChapter 19   decision-making under risk
Chapter 19 decision-making under riskBich Lien Pham
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
Minghui Conference Cross-Validation Talk
Minghui Conference Cross-Validation TalkMinghui Conference Cross-Validation Talk
Minghui Conference Cross-Validation TalkWei Wang
 
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationFeynman Liang
 
Gaining Confidence in Signalling and Regulatory Networks
Gaining Confidence in Signalling and Regulatory NetworksGaining Confidence in Signalling and Regulatory Networks
Gaining Confidence in Signalling and Regulatory NetworksMichael Stumpf
 
Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-feiTianlu Wang
 
Intro to Model Selection
Intro to Model SelectionIntro to Model Selection
Intro to Model Selectionchenhm
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18Matt Yang
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesMohammad
 
Representation Learning & Generative Modeling with Variational Autoencoder(VA...
Representation Learning & Generative Modeling with Variational Autoencoder(VA...Representation Learning & Generative Modeling with Variational Autoencoder(VA...
Representation Learning & Generative Modeling with Variational Autoencoder(VA...changedaeoh
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspacePrakash Dubey
 
A Logical Language with a Prototypical Semantics
A Logical Language with a Prototypical SemanticsA Logical Language with a Prototypical Semantics
A Logical Language with a Prototypical SemanticsL. Thorne McCarty
 
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...The University of Western Australia
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet AllocationMarco Righini
 
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...L. Thorne McCarty
 

Similaire à lassification with decision trees from a nonparametric predictive inference perspective (20)

Chapter 19 decision-making under risk
Chapter 19   decision-making under riskChapter 19   decision-making under risk
Chapter 19 decision-making under risk
 
Chapter 19 decision-making under risk
Chapter 19   decision-making under riskChapter 19   decision-making under risk
Chapter 19 decision-making under risk
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
Minghui Conference Cross-Validation Talk
Minghui Conference Cross-Validation TalkMinghui Conference Cross-Validation Talk
Minghui Conference Cross-Validation Talk
 
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference Compilation
 
Gaining Confidence in Signalling and Regulatory Networks
Gaining Confidence in Signalling and Regulatory NetworksGaining Confidence in Signalling and Regulatory Networks
Gaining Confidence in Signalling and Regulatory Networks
 
Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-fei
 
Linear models2
Linear models2Linear models2
Linear models2
 
Intro to Model Selection
Intro to Model SelectionIntro to Model Selection
Intro to Model Selection
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfaces
 
AbdoSummerANS_mod3
AbdoSummerANS_mod3AbdoSummerANS_mod3
AbdoSummerANS_mod3
 
Representation Learning & Generative Modeling with Variational Autoencoder(VA...
Representation Learning & Generative Modeling with Variational Autoencoder(VA...Representation Learning & Generative Modeling with Variational Autoencoder(VA...
Representation Learning & Generative Modeling with Variational Autoencoder(VA...
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspace
 
Jz3617271733
Jz3617271733Jz3617271733
Jz3617271733
 
A Logical Language with a Prototypical Semantics
A Logical Language with a Prototypical SemanticsA Logical Language with a Prototypical Semantics
A Logical Language with a Prototypical Semantics
 
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
 

Plus de NTNU

Varying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilitiesVarying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilitiesNTNU
 
Bagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification NoiseBagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification NoiseNTNU
 
Locally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsLocally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsNTNU
 
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...NTNU
 
An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...NTNU
 
Learning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait lociLearning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait lociNTNU
 
Split Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision TreesSplit Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision TreesNTNU
 
A Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesA Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesNTNU
 
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...NTNU
 
Interactive Learning of Bayesian Networks
Interactive Learning of Bayesian NetworksInteractive Learning of Bayesian Networks
Interactive Learning of Bayesian NetworksNTNU
 
A Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification treesA Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification treesNTNU
 
A Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification TreesA Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification TreesNTNU
 
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...NTNU
 
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...NTNU
 
Evaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy predictionEvaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy predictionNTNU
 
Effects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy PredictionEffects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy PredictionNTNU
 
Conference poster 6
Conference poster 6Conference poster 6
Conference poster 6NTNU
 

Plus de NTNU (17)

Varying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilitiesVarying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilities
 
Bagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification NoiseBagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification Noise
 
Locally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsLocally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet Metrics
 
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
 
An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...
 
Learning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait lociLearning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait loci
 
Split Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision TreesSplit Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision Trees
 
A Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesA Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of Cases
 
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...
 
Interactive Learning of Bayesian Networks
Interactive Learning of Bayesian NetworksInteractive Learning of Bayesian Networks
Interactive Learning of Bayesian Networks
 
A Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification treesA Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification trees
 
A Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification TreesA Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification Trees
 
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
 
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
 
Evaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy predictionEvaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy prediction
 
Effects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy PredictionEffects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy Prediction
 
Conference poster 6
Conference poster 6Conference poster 6
Conference poster 6
 

Dernier

❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfTukamushabaBismark
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to VirusesAreesha Ahmad
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Youngkajalvid75
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 

Dernier (20)

❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 

lassification with decision trees from a nonparametric predictive inference perspective

  • 1. ERCIM'11 - Handling imprecision in graphical models Classification with decision trees from a nonparametric predictive inference perspective Joaquín Abellán†, Rebecca M. Baker§, Frank P.A. Coolen§, Richard J. Crossman¶, Andrés. R. Masegosa† † Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain § Department of Mathematical Sciences, Durham University, Durham, UK ¶ Warwick Medical School, University of Warwick, Coventry, UK London, December 2011 ERCIM’11 London (UK) 1/28
  • 2. Outline 1 Introduction 2 Imprecise Models for Multinomial data 3 Uncertainty Measures and Decision Trees 4 Experimental Evaluation 5 Conclusions & Future Works ERCIM’11 London (UK) 2/28
  • 4. Introduction Imprecise Probabilities Imprecise Probabilities is a term that subsumes several theories for reasoning under uncertainty: Belief Functions Reachable Probability Intervals Capacities of various orders Upper and lower probabilities Credal Sets ..... ERCIM’11 London (UK) 4/28
  • 5. Introduction Uncertainty Measures Extensions of the classic information theory has been developed for these imprecise models. Several extensions to the Shannon entropy measure has been proposed: Maximum Entropy measure proposed by (Abellan - Moral, 2003) as a total uncertainty measure for general credal sets. The employment of this measure in the case of the Imprecise Dirichlet model (IDM) has lead to several successful applications in the field of data-mining, specially on decision trees for supervised classification. ERCIM’11 London (UK) 5/28
  • 6. Introduction Imprecise Dirichlet Model (IDM) model The IDM model was proposed by Walley in 1996 for statistical inference from multinomial data was developed to correct shortcomings of previous alternative objective models. It verifies a set of principles which are claimed by Walley to be desirable for inference: Representation Invariance Principle. It can be expressed via a set of reachable probability intervals and a belief function (Abellán,2006). ERCIM’11 London (UK) 6/28
  • 7. Introduction Non-parametric Predictive Inference Non-parametric Predictive Inference model for multinomial data (NPI-M) is presented by (Coolen-Augustin,2005) as an alternative to the IDM model. ERCIM’11 London (UK) 7/28
  • 8. Introduction Non-parametric Predictive Inference Non-parametric Predictive Inference model for multinomial data (NPI-M) is presented by (Coolen-Augustin,2005) as an alternative to the IDM model. Learn from data in the absence of prior knowledge and with only a few modeling assumptions. ERCIM’11 London (UK) 7/28
  • 9. Introduction Non-parametric Predictive Inference Non-parametric Predictive Inference model for multinomial data (NPI-M) is presented by (Coolen-Augustin,2005) as an alternative to the IDM model. Learn from data in the absence of prior knowledge and with only a few modeling assumptions. Post-data exchangeability-like assumption together with a latent variable representation of data. B BB B P P P P P ERCIM’11 London (UK) 7/28
  • 10. Introduction Non-parametric Predictive Inference Non-parametric Predictive Inference model for multinomial data (NPI-M) is presented by (Coolen-Augustin,2005) as an alternative to the IDM model. Learn from data in the absence of prior knowledge and with only a few modeling assumptions. Post-data exchangeability-like assumption together with a latent variable representation of data. B BB B P P P P P It does not satisfy as Representation Invariance Principle (RIP). Not a shortcoming for Coolen-Augustin. A weaker principle is proposed. ERCIM’11 London (UK) 7/28
  • 11. Introduction Classification with Decision Trees using NPI-M model In this work, we employ the NPI-M model to build decision trees using the maximum entropy (ME) measure as a split criteria: ERCIM’11 London (UK) 8/28
  • 12. Introduction Classification with Decision Trees using NPI-M model In this work, we employ the NPI-M model to build decision trees using the maximum entropy (ME) measure as a split criteria: We experimentally compare the decision trees build with IDM with different S values. Classification accuracy is slightly improve. The decision trees are notably smaller. A parameter free method. ERCIM’11 London (UK) 8/28
  • 13. Imprecise Models for Multinomial Data Part II Imprecise Models for Multinomial Data ERCIM’11 London (UK) 9/28
  • 14. Imprecise Models for Multinomial Data Imprecise Dirichlet Model The imprecise Dirichlet model (IDM) was introduced by Walley for inference about the probability distribution of a categorical variable. ERCIM’11 London (UK) 10/28
  • 15. Imprecise Models for Multinomial Data Imprecise Dirichlet Model The imprecise Dirichlet model (IDM) was introduced by Walley for inference about the probability distribution of a categorical variable. Let us assume that X is a variable taking values on a finite set X = {x1, . . . , xK } and we have a sample of N independent and identically distributed outcomes of X. ERCIM’11 London (UK) 10/28
  • 16. Imprecise Models for Multinomial Data Imprecise Dirichlet Model The imprecise Dirichlet model (IDM) was introduced by Walley for inference about the probability distribution of a categorical variable. Let us assume that X is a variable taking values on a finite set X = {x1, . . . , xK } and we have a sample of N independent and identically distributed outcomes of X. The goal is to estimate the probabilities θx = p(x) with which X takes its values. ERCIM’11 London (UK) 10/28
  • 17. Imprecise Models for Multinomial Data Imprecise Dirichlet Model The imprecise Dirichlet model (IDM) was introduced by Walley for inference about the probability distribution of a categorical variable. Let us assume that X is a variable taking values on a finite set X = {x1, . . . , xK } and we have a sample of N independent and identically distributed outcomes of X. The goal is to estimate the probabilities θx = p(x) with which X takes its values. Common Bayesian procedure: Assumes a Dirichlet prior for the parameter vector (θx )x∈X . f((θx )x∈X ) = Γ(s) x∈X Γ(s · tx ) x∈X θs·tx −1 x The parameter s > 0 and t = (tx )x∈X , which is a vector of positive real numbers satisfying x∈X tx = 1. Take the posterior expectation of the parameters given the sample. ERCIM’11 London (UK) 10/28
  • 18. Imprecise Models for Multinomial Data Imprecise Dirichlet Model Posterior Probability f((θx )x∈X |D) = Γ(s) x∈X Γ(s · tx ) x∈X θ n(x)+s·tx −1 x where n(x) are the count frequencies of x in the data. ERCIM’11 London (UK) 11/28
  • 19. Imprecise Models for Multinomial Data Imprecise Dirichlet Model Posterior Probability f((θx )x∈X |D) = Γ(s) x∈X Γ(s · tx ) x∈X θ n(x)+s·tx −1 x where n(x) are the count frequencies of x in the data. The IDM only depends on parameter s and assumes all the possible values of the vector t. This defines a closed and convex set of prior distributions. ERCIM’11 London (UK) 11/28
  • 20. Imprecise Models for Multinomial Data Imprecise Dirichlet Model Posterior Probability f((θx )x∈X |D) = Γ(s) x∈X Γ(s · tx ) x∈X θ n(x)+s·tx −1 x where n(x) are the count frequencies of x in the data. The IDM only depends on parameter s and assumes all the possible values of the vector t. This defines a closed and convex set of prior distributions. This credal set for a particular variable X can be represented by a system of probability intervals: p (x) ∈ [ n(x) N + s , n(x) + s N + s ] Parameter s determines how quickly the lower and upper probabilities converge as more data become available. ERCIM’11 London (UK) 11/28
  • 21. Imprecise Models for Multinomial Data The NPI-M model The NPI model for multinomial data (NPI-M) was developed by Coolen and Augustin is based on a variation of Hill’s assumption A(n) which relates to predictive inference involving real-valued data observations. Post-data exchangeability-like assumption together with a latent variable representation of data. ERCIM’11 London (UK) 12/28
  • 22. Imprecise Models for Multinomial Data The NPI-M model The NPI model for multinomial data (NPI-M) was developed by Coolen and Augustin is based on a variation of Hill’s assumption A(n) which relates to predictive inference involving real-valued data observations. Post-data exchangeability-like assumption together with a latent variable representation of data. Suppose there are 5 different categories {B, P, R, Y, G} the observations: 4 outcomes of B and 5 outcomes of P. B P P B P B B P P ERCIM’11 London (UK) 12/28
  • 23. Imprecise Models for Multinomial Data The NPI-M model The NPI model for multinomial data (NPI-M) was developed by Coolen and Augustin is based on a variation of Hill’s assumption A(n) which relates to predictive inference involving real-valued data observations. Post-data exchangeability-like assumption together with a latent variable representation of data. Suppose there are 5 different categories {B, P, R, Y, G} the observations: 4 outcomes of B and 5 outcomes of P. B P P B P B B P P B B B B P P P P P ERCIM’11 London (UK) 12/28
  • 24. Imprecise Models for Multinomial Data The NPI-M model Theorem (Coolen and Augustin 2009) The lower and upper probabilities for the event x: P(x) = max 0, n(x) − 1 N P(x) = min n(x) + 1 N , 1 ERCIM’11 London (UK) 13/28
  • 25. Imprecise Models for Multinomial Data The NPI-M model Theorem (Coolen and Augustin 2009) The lower and upper probabilities for the event x: P(x) = max 0, n(x) − 1 N P(x) = min n(x) + 1 N , 1 For the previous case {B, P, R, Y, G}: B B B B P P P P P [ 3 9 , 5 9 ]; [ 4 9 , 6 9 ]; [0, 1 9 ]; [0, 1 9 ]; [0, 1 9 ] ERCIM’11 London (UK) 13/28
  • 26. Imprecise Models for Multinomial Data The A-NPI-M model This set of lower and upper probabilities,L, associated with the singleton events x expresses a reachable set of probability intervals, i.e. a credal set. L = {[max 0, n(x) − 1 N , min n(x) + 1 N , 1 ] : x ∈ {x1, ..., xK }} ERCIM’11 London (UK) 14/28
  • 27. Imprecise Models for Multinomial Data The A-NPI-M model This set of lower and upper probabilities,L, associated with the singleton events x expresses a reachable set of probability intervals, i.e. a credal set. L = {[max 0, n(x) − 1 N , min n(x) + 1 N , 1 ] : x ∈ {x1, ..., xK }} The set of probability distributions obtained from NPI-M is not a credal set (see a counterexample in the paper). p = ( 3 9 , 4 9 , 2 27 , 2 27 , 2 27 ) belongs to the credal set: [3 9 , 5 9 ]; [ 4 9 , 6 9 ]; [0, 1 9 ]; [0, 1 9 ]; [0, 1 9 ] It is not possible to find a configuration of the probability wheel that corresponds to p. ERCIM’11 London (UK) 14/28
  • 28. Imprecise Models for Multinomial Data The A-NPI-M model This set of lower and upper probabilities,L, associated with the singleton events x expresses a reachable set of probability intervals, i.e. a credal set. L = {[max 0, n(x) − 1 N , min n(x) + 1 N , 1 ] : x ∈ {x1, ..., xK }} The set of probability distributions obtained from NPI-M is not a credal set (see a counterexample in the paper). p = ( 3 9 , 4 9 , 2 27 , 2 27 , 2 27 ) belongs to the credal set: [3 9 , 5 9 ]; [ 4 9 , 6 9 ]; [0, 1 9 ]; [0, 1 9 ]; [0, 1 9 ] It is not possible to find a configuration of the probability wheel that corresponds to p. The A-NPI-M model is a simplified approximate model derived from NPI-M by considering all probability distributions compatible with the set of lower and upper probabilities, L, obtained from NPI-M. ERCIM’11 London (UK) 14/28
  • 29. Uncertainty Measures and Decision Trees Part III Uncertainty Measures and Decision Trees ERCIM’11 London (UK) 15/28
  • 30. Uncertainty Measures and Decision Trees Decision Trees A decision tree, also called a classification tree, is a simple structure that can be used as a classifier. Each node represents an attribute variable Each branch represents one of the states of this variable Each tree leaf specifies an expected value of the class variable Example of decision tree for three attribute variables Ai (i = 1, 2, 3), with two possible values (0, 1) and a class variable C with cases or states c1, c2, c3: ERCIM’11 London (UK) 16/28
  • 31. Uncertainty Measures and Decision Trees Building Decision Trees from data Basic Rule Associate to each node the most informative variable about the class C. Not already been selected in the path from the root to this node. This variable provides more information than if it had not been included. ERCIM’11 London (UK) 17/28
  • 32. Uncertainty Measures and Decision Trees Building Decision Trees from data Basic Rule Associate to each node the most informative variable about the class C. Not already been selected in the path from the root to this node. This variable provides more information than if it had not been included. We are given a data set D. Each node of DT defines a set of probabilities for the class variable C, Pσ. σ is the configuration of a node. σ = (A1 = 1, A3 = 0) associated to node c3. ERCIM’11 London (UK) 17/28
  • 33. Uncertainty Measures and Decision Trees Building Decision Trees from data Basic Rule Associate to each node the most informative variable about the class C. Not already been selected in the path from the root to this node. This variable provides more information than if it had not been included. We are given a data set D. Each node of DT defines a set of probabilities for the class variable C, Pσ. σ is the configuration of a node. σ = (A1 = 1, A3 = 0) associated to node c3. Pσ are obtained from the IDM, NPI-M or A-NPI-M from the data set D[σ]. ERCIM’11 London (UK) 17/28
  • 34. Uncertainty Measures and Decision Trees Imprecise Information Gain ImpInfGain = TU(Pσ ) −   ai ∈Ai rσ ai TU(Pσ∪(Ai =ai ) )   ERCIM’11 London (UK) 18/28
  • 35. Uncertainty Measures and Decision Trees Imprecise Information Gain ImpInfGain = TU(Pσ ) −   ai ∈Ai rσ ai TU(Pσ∪(Ai =ai ) )   TU is a total uncertainty measure function, normally defined on credal sets. rσ ai is the relative frequency with which Ai takes the value ai in D[σ]. σ ∪ (Ai = ai ) is the result of adding the value Ai = ai to configuration σ ERCIM’11 London (UK) 18/28
  • 36. Uncertainty Measures and Decision Trees Imprecise Information Gain ImpInfGain = TU(Pσ ) −   ai ∈Ai rσ ai TU(Pσ∪(Ai =ai ) )   TU is a total uncertainty measure function, normally defined on credal sets. rσ ai is the relative frequency with which Ai takes the value ai in D[σ]. σ ∪ (Ai = ai ) is the result of adding the value Ai = ai to configuration σ Imprecise Information Gain is a comparison of the total uncertainty for configuration σ with the expected total uncertainty of σ ∪ (Ai = ai ). We put a new node or we split only if this difference is positive. ERCIM’11 London (UK) 18/28
  • 37. Uncertainty Measures and Decision Trees Maximum Entropy Maximum Entropy is an aggregate uncertainty measure which aims to generalize classic measures (Hartley Measure and Shanon Entropy). It satisfies all the required properties (additivity, subadditivity, monotonicity, proper range, etc.) It is defined as a functional S∗: S∗ (Pσ ) = max p∈Pσ {− x∈X p(x) log2 p(x)} ERCIM’11 London (UK) 19/28
  • 38. Uncertainty Measures and Decision Trees Maximum Entropy Maximum Entropy is an aggregate uncertainty measure which aims to generalize classic measures (Hartley Measure and Shanon Entropy). It satisfies all the required properties (additivity, subadditivity, monotonicity, proper range, etc.) It is defined as a functional S∗: S∗ (Pσ ) = max p∈Pσ {− x∈X p(x) log2 p(x)} Efficient algorithms for different imprecise models have been proposed. Credal Sets (AbellanMoral,2003); IDM (AbellanMoral,2006); NPI-M and A-NPI-M (Abellan et al.,2010). ERCIM’11 London (UK) 19/28
  • 39. Uncertainty Measures and Decision Trees Maximum Entropy Maximum Entropy is an aggregate uncertainty measure which aims to generalize classic measures (Hartley Measure and Shanon Entropy). It satisfies all the required properties (additivity, subadditivity, monotonicity, proper range, etc.) It is defined as a functional S∗: S∗ (Pσ ) = max p∈Pσ {− x∈X p(x) log2 p(x)} Efficient algorithms for different imprecise models have been proposed. Credal Sets (AbellanMoral,2003); IDM (AbellanMoral,2006); NPI-M and A-NPI-M (Abellan et al.,2010). This method for building decision trees for classification can be similarly extended for other imprecise models. ERCIM’11 London (UK) 19/28
  • 40. Experimental Evaluation Part IV Experimental Evaluation ERCIM’11 London (UK) 20/28
  • 41. Experimental Evaluation Experimental Set-up The aim is to compare the use of the IDM with the use of the NPI-M and the A-NPI-M for building decision trees ERCIM’11 London (UK) 21/28
  • 42. Experimental Evaluation Experimental Set-up The aim is to compare the use of the IDM with the use of the NPI-M and the A-NPI-M for building decision trees Decision trees for supervised classification using imprecise models, IDM, NPI-M and A-NPI: The maximum entropy as base measure for uncertainty. Classic Shannon’s entropy measure is also employed: information gain and information gain ratio. Experiments on 40 UCI data sets using a 10-fold-cross validation scheme. Statistical Tests for the comparison: Corrected Paired T-Test at 5% significant level. Friedman Test at 5% significant level. ERCIM’11 London (UK) 21/28
  • 43. Experimental Evaluation Evaluating Classification Accuracy IDM NPI-M A-NPI-M IG IGR Av. Accuracy 76.73 76.77 76.77 74.33 75.66 Friedman Rank 2.95 2.94 2.76 3.06 3.29 Friedman Test accepts the hypothesis that all algorithms performs equally well. ERCIM’11 London (UK) 22/28
  • 44. Experimental Evaluation Evaluating the size of the decision trees The size of a decision tree is relevant: A decision tree is composed by a set of classification rules. Smaller trees implies more representative classification rules (lower risk of over-fitted predictions). ERCIM’11 London (UK) 23/28
  • 45. Experimental Evaluation Evaluating the size of the decision trees The size of a decision tree is relevant: A decision tree is composed by a set of classification rules. Smaller trees implies more representative classification rules (lower risk of over-fitted predictions). IDM NPI-M A-NPI-M IG IGR Av. N. Nodes 610.16 589.52 589.81 1119.46 1288.56 Friedman Rank 2.96 1.4 1.8 4.18 4.67 Imprecise models provided much more compact representations than precise based methods. NPI-M and A-NPI-M build smaller decision trees than IDM with s=1 (default value). ERCIM’11 London (UK) 23/28
  • 46. Experimental Evaluation Evaluating IDM with different s values IDM depends of the parameter s (affects to the size of the intervals ): s=0 implies the information gain criterion. Higher s values implies higher intervals. Decrease the size of the trees. ERCIM’11 London (UK) 24/28
  • 47. Experimental Evaluation Evaluating IDM with different s values IDM depends of the parameter s (affects to the size of the intervals ): s=0 implies the information gain criterion. Higher s values implies higher intervals. Decrease the size of the trees. NPI-M IDM-S1 IDM-S1.5 IDM-S2.0 Av. N. Nodes 589.5 610.2 480.3 383.8 Friedman Rank 2.4 3.85 2.43 1.28 IDM with s=1.5 obtains decision trees with a similar size to NPI-M. IDM with s=2.0 obtains decision trees with smaller similar size than NPI-M. ERCIM’11 London (UK) 24/28
  • 48. Experimental Evaluation Evaluating IDM with different s values NPI-M IDM-S1 IDM-S1.5 IDM-S2.0 Av. Accuracy 76.77 76.73 76.23 75.75 Friedman Rank 2.25 2.26 2.46 3.03 IDM with s>1 performs worst than NPI-M and IDM with s=1. NPI-M models get the best classification accuracy with the smallest decision trees. ERCIM’11 London (UK) 25/28
  • 49. Conclusions and Future Works Part V Conclusions and Future Works ERCIM’11 London (UK) 26/28
  • 50. Conclusions and Future Works Conclusions and Future Works We have presented an application of the NPI model for multinomial data in classification. A known scheme to build decision trees via two NPI-based models, an exact model (NPI-M), and an approximate model (A-NPI-M), is used. Models based on NPI model have a performance of accuracy similar to that of the best model based on the IDM (which varies its parameter), but using smaller trees. ERCIM’11 London (UK) 27/28
  • 51. Conclusions and Future Works Conclusions and Future Works We have presented an application of the NPI model for multinomial data in classification. A known scheme to build decision trees via two NPI-based models, an exact model (NPI-M), and an approximate model (A-NPI-M), is used. Models based on NPI model have a performance of accuracy similar to that of the best model based on the IDM (which varies its parameter), but using smaller trees. Future Works: Extend these methods for credal classification. ERCIM’11 London (UK) 27/28
  • 52. Conclusions and Future Works Thanks for you attention! Questions? ERCIM’11 London (UK) 28/28