lazy learners and other classication methods

M.Rajshree
M.SC(IT)
Nadar saraswathi college of
arts&science

Lazy learners
 lazy learning is a learning method in which
generalization of the training data is, in
theory, delayed until a query is made to the
system, as opposed to in eager learning,
where the system tries to generalize the
training data before receiving queries.
 Lazy learners do less work while training
data is given and more work when
classification of a test tuple is given.

 The classification methods discussed so far
in this chapter—decision tree induction,
Bayesian classification, rule-based
classification, classification by
backpropagation, support vector machines,
and classification based on association rule
mining—are all examples of eager learners
 A lazy learner simply stores the training data
and only when it sees a test tuple starts
generalization to classify the tuple based on
its similarity to the stored training tuple

 Building a model from a given set of training
data
 Applying the model to a given set of testing
data
 Eager Learners like Bayesian Classification,
Rule-based classification, support vector
machines, etc. will construct a classification
model before receiving new tuple when a set
of training tuple is given

k-Nearest-Neighbor
Classifiers
 The k-nearest-neighbor method was first
described in the early 1950s.
 Nearest-neighbor classifiers are based on
learning by analogy, that is, by comparing a
given test tuple with training tuples that are
similar to it.
 The training tuples are described
by n attributes. Each tuple represents a point
in an n-dimensional space.

 In this way, all of the training tuples are
stored in an n-dimensional pattern space.
When given an unknown tuple, a k-nearest-
neighbor classifier searches the pattern
space for the k training tuples that are
closest to the unknown tuple
 distance between two points or tuples,
say, X1 = (x11, x12…. x1n) and X2 =
(x21, x22…x2n)
 When given a test tuple, a k-nearest
neighbor classifier searches the pattern
space for the k training tuples that are
closest to the test tuple.
 These k training tuples are the k “nearest
neighbors” of the test tuple

Case-Based Reasoning
 Base-based reasoning is the process of
solving new problems based on the solutions
of similar past problems.
 These classifiers use a database of problem
solutions to solve new problems.
 The case-based reasoner tries to combine
the solutions of the neighboring training
cases in order to propose a solution for the
new case

 Case-based reasoning (CBR) classifiers use
a database of problem solutions to solve new
problems.
 Unlike nearest-neighbor classifiers, which
store training tuples as points in Euclidean
space, CBR stores the tuples or cases‖ for
problem solving as complex symbolic
descriptions.
 Business applications of CBR include
problem resolution for customer service help
desks, where cases describe product-related
diagnostic problems.

 CBR has also been applied to areas such as
engineering and law, where cases are either
technical designs or legal rulings, respectively.
 Medical education is another area for CBR,
where patient case histories and treatments are
used to help diagnose and treat new patients.
 The case-based reasoner tries to combine the
solutions of the neighboring training cases in
order to propose a solution for the new case.
 The case-based reasoner may employ
background knowledge and problem-solving
strategies in order to propose a feasible
combined solution.

Other classification methods
 Data mining involves six common classes of
tasks. Anomaly detection, Association rule
learning, Clustering, Classification,
Regression,
Summarization. Classification is a
major technique in data mining and widely
used in various fields.
 Classification is a technique where we
categorize data into a given number of
classes

 Binary Classification: Classification task
with two possible outcomes Eg: Gender
classification (Male / Female)
 Multi class classification: Classification
with more than two classes. In multi class
classification each sample is assigned to one
and only one target label Eg: An animal can
be cat or dog but not both at the same time
 Multi label classification: Classification
task where each sample is mapped to a set
of target labels (more than one class). Eg: A
news article can be about sports, a person,
and location at the same time.

Naïve Bayes
 Naive Bayes algorithm based on Bayes’
theorem with the assumption of
independence between every pair of
features. Naive Bayes classifiers work well in
many real-world situations such as document
classification and spam filtering.
 This algorithm requires a small amount of
training data to estimate the necessary
parameters. Naive Bayes classifiers are
extremely fast compared to more
sophisticated methods.

Fuzzy Set Approaches
 Fuzzy Set Theory is also called Possibility
Theory. This theory was proposed by Lotfi
Zadeh in 1965 as an alternative the two-value
logic and probability theory
 This theory allows us to work at a high level of
abstraction. It also provides us the means for
dealing with imprecise measurement of data.
 fuzzy set approach an important consideration
is the treatment of data from a linguistic view
point from this has developed an approach that
uses linguistically quantified propositions to
summarize the content of a data base by
providing a general characterization of the
analyzed data

lazy learners and other classication methods

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à lazy learners and other classication methods

Similaire à lazy learners and other classication methods (20)

Plus de rajshreemuthiah

Plus de rajshreemuthiah (20)

Dernier

Dernier (20)

lazy learners and other classication methods