Linear Regression Parameters

Linear Regression Parameters

Rodolfo Campos (@camposer)
Universidad Politécnica de Madrid
Madrid, October 2012

When to consider Linear
Regression?
 When the outcome, or class, is numeric, and all
the attributes are numeric.
 The idea is to express the class as a linear
combination of the attributes, with predetermined
weights:
x = w0 + w1a1 + w2a2 + … + wkak
 x is the class; a1, a2, …, ak are the attribute values;
and w0, w1, …, wk are weights.

When to consider Linear
Regression?

Linear Regression in Weka
 Options specific to
weka.classifiers.functions.LinearRegression:
 D. Produce debugging output (default disabled).
 S <number of selection method>. Set the
attribute selection method to use. 1 = None, 2 =
Greedy (default 0 = M5' method).
 C. Do not try to eliminate colinear attributes.
 R <double>. Set ridge parameter (default 1.0e
8).

 S <number of selection method>. Set the
method used to select attributes for use in the
linear regression:
 0 = M5' method. Build trees whose leaves are
associated to multivariate linear models and the
nodes of the tree are chosen over the attribute that
maximizes the expected error reduction, given by
the Akaike information criterion (a measure of the
relative goodness of fit of a statistical model).

 1 = None. No need explanation.
 2 = Greedy. ”For example, a greedy strategy for
the traveling salesman problem (which is of a high
computational complexity) is the following
heuristic: "At each stage visit an unvisited city
nearest to the current city". This heuristic need not
find a best solution but terminates in a reasonable
number of steps; finding an optimal solution
typically requires unreasonably many steps” from
Wikipedia.

 C. Do not try to eliminate colinear attributes.
Possible examples:
 high performance, expensive German cars
 low performance, cheap American cars

 R <double>. Set ridge parameter (default 1.0e8).
 Its value is assigned by the analyst, and determines
how much Ridge Regression departs from Least
Square Regression, whose goal is to circumvent the
problem of predictors collinearity.
 If this value is too small, Ridge Regression cannot
fight collinearity efficiently.
 If it is too large, the bias of the parameters become too
large, and so do the parameters and predictions Mean
Square Errors.
 It has therefore to be estimated by a series of trial and
errors, usually resorting to crossvalidation

References
 I. Witten, E. Frank and M. Hall. Data Mining: Practical Machine
Learning Tools and Techniques (Third Edition). Elsevier. MA,
USA, 2011.
 Weka API. Class LinearRegression. Extracted on October 16,
2012 from
http://weka.sourceforge.net/doc/weka/classifiers/functions/LinearRegre
 D. Rodríguez, J.J. Cuadrado, M.A. Sicilia and R. Ruiz.
Segmentation of Software Engineering Datasets Using the M5
Algorithm. Extracted on October 14, 2012 from
http://www.cc.uah.es/drg/c/ICCS06.pdf
 AI Access. Ridge Regression. Extracted on October 16, 2012 from
http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_ridge.htm

Linear Regression Parameters

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (14)

Similaire à Linear Regression Parameters

Similaire à Linear Regression Parameters (20)

Plus de camposer

Plus de camposer (10)

Dernier

Dernier (20)

Linear Regression Parameters