SlideShare a Scribd company logo
1 of 23
Download to read offline
1
2013
Hildegard Erasmus & Morné Lamont
Hildegard Erasmus & Morné Lamont
University of Stellenbosch
Department Statistics & Actuarial Science
South African Statistical Association Conference 2013
Prediction Accuracy Estimation:
Applications to Linear Regression,
Regression Trees &
Support Vector Regression
 Introduction
 Classical linear regression
 Alternative techniques
- Regression Trees
- Support Vector Regression
 Advantages & Disadvantages
 Simulation study
 Conclusion
Prediction Accuracy Estimation SASA 2013
 Regression as a scientific method first appeared around
1885 (Izenman, 2008)
 Since then: regression evolved into variety of forms,
including linear, non-linear, parametric &
nonparametric
 Objective of study:
- theoretical considerations regarding different regression
techniques
- highlight advantages & disadvantages
- assess performance of different techniques
- identify conditions for good predictive performance
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
 Method of Least Squares (LS):
- Originated: Astronomy (1805)
- Legendre: developed LS method to determine the
orbits of planets
- Gauss & Laplace: Gaussian curve to describe error
component, crucial to success of the LS method
- Gauss: claimed to have used method of estimating
coefficients of set of linear equations by
minimizing error sum of squares since 1809
- Galton: develop ideas of regression & correlation
: fail to link LS with regression
- Yule: replace Gaussian error curve assumption with
assumption of linearly related variables (1897)
: proof that LS could be applied in regression
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
 Classical Linear Regression:
- Model: 										 	 	 	 		
with 		 : (n x 1) dependent or response variable
: unknown parameters
: design matrix with jth row , , … ,
														 : (n x 1) error term
- Assumptions (error term):
1.
2. ′
3. Normally distributed
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative
techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
- Parameter estimation:
- Regression coefficients ( ) and error variance ( )
- Method of Least Squares
- Estimation from data:
′ and ′
 Application inR:
-Function: lm,predict
150 200 250 300 350
051015202530
Linear regression
x
y
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
 Alternative techniques:
1. Regression Trees
- Bagging
- Boosting
- Random Forest
2. Support Vector Regression
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
Regression Trees
 Main idea:
- Nonparametric method to predict response variable y
from known input variables (Izenman, 2008)
- Classification and Regression Trees (CART) algorithm:
use recursive partitioning of input space into non-
overlapping rectangular (r=2) or cubic (r>2) regions &
fit simple prediction model within each partition
- Constant value assigned to each region as prediction
- Tree: graphical representation of partitioning
 Setup:
- Learning data: , , 1,2, … , 	, 	
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
|
ZN < 16.57
ZN < 5.75
CRIM < 48.75 ZN < 10.7
ZN < 18.84
ZN < 20.735-2.501-3.541-1.410-2.559
2.257
0.690-1.292
0 20 40 60 80 100
0510152025
Partitioning of Housing Data
CRIM
ZN
-2.50 -3.54
-1.41
-2.56
2.26
0.69
-1.29
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
 Aspects to consider:
- Choose splitting conditions at each node
- Decision rule for when a node should be terminal
(node that does not split into two daughter nodes)
- Rule for assigning a predicted response value to every
terminal node
 Application in R:
- Package: rpart
- Function: rpart, predict
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
 Bagging (Bootstrap aggregating):
- Procedure combines an ensemble of learning algorithms
to improve performance over a single algorithm
(Breiman, 1996)
- Designed to reduce variance & improve stability
- Independently construct trees using bootstrap samples;
simple majority vote taken for prediction
 Boosting:
- Reduce high bias of predictors that under fit the data
- Enhance accuracy of a “weak” (slightly >50% accuracy)
binary classification learning algorithm
- Successive trees give extra weight to incorrectly
predicted points; weighted vote taken for prediction
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
 Random Forest:
- Add additional layer of randomness to bagging
(Breiman, 2001)
- Construct each tree using a different bootstrap sample
- Different tree construction:
split each node using the best among a randomly
chosen set of predictors (Liaw & Wiener, 2002)
- Robust against overfitting
- Very user-friendly: only two parameters
: number of variables in subset
& number of trees in forest
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
Support Vector Regression (SVR)
 Main idea:
- “...computation of a linear regression function in a high
dimensional feature space where the input data are
mapped via a nonlinear function.”
(Basak, Pal, & Patranabis, 2007)
- Involve optimization of a convex loss function or
equivalently using quadratic optimization under
given constraints
 Setup:
- Training data: , , … , , 	 	
where : space of input patterns, say
: consist of predictors , … , , each dim 1
	 	: number of variables
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
 Goal:
- Find a function that deviates from actual responses with
at most distance while ensuring small coefficient values
- Tube formed around true regression function that
contains most of the data points
- Points falling outside of the tube: described by
introducing slack variables (Smola & Schölkopf, 2003)
- Approximate training data with linear function:
〈 , 〉
- Find function that will
minimize ‖ ‖ ∑ ∗ℓ
subject to
		 〈 , 〉
〈 , 〉 		 ∗
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
			 ∗
0
- SV expansion in input space ( :
∑ ∗ 〈 , 〉
∗
: obtain via quadratic optimization
 Kernel functions:
- Kernel: function K: → , such that for all , 	 	 ,
, 〈Φ , Φ 〉
- Used to compute inner products of the form 〈Φ , Φ 〉
in feature space using nonlinear kernel in input space
- High dimensionality makes it computationally
expensive or impossible
- SV expansion in feature space :
∑ ∗
,
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
- Examples of kernel functions:
Name Kernel function Dim
	
Polynomial
	
, 〈 , 〉 1 , ∈
	
!
! !
	
Gaussian
	
, , ∈
	
	
∞
	
ANOVA ,
,
	
	
∞
	
 Application in R:
- Package: kernlab
- Function: ksvm, predict
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
LINEAR
REGRESSION
REGRESSION TREES SVR
Advantages:
 Easy to estimate
parameters
 Conceptually simple  Generalized
performance
 Computationally easy  Highly interpretable  Wide, real-world
application
 Handle missing values
well
 Sparsity of support
vectors
 Resistant to outliers  Perform well when
p>>n
 No normality
assumptions
 No normality
assumptions
Disadvantages:
 Assumptions  Trees: unstable, high
variance
 How to estimate
parameters
 Outliers/influential
values
 Lack of smoothness  Parameters estimation
are computationally
intense
 Multicollinearity  Stepfunction: values not
always accurate
 Which kernels to
choose
 Variable selection
needed if p>>n
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6
161820222426
Fitted models
X-values
Y-values
Linear
Tree
Random Forest
-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6
161820222426
Fitted models
X-values
Y-values
SVR:Polynomial
SVR:Gaussian
SVR:ANOVA
Prediction Accuracy Estimation SASA 2013
 Software: R
 Data sets: (Johnson & Wichern, 2007)
 Parameter estimation: R functions optimize and optim
 Prediction accuracy measures:
																	 | |
 Cross-validation: 100 simulations
: 70% training data, 30% test data
 Bootstrap: 1000 bootstrap samples
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
 Results (Natural gas data):
MSE MAPE
Technique Mean Sd Mean Sd
Linear
Regression CV 398.2812 92.7482 5.2777 0.6665
BS 335.9571 19.6197 4.8414 0.1375
Regression Tree CV
1006.6497 431.5992 8.7364 1.9983
BS
628.2681 122.0066 6.7289 0.6826
Random Forest CV
441.2309 124.8267 5.59 0.9815
BS
223.4301 51.7075 3.3958 0.3513
SVR: Polynomial CV
692.964 487.3593 6.2902 1.6811
BS
456.7031 333.1714 3.6937 0.7804
SVR: Gaussian CV
378.6259 102.3776 5.3426 0.8103
BS
297.2053 32.399 4.6845 0.2609
SVR: ANOVA CV
4363.471 860.2588 20.332 3.3114
BS
4044.828 766.1202 19.2571 1.0407
CV: Cross-validation
BS : Bootstrap
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
 Results (Pulp & Paper data):
MSE MAPE
Technique Mean Sd Mean Sd
Linear
Regression CV 2.8228 1.3047 5.7085 1.2441
BS
2.3586 0.4082 5.2373 0.472
Regression Tree CV
2.9881 0.9126 6.1803 1.1149
BS
1.7547 0.26 4.6158 0.3316
Random Forest CV
1.4502 0.5845 4.3711 0.9242
BS
0.7125 0.223 2.5465 0.2797
SVR: Polynomial CV
1.7355 1.0912 4.7764 1.2274
BS
2.3586 0.4082 5.2373 0.472
SVR: Gaussian CV
1.4788 0.6458 4.409 0.9694
BS
1.9256 6.4087 3.8799 1.1653
SVR: ANOVA CV
5.6161 1.09 9.1277 1.2696
BS
0.7727 0.2732 2.7881 0.3638
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
 Conclusion:
- SVR with Gaussian kernel performed the best for the
Natural gas data set
- Random Forest technique performed the best for the
Pulp and Paper data set
- SVR with ANOVA kernel performed the worst for both
data sets
- Linear regression expected to perform well when linear
relations exist; alternative techniques expected to
outperform when more complex relationships are present
- Corresponding results were obtained by Cross-validation
and Bootstrap methods in calculation of MSE and MAPE
measures
- Could add Out-of-Bag procedure for additional validation
Prediction Accuracy Estimation SASA 2013
Introduction
Classical linear
regression
Alternative techniques
Regression Trees
Support Vector
Regression
Advantages &
Disadvantages
Simulation study
Conclusion
 References:
- Basak, D., Pal, S., & Patranabis, D. C. (2007). Support Vector Regression.
Neural Information Processing, 203-218.
- Breiman, L. (1996). Bagging predictors. Machine Learning, 123-140.
- Breiman, L. (2001). Random Forests. Machine Learning, 5-32.
- Izenman, A. J. (2008). Modern Multivariate Statistical Techniques. New York:
Springer Science+Business Media.
- Johnson, R. A., & W., W. D. (2007). Applied Multivariate Statistical Analysis.
NJ: Pearson Education, Inc.
- Johnson, R. A., & Wichern, D. W. (2007). Applied Multivariate Statistical
Analysis. NJ: Pearson Education, Inc.
- Liaw, A., & Wiener, M. (2002). Classification and Regression by randomForest. R
News.
- Smola, A. J., & Schölkopf, B. (2003). A Tutorial on Support Vector Regression.
Statistics and Computing, 199-222.
Prediction Accuracy Estimation SASA 2013

More Related Content

What's hot

Automatic Visualization - Leland Wilkinson, Chief Scientist, H2O.ai
Automatic Visualization - Leland Wilkinson, Chief Scientist, H2O.aiAutomatic Visualization - Leland Wilkinson, Chief Scientist, H2O.ai
Automatic Visualization - Leland Wilkinson, Chief Scientist, H2O.aiSri Ambati
 
QTML2021 UAP Quantum Feature Map
QTML2021 UAP Quantum Feature MapQTML2021 UAP Quantum Feature Map
QTML2021 UAP Quantum Feature MapHa Phuong
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component AnalysisSunjeet Jena
 
Health-e-Child CaseReasoner
Health-e-Child CaseReasonerHealth-e-Child CaseReasoner
Health-e-Child CaseReasonerGaborRendes
 
Advanced Support Vector Machine for classification in Neural Network
Advanced Support Vector Machine for classification  in Neural NetworkAdvanced Support Vector Machine for classification  in Neural Network
Advanced Support Vector Machine for classification in Neural NetworkAshwani Jha
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...zukun
 
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”Er. Arpit Sharma
 
Recognition of Handwritten Mathematical Equations
Recognition of  Handwritten Mathematical EquationsRecognition of  Handwritten Mathematical Equations
Recognition of Handwritten Mathematical EquationsIRJET Journal
 
Stride Random Erasing Augmentation
Stride Random Erasing AugmentationStride Random Erasing Augmentation
Stride Random Erasing Augmentationgerogepatton
 
Tensor Spectral Clustering
Tensor Spectral ClusteringTensor Spectral Clustering
Tensor Spectral ClusteringAustin Benson
 
IRJET - Rainfall Forecasting using Weka Data Mining Tool
IRJET - Rainfall Forecasting using Weka Data Mining ToolIRJET - Rainfall Forecasting using Weka Data Mining Tool
IRJET - Rainfall Forecasting using Weka Data Mining ToolIRJET Journal
 
Forecasting day ahead power prices in germany using fixed size least squares ...
Forecasting day ahead power prices in germany using fixed size least squares ...Forecasting day ahead power prices in germany using fixed size least squares ...
Forecasting day ahead power prices in germany using fixed size least squares ...Niklas Ignell
 
Fault diagnosis using genetic algorithms and
Fault diagnosis using genetic algorithms andFault diagnosis using genetic algorithms and
Fault diagnosis using genetic algorithms andeSAT Publishing House
 

What's hot (17)

Pca ankita dubey
Pca ankita dubeyPca ankita dubey
Pca ankita dubey
 
Automatic Visualization - Leland Wilkinson, Chief Scientist, H2O.ai
Automatic Visualization - Leland Wilkinson, Chief Scientist, H2O.aiAutomatic Visualization - Leland Wilkinson, Chief Scientist, H2O.ai
Automatic Visualization - Leland Wilkinson, Chief Scientist, H2O.ai
 
QTML2021 UAP Quantum Feature Map
QTML2021 UAP Quantum Feature MapQTML2021 UAP Quantum Feature Map
QTML2021 UAP Quantum Feature Map
 
07 learning
07 learning07 learning
07 learning
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component Analysis
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Health-e-Child CaseReasoner
Health-e-Child CaseReasonerHealth-e-Child CaseReasoner
Health-e-Child CaseReasoner
 
Advanced Support Vector Machine for classification in Neural Network
Advanced Support Vector Machine for classification  in Neural NetworkAdvanced Support Vector Machine for classification  in Neural Network
Advanced Support Vector Machine for classification in Neural Network
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...
 
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
 
Recognition of Handwritten Mathematical Equations
Recognition of  Handwritten Mathematical EquationsRecognition of  Handwritten Mathematical Equations
Recognition of Handwritten Mathematical Equations
 
Stride Random Erasing Augmentation
Stride Random Erasing AugmentationStride Random Erasing Augmentation
Stride Random Erasing Augmentation
 
Tensor Spectral Clustering
Tensor Spectral ClusteringTensor Spectral Clustering
Tensor Spectral Clustering
 
IRJET - Rainfall Forecasting using Weka Data Mining Tool
IRJET - Rainfall Forecasting using Weka Data Mining ToolIRJET - Rainfall Forecasting using Weka Data Mining Tool
IRJET - Rainfall Forecasting using Weka Data Mining Tool
 
Forecasting day ahead power prices in germany using fixed size least squares ...
Forecasting day ahead power prices in germany using fixed size least squares ...Forecasting day ahead power prices in germany using fixed size least squares ...
Forecasting day ahead power prices in germany using fixed size least squares ...
 
Fault diagnosis using genetic algorithms and
Fault diagnosis using genetic algorithms andFault diagnosis using genetic algorithms and
Fault diagnosis using genetic algorithms and
 
MS Thesis
MS ThesisMS Thesis
MS Thesis
 

Similar to SASA Presentation 2013

HEURISTIC BASED ADAPTIVE STEP SIZE CLMS ALGORITHMS FOR SMART ANTENNAS
HEURISTIC BASED ADAPTIVE STEP SIZE CLMS ALGORITHMS FOR SMART ANTENNASHEURISTIC BASED ADAPTIVE STEP SIZE CLMS ALGORITHMS FOR SMART ANTENNAS
HEURISTIC BASED ADAPTIVE STEP SIZE CLMS ALGORITHMS FOR SMART ANTENNAScscpconf
 
Heuristic based adaptive step size clms algorithms for smart antennas
Heuristic based adaptive step size clms algorithms for smart antennasHeuristic based adaptive step size clms algorithms for smart antennas
Heuristic based adaptive step size clms algorithms for smart antennascsandit
 
casestudy_important.pptx
casestudy_important.pptxcasestudy_important.pptx
casestudy_important.pptxssuser31398b
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepSanjanaSaxena17
 
Arjrandomjjejejj3ejjeejjdjddjjdjdjdjdjdjdjdjdjd
Arjrandomjjejejj3ejjeejjdjddjjdjdjdjdjdjdjdjdjdArjrandomjjejejj3ejjeejjdjddjjdjdjdjdjdjdjdjdjd
Arjrandomjjejejj3ejjeejjdjddjjdjdjdjdjdjdjdjdjd12345arjitcs
 
Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...TELKOMNIKA JOURNAL
 
Adaptive Beamforming Algorithms
Adaptive Beamforming Algorithms Adaptive Beamforming Algorithms
Adaptive Beamforming Algorithms Mohammed Abuibaid
 
Blanka Láng, László Kovács and László Mohácsi: Linear regression model select...
Blanka Láng, László Kovács and László Mohácsi: Linear regression model select...Blanka Láng, László Kovács and László Mohácsi: Linear regression model select...
Blanka Láng, László Kovács and László Mohácsi: Linear regression model select...Informatikai Intézet
 
Performance enhanced beamforming algorithms for mimo systems
Performance enhanced beamforming algorithms for mimo systemsPerformance enhanced beamforming algorithms for mimo systems
Performance enhanced beamforming algorithms for mimo systemsIAEME Publication
 
Performance enhanced beamforming algorithms for mimo systems
Performance enhanced beamforming algorithms for mimo systemsPerformance enhanced beamforming algorithms for mimo systems
Performance enhanced beamforming algorithms for mimo systemsIAEME Publication
 
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님AI Robotics KR
 
Memory Polynomial Based Adaptive Digital Predistorter
Memory Polynomial Based Adaptive Digital PredistorterMemory Polynomial Based Adaptive Digital Predistorter
Memory Polynomial Based Adaptive Digital PredistorterIJERA Editor
 
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutWhats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutTed Dunning
 
What's Right and Wrong with Apache Mahout
What's Right and Wrong with Apache MahoutWhat's Right and Wrong with Apache Mahout
What's Right and Wrong with Apache MahoutMapR Technologies
 
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Ukraine
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooJaeJun Yoo
 
A Study of Training and Blind Equalization Algorithms for Quadrature Amplitud...
A Study of Training and Blind Equalization Algorithms for Quadrature Amplitud...A Study of Training and Blind Equalization Algorithms for Quadrature Amplitud...
A Study of Training and Blind Equalization Algorithms for Quadrature Amplitud...IRJET Journal
 
A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas
A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas
A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas cscpconf
 

Similar to SASA Presentation 2013 (20)

HEURISTIC BASED ADAPTIVE STEP SIZE CLMS ALGORITHMS FOR SMART ANTENNAS
HEURISTIC BASED ADAPTIVE STEP SIZE CLMS ALGORITHMS FOR SMART ANTENNASHEURISTIC BASED ADAPTIVE STEP SIZE CLMS ALGORITHMS FOR SMART ANTENNAS
HEURISTIC BASED ADAPTIVE STEP SIZE CLMS ALGORITHMS FOR SMART ANTENNAS
 
Heuristic based adaptive step size clms algorithms for smart antennas
Heuristic based adaptive step size clms algorithms for smart antennasHeuristic based adaptive step size clms algorithms for smart antennas
Heuristic based adaptive step size clms algorithms for smart antennas
 
casestudy_important.pptx
casestudy_important.pptxcasestudy_important.pptx
casestudy_important.pptx
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
 
Arjrandomjjejejj3ejjeejjdjddjjdjdjdjdjdjdjdjdjd
Arjrandomjjejejj3ejjeejjdjddjjdjdjdjdjdjdjdjdjdArjrandomjjejejj3ejjeejjdjddjjdjdjdjdjdjdjdjdjd
Arjrandomjjejejj3ejjeejjdjddjjdjdjdjdjdjdjdjdjd
 
Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...
 
Adaptive Beamforming Algorithms
Adaptive Beamforming Algorithms Adaptive Beamforming Algorithms
Adaptive Beamforming Algorithms
 
Blanka Láng, László Kovács and László Mohácsi: Linear regression model select...
Blanka Láng, László Kovács and László Mohácsi: Linear regression model select...Blanka Láng, László Kovács and László Mohácsi: Linear regression model select...
Blanka Láng, László Kovács and László Mohácsi: Linear regression model select...
 
Performance enhanced beamforming algorithms for mimo systems
Performance enhanced beamforming algorithms for mimo systemsPerformance enhanced beamforming algorithms for mimo systems
Performance enhanced beamforming algorithms for mimo systems
 
Performance enhanced beamforming algorithms for mimo systems
Performance enhanced beamforming algorithms for mimo systemsPerformance enhanced beamforming algorithms for mimo systems
Performance enhanced beamforming algorithms for mimo systems
 
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
 
MS Thesis
MS ThesisMS Thesis
MS Thesis
 
Memory Polynomial Based Adaptive Digital Predistorter
Memory Polynomial Based Adaptive Digital PredistorterMemory Polynomial Based Adaptive Digital Predistorter
Memory Polynomial Based Adaptive Digital Predistorter
 
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutWhats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache Mahout
 
What's Right and Wrong with Apache Mahout
What's Right and Wrong with Apache MahoutWhat's Right and Wrong with Apache Mahout
What's Right and Wrong with Apache Mahout
 
20100822 computervision boykov
20100822 computervision boykov20100822 computervision boykov
20100822 computervision boykov
 
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun Yoo
 
A Study of Training and Blind Equalization Algorithms for Quadrature Amplitud...
A Study of Training and Blind Equalization Algorithms for Quadrature Amplitud...A Study of Training and Blind Equalization Algorithms for Quadrature Amplitud...
A Study of Training and Blind Equalization Algorithms for Quadrature Amplitud...
 
A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas
A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas
A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas
 

SASA Presentation 2013

  • 1. 1 2013 Hildegard Erasmus & Morné Lamont Hildegard Erasmus & Morné Lamont University of Stellenbosch Department Statistics & Actuarial Science South African Statistical Association Conference 2013 Prediction Accuracy Estimation: Applications to Linear Regression, Regression Trees & Support Vector Regression
  • 2.  Introduction  Classical linear regression  Alternative techniques - Regression Trees - Support Vector Regression  Advantages & Disadvantages  Simulation study  Conclusion Prediction Accuracy Estimation SASA 2013
  • 3.  Regression as a scientific method first appeared around 1885 (Izenman, 2008)  Since then: regression evolved into variety of forms, including linear, non-linear, parametric & nonparametric  Objective of study: - theoretical considerations regarding different regression techniques - highlight advantages & disadvantages - assess performance of different techniques - identify conditions for good predictive performance Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 4.  Method of Least Squares (LS): - Originated: Astronomy (1805) - Legendre: developed LS method to determine the orbits of planets - Gauss & Laplace: Gaussian curve to describe error component, crucial to success of the LS method - Gauss: claimed to have used method of estimating coefficients of set of linear equations by minimizing error sum of squares since 1809 - Galton: develop ideas of regression & correlation : fail to link LS with regression - Yule: replace Gaussian error curve assumption with assumption of linearly related variables (1897) : proof that LS could be applied in regression Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 5.  Classical Linear Regression: - Model: with : (n x 1) dependent or response variable : unknown parameters : design matrix with jth row , , … , : (n x 1) error term - Assumptions (error term): 1. 2. ′ 3. Normally distributed Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study
  • 6. - Parameter estimation: - Regression coefficients ( ) and error variance ( ) - Method of Least Squares - Estimation from data: ′ and ′  Application inR: -Function: lm,predict 150 200 250 300 350 051015202530 Linear regression x y Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 7.  Alternative techniques: 1. Regression Trees - Bagging - Boosting - Random Forest 2. Support Vector Regression Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 8. Regression Trees  Main idea: - Nonparametric method to predict response variable y from known input variables (Izenman, 2008) - Classification and Regression Trees (CART) algorithm: use recursive partitioning of input space into non- overlapping rectangular (r=2) or cubic (r>2) regions & fit simple prediction model within each partition - Constant value assigned to each region as prediction - Tree: graphical representation of partitioning  Setup: - Learning data: , , 1,2, … , , Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 9. | ZN < 16.57 ZN < 5.75 CRIM < 48.75 ZN < 10.7 ZN < 18.84 ZN < 20.735-2.501-3.541-1.410-2.559 2.257 0.690-1.292 0 20 40 60 80 100 0510152025 Partitioning of Housing Data CRIM ZN -2.50 -3.54 -1.41 -2.56 2.26 0.69 -1.29 Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 10.  Aspects to consider: - Choose splitting conditions at each node - Decision rule for when a node should be terminal (node that does not split into two daughter nodes) - Rule for assigning a predicted response value to every terminal node  Application in R: - Package: rpart - Function: rpart, predict Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 11.  Bagging (Bootstrap aggregating): - Procedure combines an ensemble of learning algorithms to improve performance over a single algorithm (Breiman, 1996) - Designed to reduce variance & improve stability - Independently construct trees using bootstrap samples; simple majority vote taken for prediction  Boosting: - Reduce high bias of predictors that under fit the data - Enhance accuracy of a “weak” (slightly >50% accuracy) binary classification learning algorithm - Successive trees give extra weight to incorrectly predicted points; weighted vote taken for prediction Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 12.  Random Forest: - Add additional layer of randomness to bagging (Breiman, 2001) - Construct each tree using a different bootstrap sample - Different tree construction: split each node using the best among a randomly chosen set of predictors (Liaw & Wiener, 2002) - Robust against overfitting - Very user-friendly: only two parameters : number of variables in subset & number of trees in forest Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 13. Support Vector Regression (SVR)  Main idea: - “...computation of a linear regression function in a high dimensional feature space where the input data are mapped via a nonlinear function.” (Basak, Pal, & Patranabis, 2007) - Involve optimization of a convex loss function or equivalently using quadratic optimization under given constraints  Setup: - Training data: , , … , , where : space of input patterns, say : consist of predictors , … , , each dim 1 : number of variables Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 14.  Goal: - Find a function that deviates from actual responses with at most distance while ensuring small coefficient values - Tube formed around true regression function that contains most of the data points - Points falling outside of the tube: described by introducing slack variables (Smola & Schölkopf, 2003) - Approximate training data with linear function: 〈 , 〉 - Find function that will minimize ‖ ‖ ∑ ∗ℓ subject to 〈 , 〉 〈 , 〉 ∗ Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion ∗ 0
  • 15. - SV expansion in input space ( : ∑ ∗ 〈 , 〉 ∗ : obtain via quadratic optimization  Kernel functions: - Kernel: function K: → , such that for all , , , 〈Φ , Φ 〉 - Used to compute inner products of the form 〈Φ , Φ 〉 in feature space using nonlinear kernel in input space - High dimensionality makes it computationally expensive or impossible - SV expansion in feature space : ∑ ∗ , Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 16. - Examples of kernel functions: Name Kernel function Dim Polynomial , 〈 , 〉 1 , ∈ ! ! ! Gaussian , , ∈ ∞ ANOVA , , ∞  Application in R: - Package: kernlab - Function: ksvm, predict Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 17. LINEAR REGRESSION REGRESSION TREES SVR Advantages:  Easy to estimate parameters  Conceptually simple  Generalized performance  Computationally easy  Highly interpretable  Wide, real-world application  Handle missing values well  Sparsity of support vectors  Resistant to outliers  Perform well when p>>n  No normality assumptions  No normality assumptions Disadvantages:  Assumptions  Trees: unstable, high variance  How to estimate parameters  Outliers/influential values  Lack of smoothness  Parameters estimation are computationally intense  Multicollinearity  Stepfunction: values not always accurate  Which kernels to choose  Variable selection needed if p>>n Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 18. -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 161820222426 Fitted models X-values Y-values Linear Tree Random Forest -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 161820222426 Fitted models X-values Y-values SVR:Polynomial SVR:Gaussian SVR:ANOVA Prediction Accuracy Estimation SASA 2013
  • 19.  Software: R  Data sets: (Johnson & Wichern, 2007)  Parameter estimation: R functions optimize and optim  Prediction accuracy measures: | |  Cross-validation: 100 simulations : 70% training data, 30% test data  Bootstrap: 1000 bootstrap samples Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 20.  Results (Natural gas data): MSE MAPE Technique Mean Sd Mean Sd Linear Regression CV 398.2812 92.7482 5.2777 0.6665 BS 335.9571 19.6197 4.8414 0.1375 Regression Tree CV 1006.6497 431.5992 8.7364 1.9983 BS 628.2681 122.0066 6.7289 0.6826 Random Forest CV 441.2309 124.8267 5.59 0.9815 BS 223.4301 51.7075 3.3958 0.3513 SVR: Polynomial CV 692.964 487.3593 6.2902 1.6811 BS 456.7031 333.1714 3.6937 0.7804 SVR: Gaussian CV 378.6259 102.3776 5.3426 0.8103 BS 297.2053 32.399 4.6845 0.2609 SVR: ANOVA CV 4363.471 860.2588 20.332 3.3114 BS 4044.828 766.1202 19.2571 1.0407 CV: Cross-validation BS : Bootstrap Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 21.  Results (Pulp & Paper data): MSE MAPE Technique Mean Sd Mean Sd Linear Regression CV 2.8228 1.3047 5.7085 1.2441 BS 2.3586 0.4082 5.2373 0.472 Regression Tree CV 2.9881 0.9126 6.1803 1.1149 BS 1.7547 0.26 4.6158 0.3316 Random Forest CV 1.4502 0.5845 4.3711 0.9242 BS 0.7125 0.223 2.5465 0.2797 SVR: Polynomial CV 1.7355 1.0912 4.7764 1.2274 BS 2.3586 0.4082 5.2373 0.472 SVR: Gaussian CV 1.4788 0.6458 4.409 0.9694 BS 1.9256 6.4087 3.8799 1.1653 SVR: ANOVA CV 5.6161 1.09 9.1277 1.2696 BS 0.7727 0.2732 2.7881 0.3638 Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 22.  Conclusion: - SVR with Gaussian kernel performed the best for the Natural gas data set - Random Forest technique performed the best for the Pulp and Paper data set - SVR with ANOVA kernel performed the worst for both data sets - Linear regression expected to perform well when linear relations exist; alternative techniques expected to outperform when more complex relationships are present - Corresponding results were obtained by Cross-validation and Bootstrap methods in calculation of MSE and MAPE measures - Could add Out-of-Bag procedure for additional validation Prediction Accuracy Estimation SASA 2013 Introduction Classical linear regression Alternative techniques Regression Trees Support Vector Regression Advantages & Disadvantages Simulation study Conclusion
  • 23.  References: - Basak, D., Pal, S., & Patranabis, D. C. (2007). Support Vector Regression. Neural Information Processing, 203-218. - Breiman, L. (1996). Bagging predictors. Machine Learning, 123-140. - Breiman, L. (2001). Random Forests. Machine Learning, 5-32. - Izenman, A. J. (2008). Modern Multivariate Statistical Techniques. New York: Springer Science+Business Media. - Johnson, R. A., & W., W. D. (2007). Applied Multivariate Statistical Analysis. NJ: Pearson Education, Inc. - Johnson, R. A., & Wichern, D. W. (2007). Applied Multivariate Statistical Analysis. NJ: Pearson Education, Inc. - Liaw, A., & Wiener, M. (2002). Classification and Regression by randomForest. R News. - Smola, A. J., & Schölkopf, B. (2003). A Tutorial on Support Vector Regression. Statistics and Computing, 199-222. Prediction Accuracy Estimation SASA 2013