SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
CLASSIFICATION OF VOLTAGE
DISTURBANCES USING SVM’s AND
BOOSTING CLASSIFIERS
September 29, 2015
Submitted by
MOHAN KASHYAP.P (SC13M055)
M-Tech in MACHINE LEARNING AND COMPUTING
Department of mathematics
1
List of Figures
2.1 Simulink model for generation of different kinds of electrical
faults and their analysis . . . . . . . . . . . . . . . . . . . . . 3
3.1 Confusion matrix for Gradient Boosting Classifiers of estima-
tors as 100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Confusion matrix for Ada Boosting Classifiers of estimators as
100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Confusion matrix for Random forest Classifiers of estimators
as 100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2
Contents
1 Introduction 1
2 Simulink model and Algorithmic implementation 2
2.1 Simulink modeling . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1.1 Classification of faults . . . . . . . . . . . . . . . . . . 2
2.1.2 Extraction of different features . . . . . . . . . . . . . . 3
2.2 Algorithm Implementation . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Generation of data points . . . . . . . . . . . . . . . . 5
2.2.2 Implementation of LIBSVM . . . . . . . . . . . . . . . 5
2.2.3 Implementation of Gradient Boosting algorithm . . . . 5
2.2.4 AdaBoost Classifier . . . . . . . . . . . . . . . . . . . . 6
2.2.5 Random Forest Classifiers . . . . . . . . . . . . . . . . 7
3 Results 9
3.1 Results for Gradient Boosting Classifier . . . . . . . . . . . . . 9
3.2 Results for Ada Boosting Classifier . . . . . . . . . . . . . . . 10
3.3 Results for Random-Forest Classifier . . . . . . . . . . . . . . 11
3.3.1 Results for SVM . . . . . . . . . . . . . . . . . . . . . 12
4 Conclusion and Future work 13
4.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3
Abstract
This report briefly describes about the Classification of volatge disturbance
using support vector machine and boosting classifiers. The process starts
from building of a matlab/simulink model for simulation of various electri-
cal faults. Followed by Feature extraction to generate the data-set and fi-
nally classification of the obtained data using boosting and SVM. The results
obatined were found to be promising
Chapter 1
Introduction
Over the past two decades utilities worldwide have gone through radical
changes. One big change is the deregulation of the energy market that has
taken place in a number of countries worldwide. Another change is that
todays customers are more demanding than customers in the past. These
changes have forced the utilities to become even more customer-oriented
where high network reliability and good power quality have been increasingly
important to keep customers satisfied. Therefore, recording and analyzing
voltage disturbances (also referred to as voltage events) and power quality
abnormalities have become a vital issue in order to better understand the
behavior of the power network. Disturbance data and power quality data
have become important information both for statistical purposes and as a
decision-making document in mitigation projects. Reliable disturbance and
power quality information also open up for proactive maintenance approach
with focus on increasing the power network reliability. Today most of the
disturbance data is analyzed manually by specialists. However, a lot of time
could be saved if unimportant or minor disturbances could be removed or
classified automatically. Thereby, the specialists could focus on solving more
sophisticated disturbance problems. This requires the development of robust
automatic classification systems. During the last few years yet another clas-
sification method the support vector machine (SVM) has been increasingly
popular due to its interesting theoretical and practical characteristics. The
SVM which is based on statistical learning theory is a general classification
method. During recent years boosting algorithms have also gained slowly
the popularity. Its based on ensemble learning approach.
1
Chapter 2
Simulink model and
Algorithmic implementation
2.1 Simulink modeling
The basic idea is to generate simulink model was used to simulate differ-
ent kinds of faults and extract features which were used to generate the data
points. From the generated data points the multiclass svm,adaboost,gradientboosting
and random forest algorithms were implemented. Now the simulink model
is described used the faults and various features extracted are shown in the
figure below:
2.1.1 Classification of faults
There are four different kinds of fault classification is described as given
below:
• single line to ground fault(SLG):on line is faulty is connected to ground
other two lines are properly functioning.
• double line to ground fault(DLG):two lines are faulty connected to
ground one line is properly functioning
• three phase to ground fault:all three lines are faulty and connected to
ground
• unequally distributed faults:in which the voltages in three lines are
improperly distributed
2
Figure 2.1: Simulink model for generation of different kinds of electrical faults
and their analysis
2.1.2 Extraction of different features
1. RMS(root mean square)value:when the faults occur any one of the
phase voltages will go zero or will have uneven voltage distribution.
So this value of root mean square value of voltage will be determining
factor and also one of the features which are use for classification the
formula for rms is given by
frms = limT→∞
1
T
T
0
[f(t)]2
dt.
where the equivalent symbols are
frms=Vrms=voltage rms values of voltage
f(t)=v(t)=voltage magnitude values of voltages at various time
instants=vm ∗ sin(T)
T=time period of the sinusoidal wave
2. Total harmonic distortion(THD):When the input is a sine wave the
measurement is most commonly defined as the ratio of the RMS am-
plitude of a set of higher harmonic frequencies to the RMS amplitude
3
of the first harmonic or fundamental frequency.so since already RMS
was used as one of the features so this factor also contains imbibed
RMS within it so its also useful in classification of voltage disturbance
since voltage magnitude is different in different phases during the faults
therefore the RMS values are also different therefore THD’s are also dif-
ferent for various three different phases.the equation of THD is given
below:
THDF =
√
V 2
2 +V 2
3 +V 2
4 +···+V 2
n
V1
where
where Vi is the RMS voltage of ith harmonic and i = 1 is the
fundamental frequency. where i=1,2,3....n
3. magnitude and phase sequence analyzer:magnitude sequence analyzer
gives the changes in magnitude of voltage changes at various time in-
stants and phase sequence analyzer analyses phase change with respect
to time since voltage is represented in polar form for AC components
this can be used for classification of faults since magnitude of various
phases(3) are different for different kinds of faults.we can represent it
as follows:
V=R theta
R=is magnitude of voltage, theta=is the phase of voltage
4. voltage at various selected intervals
So these were the features extracted in order to differentiate different kinds
of faults.
2.2 Algorithm Implementation
The classification of generated data was performed using Support vector ma-
chines, Gradient Boosting Classifier , Ada-boost and random forest classifier.
The multiclass classification was performed using one versus all method.
4
2.2.1 Generation of data points
The next step was to generate the data points that was done done by feature
extraction by using work-space block of simulink all the data points were gen-
erated.the orientation of data points,different classifications,normalization of
data was done using excel.
Here the number of data points were 809 of 4 different classes.The number
of features were reduced to 9 because of datapreprocessing.
2.2.2 Implementation of LIBSVM
After generation of the data points we need svm implementation to classify
the data. Implementation of multiclass svm was performed using LIBSVM
which is an open source library. The important parameter to be tuned was
the kernel of SVM. From the 5-fold cross validation we found out that RBF
kernel fits the best for our data.
2.2.3 Implementation of Gradient Boosting algorithm
Boosting algorithms are set of machine learning algorithms, which builds
strong classifier from set of weak classifiers, typically decision tress. Gradi-
ent boosting is one such algorithm which builds the model in a stage-wise
fashion, and it generalizes the model by allowing optimization of an arbitrary
differentiable loss function. The differentiable loss function in our case is Bi-
nomial deviance loss function. The algorithm is implemented as follows as
described in (Friedman et al.,2001).
Input : training set (Xi, yi), where i = 1....n , Xi ∈ H ⊆ Rn
and yi ∈ [−1, 1]
differential loss function L(y, F(X)) which in our case is Binomial deviance
loss function defined as log(1 + exp(−2yF(X))) and M are the number of
iterations .
1. Initialize model with a constant value:
F0(X) =arg min
γ
n
i=1 L(yi, γ).
2. For m = 1 to M:
(a) Compute the pseudo-responses:
rim = − ∂L(yi,F(Xi))
∂F(Xi)
F(X)=Fm−1(X)
for i = 1, . . . , n.
(b) Fit a base learner hm(X) to pseudo-response, train the pseudo
response
using the training set {(Xi, rim)}n
i=1.
5
(c) Compute multiplier γm by solving the optimization problem:
γm = arg min
γ
n
i=1 L (yi, Fm−1(Xi) + γhm(Xi)).
(d) Update the model: Fm(X) = Fm−1(X) + γmhm(X).
3. Output FM (X) = M
m=1 γmhm(X)
The value of the weight γm is found by an approximated newton raphson
solution given as γm = Xi∈hm
rim
Xi∈hm|rim|(2−|rim|)
The parameter to be tuned was the number of classifiers within the en-
semble and the base classifier within the ensemble. With the 5-fold cross
validation the optimal number of base estimators were found to be 100 with
decision tree as the base classifier.
2.2.4 AdaBoost Classifier
In adaBoost we assign (non-negative) weights to points in the data set which
are normalized, so that it forms a distribution. In each iteration, we generate
a training set by sampling from the data using the weights, i.e. the data
point (Xi, yi) would be chosen with probability wi, where wi is the current
weight for that data point. We generate the training set by such repeated
independent sampling. After learning the current classifier, we increase the
(relative) weights of data points that are misclassified by the current classifier.
We generate a fresh training set using the modified weights and so on. The
final classifier is essentially a weighted majority voting by all the classifiers.
The description of the algorithm as in (Freund et al., 1995) is given below:
Input n examples: (X1, y1), ..., (Xn, yn), Xi ∈ H ⊆ Rn
, yi ∈ [−1, 1]
1. Initialize: wi(1) = 1
n
, ∀i, each data point is initialized with equal
weight, so when data points are sampled from the probability distribu-
tion the chance of getting the data point in the training set is equally
likely.
2. We assume that there as M classifiers within the Ensembles.
For m=1 to M do
(a) Generate a training set by sampling with wi(m).
(b) Learn classifier hm using this training set.
(c) let ξm = n
i=1 wi(m) I[yi=hm(Xi)] where IA is the indicator function
of A and is defined as
IA = 1 if [yi = hm(Xi)]
6
IA = 0 if [yi = hm(Xi)]
so ξm is the error computed due to the mth classifier.
(d) Set αm=log(1−ξm
ξm
) computed hypothesis weight, such that αm > 0
because of the assumption that ξ < 0.5.
(e) Update the weight distribution over the training set as
wi(m + 1)= wi(m) exp(αmI[yi=hm(Xi)])
Normalization of the updated weights so that wi(m + 1) is a dis-
tribution. wi(m + 1) =
wi(m+1)
i wi(m+1)
end for
3. Output is final vote h(X) = sgn( M
m=1 αmhm(x)) is the weighted sum
of all classifiers in the ensemble.
In the adaboost algorithm M is a parameter. Due to the sampling with
weights, we can continue the procedure for arbitrary number of iterations.
Loss function used in adaboost algorithm is exponential loss function and
for a particular data point its defined as exp(−yif(Xi)). The parameter to
be tuned was the number of classifiers within the ensemble and the base
classifier within the ensemble. With the 5-fold cross validation the optimal
number of base estimators were found to be 100 with decision tree as the
base classifier.
2.2.5 Random Forest Classifiers
Random forests are a combination of tree predictors, such that each tree
depends on the values of a random vector sampled independently, and with
the same distribution for all trees in the forest. The main difference between
standard decision trees and random forest is, in decision trees, each node
is split using the best split among all variables and in random forest, each
node is split using the best among a subset of predictors randomly chosen
at that node. In random forest classifier ntree bootstrap samples are drawn
from the original data, and for each obtained bootstrap sample grow an
unpruned classification decision tree, with the following modification: at
each node, rather than choosing the best split among all predictors, randomly
sample mtry of the predictors and choose the best split from among those
variables. Predict new data by aggregating the predictions of the ntree trees
(i.e., majority votes for classification). The algorithm is described as follows
as in(Brieman, 2001):
Input n examples: (X1, y1), ..., (Xn, yn) = D, Xi ∈ Rn
, where D is the whole
7
dataset.
for i=1,...,B:
1. Choose a boostrap sample Di from D.
2. Construct a decision Tree Ti from the bootstrap sample Di such that
at each node, choose a random subset of m features and only consider
splitting on those features.
Finally given the testdata Xt take the majority votes for classification. Here
B is the number of bootstrap data sets generated from original data set D.
The parameter to be tuned was the number of classifiers within the ensemble
and the base classifier within the ensemble. With the 5-fold cross validation
the optimal number of base estimators were found to be 100 with decision
tree as the base classifier.
8
Chapter 3
Results
The generated data points were very few in number. The electrical faults
normally are distinguishable with additional features. Due to this scenario
the classification accuracies obtained were very high. The results for the
ensemble classifiers(i.e. Boosting classifiers) were implemented in python.
SVM classification was implemented using LIBSVM.
3.1 Results for Gradient Boosting Classifier
This section shows the training accuracies using 5-fold cross validation,testing
accuracies and confusion matrix.
Table 3.1: Variation in accuracies for Gradient Boosting classifier for number
of parameter estimate as 100
Gradient Boosting classifier training accuracies for 5-fold
5-folds Accuracy
1 0.9922
2 0.9844
3 0.9972
4 0.9982
5 0.9957
Table 3.2: variation in test score for accuracy Gradient Boosting Classifier
with number of estimators as 100
Test score for Gradient Boosting Classifier with number of estimators as 100
Sl.no Accuracy
1 99.83%
9
Figure 3.1: Confusion matrix for Gradient Boosting Classifiers of estimators
as 100
3.2 Results for Ada Boosting Classifier
This section shows the training accuracies using 5-fold cross validation,testing
accuracies and confusion matrix.
Table 3.3: Variation in accuracies for Ada Boosting classifier for number of
parameter estimate as 100
Ada Boosting classifier training accuracies for 5-fold
5-folds Accuracy
1 0.9922
2 0.9922
3 0.9972
4 0.9983
5 0.9945
Table 3.4: variation in test score for accuracy Ada Boosting Classifier with
number of estimators as 100
Test score for Ada Boosting Classifier with number of estimators as 100
Sl.no Accuracy
1 98.75%
10
Figure 3.2: Confusion matrix for Ada Boosting Classifiers of estimators as
100
3.3 Results for Random-Forest Classifier
This section shows the training accuracies using 5-fold cross validation,testing
accuracies and confusion matrix.
Table 3.5: Variation in accuracies for Random Forest classifier for number of
parameter estimate as 100
Random Forest classifier training accuracies for 5-fold
5-folds Accuracy
1 0.9934
2 0.9958
3 0.9972
4 0.9987
5 0.9954
Table 3.6: variation in test score for accuracy Random forest Classifier with
number of estimators as 100
Test score for Random Forest Classifier with number of estimators as 100
Sl.no Accuracy
1 99.37%
11
Figure 3.3: Confusion matrix for Random forest Classifiers of estimators as
100
3.3.1 Results for SVM
This section shows the training accuracies using 5-fold cross validation,testing
accuracies.
Table 3.7: Variation in accuracies for SVM classifier for Kernel as RBF
SVM classifier training accuracies for 5-fold
5-folds Accuracy
1 0.9922
2 0.9844
3 1.0
4 0.9991
5 0.9973
Table 3.8: variation in test score for accuracy SVM Classifier with Kernel as
RBF
Test score for SVM with Kernel as RBF
Sl.no Accuracy
1 99.37%
12
Chapter 4
Conclusion and Future work
4.1 Conclusion
The classification of voltage disturbances using SVMs and Boosting algo-
rithms was performed and results obatined were promising.The main moti-
vation behind the project is to understand various alogrithms (i.e.SVMs and
Boosting algorithms) and their implementaton. The above implementation
integration of two domains Electrical and Machine Learning to solve real
time engineering problems. Here the above implementation was done only
using synthetic data as testing and both training. So the accuracies were
found to be very high.
4.2 future work
The above analysis of faults was only done for certain faults which are occur-
ring very frequently in power system network.There are certain other faults
called as L-L fault (line to line) and arcing faults.Even modeling of simulink
the concept of voltage sag can be incorporated so that occurrences of faults
can be modeled more realistically and dynamically.
13
Bibliography
[1] Peter G. V. Axelberg, Irene Yu-Hua Gu. Support Vector Machine
for Classification of Voltage Disturbances. IEEE TRANSACTIONS ON
POWER DELIVERY, VOL. 22, NO. 3, JULY 2007.
[2] Breiman, Leo. Random forests. Machine learning 45, no. 1 (2001),pp.
5-32. 2001
[3] Friedman, Jerome H.Greedy function approximation: a gradient boosting
machine. Annals of statistics:pp 1189-1232. 2001
14

Contenu connexe

Tendances

NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMSNEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMSESCOM
 
3d tracking : chapter2-1 mathematical tools
3d tracking : chapter2-1 mathematical tools3d tracking : chapter2-1 mathematical tools
3d tracking : chapter2-1 mathematical toolsWoonhyuk Baek
 
Overview of State Estimation Technique for Power System Control
Overview of State Estimation Technique for Power System ControlOverview of State Estimation Technique for Power System Control
Overview of State Estimation Technique for Power System ControlIOSR Journals
 
A Novel Cosine Approximation for High-Speed Evaluation of DCT
A Novel Cosine Approximation for High-Speed Evaluation of DCTA Novel Cosine Approximation for High-Speed Evaluation of DCT
A Novel Cosine Approximation for High-Speed Evaluation of DCTCSCJournals
 
CMAC Neural Networks
CMAC Neural NetworksCMAC Neural Networks
CMAC Neural NetworksIJMREMJournal
 
Cerebellar Model Articulation Controller
Cerebellar Model Articulation ControllerCerebellar Model Articulation Controller
Cerebellar Model Articulation ControllerZahra Sadeghi
 
IRJET - Optimizing a High Lateral Misalignment Tolerance of the Short-Ra...
IRJET  -  	  Optimizing a High Lateral Misalignment Tolerance of the Short-Ra...IRJET  -  	  Optimizing a High Lateral Misalignment Tolerance of the Short-Ra...
IRJET - Optimizing a High Lateral Misalignment Tolerance of the Short-Ra...IRJET Journal
 
The tensor flight dynamics tutor
The tensor flight dynamics tutorThe tensor flight dynamics tutor
The tensor flight dynamics tutorPeter Zipfel
 
Adaline madaline
Adaline madalineAdaline madaline
Adaline madalineNagarajan
 
Curso de Analisis por elementos finitos
Curso de Analisis por elementos finitosCurso de Analisis por elementos finitos
Curso de Analisis por elementos finitosEnrique C.
 
Dynamic Kohonen Network for Representing Changes in Inputs
Dynamic Kohonen Network for Representing Changes in InputsDynamic Kohonen Network for Representing Changes in Inputs
Dynamic Kohonen Network for Representing Changes in InputsJean Fecteau
 
APPLICATION OF D-K ITERATION TECHNIQUE BASED ON H∞ ROBUST CONTROL THEORY FOR ...
APPLICATION OF D-K ITERATION TECHNIQUE BASED ON H∞ ROBUST CONTROL THEORY FOR ...APPLICATION OF D-K ITERATION TECHNIQUE BASED ON H∞ ROBUST CONTROL THEORY FOR ...
APPLICATION OF D-K ITERATION TECHNIQUE BASED ON H∞ ROBUST CONTROL THEORY FOR ...cscpconf
 
Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...
Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...
Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...ijsrd.com
 
Introduction to finite element analysis
Introduction to finite element analysisIntroduction to finite element analysis
Introduction to finite element analysisTarun Gehlot
 
Finite element analysis of space truss by abaqus
Finite element analysis of space truss by abaqus Finite element analysis of space truss by abaqus
Finite element analysis of space truss by abaqus P Venkateswalu
 
Manual for the MATLAB program to solve the 2D truss
Manual for the MATLAB program to solve the 2D trussManual for the MATLAB program to solve the 2D truss
Manual for the MATLAB program to solve the 2D trussMohammaderfan Zandieh
 
Power system state estimation using teaching learning-based optimization algo...
Power system state estimation using teaching learning-based optimization algo...Power system state estimation using teaching learning-based optimization algo...
Power system state estimation using teaching learning-based optimization algo...TELKOMNIKA JOURNAL
 
Intro to fea software
Intro to fea softwareIntro to fea software
Intro to fea softwarekubigs
 

Tendances (20)

NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMSNEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
 
3d tracking : chapter2-1 mathematical tools
3d tracking : chapter2-1 mathematical tools3d tracking : chapter2-1 mathematical tools
3d tracking : chapter2-1 mathematical tools
 
Overview of State Estimation Technique for Power System Control
Overview of State Estimation Technique for Power System ControlOverview of State Estimation Technique for Power System Control
Overview of State Estimation Technique for Power System Control
 
A Novel Cosine Approximation for High-Speed Evaluation of DCT
A Novel Cosine Approximation for High-Speed Evaluation of DCTA Novel Cosine Approximation for High-Speed Evaluation of DCT
A Novel Cosine Approximation for High-Speed Evaluation of DCT
 
TO_EDIT
TO_EDITTO_EDIT
TO_EDIT
 
CMAC Neural Networks
CMAC Neural NetworksCMAC Neural Networks
CMAC Neural Networks
 
Cerebellar Model Articulation Controller
Cerebellar Model Articulation ControllerCerebellar Model Articulation Controller
Cerebellar Model Articulation Controller
 
IRJET - Optimizing a High Lateral Misalignment Tolerance of the Short-Ra...
IRJET  -  	  Optimizing a High Lateral Misalignment Tolerance of the Short-Ra...IRJET  -  	  Optimizing a High Lateral Misalignment Tolerance of the Short-Ra...
IRJET - Optimizing a High Lateral Misalignment Tolerance of the Short-Ra...
 
The tensor flight dynamics tutor
The tensor flight dynamics tutorThe tensor flight dynamics tutor
The tensor flight dynamics tutor
 
Adaline madaline
Adaline madalineAdaline madaline
Adaline madaline
 
Introduction to FEA
Introduction to FEAIntroduction to FEA
Introduction to FEA
 
Curso de Analisis por elementos finitos
Curso de Analisis por elementos finitosCurso de Analisis por elementos finitos
Curso de Analisis por elementos finitos
 
Dynamic Kohonen Network for Representing Changes in Inputs
Dynamic Kohonen Network for Representing Changes in InputsDynamic Kohonen Network for Representing Changes in Inputs
Dynamic Kohonen Network for Representing Changes in Inputs
 
APPLICATION OF D-K ITERATION TECHNIQUE BASED ON H∞ ROBUST CONTROL THEORY FOR ...
APPLICATION OF D-K ITERATION TECHNIQUE BASED ON H∞ ROBUST CONTROL THEORY FOR ...APPLICATION OF D-K ITERATION TECHNIQUE BASED ON H∞ ROBUST CONTROL THEORY FOR ...
APPLICATION OF D-K ITERATION TECHNIQUE BASED ON H∞ ROBUST CONTROL THEORY FOR ...
 
Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...
Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...
Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...
 
Introduction to finite element analysis
Introduction to finite element analysisIntroduction to finite element analysis
Introduction to finite element analysis
 
Finite element analysis of space truss by abaqus
Finite element analysis of space truss by abaqus Finite element analysis of space truss by abaqus
Finite element analysis of space truss by abaqus
 
Manual for the MATLAB program to solve the 2D truss
Manual for the MATLAB program to solve the 2D trussManual for the MATLAB program to solve the 2D truss
Manual for the MATLAB program to solve the 2D truss
 
Power system state estimation using teaching learning-based optimization algo...
Power system state estimation using teaching learning-based optimization algo...Power system state estimation using teaching learning-based optimization algo...
Power system state estimation using teaching learning-based optimization algo...
 
Intro to fea software
Intro to fea softwareIntro to fea software
Intro to fea software
 

En vedette

Class 1 tahoe geology and native americans 2015
Class 1 tahoe geology and native americans 2015Class 1 tahoe geology and native americans 2015
Class 1 tahoe geology and native americans 2015Tahoe_History
 
Trabajoengrupo
TrabajoengrupoTrabajoengrupo
TrabajoengrupoGrupoUNY3
 
Promedios primer bimestre
Promedios primer bimestrePromedios primer bimestre
Promedios primer bimestreM4T3M4T1C4S
 
API Days Paris - When RESTful may be considered harmful
API Days Paris - When RESTful may be considered harmfulAPI Days Paris - When RESTful may be considered harmful
API Days Paris - When RESTful may be considered harmfulRoss Garrett
 
An introduction to social media (with animation) october 4
An introduction to social media (with animation)   october 4An introduction to social media (with animation)   october 4
An introduction to social media (with animation) october 4Shanta Nathwani
 
Introduction to GCC 2010 (En)
Introduction to GCC 2010 (En)Introduction to GCC 2010 (En)
Introduction to GCC 2010 (En)Mark Stothers
 
Let's Talk Mental Illness_Overview
Let's Talk Mental Illness_OverviewLet's Talk Mental Illness_Overview
Let's Talk Mental Illness_OverviewHakeem Rahim
 
Prefab+villa+profile
Prefab+villa+profilePrefab+villa+profile
Prefab+villa+profiledaviddho
 
A four way autometic traffic control system with variable delay using hdl
A four way autometic traffic control system with variable delay using hdlA four way autometic traffic control system with variable delay using hdl
A four way autometic traffic control system with variable delay using hdleSAT Journals
 
Vtc conf presentation - cdt website
Vtc conf presentation - cdt websiteVtc conf presentation - cdt website
Vtc conf presentation - cdt websiteimad Al-Samman
 

En vedette (15)

Class 1 tahoe geology and native americans 2015
Class 1 tahoe geology and native americans 2015Class 1 tahoe geology and native americans 2015
Class 1 tahoe geology and native americans 2015
 
Trabajoengrupo
TrabajoengrupoTrabajoengrupo
Trabajoengrupo
 
Promedios primer bimestre
Promedios primer bimestrePromedios primer bimestre
Promedios primer bimestre
 
API Days Paris - When RESTful may be considered harmful
API Days Paris - When RESTful may be considered harmfulAPI Days Paris - When RESTful may be considered harmful
API Days Paris - When RESTful may be considered harmful
 
The New Consumer
The New ConsumerThe New Consumer
The New Consumer
 
An introduction to social media (with animation) october 4
An introduction to social media (with animation)   october 4An introduction to social media (with animation)   october 4
An introduction to social media (with animation) october 4
 
Introduction to GCC 2010 (En)
Introduction to GCC 2010 (En)Introduction to GCC 2010 (En)
Introduction to GCC 2010 (En)
 
Let's Talk Mental Illness_Overview
Let's Talk Mental Illness_OverviewLet's Talk Mental Illness_Overview
Let's Talk Mental Illness_Overview
 
Prefab+villa+profile
Prefab+villa+profilePrefab+villa+profile
Prefab+villa+profile
 
Jitsi
JitsiJitsi
Jitsi
 
Bio medical signal analysis(both normal and abnormal)
Bio medical signal analysis(both normal and abnormal)Bio medical signal analysis(both normal and abnormal)
Bio medical signal analysis(both normal and abnormal)
 
A four way autometic traffic control system with variable delay using hdl
A four way autometic traffic control system with variable delay using hdlA four way autometic traffic control system with variable delay using hdl
A four way autometic traffic control system with variable delay using hdl
 
Mission, vision
Mission, visionMission, vision
Mission, vision
 
Vtc conf presentation - cdt website
Vtc conf presentation - cdt websiteVtc conf presentation - cdt website
Vtc conf presentation - cdt website
 
Slideshare
Slideshare Slideshare
Slideshare
 

Similaire à Classification of voltage disturbance using machine learning

Ijmer 41023842
Ijmer 41023842Ijmer 41023842
Ijmer 41023842IJMER
 
Detection of DC Voltage Fault in SRM Drives Using K-Means Clustering and Cla...
Detection of DC Voltage Fault in SRM Drives Using K-Means  Clustering and Cla...Detection of DC Voltage Fault in SRM Drives Using K-Means  Clustering and Cla...
Detection of DC Voltage Fault in SRM Drives Using K-Means Clustering and Cla...IJMER
 
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...Power System Operation
 
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...Power System Operation
 
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...IJERA Editor
 
DESIGN AND DEVELOPMENT OF SMES BASED DVR MODEL IN SIMULINK
DESIGN AND DEVELOPMENT OF SMES BASED DVR MODEL IN SIMULINKDESIGN AND DEVELOPMENT OF SMES BASED DVR MODEL IN SIMULINK
DESIGN AND DEVELOPMENT OF SMES BASED DVR MODEL IN SIMULINKpaperpublications3
 
Application of smart antenna interference suppression techniques in tdscdma
Application of smart antenna interference suppression techniques in tdscdmaApplication of smart antenna interference suppression techniques in tdscdma
Application of smart antenna interference suppression techniques in tdscdmamarwaeng
 
DigSILENT PF - 06 irena additional exercises
DigSILENT PF - 06 irena  additional exercisesDigSILENT PF - 06 irena  additional exercises
DigSILENT PF - 06 irena additional exercisesHimmelstern
 
Support Vector Machine Optimal Kernel Selection
Support Vector Machine Optimal Kernel SelectionSupport Vector Machine Optimal Kernel Selection
Support Vector Machine Optimal Kernel SelectionIRJET Journal
 
FAULT DETECTION AND CLASSIFICATION ON SINGLE CIRCUIT TRANSMISSION LINE USING ...
FAULT DETECTION AND CLASSIFICATION ON SINGLE CIRCUIT TRANSMISSION LINE USING ...FAULT DETECTION AND CLASSIFICATION ON SINGLE CIRCUIT TRANSMISSION LINE USING ...
FAULT DETECTION AND CLASSIFICATION ON SINGLE CIRCUIT TRANSMISSION LINE USING ...Politeknik Negeri Ujung Pandang
 
Optimization of Automatic Voltage Regulator Using Genetic Algorithm Applying ...
Optimization of Automatic Voltage Regulator Using Genetic Algorithm Applying ...Optimization of Automatic Voltage Regulator Using Genetic Algorithm Applying ...
Optimization of Automatic Voltage Regulator Using Genetic Algorithm Applying ...IJERA Editor
 
AMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLTAMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLTIRJET Journal
 
Addison Wesley - Modern Control Systems Analysis and Design Using Matlab, Bis...
Addison Wesley - Modern Control Systems Analysis and Design Using Matlab, Bis...Addison Wesley - Modern Control Systems Analysis and Design Using Matlab, Bis...
Addison Wesley - Modern Control Systems Analysis and Design Using Matlab, Bis...Gollapalli Sreenivasulu
 
International Journal of Computational Engineering Research(IJCER)
 International Journal of Computational Engineering Research(IJCER)  International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) ijceronline
 
Macromodel of High Speed Interconnect using Vector Fitting Algorithm
Macromodel of High Speed Interconnect using Vector Fitting AlgorithmMacromodel of High Speed Interconnect using Vector Fitting Algorithm
Macromodel of High Speed Interconnect using Vector Fitting Algorithmijsrd.com
 
ENG3104 Engineering Simulations and Computations Semester 2, 2.docx
ENG3104 Engineering Simulations and Computations Semester 2, 2.docxENG3104 Engineering Simulations and Computations Semester 2, 2.docx
ENG3104 Engineering Simulations and Computations Semester 2, 2.docxYASHU40
 

Similaire à Classification of voltage disturbance using machine learning (20)

Ijmer 41023842
Ijmer 41023842Ijmer 41023842
Ijmer 41023842
 
Detection of DC Voltage Fault in SRM Drives Using K-Means Clustering and Cla...
Detection of DC Voltage Fault in SRM Drives Using K-Means  Clustering and Cla...Detection of DC Voltage Fault in SRM Drives Using K-Means  Clustering and Cla...
Detection of DC Voltage Fault in SRM Drives Using K-Means Clustering and Cla...
 
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
 
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
 
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
 
DESIGN AND DEVELOPMENT OF SMES BASED DVR MODEL IN SIMULINK
DESIGN AND DEVELOPMENT OF SMES BASED DVR MODEL IN SIMULINKDESIGN AND DEVELOPMENT OF SMES BASED DVR MODEL IN SIMULINK
DESIGN AND DEVELOPMENT OF SMES BASED DVR MODEL IN SIMULINK
 
Application of smart antenna interference suppression techniques in tdscdma
Application of smart antenna interference suppression techniques in tdscdmaApplication of smart antenna interference suppression techniques in tdscdma
Application of smart antenna interference suppression techniques in tdscdma
 
Sensory Substitution
Sensory SubstitutionSensory Substitution
Sensory Substitution
 
DigSILENT PF - 06 irena additional exercises
DigSILENT PF - 06 irena  additional exercisesDigSILENT PF - 06 irena  additional exercises
DigSILENT PF - 06 irena additional exercises
 
CE150--Hongyi Huang
CE150--Hongyi HuangCE150--Hongyi Huang
CE150--Hongyi Huang
 
Support Vector Machine Optimal Kernel Selection
Support Vector Machine Optimal Kernel SelectionSupport Vector Machine Optimal Kernel Selection
Support Vector Machine Optimal Kernel Selection
 
Ltu ex-05238-se
Ltu ex-05238-seLtu ex-05238-se
Ltu ex-05238-se
 
FAULT DETECTION AND CLASSIFICATION ON SINGLE CIRCUIT TRANSMISSION LINE USING ...
FAULT DETECTION AND CLASSIFICATION ON SINGLE CIRCUIT TRANSMISSION LINE USING ...FAULT DETECTION AND CLASSIFICATION ON SINGLE CIRCUIT TRANSMISSION LINE USING ...
FAULT DETECTION AND CLASSIFICATION ON SINGLE CIRCUIT TRANSMISSION LINE USING ...
 
Optimization of Automatic Voltage Regulator Using Genetic Algorithm Applying ...
Optimization of Automatic Voltage Regulator Using Genetic Algorithm Applying ...Optimization of Automatic Voltage Regulator Using Genetic Algorithm Applying ...
Optimization of Automatic Voltage Regulator Using Genetic Algorithm Applying ...
 
AMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLTAMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLT
 
Addison Wesley - Modern Control Systems Analysis and Design Using Matlab, Bis...
Addison Wesley - Modern Control Systems Analysis and Design Using Matlab, Bis...Addison Wesley - Modern Control Systems Analysis and Design Using Matlab, Bis...
Addison Wesley - Modern Control Systems Analysis and Design Using Matlab, Bis...
 
International Journal of Computational Engineering Research(IJCER)
 International Journal of Computational Engineering Research(IJCER)  International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Macromodel of High Speed Interconnect using Vector Fitting Algorithm
Macromodel of High Speed Interconnect using Vector Fitting AlgorithmMacromodel of High Speed Interconnect using Vector Fitting Algorithm
Macromodel of High Speed Interconnect using Vector Fitting Algorithm
 
All projects
All projectsAll projects
All projects
 
ENG3104 Engineering Simulations and Computations Semester 2, 2.docx
ENG3104 Engineering Simulations and Computations Semester 2, 2.docxENG3104 Engineering Simulations and Computations Semester 2, 2.docx
ENG3104 Engineering Simulations and Computations Semester 2, 2.docx
 

Classification of voltage disturbance using machine learning

  • 1. CLASSIFICATION OF VOLTAGE DISTURBANCES USING SVM’s AND BOOSTING CLASSIFIERS September 29, 2015
  • 2. Submitted by MOHAN KASHYAP.P (SC13M055) M-Tech in MACHINE LEARNING AND COMPUTING Department of mathematics 1
  • 3. List of Figures 2.1 Simulink model for generation of different kinds of electrical faults and their analysis . . . . . . . . . . . . . . . . . . . . . 3 3.1 Confusion matrix for Gradient Boosting Classifiers of estima- tors as 100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Confusion matrix for Ada Boosting Classifiers of estimators as 100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3 Confusion matrix for Random forest Classifiers of estimators as 100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2
  • 4. Contents 1 Introduction 1 2 Simulink model and Algorithmic implementation 2 2.1 Simulink modeling . . . . . . . . . . . . . . . . . . . . . . . . 2 2.1.1 Classification of faults . . . . . . . . . . . . . . . . . . 2 2.1.2 Extraction of different features . . . . . . . . . . . . . . 3 2.2 Algorithm Implementation . . . . . . . . . . . . . . . . . . . . 4 2.2.1 Generation of data points . . . . . . . . . . . . . . . . 5 2.2.2 Implementation of LIBSVM . . . . . . . . . . . . . . . 5 2.2.3 Implementation of Gradient Boosting algorithm . . . . 5 2.2.4 AdaBoost Classifier . . . . . . . . . . . . . . . . . . . . 6 2.2.5 Random Forest Classifiers . . . . . . . . . . . . . . . . 7 3 Results 9 3.1 Results for Gradient Boosting Classifier . . . . . . . . . . . . . 9 3.2 Results for Ada Boosting Classifier . . . . . . . . . . . . . . . 10 3.3 Results for Random-Forest Classifier . . . . . . . . . . . . . . 11 3.3.1 Results for SVM . . . . . . . . . . . . . . . . . . . . . 12 4 Conclusion and Future work 13 4.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2 future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3
  • 5. Abstract This report briefly describes about the Classification of volatge disturbance using support vector machine and boosting classifiers. The process starts from building of a matlab/simulink model for simulation of various electri- cal faults. Followed by Feature extraction to generate the data-set and fi- nally classification of the obtained data using boosting and SVM. The results obatined were found to be promising
  • 6. Chapter 1 Introduction Over the past two decades utilities worldwide have gone through radical changes. One big change is the deregulation of the energy market that has taken place in a number of countries worldwide. Another change is that todays customers are more demanding than customers in the past. These changes have forced the utilities to become even more customer-oriented where high network reliability and good power quality have been increasingly important to keep customers satisfied. Therefore, recording and analyzing voltage disturbances (also referred to as voltage events) and power quality abnormalities have become a vital issue in order to better understand the behavior of the power network. Disturbance data and power quality data have become important information both for statistical purposes and as a decision-making document in mitigation projects. Reliable disturbance and power quality information also open up for proactive maintenance approach with focus on increasing the power network reliability. Today most of the disturbance data is analyzed manually by specialists. However, a lot of time could be saved if unimportant or minor disturbances could be removed or classified automatically. Thereby, the specialists could focus on solving more sophisticated disturbance problems. This requires the development of robust automatic classification systems. During the last few years yet another clas- sification method the support vector machine (SVM) has been increasingly popular due to its interesting theoretical and practical characteristics. The SVM which is based on statistical learning theory is a general classification method. During recent years boosting algorithms have also gained slowly the popularity. Its based on ensemble learning approach. 1
  • 7. Chapter 2 Simulink model and Algorithmic implementation 2.1 Simulink modeling The basic idea is to generate simulink model was used to simulate differ- ent kinds of faults and extract features which were used to generate the data points. From the generated data points the multiclass svm,adaboost,gradientboosting and random forest algorithms were implemented. Now the simulink model is described used the faults and various features extracted are shown in the figure below: 2.1.1 Classification of faults There are four different kinds of fault classification is described as given below: • single line to ground fault(SLG):on line is faulty is connected to ground other two lines are properly functioning. • double line to ground fault(DLG):two lines are faulty connected to ground one line is properly functioning • three phase to ground fault:all three lines are faulty and connected to ground • unequally distributed faults:in which the voltages in three lines are improperly distributed 2
  • 8. Figure 2.1: Simulink model for generation of different kinds of electrical faults and their analysis 2.1.2 Extraction of different features 1. RMS(root mean square)value:when the faults occur any one of the phase voltages will go zero or will have uneven voltage distribution. So this value of root mean square value of voltage will be determining factor and also one of the features which are use for classification the formula for rms is given by frms = limT→∞ 1 T T 0 [f(t)]2 dt. where the equivalent symbols are frms=Vrms=voltage rms values of voltage f(t)=v(t)=voltage magnitude values of voltages at various time instants=vm ∗ sin(T) T=time period of the sinusoidal wave 2. Total harmonic distortion(THD):When the input is a sine wave the measurement is most commonly defined as the ratio of the RMS am- plitude of a set of higher harmonic frequencies to the RMS amplitude 3
  • 9. of the first harmonic or fundamental frequency.so since already RMS was used as one of the features so this factor also contains imbibed RMS within it so its also useful in classification of voltage disturbance since voltage magnitude is different in different phases during the faults therefore the RMS values are also different therefore THD’s are also dif- ferent for various three different phases.the equation of THD is given below: THDF = √ V 2 2 +V 2 3 +V 2 4 +···+V 2 n V1 where where Vi is the RMS voltage of ith harmonic and i = 1 is the fundamental frequency. where i=1,2,3....n 3. magnitude and phase sequence analyzer:magnitude sequence analyzer gives the changes in magnitude of voltage changes at various time in- stants and phase sequence analyzer analyses phase change with respect to time since voltage is represented in polar form for AC components this can be used for classification of faults since magnitude of various phases(3) are different for different kinds of faults.we can represent it as follows: V=R theta R=is magnitude of voltage, theta=is the phase of voltage 4. voltage at various selected intervals So these were the features extracted in order to differentiate different kinds of faults. 2.2 Algorithm Implementation The classification of generated data was performed using Support vector ma- chines, Gradient Boosting Classifier , Ada-boost and random forest classifier. The multiclass classification was performed using one versus all method. 4
  • 10. 2.2.1 Generation of data points The next step was to generate the data points that was done done by feature extraction by using work-space block of simulink all the data points were gen- erated.the orientation of data points,different classifications,normalization of data was done using excel. Here the number of data points were 809 of 4 different classes.The number of features were reduced to 9 because of datapreprocessing. 2.2.2 Implementation of LIBSVM After generation of the data points we need svm implementation to classify the data. Implementation of multiclass svm was performed using LIBSVM which is an open source library. The important parameter to be tuned was the kernel of SVM. From the 5-fold cross validation we found out that RBF kernel fits the best for our data. 2.2.3 Implementation of Gradient Boosting algorithm Boosting algorithms are set of machine learning algorithms, which builds strong classifier from set of weak classifiers, typically decision tress. Gradi- ent boosting is one such algorithm which builds the model in a stage-wise fashion, and it generalizes the model by allowing optimization of an arbitrary differentiable loss function. The differentiable loss function in our case is Bi- nomial deviance loss function. The algorithm is implemented as follows as described in (Friedman et al.,2001). Input : training set (Xi, yi), where i = 1....n , Xi ∈ H ⊆ Rn and yi ∈ [−1, 1] differential loss function L(y, F(X)) which in our case is Binomial deviance loss function defined as log(1 + exp(−2yF(X))) and M are the number of iterations . 1. Initialize model with a constant value: F0(X) =arg min γ n i=1 L(yi, γ). 2. For m = 1 to M: (a) Compute the pseudo-responses: rim = − ∂L(yi,F(Xi)) ∂F(Xi) F(X)=Fm−1(X) for i = 1, . . . , n. (b) Fit a base learner hm(X) to pseudo-response, train the pseudo response using the training set {(Xi, rim)}n i=1. 5
  • 11. (c) Compute multiplier γm by solving the optimization problem: γm = arg min γ n i=1 L (yi, Fm−1(Xi) + γhm(Xi)). (d) Update the model: Fm(X) = Fm−1(X) + γmhm(X). 3. Output FM (X) = M m=1 γmhm(X) The value of the weight γm is found by an approximated newton raphson solution given as γm = Xi∈hm rim Xi∈hm|rim|(2−|rim|) The parameter to be tuned was the number of classifiers within the en- semble and the base classifier within the ensemble. With the 5-fold cross validation the optimal number of base estimators were found to be 100 with decision tree as the base classifier. 2.2.4 AdaBoost Classifier In adaBoost we assign (non-negative) weights to points in the data set which are normalized, so that it forms a distribution. In each iteration, we generate a training set by sampling from the data using the weights, i.e. the data point (Xi, yi) would be chosen with probability wi, where wi is the current weight for that data point. We generate the training set by such repeated independent sampling. After learning the current classifier, we increase the (relative) weights of data points that are misclassified by the current classifier. We generate a fresh training set using the modified weights and so on. The final classifier is essentially a weighted majority voting by all the classifiers. The description of the algorithm as in (Freund et al., 1995) is given below: Input n examples: (X1, y1), ..., (Xn, yn), Xi ∈ H ⊆ Rn , yi ∈ [−1, 1] 1. Initialize: wi(1) = 1 n , ∀i, each data point is initialized with equal weight, so when data points are sampled from the probability distribu- tion the chance of getting the data point in the training set is equally likely. 2. We assume that there as M classifiers within the Ensembles. For m=1 to M do (a) Generate a training set by sampling with wi(m). (b) Learn classifier hm using this training set. (c) let ξm = n i=1 wi(m) I[yi=hm(Xi)] where IA is the indicator function of A and is defined as IA = 1 if [yi = hm(Xi)] 6
  • 12. IA = 0 if [yi = hm(Xi)] so ξm is the error computed due to the mth classifier. (d) Set αm=log(1−ξm ξm ) computed hypothesis weight, such that αm > 0 because of the assumption that ξ < 0.5. (e) Update the weight distribution over the training set as wi(m + 1)= wi(m) exp(αmI[yi=hm(Xi)]) Normalization of the updated weights so that wi(m + 1) is a dis- tribution. wi(m + 1) = wi(m+1) i wi(m+1) end for 3. Output is final vote h(X) = sgn( M m=1 αmhm(x)) is the weighted sum of all classifiers in the ensemble. In the adaboost algorithm M is a parameter. Due to the sampling with weights, we can continue the procedure for arbitrary number of iterations. Loss function used in adaboost algorithm is exponential loss function and for a particular data point its defined as exp(−yif(Xi)). The parameter to be tuned was the number of classifiers within the ensemble and the base classifier within the ensemble. With the 5-fold cross validation the optimal number of base estimators were found to be 100 with decision tree as the base classifier. 2.2.5 Random Forest Classifiers Random forests are a combination of tree predictors, such that each tree depends on the values of a random vector sampled independently, and with the same distribution for all trees in the forest. The main difference between standard decision trees and random forest is, in decision trees, each node is split using the best split among all variables and in random forest, each node is split using the best among a subset of predictors randomly chosen at that node. In random forest classifier ntree bootstrap samples are drawn from the original data, and for each obtained bootstrap sample grow an unpruned classification decision tree, with the following modification: at each node, rather than choosing the best split among all predictors, randomly sample mtry of the predictors and choose the best split from among those variables. Predict new data by aggregating the predictions of the ntree trees (i.e., majority votes for classification). The algorithm is described as follows as in(Brieman, 2001): Input n examples: (X1, y1), ..., (Xn, yn) = D, Xi ∈ Rn , where D is the whole 7
  • 13. dataset. for i=1,...,B: 1. Choose a boostrap sample Di from D. 2. Construct a decision Tree Ti from the bootstrap sample Di such that at each node, choose a random subset of m features and only consider splitting on those features. Finally given the testdata Xt take the majority votes for classification. Here B is the number of bootstrap data sets generated from original data set D. The parameter to be tuned was the number of classifiers within the ensemble and the base classifier within the ensemble. With the 5-fold cross validation the optimal number of base estimators were found to be 100 with decision tree as the base classifier. 8
  • 14. Chapter 3 Results The generated data points were very few in number. The electrical faults normally are distinguishable with additional features. Due to this scenario the classification accuracies obtained were very high. The results for the ensemble classifiers(i.e. Boosting classifiers) were implemented in python. SVM classification was implemented using LIBSVM. 3.1 Results for Gradient Boosting Classifier This section shows the training accuracies using 5-fold cross validation,testing accuracies and confusion matrix. Table 3.1: Variation in accuracies for Gradient Boosting classifier for number of parameter estimate as 100 Gradient Boosting classifier training accuracies for 5-fold 5-folds Accuracy 1 0.9922 2 0.9844 3 0.9972 4 0.9982 5 0.9957 Table 3.2: variation in test score for accuracy Gradient Boosting Classifier with number of estimators as 100 Test score for Gradient Boosting Classifier with number of estimators as 100 Sl.no Accuracy 1 99.83% 9
  • 15. Figure 3.1: Confusion matrix for Gradient Boosting Classifiers of estimators as 100 3.2 Results for Ada Boosting Classifier This section shows the training accuracies using 5-fold cross validation,testing accuracies and confusion matrix. Table 3.3: Variation in accuracies for Ada Boosting classifier for number of parameter estimate as 100 Ada Boosting classifier training accuracies for 5-fold 5-folds Accuracy 1 0.9922 2 0.9922 3 0.9972 4 0.9983 5 0.9945 Table 3.4: variation in test score for accuracy Ada Boosting Classifier with number of estimators as 100 Test score for Ada Boosting Classifier with number of estimators as 100 Sl.no Accuracy 1 98.75% 10
  • 16. Figure 3.2: Confusion matrix for Ada Boosting Classifiers of estimators as 100 3.3 Results for Random-Forest Classifier This section shows the training accuracies using 5-fold cross validation,testing accuracies and confusion matrix. Table 3.5: Variation in accuracies for Random Forest classifier for number of parameter estimate as 100 Random Forest classifier training accuracies for 5-fold 5-folds Accuracy 1 0.9934 2 0.9958 3 0.9972 4 0.9987 5 0.9954 Table 3.6: variation in test score for accuracy Random forest Classifier with number of estimators as 100 Test score for Random Forest Classifier with number of estimators as 100 Sl.no Accuracy 1 99.37% 11
  • 17. Figure 3.3: Confusion matrix for Random forest Classifiers of estimators as 100 3.3.1 Results for SVM This section shows the training accuracies using 5-fold cross validation,testing accuracies. Table 3.7: Variation in accuracies for SVM classifier for Kernel as RBF SVM classifier training accuracies for 5-fold 5-folds Accuracy 1 0.9922 2 0.9844 3 1.0 4 0.9991 5 0.9973 Table 3.8: variation in test score for accuracy SVM Classifier with Kernel as RBF Test score for SVM with Kernel as RBF Sl.no Accuracy 1 99.37% 12
  • 18. Chapter 4 Conclusion and Future work 4.1 Conclusion The classification of voltage disturbances using SVMs and Boosting algo- rithms was performed and results obatined were promising.The main moti- vation behind the project is to understand various alogrithms (i.e.SVMs and Boosting algorithms) and their implementaton. The above implementation integration of two domains Electrical and Machine Learning to solve real time engineering problems. Here the above implementation was done only using synthetic data as testing and both training. So the accuracies were found to be very high. 4.2 future work The above analysis of faults was only done for certain faults which are occur- ring very frequently in power system network.There are certain other faults called as L-L fault (line to line) and arcing faults.Even modeling of simulink the concept of voltage sag can be incorporated so that occurrences of faults can be modeled more realistically and dynamically. 13
  • 19. Bibliography [1] Peter G. V. Axelberg, Irene Yu-Hua Gu. Support Vector Machine for Classification of Voltage Disturbances. IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 22, NO. 3, JULY 2007. [2] Breiman, Leo. Random forests. Machine learning 45, no. 1 (2001),pp. 5-32. 2001 [3] Friedman, Jerome H.Greedy function approximation: a gradient boosting machine. Annals of statistics:pp 1189-1232. 2001 14