5. Abstract
This report briefly describes about the Classification of volatge disturbance
using support vector machine and boosting classifiers. The process starts
from building of a matlab/simulink model for simulation of various electri-
cal faults. Followed by Feature extraction to generate the data-set and fi-
nally classification of the obtained data using boosting and SVM. The results
obatined were found to be promising
6. Chapter 1
Introduction
Over the past two decades utilities worldwide have gone through radical
changes. One big change is the deregulation of the energy market that has
taken place in a number of countries worldwide. Another change is that
todays customers are more demanding than customers in the past. These
changes have forced the utilities to become even more customer-oriented
where high network reliability and good power quality have been increasingly
important to keep customers satisfied. Therefore, recording and analyzing
voltage disturbances (also referred to as voltage events) and power quality
abnormalities have become a vital issue in order to better understand the
behavior of the power network. Disturbance data and power quality data
have become important information both for statistical purposes and as a
decision-making document in mitigation projects. Reliable disturbance and
power quality information also open up for proactive maintenance approach
with focus on increasing the power network reliability. Today most of the
disturbance data is analyzed manually by specialists. However, a lot of time
could be saved if unimportant or minor disturbances could be removed or
classified automatically. Thereby, the specialists could focus on solving more
sophisticated disturbance problems. This requires the development of robust
automatic classification systems. During the last few years yet another clas-
sification method the support vector machine (SVM) has been increasingly
popular due to its interesting theoretical and practical characteristics. The
SVM which is based on statistical learning theory is a general classification
method. During recent years boosting algorithms have also gained slowly
the popularity. Its based on ensemble learning approach.
1
7. Chapter 2
Simulink model and
Algorithmic implementation
2.1 Simulink modeling
The basic idea is to generate simulink model was used to simulate differ-
ent kinds of faults and extract features which were used to generate the data
points. From the generated data points the multiclass svm,adaboost,gradientboosting
and random forest algorithms were implemented. Now the simulink model
is described used the faults and various features extracted are shown in the
figure below:
2.1.1 Classification of faults
There are four different kinds of fault classification is described as given
below:
• single line to ground fault(SLG):on line is faulty is connected to ground
other two lines are properly functioning.
• double line to ground fault(DLG):two lines are faulty connected to
ground one line is properly functioning
• three phase to ground fault:all three lines are faulty and connected to
ground
• unequally distributed faults:in which the voltages in three lines are
improperly distributed
2
8. Figure 2.1: Simulink model for generation of different kinds of electrical faults
and their analysis
2.1.2 Extraction of different features
1. RMS(root mean square)value:when the faults occur any one of the
phase voltages will go zero or will have uneven voltage distribution.
So this value of root mean square value of voltage will be determining
factor and also one of the features which are use for classification the
formula for rms is given by
frms = limT→∞
1
T
T
0
[f(t)]2
dt.
where the equivalent symbols are
frms=Vrms=voltage rms values of voltage
f(t)=v(t)=voltage magnitude values of voltages at various time
instants=vm ∗ sin(T)
T=time period of the sinusoidal wave
2. Total harmonic distortion(THD):When the input is a sine wave the
measurement is most commonly defined as the ratio of the RMS am-
plitude of a set of higher harmonic frequencies to the RMS amplitude
3
9. of the first harmonic or fundamental frequency.so since already RMS
was used as one of the features so this factor also contains imbibed
RMS within it so its also useful in classification of voltage disturbance
since voltage magnitude is different in different phases during the faults
therefore the RMS values are also different therefore THD’s are also dif-
ferent for various three different phases.the equation of THD is given
below:
THDF =
√
V 2
2 +V 2
3 +V 2
4 +···+V 2
n
V1
where
where Vi is the RMS voltage of ith harmonic and i = 1 is the
fundamental frequency. where i=1,2,3....n
3. magnitude and phase sequence analyzer:magnitude sequence analyzer
gives the changes in magnitude of voltage changes at various time in-
stants and phase sequence analyzer analyses phase change with respect
to time since voltage is represented in polar form for AC components
this can be used for classification of faults since magnitude of various
phases(3) are different for different kinds of faults.we can represent it
as follows:
V=R theta
R=is magnitude of voltage, theta=is the phase of voltage
4. voltage at various selected intervals
So these were the features extracted in order to differentiate different kinds
of faults.
2.2 Algorithm Implementation
The classification of generated data was performed using Support vector ma-
chines, Gradient Boosting Classifier , Ada-boost and random forest classifier.
The multiclass classification was performed using one versus all method.
4
10. 2.2.1 Generation of data points
The next step was to generate the data points that was done done by feature
extraction by using work-space block of simulink all the data points were gen-
erated.the orientation of data points,different classifications,normalization of
data was done using excel.
Here the number of data points were 809 of 4 different classes.The number
of features were reduced to 9 because of datapreprocessing.
2.2.2 Implementation of LIBSVM
After generation of the data points we need svm implementation to classify
the data. Implementation of multiclass svm was performed using LIBSVM
which is an open source library. The important parameter to be tuned was
the kernel of SVM. From the 5-fold cross validation we found out that RBF
kernel fits the best for our data.
2.2.3 Implementation of Gradient Boosting algorithm
Boosting algorithms are set of machine learning algorithms, which builds
strong classifier from set of weak classifiers, typically decision tress. Gradi-
ent boosting is one such algorithm which builds the model in a stage-wise
fashion, and it generalizes the model by allowing optimization of an arbitrary
differentiable loss function. The differentiable loss function in our case is Bi-
nomial deviance loss function. The algorithm is implemented as follows as
described in (Friedman et al.,2001).
Input : training set (Xi, yi), where i = 1....n , Xi ∈ H ⊆ Rn
and yi ∈ [−1, 1]
differential loss function L(y, F(X)) which in our case is Binomial deviance
loss function defined as log(1 + exp(−2yF(X))) and M are the number of
iterations .
1. Initialize model with a constant value:
F0(X) =arg min
γ
n
i=1 L(yi, γ).
2. For m = 1 to M:
(a) Compute the pseudo-responses:
rim = − ∂L(yi,F(Xi))
∂F(Xi)
F(X)=Fm−1(X)
for i = 1, . . . , n.
(b) Fit a base learner hm(X) to pseudo-response, train the pseudo
response
using the training set {(Xi, rim)}n
i=1.
5
11. (c) Compute multiplier γm by solving the optimization problem:
γm = arg min
γ
n
i=1 L (yi, Fm−1(Xi) + γhm(Xi)).
(d) Update the model: Fm(X) = Fm−1(X) + γmhm(X).
3. Output FM (X) = M
m=1 γmhm(X)
The value of the weight γm is found by an approximated newton raphson
solution given as γm = Xi∈hm
rim
Xi∈hm|rim|(2−|rim|)
The parameter to be tuned was the number of classifiers within the en-
semble and the base classifier within the ensemble. With the 5-fold cross
validation the optimal number of base estimators were found to be 100 with
decision tree as the base classifier.
2.2.4 AdaBoost Classifier
In adaBoost we assign (non-negative) weights to points in the data set which
are normalized, so that it forms a distribution. In each iteration, we generate
a training set by sampling from the data using the weights, i.e. the data
point (Xi, yi) would be chosen with probability wi, where wi is the current
weight for that data point. We generate the training set by such repeated
independent sampling. After learning the current classifier, we increase the
(relative) weights of data points that are misclassified by the current classifier.
We generate a fresh training set using the modified weights and so on. The
final classifier is essentially a weighted majority voting by all the classifiers.
The description of the algorithm as in (Freund et al., 1995) is given below:
Input n examples: (X1, y1), ..., (Xn, yn), Xi ∈ H ⊆ Rn
, yi ∈ [−1, 1]
1. Initialize: wi(1) = 1
n
, ∀i, each data point is initialized with equal
weight, so when data points are sampled from the probability distribu-
tion the chance of getting the data point in the training set is equally
likely.
2. We assume that there as M classifiers within the Ensembles.
For m=1 to M do
(a) Generate a training set by sampling with wi(m).
(b) Learn classifier hm using this training set.
(c) let ξm = n
i=1 wi(m) I[yi=hm(Xi)] where IA is the indicator function
of A and is defined as
IA = 1 if [yi = hm(Xi)]
6
12. IA = 0 if [yi = hm(Xi)]
so ξm is the error computed due to the mth classifier.
(d) Set αm=log(1−ξm
ξm
) computed hypothesis weight, such that αm > 0
because of the assumption that ξ < 0.5.
(e) Update the weight distribution over the training set as
wi(m + 1)= wi(m) exp(αmI[yi=hm(Xi)])
Normalization of the updated weights so that wi(m + 1) is a dis-
tribution. wi(m + 1) =
wi(m+1)
i wi(m+1)
end for
3. Output is final vote h(X) = sgn( M
m=1 αmhm(x)) is the weighted sum
of all classifiers in the ensemble.
In the adaboost algorithm M is a parameter. Due to the sampling with
weights, we can continue the procedure for arbitrary number of iterations.
Loss function used in adaboost algorithm is exponential loss function and
for a particular data point its defined as exp(−yif(Xi)). The parameter to
be tuned was the number of classifiers within the ensemble and the base
classifier within the ensemble. With the 5-fold cross validation the optimal
number of base estimators were found to be 100 with decision tree as the
base classifier.
2.2.5 Random Forest Classifiers
Random forests are a combination of tree predictors, such that each tree
depends on the values of a random vector sampled independently, and with
the same distribution for all trees in the forest. The main difference between
standard decision trees and random forest is, in decision trees, each node
is split using the best split among all variables and in random forest, each
node is split using the best among a subset of predictors randomly chosen
at that node. In random forest classifier ntree bootstrap samples are drawn
from the original data, and for each obtained bootstrap sample grow an
unpruned classification decision tree, with the following modification: at
each node, rather than choosing the best split among all predictors, randomly
sample mtry of the predictors and choose the best split from among those
variables. Predict new data by aggregating the predictions of the ntree trees
(i.e., majority votes for classification). The algorithm is described as follows
as in(Brieman, 2001):
Input n examples: (X1, y1), ..., (Xn, yn) = D, Xi ∈ Rn
, where D is the whole
7
13. dataset.
for i=1,...,B:
1. Choose a boostrap sample Di from D.
2. Construct a decision Tree Ti from the bootstrap sample Di such that
at each node, choose a random subset of m features and only consider
splitting on those features.
Finally given the testdata Xt take the majority votes for classification. Here
B is the number of bootstrap data sets generated from original data set D.
The parameter to be tuned was the number of classifiers within the ensemble
and the base classifier within the ensemble. With the 5-fold cross validation
the optimal number of base estimators were found to be 100 with decision
tree as the base classifier.
8
14. Chapter 3
Results
The generated data points were very few in number. The electrical faults
normally are distinguishable with additional features. Due to this scenario
the classification accuracies obtained were very high. The results for the
ensemble classifiers(i.e. Boosting classifiers) were implemented in python.
SVM classification was implemented using LIBSVM.
3.1 Results for Gradient Boosting Classifier
This section shows the training accuracies using 5-fold cross validation,testing
accuracies and confusion matrix.
Table 3.1: Variation in accuracies for Gradient Boosting classifier for number
of parameter estimate as 100
Gradient Boosting classifier training accuracies for 5-fold
5-folds Accuracy
1 0.9922
2 0.9844
3 0.9972
4 0.9982
5 0.9957
Table 3.2: variation in test score for accuracy Gradient Boosting Classifier
with number of estimators as 100
Test score for Gradient Boosting Classifier with number of estimators as 100
Sl.no Accuracy
1 99.83%
9
15. Figure 3.1: Confusion matrix for Gradient Boosting Classifiers of estimators
as 100
3.2 Results for Ada Boosting Classifier
This section shows the training accuracies using 5-fold cross validation,testing
accuracies and confusion matrix.
Table 3.3: Variation in accuracies for Ada Boosting classifier for number of
parameter estimate as 100
Ada Boosting classifier training accuracies for 5-fold
5-folds Accuracy
1 0.9922
2 0.9922
3 0.9972
4 0.9983
5 0.9945
Table 3.4: variation in test score for accuracy Ada Boosting Classifier with
number of estimators as 100
Test score for Ada Boosting Classifier with number of estimators as 100
Sl.no Accuracy
1 98.75%
10
16. Figure 3.2: Confusion matrix for Ada Boosting Classifiers of estimators as
100
3.3 Results for Random-Forest Classifier
This section shows the training accuracies using 5-fold cross validation,testing
accuracies and confusion matrix.
Table 3.5: Variation in accuracies for Random Forest classifier for number of
parameter estimate as 100
Random Forest classifier training accuracies for 5-fold
5-folds Accuracy
1 0.9934
2 0.9958
3 0.9972
4 0.9987
5 0.9954
Table 3.6: variation in test score for accuracy Random forest Classifier with
number of estimators as 100
Test score for Random Forest Classifier with number of estimators as 100
Sl.no Accuracy
1 99.37%
11
17. Figure 3.3: Confusion matrix for Random forest Classifiers of estimators as
100
3.3.1 Results for SVM
This section shows the training accuracies using 5-fold cross validation,testing
accuracies.
Table 3.7: Variation in accuracies for SVM classifier for Kernel as RBF
SVM classifier training accuracies for 5-fold
5-folds Accuracy
1 0.9922
2 0.9844
3 1.0
4 0.9991
5 0.9973
Table 3.8: variation in test score for accuracy SVM Classifier with Kernel as
RBF
Test score for SVM with Kernel as RBF
Sl.no Accuracy
1 99.37%
12
18. Chapter 4
Conclusion and Future work
4.1 Conclusion
The classification of voltage disturbances using SVMs and Boosting algo-
rithms was performed and results obatined were promising.The main moti-
vation behind the project is to understand various alogrithms (i.e.SVMs and
Boosting algorithms) and their implementaton. The above implementation
integration of two domains Electrical and Machine Learning to solve real
time engineering problems. Here the above implementation was done only
using synthetic data as testing and both training. So the accuracies were
found to be very high.
4.2 future work
The above analysis of faults was only done for certain faults which are occur-
ring very frequently in power system network.There are certain other faults
called as L-L fault (line to line) and arcing faults.Even modeling of simulink
the concept of voltage sag can be incorporated so that occurrences of faults
can be modeled more realistically and dynamically.
13
19. Bibliography
[1] Peter G. V. Axelberg, Irene Yu-Hua Gu. Support Vector Machine
for Classification of Voltage Disturbances. IEEE TRANSACTIONS ON
POWER DELIVERY, VOL. 22, NO. 3, JULY 2007.
[2] Breiman, Leo. Random forests. Machine learning 45, no. 1 (2001),pp.
5-32. 2001
[3] Friedman, Jerome H.Greedy function approximation: a gradient boosting
machine. Annals of statistics:pp 1189-1232. 2001
14