Contenu connexe
Similaire à Knowledge extraction from numerical data an abc
Similaire à Knowledge extraction from numerical data an abc (20)
Plus de IAEME Publication
Plus de IAEME Publication (20)
Knowledge extraction from numerical data an abc
- 1. INTERNATIONALComputer EngineeringCOMPUTER ENGINEERING
International Journal of JOURNAL OF and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
& TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online) IJCET
Volume 4, Issue 2, March – April (2013), pp. 01-09
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2013): 6.1302 (Calculated by GISI)
©IAEME
www.jifactor.com
KNOWLEDGE EXTRACTION FROM NUMERICAL DATA: AN ABC
BASED APPROACH
Lalit Kumar1, Dr. Dheerendra Singh2
1
M.Tech Scholar, Department of CSE, SUSCET Tangori, Mohali, India
2
Professor and Head, Department of CSE, SUSCET, Tangori, Mohali, India
ABSTRACT
Fuzzy rule based systems provide a framework for representing & processing
information in a way that resembles human communication & reasoning process. Two
approaches can be found in the literature which is used for rule based generation; In
Knowledge Driven Models the requisite rule base is provide by domain expert & knowledge
engineers. In the Data Driven Models the rule base is generated from available numerical
data. As the domain experts are difficult to find & knowledge extraction from the experts
itself is difficult task the data driven modeling assume significance, One has to apply soft
computing base methodology to generate rule base form data. Neural networks, genetic
algorithm & particle swam optimization are some of the approaches [1]. Basic Artificial Bee
Colony algorithm (ABC) has the advantages of strong robustness, fast convergence and high
flexibility, fewer setting parameters, but it has the disadvantages premature convergence in
the later search period and the accuracy of the optimal value which cannot meet the
requirements sometimes [4].
KEY-WORDS: Artificial Bee Colony Algorithm, Fuzzy Membership Function, Sugeno
System, Rule Based Generation.
1. INTRODUCTION
Fuzzy systems are used to model highly complex and highly nonlinear systems and
under the circumstances, the rule base extraction problem becomes NP hard problem. When
the problem is very complex, application of classical methods turns out to be very expensive
computationally. ABC is an example of how a natural process can be modeled to solve
optimization problems [3]. The concept of mathematical model is fundamental to system
analysis & design which requires representation of systems as functional dependence
between interacting input & output variables conventionally, a mathematical model is
constructed by analyzing input –output from the system.
1
- 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
LITERATURE SURVE
Since March 2012, Singh D. have studying the Solving Real Optimization Problem
using Genetic algorithm with Employed Bee (GAEB) A multimodal function has two or
more local optima. A function of variables is separable if it can be rewritten as a sum of
functions of just one variable. The search process for a multimodal function is difficult if the
local optima are randomly distributed.
This paper we proposed the Artificial Bee Colony (ABC) Algorithm as a solver for
the Shortest Common Super sequence problem. In 2011, Mustafa M. Noaman compared the
results obtained by applying Artificial Bee Colony (ABC) Algorithm [7] with the results
obtained from applying other approaches that were proposed for solving the SCSP. The
Artificial Bee Colony (ABC) Algorithm provides a scalable solution and promising results.
In this paper, real coded mutation and crossover operator is applied to the ABC after
the employed bee phase and onlooker bee phase of ABC algorithm. In 2012, Manish Gupta
have research some probabilistic criteria selected food source is altered by mutation operator.
The experiments are performed on a job scheduling problem available in the literature. There
is no specific value for mutation probability for which we can obtain best results for job
scheduling experiments. As future work we have the intention to apply other types of
simulation operators and crossover operator in the ABC algorithm.
The aim of this paper is to compare the performance of the ABC algorithm when uses
different selection strategies. In 2011, Malek Alzaqebah is concluded that ABC algorithm
with a disruptive selection strategy is able to produce better results when compared to other
selection strategies tested in this work. We believe the performance of the ABC algorithm
can be enhanced by applying a suitable mechanism to choose the neighborhood structure
based on the current solution in hand.
Since 2010, Ivona B. presented the ABC algorithm for capacitated vehicle routing
problem. The twelve benchmark instances of small scale problems were tested. The results
were compared to the best known results. Although the global optimality cannot be
guaranteed, the performance of the algorithm is good and robust. It is noticed that algorithm
can be trapped in the local minimum for some benchmark instances. In the future work the
algorithm needs to be explored and tested for larger instances of the CVRP. The proposed
approach is also suitable for other combinatorial problems.
Since 2005, D. Karaboga and his research group have been studying the ABC
algorithm and its applications to real world problems. Karaboga and Basturk have
investigated the performance of the ABC algorithm on unconstrained numerical optimization
problems and its extended version for the constrained optimization problems and Karaboga et
al. applied ABC algorithm to neural network training. In 2010, Hadidi et al. employed an
Artificial Bee Colony (ABC) Algorithm based approach for structural optimization. In 2011,
Zhang et al. employed the ABC for optimal multi-level thresholding MR brain image
classification, cluster analysis, face pose estimation, and 2D protein folding.
2. FUZZY SYSTEM
The word fuzzy means “Vagueness”. Fuzziness occurs when the boundary of a piece
of information is not clear-cut. Fuzzy set theory is an extension of classical set theory which
allows the membership of the elements in the set in binary terms; a bivalent condition- an
element either belongs or does not belong to the set. But fuzzy theory permits the gradual
2
- 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
,
assessment of the membership of elements in a set, described with the aid of a membership
function valued in the real unit interval [0, 1].
A typical fuzzy based intelligent system has following modules as Fuzzification module,
y
Inference Engine, Knowledge Base and Defuzzification module [1]. Out of these modules the
knowledge or rule base is one of the most important parts of a fuzzy system as it provides t
the
necessary intelligence to the system.
2.1 Fuzzy Logic Based System
Fuzzy systems are a class of systems belonging to knowledge based systems. In the
class of systems, the knowledge is represented in the form of a rule base of the system. Fuzzy
system can be represented with the help of block diagram. Any fuzzy system consists of four
major modules namely fuzzification module, inference engine, knowledge base and
defuzzification module [1].
2.2 Fuzzificaton Module
Fuzzification is the process of transforming the crisp input values to the corresponding
values in fuzzy domain (fuzzy values) [1].
Figure 1: Block Diagram of fuzzy logic based system
3
- 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
KNOWLEDGE BASE
This module contains the knowledge of the application domain and the procedural
knowledge. It consists of a data base and linguistic control rule base [1].
INFERENCE ENGINE
This module simulates the decision making capabilities of human brain. Based on
input from fuzzifier, domain knowledge and set of control rules, the output decisions or the
necessary control actions are evaluated in fuzzy domain [1]. It involves three steps:Rule
Composition, Implication: and Aggregation:
Depending upon type of composition operators and implication operators inference process
is of three types:
• Mamdani Style Inference.
• Larsen Style Inference.
• Sugeno Style Inference.
2.3 Defuzzification
Defuzzification performs the reverse operation of fuzzification process that is it
converts the fuzzified output of inference engine into corresponding crisp values [1]. It
performs the following functions:
A number of defuzzification methods are available. E.g.
• Centre of Gravity/Centre of Area/Centroid Method
• Centre of Sums.
• Weighted Average.
• Centre of Largest Area.
• The process of design for fuzzy systems involves following steps:
• Identify the input and output variables.
• For these variables, generate membership functions and decide their shapes such as
triangular, Z-type, S-type etc.
• Generate rule base for the system.
• Select the type of inference.
• Select the type of aggregation.
• Decide on the defuzzification technique and generate a crisp control action
(defuzzification).
For the system of small complexity Step 1 can be performed by the experts by including
all the available inputs. For the systems of higher complexity and it is not possible to take in to
account all the inputs and one may be constrained to select only those inputs which have
significant contribution to the overall output of the system. Some of the suggested procedures
in the literature are forward selection procedure, backward elimination procedure, best subset
method and few other statistical selection procedures [17].
Step 2 can be performed with the help of domain expert(s) if they are available, from the
common sense or from the available numerical data. In case of numerical data is available for
these variables the membership functions generating using techniques like FCM, Neural
networks, GA etc.
Step 3 involves the development of rule base. In the case of a knowledge based system
development, step 3 is performed by an expert whereas in case of data driven system
development certain computerized techniques are used to develop the rule base.
4
- 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
As far as step 4 is concerned one can have hundreds of combinations of composition,
implication and aggregation operators.
For step 6, a large number of defuzzification techniques are available in the literature.
Some of the defuzzification techniques are: centre of gravity (COG), centre of sum methods,
first/last of maxima, Mean of maxima (MOM) [18-21].
2.4 Problem Formulation
Figure represents a sugeno type fuzzy system. From figure it is clear that such systems
consists of 4 major modules i.e. fuzzifier, rule composition module (Fuzzy MIN operators),
implication module and defuzzification module [21].
The overall computed output, in the case of a sugeno type system, can be written as:
Computed output=∑Wi*Ci / ∑ Wi…………….. (1)
In order to proceed for system design we first divide the input universe of discourse as
evidenced by data in to number of membership functions. For a two input system like the one
given in figure the total number of rules in the rule base will be 3x2=6. In general if there are
A inputs with B membership function each then the number of rules R can be written as
follows: R=BA. But these rules are due to combinations of membership functions of various
inputs and these are incomplete as we could have knowledge only about antecedent part and
consequents are yet unknown.
Because for any set of inputs Wi are easily computed by fuzzifier and rule composing
modules, the RHS of output expression (1) can be evaluated if we could choose the proper
values for Ci.
For a given data set of a system, Wi’s are known. Find the appropriate values of Ci such that
the difference between computed output and the actual output as given in data is minimum.
Ocomputed = (W1*C1+W2*C2+…..+Wn*Cn) / (W1+W2+…. +Wn)
We compare this computed output with actual output as given in the data set and find the
error. Let the error be defined as:
Error E= Actual output (As given in data set) - Computed output (As given in equation
1)
5
- 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
Now the whole problem of rulebase generation boils down to a minimization problem as
stated below:Minimize objective function E
E=OActual -OComputed …………………….. (2)
Any minimization technique may not be applicable if the problem is very complex. We apply
simple Ant colony optimization S-ACO algorithm to evaluate rule base.
3. ARTIFICIAL BEE COLONY OPTIMIZATION
In the ABC model, the colony consists of three groups of bees: employed bees,
onlookers and scouts. It is assumed that there is only one artificial employed bee for each
food source. In other words, the number of employed bees in the colony is equal to the
number of food sources around the hive. Employed bees go to their food source and come
back to hive and dance on this area. The employed bee whose food source has been
abandoned becomes a scout and starts to search for finding a new food source. Onlookers
watch the dances of employed bees and choose food sources depending on dances. The main
steps of the algorithm are given below:
Initial food sources are produced for all employed bees
REPEAT
Each employed bee goes to a food source in her memory and determines a
neighbor source, then evaluates its nectar amount and dances in the hive
Each onlooker watches the dance of employed bees and chooses one of their
sources depending on the dances, and then goes to that source. After choosing
a neighbor around that, she evaluates its nectar amount.
Abandoned food sources are determined and are replaced with the new food
sources discovered by scouts.
The best food source found so far is registered.
UNTIL (requirements are met)
In ABC, a population based algorithm, the position of a food source represents a possible
solution to the optimization problem and the nectar amount of a food source corresponds to
the quality (fitness) of the associated solution. The number of the employed bees is equal to
the number of solutions in the population. At the first step, a randomly distributed initial
population (food source positions) is generated. After initialization, the population is
subjected to repeat the cycles of the search processes of the employed, onlooker, and scout
bees, respectively.
4. RESULT ANALYSIS
The suggested approach has been applied for identification of fuzzy model for the rapid
Nickel-Cadmium (Ni-Cd) battery charger. The main objective of development of this charger
was to charge the batteries as quickly as possible but without doing any damage to them.
Input-output data consisting of 561 points, obtained through experimentation [22]. For this
charger the two input variables used to control the charging rate (ct) are absolute temperature
of the batteries (T) and its temperature gradient (dT/dt). Charging rates are expressed as
multiple of rated capacity of the battery. The input-output variables identified for rapid Ni-Cd
battery charger along with their universes of discourse are listed in Table 1.
6
- 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
Input Variables Minimum Value Maximum Value
Temp. (T)[0C] 0 50
Temp. Gradient (dT/dt) [ 0C/sec] 0 1
Output Variable Charging Rate 0 4
(Ct) [A]
Table1: Input and Output variables for rapid Ni-Cd battery charger along with their universes of
discourse
Let us assume that the temperature with the universe of discourse ranging from 0-50 degree
centigrade has been partitioned into 3 fuzzy sets namely temperature low, medium and
temperature high. The temperature gradient is partitioned into two fuzzy sets (membership
functions) namely low and high. Initially set the parameters of membership functions of input
variables to any arbitrary value. Once fuzzification of the inputs is carried out, 6 combinations of
input membership functions (3*2=6) representing 6 antecedents of rules are obtained. These 6
rules from the rulebase for the system under identification. The rulebase is yet incomplete as for
each rule the consequent is need to be found out. From the given data set of table 1 there are only 5
consequents that from where to choose one particular element as the consequent for a particular
rule The specified set of consequents in this case are C1= µultrafast, C2 = µhigh , C3 = µmedium ,
C4 = µlow and, C5 = µtrickle. The parameters of antecedent and consequents are chosen in such a
way so as to fulfill condition given by expression (2). Degree of compatibility of any input data set
to rule represented by Wi can be easily computed using the following formula
W1 = min (µLOW (temperature), µLOW (temp_grad))
This way all the Wi are evaluated, the right hand side of output expression (1) can be evaluated if
the proper values for Ci ε {ULTRAFAST, MED, LOW, HIGH, TRICKLE} can be chosen.
The ABC algorithm is implemented in C Language to select the values of consequents to satisfy
the equation (2). It was observed that the algorithm was successfully able to generate the required
rule base for the FLS shown in figure 3. With the application of rule reduction algorithm as given
in [Step 2- (d)] following set of rules are extracted by ABC.
Figure 2: Extracted Rules
7
- 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
4. PERFORMANCE OF BATTERY CHARGER
Mean Square Error (MSE) = 1/2N Σk=1 [y(k)-y’(k)]2
Where, y(k) = actual output
y’(k) = computed output
N = number of data points taken for model validation
MSE = 0.3099
5. CONCLUSION
This paper proposed an ABC based algorithm to enumerate rulebase for a sugeno type
fuzzy logic based system.The length of the pathset represents the system output i.e. ∑Wi*Ci
/ ∑ Wi. The difference between computed output and actual output as given in the training
data gives the error. This error is used to update the pheromone trail. Smaller the error more is
the amount of pheromone that is being deposited on the path. This allows artificial ants to
choose a path with higher pheromone deposit with higher probability. Finally all the ants
follow the path that has high pheromone deposit leading to shortest path. This leads to
generation of rule that produces minimum error.
6. REFERENCES
[1] Singh D., “Solving Real Optimization Problem using Genetic Algorithm with
Employed Bee” International Journal of Computer Applications (0975 – 8887) Volume 42–
No.11, March 2012.
[2] Mohd Afizi Mohd Shukran, “Artificial Bee Colony based Data Mining Algorithms for
Classification Tasks” 2011.
[3] Mustafa M. Noaman, “Solving Shortest Common Supersequence Problem Using
Artificial Bee Colony Algorithm” The Research Bulletin of Jordan ACM, ISSN, Volume II
(III) PP-80.
[4] Gupta M., “An Efficient Modified Artificial Bee Colony Algorithm for Job
Scheduling Problem” International Journal of Soft Computing and Engineering (IJSCE)
ISSN: 2231-2307, Volume-1, Issue-6, January 2012
[5] Inova B., “Artificial bee colony algorithm for the capacitated vehicle routing
problem” Proceedings of the European Computing Conference 2010.
[6] Chang Jianghui, Zhao Yongsheng, Wen Chongzhu, “Research on Optimization of
Fuzzy Membership function based on Ant Colony Algorithm,” Proc of the 25th Chinese
Control Conference, Harbin, Aug, 2006.
[7] Ashita S. Bhagade, “Artificial Bee Colony (ABC) Algorithm for Vehicle Routing
Optimization Problem” International Journal of Soft Computing and Engineering (IJSCE
ISSN: 2231-2307, Volume-2, Issue-2, May 2012
[8] Malek Alzaqebah, “Artificial bee colony search algorithm for examination
timetabling Problems” International Journal of the Physical Sciences Vol. 6(17), pp. 4264-
4272, September, 2011
[9] Adil Baykasoglu, “Artificial Bee Colony Algorithm and Its Application to
Generalized Assignment Problem” International Conference on Computational Intelligence
for Modeling, Control and Automation, Las Vegas.
8
- 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
[10] Marco Dorigo and Thomas Stuzzle, Ant Colony Optimization, Eastern Economy
Edition, PHI, 2005.
[11] Arun Khosla, Shakti Kumar, KK Aggarwal, Jagatpreet Singh,”Particle Swarm
Optimizer for fuzzy models IEEE Proc. on Fuzzy Systems, 2007
[12] Marco Dorigo and Thomas Stuzzle, Ant Colony Optimization, Eastern Economy
Edition, PHI, 2005.
[13] M. Galea and Q. Shen, “Fuzzy Rules from ant-inspired computation,”Proc. IEEE Int’l
Conf. Fuzzy Systems, pp 1691-1696, 2004.
[14] Bhalla P., “Fuzzy Rule base generation from Numerical Data using Ant colony
optimization,” MAIMT-Journal of IT & Management. Vol. 1, No. 1 May-Oct, 2007,
pp 33-47.
[15] Chia-Feng J, H.J. Huang and C.M. Lu, “Fuzzy Controller Design by ant colony
optimization,” IEEE Proc. on Fuzzy Systems, 2007.
[16] Kumar S. “Introduction to Fuzzy Logic Based Systems”, Workshop on Intelligent
System Engineering (WISE-2010), 2010.
[17] Shakti Kumar, P.Bhalla and Amarpartap Singh, “Soft Computing Approaches to
Fuzzy System identification:A Survey”, IISN-2009,pp 402-411, 2009.
[18] M.S. Abadeh, J. Habibi and E. Soroush, “Induction of Fuzzy classification systems
using evolutionary ABC-based algorithms,” Proc. of the First Asia Int’l Conf. on Modeling
and Simulation (AMS’07), 2007
[19] Shakti K, P. Bhalla and S.Sharma, “Automatic Fuzzy Rule base Generation for
Intersystem Handover using Ant Colony Optimization Algorithm,” International Conference
on Intelligent Systems and Networks (IISN-2007), Feb 23-25, 2007, MAIMT, Jagadhri
Haryana, India, pp 764-773.
[20] Shakti Kumar, “Rule base generation using ant colony optimization,” Proc. Of the one
week workshop on applied soft computing (SOCO-2006), Haryana Engineering College,
Jagadhri, July 2006.
[21] Adil, B., Lale, Ö., and Pınar, T. 2007. Artificial Bee Colony Algorithm and Its
Application to Generalized Assignment Problem. ISBN 978-3-902613
[22] Andreas, W. 2003. The Shortest Common Supersequence Problem. ISBN
978-3-90232
[23] Barone, P., Bonizzoni P., Vedova, G.D., and Mauri, G. 2001. An approximation
algorithm for the shortest common Supersequence symposium on applied computing, 56-60.
[24] Dervis, K. 2010. Artificial bee colony algorithm. Scholarpedia. 5(3):6915.
[25] Dervis, K., and Bahriye, A. 2009. A comparative study of Artificial Bee Colony
algorithm. Applied Mathematics and Computation, 214, 108–132.
[26] G.Vasu, J. Nancy Namratha and V.Rambabu, “Large Scale Linear Dynamic System
Reduction Using Artificial Bee Colony Optimization Algorithm” International Journal of
Electrical Engineering & Technology (IJEET), Volume 3, Issue 1, 2012, pp. 145 - 155,
Published by IAEME.
[27] Lalit Kumar and Dr. Dheerendra Singh, “Solving Np-Hard Problem Using Artificial
Bee Colony Algorithm” International journal of Computer Engineering & Technology
(IJCET), Volume 4, Issue 1, 2013, pp. 171 - 177, Published by IAEME.
9