2. IV. HMM TRAINING USING HYBRID GA-BW
• B ={bij ( k )} is the observation probability matrix, ALGORITHM
such that bij is the probability that the observation Ok In MixtGaussian toolbox implementation a 3 hidden state
has been generated by state i. Elements of matrix B continuous density mixture Gaussian HMM with 105
must satisfy the next two conditions: observation symbols have been used, this configuration can
well describe the speech utterance. Thus, the HMM model
bij:::: 0 where 1:S i,j :s 3 (3) parameters is Ie = {A, B, n }, where A is a 3 by 3 transition
probability matrix, B a 3 by 105 matrix, n is the initial
3 probability of states vector of size 3.
""' bi
L j =o j
= 1 where 1:S i :s 3 (4)
A. Encoding Method
• n = {ni} is the initial state distribution vector, and
every ni express the probability that the HMM chain It is vital to find a genetic representation of the model
will start at state i. Elements of vector n must satisfy parameters before applying GA to solve the optimization
the next two conditions: problem. In genetics, chromosomes are comprised by a set of
basic elements called genes, in our case the elementary
information is the elements of every probability matrixes A, B
where 1:S I :s 3 (5) and n. We choose to concatenate the rows of each matrix in the
model parameters Ie, thus the chromosomes will be represented
(6) as an array of real numbers as shown in Fig. 2.
HMM Training is the process of HMM parameter
calculation. This is shown in figure 1, the training tools use the
speech data and their transcriptions to estimate the HMM
parameters then the recognizer will use these HMMs to classify
the unknown speech utterances.
Figure 2. Chromosome encoding
B. Hybrid GA-BW
We present a hybrid GA-BW algorithm due to slow
convergence and high computational power needed by classical
Figure 1. HMM training process [12]
GA, especially when the generated chromosomes cannot
satisfy the conditions (1), (2), ... ,(6). For this reason, not
III. HMM TRAINING USING B-W ALGORITHM
included in the offspring and they are replaced by new
The B-W algorithm provided by MixtGaussian toolbox [9], chromosomes. To overcome the slow convergence problem,
has been used to train the HMM. After an initial guess of the the initial generation P(O) is not generated randomly as in
HMM parameters is made, the B-W algorithm is run for 20 normal GA, but by applying an iteration of B-W algorithm as
iterations to get more accurate parameters. As result we get a shown in Figure 3. For every ten generations, three iterations of
continuous density mixture Gaussian HMM. Finally
B-W algorithm are applied to all chromosomes to allow the
transcriptions of unknown speech utterances will be made by
algorithm to converge faster.
the recognizer module to determine how accurate are the
HMM's parameters.
543
3. selection and crossover functions have been used III the
1) Apply B-W algorithm to generate the initial
implementation.
population P(O), where
P(O) ={C], C2, ..., CN }, and Cj is one V. EXPERIMENTAL STUDY
chromosome The experimental study consist of using the HMM model
parameter resulting from both B-W and hybrid GA-BW
2) Calculate the fitness function F(Cj) of every training systems, to classify a set of unknown speech
chromosome Cj within the actual population utterances. The speech database used in the study comes with
MixtGaussian toolbox download file, the training data contains
P(t).
digits from 1 to 5, each digit is repeated by 15 speakers, and the
testing data contains the same digit repeated by 10 speakers.
3) Select a few chromosomes for the The recognition software is implemented in the MixtGaussian
intermediate population P'(t). toolbox to. This test is repeated ten times as it is shown in table
I.
4) Apply crossover to some chromosomes III
P'(t). TABLE I. RECOGNITION RATE FOR BOTH B- WAND GA-BW WITHIN 10
EXPERIENCES
5) Apply mutation to few chromosomes III Training Recognition rate (%) for experience number
P'(t). technique 1 2 3 4 5 6 7 8 9 10
B-W 68 76 76 72 68 76 68 84 72 64
6) Apply three iterations of B-W algorithm to
GA-BW 76 64 88 72 84 76 72 80 76 72
the population P(t) for each ten generations.
7)
From the results of table I it is difficult to statute about the
t t+l;
=
quality of B-W and GA-BW algorithms in term of recognition
if not convergence, then go to: 2). rate. However it is clear from results from table II that GA-BW
algorithm performs better than the classical B-W algorithm.
Figure 3. Hybrid GA-BW algorithm
TABLE II. MINIMUM, MAXIMUM AND AVERAGE OF RECOGNITION RATE
FOR BOTH B-W AND GA-BW ALGORITHMS
Training Recognition rate (%)
C. Fitness Function
technique Minimum Maximum Average
The fitness function is an evaluation mechanism of the
B-W 64% 84% 72.67%
chromosome; a higher fitness value reflects the chances of the
chromosome to be chosen in the next generation. The log GA-BW 64 % 88% 76%
likelihood [12] has been used, and it represents the probability
that the training observation utterances have been generated by
the current model parameters and it is a function of the
following form [2]: VI. CONCLUSION
In this work two HMM training systems have been
(7) analyzed. The first is based on the classical Baum-Welch
iterative procedure; the second is based on genetic algorithm.
The resulted HMM model parameters of each system has been
used to classify the same unknown speech utterances with the
D. Mutation
same recognizer. Under the same conditions the hybrid GA
Mutation selects randomly few chromosomes and alters BW algorithm performs better then Baum-Welch method. We
some genes to produce new chromosomes. The believe that this improvement is due to global searching ability
"mutationadaptfeasible" function of GA toolbox has been used of GA.
to satisfy the conditions (1), (2) ... (6).
E. Selection and Crossover
Selection mimics the survival of fittest mechanism seen in
Nature. Chromosomes with higher fitness values have a greater
probability to survive in the succeeding generations. Then
some elements from the population pool will be selected to
apply crossover. Portions of genes will be exchanged between
each couple of chromosomes. The default GA Toolbox
544
4. REFERENCES [6] K. Sun, J. Yu, "Video Affective Content Recognition Based on Genetic
Algorithm Combined HMM," Lecture Notes in Computer Science,
[I] M. Shamsul Huda, J. Yearwood, G. Ranadhir, "A Hybrid Algorithm for Springer 200 7.
Estimation of the Parameters of Hidden Markov Model based Acoustic
[ 7] J. Xiao, L. Zou, C. Li, " Optimization of Hidden Markov Model by a
Modeling of Speech Signals using Constraint-Based Genetic Algorithm
Genetic Algorithm for Web Information Extraction," http://atlantis
and Expectation Maximization," 6th IEEEIACIS International
press.com access date 22112/2009.
Conference on Computer and Information Science (ICJS 2007), pp. 4 3 8-
4 4 3,2007. [ 8] J. Xiao, L. Zou, C. Li, "Optimization of Hidden Markov Model by a
Genetic Algorithm for Web Information Extraction," http://atlantis
[2] S. Kwong, C.W. Chau, K.F. Man, K. S. Tang, "Optimization of HMM
press.com access date 22112/2009.
topology and its model parameters by genetic algorithms," Pattern
Recognition Society,V. 34, pp.S09-S22,2001. [9 ] R. Barbara, "A Tutorial for the Course Computational Intelligence," .
http://www.igi.tugraz.atllehre/CI. 3I-01-201O.
[ 3] K. Won, A. Prugel-Bennett, A. Krogh, "Training hmm structure with
genetic algorithm for biological sequence analysis,", Bioinformatics [10] S. Kwong, C.W. Chau, "Analysis of Parallel Genetic Algorithms on
20(1 8),3613-3619,2004. HMM based speech recognition system," IEEE transactions on
Consumer Electronics,Vol. 43, No. 4, 199 7.
[4] K. Won, A. Prugel-Bennett, A. Krogh, "Evolving the Structure of
Hidden Markov Models," IEEE TRANSACTIONS ON [II] Chau, C. W., S. Kwong, C. K. Diu, W. R. Fahinet, "Optimization of
EVOLUTIONARY COMPUTATION, VOL. 10, NO. I, FEBRUARY HMM by a genetic algorithm," Proceedings ICASSP, pp. 1 727-17 30,
2006. 199 7.
[S] L. Liu, H. Huo, B. Wang, "A Novel Optimization of Profile HMM by a [12] S. Young, D. Ollason, V. Valtchev, P. Woodland, The HTK Book (for
Hybrid Genetic Algorithm," Journal of Information & Computational HTK Version 34.1), Cambridge University Engineering Department,
Science,V.3S12 p. 734-7 41,Springer 200S. 2009.
545