SlideShare une entreprise Scribd logo
1  sur  9
Télécharger pour lire hors ligne
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 85
EMOTIONAL TELUGU SPEECH SIGNALS CLASSIFICATION BASED
ON K-NN CLASSIFIER
P. Vijai Bhaskar1
, S. Ramamohana Rao2
1
Professor, ECE Department, AVN Institute of Engg & Tech, A.P. India, 2
Principal, Geethanjali College of Engineering
and Technology, A.P, India, pvijaibhaskar@gmail.com, Principal.gcet@gmail.com
Abstract
Speech processing is the study of speech signals, and the methods used to process them. In application such as speech coding, speech
synthesis, speech recognition and speaker recognition technology, speech processing is employed. In speech classification, the
computation of prosody effects from speech signals plays a major role. In emotional speech signals pitch and frequency is a most
important parameters. Normally, the pitch value of sad and happy speech signals has a great difference and the frequency value of
happy is higher than sad speech. But, in some cases the frequency of happy speech is nearly similar to sad speech or frequency of sad
speech is similar to happy speech. In such situation, it is difficult to recognize the exact speech signal. To reduce such drawbacks, in
this paper we propose a Telugu speech emotion classification system with three features like Energy Entropy, Short Time Energy,
Zero Crossing Rate and K-NN classifier for the classification. Features are extracted from the speech signals and given to the K-NN.
The implementation result shows the effectiveness of proposed speech emotion classification system in classifying the Telugu speech
signals based on their prosody effects. The performance of the proposed speech emotion classification system is evaluated by
conducting cross validation on the Telugu speech database.
Keywords: Emotion Classification, K-NN classifier, Energy Entropy, Short Time Energy, Zero Crossing Rate.
----------------------------------------------------------------------***-----------------------------------------------------------------------
1. INTRODUCTION
Speech is the most desirable medium of communication
between humans. There are several ways of characterizing the
communications potential of speech. In general, speech coding
can be considered to be a particular specialty in the broad field
of speech processing, which also includes speech analysis and
speech recognition. The purpose of speech coder is to convert
an analogue speech signal into digital form for efficient
transmission or storage and then to convert the received digital
signal back to analogue [1]. Nowadays, speech technology has
matured enough to be useful for many practical applications.
Its good performance, however, depends on the availability of
adequate training resources. There are many applications,
where resources for the domain or language of interest are
very limited. For example, in intelligence applications, it is
often impossible to collect the necessary speech resources in
advance as it is hard to predict which languages become the
next ones of interest [4].
Currently, speech recognition applications are becoming
increasingly advantageous. Also, different interactive speech
aware applications are widely available in the market. In
speech recognition, the sounds uttered by a speaker are
converted to a sequence of words recognized by a listener. The
logic of the process requires information to flow in one
direction: from sounds to words. This direction of information
flow is unavoidable and necessary for a speech recognition
model to function [2]. In many practical situations, an
automatic speech recognizer has to operate in several different
but well-defined acoustic environments. For example, the
same recognition task may be implemented using different
microphones or transmission channels. In this situation, it may
not be practical to recollect a speech corpus to train the
acoustic models of the recognizer [3]. Speech input offers high
bandwidth information and relative ease of use. It also permits
the user’s hands and eyes to be busy with a task, which is
particularly valuable when users are in motion or in natural
field settings [7]. Similarly, speech output is more impressive
and understandable than the text output. Speech interfacing
involves speech synthesis and speech recognition. Speech
synthesizer takes the text as input and converts it into the
speech output i.e. It acts as text to speech converter. Speech
recognizer converts the spoken word into text [5].
Prosody refers to the supra segmental features of natural
speech, such as rhythm and intonation [6]. Native speakers use
prosody to convey paralinguistic information such as
emphasis, intention, attitude and emotion. The prosody of a
word sequence can be described by a set of prosodic variables
such as prosodic phrase boundary, pitch accent, lexical stress,
syllable position and hesitation, etc. Among these prosodic
variables, pitch accent and into national phrase boundary have
the most salient acoustic correlates and maybe most
perceptually robust [8] [9]. Prosody is potentially convenient
in automatic speech understanding systems for some reasons
[10]. Prosody correlates with prosody may be used to
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 86
disambiguate syntactically distinct sentences with identical
phoneme strings. Since prosody is ambiguous without
phoneme information, and phonemes are ambiguous without
prosodic information [11] [12]
The outline structure of the paper is organized as follows. In
section 2, a brief review is made about the recent research
works related to speech classification that is emphasized in
relation to its problem statement. Next, Section 3 details the
proposed speech emotion classification system and in Section
4 discusses about the implementation results and comparative
results whereas as in Section 5 concludes the paper.
2. RELATED WORKS
A few of the most recent literature works in speech
classification are reviewed below
Gyorgy Szaszaket al. [13] have developed a modeling acoustic
processing approach for speech recognition. Acoustic
processing and modeling of the supra-segmental
characteristics of speech, with the aim of incorporating
advanced syntactic and semantic level processing of spoken
language for speech recognition/understanding tasks. The
proposed modeling approach was similar to the one used in
standard speech recognition, where basic HMM units were
trained and were connected according to the dictionary and
some grammar (language model) to obtain a recognition
network, along which recognition could be interpreted also as
an alignment process. In this proposed method, HMM
framework was used to model speech prosody, and to perform
initial syntactic and/or semantic level processing of the input
speech in parallel for standard speech recognition. As
acoustic–prosodic features, fundamental frequency and energy
are used. For this, HMMs for the different types of word-stress
unit contours were trained and then used for recognition and
alignment of such units from the input speech. This method
was also allows punctuation to be automatically marked.
Soheil Shafieea et al. [14] have proposed Speech Activity
Detectors (SADs) technique for speech recognition tasks. In
this proposed, a two-stage speech activity detection system
was focused which at first take advantage of a voice activity
detector to discard pause segments out of the audio signals;
this was done even in presence of stationary background
noises. To find the feature set in speech/non-speech
classification, a large set of robust features was introduced; the
optimal subset of these features was chosen by applying a
Genetic Algorithm (GA) to the initial feature set. It had been
discovered that Fractal dimensions of numeric series of
prosodic features are the most speech/non-speech
differentiating features. Models of the system were trained
over a Farsi database, FARSDAT, however, the final
experiments in the TIMIT English database had been
conducted.
Raul Fernandez et al. [15] have proposed graphical models on
the task of recognizing affective categories from prosody in
both acted and natural speech. A strength of this approach was
the integration and summarization of information using both
local and global prosodic phenomena. In this framework
speech was structurally modeled as a dynamically evolving
hierarchical model in which levels of the hierarchy were
determined by prosodic constituency and contain parameters
that evolve according to dynamical systems. The acoustic
parameters had been chosen to reflect four main components
of the speech thought to reflect paralinguistic and affect-
specific information: intonation, loudness, rhythm and voice
quality. The model was then evaluated on two different
corpora of fully spontaneous, effectively-colored, naturally
occurring speech between people: Call Home English and BT
Call Center. Here the ground truth labels were obtained from
examining the agreement of 29 human coders labeling arousal
and valence. Then finally detecting high arousal negative
valence speech in call centers
Md. Mijanur Rahmanet al. [16] have proposed MFCC signal
analysis technique for multi leveled pattern recognition task,
which includes speech segmentation, classification, feature
extraction and pattern recognition. In the proposed method, a
blind speech segmentation procedure was used to segment the
continuously spoken Bangla sentences into words/sub-words
like units using the endpoint detection technique. These
segmented words were classified according to the number of
syllables and the sizes of the segmented words. The MFCC
signal analysis technique was used to extract the features of
speech words, which including windowing. The developed
system achieved the segmentation accuracy rate at different
from total 24 sub-classes of segmented words.
Abhishek Jaywant et al. [17] have characterized to category-
specificity from speech prosody and facial expressions. In
communication involved processing nonverbal emotional cues
from auditory and visual stimuli. They employed a cross-
modal priming task using emotional stimuli with the same
valence but that diff erent in emotion category. After listening
to angry, sad, disgusted, or neutral vocal primes, subjects
rendered a facial aff ect decision about an emotionally
congruent or incongruent face target. In this results revealed
that participants made fewer errors when judging face targets
that conveyed the same emotion as the vocal prime, and
responded significantly faster for most emotions.
Astonishingly, participants responded slower when the prime
and target both conveyed disgust, perhaps due to attention
biases for disgust-related stimuli. They find that vocal
emotional expressions with similar valence are processed with
category specificity and that discrete emotion knowledge
implicitly aff ects the processing of emotional faces between
sensory modalities.
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 87
THE PROBLEM STATEMENT
Section 2 reviews the recent works related to the recognition
or classification of speech signals based on the prosody
effects. In continues speech signal classification, the prosody
of continuous speech depends on many separate aspects, such
as the meaning of the sentence and the speaker characteristics
and emotions. In most of the research works, speech
classification process based on prosody effects is done by
using local and global features like, energy, pitch, linear
predictive spectrum coding (LPCC), Mel-frequency
spectrum coefficients (MFCC), and Mel-energy spectrum
dynamic coefficients (MEDC), Intonation, Pitch contour,
Vocal effort, etc. Normally, the pitch value of sad and happy
speech signals has a great difference. Also, the frequency
value of happy is higher than sad speech. But, in some cases
the frequency of happy speech is nearly similar to sad speech
or frequency of sad speech is similar to happy speech. In such
situation, it is difficult to recognize the exact speech signal.
3. PROPOSED CLASSIFIER SYSTEM
. Our proposed emotion speech classification system classifies
the speech signals based on their prosody effects by using K-
NN Classifier. In our work, we have utilizes a Telugu speech
signals to accomplish classification process. The proposed
system mainly comprised of two stages namely, (i) Feature
extraction (ii) Emotion classification. These two stages are
consecutively performed and the more accurate results are
occurred and are discussed in Section 3.1 and 3.2 respectively.
3.1 Feature Extraction
The Telugu speech database used in the proposed system
consist speeches of seven male speakers and two female
speakers with four emotions each. In feature extraction stage,
there are three features are extracted from the input speech
signals and given to the classification process. In speech
classification feature extraction plays a most important role.
Because the efficient features extraction from the input speech
signals makes that the output to be more efficient and provide
high classification accuracy. In our work, we extract three
efficient features from the speech signals. The extracted three
efficient features namely,
 Energy Entropy
 Short Time Energy
 Zero Crossing Rate
These features are extracted in our proposed system is
explained below.
(i) Energy Entropy (Ee)
The energy level of an input speech words signal is rapidly
changed and these sudden changes in the speech signals are
measured. This measurement result is stated as energy entropy
feature. To calculate this feature, the input speech word
signals are divided into f number of frames and calculate
normalized energy for each frame. The energy entropy (Ee)
feature is calculated by using the formula which is stated as
follows,




1
0
2
2
2
)(.log
f
k
eE  -------- (1)









N
b
b
l
F
S
W
N
1
2
*
 --------- (2)
Where,
2
 - is the normalized energy
N - is the total number of blocks
lW - is the window length
bS - is the number of short blocks
F - is the frequency
By exploiting the energy entropy equation is given in Equ. (1)
Is applied to the speech word signals and obtain the energy
entropy feature (Ee).
(ii) Short Time Energy (Se)
The input speech signals energy level is to be increased
suddenly. So we measure this energy increment level in
speech signals is defined as short time energy. To calculate
the short time energy the input signal is divided into w number
of windows and calculates the windowing function for each
window. The short-time energy (Se) of speech signals
replicates the amplitude variation and is described as follows,




i
e iwhixS )(.)( 2
------ (3)
Where,
)(ix - is the input signal
)(wh - is the impulse response
By utilizing the equation is given in Equ. (3), the short time
energy (Se) feature is calculated from the input speech word
signals.
(iii) Zero Crossing Rate (ZCR)
Among these four features, the zero crossing rates is one of
the most dominant features for speech signal processing. The
zero crossing ratios are defined as the rate at which the speech
signal crosses zero can provide information about the source
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 88
of its creation or the ratio of number of time domain zero
crossings occurred to the frame length. The Zero crossing Rate
(ZCR) is calculated by using the sign functions, which is
stated below,





1
1
2
)}1(sgn{)}(sgn{1 k
x
CR
xyxy
M
Z
(4)
Where,
M - is the length of the sample
)}(sgn{ xy - is the sign function
The sign function )}(sgn{ xy is defined as,









0)(;1
0)(;0
0)(;1
)}(sgn{
xy
xy
xy
xy
(5)
The ZCR is calculated for each input speech word signals by
using the Equ. (4).
3.2 Emotion Classification
In emotion classification the speech signals are classified by
using the extracted features from the feature extraction stage.
The extracted features from the speech signals are given to the
K-NN classifier. The features are extracted for the training
database speech signals and given to the K-NN classifier to
perform the training process. During training stage, the speech
signals corresponding three features are taken as input to the
K-NN classifier. Here, we have taken three inputs as energy
entropy (Ee), short-time energy (Se) and Zero crossing Rate
(ZCR). The proposed emotion classification K-NN classifier
Technique is mentioned below.
3.3 K-Nearest Neighbor Technique as an Emotion
Recognizer
A more general version of the nearest neighbor technique
bases the classification of an unknown sample on the “votes”
of K of its nearest neighbor rather than on only it’s on single
nearest neighbor. The K-Nearest Neighbor classification
procedure is denoted is denoted by K-NN. If the costs of error
are equal for each class, the estimated class of an unknown
sample is chosen to be the class that is most commonly
represented in the collection of its K nearest neighbors.
Among the various methods of supervised statistical pattern
recognition, the Nearest Neighbor is the most traditional one,
it does not consider a priori assumptions about the
distributions from which the training examples are drawn. It
involves a training set of all cases. A new sample is classified
by calculating the distance to the nearest training case, the sign
of that point then determines the classification of the sample.
The K-NN classifier extends this idea by taking the K nearest
points and assigning the sign of the majority. It is common to
select K small and odd to break ties (typically 1, 3 or 5).
Larger K values help reduce the effects of noisy points within
the training data set, and the choice of K is often performed
through cross validation. In this way, given a input test sample
vector of features x of dimension n, we estimate its Euclidean
distance d equation 1 with all the training samples (y) and
classify to the class of the minimal distance.
𝑞 𝑥, 𝑦 = (𝑥𝑖
2𝑛
𝑖=1 - 𝑦𝑖
2
)--------(6)
The training examples are vectors in a multidimensional
feature space, each with a class label. The training phase of the
algorithm consists only of storing the feature vectors and class
labels of the training samples. In the classification phase, K is
a user-defined constant, and an unlabelled vector (a query or
test point) is classified by assigning the label which is most
frequent among the K training samples nearest to that query
point. Usually Euclidean distance is used as the distance
metric, however this is only applicable to continuous
variables.
3.4 K-NN Algorithm:
The k-NN algorithm can also be adapted for use in estimating
continuous variables. One such implementation uses an
inverse distance weighted average of the k-nearest
multivariate neighbors. This algorithm functions as follows:
Compute Euclidean or Mahalanobis distance from target plot
to those that were sampled.
1. Order samples taking for account and calculate distances.
2. Choose heuristically optimal K nearest neighbor based on
root mean square error q(x, y) done by cross validation
technique.
3. Calculate an inverse distance weighted average with the k-
nearest multivariate neighbors.
4. EXPERIMENTAL RESULTS
The proposed Telugu speech emotion classification technique
is implemented in the working platform of MATLAB (version
7.12) with machine configuration as follows
Processor: Intel core i5
OS: Windows xp
CPU speed: 3.20 GHz, RAM: 4GB
The performance of the proposed system is evaluated with
different person’s emotion speech signals and the results are
compared against the existing techniques. The input speech
signals are classified using the proposed speech classification
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 89
system using K-NN classifier. The sample (i) input normal,
happy, sad and angry emotion speech signals (ii) Extracted
features zero crossing rate, energy entropy, short time energy
for normal, happy, sad and angry speech signals are shown in
the fig.1
(a)
(b)
(c)
(d)
1(i)
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 90
(a) (b)
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 91
(c)
(d)
1(ii)
Figure 1: (i) Input (a) angry, (b) happy, (c) normal and (d) sad
emotion speech signals (ii) Extracted features of (a) angry, (b)
happy, (c) normal and (d) sad signals
The performance of proposed speech classification system is
analyzed by using the statistical measures. The statistical
measures [18] are used to measure the speech classification
performance. The performance analyses have shown that the
proposed system has been successfully classifies the input
speech signals based on their prosody which the speech
signals belongs to.
The Telugu speech signal database is created with 9 persons,
each person has four emotional speech signals are normal,
happy, sad and angry. These 9 person’s emotion speech
signals are given to the one cross validation process and the
performance measures of the classification results are listed in
Table .1
The K-NN classifier performance measures for the same
Telugu speech database are given in Table .1.
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 92
Table 1: Speech emotion classification Statistical performance measures of K-NN for 9 different Speakers.
As can be seen from table 1, our proposed emotion
classification K-NN system has given a considerable and
moderate classification performance.
CONCLUSIONS
In this paper, we proposed a Telugu speech emotion
classification system using K-NN Classifier. Here, the
classification process is made by extracting three features like
entropy, short time energy and zero crossing rates from the
input speech signals. The computed features were given to the
K-NN to Classifier accomplish the training process. The
proposed speech emotion classification system classifies the
Telugu speech signals based on their prosody by using the K-
NN classifier. During testing, if a set of emotional speech
signals is given as input it classifies the speech signals based
on the prosody which the speech signal belongs to. Human
emotions can be recognized from speech signals when facial
expressions or biological signals are not available. In this
work Emotions are recognized from speech signals using real
time database. In this work we presented an approach to
emotion recognition from speech signal. The future work will
be to conduct comparative study of various classifier using
different parameter selection method to improve and
accuracy.
REFERENCES
[1] Marwan Al-Akaidi, "Introduction to speech processing",
Fractal Speech Processing-Cambridge University Press, 2012
[2] Dennis Norris, James M. McQueen and Anne Cutler,
"Merging information in speech recognition: Feedback is
Never necessary", Behavioral and Brain Sciences, Vol. 23, pp.
299-370, 2000
[3] Leonardo Neumeyer and Mitchel Weintraub, "Probabilistic
Optimum Filtering for Robust Speech Recognition", In
Proceedings of IEEE International Conference on Acoustics,
Speech, and Signal Processing, Adelaide, SA, Australia, Vol.
1, pp. I/417 - I/420, 1994
[4] Lukas Burget, Petr Schwarz, Mohit Agarwal, Pinar
Akyazi, Kai Feng, Arnab Ghoshal, Ondrej Glembek,
Nagendra Goel, Martin Karafiat, Daniel Povey, Ariya
Rastrow, Richard C. Rose, Samuel Thomas, "Multilingual
Acoustic Modeling for Speech Recognition Based on
Subspace Gaussian Mixture Models", In Proceedings of IEEE
International Conference on Acoustics Speech and Signal
Processing, Dallas, TX, pp. 4334 - 4337, 2010,
[5] Kuldeep Kumar and R. K. Aggarwal, "Hindi Speech
Recognition System Using HTK", International Journal of
Computing and Business Research, Vol. 2, No. 2, May 2011
[6]. Ken Chen, Mark Hasegawa-Johnson and Aaron Cohen,
“An Automatic Prosody Labeling System Using Ann-Based
Syntactic-Prosodic Model and GMM-Based Acoustic-
Prosodic Model”, IEEE International Conference on
Acoustics, Speech, and Signal Processing, Vol. 1, pp. 509-
512, 2004.
[7] Vimala and Radha, "A Review on Speech Recognition
Challenges and Approaches", World of Computer Science and
Information Technology Journal, Vol. 2, No. 1, pp. 1-7, 2012
[8] Omair Khan, Wasfi G. Al-Khatib, Cheded Lahouari,
“Detection of Questions in Arabic Audio Monologues Using
Prosodic Features”, Ninth IEEE International Symposium on
Multimedia, pp. 29- 36, 2007
[9]. Stefanie Shattuck-Hufnage and Alice E. Turk, “A
Prosody Tutorial for Investigators of Auditory Sentence
Experiments Sensitivity FPR Accuracy Specificity PPR NPV FPR MCC
1 25 75 25 25 25 25 75 13
2 0 67 25 33 0 33 100 -12
3 25 75 25 25 13 38 88 -9
4 25 75 25 25 25 25 75 13
5 25 75 25 25 25 25 75 13
6 25 75 25 25 25 25 75 13
7 25 75 25 25 25 25 75 13
8 25 75 25 25 25 25 75 13
9 25 75 25 25 25 25 75 13
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 93
Processing”, Journal of Psycholinguistic Research, Vol. 25,
No. 2, pp. 193-247, 1996
[10] Elizabeth Shriberga, Andreas Stolckea, Dilek Hakkani-
Türb, Gökhan Tur, “Prosody-based automatic segmentation of
speech into sentences and topics”, Speech Communication,
Vol. 32, No. 1–2, pp. 127–154, 2000
[11] Ken Chen, Mark Hasegawa-Johnson, Aaron Cohen,
Sarah Borys, Sung-Suk Kim, Jennifer Cole, and Jeung-Yoon
Choi, “Prosody Dependent Speech Recognition on Radio
News Corpus of American English”, IEEE Transactions on
Speech and Audio Processing, Vol. 13, No. 6, pp. 1-15, 2005
[12] Mark Hasegawa-Johnson, Ken Chen, Jennifer Cole,
Sarah Borys, Sung-Suk Kim, Aaron Cohen, Tong Zhang,
Jeung-Yoon Choi, Heejin Kim, Taejin Yoon, Sandra
Chavarria, “Simultaneous recognition of words and prosody in
the Boston University Radio Speech Corpus”, Speech
Communication, Vol. 46, No. 3–4, pp. 418–439, 2005
[13] KlaraVicsi, Gyorgy Szaszak, “Using prosody to improve
automatic speech recognition”, Speech Communication, Vol.
52, No. 5, pp. 413–426, 2010
[14] Soheil Shafieea, Farshad Almasganja, Bahram
Vazirnezhada, Ayyoob Jafari, “A two-stage speech activity
detection system considering fractal aspects of prosody”,
Pattern Recognition Letters, Vol. 31, No. 9, pp. 936–948,
2010
[15] Raul Fernandez, Rosalind Picard, “Recognizing affect
from speech prosody using hierarchical graphical models”,
Journal Speech Communication, Vol. 53, No. 9-10, pp. 1088-
1103, 2011
[16] Md. Mijanur Rahman, Md. Farukuzzaman Khan and Md.
Al-Amin Bhuiyan, “Continuous Bangla Speech Segmentation,
Classification and Feature Extraction”, IJCSI International
Journal of Computer Science Issues, Vol. 9, No. 2, No 1, pp.
67-75, 2012
[17] Abhishek Jaywant, Marc D. Pell, “Categorical processing
of negative emotions from speech prosody”, Speech
Communication, Vol. 54, No. 1, pp. 1–10, 2012
[18]http://en.wikipedia.org/wiki/Sensitivity_and_specificity
BIOGRAPHIES:
P. Vijay Bhaskar, Head of the ECE
Department, AVN Institute of
Engg.&Tech. –HYDERABAD obtained
his B. Tech., from JNTU, Kakinada and
M. Tech., from JNTU, Hyderabad, and
both First Class with Distinction and
pursuing PhD. from JNTU, Hyderabad,
and also having the overall teaching
experience of 18 Years and Published 09
papers in various National and International conferences, and
guided good number of B. Tech., and M. Tech., Projects
S. Rama Mohana Rao, who served for
25 years in ISRO Vikram Sarabhai
Space Centre Trivandrum from 1971-
1996 in various capabilities such as
Head, ELP, PCF, EFF and the last being
as Deputy Project Director ESP. He has
also served for 15 years in various
educational institutions as Professor,
HOD & Principal. who is an eminent personality known for
his versatility, and obtained his Ph D from IISC, Bangalore,
and also his credit publications in International & National
journals. He has over15 years of experience as Principal at
reputed Engineering colleges.

Contenu connexe

Tendances

Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerImplementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerIOSR Journals
 
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...
IRJET- Spoken Language Identification System using MFCC Features and Gaus...IRJET Journal
 
Myanmar named entity corpus and its use in syllable-based neural named entity...
Myanmar named entity corpus and its use in syllable-based neural named entity...Myanmar named entity corpus and its use in syllable-based neural named entity...
Myanmar named entity corpus and its use in syllable-based neural named entity...IJECEIAES
 
Personalising speech to-speech translation
Personalising speech to-speech translationPersonalising speech to-speech translation
Personalising speech to-speech translationbehzad66
 
An expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicAn expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicijnlc
 
Hindi –tamil text translation
Hindi –tamil text translationHindi –tamil text translation
Hindi –tamil text translationVaibhav Agarwal
 
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language ModelsIRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language ModelsIRJET Journal
 
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...IJECEIAES
 
Performance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languagePerformance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languageiosrjce
 
ZEHRA RAFIA SALMA
ZEHRA RAFIA SALMAZEHRA RAFIA SALMA
ZEHRA RAFIA SALMALINKE DIN
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...iosrjce
 
Ijartes v1-i1-002
Ijartes v1-i1-002Ijartes v1-i1-002
Ijartes v1-i1-002IJARTES
 
Transliteration by orthography or phonology for hindi and marathi to english ...
Transliteration by orthography or phonology for hindi and marathi to english ...Transliteration by orthography or phonology for hindi and marathi to english ...
Transliteration by orthography or phonology for hindi and marathi to english ...ijnlc
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsIOSR Journals
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorWaqas Tariq
 
NLP and its Use in Education
NLP and its Use in EducationNLP and its Use in Education
NLP and its Use in Educationantonellarose
 
Creation of speech corpus for emotion analysis in Gujarati language and its e...
Creation of speech corpus for emotion analysis in Gujarati language and its e...Creation of speech corpus for emotion analysis in Gujarati language and its e...
Creation of speech corpus for emotion analysis in Gujarati language and its e...IJECEIAES
 

Tendances (19)

Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerImplementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
 
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
 
Myanmar named entity corpus and its use in syllable-based neural named entity...
Myanmar named entity corpus and its use in syllable-based neural named entity...Myanmar named entity corpus and its use in syllable-based neural named entity...
Myanmar named entity corpus and its use in syllable-based neural named entity...
 
Personalising speech to-speech translation
Personalising speech to-speech translationPersonalising speech to-speech translation
Personalising speech to-speech translation
 
An expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicAn expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabic
 
Hindi –tamil text translation
Hindi –tamil text translationHindi –tamil text translation
Hindi –tamil text translation
 
Ey4301913917
Ey4301913917Ey4301913917
Ey4301913917
 
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language ModelsIRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
 
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
 
Performance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languagePerformance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi language
 
ZEHRA RAFIA SALMA
ZEHRA RAFIA SALMAZEHRA RAFIA SALMA
ZEHRA RAFIA SALMA
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
 
Ijartes v1-i1-002
Ijartes v1-i1-002Ijartes v1-i1-002
Ijartes v1-i1-002
 
Transliteration by orthography or phonology for hindi and marathi to english ...
Transliteration by orthography or phonology for hindi and marathi to english ...Transliteration by orthography or phonology for hindi and marathi to english ...
Transliteration by orthography or phonology for hindi and marathi to english ...
 
Cf32516518
Cf32516518Cf32516518
Cf32516518
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design Aspects
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
 
NLP and its Use in Education
NLP and its Use in EducationNLP and its Use in Education
NLP and its Use in Education
 
Creation of speech corpus for emotion analysis in Gujarati language and its e...
Creation of speech corpus for emotion analysis in Gujarati language and its e...Creation of speech corpus for emotion analysis in Gujarati language and its e...
Creation of speech corpus for emotion analysis in Gujarati language and its e...
 

En vedette

A comparative study on road traffic management systems
A comparative study on road traffic management systemsA comparative study on road traffic management systems
A comparative study on road traffic management systemseSAT Publishing House
 
Effects of different mole spacings on the yield of summer groundnut
Effects of different mole spacings on the yield of summer groundnutEffects of different mole spacings on the yield of summer groundnut
Effects of different mole spacings on the yield of summer groundnuteSAT Publishing House
 
Study of shape of intermediate sill on the design of stilling basin model
Study of shape of intermediate sill on the design of stilling basin modelStudy of shape of intermediate sill on the design of stilling basin model
Study of shape of intermediate sill on the design of stilling basin modeleSAT Publishing House
 
Active self interference cancellation techniques in
Active self interference cancellation techniques inActive self interference cancellation techniques in
Active self interference cancellation techniques ineSAT Publishing House
 
Web performance prediction using geo statistical method
Web performance prediction using geo statistical methodWeb performance prediction using geo statistical method
Web performance prediction using geo statistical methodeSAT Publishing House
 
The demonstration of fourier series to first year undergraduate engineering s...
The demonstration of fourier series to first year undergraduate engineering s...The demonstration of fourier series to first year undergraduate engineering s...
The demonstration of fourier series to first year undergraduate engineering s...eSAT Publishing House
 
Characterization of reusable software components for better reuse
Characterization of reusable software components for better reuseCharacterization of reusable software components for better reuse
Characterization of reusable software components for better reuseeSAT Publishing House
 
Experimental investigation, optimization and performance prediction of wind t...
Experimental investigation, optimization and performance prediction of wind t...Experimental investigation, optimization and performance prediction of wind t...
Experimental investigation, optimization and performance prediction of wind t...eSAT Publishing House
 
Fabrication and optimization of parameters for dye
Fabrication and optimization of parameters for dyeFabrication and optimization of parameters for dye
Fabrication and optimization of parameters for dyeeSAT Publishing House
 
Study of surface roughness for discontinuous
Study of surface roughness for discontinuousStudy of surface roughness for discontinuous
Study of surface roughness for discontinuouseSAT Publishing House
 
A review of pre combustion co2 capture in igcc
A review of pre combustion co2 capture in igccA review of pre combustion co2 capture in igcc
A review of pre combustion co2 capture in igcceSAT Publishing House
 
Transient voltage distribution in transformer winding (experimental investiga...
Transient voltage distribution in transformer winding (experimental investiga...Transient voltage distribution in transformer winding (experimental investiga...
Transient voltage distribution in transformer winding (experimental investiga...eSAT Publishing House
 
Open domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingOpen domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingeSAT Publishing House
 
Performance evaluation of broadcast mac and aloha mac protocol for underwater...
Performance evaluation of broadcast mac and aloha mac protocol for underwater...Performance evaluation of broadcast mac and aloha mac protocol for underwater...
Performance evaluation of broadcast mac and aloha mac protocol for underwater...eSAT Publishing House
 
Semantic approach utilizing data mining and case based reasoning for it suppo...
Semantic approach utilizing data mining and case based reasoning for it suppo...Semantic approach utilizing data mining and case based reasoning for it suppo...
Semantic approach utilizing data mining and case based reasoning for it suppo...eSAT Publishing House
 
Automated water head controller for domestic application
Automated water head controller for domestic applicationAutomated water head controller for domestic application
Automated water head controller for domestic applicationeSAT Publishing House
 
Elevating forensic investigation system for file clustering
Elevating forensic investigation system for file clusteringElevating forensic investigation system for file clustering
Elevating forensic investigation system for file clusteringeSAT Publishing House
 
Lab view study of electrical power distribution system
Lab view study of electrical power distribution systemLab view study of electrical power distribution system
Lab view study of electrical power distribution systemeSAT Publishing House
 
Design and analysis of different digital communication systems and determinat...
Design and analysis of different digital communication systems and determinat...Design and analysis of different digital communication systems and determinat...
Design and analysis of different digital communication systems and determinat...eSAT Publishing House
 

En vedette (20)

A comparative study on road traffic management systems
A comparative study on road traffic management systemsA comparative study on road traffic management systems
A comparative study on road traffic management systems
 
Effects of different mole spacings on the yield of summer groundnut
Effects of different mole spacings on the yield of summer groundnutEffects of different mole spacings on the yield of summer groundnut
Effects of different mole spacings on the yield of summer groundnut
 
Study of shape of intermediate sill on the design of stilling basin model
Study of shape of intermediate sill on the design of stilling basin modelStudy of shape of intermediate sill on the design of stilling basin model
Study of shape of intermediate sill on the design of stilling basin model
 
Active self interference cancellation techniques in
Active self interference cancellation techniques inActive self interference cancellation techniques in
Active self interference cancellation techniques in
 
Web performance prediction using geo statistical method
Web performance prediction using geo statistical methodWeb performance prediction using geo statistical method
Web performance prediction using geo statistical method
 
The demonstration of fourier series to first year undergraduate engineering s...
The demonstration of fourier series to first year undergraduate engineering s...The demonstration of fourier series to first year undergraduate engineering s...
The demonstration of fourier series to first year undergraduate engineering s...
 
Characterization of reusable software components for better reuse
Characterization of reusable software components for better reuseCharacterization of reusable software components for better reuse
Characterization of reusable software components for better reuse
 
Experimental investigation, optimization and performance prediction of wind t...
Experimental investigation, optimization and performance prediction of wind t...Experimental investigation, optimization and performance prediction of wind t...
Experimental investigation, optimization and performance prediction of wind t...
 
Composites from natural fibres
Composites from natural fibresComposites from natural fibres
Composites from natural fibres
 
Fabrication and optimization of parameters for dye
Fabrication and optimization of parameters for dyeFabrication and optimization of parameters for dye
Fabrication and optimization of parameters for dye
 
Study of surface roughness for discontinuous
Study of surface roughness for discontinuousStudy of surface roughness for discontinuous
Study of surface roughness for discontinuous
 
A review of pre combustion co2 capture in igcc
A review of pre combustion co2 capture in igccA review of pre combustion co2 capture in igcc
A review of pre combustion co2 capture in igcc
 
Transient voltage distribution in transformer winding (experimental investiga...
Transient voltage distribution in transformer winding (experimental investiga...Transient voltage distribution in transformer winding (experimental investiga...
Transient voltage distribution in transformer winding (experimental investiga...
 
Open domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingOpen domain question answering system using semantic role labeling
Open domain question answering system using semantic role labeling
 
Performance evaluation of broadcast mac and aloha mac protocol for underwater...
Performance evaluation of broadcast mac and aloha mac protocol for underwater...Performance evaluation of broadcast mac and aloha mac protocol for underwater...
Performance evaluation of broadcast mac and aloha mac protocol for underwater...
 
Semantic approach utilizing data mining and case based reasoning for it suppo...
Semantic approach utilizing data mining and case based reasoning for it suppo...Semantic approach utilizing data mining and case based reasoning for it suppo...
Semantic approach utilizing data mining and case based reasoning for it suppo...
 
Automated water head controller for domestic application
Automated water head controller for domestic applicationAutomated water head controller for domestic application
Automated water head controller for domestic application
 
Elevating forensic investigation system for file clustering
Elevating forensic investigation system for file clusteringElevating forensic investigation system for file clustering
Elevating forensic investigation system for file clustering
 
Lab view study of electrical power distribution system
Lab view study of electrical power distribution systemLab view study of electrical power distribution system
Lab view study of electrical power distribution system
 
Design and analysis of different digital communication systems and determinat...
Design and analysis of different digital communication systems and determinat...Design and analysis of different digital communication systems and determinat...
Design and analysis of different digital communication systems and determinat...
 

Similaire à Emotional telugu speech signals classification based on k nn classifier

Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemIJERA Editor
 
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and PhonemesEffect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemeskevig
 
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESEFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESkevig
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert Systemcsandit
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSijnlc
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignmentskevig
 
High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...
High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...
High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...IJECEIAES
 
Classification improvement of spoken arabic language based on radial basis fu...
Classification improvement of spoken arabic language based on radial basis fu...Classification improvement of spoken arabic language based on radial basis fu...
Classification improvement of spoken arabic language based on radial basis fu...IJECEIAES
 
Survey On Speech Synthesis
Survey On Speech SynthesisSurvey On Speech Synthesis
Survey On Speech SynthesisCSCJournals
 
Direct Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete UnitsDirect Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete UnitsIJCI JOURNAL
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01ijcsit
 
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURESSPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURESacijjournal
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327IJMER
 
Natural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and DifficultiesNatural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and Difficultiesijtsrd
 

Similaire à Emotional telugu speech signals classification based on k nn classifier (20)

Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis System
 
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and PhonemesEffect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
 
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESEFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignments
 
Kc3517481754
Kc3517481754Kc3517481754
Kc3517481754
 
High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...
High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...
High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...
 
Classification improvement of spoken arabic language based on radial basis fu...
Classification improvement of spoken arabic language based on radial basis fu...Classification improvement of spoken arabic language based on radial basis fu...
Classification improvement of spoken arabic language based on radial basis fu...
 
Ijetcas14 390
Ijetcas14 390Ijetcas14 390
Ijetcas14 390
 
Survey On Speech Synthesis
Survey On Speech SynthesisSurvey On Speech Synthesis
Survey On Speech Synthesis
 
Direct Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete UnitsDirect Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete Units
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01
 
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURESSPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327
 
Natural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and DifficultiesNatural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and Difficulties
 

Plus de eSAT Publishing House

Likely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnamLikely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnameSAT Publishing House
 
Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...eSAT Publishing House
 
Hudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnamHudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnameSAT Publishing House
 
Groundwater investigation using geophysical methods a case study of pydibhim...
Groundwater investigation using geophysical methods  a case study of pydibhim...Groundwater investigation using geophysical methods  a case study of pydibhim...
Groundwater investigation using geophysical methods a case study of pydibhim...eSAT Publishing House
 
Flood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, indiaFlood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, indiaeSAT Publishing House
 
Enhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity buildingEnhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity buildingeSAT Publishing House
 
Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...eSAT Publishing House
 
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...eSAT Publishing House
 
Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...eSAT Publishing House
 
Shear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a reviewShear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a revieweSAT Publishing House
 
Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...eSAT Publishing House
 
Risk analysis and environmental hazard management
Risk analysis and environmental hazard managementRisk analysis and environmental hazard management
Risk analysis and environmental hazard managementeSAT Publishing House
 
Review study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear wallsReview study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear wallseSAT Publishing House
 
Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...eSAT Publishing House
 
Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...eSAT Publishing House
 
Coastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of indiaCoastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of indiaeSAT Publishing House
 
Can fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structuresCan fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structureseSAT Publishing House
 
Assessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildingsAssessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildingseSAT Publishing House
 
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...eSAT Publishing House
 
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...eSAT Publishing House
 

Plus de eSAT Publishing House (20)

Likely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnamLikely impacts of hudhud on the environment of visakhapatnam
Likely impacts of hudhud on the environment of visakhapatnam
 
Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...Impact of flood disaster in a drought prone area – case study of alampur vill...
Impact of flood disaster in a drought prone area – case study of alampur vill...
 
Hudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnamHudhud cyclone – a severe disaster in visakhapatnam
Hudhud cyclone – a severe disaster in visakhapatnam
 
Groundwater investigation using geophysical methods a case study of pydibhim...
Groundwater investigation using geophysical methods  a case study of pydibhim...Groundwater investigation using geophysical methods  a case study of pydibhim...
Groundwater investigation using geophysical methods a case study of pydibhim...
 
Flood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, indiaFlood related disasters concerned to urban flooding in bangalore, india
Flood related disasters concerned to urban flooding in bangalore, india
 
Enhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity buildingEnhancing post disaster recovery by optimal infrastructure capacity building
Enhancing post disaster recovery by optimal infrastructure capacity building
 
Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...Effect of lintel and lintel band on the global performance of reinforced conc...
Effect of lintel and lintel band on the global performance of reinforced conc...
 
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
 
Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...Wind damage to buildings, infrastrucuture and landscape elements along the be...
Wind damage to buildings, infrastrucuture and landscape elements along the be...
 
Shear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a reviewShear strength of rc deep beam panels – a review
Shear strength of rc deep beam panels – a review
 
Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...Role of voluntary teams of professional engineers in dissater management – ex...
Role of voluntary teams of professional engineers in dissater management – ex...
 
Risk analysis and environmental hazard management
Risk analysis and environmental hazard managementRisk analysis and environmental hazard management
Risk analysis and environmental hazard management
 
Review study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear wallsReview study on performance of seismically tested repaired shear walls
Review study on performance of seismically tested repaired shear walls
 
Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...Monitoring and assessment of air quality with reference to dust particles (pm...
Monitoring and assessment of air quality with reference to dust particles (pm...
 
Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...Low cost wireless sensor networks and smartphone applications for disaster ma...
Low cost wireless sensor networks and smartphone applications for disaster ma...
 
Coastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of indiaCoastal zones – seismic vulnerability an analysis from east coast of india
Coastal zones – seismic vulnerability an analysis from east coast of india
 
Can fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structuresCan fracture mechanics predict damage due disaster of structures
Can fracture mechanics predict damage due disaster of structures
 
Assessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildingsAssessment of seismic susceptibility of rc buildings
Assessment of seismic susceptibility of rc buildings
 
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
 
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
 

Dernier

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxPurva Nikam
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 

Dernier (20)

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptx
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 

Emotional telugu speech signals classification based on k nn classifier

  • 1. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 85 EMOTIONAL TELUGU SPEECH SIGNALS CLASSIFICATION BASED ON K-NN CLASSIFIER P. Vijai Bhaskar1 , S. Ramamohana Rao2 1 Professor, ECE Department, AVN Institute of Engg & Tech, A.P. India, 2 Principal, Geethanjali College of Engineering and Technology, A.P, India, pvijaibhaskar@gmail.com, Principal.gcet@gmail.com Abstract Speech processing is the study of speech signals, and the methods used to process them. In application such as speech coding, speech synthesis, speech recognition and speaker recognition technology, speech processing is employed. In speech classification, the computation of prosody effects from speech signals plays a major role. In emotional speech signals pitch and frequency is a most important parameters. Normally, the pitch value of sad and happy speech signals has a great difference and the frequency value of happy is higher than sad speech. But, in some cases the frequency of happy speech is nearly similar to sad speech or frequency of sad speech is similar to happy speech. In such situation, it is difficult to recognize the exact speech signal. To reduce such drawbacks, in this paper we propose a Telugu speech emotion classification system with three features like Energy Entropy, Short Time Energy, Zero Crossing Rate and K-NN classifier for the classification. Features are extracted from the speech signals and given to the K-NN. The implementation result shows the effectiveness of proposed speech emotion classification system in classifying the Telugu speech signals based on their prosody effects. The performance of the proposed speech emotion classification system is evaluated by conducting cross validation on the Telugu speech database. Keywords: Emotion Classification, K-NN classifier, Energy Entropy, Short Time Energy, Zero Crossing Rate. ----------------------------------------------------------------------***----------------------------------------------------------------------- 1. INTRODUCTION Speech is the most desirable medium of communication between humans. There are several ways of characterizing the communications potential of speech. In general, speech coding can be considered to be a particular specialty in the broad field of speech processing, which also includes speech analysis and speech recognition. The purpose of speech coder is to convert an analogue speech signal into digital form for efficient transmission or storage and then to convert the received digital signal back to analogue [1]. Nowadays, speech technology has matured enough to be useful for many practical applications. Its good performance, however, depends on the availability of adequate training resources. There are many applications, where resources for the domain or language of interest are very limited. For example, in intelligence applications, it is often impossible to collect the necessary speech resources in advance as it is hard to predict which languages become the next ones of interest [4]. Currently, speech recognition applications are becoming increasingly advantageous. Also, different interactive speech aware applications are widely available in the market. In speech recognition, the sounds uttered by a speaker are converted to a sequence of words recognized by a listener. The logic of the process requires information to flow in one direction: from sounds to words. This direction of information flow is unavoidable and necessary for a speech recognition model to function [2]. In many practical situations, an automatic speech recognizer has to operate in several different but well-defined acoustic environments. For example, the same recognition task may be implemented using different microphones or transmission channels. In this situation, it may not be practical to recollect a speech corpus to train the acoustic models of the recognizer [3]. Speech input offers high bandwidth information and relative ease of use. It also permits the user’s hands and eyes to be busy with a task, which is particularly valuable when users are in motion or in natural field settings [7]. Similarly, speech output is more impressive and understandable than the text output. Speech interfacing involves speech synthesis and speech recognition. Speech synthesizer takes the text as input and converts it into the speech output i.e. It acts as text to speech converter. Speech recognizer converts the spoken word into text [5]. Prosody refers to the supra segmental features of natural speech, such as rhythm and intonation [6]. Native speakers use prosody to convey paralinguistic information such as emphasis, intention, attitude and emotion. The prosody of a word sequence can be described by a set of prosodic variables such as prosodic phrase boundary, pitch accent, lexical stress, syllable position and hesitation, etc. Among these prosodic variables, pitch accent and into national phrase boundary have the most salient acoustic correlates and maybe most perceptually robust [8] [9]. Prosody is potentially convenient in automatic speech understanding systems for some reasons [10]. Prosody correlates with prosody may be used to
  • 2. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 86 disambiguate syntactically distinct sentences with identical phoneme strings. Since prosody is ambiguous without phoneme information, and phonemes are ambiguous without prosodic information [11] [12] The outline structure of the paper is organized as follows. In section 2, a brief review is made about the recent research works related to speech classification that is emphasized in relation to its problem statement. Next, Section 3 details the proposed speech emotion classification system and in Section 4 discusses about the implementation results and comparative results whereas as in Section 5 concludes the paper. 2. RELATED WORKS A few of the most recent literature works in speech classification are reviewed below Gyorgy Szaszaket al. [13] have developed a modeling acoustic processing approach for speech recognition. Acoustic processing and modeling of the supra-segmental characteristics of speech, with the aim of incorporating advanced syntactic and semantic level processing of spoken language for speech recognition/understanding tasks. The proposed modeling approach was similar to the one used in standard speech recognition, where basic HMM units were trained and were connected according to the dictionary and some grammar (language model) to obtain a recognition network, along which recognition could be interpreted also as an alignment process. In this proposed method, HMM framework was used to model speech prosody, and to perform initial syntactic and/or semantic level processing of the input speech in parallel for standard speech recognition. As acoustic–prosodic features, fundamental frequency and energy are used. For this, HMMs for the different types of word-stress unit contours were trained and then used for recognition and alignment of such units from the input speech. This method was also allows punctuation to be automatically marked. Soheil Shafieea et al. [14] have proposed Speech Activity Detectors (SADs) technique for speech recognition tasks. In this proposed, a two-stage speech activity detection system was focused which at first take advantage of a voice activity detector to discard pause segments out of the audio signals; this was done even in presence of stationary background noises. To find the feature set in speech/non-speech classification, a large set of robust features was introduced; the optimal subset of these features was chosen by applying a Genetic Algorithm (GA) to the initial feature set. It had been discovered that Fractal dimensions of numeric series of prosodic features are the most speech/non-speech differentiating features. Models of the system were trained over a Farsi database, FARSDAT, however, the final experiments in the TIMIT English database had been conducted. Raul Fernandez et al. [15] have proposed graphical models on the task of recognizing affective categories from prosody in both acted and natural speech. A strength of this approach was the integration and summarization of information using both local and global prosodic phenomena. In this framework speech was structurally modeled as a dynamically evolving hierarchical model in which levels of the hierarchy were determined by prosodic constituency and contain parameters that evolve according to dynamical systems. The acoustic parameters had been chosen to reflect four main components of the speech thought to reflect paralinguistic and affect- specific information: intonation, loudness, rhythm and voice quality. The model was then evaluated on two different corpora of fully spontaneous, effectively-colored, naturally occurring speech between people: Call Home English and BT Call Center. Here the ground truth labels were obtained from examining the agreement of 29 human coders labeling arousal and valence. Then finally detecting high arousal negative valence speech in call centers Md. Mijanur Rahmanet al. [16] have proposed MFCC signal analysis technique for multi leveled pattern recognition task, which includes speech segmentation, classification, feature extraction and pattern recognition. In the proposed method, a blind speech segmentation procedure was used to segment the continuously spoken Bangla sentences into words/sub-words like units using the endpoint detection technique. These segmented words were classified according to the number of syllables and the sizes of the segmented words. The MFCC signal analysis technique was used to extract the features of speech words, which including windowing. The developed system achieved the segmentation accuracy rate at different from total 24 sub-classes of segmented words. Abhishek Jaywant et al. [17] have characterized to category- specificity from speech prosody and facial expressions. In communication involved processing nonverbal emotional cues from auditory and visual stimuli. They employed a cross- modal priming task using emotional stimuli with the same valence but that diff erent in emotion category. After listening to angry, sad, disgusted, or neutral vocal primes, subjects rendered a facial aff ect decision about an emotionally congruent or incongruent face target. In this results revealed that participants made fewer errors when judging face targets that conveyed the same emotion as the vocal prime, and responded significantly faster for most emotions. Astonishingly, participants responded slower when the prime and target both conveyed disgust, perhaps due to attention biases for disgust-related stimuli. They find that vocal emotional expressions with similar valence are processed with category specificity and that discrete emotion knowledge implicitly aff ects the processing of emotional faces between sensory modalities.
  • 3. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 87 THE PROBLEM STATEMENT Section 2 reviews the recent works related to the recognition or classification of speech signals based on the prosody effects. In continues speech signal classification, the prosody of continuous speech depends on many separate aspects, such as the meaning of the sentence and the speaker characteristics and emotions. In most of the research works, speech classification process based on prosody effects is done by using local and global features like, energy, pitch, linear predictive spectrum coding (LPCC), Mel-frequency spectrum coefficients (MFCC), and Mel-energy spectrum dynamic coefficients (MEDC), Intonation, Pitch contour, Vocal effort, etc. Normally, the pitch value of sad and happy speech signals has a great difference. Also, the frequency value of happy is higher than sad speech. But, in some cases the frequency of happy speech is nearly similar to sad speech or frequency of sad speech is similar to happy speech. In such situation, it is difficult to recognize the exact speech signal. 3. PROPOSED CLASSIFIER SYSTEM . Our proposed emotion speech classification system classifies the speech signals based on their prosody effects by using K- NN Classifier. In our work, we have utilizes a Telugu speech signals to accomplish classification process. The proposed system mainly comprised of two stages namely, (i) Feature extraction (ii) Emotion classification. These two stages are consecutively performed and the more accurate results are occurred and are discussed in Section 3.1 and 3.2 respectively. 3.1 Feature Extraction The Telugu speech database used in the proposed system consist speeches of seven male speakers and two female speakers with four emotions each. In feature extraction stage, there are three features are extracted from the input speech signals and given to the classification process. In speech classification feature extraction plays a most important role. Because the efficient features extraction from the input speech signals makes that the output to be more efficient and provide high classification accuracy. In our work, we extract three efficient features from the speech signals. The extracted three efficient features namely,  Energy Entropy  Short Time Energy  Zero Crossing Rate These features are extracted in our proposed system is explained below. (i) Energy Entropy (Ee) The energy level of an input speech words signal is rapidly changed and these sudden changes in the speech signals are measured. This measurement result is stated as energy entropy feature. To calculate this feature, the input speech word signals are divided into f number of frames and calculate normalized energy for each frame. The energy entropy (Ee) feature is calculated by using the formula which is stated as follows,     1 0 2 2 2 )(.log f k eE  -------- (1)          N b b l F S W N 1 2 *  --------- (2) Where, 2  - is the normalized energy N - is the total number of blocks lW - is the window length bS - is the number of short blocks F - is the frequency By exploiting the energy entropy equation is given in Equ. (1) Is applied to the speech word signals and obtain the energy entropy feature (Ee). (ii) Short Time Energy (Se) The input speech signals energy level is to be increased suddenly. So we measure this energy increment level in speech signals is defined as short time energy. To calculate the short time energy the input signal is divided into w number of windows and calculates the windowing function for each window. The short-time energy (Se) of speech signals replicates the amplitude variation and is described as follows,     i e iwhixS )(.)( 2 ------ (3) Where, )(ix - is the input signal )(wh - is the impulse response By utilizing the equation is given in Equ. (3), the short time energy (Se) feature is calculated from the input speech word signals. (iii) Zero Crossing Rate (ZCR) Among these four features, the zero crossing rates is one of the most dominant features for speech signal processing. The zero crossing ratios are defined as the rate at which the speech signal crosses zero can provide information about the source
  • 4. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 88 of its creation or the ratio of number of time domain zero crossings occurred to the frame length. The Zero crossing Rate (ZCR) is calculated by using the sign functions, which is stated below,      1 1 2 )}1(sgn{)}(sgn{1 k x CR xyxy M Z (4) Where, M - is the length of the sample )}(sgn{ xy - is the sign function The sign function )}(sgn{ xy is defined as,          0)(;1 0)(;0 0)(;1 )}(sgn{ xy xy xy xy (5) The ZCR is calculated for each input speech word signals by using the Equ. (4). 3.2 Emotion Classification In emotion classification the speech signals are classified by using the extracted features from the feature extraction stage. The extracted features from the speech signals are given to the K-NN classifier. The features are extracted for the training database speech signals and given to the K-NN classifier to perform the training process. During training stage, the speech signals corresponding three features are taken as input to the K-NN classifier. Here, we have taken three inputs as energy entropy (Ee), short-time energy (Se) and Zero crossing Rate (ZCR). The proposed emotion classification K-NN classifier Technique is mentioned below. 3.3 K-Nearest Neighbor Technique as an Emotion Recognizer A more general version of the nearest neighbor technique bases the classification of an unknown sample on the “votes” of K of its nearest neighbor rather than on only it’s on single nearest neighbor. The K-Nearest Neighbor classification procedure is denoted is denoted by K-NN. If the costs of error are equal for each class, the estimated class of an unknown sample is chosen to be the class that is most commonly represented in the collection of its K nearest neighbors. Among the various methods of supervised statistical pattern recognition, the Nearest Neighbor is the most traditional one, it does not consider a priori assumptions about the distributions from which the training examples are drawn. It involves a training set of all cases. A new sample is classified by calculating the distance to the nearest training case, the sign of that point then determines the classification of the sample. The K-NN classifier extends this idea by taking the K nearest points and assigning the sign of the majority. It is common to select K small and odd to break ties (typically 1, 3 or 5). Larger K values help reduce the effects of noisy points within the training data set, and the choice of K is often performed through cross validation. In this way, given a input test sample vector of features x of dimension n, we estimate its Euclidean distance d equation 1 with all the training samples (y) and classify to the class of the minimal distance. 𝑞 𝑥, 𝑦 = (𝑥𝑖 2𝑛 𝑖=1 - 𝑦𝑖 2 )--------(6) The training examples are vectors in a multidimensional feature space, each with a class label. The training phase of the algorithm consists only of storing the feature vectors and class labels of the training samples. In the classification phase, K is a user-defined constant, and an unlabelled vector (a query or test point) is classified by assigning the label which is most frequent among the K training samples nearest to that query point. Usually Euclidean distance is used as the distance metric, however this is only applicable to continuous variables. 3.4 K-NN Algorithm: The k-NN algorithm can also be adapted for use in estimating continuous variables. One such implementation uses an inverse distance weighted average of the k-nearest multivariate neighbors. This algorithm functions as follows: Compute Euclidean or Mahalanobis distance from target plot to those that were sampled. 1. Order samples taking for account and calculate distances. 2. Choose heuristically optimal K nearest neighbor based on root mean square error q(x, y) done by cross validation technique. 3. Calculate an inverse distance weighted average with the k- nearest multivariate neighbors. 4. EXPERIMENTAL RESULTS The proposed Telugu speech emotion classification technique is implemented in the working platform of MATLAB (version 7.12) with machine configuration as follows Processor: Intel core i5 OS: Windows xp CPU speed: 3.20 GHz, RAM: 4GB The performance of the proposed system is evaluated with different person’s emotion speech signals and the results are compared against the existing techniques. The input speech signals are classified using the proposed speech classification
  • 5. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 89 system using K-NN classifier. The sample (i) input normal, happy, sad and angry emotion speech signals (ii) Extracted features zero crossing rate, energy entropy, short time energy for normal, happy, sad and angry speech signals are shown in the fig.1 (a) (b) (c) (d) 1(i)
  • 6. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 90 (a) (b)
  • 7. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 91 (c) (d) 1(ii) Figure 1: (i) Input (a) angry, (b) happy, (c) normal and (d) sad emotion speech signals (ii) Extracted features of (a) angry, (b) happy, (c) normal and (d) sad signals The performance of proposed speech classification system is analyzed by using the statistical measures. The statistical measures [18] are used to measure the speech classification performance. The performance analyses have shown that the proposed system has been successfully classifies the input speech signals based on their prosody which the speech signals belongs to. The Telugu speech signal database is created with 9 persons, each person has four emotional speech signals are normal, happy, sad and angry. These 9 person’s emotion speech signals are given to the one cross validation process and the performance measures of the classification results are listed in Table .1 The K-NN classifier performance measures for the same Telugu speech database are given in Table .1.
  • 8. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 92 Table 1: Speech emotion classification Statistical performance measures of K-NN for 9 different Speakers. As can be seen from table 1, our proposed emotion classification K-NN system has given a considerable and moderate classification performance. CONCLUSIONS In this paper, we proposed a Telugu speech emotion classification system using K-NN Classifier. Here, the classification process is made by extracting three features like entropy, short time energy and zero crossing rates from the input speech signals. The computed features were given to the K-NN to Classifier accomplish the training process. The proposed speech emotion classification system classifies the Telugu speech signals based on their prosody by using the K- NN classifier. During testing, if a set of emotional speech signals is given as input it classifies the speech signals based on the prosody which the speech signal belongs to. Human emotions can be recognized from speech signals when facial expressions or biological signals are not available. In this work Emotions are recognized from speech signals using real time database. In this work we presented an approach to emotion recognition from speech signal. The future work will be to conduct comparative study of various classifier using different parameter selection method to improve and accuracy. REFERENCES [1] Marwan Al-Akaidi, "Introduction to speech processing", Fractal Speech Processing-Cambridge University Press, 2012 [2] Dennis Norris, James M. McQueen and Anne Cutler, "Merging information in speech recognition: Feedback is Never necessary", Behavioral and Brain Sciences, Vol. 23, pp. 299-370, 2000 [3] Leonardo Neumeyer and Mitchel Weintraub, "Probabilistic Optimum Filtering for Robust Speech Recognition", In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Adelaide, SA, Australia, Vol. 1, pp. I/417 - I/420, 1994 [4] Lukas Burget, Petr Schwarz, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Nagendra Goel, Martin Karafiat, Daniel Povey, Ariya Rastrow, Richard C. Rose, Samuel Thomas, "Multilingual Acoustic Modeling for Speech Recognition Based on Subspace Gaussian Mixture Models", In Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing, Dallas, TX, pp. 4334 - 4337, 2010, [5] Kuldeep Kumar and R. K. Aggarwal, "Hindi Speech Recognition System Using HTK", International Journal of Computing and Business Research, Vol. 2, No. 2, May 2011 [6]. Ken Chen, Mark Hasegawa-Johnson and Aaron Cohen, “An Automatic Prosody Labeling System Using Ann-Based Syntactic-Prosodic Model and GMM-Based Acoustic- Prosodic Model”, IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1, pp. 509- 512, 2004. [7] Vimala and Radha, "A Review on Speech Recognition Challenges and Approaches", World of Computer Science and Information Technology Journal, Vol. 2, No. 1, pp. 1-7, 2012 [8] Omair Khan, Wasfi G. Al-Khatib, Cheded Lahouari, “Detection of Questions in Arabic Audio Monologues Using Prosodic Features”, Ninth IEEE International Symposium on Multimedia, pp. 29- 36, 2007 [9]. Stefanie Shattuck-Hufnage and Alice E. Turk, “A Prosody Tutorial for Investigators of Auditory Sentence Experiments Sensitivity FPR Accuracy Specificity PPR NPV FPR MCC 1 25 75 25 25 25 25 75 13 2 0 67 25 33 0 33 100 -12 3 25 75 25 25 13 38 88 -9 4 25 75 25 25 25 25 75 13 5 25 75 25 25 25 25 75 13 6 25 75 25 25 25 25 75 13 7 25 75 25 25 25 25 75 13 8 25 75 25 25 25 25 75 13 9 25 75 25 25 25 25 75 13
  • 9. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 93 Processing”, Journal of Psycholinguistic Research, Vol. 25, No. 2, pp. 193-247, 1996 [10] Elizabeth Shriberga, Andreas Stolckea, Dilek Hakkani- Türb, Gökhan Tur, “Prosody-based automatic segmentation of speech into sentences and topics”, Speech Communication, Vol. 32, No. 1–2, pp. 127–154, 2000 [11] Ken Chen, Mark Hasegawa-Johnson, Aaron Cohen, Sarah Borys, Sung-Suk Kim, Jennifer Cole, and Jeung-Yoon Choi, “Prosody Dependent Speech Recognition on Radio News Corpus of American English”, IEEE Transactions on Speech and Audio Processing, Vol. 13, No. 6, pp. 1-15, 2005 [12] Mark Hasegawa-Johnson, Ken Chen, Jennifer Cole, Sarah Borys, Sung-Suk Kim, Aaron Cohen, Tong Zhang, Jeung-Yoon Choi, Heejin Kim, Taejin Yoon, Sandra Chavarria, “Simultaneous recognition of words and prosody in the Boston University Radio Speech Corpus”, Speech Communication, Vol. 46, No. 3–4, pp. 418–439, 2005 [13] KlaraVicsi, Gyorgy Szaszak, “Using prosody to improve automatic speech recognition”, Speech Communication, Vol. 52, No. 5, pp. 413–426, 2010 [14] Soheil Shafieea, Farshad Almasganja, Bahram Vazirnezhada, Ayyoob Jafari, “A two-stage speech activity detection system considering fractal aspects of prosody”, Pattern Recognition Letters, Vol. 31, No. 9, pp. 936–948, 2010 [15] Raul Fernandez, Rosalind Picard, “Recognizing affect from speech prosody using hierarchical graphical models”, Journal Speech Communication, Vol. 53, No. 9-10, pp. 1088- 1103, 2011 [16] Md. Mijanur Rahman, Md. Farukuzzaman Khan and Md. Al-Amin Bhuiyan, “Continuous Bangla Speech Segmentation, Classification and Feature Extraction”, IJCSI International Journal of Computer Science Issues, Vol. 9, No. 2, No 1, pp. 67-75, 2012 [17] Abhishek Jaywant, Marc D. Pell, “Categorical processing of negative emotions from speech prosody”, Speech Communication, Vol. 54, No. 1, pp. 1–10, 2012 [18]http://en.wikipedia.org/wiki/Sensitivity_and_specificity BIOGRAPHIES: P. Vijay Bhaskar, Head of the ECE Department, AVN Institute of Engg.&Tech. –HYDERABAD obtained his B. Tech., from JNTU, Kakinada and M. Tech., from JNTU, Hyderabad, and both First Class with Distinction and pursuing PhD. from JNTU, Hyderabad, and also having the overall teaching experience of 18 Years and Published 09 papers in various National and International conferences, and guided good number of B. Tech., and M. Tech., Projects S. Rama Mohana Rao, who served for 25 years in ISRO Vikram Sarabhai Space Centre Trivandrum from 1971- 1996 in various capabilities such as Head, ELP, PCF, EFF and the last being as Deputy Project Director ESP. He has also served for 15 years in various educational institutions as Professor, HOD & Principal. who is an eminent personality known for his versatility, and obtained his Ph D from IISC, Bangalore, and also his credit publications in International & National journals. He has over15 years of experience as Principal at reputed Engineering colleges.