SlideShare une entreprise Scribd logo
1  sur  4
Télécharger pour lire hors ligne
ACEEE Int. J. on Information Technology, Vol. 01, No. 01, Mar 2011



     Marathi Isolated Word Recognition System using
                MFCC and DTW Features
                   Bharti W. Gawali1, Santosh Gaikwad2, Pravin Yannawar3, Suresh C.Mehrotra4
                             1,2,3,4
                                 Department of Computer Science & Information Technology,
                                    Dr.Babasaheb Ambedkar Marathwada University,
                                            Aurangabad. 431001(MS) India.
                        Address: 1bharti_rokade@yahoo.co.in, , 2santosh.gaikwadcsit@gmail.com
                             3pravinyannawar@gmail.com , 4mehrotra_suresh@yahoo.com


Abstract— This paper presents a Marathi database and isolated                    II. MARATHI SPEECH DATABASE
Word recognition system based on Mel-frequency cepstral
                                                                           For accuracy in the speech recognition, we need a
coefficient (MFCC), and Distance Time Warping (DTW) as
features. For the extraction of the feature, Marathi speech           collection of utterances, which are required for training and
                                                                          testing. The Collection of utterances in proper manner is
database has been designed by using the Computerized Speech
Lab. The database consists of the Marathi vowels, isolated            called the database.
words starting with each vowels and simple Marathi sentences.             The generation of a corpus of Marathi Vowels, words
Each word has been repeated three times by the 35 speakers.           and sentences as well as the collection of speech data are
This paper presents the comparative recognition accuracy of           described below. The age group of speakers selected for the
DTW and MFCC.                                                         collection of database ranges from 22 to 35. Mother tongue
                                                                      of all the speakers was Marathi. The total number of speakers
Index Terms— CSL, MFCC, DTW, Spectrogram, Speech                      was 35 out of which 17 were Females and 18 were Males.
Recognition and statistical method                                    The vocabulary size of the database consists of
                                                                                  • Marathi Vowels: 105 samples
                      I.. INTRODUCTION                                            • Isolated words stating with each vowel:
    The Speech is the most prominent and natural form of                              420 Samples
communication between humans. There are various spoken                            • Sentences: 175 samples.
Languages thought the world [1]. Marathi is an Indo-Aryan             A. Acquisition setup
Language, spoken in western and central India. There are
90 million of fluent speakers all over world [2].However;                 To achieve a high audio quality the recording took place
there is lot of scope to develop systems using Indian                 in the10 X 10 rooms without noisy sound and effect of echo.
languages which are of different variations. Some work is             The Sampling frequency for all recordings was 11025 Hz in
done in this direction in isolated Bengali words, Hindi and           the Room temperature and normal humidity. The speaker
Telugu [3].The amount of work in Indian regional languages            were Seating in front of the direction of the microphone with
has not yet reached to a critical level to be used it as real         the Distance of about 12-15 cm [6]. The speech data is
communication tool, as already done in other languages in             collected with the help of Computerized speech laboratory
developed countries. Thus, this work was taken to focus on            (CSL) using the single channel. The CSL is most advanced
Marathi language. It is important to see that whether Speech          analysis system For speech and voice. It is a complete
Recognition System for Marathi can be carried out similar             hardware and software system with specifications and
pathways of reaserch as carried out in English [4, 5].In this         performance. It is an input/output recording device for a PC,
paper we are presenting work consists of the creation of              which has special features for reliable acoustic measurements
Marathi speech database and its speech recognition system             [7].
for isolated words.                                                   B. Vowels, Words and Sentences corpus
    The paper is divided into five sections, Section 1, gives             Marathi language uses Devanagari, a character based
Introduction, Section 2 deals with details of creating Marathi        script. A character represents one vowel and zero or more
speech database ,section 3,focuses on Recognition of isolated         consonants. There are 12 vowels and 36 consonants present
words using MFCC and DTW, Section 4 ,covers results and               in Marathi languages as shown in figure 1.a [8]. We have
conclusion followed by section 5 with the References.                 recorded vowels, isolated words starting from each vowel
                                                                      and the simple sentences which are used for communication
                                                                      in Government offices shown in figure 1.b. All sentences
                                                                      are interrogative and containing 5 to 6 words. The speech
                                                                      signals have been stored in form of wav file. Each vowels,
                                                                      words and sentences have been separated using CSL software
                                                                      and stored and given the labels of spoken words to the
                                                                      different files and folders.

© 2011 ACEEE                                                     21
DOI: 01.IJIT.01.01.73
ACEEE Int. J. on Information Technology, Vol. 01, No. 01, Mar 2011


                                                                       Pre-emphasis
                                                                           The speech is first pre-emphasis with the pre-emphasis
                                                                       filter 1-az-1 to spectrally flatten the signal.
                                                                        Framing and Windowing
                                                                           A speech signal is assumed to remain stationary in periods
                                                                       of approximately 20 ms. Dividing a discrete signal s[n] into
                                                                       frames in the time domain truncating the signal with a
                                                                       window function w[n]. This is done by multiplying the signal,
                                                                       consisting of N samples, . The frame is shifted 10 ms so that
                                                                       the overlapping between two adjacent frames is 50% to avoid
                                                                       the risk of losing the information from the speech signal.
                                                                       After dividing the signal into frames that contain nearly
                                                                       stationary signal blocks, the windowing function is applied.
                                                                        Fourier Transform
                                                                           To obtain a good frequency resolution, a 512 point Fast
                                                                       Fourier Transform (FFT) is used [12, 13]
                                                                       Mel-Frequency Filter Bank
                                                                           A filter bank is created by calculating a number of peaks,
                                                                           Uniformly spaced in the Mel-scale and then transforming
                                                                       the back to the normal frequency scale where they are used
                                                                       a speaks for the filter banks.
                                                                        Discrete Cosine Transform
                                                                           As the Mel-cepstrum coefficients contain only real parts,
                                                                       the Discrete Cosine Transform (DCT) can be used to achieve
                                                                       the Mel- cepstrum coefficients. There were 24 Coefficients
                                                                       out of that only 13 coefficients have been selected for the
                                                                       recognition system.
                                                                        Distance Measures
                                                                           There are some commonly used distance measures i.e.
                                                                       Euclidean Distance. Euclidian Distance of two vectors x and
                                                                       p is used measured. The result of MFCC on the vector of
                                                                       words starting with‘ ’are presented in Table 1.
                                                                        B. Dynamic Time Warping
          III. WORD RECOGNITION SYSTEM                                     The simplest way to recognize an isolated word sample
                                                                       is to compare it against a number of stored word templates
    There are several kinds of parametric representation of
                                                                       and determine the best match [14,15].DTW is an instance of
the acoustic signals. Among of them the Mel-Frequency
                                                                       the general class of algorithms and known as dynamic
cepstral Coefficient (MFCC) is most widely used [9].There
                                                                       programming .its time and space complexity is merely linear
are many work on MFCC, on the improvement and accuracy
                                                                       in duration of speech sample and the vocabulary size. The
in recognition [10].The recognition rate achieved by MCC,
                                                                       algorithm makes a single pass though a matrix of frame scores
LPC and Combined are given 87.5%, 91.61% and 84.6%
                                                                       while computing locally optimized segment of the global
respectively for 312 number of testing patterns [11].We have
                                                                       alignment path. The dynamic time warping algorithm
developed the recognition system using MFCC and
                                                                       provides a procedure to align in the test and reference patterns
DTW.paragraphs must be indented. All paragraphs must be
                                                                       to give the average distance associated with the optimal
justified, i.e. both left-justified and right-justified.
                                                                       warping path. The results of DTW are given in table 2.
 A. Mel Frequency Cepstral Coefficient
    It consists of various steps described below.                                  IV. RESULTS AND CONCLUSION
Speech Signal                                                               The aim here is to compare the performance of MFCC
                                                                       (where 13 coefficients are used), and DTW. The speech data
    The excitation signal is spectrally shaped by a vocal tract
                                                                       used in this experiment are the isolated words starting from
Equivalent filter. The outcome of this process is the sequence
                                                                       vowel‘’ in Marathi, spoken by female speakers.The test
of exciting signal called speech
                                                                       pattern is compared with the reference pattern to get the best
                                                                       Match. The symmetric form of DTW algorithm is used to
                                                                       optimally align in time the test and reference patterns and
                                                                  22
© 2011 ACEEE
DOI: 01.IJIT.01.01.73
ACEEE Int. J. on Information Technology, Vol. 01, No. 01, Mar 2011


and to give average distance associated with optimal warping                                 ACKNOWLEDGMENT
path. The recognition system uses the utterance of the spoken               The authors would like to thank the university authorities
word For training and the remaining utterances for testing.             for providing infrastructure to carry out experiment. This
   The Reference patterns are created for each word in the
                                                                        work has been supported by DST under Fast Track Scheme
vocabulary from the data in the training set by averaging (after
                                                                        entitled as “Design and Development of Marathi Speech
dynamic time warping) the patterns of utterances of the same
                                                                        Interface System”.
word. The comparative recognition accuracy is presented in
Table 3.




                                                                   23
© 2011 ACEEE
DOI: 01.IJIT.01.01.73
ACEEE Int. J. on Information Technology, Vol. 01, No. 01, Mar 2011


                                REFERENCES                                               [9] Steven B. Davis and Paul Mermelstein, “Comparison of
[1] (2010) The Wikipedia website [Online].                                                    Parametric Representations for Monosyllabic Word
     Av a i l a b l e : h t t p : / / e n . w i k i p e d i a . o r g / w i k i /             Recognition in Continuously Spoken sentences”, IEEE
     List_of_languages_by_number_of_native _speakers. Viewed                                  Transactions on Acoustics, Speech, and Signal Processing,
     10 Jan 2010.                                                                             vol. ASSP-28, No.4, August 1980.
[2] (2010)The Wikipedia website [Online].                                                [10] Qi, Li, Frank K, Soong and Olivier Siohan, “A High-
Available: http://en.wikipedia.org/wiki/Marathi_language. Viewed                              Performance Auditory Feature for Robust Speech
     12 Jan 2010.                                                                             Recognition”, 6th International Conferences on Spoken
[3] Samudravijaya K, P.V.S. Rao and S.S. Agrawal, “Hindi Speech                               languages processing, Beijing, October 2000
     database”, Proc. Int. Conf. Spoken language processing,                             [11] K. R. Aida – Zade, Ardil and S.S. Rustamov,”Investigation of
     ICSLP00,October Beijing 2000.                                                            Combined used of MFCC and LPC features in Speech
[4] F. Jelinek, L.R. Bahl, and R. L, Mercer, “Design of a linguistic                          Recognition Systems”, PWASET VOLUME 13 MAY 2006
     statistical decoder for the recognition of continuous speech”,                           ISSN 1307-6884, Proceeding of world academy of science,
     IEEE Trans. Informat. Theory, vol. IT-21, PP. 250-250, 1975.                             engineering and technology volume.
[5] Michael Grinm, Kristian Kroschel and Shrikanth Narayanan,                            [12] Hiromi Sakaguchi and Naoaki Kawaguchi, “Mathematical
     “The Vera AM Mittag German Audio – Visual Emotional                                      Modeling of Human Speech Processing Mechanism Based
     Speech Database.                                                                         on the Principle of Bain Internal Model of Vocal Tract” Journal
[6] F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, B.                                   of the faculty of Engineering, Honshu University, No. 75,1995
     Weiss,       “A       database          of      German          Emotion             [13] Steven W. Smith, “the Scientist and Engineer’s Guide to
     Speech”,INTERSPEECH 2005, September, 4-8, Lisbon,                                        Digital Signal Processing”, California Technical Publishing,
     Portugal                                                                                 1977, page (s): 169-174
[7] The website for The Disordered Voice Database                                        [14] Wei HAN, Cheong-Fat CHAN, Chiu-Sing CHOY and Kong-
     Available:http://www.kayelemetrics.com/Product%20Info/                                   Pang PUN, “An efficient MFCC Extraction Method in Speech
     CSL%20Family/4500/4500.htm.                                                              Recognition”, 0-7803-9390-2/06/2006 IEEE
[8] [2010] The Wikipedia website[Online] Available: http://                              [15] K.K. PALIWAL, Anant AGARWAL and Sarvajit S. SINHA
     en.wikipedia.org/wiki/Devanagari                                                         “Amodification over Sakoe and Chiba’s dynamic Time
                                                                                              Warping Algorithm for isolated word recognition.” Signal
                                                                                              Processing 4.




© 2011 ACEEE                                                                        24
DOI: 01.IJIT.01.01.73

Contenu connexe

Tendances

Efficient Speech Emotion Recognition using SVM and Decision Trees
Efficient Speech Emotion Recognition using SVM and Decision TreesEfficient Speech Emotion Recognition using SVM and Decision Trees
Efficient Speech Emotion Recognition using SVM and Decision TreesIRJET Journal
 
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLABA GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLABsipij
 
Homomorphic speech processing
Homomorphic speech processingHomomorphic speech processing
Homomorphic speech processingsivakumar m
 
Speaker recognition on matlab
Speaker recognition on matlabSpeaker recognition on matlab
Speaker recognition on matlabArcanjo Salazaku
 
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...IRJET Journal
 
IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...
IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...
IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...ijnlc
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327IJMER
 
VAW-GAN for disentanglement and recomposition of emotional elements in speech
VAW-GAN for disentanglement and recomposition of emotional elements in speechVAW-GAN for disentanglement and recomposition of emotional elements in speech
VAW-GAN for disentanglement and recomposition of emotional elements in speechKunZhou18
 
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM ModelASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Modelsipij
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By MatlabAnkit Gujrati
 
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODEL
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODELASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODEL
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODELsipij
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...ijcsit
 

Tendances (15)

Efficient Speech Emotion Recognition using SVM and Decision Trees
Efficient Speech Emotion Recognition using SVM and Decision TreesEfficient Speech Emotion Recognition using SVM and Decision Trees
Efficient Speech Emotion Recognition using SVM and Decision Trees
 
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLABA GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
 
speech enhancement
speech enhancementspeech enhancement
speech enhancement
 
Homomorphic speech processing
Homomorphic speech processingHomomorphic speech processing
Homomorphic speech processing
 
Kf2517971799
Kf2517971799Kf2517971799
Kf2517971799
 
Speaker recognition on matlab
Speaker recognition on matlabSpeaker recognition on matlab
Speaker recognition on matlab
 
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
 
IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...
IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...
IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327
 
VAW-GAN for disentanglement and recomposition of emotional elements in speech
VAW-GAN for disentanglement and recomposition of emotional elements in speechVAW-GAN for disentanglement and recomposition of emotional elements in speech
VAW-GAN for disentanglement and recomposition of emotional elements in speech
 
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM ModelASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
 
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODEL
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODELASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODEL
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODEL
 
An Introduction To Speech Recognition
An Introduction To Speech RecognitionAn Introduction To Speech Recognition
An Introduction To Speech Recognition
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 

Similaire à Marathi Isolated Word Recognition System using MFCC and DTW Features

Combined feature extraction techniques and naive bayes classifier for speech ...
Combined feature extraction techniques and naive bayes classifier for speech ...Combined feature extraction techniques and naive bayes classifier for speech ...
Combined feature extraction techniques and naive bayes classifier for speech ...csandit
 
COMBINED FEATURE EXTRACTION TECHNIQUES AND NAIVE BAYES CLASSIFIER FOR SPEECH ...
COMBINED FEATURE EXTRACTION TECHNIQUES AND NAIVE BAYES CLASSIFIER FOR SPEECH ...COMBINED FEATURE EXTRACTION TECHNIQUES AND NAIVE BAYES CLASSIFIER FOR SPEECH ...
COMBINED FEATURE EXTRACTION TECHNIQUES AND NAIVE BAYES CLASSIFIER FOR SPEECH ...cscpconf
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSijnlc
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignmentskevig
 
Performance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languagePerformance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languageiosrjce
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...AIRCC Publishing Corporation
 
Incremental Difference as Feature for Lipreading
Incremental Difference as Feature for LipreadingIncremental Difference as Feature for Lipreading
Incremental Difference as Feature for LipreadingIDES Editor
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...iosrjce
 
Speech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingSpeech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingiosrjce
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...iosrjce
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01ijcsit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
Implementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large DictionaryImplementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large Dictionaryiosrjce
 

Similaire à Marathi Isolated Word Recognition System using MFCC and DTW Features (20)

Combined feature extraction techniques and naive bayes classifier for speech ...
Combined feature extraction techniques and naive bayes classifier for speech ...Combined feature extraction techniques and naive bayes classifier for speech ...
Combined feature extraction techniques and naive bayes classifier for speech ...
 
COMBINED FEATURE EXTRACTION TECHNIQUES AND NAIVE BAYES CLASSIFIER FOR SPEECH ...
COMBINED FEATURE EXTRACTION TECHNIQUES AND NAIVE BAYES CLASSIFIER FOR SPEECH ...COMBINED FEATURE EXTRACTION TECHNIQUES AND NAIVE BAYES CLASSIFIER FOR SPEECH ...
COMBINED FEATURE EXTRACTION TECHNIQUES AND NAIVE BAYES CLASSIFIER FOR SPEECH ...
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignments
 
Performance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languagePerformance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi language
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
Incremental Difference as Feature for Lipreading
Incremental Difference as Feature for LipreadingIncremental Difference as Feature for Lipreading
Incremental Difference as Feature for Lipreading
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
 
H010625862
H010625862H010625862
H010625862
 
Speech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingSpeech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law companding
 
A017410108
A017410108A017410108
A017410108
 
A017410108
A017410108A017410108
A017410108
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
G010424248
G010424248G010424248
G010424248
 
Implementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large DictionaryImplementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large Dictionary
 

Plus de IDES Editor

Power System State Estimation - A Review
Power System State Estimation - A ReviewPower System State Estimation - A Review
Power System State Estimation - A ReviewIDES Editor
 
Artificial Intelligence Technique based Reactive Power Planning Incorporating...
Artificial Intelligence Technique based Reactive Power Planning Incorporating...Artificial Intelligence Technique based Reactive Power Planning Incorporating...
Artificial Intelligence Technique based Reactive Power Planning Incorporating...IDES Editor
 
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...IDES Editor
 
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...IDES Editor
 
Line Losses in the 14-Bus Power System Network using UPFC
Line Losses in the 14-Bus Power System Network using UPFCLine Losses in the 14-Bus Power System Network using UPFC
Line Losses in the 14-Bus Power System Network using UPFCIDES Editor
 
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...IDES Editor
 
Assessing Uncertainty of Pushover Analysis to Geometric Modeling
Assessing Uncertainty of Pushover Analysis to Geometric ModelingAssessing Uncertainty of Pushover Analysis to Geometric Modeling
Assessing Uncertainty of Pushover Analysis to Geometric ModelingIDES Editor
 
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...IDES Editor
 
Selfish Node Isolation & Incentivation using Progressive Thresholds
Selfish Node Isolation & Incentivation using Progressive ThresholdsSelfish Node Isolation & Incentivation using Progressive Thresholds
Selfish Node Isolation & Incentivation using Progressive ThresholdsIDES Editor
 
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...IDES Editor
 
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...IDES Editor
 
Cloud Security and Data Integrity with Client Accountability Framework
Cloud Security and Data Integrity with Client Accountability FrameworkCloud Security and Data Integrity with Client Accountability Framework
Cloud Security and Data Integrity with Client Accountability FrameworkIDES Editor
 
Genetic Algorithm based Layered Detection and Defense of HTTP Botnet
Genetic Algorithm based Layered Detection and Defense of HTTP BotnetGenetic Algorithm based Layered Detection and Defense of HTTP Botnet
Genetic Algorithm based Layered Detection and Defense of HTTP BotnetIDES Editor
 
Enhancing Data Storage Security in Cloud Computing Through Steganography
Enhancing Data Storage Security in Cloud Computing Through SteganographyEnhancing Data Storage Security in Cloud Computing Through Steganography
Enhancing Data Storage Security in Cloud Computing Through SteganographyIDES Editor
 
Low Energy Routing for WSN’s
Low Energy Routing for WSN’sLow Energy Routing for WSN’s
Low Energy Routing for WSN’sIDES Editor
 
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...IDES Editor
 
Rotman Lens Performance Analysis
Rotman Lens Performance AnalysisRotman Lens Performance Analysis
Rotman Lens Performance AnalysisIDES Editor
 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesIDES Editor
 
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...IDES Editor
 
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...Texture Unit based Monocular Real-world Scene Classification using SOM and KN...
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...IDES Editor
 

Plus de IDES Editor (20)

Power System State Estimation - A Review
Power System State Estimation - A ReviewPower System State Estimation - A Review
Power System State Estimation - A Review
 
Artificial Intelligence Technique based Reactive Power Planning Incorporating...
Artificial Intelligence Technique based Reactive Power Planning Incorporating...Artificial Intelligence Technique based Reactive Power Planning Incorporating...
Artificial Intelligence Technique based Reactive Power Planning Incorporating...
 
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...
 
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...
 
Line Losses in the 14-Bus Power System Network using UPFC
Line Losses in the 14-Bus Power System Network using UPFCLine Losses in the 14-Bus Power System Network using UPFC
Line Losses in the 14-Bus Power System Network using UPFC
 
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...
 
Assessing Uncertainty of Pushover Analysis to Geometric Modeling
Assessing Uncertainty of Pushover Analysis to Geometric ModelingAssessing Uncertainty of Pushover Analysis to Geometric Modeling
Assessing Uncertainty of Pushover Analysis to Geometric Modeling
 
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...
 
Selfish Node Isolation & Incentivation using Progressive Thresholds
Selfish Node Isolation & Incentivation using Progressive ThresholdsSelfish Node Isolation & Incentivation using Progressive Thresholds
Selfish Node Isolation & Incentivation using Progressive Thresholds
 
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...
 
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...
 
Cloud Security and Data Integrity with Client Accountability Framework
Cloud Security and Data Integrity with Client Accountability FrameworkCloud Security and Data Integrity with Client Accountability Framework
Cloud Security and Data Integrity with Client Accountability Framework
 
Genetic Algorithm based Layered Detection and Defense of HTTP Botnet
Genetic Algorithm based Layered Detection and Defense of HTTP BotnetGenetic Algorithm based Layered Detection and Defense of HTTP Botnet
Genetic Algorithm based Layered Detection and Defense of HTTP Botnet
 
Enhancing Data Storage Security in Cloud Computing Through Steganography
Enhancing Data Storage Security in Cloud Computing Through SteganographyEnhancing Data Storage Security in Cloud Computing Through Steganography
Enhancing Data Storage Security in Cloud Computing Through Steganography
 
Low Energy Routing for WSN’s
Low Energy Routing for WSN’sLow Energy Routing for WSN’s
Low Energy Routing for WSN’s
 
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...
 
Rotman Lens Performance Analysis
Rotman Lens Performance AnalysisRotman Lens Performance Analysis
Rotman Lens Performance Analysis
 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
 
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...
 
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...Texture Unit based Monocular Real-world Scene Classification using SOM and KN...
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...
 

Dernier

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Dernier (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Marathi Isolated Word Recognition System using MFCC and DTW Features

  • 1. ACEEE Int. J. on Information Technology, Vol. 01, No. 01, Mar 2011 Marathi Isolated Word Recognition System using MFCC and DTW Features Bharti W. Gawali1, Santosh Gaikwad2, Pravin Yannawar3, Suresh C.Mehrotra4 1,2,3,4 Department of Computer Science & Information Technology, Dr.Babasaheb Ambedkar Marathwada University, Aurangabad. 431001(MS) India. Address: 1bharti_rokade@yahoo.co.in, , 2santosh.gaikwadcsit@gmail.com 3pravinyannawar@gmail.com , 4mehrotra_suresh@yahoo.com Abstract— This paper presents a Marathi database and isolated II. MARATHI SPEECH DATABASE Word recognition system based on Mel-frequency cepstral For accuracy in the speech recognition, we need a coefficient (MFCC), and Distance Time Warping (DTW) as features. For the extraction of the feature, Marathi speech collection of utterances, which are required for training and testing. The Collection of utterances in proper manner is database has been designed by using the Computerized Speech Lab. The database consists of the Marathi vowels, isolated called the database. words starting with each vowels and simple Marathi sentences. The generation of a corpus of Marathi Vowels, words Each word has been repeated three times by the 35 speakers. and sentences as well as the collection of speech data are This paper presents the comparative recognition accuracy of described below. The age group of speakers selected for the DTW and MFCC. collection of database ranges from 22 to 35. Mother tongue of all the speakers was Marathi. The total number of speakers Index Terms— CSL, MFCC, DTW, Spectrogram, Speech was 35 out of which 17 were Females and 18 were Males. Recognition and statistical method The vocabulary size of the database consists of • Marathi Vowels: 105 samples I.. INTRODUCTION • Isolated words stating with each vowel: The Speech is the most prominent and natural form of 420 Samples communication between humans. There are various spoken • Sentences: 175 samples. Languages thought the world [1]. Marathi is an Indo-Aryan A. Acquisition setup Language, spoken in western and central India. There are 90 million of fluent speakers all over world [2].However; To achieve a high audio quality the recording took place there is lot of scope to develop systems using Indian in the10 X 10 rooms without noisy sound and effect of echo. languages which are of different variations. Some work is The Sampling frequency for all recordings was 11025 Hz in done in this direction in isolated Bengali words, Hindi and the Room temperature and normal humidity. The speaker Telugu [3].The amount of work in Indian regional languages were Seating in front of the direction of the microphone with has not yet reached to a critical level to be used it as real the Distance of about 12-15 cm [6]. The speech data is communication tool, as already done in other languages in collected with the help of Computerized speech laboratory developed countries. Thus, this work was taken to focus on (CSL) using the single channel. The CSL is most advanced Marathi language. It is important to see that whether Speech analysis system For speech and voice. It is a complete Recognition System for Marathi can be carried out similar hardware and software system with specifications and pathways of reaserch as carried out in English [4, 5].In this performance. It is an input/output recording device for a PC, paper we are presenting work consists of the creation of which has special features for reliable acoustic measurements Marathi speech database and its speech recognition system [7]. for isolated words. B. Vowels, Words and Sentences corpus The paper is divided into five sections, Section 1, gives Marathi language uses Devanagari, a character based Introduction, Section 2 deals with details of creating Marathi script. A character represents one vowel and zero or more speech database ,section 3,focuses on Recognition of isolated consonants. There are 12 vowels and 36 consonants present words using MFCC and DTW, Section 4 ,covers results and in Marathi languages as shown in figure 1.a [8]. We have conclusion followed by section 5 with the References. recorded vowels, isolated words starting from each vowel and the simple sentences which are used for communication in Government offices shown in figure 1.b. All sentences are interrogative and containing 5 to 6 words. The speech signals have been stored in form of wav file. Each vowels, words and sentences have been separated using CSL software and stored and given the labels of spoken words to the different files and folders. © 2011 ACEEE 21 DOI: 01.IJIT.01.01.73
  • 2. ACEEE Int. J. on Information Technology, Vol. 01, No. 01, Mar 2011 Pre-emphasis The speech is first pre-emphasis with the pre-emphasis filter 1-az-1 to spectrally flatten the signal. Framing and Windowing A speech signal is assumed to remain stationary in periods of approximately 20 ms. Dividing a discrete signal s[n] into frames in the time domain truncating the signal with a window function w[n]. This is done by multiplying the signal, consisting of N samples, . The frame is shifted 10 ms so that the overlapping between two adjacent frames is 50% to avoid the risk of losing the information from the speech signal. After dividing the signal into frames that contain nearly stationary signal blocks, the windowing function is applied. Fourier Transform To obtain a good frequency resolution, a 512 point Fast Fourier Transform (FFT) is used [12, 13] Mel-Frequency Filter Bank A filter bank is created by calculating a number of peaks, Uniformly spaced in the Mel-scale and then transforming the back to the normal frequency scale where they are used a speaks for the filter banks. Discrete Cosine Transform As the Mel-cepstrum coefficients contain only real parts, the Discrete Cosine Transform (DCT) can be used to achieve the Mel- cepstrum coefficients. There were 24 Coefficients out of that only 13 coefficients have been selected for the recognition system. Distance Measures There are some commonly used distance measures i.e. Euclidean Distance. Euclidian Distance of two vectors x and p is used measured. The result of MFCC on the vector of words starting with‘ ’are presented in Table 1. B. Dynamic Time Warping III. WORD RECOGNITION SYSTEM The simplest way to recognize an isolated word sample is to compare it against a number of stored word templates There are several kinds of parametric representation of and determine the best match [14,15].DTW is an instance of the acoustic signals. Among of them the Mel-Frequency the general class of algorithms and known as dynamic cepstral Coefficient (MFCC) is most widely used [9].There programming .its time and space complexity is merely linear are many work on MFCC, on the improvement and accuracy in duration of speech sample and the vocabulary size. The in recognition [10].The recognition rate achieved by MCC, algorithm makes a single pass though a matrix of frame scores LPC and Combined are given 87.5%, 91.61% and 84.6% while computing locally optimized segment of the global respectively for 312 number of testing patterns [11].We have alignment path. The dynamic time warping algorithm developed the recognition system using MFCC and provides a procedure to align in the test and reference patterns DTW.paragraphs must be indented. All paragraphs must be to give the average distance associated with the optimal justified, i.e. both left-justified and right-justified. warping path. The results of DTW are given in table 2. A. Mel Frequency Cepstral Coefficient It consists of various steps described below. IV. RESULTS AND CONCLUSION Speech Signal The aim here is to compare the performance of MFCC (where 13 coefficients are used), and DTW. The speech data The excitation signal is spectrally shaped by a vocal tract used in this experiment are the isolated words starting from Equivalent filter. The outcome of this process is the sequence vowel‘’ in Marathi, spoken by female speakers.The test of exciting signal called speech pattern is compared with the reference pattern to get the best Match. The symmetric form of DTW algorithm is used to optimally align in time the test and reference patterns and 22 © 2011 ACEEE DOI: 01.IJIT.01.01.73
  • 3. ACEEE Int. J. on Information Technology, Vol. 01, No. 01, Mar 2011 and to give average distance associated with optimal warping ACKNOWLEDGMENT path. The recognition system uses the utterance of the spoken The authors would like to thank the university authorities word For training and the remaining utterances for testing. for providing infrastructure to carry out experiment. This The Reference patterns are created for each word in the work has been supported by DST under Fast Track Scheme vocabulary from the data in the training set by averaging (after entitled as “Design and Development of Marathi Speech dynamic time warping) the patterns of utterances of the same Interface System”. word. The comparative recognition accuracy is presented in Table 3. 23 © 2011 ACEEE DOI: 01.IJIT.01.01.73
  • 4. ACEEE Int. J. on Information Technology, Vol. 01, No. 01, Mar 2011 REFERENCES [9] Steven B. Davis and Paul Mermelstein, “Comparison of [1] (2010) The Wikipedia website [Online]. Parametric Representations for Monosyllabic Word Av a i l a b l e : h t t p : / / e n . w i k i p e d i a . o r g / w i k i / Recognition in Continuously Spoken sentences”, IEEE List_of_languages_by_number_of_native _speakers. Viewed Transactions on Acoustics, Speech, and Signal Processing, 10 Jan 2010. vol. ASSP-28, No.4, August 1980. [2] (2010)The Wikipedia website [Online]. [10] Qi, Li, Frank K, Soong and Olivier Siohan, “A High- Available: http://en.wikipedia.org/wiki/Marathi_language. Viewed Performance Auditory Feature for Robust Speech 12 Jan 2010. Recognition”, 6th International Conferences on Spoken [3] Samudravijaya K, P.V.S. Rao and S.S. Agrawal, “Hindi Speech languages processing, Beijing, October 2000 database”, Proc. Int. Conf. Spoken language processing, [11] K. R. Aida – Zade, Ardil and S.S. Rustamov,”Investigation of ICSLP00,October Beijing 2000. Combined used of MFCC and LPC features in Speech [4] F. Jelinek, L.R. Bahl, and R. L, Mercer, “Design of a linguistic Recognition Systems”, PWASET VOLUME 13 MAY 2006 statistical decoder for the recognition of continuous speech”, ISSN 1307-6884, Proceeding of world academy of science, IEEE Trans. Informat. Theory, vol. IT-21, PP. 250-250, 1975. engineering and technology volume. [5] Michael Grinm, Kristian Kroschel and Shrikanth Narayanan, [12] Hiromi Sakaguchi and Naoaki Kawaguchi, “Mathematical “The Vera AM Mittag German Audio – Visual Emotional Modeling of Human Speech Processing Mechanism Based Speech Database. on the Principle of Bain Internal Model of Vocal Tract” Journal [6] F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, B. of the faculty of Engineering, Honshu University, No. 75,1995 Weiss, “A database of German Emotion [13] Steven W. Smith, “the Scientist and Engineer’s Guide to Speech”,INTERSPEECH 2005, September, 4-8, Lisbon, Digital Signal Processing”, California Technical Publishing, Portugal 1977, page (s): 169-174 [7] The website for The Disordered Voice Database [14] Wei HAN, Cheong-Fat CHAN, Chiu-Sing CHOY and Kong- Available:http://www.kayelemetrics.com/Product%20Info/ Pang PUN, “An efficient MFCC Extraction Method in Speech CSL%20Family/4500/4500.htm. Recognition”, 0-7803-9390-2/06/2006 IEEE [8] [2010] The Wikipedia website[Online] Available: http:// [15] K.K. PALIWAL, Anant AGARWAL and Sarvajit S. SINHA en.wikipedia.org/wiki/Devanagari “Amodification over Sakoe and Chiba’s dynamic Time Warping Algorithm for isolated word recognition.” Signal Processing 4. © 2011 ACEEE 24 DOI: 01.IJIT.01.01.73