SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
7
A NEW SINUSOIDAL SPEECH CODING TECHNIQUE WITH SPEECH
ENHANCER AT LOW BIT RATES
Samer J. Alabed Eyad A. Ibrahim
Darmstadt University of Technology Zarqa University of Technology
Darmstadt, Germany Zarqa, Jordan
ABSTRACT
Speech coding deals with the problem of reducing the bit rate required for representing
speech signals while preserving the quality of the speech reconstructed from that representation. In
this paper, we propose a novel speech coding technique, not only to compress speech signal at low
bit rate, but also to maintain its quality even if the received signal is corrupted by noise. The encoder
of the proposed technique is based on speech analysis/synthesis model using a sinusoidal
representation where the sinusoidal components are involved to form a nearly resemblance of the
original speech waveform. In the proposed technique, the original frame is divided to voiced or
unvoiced sub-frames based on their energies. The aim of the division and classification is to choose
the best parameters that reduce the total bit rate and enable the receiver to recover the speech signal
with a good quality. The parameters involved in the analysis stage are extracted from the short-time
Fourier transform where the original speech signal is converted into frequency domain. Making use
of the peak-picking technique, amplitudes of the selected peaks with their associated frequencies and
phases of the original speech signal are extracted. In the next stage, novel parameter reduction and
quantization techniques are performed to reduce the bit rate while preserving the quality of the
recovered signal.
Keywords: Speech Coding, Speech Enhancement, Speech Compression, Waveform Speech Coder,
Sinusoidal Model, Source Coding.
1. INTRODUCTION
Due to the redundancy in speech signals, speech coding used to compress speech is one of the
most important speech processing steps. Speech coding or compression deals with the problem of
obtaining compact representation of speech signals for efficient digital storage or transmission and in
reducing the bit rate required for a speech representation while preserving the quality of speech
reconstructed from that representation. Hence, the main objective of speech coding techniques is to
INTERNATIONAL JOURNAL OF ELECTRONICS AND
COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)
ISSN 0976 – 6464(Print)
ISSN 0976 – 6472(Online)
Volume 5, Issue 4, April (2014), pp. 07-18
© IAEME: www.iaeme.com/ijecet.asp
Journal Impact Factor (2014): 7.2836 (Calculated by GISI)
www.jifactor.com
IJECET
© I A E M E
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
8
represent speech signal with minimum number of bits while maintaining its quality. Furthermore,
speech coding techniques are used for improving bandwidth utilization and power efficiency in
several applications such as digital telephony, multimedia applications, and security of digital
communications which require the speech signal to be in digital format to facilitate its process,
storage, and transmission.
Although digital speech brings flexibility and opportunities for encryption, it is also
associated, when uncompressed, with a high data rate and, hence, high requirements of transmission
bandwidth and storage. In wired communications, very large transmission bandwidths are now
available, however, in wireless and satellite communications, transmission bandwidth is limited.
Therefore, reducing the bit rate is necessary to reduce the required transmission bandwidth and
memory storage.
In order to reduce the bit rate of speech signal while preserving its quality, speech coding
provides sophisticated techniques to remove the redundancies and the irrelevant information from the
speech signal. There are two categories of speech coding techniques: i-) techniques based on linear
prediction [1] and ii-) techniques based on orthogonal transforms [1-19]. The techniques belonging
to the first category are very well known [13-19] where one of them, called regular pulse excitation
(RPE), is now used for the GSM standard [1]. The proposed technique described in details in this
paper belongs to the second category.
The encoder, analysis stage, and the decoder, synthesis stage, are the two main components
of any speech coding technique. In the analysis stage, the encoder encodes the speech signal in a
compact form using a few parameters where the analog speech signal s(t) is first sampled at rate fs ≥
2fmax, where fmax is the maximum frequency content of s(t) and the sampled discrete time signal is
denoted by s(n). Afterwards, one of the coding techniques such as pulse code modulation (PCM),
differential PCM, predictive coding, … , etc is used to encode the signal s(n). In PCM coding
technique, the discrete time signal s(n) is quantized to one of the 2R
levels where each sample s(n) is
represented by R bits. In sinusoidal speech coding [2-9],[12], the encoder takes a group of samples at
a time, extracts some parameters from them, and then converts the extracted parameters to binary
bits. After that, the binary signal is transmitted to decoder. In the synthesis stage, the decoder
reconstructs the parameters from the received binary bits. Making use the reconstructed parameters,
the can recover the original speech signal.
In the proposed technique, sinusoidal speech coding is used to reduce the required bit rate of
a speech signal while maintaining its quality. We first divided the speech signal to sub-frames and
made voiced/ unvoiced classifications based on their energies. In the analysis stage and after
converting the speech frame into frequency domain using the short-time Fourier transform, all peaks
with their associated frequencies and phases are extracted using the peak-picking strategy. In the
next stage, novel parameter reduction and quantization techniques as well as the concept of birth and
death tracking of the involved frequencies are performed to reduce the required bit rate and enhance
the quality of the recovered signal.
The layout of this paper is organized as follows: In section two, the implementation of the
sinusoidal coder is introduced; this is followed by discussion of the proposed technique in section
three. In the last section, the authors present the experimental results and conclusions.
2. IMPLEMENTATION OF THE SINUSOIDAL CODER
2.1. Analysis-synthesis model
A sinusoidal speech model is a vocoding strategy proposed in [1] to develop a new analysis/
synthesis technique characterized by the amplitudes, frequencies and phases of the speech sine
waves. This model has been shown to produce a high quality recovered speech at low data rates [1]-
[12] where the kth
segment (frame) of the input speech is represented as a sum of finite number of
sinusoidal waves with different amplitudes, frequencies, and phases, such that
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
9
∑=
+=
P
k
kkk nAns
1
)sin(.)( θω (1)
where kA , kω , kθ , and P represent the amplitude, frequency, phase of the kth sinusoidal wave,
and the number of possible peaks, respectively. It has also been shown that the sinusoidal encoder is
capable of representing both voiced and unvoiced speech frames [1]. In the analysis/synthesis model
and after dividing the original speech signal into small frames, the analysis stage is used to extract
parameters from each speech frame which represent it. The extracted parameters are used at the
synthesis stage to reconstruct the speech frames which should be as close as possible to the original
ones.
2.2. Encoder stage
The encoder processes the speech signal and converts it to a set of parameters, before
quantizing them in order to transmit the resulting binary bits along the digital channel. In the
proposed technique, we focus on minimizing the overall bit rate required to represent the speech
signal while maintaining the perceptual quality of the reconstructed speech. First, the speech is
sampled at 8 kHz and divided into main frames. Afterward, the main frames are categories based on
their energies into voiced and unvoiced frames, so that the unvoiced frame has less peaks as
compared to the voiced frames.
In addition to that, each of the voiced main frames is further divided into N sub-frames which
are also classified according to their energies, so that the sub-frame with higher energy gets more
peaks than that with lower energy. The purpose of these classifications is to extract the best
parameters which represent speech frames to achieve low bit rate and good quality for the
reconstructed speech. The two parts of the proposed encoder stage are explained in the following
subsections.
2.2.1 Peak-picking strategy
In order to make the speech signal wide sense stationary, the length of each main frame
should be small enough. In the proposed technique, the encoder divides the speech signal into (20 to
40 ms) main frames and then transforms them into the frequency domain using the fast Fourier
transform (FFT) technique. A crucial part in a sinusoidal modeling system is peak detection since
the speech is reconstructed at the decoder using the detected peaks only. There are fundamental
problems in the estimation of the meaningful peaks and their corresponding parameters. Most of
these problems are related to the length of the analysis window where a short window is required to
follow rapid changes in the input signal and a long window is needed to estimate accurate
frequencies of the sinusoidal waves or to distinguish spectrally close sinusoids from each other. It is
worth to mention that a Hanning window is used in the analysis stage, since it has very good side
lobe structure which improves the speech quality.
In almost all the sinusoidal analysis systems, the peak detection and parameter estimation is
performed in the frequency domain. This is natural, since each stable sinusoid corresponds to an
impulse in the frequency domain. However, natural sounds are infinite-duration stable sinusoids.
The simplest technique for extracting sinusoidal waves of a speech signal is to choose a large
number of local maximums in the magnitude of the STFT where a peak or a local maximum in the
magnitude of the STFT indicates the presence of a sinusoidal wave. This method, often used in audio
coding applications, is very fast and produces a fixed bit rate. However, to achieve a low bit rate, a
small number of sinusoids should be chosen. A natural improvement of this technique is to use a
threshold for peak detection where all local maximums of the STFT amplitudes above the threshold
are interpreted as sinusoidal peaks.
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
10
In the proposed technique, the original speech is divided into main frames where each of
them is also divided into 6 sub-frames. The peaks are selected by finding the location of change in
spectral slope from positive to negative. A more accurate technique using a parabola that is fitted to
peak and the location of its vertex is encoded as the peak frequency. Usually after performing this
step, around eighty peaks are obtained. The obtained peaks are further reduced by the proposed
reduction techniques, described latter, without significant loss of perceptual information. The
amplitude spectrum is illustrated in Fig. 1.
Fig. 1: Amplitude Spectral Domain of a Voiced Frame
After performing the proposed reduction techniques, we extract the frequency locations
corresponding to the detected peaks as well as the significant phases. The last step is to quantize
them before transmitting them to the receiver.
2.2.2. Parameters optimization
In our proposed technique, the encoding of the speech frames is based on selecting the most
important peaks rather than encoding all peaks by dividing the frame to sub-frames and making
proper classifications. The block diagram of our new encoder model is shown in Fig. 2 (a and b). In
this model, the original speech is divided in time domain to main frames. After that, we classify these
main frames to voiced and unvoiced frames using energy threshold where the energy of voiced
frames should be higher than this threshold value while the energy of unvoiced frames is below it. If
the main frame is voiced, it will be divided to N sub-frames. Afterward, we make energy
classification to the sub-frames, so that the sub-frame with higher energy gets more peaks than that
with lower energy. If the main frame is unvoiced, the same procedure is applied but there is no
energy classification and all sub-frames have the same number of peaks which is the number chosen
for the lowest energy sub-frame in the voiced frame. The purpose of dividing the main frames to N
sub-frames and making the voiced and unvoiced classification is to choose the best peaks in these
sub-frames that enable us to achieve a low bit rate and a good quality for the reconstructed speech.
The parameter reduction is one of the most important parts in this model, since most errors
occur in this stage. The aim of this part is to reduce the number of parameters described each main-
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
11
frame to (15-30) parameters. In addition to the proceeding reduction technique, another reduction of
information can come out from quantization process. This justifies our main concern about this topic.
Hence, after classifying the frames and dividing them to sub-frames, the following three encoding
techniques are proposed to reduce the number of parameters.
A. Peak reduction,
B. Phase reduction,
C. Threshold reduction.
Speech Sub-frames
Parameters
Binary Sequence
(a)
Phases
Sub-frame
Amplitudes
Frequencies
Binary sequence Parameters
(b)
Fig. 2: (a) The Encoder Stage, (b) Parameter Extraction and Reduction stage
A. Peak reduction technique
This technique is based on selecting the best N sinusoidal waves in each speech frame. The
value of N depends on the required data rate. The following encoding procedure summarizes this
technique:
Segmentation
and Voiced /
Unvoiced
Segmentation
to Sub-frames
Energy
Classification
Parameter
Extraction and
Reduction
Parameters
Encoding
Channel
Coding
STFT
ARCTAN
| . |
Parameter
Reduction
(Peak, Phase
and Threshold
Method)
Phase
Coding
Amplitude
Coding
Frequency
Coding
Quantization
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
12
1. Selecting the largest peaks for each sub-frame, after converting it to frequency domain.
2. If a group of peaks are close enough to each other, then choose the largest peak to represent
them.
It should be noted that by doing this step, the speech signal is still having a very good quality
which encouraging to go forward to the second reduction technique.
B. Phase reduction technique
This type of reduction aims to reduce the phase parameters which can be performed after
determining whether the sub-frame is voiced or unvoiced where the voiced frame has the following
characteristics:
• Its energy is greater than a preset threshold.
• Its zero crossing is less than that of the unvoiced (also less than a preset threshold value).
• It has a specific pitch value.
Note that the first item of the previous criteria is the most sufficient one to reduce the overall
complexity; therefore, we depend on it in the binary decision process. If the frame is voiced, i.e., it
has a large embedded energy, the encoder extracts its phases. Otherwise, the frame is considered as
unvoiced and, in this case, its phases are estimated using the phase extraction equations proposed by
Mcaulay and Quatieri in [2], [7] or Ahmadi and Spanias in [3], [4]. Once this procedure is
performed, the number of phases is reduced with humble effect on speech quality, since human ear is
less sensitive to phase distortion, so the elimination is justified.
C. Threshold reduction technique
This technique considered as the most efficient one among all reduction techniques described
previously, in the sense that it reduces the number of peaks without affecting the voice perceptual
sense. This technique chooses a threshold value that is very small, so that all the peaks below this
value are eliminated. By doing this, not only the number of amplitudes, but also their corresponding
number of frequency locations and corresponding phases are also reduced. Thus, this reduction
technique reduces the total data rate required for transmission and enhances the recovered speech
frames by filtering the peaks of the noise signal whose amplitudes are less than the threshold value.
At the end of the day, this filtration is an advantageous technique. On the other hand, the increase of
the threshold above a certain value produces a corrupted speech frame because of filtering important
informational peaks. Therefore, the threshold value should be chosen based on exhaustive statistical
study to confirm the optimal value.
After performing these reduction techniques, we end up having S amplitudes and frequencies plus
(0.5 S) phases. In other words, we have: S peaks plus S frequency locations plus (0.5 S) phases for
each main frame. In this paper, we use 6 bits for each amplitude and frequency location and 4 bits for
each phase.
Thus, the required data rate for each frame = (6 S + 6 S + 4 (0.5 S) ) = 14 S bits/frame. The
total data rate R can be computed as:
R = 14 S (bits/frame) * N (frame/s) = 14 N S bps.
Some extra bits can also be used for control and error detection and correction. At this point,
we turn to the quantization process which has same degree of importance.
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
13
2.2.3. Modeling and encoding technique
The quantization technique is defined as the process in which the dynamic range of the signal
is divided into number of levels. The number of levels is determined from the formula L= 2k
where k
is the word size and L is number of distinct words. We assigned each level to a specific word after
rounding the sample to the nearest level. This kind of quantization is called PCM.
In our model, the different techniques described in the next subsections are used to encode phase,
frequency location, and amplitude of each sinusoid.
A. Sinusoidal phase modeling and encoding
The bits used to quantize the phases can be reduced by minimizing their entropy. In order to
minimize the entropy of the phases, the encoder predicts differentially a phase from its past value
and encodes the phase difference rather than the phase itself which has less entropy than the actual
phase [3]. The differentially predicted phase is given by
L=lTω+θ=θ 1k
l
1k
l
k
l 1,2,....ˆ −−
, (2)
where the superscript k denotes the frame number,
k
lω is the ( l ) sinusoid, T is the time interval
between frames, and L is the number of sinusoidal components. The phase differences or residues
are expressed as
L=lθθ=∆θ k
l
k
l
k
l 1,2,...ˆ− (3)
where the actual phase is used to estimate the phase difference
k
lθ∆ .
B. Sinusoidal frequency encoding
After transforming the speech frames into the frequency domain using the STFT strategy, the
frequency location indices are integer values, i.e., in Matlab, the spectrum has 512 points in both
sides. By taking one side (256 points) that represents the frequencies contained within one frame,
frequency locations are from 1 to 256 which corresponding to the frequency range from 0 to 4000 Hz
where (4) is used to get the frequency rang:
framesize
locationfrequency
4000
).1( −= (4)
The minimum number of bits required to encode each frequency location is 8 bits which is
the normal case, however, in this model, the situation looks different, where only 6 bits per location
are used and the results are almost similar. In the proposed model, the first frequency location
represents low frequency components and the last frequency location represents high frequency
component. Hence, we do not need to spend the same number of bits for each frequency location
where higher frequency locations correspond to the high frequency components which have less
effect in speech perception. Therefore, higher frequency locations can be quantized using fewer bits
than lower frequency locations. This reduces the bit rate while keeping the speech quality almost the
same. Hence, to implement this idea, we developed the following procedure:
1. Dividing the frequency locations by the STFT size to normalize the frequency location vector,
and then we obtain (fn).
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
14
2. The normalized frequency location vector is transformed to anther domain (un) to reduce the
number of bits used to encode each frequency location where (un) is given by
64.
528.1
)072.0).41((log −+
= ne
n
f
u (5)
After calculating Un, we obtain values within the range of (1-64). Note that equation (5) is similar
to µ-Law used in digital signal processing to compress the speech signal.
3. Round the result and then convert resulting value to binary.
C. Sinusoidal amplitude encoding
This technique is also important since the amplitude is susceptible to any change in it due to
the quantization process. Therefore, we proposed an encoding technique that increases the resolution
in order of (6-12) times than the resolution of the PCM. Let us assume that we have amplitudes:
)( nx = [amp1, amp2,…,ampN) where N is the number of the considered peaks, then the proposed
encoding technique is summarized as follows:
1. Take Log2(xn) of the amplitude, in order to reduce the dynamic range.
2. The results of the first step are all negative, since all amplitude involved are less than unity.
3. The resulted dynamic range from the previous two steps is (-1,-20), because the lowest
amplitude is 10-6
which is our predetermined threshold.
4. Take the absolute value of the results and then multiply them by (ß) where the value of (ß) is
chosen to be 3 to make the dynamic range (1-64). Then, extract the values of na using (6).
5. Sort the amplitudes ( na ) in ascending order together with the associated phases and
frequencies as a bundle. This step is justified because we note that there is a small difference
between successive amplitudes in the same frame.
6. Take the integer part in the first amplitude 0a (floor) and convert it to binary ( 0q ).
7. Subtract the value found in the step 6 above ( 0q ) from all other amplitudes ( na ).
8. Multiply the next amplitude ( na ) by a number (α) in the range (6-12).
9. Floor the value found in the previous step ( iq where i=1,…,N-1)
10. Convert the result to binary.
11. Subtract all remaining na ’s by the output of the step 9 divided by α.
12. Repeat steps (8-11) until you finish na ’s.
The general equations that represent the amplitude quantization are given by
)))(((log. 2 nxabsan β= (6)
)][( 00 aFloorq = (7)






+−= ∑
−
=
1
1
0 ].[.
N
i
inn qqaFloorq αα (8)
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
15
2.3. Decoder stage
The decoder is used to reconstruct the original signal by decoding the parameters extracted in
the encoder stage as shown in Fig. 3. These parameters are then used to reconstruct the speech
frames by linearly summing the sine waves of different amplitudes, frequencies, and phases.
2.3.1. Decoding strategy
This strategy converts the received binary representation of the parameters to a decimal form.
Three decoding techniques for the amplitudes, frequencies, and phases are required to recover them.
The reconstructed parameters should be as similar as possible to the original ones.
A. Phase decoding technique
This process can be summarized as follows:
1. Dequantize the received binary bits corresponding to the phase differences.
2. Predict the phases from their past values using equation (2).
3. Add the estimated phase found in the previous step to the phase difference found in step 1.
B. Frequency decoding technique
1. Convert the received binary bits to decimal form nuˆ .
2. The estimated normalized frequency location vector (zn) is reconstructed from nuˆ using equation
(9) which is the inverse of equation 5, given by:
4
)1)ˆ.0.023875072.0(exp( −+
= n
n
u
z (9)
3. Round nz .
C. Amplitude decoding technique
Convert the binary signal to [q0, q1,…, qN], then we find
00 qd =
α
1
01
q
qd +=
αα
21
02
qq
qd ++=
.
.
.
∑=
+=
n
i
n
q
qd
1
1
0
α
(10)
where n+1 is the number of considered peaks. Note that after performing this step, the maximum
error occur at n = 0. However, this error is very small. To reconstruct the signal parameter
amplitudes (yn), we use the following equation
)(
2 β
nd
ny −=  (11)
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
16
Binary Sequence Parameters
Phases
Frequencies
Amplitudes
Figure (3): The Decoder Stage
3. ADVANTAGES OF THE PROPOSED SPEECH CODING TECHNIQUE
From the previous described section, we can conclude that the proposed speech coding technique
• Enjoys a very efficient and effective encoding and decoding procedure.
• Gives a reconstructed speech signal with high quality.
• Reduces the data rate to (3.6-8) kbps.
• Enhances the original signal when the received speech signal is corrupted by additive noise.
• Does not depend on a pitch (the fundamental frequency).
• Can be considered as a noise immune.
• Reduces the total required transmitted power due to minimizing the required bit rate.
• Allows error detection and correction procedures.
4. EXPERIMENTAL RESULTS
1. From literature it is advised to use a window size equal to 2.5 times the average pitch.,
therefore, the size of the main frame is between (20-40) ms. This means that the overlap and
add percentage is 33.3% at the transmitter, and the FFT size is equal to 512 points.
2. After an exhaustive statistical study, the threshold value used in Sec. 2.2.2-C is selected to be
less than (10-6
). As explained in Sec. 2.2.2-C, this step is to reduce the total number of peaks.
3. Hamming window is employed.
4. The data rate of the proposed technique is between 3.6 kbps to slightly less than 8 kbps. We
remark that for high quality speech, the data rate is less than 8 kbps where the remainder of bits
can be used for controlling and error detection and correction.
5. At the decoder, we perform an overlap and add with percentage equal to 50% to eliminate
discontinuity of the received speech.
Dequantization
Phase
Decoding
Amplitude
Decoding
Frequency
Decoding
Sine Wave
GeneratorSpeech Audio
Amplifier
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
17
5. CONCLUSIONS
In this research, we propose a computationally efficient low bit rate speech coding technique
based on the sinusoidal model with efficient speech enhancer. The proposed technique can
reconstruct the transmitted speech signal at the decoder with good quality and intelligibility, even if
it is corrupted by a thermal noise, at bit rate from 3.6 to 8 kbps. In our speech coding technique, we
propose novel encoding techniques to minimize the total number of parameters extracted from
frequency domain, i.e., amplitudes, frequency locations, and phases. The significant one is the
threshold technique which not only reduces the number of parameters but also enhances the
recovered speech signal. After that, we introduced new techniques, i.e., phase coding, amplitude
coding, and frequency coding, to model and encode these parameters efficiently.
REFERENCES
1. Spanias, "Speech Coding: A Tutorial Review," Proc. of the IEEE, Vol. 82, No. 10,
pp. 1541 - 1582, Oct. 94.
2. R.J. McAulay and T.F. Quatieri, "Speech Analysis/Synthesis Based on a Sinusoidal
Representation," IEEE Trans. On ASSP, Vol. ASSP-34, No. 4, pp. 744-754, August 1986.
3. Sassan Ahmadi & Andereas .S. Spanias, "New Techniques For Sinusoidal Coding of Speech
at 2400 bps", Arizona State University. Proc. Asilomar-96, Nov 3-6, Pacific Grove, CA1.
4. Sassan Ahmadi and Anderias Spanias, "Low-bit rate speech coding based on harmonic
sinusoidal models", In Proc. International Symposium on Digital Signal Processing
(ISDSP),pp. 165-170, July 1996.
5. Remy Boyer and Julie Rosier, "Iterative Method for Harmonic and Exponentially Sinusoidal
Models", Proc. Of the 5th
Int. Conference on Digital Audio Effects (DAFx-02), Hamburg,
Germany, September 26-28, 2002.
6. E. B. George and M. J. T. Smith. Speech analysis/synthesis and modification using an
analysis-by-synthesis/overlap-add sinusoidal model. IEEE Trans. Speech and Audio Proc.,
Vol.5, Number 5, pp.389–406, September 1997.
7. Robert J. McAulay and Thomas F. Quatieri, “Processing of Acoustic Waveforms,” United
States Patent, Dec. 28, 1999, Patent No.:Re.36, 478, Assignee: Massachusetts Institute of
Technology, Cambridge, Mass.
8. K. Vos, R. Vafin, R. Heusdens, and W. B. Kleijn, “High quality consistent analysis-synthesis
in sinusoidal coding”, in Proc. AES 17th Int. Conf., ’High-Quality Audio Coding’,
pp. 244 – 250, 1999.
9. Izmirli, O., “Non-harmonic Sinusoidal Modeling Synthesis Using Short-time High-resolution
Parameter Analysis” Proceedings of the COST G-6 Conference on Digital Audio Effects
(DAFX-00), Verona, Italy, December 7-9, 2000.
10. Harald Pobloth, Renat Vafin, and W. Bastiaan Kleijn, '' Polar Quantization of Sinusoids from
Speech Signal Blocks", EUROSPEECH 2003 – Geneva.
11. Mathieu Lagrange, Sylvain Marchand and Jean Bernard Rault, "Sinusoidal Parameter
extraction and Component selection in a Non Stationary Model", Proc. Of the 5th
Int.
Conference on Digital Audio Effects (DAFx-02), Hamburg, Germany, September 26-28,
2002.
12. Ibrahim Mansour and Samer J. Alabed. "Using Sinusoidal Model to Implement Sinusoidal
Speech Coder with Speech Enhancer". The 6th
International Electrical and Electronics
Engineering Conference (JIEEEC), Volume 1, page 1-8, march 2006.
13. Kang Sangwon, Shin Yongwon, and Fischer Thomas. (2004). "Low-Complexity Predictive
Trellis-Coded Quantization of Speech Line Spectral Frequencies". IEEE Transactions on
Signal Processing, Vol. 52, No. 7.
International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 –
6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME
18
14. Alku Paavo, and Bäckström Tom. (2004). "Linear Predictive Method for Improved Spectral
Modeling of Lower Frequencies of Speech With Small Prediction Orders". IEEE
Transactions on Speech and Audio Processing, Vol. 12, No. 2.
15. Atal Bishnu. (1982). "Predictive coding of speech at low bit rates". IEEE Transactions on
Communications, COM-30(4):600-614.
16. Brinker Albertus C. den, Voitishchuk, V., and Eijndhoven Stephanus J. L. van. (2004). "IIR-
Based Pure Linear Prediction". IEEE Transactions on Speech and Audio Processing, Vol. 12,
No. 1.
17. Papamichalis Panos. (1987). "Practical Approaches to Speech Coding", Prentice Hall, Inc.
Texas Instruments, Inc. Rice University.
18. Härmä Aki. (2001). "Linear Predictive Coding With Modified Filter Structures". IEEE
Transactions on Speech and Audio Processing, Vol. 9, No. 8.
19. Hu Hwai-Tsu, and Wu Hsi-Tsung. (2000). "A Glottal-Excited Linear Prediction (GELP)
Model for Low-Bit-Rate Speech Coding", Proc. Natl. Sci, Counc. ROC(A) Vol. 24.
pp. 134-142.
20. Sudha.P.N and Dr U.Eranna, “Source and Adaptive Channel Coding Techniques for Wireless
Communication”, International Journal of Electronics and Communication Engineering &
Technology (IJECET), Volume 3, Issue 3, 2012, pp. 314 - 323, ISSN Print: 0976-6464,
ISSN Online: 0976-6472,.
21. P Mahalakshmi and M R Reddy, “Speech Processing Strategies for Cochlear Prostheses-The
Past, Present and Future: A Tutorial Review”, International Journal of Advanced Research in
Engineering & Technology (IJARET), Volume 3, Issue 2, 2012, pp. 197 - 206, ISSN Print:
0976-6480, ISSN Online: 0976-6499.

Contenu connexe

Tendances

A REVIEW OF LPC METHODS FOR ENHANCEMENT OF SPEECH SIGNALS
A REVIEW OF LPC METHODS FOR ENHANCEMENT OF SPEECH SIGNALSA REVIEW OF LPC METHODS FOR ENHANCEMENT OF SPEECH SIGNALS
A REVIEW OF LPC METHODS FOR ENHANCEMENT OF SPEECH SIGNALSijiert bestjournal
 
Low power fpga solution for dab audio decoder
Low power fpga solution for dab audio decoderLow power fpga solution for dab audio decoder
Low power fpga solution for dab audio decodereSAT Publishing House
 
Speech Compression using LPC
Speech Compression using LPCSpeech Compression using LPC
Speech Compression using LPCDisha Modi
 
LPC Models and Different Speech Enhancement Techniques- A Review
LPC Models and Different Speech Enhancement Techniques- A ReviewLPC Models and Different Speech Enhancement Techniques- A Review
LPC Models and Different Speech Enhancement Techniques- A Reviewijiert bestjournal
 
Paper id 2720144
Paper id 2720144Paper id 2720144
Paper id 2720144IJRAT
 
A review on sparse Fast Fourier Transform applications in image processing
A review on sparse Fast Fourier Transform applications in image processing A review on sparse Fast Fourier Transform applications in image processing
A review on sparse Fast Fourier Transform applications in image processing IJECEIAES
 
Design and implementation of different audio restoration techniques for audio...
Design and implementation of different audio restoration techniques for audio...Design and implementation of different audio restoration techniques for audio...
Design and implementation of different audio restoration techniques for audio...eSAT Journals
 
Cooperative Diversity - An Introduction to Cooperative Comm
Cooperative Diversity - An Introduction to Cooperative CommCooperative Diversity - An Introduction to Cooperative Comm
Cooperative Diversity - An Introduction to Cooperative CommAshish Meshram
 
Cooperative Communication for a Multiple-Satellite Network
Cooperative Communication for a Multiple-Satellite NetworkCooperative Communication for a Multiple-Satellite Network
Cooperative Communication for a Multiple-Satellite Networkchiragwarty
 
44 i9 advanced-speaker-recognition
44 i9 advanced-speaker-recognition44 i9 advanced-speaker-recognition
44 i9 advanced-speaker-recognitionsunnysyed
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...IDES Editor
 
Design and implementation of DA FIR filter for bio-inspired computing archite...
Design and implementation of DA FIR filter for bio-inspired computing archite...Design and implementation of DA FIR filter for bio-inspired computing archite...
Design and implementation of DA FIR filter for bio-inspired computing archite...IJECEIAES
 
Comparison and Analysis Of LDM and LMS for an Application of a Speech
Comparison and Analysis Of LDM and LMS for an Application of a SpeechComparison and Analysis Of LDM and LMS for an Application of a Speech
Comparison and Analysis Of LDM and LMS for an Application of a SpeechCSCJournals
 

Tendances (20)

[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
 
A REVIEW OF LPC METHODS FOR ENHANCEMENT OF SPEECH SIGNALS
A REVIEW OF LPC METHODS FOR ENHANCEMENT OF SPEECH SIGNALSA REVIEW OF LPC METHODS FOR ENHANCEMENT OF SPEECH SIGNALS
A REVIEW OF LPC METHODS FOR ENHANCEMENT OF SPEECH SIGNALS
 
Low power fpga solution for dab audio decoder
Low power fpga solution for dab audio decoderLow power fpga solution for dab audio decoder
Low power fpga solution for dab audio decoder
 
Speech Compression using LPC
Speech Compression using LPCSpeech Compression using LPC
Speech Compression using LPC
 
LPC Models and Different Speech Enhancement Techniques- A Review
LPC Models and Different Speech Enhancement Techniques- A ReviewLPC Models and Different Speech Enhancement Techniques- A Review
LPC Models and Different Speech Enhancement Techniques- A Review
 
Gg2612711275
Gg2612711275Gg2612711275
Gg2612711275
 
L010218691
L010218691L010218691
L010218691
 
Paper id 2720144
Paper id 2720144Paper id 2720144
Paper id 2720144
 
A review on sparse Fast Fourier Transform applications in image processing
A review on sparse Fast Fourier Transform applications in image processing A review on sparse Fast Fourier Transform applications in image processing
A review on sparse Fast Fourier Transform applications in image processing
 
Design and implementation of different audio restoration techniques for audio...
Design and implementation of different audio restoration techniques for audio...Design and implementation of different audio restoration techniques for audio...
Design and implementation of different audio restoration techniques for audio...
 
F41014349
F41014349F41014349
F41014349
 
I1035563
I1035563I1035563
I1035563
 
Cooperative Diversity - An Introduction to Cooperative Comm
Cooperative Diversity - An Introduction to Cooperative CommCooperative Diversity - An Introduction to Cooperative Comm
Cooperative Diversity - An Introduction to Cooperative Comm
 
1801 1805
1801 18051801 1805
1801 1805
 
Cooperative Communication for a Multiple-Satellite Network
Cooperative Communication for a Multiple-Satellite NetworkCooperative Communication for a Multiple-Satellite Network
Cooperative Communication for a Multiple-Satellite Network
 
44 i9 advanced-speaker-recognition
44 i9 advanced-speaker-recognition44 i9 advanced-speaker-recognition
44 i9 advanced-speaker-recognition
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
 
Design and implementation of DA FIR filter for bio-inspired computing archite...
Design and implementation of DA FIR filter for bio-inspired computing archite...Design and implementation of DA FIR filter for bio-inspired computing archite...
Design and implementation of DA FIR filter for bio-inspired computing archite...
 
Comparison and Analysis Of LDM and LMS for an Application of a Speech
Comparison and Analysis Of LDM and LMS for an Application of a SpeechComparison and Analysis Of LDM and LMS for an Application of a Speech
Comparison and Analysis Of LDM and LMS for an Application of a Speech
 
40120140503001 2-3
40120140503001 2-340120140503001 2-3
40120140503001 2-3
 

En vedette

Cardinal Garrett
Cardinal GarrettCardinal Garrett
Cardinal Garrettjabernethy
 
GenerationNation 2014-15 impact report
GenerationNation 2014-15 impact reportGenerationNation 2014-15 impact report
GenerationNation 2014-15 impact reportGenerationNation
 
Scottish musical history 2013 Strathclyde University lecture 5
Scottish musical history 2013 Strathclyde University lecture 5Scottish musical history 2013 Strathclyde University lecture 5
Scottish musical history 2013 Strathclyde University lecture 5Karen McAulay
 
Watches Verticals Students Submissions
Watches Verticals   Students SubmissionsWatches Verticals   Students Submissions
Watches Verticals Students SubmissionsJigyasa
 
IAML Antwerp 2014 From historical collections to metadata
IAML Antwerp 2014 From historical collections to metadataIAML Antwerp 2014 From historical collections to metadata
IAML Antwerp 2014 From historical collections to metadataKaren McAulay
 
Cardinal Corp Employee Communication Plan
Cardinal Corp Employee Communication PlanCardinal Corp Employee Communication Plan
Cardinal Corp Employee Communication Planturpinlc
 
Election2011 - Middle and High - N Meck Towns
Election2011 - Middle and High - N Meck TownsElection2011 - Middle and High - N Meck Towns
Election2011 - Middle and High - N Meck TownsGenerationNation
 

En vedette (9)

A portfolio career
A portfolio careerA portfolio career
A portfolio career
 
Cardinal Garrett
Cardinal GarrettCardinal Garrett
Cardinal Garrett
 
GenerationNation 2014-15 impact report
GenerationNation 2014-15 impact reportGenerationNation 2014-15 impact report
GenerationNation 2014-15 impact report
 
Scottish musical history 2013 Strathclyde University lecture 5
Scottish musical history 2013 Strathclyde University lecture 5Scottish musical history 2013 Strathclyde University lecture 5
Scottish musical history 2013 Strathclyde University lecture 5
 
Watches Verticals Students Submissions
Watches Verticals   Students SubmissionsWatches Verticals   Students Submissions
Watches Verticals Students Submissions
 
IAML Antwerp 2014 From historical collections to metadata
IAML Antwerp 2014 From historical collections to metadataIAML Antwerp 2014 From historical collections to metadata
IAML Antwerp 2014 From historical collections to metadata
 
Cardinal Corp Employee Communication Plan
Cardinal Corp Employee Communication PlanCardinal Corp Employee Communication Plan
Cardinal Corp Employee Communication Plan
 
Election2011 - Middle and High - N Meck Towns
Election2011 - Middle and High - N Meck TownsElection2011 - Middle and High - N Meck Towns
Election2011 - Middle and High - N Meck Towns
 
cardinality1
cardinality1cardinality1
cardinality1
 

Similaire à 40120140504002

Enhanced modulation spectral subtraction for IOVT speech recognition application
Enhanced modulation spectral subtraction for IOVT speech recognition applicationEnhanced modulation spectral subtraction for IOVT speech recognition application
Enhanced modulation spectral subtraction for IOVT speech recognition applicationIRJET Journal
 
IRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech EnhancementIRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech EnhancementIRJET Journal
 
IRJET- Pitch Detection Algorithms in Time Domain
IRJET- Pitch Detection Algorithms in Time DomainIRJET- Pitch Detection Algorithms in Time Domain
IRJET- Pitch Detection Algorithms in Time DomainIRJET Journal
 
Audio Steganography Coding Using the Discreet Wavelet Transforms
Audio Steganography Coding Using the Discreet Wavelet TransformsAudio Steganography Coding Using the Discreet Wavelet Transforms
Audio Steganography Coding Using the Discreet Wavelet TransformsCSCJournals
 
Minimization Of Inter Symbol Interference Based Error in OFDM System Using A...
Minimization Of Inter Symbol Interference Based Error in  OFDM System Using A...Minimization Of Inter Symbol Interference Based Error in  OFDM System Using A...
Minimization Of Inter Symbol Interference Based Error in OFDM System Using A...IJMER
 
Speech Compression Using Wavelets
Speech Compression Using Wavelets Speech Compression Using Wavelets
Speech Compression Using Wavelets IJMER
 
An Adaptive Approach to Switching Coded Modulation in OFDM System Under AWGN ...
An Adaptive Approach to Switching Coded Modulation in OFDM System Under AWGN ...An Adaptive Approach to Switching Coded Modulation in OFDM System Under AWGN ...
An Adaptive Approach to Switching Coded Modulation in OFDM System Under AWGN ...ijsrd.com
 
Performance improvement of ofdm
Performance improvement of ofdmPerformance improvement of ofdm
Performance improvement of ofdmEditorIJAERD
 
On the realization of non linear pseudo-noise generator for various signal pr...
On the realization of non linear pseudo-noise generator for various signal pr...On the realization of non linear pseudo-noise generator for various signal pr...
On the realization of non linear pseudo-noise generator for various signal pr...Alexander Decker
 
Ijarcet vol-2-issue-7-2374-2377
Ijarcet vol-2-issue-7-2374-2377Ijarcet vol-2-issue-7-2374-2377
Ijarcet vol-2-issue-7-2374-2377Editor IJARCET
 
Ijarcet vol-2-issue-7-2374-2377
Ijarcet vol-2-issue-7-2374-2377Ijarcet vol-2-issue-7-2374-2377
Ijarcet vol-2-issue-7-2374-2377Editor IJARCET
 
Coverage of WCDMA Network Using Different Modulation Techniques with Soft and...
Coverage of WCDMA Network Using Different Modulation Techniques with Soft and...Coverage of WCDMA Network Using Different Modulation Techniques with Soft and...
Coverage of WCDMA Network Using Different Modulation Techniques with Soft and...ijcnac
 
A review paper on the papr analysis of orthogonal frequency division multiple...
A review paper on the papr analysis of orthogonal frequency division multiple...A review paper on the papr analysis of orthogonal frequency division multiple...
A review paper on the papr analysis of orthogonal frequency division multiple...ijmnct
 

Similaire à 40120140504002 (20)

Enhanced modulation spectral subtraction for IOVT speech recognition application
Enhanced modulation spectral subtraction for IOVT speech recognition applicationEnhanced modulation spectral subtraction for IOVT speech recognition application
Enhanced modulation spectral subtraction for IOVT speech recognition application
 
Ijetr021253
Ijetr021253Ijetr021253
Ijetr021253
 
IRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech EnhancementIRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
 
Mk3422222228
Mk3422222228Mk3422222228
Mk3422222228
 
IRJET- Pitch Detection Algorithms in Time Domain
IRJET- Pitch Detection Algorithms in Time DomainIRJET- Pitch Detection Algorithms in Time Domain
IRJET- Pitch Detection Algorithms in Time Domain
 
I017446164
I017446164I017446164
I017446164
 
Audio Steganography Coding Using the Discreet Wavelet Transforms
Audio Steganography Coding Using the Discreet Wavelet TransformsAudio Steganography Coding Using the Discreet Wavelet Transforms
Audio Steganography Coding Using the Discreet Wavelet Transforms
 
H010234144
H010234144H010234144
H010234144
 
Minimization Of Inter Symbol Interference Based Error in OFDM System Using A...
Minimization Of Inter Symbol Interference Based Error in  OFDM System Using A...Minimization Of Inter Symbol Interference Based Error in  OFDM System Using A...
Minimization Of Inter Symbol Interference Based Error in OFDM System Using A...
 
40120140501015
4012014050101540120140501015
40120140501015
 
Speech Compression Using Wavelets
Speech Compression Using Wavelets Speech Compression Using Wavelets
Speech Compression Using Wavelets
 
An Adaptive Approach to Switching Coded Modulation in OFDM System Under AWGN ...
An Adaptive Approach to Switching Coded Modulation in OFDM System Under AWGN ...An Adaptive Approach to Switching Coded Modulation in OFDM System Under AWGN ...
An Adaptive Approach to Switching Coded Modulation in OFDM System Under AWGN ...
 
F5242832
F5242832F5242832
F5242832
 
Performance improvement of ofdm
Performance improvement of ofdmPerformance improvement of ofdm
Performance improvement of ofdm
 
On the realization of non linear pseudo-noise generator for various signal pr...
On the realization of non linear pseudo-noise generator for various signal pr...On the realization of non linear pseudo-noise generator for various signal pr...
On the realization of non linear pseudo-noise generator for various signal pr...
 
Ijarcet vol-2-issue-7-2374-2377
Ijarcet vol-2-issue-7-2374-2377Ijarcet vol-2-issue-7-2374-2377
Ijarcet vol-2-issue-7-2374-2377
 
Ijarcet vol-2-issue-7-2374-2377
Ijarcet vol-2-issue-7-2374-2377Ijarcet vol-2-issue-7-2374-2377
Ijarcet vol-2-issue-7-2374-2377
 
Coverage of WCDMA Network Using Different Modulation Techniques with Soft and...
Coverage of WCDMA Network Using Different Modulation Techniques with Soft and...Coverage of WCDMA Network Using Different Modulation Techniques with Soft and...
Coverage of WCDMA Network Using Different Modulation Techniques with Soft and...
 
A review paper on the papr analysis of orthogonal frequency division multiple...
A review paper on the papr analysis of orthogonal frequency division multiple...A review paper on the papr analysis of orthogonal frequency division multiple...
A review paper on the papr analysis of orthogonal frequency division multiple...
 
Ijecet 06 10_006
Ijecet 06 10_006Ijecet 06 10_006
Ijecet 06 10_006
 

Plus de IAEME Publication

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME Publication
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...IAEME Publication
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSIAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSIAEME Publication
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSIAEME Publication
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSIAEME Publication
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOIAEME Publication
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IAEME Publication
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYIAEME Publication
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEIAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...IAEME Publication
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...IAEME Publication
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTIAEME Publication
 

Plus de IAEME Publication (20)

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
 

Dernier

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 

40120140504002

  • 1. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 7 A NEW SINUSOIDAL SPEECH CODING TECHNIQUE WITH SPEECH ENHANCER AT LOW BIT RATES Samer J. Alabed Eyad A. Ibrahim Darmstadt University of Technology Zarqa University of Technology Darmstadt, Germany Zarqa, Jordan ABSTRACT Speech coding deals with the problem of reducing the bit rate required for representing speech signals while preserving the quality of the speech reconstructed from that representation. In this paper, we propose a novel speech coding technique, not only to compress speech signal at low bit rate, but also to maintain its quality even if the received signal is corrupted by noise. The encoder of the proposed technique is based on speech analysis/synthesis model using a sinusoidal representation where the sinusoidal components are involved to form a nearly resemblance of the original speech waveform. In the proposed technique, the original frame is divided to voiced or unvoiced sub-frames based on their energies. The aim of the division and classification is to choose the best parameters that reduce the total bit rate and enable the receiver to recover the speech signal with a good quality. The parameters involved in the analysis stage are extracted from the short-time Fourier transform where the original speech signal is converted into frequency domain. Making use of the peak-picking technique, amplitudes of the selected peaks with their associated frequencies and phases of the original speech signal are extracted. In the next stage, novel parameter reduction and quantization techniques are performed to reduce the bit rate while preserving the quality of the recovered signal. Keywords: Speech Coding, Speech Enhancement, Speech Compression, Waveform Speech Coder, Sinusoidal Model, Source Coding. 1. INTRODUCTION Due to the redundancy in speech signals, speech coding used to compress speech is one of the most important speech processing steps. Speech coding or compression deals with the problem of obtaining compact representation of speech signals for efficient digital storage or transmission and in reducing the bit rate required for a speech representation while preserving the quality of speech reconstructed from that representation. Hence, the main objective of speech coding techniques is to INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) ISSN 0976 – 6464(Print) ISSN 0976 – 6472(Online) Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME: www.iaeme.com/ijecet.asp Journal Impact Factor (2014): 7.2836 (Calculated by GISI) www.jifactor.com IJECET © I A E M E
  • 2. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 8 represent speech signal with minimum number of bits while maintaining its quality. Furthermore, speech coding techniques are used for improving bandwidth utilization and power efficiency in several applications such as digital telephony, multimedia applications, and security of digital communications which require the speech signal to be in digital format to facilitate its process, storage, and transmission. Although digital speech brings flexibility and opportunities for encryption, it is also associated, when uncompressed, with a high data rate and, hence, high requirements of transmission bandwidth and storage. In wired communications, very large transmission bandwidths are now available, however, in wireless and satellite communications, transmission bandwidth is limited. Therefore, reducing the bit rate is necessary to reduce the required transmission bandwidth and memory storage. In order to reduce the bit rate of speech signal while preserving its quality, speech coding provides sophisticated techniques to remove the redundancies and the irrelevant information from the speech signal. There are two categories of speech coding techniques: i-) techniques based on linear prediction [1] and ii-) techniques based on orthogonal transforms [1-19]. The techniques belonging to the first category are very well known [13-19] where one of them, called regular pulse excitation (RPE), is now used for the GSM standard [1]. The proposed technique described in details in this paper belongs to the second category. The encoder, analysis stage, and the decoder, synthesis stage, are the two main components of any speech coding technique. In the analysis stage, the encoder encodes the speech signal in a compact form using a few parameters where the analog speech signal s(t) is first sampled at rate fs ≥ 2fmax, where fmax is the maximum frequency content of s(t) and the sampled discrete time signal is denoted by s(n). Afterwards, one of the coding techniques such as pulse code modulation (PCM), differential PCM, predictive coding, … , etc is used to encode the signal s(n). In PCM coding technique, the discrete time signal s(n) is quantized to one of the 2R levels where each sample s(n) is represented by R bits. In sinusoidal speech coding [2-9],[12], the encoder takes a group of samples at a time, extracts some parameters from them, and then converts the extracted parameters to binary bits. After that, the binary signal is transmitted to decoder. In the synthesis stage, the decoder reconstructs the parameters from the received binary bits. Making use the reconstructed parameters, the can recover the original speech signal. In the proposed technique, sinusoidal speech coding is used to reduce the required bit rate of a speech signal while maintaining its quality. We first divided the speech signal to sub-frames and made voiced/ unvoiced classifications based on their energies. In the analysis stage and after converting the speech frame into frequency domain using the short-time Fourier transform, all peaks with their associated frequencies and phases are extracted using the peak-picking strategy. In the next stage, novel parameter reduction and quantization techniques as well as the concept of birth and death tracking of the involved frequencies are performed to reduce the required bit rate and enhance the quality of the recovered signal. The layout of this paper is organized as follows: In section two, the implementation of the sinusoidal coder is introduced; this is followed by discussion of the proposed technique in section three. In the last section, the authors present the experimental results and conclusions. 2. IMPLEMENTATION OF THE SINUSOIDAL CODER 2.1. Analysis-synthesis model A sinusoidal speech model is a vocoding strategy proposed in [1] to develop a new analysis/ synthesis technique characterized by the amplitudes, frequencies and phases of the speech sine waves. This model has been shown to produce a high quality recovered speech at low data rates [1]- [12] where the kth segment (frame) of the input speech is represented as a sum of finite number of sinusoidal waves with different amplitudes, frequencies, and phases, such that
  • 3. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 9 ∑= += P k kkk nAns 1 )sin(.)( θω (1) where kA , kω , kθ , and P represent the amplitude, frequency, phase of the kth sinusoidal wave, and the number of possible peaks, respectively. It has also been shown that the sinusoidal encoder is capable of representing both voiced and unvoiced speech frames [1]. In the analysis/synthesis model and after dividing the original speech signal into small frames, the analysis stage is used to extract parameters from each speech frame which represent it. The extracted parameters are used at the synthesis stage to reconstruct the speech frames which should be as close as possible to the original ones. 2.2. Encoder stage The encoder processes the speech signal and converts it to a set of parameters, before quantizing them in order to transmit the resulting binary bits along the digital channel. In the proposed technique, we focus on minimizing the overall bit rate required to represent the speech signal while maintaining the perceptual quality of the reconstructed speech. First, the speech is sampled at 8 kHz and divided into main frames. Afterward, the main frames are categories based on their energies into voiced and unvoiced frames, so that the unvoiced frame has less peaks as compared to the voiced frames. In addition to that, each of the voiced main frames is further divided into N sub-frames which are also classified according to their energies, so that the sub-frame with higher energy gets more peaks than that with lower energy. The purpose of these classifications is to extract the best parameters which represent speech frames to achieve low bit rate and good quality for the reconstructed speech. The two parts of the proposed encoder stage are explained in the following subsections. 2.2.1 Peak-picking strategy In order to make the speech signal wide sense stationary, the length of each main frame should be small enough. In the proposed technique, the encoder divides the speech signal into (20 to 40 ms) main frames and then transforms them into the frequency domain using the fast Fourier transform (FFT) technique. A crucial part in a sinusoidal modeling system is peak detection since the speech is reconstructed at the decoder using the detected peaks only. There are fundamental problems in the estimation of the meaningful peaks and their corresponding parameters. Most of these problems are related to the length of the analysis window where a short window is required to follow rapid changes in the input signal and a long window is needed to estimate accurate frequencies of the sinusoidal waves or to distinguish spectrally close sinusoids from each other. It is worth to mention that a Hanning window is used in the analysis stage, since it has very good side lobe structure which improves the speech quality. In almost all the sinusoidal analysis systems, the peak detection and parameter estimation is performed in the frequency domain. This is natural, since each stable sinusoid corresponds to an impulse in the frequency domain. However, natural sounds are infinite-duration stable sinusoids. The simplest technique for extracting sinusoidal waves of a speech signal is to choose a large number of local maximums in the magnitude of the STFT where a peak or a local maximum in the magnitude of the STFT indicates the presence of a sinusoidal wave. This method, often used in audio coding applications, is very fast and produces a fixed bit rate. However, to achieve a low bit rate, a small number of sinusoids should be chosen. A natural improvement of this technique is to use a threshold for peak detection where all local maximums of the STFT amplitudes above the threshold are interpreted as sinusoidal peaks.
  • 4. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 10 In the proposed technique, the original speech is divided into main frames where each of them is also divided into 6 sub-frames. The peaks are selected by finding the location of change in spectral slope from positive to negative. A more accurate technique using a parabola that is fitted to peak and the location of its vertex is encoded as the peak frequency. Usually after performing this step, around eighty peaks are obtained. The obtained peaks are further reduced by the proposed reduction techniques, described latter, without significant loss of perceptual information. The amplitude spectrum is illustrated in Fig. 1. Fig. 1: Amplitude Spectral Domain of a Voiced Frame After performing the proposed reduction techniques, we extract the frequency locations corresponding to the detected peaks as well as the significant phases. The last step is to quantize them before transmitting them to the receiver. 2.2.2. Parameters optimization In our proposed technique, the encoding of the speech frames is based on selecting the most important peaks rather than encoding all peaks by dividing the frame to sub-frames and making proper classifications. The block diagram of our new encoder model is shown in Fig. 2 (a and b). In this model, the original speech is divided in time domain to main frames. After that, we classify these main frames to voiced and unvoiced frames using energy threshold where the energy of voiced frames should be higher than this threshold value while the energy of unvoiced frames is below it. If the main frame is voiced, it will be divided to N sub-frames. Afterward, we make energy classification to the sub-frames, so that the sub-frame with higher energy gets more peaks than that with lower energy. If the main frame is unvoiced, the same procedure is applied but there is no energy classification and all sub-frames have the same number of peaks which is the number chosen for the lowest energy sub-frame in the voiced frame. The purpose of dividing the main frames to N sub-frames and making the voiced and unvoiced classification is to choose the best peaks in these sub-frames that enable us to achieve a low bit rate and a good quality for the reconstructed speech. The parameter reduction is one of the most important parts in this model, since most errors occur in this stage. The aim of this part is to reduce the number of parameters described each main-
  • 5. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 11 frame to (15-30) parameters. In addition to the proceeding reduction technique, another reduction of information can come out from quantization process. This justifies our main concern about this topic. Hence, after classifying the frames and dividing them to sub-frames, the following three encoding techniques are proposed to reduce the number of parameters. A. Peak reduction, B. Phase reduction, C. Threshold reduction. Speech Sub-frames Parameters Binary Sequence (a) Phases Sub-frame Amplitudes Frequencies Binary sequence Parameters (b) Fig. 2: (a) The Encoder Stage, (b) Parameter Extraction and Reduction stage A. Peak reduction technique This technique is based on selecting the best N sinusoidal waves in each speech frame. The value of N depends on the required data rate. The following encoding procedure summarizes this technique: Segmentation and Voiced / Unvoiced Segmentation to Sub-frames Energy Classification Parameter Extraction and Reduction Parameters Encoding Channel Coding STFT ARCTAN | . | Parameter Reduction (Peak, Phase and Threshold Method) Phase Coding Amplitude Coding Frequency Coding Quantization
  • 6. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 12 1. Selecting the largest peaks for each sub-frame, after converting it to frequency domain. 2. If a group of peaks are close enough to each other, then choose the largest peak to represent them. It should be noted that by doing this step, the speech signal is still having a very good quality which encouraging to go forward to the second reduction technique. B. Phase reduction technique This type of reduction aims to reduce the phase parameters which can be performed after determining whether the sub-frame is voiced or unvoiced where the voiced frame has the following characteristics: • Its energy is greater than a preset threshold. • Its zero crossing is less than that of the unvoiced (also less than a preset threshold value). • It has a specific pitch value. Note that the first item of the previous criteria is the most sufficient one to reduce the overall complexity; therefore, we depend on it in the binary decision process. If the frame is voiced, i.e., it has a large embedded energy, the encoder extracts its phases. Otherwise, the frame is considered as unvoiced and, in this case, its phases are estimated using the phase extraction equations proposed by Mcaulay and Quatieri in [2], [7] or Ahmadi and Spanias in [3], [4]. Once this procedure is performed, the number of phases is reduced with humble effect on speech quality, since human ear is less sensitive to phase distortion, so the elimination is justified. C. Threshold reduction technique This technique considered as the most efficient one among all reduction techniques described previously, in the sense that it reduces the number of peaks without affecting the voice perceptual sense. This technique chooses a threshold value that is very small, so that all the peaks below this value are eliminated. By doing this, not only the number of amplitudes, but also their corresponding number of frequency locations and corresponding phases are also reduced. Thus, this reduction technique reduces the total data rate required for transmission and enhances the recovered speech frames by filtering the peaks of the noise signal whose amplitudes are less than the threshold value. At the end of the day, this filtration is an advantageous technique. On the other hand, the increase of the threshold above a certain value produces a corrupted speech frame because of filtering important informational peaks. Therefore, the threshold value should be chosen based on exhaustive statistical study to confirm the optimal value. After performing these reduction techniques, we end up having S amplitudes and frequencies plus (0.5 S) phases. In other words, we have: S peaks plus S frequency locations plus (0.5 S) phases for each main frame. In this paper, we use 6 bits for each amplitude and frequency location and 4 bits for each phase. Thus, the required data rate for each frame = (6 S + 6 S + 4 (0.5 S) ) = 14 S bits/frame. The total data rate R can be computed as: R = 14 S (bits/frame) * N (frame/s) = 14 N S bps. Some extra bits can also be used for control and error detection and correction. At this point, we turn to the quantization process which has same degree of importance.
  • 7. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 13 2.2.3. Modeling and encoding technique The quantization technique is defined as the process in which the dynamic range of the signal is divided into number of levels. The number of levels is determined from the formula L= 2k where k is the word size and L is number of distinct words. We assigned each level to a specific word after rounding the sample to the nearest level. This kind of quantization is called PCM. In our model, the different techniques described in the next subsections are used to encode phase, frequency location, and amplitude of each sinusoid. A. Sinusoidal phase modeling and encoding The bits used to quantize the phases can be reduced by minimizing their entropy. In order to minimize the entropy of the phases, the encoder predicts differentially a phase from its past value and encodes the phase difference rather than the phase itself which has less entropy than the actual phase [3]. The differentially predicted phase is given by L=lTω+θ=θ 1k l 1k l k l 1,2,....ˆ −− , (2) where the superscript k denotes the frame number, k lω is the ( l ) sinusoid, T is the time interval between frames, and L is the number of sinusoidal components. The phase differences or residues are expressed as L=lθθ=∆θ k l k l k l 1,2,...ˆ− (3) where the actual phase is used to estimate the phase difference k lθ∆ . B. Sinusoidal frequency encoding After transforming the speech frames into the frequency domain using the STFT strategy, the frequency location indices are integer values, i.e., in Matlab, the spectrum has 512 points in both sides. By taking one side (256 points) that represents the frequencies contained within one frame, frequency locations are from 1 to 256 which corresponding to the frequency range from 0 to 4000 Hz where (4) is used to get the frequency rang: framesize locationfrequency 4000 ).1( −= (4) The minimum number of bits required to encode each frequency location is 8 bits which is the normal case, however, in this model, the situation looks different, where only 6 bits per location are used and the results are almost similar. In the proposed model, the first frequency location represents low frequency components and the last frequency location represents high frequency component. Hence, we do not need to spend the same number of bits for each frequency location where higher frequency locations correspond to the high frequency components which have less effect in speech perception. Therefore, higher frequency locations can be quantized using fewer bits than lower frequency locations. This reduces the bit rate while keeping the speech quality almost the same. Hence, to implement this idea, we developed the following procedure: 1. Dividing the frequency locations by the STFT size to normalize the frequency location vector, and then we obtain (fn).
  • 8. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 14 2. The normalized frequency location vector is transformed to anther domain (un) to reduce the number of bits used to encode each frequency location where (un) is given by 64. 528.1 )072.0).41((log −+ = ne n f u (5) After calculating Un, we obtain values within the range of (1-64). Note that equation (5) is similar to µ-Law used in digital signal processing to compress the speech signal. 3. Round the result and then convert resulting value to binary. C. Sinusoidal amplitude encoding This technique is also important since the amplitude is susceptible to any change in it due to the quantization process. Therefore, we proposed an encoding technique that increases the resolution in order of (6-12) times than the resolution of the PCM. Let us assume that we have amplitudes: )( nx = [amp1, amp2,…,ampN) where N is the number of the considered peaks, then the proposed encoding technique is summarized as follows: 1. Take Log2(xn) of the amplitude, in order to reduce the dynamic range. 2. The results of the first step are all negative, since all amplitude involved are less than unity. 3. The resulted dynamic range from the previous two steps is (-1,-20), because the lowest amplitude is 10-6 which is our predetermined threshold. 4. Take the absolute value of the results and then multiply them by (ß) where the value of (ß) is chosen to be 3 to make the dynamic range (1-64). Then, extract the values of na using (6). 5. Sort the amplitudes ( na ) in ascending order together with the associated phases and frequencies as a bundle. This step is justified because we note that there is a small difference between successive amplitudes in the same frame. 6. Take the integer part in the first amplitude 0a (floor) and convert it to binary ( 0q ). 7. Subtract the value found in the step 6 above ( 0q ) from all other amplitudes ( na ). 8. Multiply the next amplitude ( na ) by a number (α) in the range (6-12). 9. Floor the value found in the previous step ( iq where i=1,…,N-1) 10. Convert the result to binary. 11. Subtract all remaining na ’s by the output of the step 9 divided by α. 12. Repeat steps (8-11) until you finish na ’s. The general equations that represent the amplitude quantization are given by )))(((log. 2 nxabsan β= (6) )][( 00 aFloorq = (7)       +−= ∑ − = 1 1 0 ].[. N i inn qqaFloorq αα (8)
  • 9. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 15 2.3. Decoder stage The decoder is used to reconstruct the original signal by decoding the parameters extracted in the encoder stage as shown in Fig. 3. These parameters are then used to reconstruct the speech frames by linearly summing the sine waves of different amplitudes, frequencies, and phases. 2.3.1. Decoding strategy This strategy converts the received binary representation of the parameters to a decimal form. Three decoding techniques for the amplitudes, frequencies, and phases are required to recover them. The reconstructed parameters should be as similar as possible to the original ones. A. Phase decoding technique This process can be summarized as follows: 1. Dequantize the received binary bits corresponding to the phase differences. 2. Predict the phases from their past values using equation (2). 3. Add the estimated phase found in the previous step to the phase difference found in step 1. B. Frequency decoding technique 1. Convert the received binary bits to decimal form nuˆ . 2. The estimated normalized frequency location vector (zn) is reconstructed from nuˆ using equation (9) which is the inverse of equation 5, given by: 4 )1)ˆ.0.023875072.0(exp( −+ = n n u z (9) 3. Round nz . C. Amplitude decoding technique Convert the binary signal to [q0, q1,…, qN], then we find 00 qd = α 1 01 q qd += αα 21 02 qq qd ++= . . . ∑= += n i n q qd 1 1 0 α (10) where n+1 is the number of considered peaks. Note that after performing this step, the maximum error occur at n = 0. However, this error is very small. To reconstruct the signal parameter amplitudes (yn), we use the following equation )( 2 β nd ny −=  (11)
  • 10. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 16 Binary Sequence Parameters Phases Frequencies Amplitudes Figure (3): The Decoder Stage 3. ADVANTAGES OF THE PROPOSED SPEECH CODING TECHNIQUE From the previous described section, we can conclude that the proposed speech coding technique • Enjoys a very efficient and effective encoding and decoding procedure. • Gives a reconstructed speech signal with high quality. • Reduces the data rate to (3.6-8) kbps. • Enhances the original signal when the received speech signal is corrupted by additive noise. • Does not depend on a pitch (the fundamental frequency). • Can be considered as a noise immune. • Reduces the total required transmitted power due to minimizing the required bit rate. • Allows error detection and correction procedures. 4. EXPERIMENTAL RESULTS 1. From literature it is advised to use a window size equal to 2.5 times the average pitch., therefore, the size of the main frame is between (20-40) ms. This means that the overlap and add percentage is 33.3% at the transmitter, and the FFT size is equal to 512 points. 2. After an exhaustive statistical study, the threshold value used in Sec. 2.2.2-C is selected to be less than (10-6 ). As explained in Sec. 2.2.2-C, this step is to reduce the total number of peaks. 3. Hamming window is employed. 4. The data rate of the proposed technique is between 3.6 kbps to slightly less than 8 kbps. We remark that for high quality speech, the data rate is less than 8 kbps where the remainder of bits can be used for controlling and error detection and correction. 5. At the decoder, we perform an overlap and add with percentage equal to 50% to eliminate discontinuity of the received speech. Dequantization Phase Decoding Amplitude Decoding Frequency Decoding Sine Wave GeneratorSpeech Audio Amplifier
  • 11. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 17 5. CONCLUSIONS In this research, we propose a computationally efficient low bit rate speech coding technique based on the sinusoidal model with efficient speech enhancer. The proposed technique can reconstruct the transmitted speech signal at the decoder with good quality and intelligibility, even if it is corrupted by a thermal noise, at bit rate from 3.6 to 8 kbps. In our speech coding technique, we propose novel encoding techniques to minimize the total number of parameters extracted from frequency domain, i.e., amplitudes, frequency locations, and phases. The significant one is the threshold technique which not only reduces the number of parameters but also enhances the recovered speech signal. After that, we introduced new techniques, i.e., phase coding, amplitude coding, and frequency coding, to model and encode these parameters efficiently. REFERENCES 1. Spanias, "Speech Coding: A Tutorial Review," Proc. of the IEEE, Vol. 82, No. 10, pp. 1541 - 1582, Oct. 94. 2. R.J. McAulay and T.F. Quatieri, "Speech Analysis/Synthesis Based on a Sinusoidal Representation," IEEE Trans. On ASSP, Vol. ASSP-34, No. 4, pp. 744-754, August 1986. 3. Sassan Ahmadi & Andereas .S. Spanias, "New Techniques For Sinusoidal Coding of Speech at 2400 bps", Arizona State University. Proc. Asilomar-96, Nov 3-6, Pacific Grove, CA1. 4. Sassan Ahmadi and Anderias Spanias, "Low-bit rate speech coding based on harmonic sinusoidal models", In Proc. International Symposium on Digital Signal Processing (ISDSP),pp. 165-170, July 1996. 5. Remy Boyer and Julie Rosier, "Iterative Method for Harmonic and Exponentially Sinusoidal Models", Proc. Of the 5th Int. Conference on Digital Audio Effects (DAFx-02), Hamburg, Germany, September 26-28, 2002. 6. E. B. George and M. J. T. Smith. Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model. IEEE Trans. Speech and Audio Proc., Vol.5, Number 5, pp.389–406, September 1997. 7. Robert J. McAulay and Thomas F. Quatieri, “Processing of Acoustic Waveforms,” United States Patent, Dec. 28, 1999, Patent No.:Re.36, 478, Assignee: Massachusetts Institute of Technology, Cambridge, Mass. 8. K. Vos, R. Vafin, R. Heusdens, and W. B. Kleijn, “High quality consistent analysis-synthesis in sinusoidal coding”, in Proc. AES 17th Int. Conf., ’High-Quality Audio Coding’, pp. 244 – 250, 1999. 9. Izmirli, O., “Non-harmonic Sinusoidal Modeling Synthesis Using Short-time High-resolution Parameter Analysis” Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00), Verona, Italy, December 7-9, 2000. 10. Harald Pobloth, Renat Vafin, and W. Bastiaan Kleijn, '' Polar Quantization of Sinusoids from Speech Signal Blocks", EUROSPEECH 2003 – Geneva. 11. Mathieu Lagrange, Sylvain Marchand and Jean Bernard Rault, "Sinusoidal Parameter extraction and Component selection in a Non Stationary Model", Proc. Of the 5th Int. Conference on Digital Audio Effects (DAFx-02), Hamburg, Germany, September 26-28, 2002. 12. Ibrahim Mansour and Samer J. Alabed. "Using Sinusoidal Model to Implement Sinusoidal Speech Coder with Speech Enhancer". The 6th International Electrical and Electronics Engineering Conference (JIEEEC), Volume 1, page 1-8, march 2006. 13. Kang Sangwon, Shin Yongwon, and Fischer Thomas. (2004). "Low-Complexity Predictive Trellis-Coded Quantization of Speech Line Spectral Frequencies". IEEE Transactions on Signal Processing, Vol. 52, No. 7.
  • 12. International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online), Volume 5, Issue 4, April (2014), pp. 07-18 © IAEME 18 14. Alku Paavo, and Bäckström Tom. (2004). "Linear Predictive Method for Improved Spectral Modeling of Lower Frequencies of Speech With Small Prediction Orders". IEEE Transactions on Speech and Audio Processing, Vol. 12, No. 2. 15. Atal Bishnu. (1982). "Predictive coding of speech at low bit rates". IEEE Transactions on Communications, COM-30(4):600-614. 16. Brinker Albertus C. den, Voitishchuk, V., and Eijndhoven Stephanus J. L. van. (2004). "IIR- Based Pure Linear Prediction". IEEE Transactions on Speech and Audio Processing, Vol. 12, No. 1. 17. Papamichalis Panos. (1987). "Practical Approaches to Speech Coding", Prentice Hall, Inc. Texas Instruments, Inc. Rice University. 18. Härmä Aki. (2001). "Linear Predictive Coding With Modified Filter Structures". IEEE Transactions on Speech and Audio Processing, Vol. 9, No. 8. 19. Hu Hwai-Tsu, and Wu Hsi-Tsung. (2000). "A Glottal-Excited Linear Prediction (GELP) Model for Low-Bit-Rate Speech Coding", Proc. Natl. Sci, Counc. ROC(A) Vol. 24. pp. 134-142. 20. Sudha.P.N and Dr U.Eranna, “Source and Adaptive Channel Coding Techniques for Wireless Communication”, International Journal of Electronics and Communication Engineering & Technology (IJECET), Volume 3, Issue 3, 2012, pp. 314 - 323, ISSN Print: 0976-6464, ISSN Online: 0976-6472,. 21. P Mahalakshmi and M R Reddy, “Speech Processing Strategies for Cochlear Prostheses-The Past, Present and Future: A Tutorial Review”, International Journal of Advanced Research in Engineering & Technology (IJARET), Volume 3, Issue 2, 2012, pp. 197 - 206, ISSN Print: 0976-6480, ISSN Online: 0976-6499.