SlideShare une entreprise Scribd logo
1  sur  7
Télécharger pour lire hors ligne
S. Prashantini et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59

RESEARCH ARTICLE

www.ijera.com

OPEN ACCESS

A Privacy Preserving Approach to Analyze Security in Voip
System
S.Prashantini1, T.J.Jeyaprabha1, Dr.G.Sumathi2
1

Dept of Ece, Sri Venkateswara College Of Engineering, Chennai
Assistant Professor, Dept of Ece, Sri Venkateswara College Of Engineering, Chennai
2
Professor & Hod-Ims, Department Of It, Sri Venkateswara College Of Engineering, Chennai.
1

ABSTRACT
Pre-processing of Speech Signal serves various purposes in any speech processing application. It includes Noise
Removal, Endpoint Detection, Pre-emphasis, Framing, Windowing, Echo Canceling etc. Out of these, silence
portion removal along with endpoint detection is the fundamental step for applications like Speech and Speaker
Recognition. The proposed method uses multi-layer perceptron along with hierarchal mixture model for
classification of voiced part of a speech from silence/unvoiced part. The work shows better end point detection
as well as silence removal. The study is based on timing-based traffic analysis attacks that can be used to
reconstruct the communication on end-to-end voip systems by taking advantage of the reduction or
suppression of the generation of traffic whenever the sender detects a voice inactivity. Removal of the silence in
between the inter packet time in order to obtain a clear communication. The proposed attacks can detect
speakers of encrypted speech communications with high probabilities. With the help of different codecs we are
going to detect speakers of speech communications In comparison with traditional traffic analysis attacks, the
proposed traffic analysis attacks do not require simultaneous accesses to one traffic flow of interest at both
sides.
Keywords – Speech recognition, silent suppression, passive analysis.

I. INTRODUCTION
Speech recognition (SR)
Speech Recognition is the translation of
spoken words into text. Some SR systems use
"speaker independent speech recognition" while
others use "training" where an individual speaker
reads sections of text into the SR system. These
systems analyze the person's specific voice and use it
to fine tune the recognition of that person's speech,
resulting in more accurate transcription. Systems that
do not use training are called "speaker independent"
systems. Systems that use training are called "speaker
dependent" systems.Speech recognition applications
include voice user interfaces such as voice dialing,
call routing, appliance control, and search simple data
entry.
Speaker recognition is the identification of
the person who is speaking by characteristics of their
voices (voice biometrics), also called voice
recognition. There are two major applications of
speaker recognition technologies and methodologies.
If the speaker claims to be of a certain identity and
the voice is used to verify this claim, this is called
verification or authentication. On the other hand,
identification is the task of determining an unknown
speaker's identity. In a sense speaker verification is a
1:1 match where one speaker's voice is matched to
one template (also called a "voice print" or "voice
www.ijera.com

model") whereas speaker identification is a 1:N
match where the voice is compared against N
templates.
To guarantee bandwidth for VoIP packets, a
network device must be able to identify VoIP packets
in all the IP traffic flowing through it. Network
devices use the source and destination IP address
This identification and grouping process is called
classification and it is the basis for providing any
QoS.
Packet classification can be processorintensive, so it should occur as far out toward the
edge of the network as possible. Because every hop
still needs to make a determination on the treatment a
packet should receive, you need to have a simpler,
more efficient classification method in the network
core. This simpler classification is achieved through
hidden markov model.
HMMs [2] are popular is because they can
be trained automatically and are simple and
computationally feasible to use. In speech
recognition, the hidden Markov model would output
a sequence of n-dimensional real-valued vectors
(with n being a small integer, such as 10), outputting
one of these every 10 milliseconds. The vectors
would consist of cepstral coefficients, which are
obtained by taking a Fourier transform of a short time
53 | P a g e
S. Prashantini et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59
window of speech and decorrelating the spectrum
using a cosine transform, then taking the first (most
significant) coefficients. Each word, or (for more
general speech recognition systems), each phoneme,
will have a different output distribution; a hidden
Markov model for a sequence of words or phonemes
is made by concatenating the individual trained
hidden Markov models for the separate words and
phonemes.
The term silence suppression is used
in telephony to describe the process of not
transmitting information over the network when one
of the parties involved in atelephone call is not
speaking, thereby reducing bandwidth usage.Voice is
carried over a digital telephone network by
converting the analog signal to a digital signal which
is then packetized and sent electronically over the
network. The analogue signal is re-created at the
receiving end of the network [6]. When one of the
parties does not speak, background noise is picked up
and sent over the network. This is inefficient as this
signal carries no useful information and thus,
bandwidth is wasted.Given that typically only one
party in a conversation speaks at any one time,
silence suppression can achieve overall bandwidth
savings in the order of 50% over the duration of
a telephone call.
In this paper, we detect silence/unvoiced
[14] part from the speech sample using multi-layer
perceptron algorithm. The algorithm uses statistical
properties of noise based on packets as well as
packets based on threshold energy. The experiments
are done in which the speech is transferred through
internet and categorized as packets. The result shows
better classification for the proposed method in both
the cases when compared against conventional
silence/unvoiced detection methods. We assume that
background noise present in the utterances are
Gaussian in nature, however a speech signal may also
be contaminated with different types of noise. In such
cases the corresponding properties of the noise
distribution function are to be used for detection
purpose.
This paper is organized as follows. In
section 2 it describes about the theoretical
background. Section 3 presents the method i.e. the
algorithm along with description regarding its model
and defining the measure of correctness. The results
are presented in section 4 and section 5 describes the
conclusion.

II. THEORITICAL BACKGROUND
2.1
SPEECH SIGNAL AND ITS BASIC
PROPERTIES
The proposed system consist of a new class
of traffic analysis attacks to encrypted speech
communications with the goal of detecting speakers
www.ijera.com

www.ijera.com

of encrypted speech communications. These attacks
are based on packet timing information only and the
attacks
can
detect
speakers
of
speech
communications made with different
codecs. We evaluate the proposed attacks with
extensive experiments over different type of
networks including commercial anonymity networks
and campus networks. The experiments show that the
proposed traffic analysis attacks can detect speakers
of encrypted speech communications with high
accuracy.
The speech signal is a slowly time varying
signal in the sense that, when examined over a
sufficiently short period of time, its characteristics
are fairly stationary; however, over long periods of
time the signal characteristics change to reflect the
different speech sounds being spoken. Usually first
200 msec or more (1600 samples if the sampling rate
is 8000 samples/sec) of a speech recording
corresponds to silence (or background noise) because
the speaker takes some time to read when recording
starts.
2.2

DATAFLOW DIAGRAM

Input speech signal
Speech signal to
voice data
Silence detection

Extract features
Classification
Encryption
2.3 MODULES USED
Speech Coding
In speech communications, an analog voice
signal is first converted into a voice data stream by a
chosen codec. Typically in this step, compression is
used to reduce the data rate. The voice data stream is
then packetized in small units of typically tens of
milliseconds of voice, and encapsulated in a packet
stream over the Internet. In this paper, we focus on
constant bit rate codecs since most codecs used in
current speech communications are CBR codecs.

54 | P a g e
S. Prashantini et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59

Silence detection
In this, silence period in the voice signal is
detected .This was done bychecking threshold range
which are matched to silence level.Then this was
eliminated and combine the voice into feature values.
Voice Signal Waveform
The packet train generated by feeding the
voice signal to X-Lite, a popular speech
communication tool. Here it is easily observed that
the correspondence between the silence periods in the
voice signal and the gaps in the packet train. The
length of a silence period is different from the length
of the corresponding gap in the packet train.
Speaker Detection
The inputs to this step are the Multi-Layer
Perception (MLP) classifier with (FFBP) trained in
the previous step and the feature vectors generated
from a pool of raw speech communication traces of
interest [12]. The output of this step is the
intermediate detection result, ie speakers from the
candidate pool with talk patterns closest to another
talk pattern. The detection step can be divided into
two phases: First, the likelihood of each feature
vector is calculated with the trained MLP. The trace
with the highest likelihood is declared if the
intersection step is not used. To improve the
detection accuracy, the intermediate detection results
can be fed into the optional intersection attack step.
Cross – Codec Detection
In this set of experiments, the training traces
and the traces to be detected are generated with
different codecs[12]. We believe this set of
experiments is important because: Practically training
traces and the traces to be detected can be collected
from speech communications made with different
codecs. Since speech packets are encrypted and
possibly padded to a fixed length, adversaries may
not be able to differentiate speech communications
made with different codecs.

III. METHOD
The algorithm described is divided into two
parts. First part classifies the voice packets and
identifies the delay by using mlpalgorithm while the
second part classifies each packet according to the
timing delay by using Hmm model.
Multilayer perceptron (MLP)Multi-layer perceptron is a feed forward
artificial neural network model that maps sets of
input data onto a set of appropriate outputs. An MLP
consists of multiple layers of nodes in a directed
www.ijera.com

www.ijera.com

graph, with each layer fully connected to the next
one. Except for the input nodes, each node is a
neuron (or processing element) with a nonlinear
activation function. MLP utilizes a supervised
learning technique called backpropagation for
training the network.MLP is a modification of the
standard linear perceptron and can distinguish data
that are not linearly separable. A learning rule is
applied in order to improve the value of the MLP
weights over a training set T according to a given
criterion function.
Back Propagation Algorithm
Back Propagation Algorithm is used to train
MLNN. The training proceeds in two phases:
In the forward phase, the synaptic weights of the
network are fixed and the input signal is propagated
through the network layer by layer until it reaches the
output.
In the backward phase, an error signal is
produced by comparing the output of the network
with a desired response. The resulting error signal is
propagated through the network, layer by layer but
the propagation is performed in the backward
direction. In this phase successive adjustments are
applied to the synaptic weights of the network.The
key factor involved in the calculation of the weight
adjustment is the error signal at the output neuron j.
As we see the credit-assignment problem arises here.
In this context we may identify two distinct cases:
Case #1: Neuron j is an output node:
The error signal is supplied to the neuron by
its own from equation
Case #2: Neuron j is a hidden node:
When a neuron j is located in a hidden layer
of the network, there’s no specified desired response
for that neuron.
Accordingly, the error signal for a hidden neuron
would have to be determined recursively and working
backwards in terms of the error signals of all the
neurons to which that hidden neuron connected.
The final back-propagation formula for the local
gradientWherek represents the number of neurons
that are connected to hidden neuron j.
As a summary the correction is applied to the
synaptic weight connecting neuron I to j
Any activation function that is used in multilayer
neural networks should be continuous
The most commonly used activation function is
sigmoidal nonlinearity.
3.1

ACTIVATION FUNCTION
It’s preferred to use a sigmoid activation
function that’s an odd function in its arguments.The
hyperbolic sigmoid function is the recommended
55 | P a g e
S. Prashantini et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59
one.Target values (desired) are chosen within the
range of the sigmoid activation function.
3.2 HMM TRAINING MODEL

Fig 3.1: HMM model diagram
By combining decisions probabilistically at
all lower levels, and making more deterministic
decisions only at the highest level [3]. Speech
recognition by a machine is a process broken into
several phases. Computationally, it is a problem in
which a sound pattern has to be recognized or
classified into a category that represents a meaning to
a human. Every acoustic signal can be broken in
smaller more basic sub-signals. As the more complex
sound signal is broken into the smaller sub-sounds,
different levels are created, where at the top level we
have complex sounds, which are made of simpler
sounds on lower level, and going to lower levels even
more, we create more basic and shorter and simpler
sounds. The lowest level, where the sounds are the
most fundamental, a machine would check for simple
and more probabilistic rules of what sound should
represent. Once these sounds are put together into
more complex sound on upper level, a new set of
more deterministic rules should predict what new
complex sound should represent. The most upper
level of a deterministic rule should figure out the
meaning of complex expressions. In order to expand
our knowledge about speech recognition we need to
take into a consideration neural networks.
Modern general-purpose speech recognition
systems are based on Hidden Markov Models. These
are statistical models that output a sequence of
symbols or quantities. HMMs are used in speech
recognition because a speech signal can be viewed as
a piecewise stationary signal or a short-time
stationary signal. In a short time-scales (e.g., 10
milliseconds), speech can be approximated as a
stationary process. Speech can be thought of as a
Markov model for many stochastic purposes.
HMMs [2] are popular is because they can
be trained automatically and are simple and
computationally feasible to use. In speech
recognition, the hidden Markov model would output
www.ijera.com

www.ijera.com

a sequence of n-dimensional real-valued vectors
(with n being a small integer, such as 10), outputting
one of these every 10 milliseconds. The vectors
would consist of coefficients, which are obtained by
taking a Fourier transform of a short time window of
speech and decorrelating the spectrum using a cosine
transform, then taking the first (most significant)
coefficients. The hidden Markov model will tend to
have in each state a statistical distribution that is a
mixture of diagonal covariance Gaussians, which will
give a likelihood for each observed vector. Each
word, or (for more general speech recognition
systems), each phoneme, will have a different output
distribution; a hidden Markov model for a sequence
of words or phonemes is made by concatenating the
individual trained hidden Markov models for the
separate words and phonemes.
Decoding of the speech (the term for what
happens when the system is presented with a new
utterance and must compute the most likely source
sentence) would probably use the Viterbi algorithm
to find the best path, and here there is a choice
between dynamically creating a combination hidden
Markov model, which includes both the acoustic and
language model information, and combining it
statically beforehand (the finite state transducer, or
FST, approach).
Filter design is the process of designing a
filter which meets a set of requirements. It is
essentially an optimization problem. The difference
between the desired filter and the designed filter
should be minimized. A general goal of a design is
that
the
complexity
(“complexity”
means
“complexity in terms of the number of multipliers” in
this paper) to realize the filter is as low as possible.
A codec is a device or computer program
capable of encoding or decoding a digitaldata stream
or signal.[1][2][3].The word codec is a portmanteau of
"coder-decoder" or, less commonly, "compressordecompressor". A codec (the program) should not be
confused with a coding or compression format or
standard – a format is a document (the standard), a
way of storing data, while a codec is a program (an
implementation) which can read or write such files.
In practice, however, "codec" is sometimes used
loosely to refer to formats.
A codec encodes a data stream or signal for
transmission, storage or encryption, or decodes it for
playback or editing. Codecs are used in
videoconferencing, streaming media and video
editing applications. A video camera's analog-todigital converter (ADC) converts its analog signals
into digital signals, which are then passed through a
video compressor for digital transmission or storage.
A receiving device then runs the signal through a
video decompressor, then a digital-to-analog
converter (DAC) for analog display. The term codec
56 | P a g e
S. Prashantini et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59
is also used as a generic name for a
videoconferencing unit.
Speaker recognition [4] is the identification of the
person who is speaking by characteristics of their
voices (voice biometrics), also called voice
recognition. There are two major applications of
speaker recognition technologies and methodologies.
If the speaker claims to be of a certain identity and
the voice is used to verify this claim, this is called
verification or authentication. On the other hand,
identification is the task of determining an unknown
speaker's identity. In a sense speaker verification is a
1:1 match where one speaker's voice is matched to
one template (also called a "voice print" or "voice
model") whereas speaker identification is a 1:N
match where the voice is compared against N
templates.
IV. RESULTS
The Scenario of training and test of the
proposed approach are generated along with few
comparison of different voice signals by using
MATLAB Simulation software. MATLAB is a highlevel technical computing language and interactive
environment for algorithm development, data
visualization, data analysis, and numerical
computation. Using MATLAB, you can solve
technical computing problems.
In the proposed system the traces of speech
is given and being simulated in order to obtain an
encrypted speech.Thus the bandwidth of the speech is
reduced for which the simulated samples are given.
The various simulation outputs describes
about the encrypted speech along with delay and also
the correctness data acquired after the reduction of
delay. Figure 4.1 describes about the input speech
signal along with its encrypted output of value 3
degree. Figure 4.2 consists of a periodogram input
which in order to describe the bandwidth reduction.

www.ijera.com

Fig 4.2: Periodogram input of large input
Figure 4.3 represents the input signal along
with silence ie delay is present in the transmission.
Figure 4.4 represent the signal after the delay is being
remover and hence output obtained is free of silence .

Fig 4.3:before silence removal

Fig 4.1: Input signal along with encrypted Output
Fig 4.4: after silence removal
www.ijera.com

57 | P a g e
S. Prashantini et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59

www.ijera.com

The various graphical representations are
given which explains about the rate of input signal
along with respective output signals. Figure 4.5
describes about the graph of an input signal along
with initial conditions of silence threshold value.
Figure 4.6 represents the comparison of detected
silence and original signal values and 4.7 gives the
detected rate of signal transmissions.

Fig 4.7: Detection rate

V. CONCLUSION

Fig 4.5: graphical representation of silence threshold

The input consist of speech signal as input
and removed the silence present in between in order
to give an effective communication by providing
reduced bandwidth Hidden markov model has been
used effectively in order obtain the voice traces. The
proposed attacks is carried out by extensive
experiments over different types of networks
including periodogram inputs. The experiments show
that the proposed traffic analysis attacks can detect
speakers of encrypted speech communications with
high detection rates based on speech communication
traces. The threshold energy used in this method is
uniquely specified and depends on the packet
transfer. It is shown to be computationally efficient
for real time applications and it performs better
thanconventional methods for speech samples
collected from noisy as well as noise free
environment.
VI. ACKNOWLEDGEMENT
I would like to thank my guide Mrs. T. J.
Jeyaprabha, Assistant professor, who guided me to
complete this project and also thank everyone those
who supported me to achieve my target.

Fig 4.6: comparison of signals

REFERENCES
[1]

[2]

www.ijera.com

Y.J. Pyun, Y.H. Park, X. Wang, D.S.
Reeves, and P. Ning, “Tracing Traffic
through Intermediate Hosts that Repacketize
Flows,” Proc.IEEE INFOCOM ’07, May
2010.
J.W. Deng and H.T. Tsui, “An HMM-Based
Approach for Gesture Segmentation and
Recognition,” Proc. Int’l Conf. Pattern
Recognition (ICPR ’00), pp. 679-682, 2000.
58 | P a g e
S. Prashantini et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59
[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

L.R. Rabiner, “A Tutorial on Hidden
Markov Models and Selected Applications
in Speech Recognition,” Proc. IEEE, vol.
77, no. 2,pp. 267-296..
R. Bakis, “Continuous Speech Recognition
via CentisecondAcousticstates,” J. the
Acoustical Soc. of Am.
G. Danezis and A. Serjantov, “Statistical
Disclosure or Intersection Attacks on
Anonymity
Systems,”
Proc.
Sixth
Information Hiding Workshop (IH ’04), pp.
293-308, May 2004.
O. Berthold and H. Langos, “Dummy
Traffic Against Long Term Intersection
Attacks,”
Proc.
Privacy
Enhancing
Technologies Workshop(PET ’02), pp. 110128, Apr. 2002.
X. Wang, S. Chen, and S. Jajodia, “Tracking
Anonymous Peer-to-Peer Voip Calls on the
Internet,” Proc. 12th ACM Conf. Computer
and Comm. Security (CCS ’05), pp. 81-91,
2005.
“Audio Signals Used for Experiments,
”http:// academic. csuohio. edu/zhu _y/ isc
2010/instruction.txt, 2011.
Q. Sun, D.R. Simon, Y.-M. Wang, W.
Russell, V.N. Padmanabhan,and L. Qiu,
“Statistical Identification of Encrypted Web
Browsing Traffic,” Proc. IEEE Symp.
Security and Privacy (SP ’02), pp. 1930,2002.
D. Herrmann, R. Wendolsky, and H.
Federrath,
“Website
Fingerprinting:
Attacking Popular Privacy Enhancing
Technologies with the Multinomial Naı¨ve-

www.ijera.com

[11]

[12]

[13]

[14]

[15]

www.ijera.com

Bayes Classifier,” Proc. ACM Workshop
Cloud Computing Security (CCSW ’09), pp.
31-42, 2009.
L. Lu, E.-C. Chang, and M. Chan, “Website
Fingerprinting and Identification Using
Ordered Feature Sequences,” Proc.15th
European Conf. Research in Computer
Security (ESORICS),D. Gritzalis, B.
Preneel, and M. Theoharidou, eds. pp. 199214.2010.
C.V. Wright, L. Ballard, S.E. Coull, F.
Monrose, and G.M. Masson,“Spot Me if
You Can: Uncovering Spoken Phrases in
EncryptedVoip Conversations,” Proc. IEEE
Symp. Security and Privacy(SP ’08), pp. 3549, 2008.
C.V. Wright, L. Ballard, F. Monrose, and
G.M. Masson, “Language Identification of
Encrypted Voip Traffic: Alejandra y
Roberto orAlice and Bob?,” Proc. 16th
USENIX Security Symp. USENIX Security
Symp.,
pp.
4:1-4:12,
http://portal.acm.org/citationcfm?id=136290
3.1362907, 2007.
C. Wright, S. Coull, and F. Monrose,
“Traffic Morphing: An Efficient Defense
against Statistical Traffic Analysis,” Proc.
Network and Distributed Security Symp.
(NDSS ’09), Feb. 2009
Network Security M. Backes, G. Doychev,
M. Du¨ rmuth, and B. Ko¨ pf, “Speaker
Recognition in Encrypted Voice Streams,”
Proc. 15th European Symp. Research in
Computer Security (ESORICS ’10), pp.
508-523, Sept. 2010.

59 | P a g e

Contenu connexe

Tendances

Estimating the quality of digitally transmitted speech over satellite communi...
Estimating the quality of digitally transmitted speech over satellite communi...Estimating the quality of digitally transmitted speech over satellite communi...
Estimating the quality of digitally transmitted speech over satellite communi...Alexander Decker
 
Utterance based speaker identification
Utterance based speaker identificationUtterance based speaker identification
Utterance based speaker identificationIJCSEA Journal
 
05 comparative study of voice print based acoustic features mfcc and lpcc
05 comparative study of voice print based acoustic features mfcc and lpcc05 comparative study of voice print based acoustic features mfcc and lpcc
05 comparative study of voice print based acoustic features mfcc and lpccIJAEMSJORNAL
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...ijcsit
 
An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...eSAT Journals
 
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNALGENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNALIJCSEIT Journal
 
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...ijceronline
 
IRJET- Voice based Gender Recognition
IRJET- Voice based Gender RecognitionIRJET- Voice based Gender Recognition
IRJET- Voice based Gender RecognitionIRJET Journal
 
Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...IJECEIAES
 
Real Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationReal Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationIDES Editor
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSijnlc
 
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...TELKOMNIKA JOURNAL
 
Speaker recognition on matlab
Speaker recognition on matlabSpeaker recognition on matlab
Speaker recognition on matlabArcanjo Salazaku
 
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...Pratik Narang
 
Dynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingDynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingCSCJournals
 

Tendances (20)

B034205010
B034205010B034205010
B034205010
 
Kc3517481754
Kc3517481754Kc3517481754
Kc3517481754
 
Estimating the quality of digitally transmitted speech over satellite communi...
Estimating the quality of digitally transmitted speech over satellite communi...Estimating the quality of digitally transmitted speech over satellite communi...
Estimating the quality of digitally transmitted speech over satellite communi...
 
Ijetcas14 426
Ijetcas14 426Ijetcas14 426
Ijetcas14 426
 
Utterance based speaker identification
Utterance based speaker identificationUtterance based speaker identification
Utterance based speaker identification
 
52 57
52 5752 57
52 57
 
05 comparative study of voice print based acoustic features mfcc and lpcc
05 comparative study of voice print based acoustic features mfcc and lpcc05 comparative study of voice print based acoustic features mfcc and lpcc
05 comparative study of voice print based acoustic features mfcc and lpcc
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...
 
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNALGENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
 
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
 
IRJET- Voice based Gender Recognition
IRJET- Voice based Gender RecognitionIRJET- Voice based Gender Recognition
IRJET- Voice based Gender Recognition
 
Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...
 
Real Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationReal Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and Validation
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
 
Audio steganography using r prime rsa and ga based lsb algorithm to enhance s...
Audio steganography using r prime rsa and ga based lsb algorithm to enhance s...Audio steganography using r prime rsa and ga based lsb algorithm to enhance s...
Audio steganography using r prime rsa and ga based lsb algorithm to enhance s...
 
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...
 
Speaker recognition on matlab
Speaker recognition on matlabSpeaker recognition on matlab
Speaker recognition on matlab
 
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
 
Dynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingDynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modelling
 

En vedette

UNICEF: Believe in zero campaign 2009
UNICEF: Believe in zero campaign 2009UNICEF: Believe in zero campaign 2009
UNICEF: Believe in zero campaign 2009Post Media
 
Sin título 2
Sin título 2Sin título 2
Sin título 2Gesasa
 
Actividad para el día de la constitución
Actividad para el día de la constituciónActividad para el día de la constitución
Actividad para el día de la constituciónmorrigan20062009
 
Tapasdediarios
TapasdediariosTapasdediarios
Tapasdediariostintaen3d
 
Baltimore SPUG - Worst Practices and Blunders
Baltimore SPUG - Worst Practices and BlundersBaltimore SPUG - Worst Practices and Blunders
Baltimore SPUG - Worst Practices and BlundersScott Hoag
 

En vedette (8)

UNICEF: Believe in zero campaign 2009
UNICEF: Believe in zero campaign 2009UNICEF: Believe in zero campaign 2009
UNICEF: Believe in zero campaign 2009
 
Sin título 2
Sin título 2Sin título 2
Sin título 2
 
Actividad para el día de la constitución
Actividad para el día de la constituciónActividad para el día de la constitución
Actividad para el día de la constitución
 
Portfolio 1
Portfolio 1Portfolio 1
Portfolio 1
 
Digestió
DigestióDigestió
Digestió
 
Tapasdediarios
TapasdediariosTapasdediarios
Tapasdediarios
 
Baltimore SPUG - Worst Practices and Blunders
Baltimore SPUG - Worst Practices and BlundersBaltimore SPUG - Worst Practices and Blunders
Baltimore SPUG - Worst Practices and Blunders
 
Bill Gates
Bill GatesBill Gates
Bill Gates
 

Similaire à H42045359

Classification of Language Speech Recognition System
Classification of Language Speech Recognition SystemClassification of Language Speech Recognition System
Classification of Language Speech Recognition Systemijtsrd
 
Audio Features Based Steganography Detection in WAV File
Audio Features Based Steganography Detection in WAV FileAudio Features Based Steganography Detection in WAV File
Audio Features Based Steganography Detection in WAV Fileijtsrd
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...AIRCC Publishing Corporation
 
A Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification TechniquesA Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification TechniquesNicole Heredia
 
Financial Transactions in ATM Machines using Speech Signals
Financial Transactions in ATM Machines using Speech SignalsFinancial Transactions in ATM Machines using Speech Signals
Financial Transactions in ATM Machines using Speech SignalsIJERA Editor
 
A Robust Speaker Identification System
A Robust Speaker Identification SystemA Robust Speaker Identification System
A Robust Speaker Identification Systemijtsrd
 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approachijsrd.com
 
Course report-islam-taharimul (1)
Course report-islam-taharimul (1)Course report-islam-taharimul (1)
Course report-islam-taharimul (1)TANVIRAHMED611926
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Editor IJARCET
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Editor IJARCET
 
Effect of Singular Value Decomposition Based Processing on Speech Perception
Effect of Singular Value Decomposition Based Processing on Speech PerceptionEffect of Singular Value Decomposition Based Processing on Speech Perception
Effect of Singular Value Decomposition Based Processing on Speech Perceptionkevig
 
A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...TELKOMNIKA JOURNAL
 
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...IJMER
 
Bachelors project summary
Bachelors project summaryBachelors project summary
Bachelors project summaryAditya Deshmukh
 
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLE
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLEMULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLE
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLEIRJET Journal
 
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITY
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITYDATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITY
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITYcscpconf
 
A review of analog audio scrambling methods for residual intelligibility
A review of analog audio scrambling methods for residual intelligibilityA review of analog audio scrambling methods for residual intelligibility
A review of analog audio scrambling methods for residual intelligibilityAlexander Decker
 

Similaire à H42045359 (20)

Classification of Language Speech Recognition System
Classification of Language Speech Recognition SystemClassification of Language Speech Recognition System
Classification of Language Speech Recognition System
 
50120140502007
5012014050200750120140502007
50120140502007
 
Audio Features Based Steganography Detection in WAV File
Audio Features Based Steganography Detection in WAV FileAudio Features Based Steganography Detection in WAV File
Audio Features Based Steganography Detection in WAV File
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
A Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification TechniquesA Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification Techniques
 
Financial Transactions in ATM Machines using Speech Signals
Financial Transactions in ATM Machines using Speech SignalsFinancial Transactions in ATM Machines using Speech Signals
Financial Transactions in ATM Machines using Speech Signals
 
A Robust Speaker Identification System
A Robust Speaker Identification SystemA Robust Speaker Identification System
A Robust Speaker Identification System
 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approach
 
Course report-islam-taharimul (1)
Course report-islam-taharimul (1)Course report-islam-taharimul (1)
Course report-islam-taharimul (1)
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189
 
V041203124126
V041203124126V041203124126
V041203124126
 
Effect of Singular Value Decomposition Based Processing on Speech Perception
Effect of Singular Value Decomposition Based Processing on Speech PerceptionEffect of Singular Value Decomposition Based Processing on Speech Perception
Effect of Singular Value Decomposition Based Processing on Speech Perception
 
A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...
 
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
 
Bachelors project summary
Bachelors project summaryBachelors project summary
Bachelors project summary
 
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
 
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLE
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLEMULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLE
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLE
 
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITY
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITYDATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITY
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITY
 
A review of analog audio scrambling methods for residual intelligibility
A review of analog audio scrambling methods for residual intelligibilityA review of analog audio scrambling methods for residual intelligibility
A review of analog audio scrambling methods for residual intelligibility
 

Dernier

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Dernier (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

H42045359

  • 1. S. Prashantini et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59 RESEARCH ARTICLE www.ijera.com OPEN ACCESS A Privacy Preserving Approach to Analyze Security in Voip System S.Prashantini1, T.J.Jeyaprabha1, Dr.G.Sumathi2 1 Dept of Ece, Sri Venkateswara College Of Engineering, Chennai Assistant Professor, Dept of Ece, Sri Venkateswara College Of Engineering, Chennai 2 Professor & Hod-Ims, Department Of It, Sri Venkateswara College Of Engineering, Chennai. 1 ABSTRACT Pre-processing of Speech Signal serves various purposes in any speech processing application. It includes Noise Removal, Endpoint Detection, Pre-emphasis, Framing, Windowing, Echo Canceling etc. Out of these, silence portion removal along with endpoint detection is the fundamental step for applications like Speech and Speaker Recognition. The proposed method uses multi-layer perceptron along with hierarchal mixture model for classification of voiced part of a speech from silence/unvoiced part. The work shows better end point detection as well as silence removal. The study is based on timing-based traffic analysis attacks that can be used to reconstruct the communication on end-to-end voip systems by taking advantage of the reduction or suppression of the generation of traffic whenever the sender detects a voice inactivity. Removal of the silence in between the inter packet time in order to obtain a clear communication. The proposed attacks can detect speakers of encrypted speech communications with high probabilities. With the help of different codecs we are going to detect speakers of speech communications In comparison with traditional traffic analysis attacks, the proposed traffic analysis attacks do not require simultaneous accesses to one traffic flow of interest at both sides. Keywords – Speech recognition, silent suppression, passive analysis. I. INTRODUCTION Speech recognition (SR) Speech Recognition is the translation of spoken words into text. Some SR systems use "speaker independent speech recognition" while others use "training" where an individual speaker reads sections of text into the SR system. These systems analyze the person's specific voice and use it to fine tune the recognition of that person's speech, resulting in more accurate transcription. Systems that do not use training are called "speaker independent" systems. Systems that use training are called "speaker dependent" systems.Speech recognition applications include voice user interfaces such as voice dialing, call routing, appliance control, and search simple data entry. Speaker recognition is the identification of the person who is speaking by characteristics of their voices (voice biometrics), also called voice recognition. There are two major applications of speaker recognition technologies and methodologies. If the speaker claims to be of a certain identity and the voice is used to verify this claim, this is called verification or authentication. On the other hand, identification is the task of determining an unknown speaker's identity. In a sense speaker verification is a 1:1 match where one speaker's voice is matched to one template (also called a "voice print" or "voice www.ijera.com model") whereas speaker identification is a 1:N match where the voice is compared against N templates. To guarantee bandwidth for VoIP packets, a network device must be able to identify VoIP packets in all the IP traffic flowing through it. Network devices use the source and destination IP address This identification and grouping process is called classification and it is the basis for providing any QoS. Packet classification can be processorintensive, so it should occur as far out toward the edge of the network as possible. Because every hop still needs to make a determination on the treatment a packet should receive, you need to have a simpler, more efficient classification method in the network core. This simpler classification is achieved through hidden markov model. HMMs [2] are popular is because they can be trained automatically and are simple and computationally feasible to use. In speech recognition, the hidden Markov model would output a sequence of n-dimensional real-valued vectors (with n being a small integer, such as 10), outputting one of these every 10 milliseconds. The vectors would consist of cepstral coefficients, which are obtained by taking a Fourier transform of a short time 53 | P a g e
  • 2. S. Prashantini et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59 window of speech and decorrelating the spectrum using a cosine transform, then taking the first (most significant) coefficients. Each word, or (for more general speech recognition systems), each phoneme, will have a different output distribution; a hidden Markov model for a sequence of words or phonemes is made by concatenating the individual trained hidden Markov models for the separate words and phonemes. The term silence suppression is used in telephony to describe the process of not transmitting information over the network when one of the parties involved in atelephone call is not speaking, thereby reducing bandwidth usage.Voice is carried over a digital telephone network by converting the analog signal to a digital signal which is then packetized and sent electronically over the network. The analogue signal is re-created at the receiving end of the network [6]. When one of the parties does not speak, background noise is picked up and sent over the network. This is inefficient as this signal carries no useful information and thus, bandwidth is wasted.Given that typically only one party in a conversation speaks at any one time, silence suppression can achieve overall bandwidth savings in the order of 50% over the duration of a telephone call. In this paper, we detect silence/unvoiced [14] part from the speech sample using multi-layer perceptron algorithm. The algorithm uses statistical properties of noise based on packets as well as packets based on threshold energy. The experiments are done in which the speech is transferred through internet and categorized as packets. The result shows better classification for the proposed method in both the cases when compared against conventional silence/unvoiced detection methods. We assume that background noise present in the utterances are Gaussian in nature, however a speech signal may also be contaminated with different types of noise. In such cases the corresponding properties of the noise distribution function are to be used for detection purpose. This paper is organized as follows. In section 2 it describes about the theoretical background. Section 3 presents the method i.e. the algorithm along with description regarding its model and defining the measure of correctness. The results are presented in section 4 and section 5 describes the conclusion. II. THEORITICAL BACKGROUND 2.1 SPEECH SIGNAL AND ITS BASIC PROPERTIES The proposed system consist of a new class of traffic analysis attacks to encrypted speech communications with the goal of detecting speakers www.ijera.com www.ijera.com of encrypted speech communications. These attacks are based on packet timing information only and the attacks can detect speakers of speech communications made with different codecs. We evaluate the proposed attacks with extensive experiments over different type of networks including commercial anonymity networks and campus networks. The experiments show that the proposed traffic analysis attacks can detect speakers of encrypted speech communications with high accuracy. The speech signal is a slowly time varying signal in the sense that, when examined over a sufficiently short period of time, its characteristics are fairly stationary; however, over long periods of time the signal characteristics change to reflect the different speech sounds being spoken. Usually first 200 msec or more (1600 samples if the sampling rate is 8000 samples/sec) of a speech recording corresponds to silence (or background noise) because the speaker takes some time to read when recording starts. 2.2 DATAFLOW DIAGRAM Input speech signal Speech signal to voice data Silence detection Extract features Classification Encryption 2.3 MODULES USED Speech Coding In speech communications, an analog voice signal is first converted into a voice data stream by a chosen codec. Typically in this step, compression is used to reduce the data rate. The voice data stream is then packetized in small units of typically tens of milliseconds of voice, and encapsulated in a packet stream over the Internet. In this paper, we focus on constant bit rate codecs since most codecs used in current speech communications are CBR codecs. 54 | P a g e
  • 3. S. Prashantini et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59 Silence detection In this, silence period in the voice signal is detected .This was done bychecking threshold range which are matched to silence level.Then this was eliminated and combine the voice into feature values. Voice Signal Waveform The packet train generated by feeding the voice signal to X-Lite, a popular speech communication tool. Here it is easily observed that the correspondence between the silence periods in the voice signal and the gaps in the packet train. The length of a silence period is different from the length of the corresponding gap in the packet train. Speaker Detection The inputs to this step are the Multi-Layer Perception (MLP) classifier with (FFBP) trained in the previous step and the feature vectors generated from a pool of raw speech communication traces of interest [12]. The output of this step is the intermediate detection result, ie speakers from the candidate pool with talk patterns closest to another talk pattern. The detection step can be divided into two phases: First, the likelihood of each feature vector is calculated with the trained MLP. The trace with the highest likelihood is declared if the intersection step is not used. To improve the detection accuracy, the intermediate detection results can be fed into the optional intersection attack step. Cross – Codec Detection In this set of experiments, the training traces and the traces to be detected are generated with different codecs[12]. We believe this set of experiments is important because: Practically training traces and the traces to be detected can be collected from speech communications made with different codecs. Since speech packets are encrypted and possibly padded to a fixed length, adversaries may not be able to differentiate speech communications made with different codecs. III. METHOD The algorithm described is divided into two parts. First part classifies the voice packets and identifies the delay by using mlpalgorithm while the second part classifies each packet according to the timing delay by using Hmm model. Multilayer perceptron (MLP)Multi-layer perceptron is a feed forward artificial neural network model that maps sets of input data onto a set of appropriate outputs. An MLP consists of multiple layers of nodes in a directed www.ijera.com www.ijera.com graph, with each layer fully connected to the next one. Except for the input nodes, each node is a neuron (or processing element) with a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training the network.MLP is a modification of the standard linear perceptron and can distinguish data that are not linearly separable. A learning rule is applied in order to improve the value of the MLP weights over a training set T according to a given criterion function. Back Propagation Algorithm Back Propagation Algorithm is used to train MLNN. The training proceeds in two phases: In the forward phase, the synaptic weights of the network are fixed and the input signal is propagated through the network layer by layer until it reaches the output. In the backward phase, an error signal is produced by comparing the output of the network with a desired response. The resulting error signal is propagated through the network, layer by layer but the propagation is performed in the backward direction. In this phase successive adjustments are applied to the synaptic weights of the network.The key factor involved in the calculation of the weight adjustment is the error signal at the output neuron j. As we see the credit-assignment problem arises here. In this context we may identify two distinct cases: Case #1: Neuron j is an output node: The error signal is supplied to the neuron by its own from equation Case #2: Neuron j is a hidden node: When a neuron j is located in a hidden layer of the network, there’s no specified desired response for that neuron. Accordingly, the error signal for a hidden neuron would have to be determined recursively and working backwards in terms of the error signals of all the neurons to which that hidden neuron connected. The final back-propagation formula for the local gradientWherek represents the number of neurons that are connected to hidden neuron j. As a summary the correction is applied to the synaptic weight connecting neuron I to j Any activation function that is used in multilayer neural networks should be continuous The most commonly used activation function is sigmoidal nonlinearity. 3.1 ACTIVATION FUNCTION It’s preferred to use a sigmoid activation function that’s an odd function in its arguments.The hyperbolic sigmoid function is the recommended 55 | P a g e
  • 4. S. Prashantini et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59 one.Target values (desired) are chosen within the range of the sigmoid activation function. 3.2 HMM TRAINING MODEL Fig 3.1: HMM model diagram By combining decisions probabilistically at all lower levels, and making more deterministic decisions only at the highest level [3]. Speech recognition by a machine is a process broken into several phases. Computationally, it is a problem in which a sound pattern has to be recognized or classified into a category that represents a meaning to a human. Every acoustic signal can be broken in smaller more basic sub-signals. As the more complex sound signal is broken into the smaller sub-sounds, different levels are created, where at the top level we have complex sounds, which are made of simpler sounds on lower level, and going to lower levels even more, we create more basic and shorter and simpler sounds. The lowest level, where the sounds are the most fundamental, a machine would check for simple and more probabilistic rules of what sound should represent. Once these sounds are put together into more complex sound on upper level, a new set of more deterministic rules should predict what new complex sound should represent. The most upper level of a deterministic rule should figure out the meaning of complex expressions. In order to expand our knowledge about speech recognition we need to take into a consideration neural networks. Modern general-purpose speech recognition systems are based on Hidden Markov Models. These are statistical models that output a sequence of symbols or quantities. HMMs are used in speech recognition because a speech signal can be viewed as a piecewise stationary signal or a short-time stationary signal. In a short time-scales (e.g., 10 milliseconds), speech can be approximated as a stationary process. Speech can be thought of as a Markov model for many stochastic purposes. HMMs [2] are popular is because they can be trained automatically and are simple and computationally feasible to use. In speech recognition, the hidden Markov model would output www.ijera.com www.ijera.com a sequence of n-dimensional real-valued vectors (with n being a small integer, such as 10), outputting one of these every 10 milliseconds. The vectors would consist of coefficients, which are obtained by taking a Fourier transform of a short time window of speech and decorrelating the spectrum using a cosine transform, then taking the first (most significant) coefficients. The hidden Markov model will tend to have in each state a statistical distribution that is a mixture of diagonal covariance Gaussians, which will give a likelihood for each observed vector. Each word, or (for more general speech recognition systems), each phoneme, will have a different output distribution; a hidden Markov model for a sequence of words or phonemes is made by concatenating the individual trained hidden Markov models for the separate words and phonemes. Decoding of the speech (the term for what happens when the system is presented with a new utterance and must compute the most likely source sentence) would probably use the Viterbi algorithm to find the best path, and here there is a choice between dynamically creating a combination hidden Markov model, which includes both the acoustic and language model information, and combining it statically beforehand (the finite state transducer, or FST, approach). Filter design is the process of designing a filter which meets a set of requirements. It is essentially an optimization problem. The difference between the desired filter and the designed filter should be minimized. A general goal of a design is that the complexity (“complexity” means “complexity in terms of the number of multipliers” in this paper) to realize the filter is as low as possible. A codec is a device or computer program capable of encoding or decoding a digitaldata stream or signal.[1][2][3].The word codec is a portmanteau of "coder-decoder" or, less commonly, "compressordecompressor". A codec (the program) should not be confused with a coding or compression format or standard – a format is a document (the standard), a way of storing data, while a codec is a program (an implementation) which can read or write such files. In practice, however, "codec" is sometimes used loosely to refer to formats. A codec encodes a data stream or signal for transmission, storage or encryption, or decodes it for playback or editing. Codecs are used in videoconferencing, streaming media and video editing applications. A video camera's analog-todigital converter (ADC) converts its analog signals into digital signals, which are then passed through a video compressor for digital transmission or storage. A receiving device then runs the signal through a video decompressor, then a digital-to-analog converter (DAC) for analog display. The term codec 56 | P a g e
  • 5. S. Prashantini et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59 is also used as a generic name for a videoconferencing unit. Speaker recognition [4] is the identification of the person who is speaking by characteristics of their voices (voice biometrics), also called voice recognition. There are two major applications of speaker recognition technologies and methodologies. If the speaker claims to be of a certain identity and the voice is used to verify this claim, this is called verification or authentication. On the other hand, identification is the task of determining an unknown speaker's identity. In a sense speaker verification is a 1:1 match where one speaker's voice is matched to one template (also called a "voice print" or "voice model") whereas speaker identification is a 1:N match where the voice is compared against N templates. IV. RESULTS The Scenario of training and test of the proposed approach are generated along with few comparison of different voice signals by using MATLAB Simulation software. MATLAB is a highlevel technical computing language and interactive environment for algorithm development, data visualization, data analysis, and numerical computation. Using MATLAB, you can solve technical computing problems. In the proposed system the traces of speech is given and being simulated in order to obtain an encrypted speech.Thus the bandwidth of the speech is reduced for which the simulated samples are given. The various simulation outputs describes about the encrypted speech along with delay and also the correctness data acquired after the reduction of delay. Figure 4.1 describes about the input speech signal along with its encrypted output of value 3 degree. Figure 4.2 consists of a periodogram input which in order to describe the bandwidth reduction. www.ijera.com Fig 4.2: Periodogram input of large input Figure 4.3 represents the input signal along with silence ie delay is present in the transmission. Figure 4.4 represent the signal after the delay is being remover and hence output obtained is free of silence . Fig 4.3:before silence removal Fig 4.1: Input signal along with encrypted Output Fig 4.4: after silence removal www.ijera.com 57 | P a g e
  • 6. S. Prashantini et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59 www.ijera.com The various graphical representations are given which explains about the rate of input signal along with respective output signals. Figure 4.5 describes about the graph of an input signal along with initial conditions of silence threshold value. Figure 4.6 represents the comparison of detected silence and original signal values and 4.7 gives the detected rate of signal transmissions. Fig 4.7: Detection rate V. CONCLUSION Fig 4.5: graphical representation of silence threshold The input consist of speech signal as input and removed the silence present in between in order to give an effective communication by providing reduced bandwidth Hidden markov model has been used effectively in order obtain the voice traces. The proposed attacks is carried out by extensive experiments over different types of networks including periodogram inputs. The experiments show that the proposed traffic analysis attacks can detect speakers of encrypted speech communications with high detection rates based on speech communication traces. The threshold energy used in this method is uniquely specified and depends on the packet transfer. It is shown to be computationally efficient for real time applications and it performs better thanconventional methods for speech samples collected from noisy as well as noise free environment. VI. ACKNOWLEDGEMENT I would like to thank my guide Mrs. T. J. Jeyaprabha, Assistant professor, who guided me to complete this project and also thank everyone those who supported me to achieve my target. Fig 4.6: comparison of signals REFERENCES [1] [2] www.ijera.com Y.J. Pyun, Y.H. Park, X. Wang, D.S. Reeves, and P. Ning, “Tracing Traffic through Intermediate Hosts that Repacketize Flows,” Proc.IEEE INFOCOM ’07, May 2010. J.W. Deng and H.T. Tsui, “An HMM-Based Approach for Gesture Segmentation and Recognition,” Proc. Int’l Conf. Pattern Recognition (ICPR ’00), pp. 679-682, 2000. 58 | P a g e
  • 7. S. Prashantini et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 4), February 2014, pp.53-59 [3] [4] [5] [6] [7] [8] [9] [10] L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proc. IEEE, vol. 77, no. 2,pp. 267-296.. R. Bakis, “Continuous Speech Recognition via CentisecondAcousticstates,” J. the Acoustical Soc. of Am. G. Danezis and A. Serjantov, “Statistical Disclosure or Intersection Attacks on Anonymity Systems,” Proc. Sixth Information Hiding Workshop (IH ’04), pp. 293-308, May 2004. O. Berthold and H. Langos, “Dummy Traffic Against Long Term Intersection Attacks,” Proc. Privacy Enhancing Technologies Workshop(PET ’02), pp. 110128, Apr. 2002. X. Wang, S. Chen, and S. Jajodia, “Tracking Anonymous Peer-to-Peer Voip Calls on the Internet,” Proc. 12th ACM Conf. Computer and Comm. Security (CCS ’05), pp. 81-91, 2005. “Audio Signals Used for Experiments, ”http:// academic. csuohio. edu/zhu _y/ isc 2010/instruction.txt, 2011. Q. Sun, D.R. Simon, Y.-M. Wang, W. Russell, V.N. Padmanabhan,and L. Qiu, “Statistical Identification of Encrypted Web Browsing Traffic,” Proc. IEEE Symp. Security and Privacy (SP ’02), pp. 1930,2002. D. Herrmann, R. Wendolsky, and H. Federrath, “Website Fingerprinting: Attacking Popular Privacy Enhancing Technologies with the Multinomial Naı¨ve- www.ijera.com [11] [12] [13] [14] [15] www.ijera.com Bayes Classifier,” Proc. ACM Workshop Cloud Computing Security (CCSW ’09), pp. 31-42, 2009. L. Lu, E.-C. Chang, and M. Chan, “Website Fingerprinting and Identification Using Ordered Feature Sequences,” Proc.15th European Conf. Research in Computer Security (ESORICS),D. Gritzalis, B. Preneel, and M. Theoharidou, eds. pp. 199214.2010. C.V. Wright, L. Ballard, S.E. Coull, F. Monrose, and G.M. Masson,“Spot Me if You Can: Uncovering Spoken Phrases in EncryptedVoip Conversations,” Proc. IEEE Symp. Security and Privacy(SP ’08), pp. 3549, 2008. C.V. Wright, L. Ballard, F. Monrose, and G.M. Masson, “Language Identification of Encrypted Voip Traffic: Alejandra y Roberto orAlice and Bob?,” Proc. 16th USENIX Security Symp. USENIX Security Symp., pp. 4:1-4:12, http://portal.acm.org/citationcfm?id=136290 3.1362907, 2007. C. Wright, S. Coull, and F. Monrose, “Traffic Morphing: An Efficient Defense against Statistical Traffic Analysis,” Proc. Network and Distributed Security Symp. (NDSS ’09), Feb. 2009 Network Security M. Backes, G. Doychev, M. Du¨ rmuth, and B. Ko¨ pf, “Speaker Recognition in Encrypted Voice Streams,” Proc. 15th European Symp. Research in Computer Security (ESORICS ’10), pp. 508-523, Sept. 2010. 59 | P a g e