speech processing basics

sivakumar m
sivakumar mLecturer at jntuacep à jntuacep
Speech Processing
• Fundamentals of Digital Speech processing
1.Anatomy and physiology of speech organs
2.The process of speech production
3.The Acoustic Theory of speech production
4.Digital models for speech signals
Applications of Speech Processing
• 1.Speech recognition: speech to text
• 2.Speech understanding: Not exact words(meaning is
important rather than text) :speech translation
• 3.speech synthesis: Text to speech, computer can
speak to you
• 4.Word processing: check and correct spelling,
grammar and style
• 5.text prediction: speed up word processing
• 6.automatic summarization: Topic identification,
summary generation
• 7.text mining : Necessary data
speech processing basics
• Anatomy: It is the study of structure of bodies of people or animals
• Physiology: It is the study of how people’s and animals bodies functions
and understanding the higher order mechanisms within the human central
nervous system that account for speech production in human beings
• Acoustic: It is a scientific study of sounds
• Phonetics: It is relating to the sound of a word or to the sounds that are
used in languages
• Phonemes: It is the smallest unit of sounds which is significant in a
language
• Articulatory:It is the action of productory a sound or word cleary,in speech
or music
• Linguistics: It is study of the way in which language works
• Semantics: It is the branch of Linguistics that deals with the meanings of
words and sentences.
Speech Processing
Signal
Processing Information
Theory
Phonetics
Acoustics
Algorithms
(Programming)
Fourier transforms
Discrete time filters
AR(MA) models
Entropy
Communication theory
Rate-distortion theory
Statistical SP
Stochastic
models
Psychoacoustics
Room acoustics
Speech production
ASR: Application
© James Glass, MIT
7
Recognition
Voice Input Analog to Digital Acoustic Model
Language Model
Display Speech EngineFeedback
Automatic Speech Recognition
speech processing basics
Speech Generation
• first talker formulates a message(in this mind)that
he wants to transmit to listener via speech
• The process of message formulation is creation of
printed text expressing the words of message
• The next step is conversion of the message into a
language code.
• This roughly corresponds to converting the
printed text of message into set of phoneme
sequence corresponding to sounds that make up
words and pitch accent associated with the
sounds
• Once the language code is chosen, the talker
must execute a series of neuromuscular
commands to cause the vocal cords to vibrate
when appropriate and shape the vocal tract
such that the proper sequence of speech
sounds is created and spoken by the talker,
then producing an acoustic signal as final
output
Speech Recognition
• First the listener processes the acoustic signal
the basilar membrane in the inner ear, which
providing a running spectrum analysis of the
incoming signal.
• The neural activity along the auditory nerve is
converted into a language code at higher
centers of processing within the brain and
message comprehension is achieved
speech processing basics
speech processing basics
speech processing basics
speech processing basics
speech processing basics
• The lungs and the associated muscles act as the source
of air for exciting the vocal mechanism.
• The muscle force pushes air out of lungs(shown as a
piston pushing up within a cylinder)and though the
bronchi and trachea.
• When the vocal cords are tensed, the air flow causes
them to vibrate ,producing so called voiced speech
sounds
• When the vocal cords are relaxed, in order to produce
a sound, the air flow either must pass through a
constriction in vocal tract and thereby become
turbulent, producing so called unvoiced speech sounds
Classifications
• 1.silence(s)-no speech is produced()
• 2.Unvoiced(U):vocal cords are not vibrating so
speech signal is aperiodic or random in nature
• 3.Voiced(V): vocal cords are vibrate
periodically when air flows from the lungs, so
speech signal is periodic
Speech Waveform Characteristics
• Loudness
• Voiced/Unvoiced.
• Pitch.
– Fundamental frequency.
• Spectral envelope.
– Formants.
Speech Waveform Characteristics
Cont.
Voiced Speech Unvoiced Speech
/ih/ /s/
speech processing basics
speech processing basics
Phoneme Hierarchy
Speech sounds
Vowels ConsonantsDiphtongs
Plosive
Nasal
Fricative
Retroflex
liquid
Lateral
liquid
Glide
iy, ih, ae, aa,
ah, ao,ax, eh,
er, ow, uh, uw
ay, ey,
oy, aw
w, y
p, b, t,
d, k, g
m, n, ng f, v, th, dh,
s, z, sh, zh, h
r
l
Language dependent.
About 50 in English.
Signal processing
Digital speech processing
speech processing basics
• Speech signals are composed of a sequence of
sounds.
• The study of these rules and their implication
s in human communication is the domain of
linguistics.
• The study and classification of sound of
speech is called phonetics.
speech processing basics
speech processing basics
speech processing basics
speech processing basics
1 sur 32

Recommandé

Unit 1 speech processing par
Unit 1 speech processingUnit 1 speech processing
Unit 1 speech processingazhagujaisudhan
801 vues144 diapositives
Speech processing par
Speech processingSpeech processing
Speech processingIndian Institute of Technology Bhubaneswar
19.3K vues21 diapositives
Digital speech processing lecture1 par
Digital speech processing lecture1Digital speech processing lecture1
Digital speech processing lecture1Samiul Parag
12.1K vues20 diapositives
Automatic speech recognition par
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionRichie
22.6K vues46 diapositives
Speech recognition final presentation par
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentationhimanshubhatti
30.8K vues30 diapositives
Speech synthesis technology par
Speech synthesis technologySpeech synthesis technology
Speech synthesis technologyKalluri Madhuri
5.2K vues18 diapositives

Contenu connexe

Tendances

Speech Recognition System By Matlab par
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By MatlabAnkit Gujrati
58.1K vues21 diapositives
Speech Recognition par
Speech RecognitionSpeech Recognition
Speech Recognitionfathitarek
2.7K vues21 diapositives
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications I par
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications IDSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications IAmr E. Mohamed
2.1K vues64 diapositives
Speech Recognition par
Speech RecognitionSpeech Recognition
Speech RecognitionHugo Moreno
5.8K vues15 diapositives
Deep Learning For Speech Recognition par
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognitionananth
3.3K vues12 diapositives
Linear Predictive Coding par
Linear Predictive CodingLinear Predictive Coding
Linear Predictive CodingSrishti Kakade
493 vues21 diapositives

Tendances(20)

Speech Recognition System By Matlab par Ankit Gujrati
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
Ankit Gujrati58.1K vues
Speech Recognition par fathitarek
Speech RecognitionSpeech Recognition
Speech Recognition
fathitarek2.7K vues
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications I par Amr E. Mohamed
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications IDSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
Amr E. Mohamed2.1K vues
Speech Recognition par Hugo Moreno
Speech RecognitionSpeech Recognition
Speech Recognition
Hugo Moreno5.8K vues
Deep Learning For Speech Recognition par ananth
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognition
ananth3.3K vues
TEXT-SPEECH PPT.pptx par Nsaroj kumar
TEXT-SPEECH PPT.pptxTEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptx
Nsaroj kumar12.9K vues
Homomorphic speech processing par sivakumar m
Homomorphic speech processingHomomorphic speech processing
Homomorphic speech processing
sivakumar m3.5K vues
5. phases of nlp par monircse2
5. phases of nlp5. phases of nlp
5. phases of nlp
monircse2636 vues
Text to-speech & voice recognition par Mark Williams
Text to-speech & voice recognitionText to-speech & voice recognition
Text to-speech & voice recognition
Mark Williams7K vues

En vedette

Ppt on speech processing by ranbeer par
Ppt on speech processing by ranbeerPpt on speech processing by ranbeer
Ppt on speech processing by ranbeerRanbeer Tyagi
1.7K vues13 diapositives
Speech signal processing lizy par
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizyLizy Abraham
7.3K vues173 diapositives
Radio Communication par
Radio CommunicationRadio Communication
Radio CommunicationJohn Grace
34.6K vues30 diapositives
Radio communication presentation par
Radio communication presentationRadio communication presentation
Radio communication presentationrandan88
22.7K vues11 diapositives
Radio Presentation par
Radio PresentationRadio Presentation
Radio PresentationTheyagarajan Sundaramoorthy
24.7K vues40 diapositives
Gsm.....ppt par
Gsm.....pptGsm.....ppt
Gsm.....pptbalu008
162.3K vues25 diapositives

En vedette(6)

Ppt on speech processing by ranbeer par Ranbeer Tyagi
Ppt on speech processing by ranbeerPpt on speech processing by ranbeer
Ppt on speech processing by ranbeer
Ranbeer Tyagi1.7K vues
Speech signal processing lizy par Lizy Abraham
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizy
Lizy Abraham7.3K vues
Radio Communication par John Grace
Radio CommunicationRadio Communication
Radio Communication
John Grace34.6K vues
Radio communication presentation par randan88
Radio communication presentationRadio communication presentation
Radio communication presentation
randan8822.7K vues
Gsm.....ppt par balu008
Gsm.....pptGsm.....ppt
Gsm.....ppt
balu008162.3K vues

Similaire à speech processing basics

Principal characteristics of speech par
Principal characteristics of speechPrincipal characteristics of speech
Principal characteristics of speechNikolay Karpov
3.6K vues22 diapositives
Phonetics and its types.PPTX par
Phonetics and its types.PPTXPhonetics and its types.PPTX
Phonetics and its types.PPTXJumaGull
4 vues30 diapositives
Part1 speech basics par
Part1 speech basicsPart1 speech basics
Part1 speech basicsMinakshi Atre
211 vues26 diapositives
Chapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptx par
Chapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptxChapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptx
Chapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptxMarianAseniero
13 vues54 diapositives
Phonetics & phonology, INTRODUCTION, Dr, Salama Embarak par
Phonetics & phonology, INTRODUCTION, Dr, Salama EmbarakPhonetics & phonology, INTRODUCTION, Dr, Salama Embarak
Phonetics & phonology, INTRODUCTION, Dr, Salama EmbarakAbdulsalam Mohammed
4.3K vues23 diapositives

Similaire à speech processing basics(20)

Principal characteristics of speech par Nikolay Karpov
Principal characteristics of speechPrincipal characteristics of speech
Principal characteristics of speech
Nikolay Karpov3.6K vues
Phonetics and its types.PPTX par JumaGull
Phonetics and its types.PPTXPhonetics and its types.PPTX
Phonetics and its types.PPTX
JumaGull4 vues
Chapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptx par MarianAseniero
Chapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptxChapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptx
Chapter 3 Phonology , Lesson 1.1 Understanding the Concept.pptx
MarianAseniero13 vues
Phonetics & phonology, INTRODUCTION, Dr, Salama Embarak par Abdulsalam Mohammed
Phonetics & phonology, INTRODUCTION, Dr, Salama EmbarakPhonetics & phonology, INTRODUCTION, Dr, Salama Embarak
Phonetics & phonology, INTRODUCTION, Dr, Salama Embarak
Physiology of speech and swallowing par Joe Antony
Physiology of speech and swallowingPhysiology of speech and swallowing
Physiology of speech and swallowing
Joe Antony60 vues
1-An Introduction to English Phonetics and Phonology.ppt par PhamTheTan2
1-An Introduction to English Phonetics and Phonology.ppt1-An Introduction to English Phonetics and Phonology.ppt
1-An Introduction to English Phonetics and Phonology.ppt
PhamTheTan297 vues
Physiology of speech par hariom gour
Physiology of speechPhysiology of speech
Physiology of speech
hariom gour1.4K vues
An Introduction To Speech Sciences (Acoustic Analysis Of Speech) par Jeff Nelson
An Introduction To Speech Sciences (Acoustic Analysis Of Speech)An Introduction To Speech Sciences (Acoustic Analysis Of Speech)
An Introduction To Speech Sciences (Acoustic Analysis Of Speech)
Jeff Nelson5 vues
Principal characteristics of speech par Nikolay Karpov
Principal characteristics of speechPrincipal characteristics of speech
Principal characteristics of speech
Nikolay Karpov3.7K vues
Theories of speech perception.pptx par sherin444916
Theories of speech perception.pptxTheories of speech perception.pptx
Theories of speech perception.pptx
sherin44491684 vues
Phonetics lesson 1 - general introduction par Thu Trang
Phonetics   lesson 1 - general introductionPhonetics   lesson 1 - general introduction
Phonetics lesson 1 - general introduction
Thu Trang10.8K vues

Dernier

Renewal Projects in Seismic Construction par
Renewal Projects in Seismic ConstructionRenewal Projects in Seismic Construction
Renewal Projects in Seismic ConstructionEngineering & Seismic Construction
8 vues8 diapositives
taylor-2005-classical-mechanics.pdf par
taylor-2005-classical-mechanics.pdftaylor-2005-classical-mechanics.pdf
taylor-2005-classical-mechanics.pdfArturoArreola10
37 vues808 diapositives
Automated Remote sensing GPS satellite system for managing resources and moni... par
Automated Remote sensing GPS satellite system for managing resources and moni...Automated Remote sensing GPS satellite system for managing resources and moni...
Automated Remote sensing GPS satellite system for managing resources and moni...Khalid Abdel Naser Abdel Rahim
5 vues1 diapositive
Basic Design Flow for Field Programmable Gate Arrays par
Basic Design Flow for Field Programmable Gate ArraysBasic Design Flow for Field Programmable Gate Arrays
Basic Design Flow for Field Programmable Gate ArraysUsha Mehta
10 vues21 diapositives
Unlocking Research Visibility.pdf par
Unlocking Research Visibility.pdfUnlocking Research Visibility.pdf
Unlocking Research Visibility.pdfKhatirNaima
11 vues19 diapositives
Design_Discover_Develop_Campaign.pptx par
Design_Discover_Develop_Campaign.pptxDesign_Discover_Develop_Campaign.pptx
Design_Discover_Develop_Campaign.pptxShivanshSeth6
56 vues20 diapositives

Dernier(20)

Basic Design Flow for Field Programmable Gate Arrays par Usha Mehta
Basic Design Flow for Field Programmable Gate ArraysBasic Design Flow for Field Programmable Gate Arrays
Basic Design Flow for Field Programmable Gate Arrays
Usha Mehta10 vues
Unlocking Research Visibility.pdf par KhatirNaima
Unlocking Research Visibility.pdfUnlocking Research Visibility.pdf
Unlocking Research Visibility.pdf
KhatirNaima11 vues
Design_Discover_Develop_Campaign.pptx par ShivanshSeth6
Design_Discover_Develop_Campaign.pptxDesign_Discover_Develop_Campaign.pptx
Design_Discover_Develop_Campaign.pptx
ShivanshSeth656 vues
IRJET-Productivity Enhancement Using Method Study.pdf par SahilBavdhankar
IRJET-Productivity Enhancement Using Method Study.pdfIRJET-Productivity Enhancement Using Method Study.pdf
IRJET-Productivity Enhancement Using Method Study.pdf
SahilBavdhankar10 vues
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth par Innomantra
BCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for GrowthBCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for Growth
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth
Innomantra 22 vues
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc... par csegroupvn
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
csegroupvn16 vues
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf par AlhamduKure
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdfASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
AlhamduKure10 vues
MongoDB.pdf par ArthyR3
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
ArthyR351 vues
Créativité dans le design mécanique à l’aide de l’optimisation topologique par LIEGE CREATIVE
Créativité dans le design mécanique à l’aide de l’optimisation topologiqueCréativité dans le design mécanique à l’aide de l’optimisation topologique
Créativité dans le design mécanique à l’aide de l’optimisation topologique
Field Programmable Gate Arrays : Architecture par Usha Mehta
Field Programmable Gate Arrays : ArchitectureField Programmable Gate Arrays : Architecture
Field Programmable Gate Arrays : Architecture
Usha Mehta23 vues
AWS Certified Solutions Architect Associate Exam Guide_published .pdf par Kiran Kumar Malik
AWS Certified Solutions Architect Associate Exam Guide_published .pdfAWS Certified Solutions Architect Associate Exam Guide_published .pdf
AWS Certified Solutions Architect Associate Exam Guide_published .pdf

speech processing basics

  • 1. Speech Processing • Fundamentals of Digital Speech processing 1.Anatomy and physiology of speech organs 2.The process of speech production 3.The Acoustic Theory of speech production 4.Digital models for speech signals
  • 2. Applications of Speech Processing • 1.Speech recognition: speech to text • 2.Speech understanding: Not exact words(meaning is important rather than text) :speech translation • 3.speech synthesis: Text to speech, computer can speak to you • 4.Word processing: check and correct spelling, grammar and style • 5.text prediction: speed up word processing • 6.automatic summarization: Topic identification, summary generation • 7.text mining : Necessary data
  • 4. • Anatomy: It is the study of structure of bodies of people or animals • Physiology: It is the study of how people’s and animals bodies functions and understanding the higher order mechanisms within the human central nervous system that account for speech production in human beings • Acoustic: It is a scientific study of sounds • Phonetics: It is relating to the sound of a word or to the sounds that are used in languages • Phonemes: It is the smallest unit of sounds which is significant in a language • Articulatory:It is the action of productory a sound or word cleary,in speech or music • Linguistics: It is study of the way in which language works • Semantics: It is the branch of Linguistics that deals with the meanings of words and sentences.
  • 5. Speech Processing Signal Processing Information Theory Phonetics Acoustics Algorithms (Programming) Fourier transforms Discrete time filters AR(MA) models Entropy Communication theory Rate-distortion theory Statistical SP Stochastic models Psychoacoustics Room acoustics Speech production
  • 7. 7 Recognition Voice Input Analog to Digital Acoustic Model Language Model Display Speech EngineFeedback
  • 10. Speech Generation • first talker formulates a message(in this mind)that he wants to transmit to listener via speech • The process of message formulation is creation of printed text expressing the words of message • The next step is conversion of the message into a language code. • This roughly corresponds to converting the printed text of message into set of phoneme sequence corresponding to sounds that make up words and pitch accent associated with the sounds
  • 11. • Once the language code is chosen, the talker must execute a series of neuromuscular commands to cause the vocal cords to vibrate when appropriate and shape the vocal tract such that the proper sequence of speech sounds is created and spoken by the talker, then producing an acoustic signal as final output
  • 12. Speech Recognition • First the listener processes the acoustic signal the basilar membrane in the inner ear, which providing a running spectrum analysis of the incoming signal. • The neural activity along the auditory nerve is converted into a language code at higher centers of processing within the brain and message comprehension is achieved
  • 18. • The lungs and the associated muscles act as the source of air for exciting the vocal mechanism. • The muscle force pushes air out of lungs(shown as a piston pushing up within a cylinder)and though the bronchi and trachea. • When the vocal cords are tensed, the air flow causes them to vibrate ,producing so called voiced speech sounds • When the vocal cords are relaxed, in order to produce a sound, the air flow either must pass through a constriction in vocal tract and thereby become turbulent, producing so called unvoiced speech sounds
  • 19. Classifications • 1.silence(s)-no speech is produced() • 2.Unvoiced(U):vocal cords are not vibrating so speech signal is aperiodic or random in nature • 3.Voiced(V): vocal cords are vibrate periodically when air flows from the lungs, so speech signal is periodic
  • 20. Speech Waveform Characteristics • Loudness • Voiced/Unvoiced. • Pitch. – Fundamental frequency. • Spectral envelope. – Formants.
  • 21. Speech Waveform Characteristics Cont. Voiced Speech Unvoiced Speech /ih/ /s/
  • 24. Phoneme Hierarchy Speech sounds Vowels ConsonantsDiphtongs Plosive Nasal Fricative Retroflex liquid Lateral liquid Glide iy, ih, ae, aa, ah, ao,ax, eh, er, ow, uh, uw ay, ey, oy, aw w, y p, b, t, d, k, g m, n, ng f, v, th, dh, s, z, sh, zh, h r l Language dependent. About 50 in English.
  • 28. • Speech signals are composed of a sequence of sounds. • The study of these rules and their implication s in human communication is the domain of linguistics. • The study and classification of sound of speech is called phonetics.