SlideShare a Scribd company logo
1 of 20
Digital Speech Processing

 Dept. of Computer Science & Engineering
 Shahjalal University of Science & Technology
Course Description
– Review of digital signal processing

– Fundamentals of speech production and perception

– Basic techniques for digital speech processing:
   • short - time energy, magnitude, autocorrelation
   • short - time Fourier analysis
   • homomorphic methods
   • linear predictive methods
Course Description…
– Speech estimation methods
    • speech/non-speech detection
    • voiced/unvoiced/non-speech segmentation/classification
    • pitch detection
    • formant estimation
– Applications of speech signal processing
    • Speech coding
    • Speech synthesis
    • Speech recognition/natural language processing
Book Information
Textbook:
L. R. Rabiner and R. W. Schafer,
-Theory and Applications of Digital Speech
Processing, Prentice-Hall Inc., 2011

Recommended Supplementary Textbook:
• T. F. Quatieri, Principles of Discrete - Time Speech
Processing, Prentice Hall Inc, 2002
Laboratory works: Using Matlab
Speech Processing
Speech is one of the most intriguing signals that humans work
  with every day.

• Purpose of speech processing:
– To understand speech as a means of communication;
– To represent speech for transmission and reproduction;
– To analyze speech for automatic recognition and extraction of
   information
– To discover some physiological characteristics of the talker.
The Speech Stack

• Speech Applications —
  coding, synthesis, recognition, understanding, ve
  rification, language translation, speed-up/slow-
  down
• Speech Algorithms —speech-silence
  (background), voiced-unvoiced decision, pitch
  detection, formant estimation
• Speech Representations —
  temporal, spectral, homomorphic, LPC
• Fundamentals —
  acoustics, linguistics, pragmatics, speech
  perception
Speech Coding
Speech Coding

Speech Coding is the process of transforming a speech
  signal into a representation for efficient transmission
  and storage of speech
   – narrowband and broadband wired telephony
   – cellular communications
   – Voice over IP (VoIP) to utilize the Internet as a real-time
      communications medium
   – secure voice for privacy and encryption for national
      security applications
   – extremely narrowband communications
      channels, e.g., battlefield applications using HF radio
   – storage of speech for telephone answering
      machines, prerecorded messages
Speech Synthesis
Synthesis of Speech is the process of generating a speech
signal using computational means for effective human-
machine interactions.
Speech Synthesis…
– machine reading of text or email messages
– talking agents for automatic transactions
– automatic agent in customer care call center
– handheld devices such as foreign language
  phrasebooks, dictionaries, crossword puzzle
  helpers
– announcement machines that provide information
  such as stock quotes, airlines schedules, weather
  reports, etc.
Speech Synthesis Examples




    Natural     Synthetic
Speech Recognition and understanding

Recognition and Understanding of Speech is the process of
  extracting usable linguistic information from a speech signal
  in support of human-machine communication by voice
   – command and control (C&C) applications, e.g., simple commands for
      spreadsheets, presentation graphics, Appliances
   – voice dictation to create letters, memos, and other documents
   – natural language voice dialogues with machines to enable Help
      desks, Call Centers
   – voice dialing for cellphones and from PDA’s and other small devices
   – agent services such as calendar entry and update, address list
      modification and entry, etc.
Speech Recognition Demos
Pattern Matching Problems
Pattern Matching Problems
• Speech recognition
• Speaker recognition
• Speaker verification
• Word spotting
• Automatic indexing of speech recordings
Other Speech Applications
• Speaker Verification for secure access to
   premises, information, virtual spaces
• Speaker Recognition for legal and forensic purposes—national
   security; also for personalized services
• Speech Enhancement for use in noisy environments, to eliminate
   echo, to align voices with video segments, to change voice
   qualities, to speed-up or slow-down prerecorded speech
   (e.g., talking books, rapid review of material, careful scrutinizing of
   spoken material, etc) => potentially to improve intelligibility and
   naturalness of speech
• Language Translation to convert spoken words in one language to
   another to facilitate natural language dialogues between people
   speaking different languages, i.e., tourists, business people
Speech/DSP Enabled Devices
Digital Speech Processing
• DSP:
– obtaining discrete representations of speech signal
– theory, design and implementation of numerical procedures
(algorithms) for processing the discrete representation in order to
achieve a goal (recognizing the signal, modifying the time scale
of the signal, removing background noise from the signal, etc.)
•Why DSP
– reliability
– flexibility
– accuracy
– real-time implementations on inexpensive dsp chips
– ability to integrate with multimedia and data
– encryptability/security of the data and the data representations
via suitable techniques
What We Will Be Learning
• Review some basic DSP concepts
• Speech production model—acoustics, articulatory concepts,
   speech production models
• Speech perception model—ear models, auditory signal
   processing, equivalent acoustic processing models
• Time domain processing concepts—speech properties, pitch,
   voiced-unvoiced, energy, autocorrelation, zero-crossing rates
• Short time Fourier analysis methods—digital filter banks,
   spectrograms, analysis-synthesis systems, vocoders
• Homomorphic speech processing—cepstrum, pitch detection,
   formant estimation, homomorphic vocoder
What We Will Be Learning…
• Linear predictive coding methods—autocorrelation
   method, covariance method, lattice methods, relation to vocal
   tract models
• Speech waveform coding and source models—delta
   modulation, PCM, mu-law, ADPCM, vector
   quantization, multipulse coding, CELP coding
• Methods for speech synthesis and text-to-speech systems—
   physical models, formant models, articulatory
   models, concatenative models
• Methods for speech recognition—the Hidden Markov Model
   (HMM)

More Related Content

What's hot

speech processing basics
speech processing basicsspeech processing basics
speech processing basicssivakumar m
 
Voice Morping ppt
Voice Morping pptVoice Morping ppt
Voice Morping pptciciapaul
 
DSP applications in medical field.
DSP applications in medical field.DSP applications in medical field.
DSP applications in medical field.Ethar Sayed
 
Medical applications of dsp
Medical applications of dspMedical applications of dsp
Medical applications of dspkanusinghal3
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCCHira Shaukat
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySrijanKumar18
 
Analog pulse modulation scheme.pptx
Analog pulse modulation scheme.pptxAnalog pulse modulation scheme.pptx
Analog pulse modulation scheme.pptxswatihalunde
 
Speech recognition an overview
Speech recognition   an overviewSpeech recognition   an overview
Speech recognition an overviewVarun Jain
 
Adaptive filter
Adaptive filterAdaptive filter
Adaptive filterA. Shamel
 
Transmission line, single and double matching
Transmission line, single and double matchingTransmission line, single and double matching
Transmission line, single and double matchingShankar Gangaju
 
Detection and Binary Decision in AWGN Channel
Detection and Binary Decision in AWGN ChannelDetection and Binary Decision in AWGN Channel
Detection and Binary Decision in AWGN ChannelDrAimalKhan
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
presentation on digital signal processing
presentation on digital signal processingpresentation on digital signal processing
presentation on digital signal processingsandhya jois
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionRichie
 

What's hot (20)

speech processing basics
speech processing basicsspeech processing basics
speech processing basics
 
Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech Recognition
 
FILTER BANKS
FILTER BANKSFILTER BANKS
FILTER BANKS
 
Voice Morping ppt
Voice Morping pptVoice Morping ppt
Voice Morping ppt
 
Speech encoding techniques
Speech encoding techniquesSpeech encoding techniques
Speech encoding techniques
 
DSP applications in medical field.
DSP applications in medical field.DSP applications in medical field.
DSP applications in medical field.
 
Medical applications of dsp
Medical applications of dspMedical applications of dsp
Medical applications of dsp
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCC
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
DPCM
DPCMDPCM
DPCM
 
Analog pulse modulation scheme.pptx
Analog pulse modulation scheme.pptxAnalog pulse modulation scheme.pptx
Analog pulse modulation scheme.pptx
 
Speech recognition an overview
Speech recognition   an overviewSpeech recognition   an overview
Speech recognition an overview
 
Adaptive filter
Adaptive filterAdaptive filter
Adaptive filter
 
Transmission line, single and double matching
Transmission line, single and double matchingTransmission line, single and double matching
Transmission line, single and double matching
 
Detection and Binary Decision in AWGN Channel
Detection and Binary Decision in AWGN ChannelDetection and Binary Decision in AWGN Channel
Detection and Binary Decision in AWGN Channel
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
presentation on digital signal processing
presentation on digital signal processingpresentation on digital signal processing
presentation on digital signal processing
 
Spread spectrum modulation
Spread spectrum modulationSpread spectrum modulation
Spread spectrum modulation
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
MINIMUM SHIFT KEYING(MSK)
MINIMUM SHIFT KEYING(MSK)MINIMUM SHIFT KEYING(MSK)
MINIMUM SHIFT KEYING(MSK)
 

Viewers also liked

Essential linguistics Chap 3 part 1 Graphic Organizer
Essential linguistics Chap 3 part 1 Graphic OrganizerEssential linguistics Chap 3 part 1 Graphic Organizer
Essential linguistics Chap 3 part 1 Graphic Organizersheilacook
 
Ppt on speech processing by ranbeer
Ppt on speech processing by ranbeerPpt on speech processing by ranbeer
Ppt on speech processing by ranbeerRanbeer Tyagi
 
Physiology of speech
Physiology of speechPhysiology of speech
Physiology of speechRaghu Veer
 
Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizyLizy Abraham
 
Radio Communication
Radio CommunicationRadio Communication
Radio CommunicationJohn Grace
 
Radio communication presentation
Radio communication presentationRadio communication presentation
Radio communication presentationrandan88
 
DIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSINGDIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSINGSnehal Hedau
 
Gsm.....ppt
Gsm.....pptGsm.....ppt
Gsm.....pptbalu008
 

Viewers also liked (11)

Dif fft
Dif fftDif fft
Dif fft
 
Essential linguistics Chap 3 part 1 Graphic Organizer
Essential linguistics Chap 3 part 1 Graphic OrganizerEssential linguistics Chap 3 part 1 Graphic Organizer
Essential linguistics Chap 3 part 1 Graphic Organizer
 
Ppt on speech processing by ranbeer
Ppt on speech processing by ranbeerPpt on speech processing by ranbeer
Ppt on speech processing by ranbeer
 
Physiology of speech
Physiology of speechPhysiology of speech
Physiology of speech
 
Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizy
 
Radio Presentation
Radio PresentationRadio Presentation
Radio Presentation
 
Radio Communication
Radio CommunicationRadio Communication
Radio Communication
 
Radio communication presentation
Radio communication presentationRadio communication presentation
Radio communication presentation
 
DIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSINGDIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSING
 
Dsp ppt
Dsp pptDsp ppt
Dsp ppt
 
Gsm.....ppt
Gsm.....pptGsm.....ppt
Gsm.....ppt
 

Similar to Digital speech processing lecture1

Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01girishjoshi1234
 
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionZachary S. Brown
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition systemavinash raibole
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition Goa App
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generatorsPaul Kahoro
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introductionacemindia
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction acemindia
 
How speech reorganization works
How speech reorganization worksHow speech reorganization works
How speech reorganization worksMuhammad Taqi
 
Artificial intelligence - research areas
Artificial intelligence - research areasArtificial intelligence - research areas
Artificial intelligence - research areasLearnbay Datascience
 
Chapter 10 - Universal Design and User Support.pptx
Chapter 10 - Universal Design and User Support.pptxChapter 10 - Universal Design and User Support.pptx
Chapter 10 - Universal Design and User Support.pptxValSilverio1
 

Similar to Digital speech processing lecture1 (20)

Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01
 
Speech recognition (dr. m. sabarimalai manikandan)
Speech recognition (dr. m. sabarimalai manikandan)Speech recognition (dr. m. sabarimalai manikandan)
Speech recognition (dr. m. sabarimalai manikandan)
 
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition system
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
Iitdmj 1
Iitdmj 1Iitdmj 1
Iitdmj 1
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generators
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introduction
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction
 
Speech-Recognition.pptx
Speech-Recognition.pptxSpeech-Recognition.pptx
Speech-Recognition.pptx
 
Web AI.pptx
Web AI.pptxWeb AI.pptx
Web AI.pptx
 
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
 
How speech reorganization works
How speech reorganization worksHow speech reorganization works
How speech reorganization works
 
Universal design HCI
Universal design HCIUniversal design HCI
Universal design HCI
 
WomenTech_Event
WomenTech_EventWomenTech_Event
WomenTech_Event
 
Artificial intelligence - research areas
Artificial intelligence - research areasArtificial intelligence - research areas
Artificial intelligence - research areas
 
Assign
AssignAssign
Assign
 
Amadou
AmadouAmadou
Amadou
 
Chapter 10 - Universal Design and User Support.pptx
Chapter 10 - Universal Design and User Support.pptxChapter 10 - Universal Design and User Support.pptx
Chapter 10 - Universal Design and User Support.pptx
 

Recently uploaded

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Recently uploaded (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Digital speech processing lecture1

  • 1. Digital Speech Processing Dept. of Computer Science & Engineering Shahjalal University of Science & Technology
  • 2. Course Description – Review of digital signal processing – Fundamentals of speech production and perception – Basic techniques for digital speech processing: • short - time energy, magnitude, autocorrelation • short - time Fourier analysis • homomorphic methods • linear predictive methods
  • 3. Course Description… – Speech estimation methods • speech/non-speech detection • voiced/unvoiced/non-speech segmentation/classification • pitch detection • formant estimation – Applications of speech signal processing • Speech coding • Speech synthesis • Speech recognition/natural language processing
  • 4. Book Information Textbook: L. R. Rabiner and R. W. Schafer, -Theory and Applications of Digital Speech Processing, Prentice-Hall Inc., 2011 Recommended Supplementary Textbook: • T. F. Quatieri, Principles of Discrete - Time Speech Processing, Prentice Hall Inc, 2002 Laboratory works: Using Matlab
  • 5. Speech Processing Speech is one of the most intriguing signals that humans work with every day. • Purpose of speech processing: – To understand speech as a means of communication; – To represent speech for transmission and reproduction; – To analyze speech for automatic recognition and extraction of information – To discover some physiological characteristics of the talker.
  • 6. The Speech Stack • Speech Applications — coding, synthesis, recognition, understanding, ve rification, language translation, speed-up/slow- down • Speech Algorithms —speech-silence (background), voiced-unvoiced decision, pitch detection, formant estimation • Speech Representations — temporal, spectral, homomorphic, LPC • Fundamentals — acoustics, linguistics, pragmatics, speech perception
  • 8. Speech Coding Speech Coding is the process of transforming a speech signal into a representation for efficient transmission and storage of speech – narrowband and broadband wired telephony – cellular communications – Voice over IP (VoIP) to utilize the Internet as a real-time communications medium – secure voice for privacy and encryption for national security applications – extremely narrowband communications channels, e.g., battlefield applications using HF radio – storage of speech for telephone answering machines, prerecorded messages
  • 9. Speech Synthesis Synthesis of Speech is the process of generating a speech signal using computational means for effective human- machine interactions.
  • 10. Speech Synthesis… – machine reading of text or email messages – talking agents for automatic transactions – automatic agent in customer care call center – handheld devices such as foreign language phrasebooks, dictionaries, crossword puzzle helpers – announcement machines that provide information such as stock quotes, airlines schedules, weather reports, etc.
  • 11. Speech Synthesis Examples Natural Synthetic
  • 12. Speech Recognition and understanding Recognition and Understanding of Speech is the process of extracting usable linguistic information from a speech signal in support of human-machine communication by voice – command and control (C&C) applications, e.g., simple commands for spreadsheets, presentation graphics, Appliances – voice dictation to create letters, memos, and other documents – natural language voice dialogues with machines to enable Help desks, Call Centers – voice dialing for cellphones and from PDA’s and other small devices – agent services such as calendar entry and update, address list modification and entry, etc.
  • 15. Pattern Matching Problems • Speech recognition • Speaker recognition • Speaker verification • Word spotting • Automatic indexing of speech recordings
  • 16. Other Speech Applications • Speaker Verification for secure access to premises, information, virtual spaces • Speaker Recognition for legal and forensic purposes—national security; also for personalized services • Speech Enhancement for use in noisy environments, to eliminate echo, to align voices with video segments, to change voice qualities, to speed-up or slow-down prerecorded speech (e.g., talking books, rapid review of material, careful scrutinizing of spoken material, etc) => potentially to improve intelligibility and naturalness of speech • Language Translation to convert spoken words in one language to another to facilitate natural language dialogues between people speaking different languages, i.e., tourists, business people
  • 18. Digital Speech Processing • DSP: – obtaining discrete representations of speech signal – theory, design and implementation of numerical procedures (algorithms) for processing the discrete representation in order to achieve a goal (recognizing the signal, modifying the time scale of the signal, removing background noise from the signal, etc.) •Why DSP – reliability – flexibility – accuracy – real-time implementations on inexpensive dsp chips – ability to integrate with multimedia and data – encryptability/security of the data and the data representations via suitable techniques
  • 19. What We Will Be Learning • Review some basic DSP concepts • Speech production model—acoustics, articulatory concepts, speech production models • Speech perception model—ear models, auditory signal processing, equivalent acoustic processing models • Time domain processing concepts—speech properties, pitch, voiced-unvoiced, energy, autocorrelation, zero-crossing rates • Short time Fourier analysis methods—digital filter banks, spectrograms, analysis-synthesis systems, vocoders • Homomorphic speech processing—cepstrum, pitch detection, formant estimation, homomorphic vocoder
  • 20. What We Will Be Learning… • Linear predictive coding methods—autocorrelation method, covariance method, lattice methods, relation to vocal tract models • Speech waveform coding and source models—delta modulation, PCM, mu-law, ADPCM, vector quantization, multipulse coding, CELP coding • Methods for speech synthesis and text-to-speech systems— physical models, formant models, articulatory models, concatenative models • Methods for speech recognition—the Hidden Markov Model (HMM)