SlideShare une entreprise Scribd logo
1  sur  22
BY
Tahmeena naheed(043)
Tehzeeb khan marwat(016)
Tayyaba Rani (046)
University of Wah, Wahcantt
 What is Speech Recognition?
also known as automatic speech recognition or computer
speech recognition which means understanding voice by the
computer and performing any required task.
 Where can it be used?
military operations
System control/navigation
radiologist
Commercial/Industrial applications
Voice dialing
Voice Input Analog to Digital Acoustic Model
Language Model
Display Speech EngineFeedback
 Acoustic Model
An acoustic model is created by taking audio recordings of
speech, and their text transcriptions, and using software to create
statistical representations of the sounds that make up each word.
It is used by a speech recognition engine to recognize speech.
 Language Model
 Language modeling is used in many natural language
processing applications such as speech recognition tries to
capture the properties of a language, and to predict the next word
in a speech sequence.
 There are two types of speech recognition. One is
called speaker-dependent and the other is speaker-
independent. Speaker-dependent software is commonly
used for dictation software, while speaker-independent
software is more commonly found in telephone
applications.
 Speaker-dependent software works by learning the
unique characteristics of a single person’s voice, in a
way similar to voice recognition. New users must first
“train” the software by speaking to it, so the computer
can analyze how the person talks. This often means
users have to read a few pages of text to the computer
before they can use the speech recognition software.
 Speaker-independent software is designed to recognize
anyone’s voice, so no training is involved. This means it
is the only real option for applications such as interactive
voice response systems — where businesses can’t ask
callers to read pages of text before using the system.
The downside is that speaker-independent software is
generally less accurate than speaker-dependent
software.
 Speech recognition engines that are speaker
independent generally deal with this fact by limiting the
grammars they use. By using a smaller list of recognized
words, the speech engine is more likely to correctly
recognize what a speaker said.
 Digitization
 Signal processing
 Phonetics
Speech recognition
• Digitization
– Converting analogue signal into digital
representation
• Signal processing
– Separating speech from background noise
• Phonetics
– Variability in human speech
.
 Analogue to digital conversion
 Sampling and quantizing
Sampling is converting a continuous signal into a discrete signal
 Quantizing is the process of approximating a continuous range of values
 Use filters to measure energy levels for various
points on the frequency spectrum
 Knowing the relative importance of different
frequency bands (for speech) makes this process
more efficient
 E.g. high frequency sounds are less informative, so
can be sampled using a broader bandwidth (log
scale)
 Noise cancelling microphones
◦ Two mics, one facing speaker, the other facing away
◦ Ambient noise is roughly same for both mics
 Knowing which bits of the signal relate to
speech
Speaker
Recognition
Speech
Recognition
parsing
and
arbitration
S1
S2
SK
SN
Speaker
Recognition
Speech
Recognition
parsing
and
arbitration
Switch on
Channel 9
S1
S2
SK
SN
Speaker
Recognition
Speech
Recognition
parsing
and
arbitration
Who is
speaking?
Annie
David
Cathy
S1
S2
SK
SN
“Authentication”
Speaker
Recognition
Speech
Recognition
parsing
and
arbitration
What is he
saying?
On,Off,TV
Fridge,Door
S1
S2
SK
SN
“Understanding”
Speaker
Recognition
Speech
Recognition
parsing
and
arbitration
What is he
talking
about?
Channel->TV
Dim->Lamp
On->TV,Lamp
S1
S2
SK
SN
“Switch”,”to”,”channel”,”nine”
“Inferring and execution”
Face
Recognition
Gesture
Recognition
parsing
and
arbitration
S1
S2
SK
SN
“Authentication” “Understanding” “Inferring and execution”
 Definition
◦ It is the method of recognizing a person based on his voice
◦ It is one of the forms of biometric identification
 Depends of speaker specific
characteristics.
 Advantages
• People with disabilities
◦ Organizations - Increases productivity, reduces costs and errors.
◦ Lower operational Costs
◦ Advances in technology will allow consumers and businesses to
implement speech recognition systems at a relatively low cost.
 Cell-phone users can dial pre-programmed numbers by voice
command.
 Users can trade stocks through a voice-activated trading system.
 Speech recognition technology can also replace touch-tone dialing
resulting in the ability to target customers that speak different
languages
 Difficult to build a perfect system.
 Conversations
◦ Involves more than just words (non-verbal
communication; stutters etc.
◦ Every human being has differences such as their voice,
mouth, and speaking style.
 Filtering background noise is a task that can even
be difficult for humans to accomplish.
 Accuracy will become better and better.
 Dictation speech recognition will gradually become
accepted.
 Small hand-held writing tablets for computer speech
recognition dictation and data entry will be developed, as
faster processors and more memory become available.
 Greater use will be made of "intelligent systems" which
will attempt to guess what the speaker intended to say,
rather than what was actually said, as people often
misspeak and make unintentional mistakes.
 Microphone and sound systems will be designed to
adapt more quickly to changing background noise levels,
different environments, with better recognition of
extraneous material to be discarded.

Contenu connexe

Tendances

Voice Recognition
Voice RecognitionVoice Recognition
Voice Recognition
Amrita More
 
TEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptxTEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptx
Nsaroj kumar
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
ankit_saluja
 

Tendances (20)

A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technology
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice Recognition
 
Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech Recognition
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognition
 
Speech Recognition System
Speech Recognition SystemSpeech Recognition System
Speech Recognition System
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
TEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptxTEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptx
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Speech processing
Speech processingSpeech processing
Speech processing
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
 
Artificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemArtificial intelligence Speech recognition system
Artificial intelligence Speech recognition system
 
Text to-speech & voice recognition
Text to-speech & voice recognitionText to-speech & voice recognition
Text to-speech & voice recognition
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speech
 

En vedette

Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
Ilhaan Marwat
 
Organization Structure and Design
Organization Structure and DesignOrganization Structure and Design
Organization Structure and Design
Marwan H. Noman
 
Mechanistic vs organic organisation
Mechanistic vs organic organisationMechanistic vs organic organisation
Mechanistic vs organic organisation
rogerfed
 
Ch 9 organizational structure and design
Ch 9 organizational structure and designCh 9 organizational structure and design
Ch 9 organizational structure and design
Nardin A
 

En vedette (15)

Superstitions
SuperstitionsSuperstitions
Superstitions
 
internship report on IESCO Wapda final project 2014
internship report on IESCO Wapda final project 2014internship report on IESCO Wapda final project 2014
internship report on IESCO Wapda final project 2014
 
Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
 
CHAPTER 10 Managing Human Resources
CHAPTER 10                Managing Human ResourcesCHAPTER 10                Managing Human Resources
CHAPTER 10 Managing Human Resources
 
presentation about guns...
presentation about guns...presentation about guns...
presentation about guns...
 
Assinment 2
Assinment 2Assinment 2
Assinment 2
 
Management 9 chapter Organizational Structure & Design
Management 9 chapter                        Organizational Structure & DesignManagement 9 chapter                        Organizational Structure & Design
Management 9 chapter Organizational Structure & Design
 
Organization Structure and Design
Organization Structure and DesignOrganization Structure and Design
Organization Structure and Design
 
Free masonry
Free masonryFree masonry
Free masonry
 
Superstitions
SuperstitionsSuperstitions
Superstitions
 
Mechanistic vs organic organisation
Mechanistic vs organic organisationMechanistic vs organic organisation
Mechanistic vs organic organisation
 
Representation of male and female in media
Representation of male and female in mediaRepresentation of male and female in media
Representation of male and female in media
 
Mobilink project
Mobilink projectMobilink project
Mobilink project
 
Organization Structure
Organization StructureOrganization Structure
Organization Structure
 
Ch 9 organizational structure and design
Ch 9 organizational structure and designCh 9 organizational structure and design
Ch 9 organizational structure and design
 

Similaire à Speech Recognition in Artificail Inteligence

Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
ankit_saluja
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognition
Vinay Jaisriram
 
Paper on Speech Recognition
Paper on Speech RecognitionPaper on Speech Recognition
Paper on Speech Recognition
Thejus Joby
 

Similaire à Speech Recognition in Artificail Inteligence (20)

Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
 
Seminar
SeminarSeminar
Seminar
 
30
3030
30
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introduction
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Artificial intelligence - research areas
Artificial intelligence - research areasArtificial intelligence - research areas
Artificial intelligence - research areas
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognition
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generators
 
AI for voice recognition.pptx
AI for voice recognition.pptxAI for voice recognition.pptx
AI for voice recognition.pptx
 
Assign
AssignAssign
Assign
 
Dy36749754
Dy36749754Dy36749754
Dy36749754
 
Enterprise Voice Technology Solutions: A Primer
Enterprise Voice Technology Solutions: A PrimerEnterprise Voice Technology Solutions: A Primer
Enterprise Voice Technology Solutions: A Primer
 
Paper on Speech Recognition
Paper on Speech RecognitionPaper on Speech Recognition
Paper on Speech Recognition
 
FINAL report
FINAL reportFINAL report
FINAL report
 
Google Voice-to-text
Google Voice-to-textGoogle Voice-to-text
Google Voice-to-text
 
How does speech recognition AI work.pdf
How does speech recognition AI work.pdfHow does speech recognition AI work.pdf
How does speech recognition AI work.pdf
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition system
 
Top 10 Best Speech Recognition Software
Top 10 Best Speech Recognition Software Top 10 Best Speech Recognition Software
Top 10 Best Speech Recognition Software
 

Dernier

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Dernier (20)

Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 

Speech Recognition in Artificail Inteligence

  • 1. BY Tahmeena naheed(043) Tehzeeb khan marwat(016) Tayyaba Rani (046) University of Wah, Wahcantt
  • 2.  What is Speech Recognition? also known as automatic speech recognition or computer speech recognition which means understanding voice by the computer and performing any required task.
  • 3.  Where can it be used? military operations System control/navigation radiologist Commercial/Industrial applications Voice dialing
  • 4. Voice Input Analog to Digital Acoustic Model Language Model Display Speech EngineFeedback
  • 5.  Acoustic Model An acoustic model is created by taking audio recordings of speech, and their text transcriptions, and using software to create statistical representations of the sounds that make up each word. It is used by a speech recognition engine to recognize speech.  Language Model  Language modeling is used in many natural language processing applications such as speech recognition tries to capture the properties of a language, and to predict the next word in a speech sequence.
  • 6.  There are two types of speech recognition. One is called speaker-dependent and the other is speaker- independent. Speaker-dependent software is commonly used for dictation software, while speaker-independent software is more commonly found in telephone applications.  Speaker-dependent software works by learning the unique characteristics of a single person’s voice, in a way similar to voice recognition. New users must first “train” the software by speaking to it, so the computer can analyze how the person talks. This often means users have to read a few pages of text to the computer before they can use the speech recognition software.
  • 7.  Speaker-independent software is designed to recognize anyone’s voice, so no training is involved. This means it is the only real option for applications such as interactive voice response systems — where businesses can’t ask callers to read pages of text before using the system. The downside is that speaker-independent software is generally less accurate than speaker-dependent software.  Speech recognition engines that are speaker independent generally deal with this fact by limiting the grammars they use. By using a smaller list of recognized words, the speech engine is more likely to correctly recognize what a speaker said.
  • 8.
  • 9.  Digitization  Signal processing  Phonetics Speech recognition
  • 10. • Digitization – Converting analogue signal into digital representation • Signal processing – Separating speech from background noise • Phonetics – Variability in human speech .
  • 11.  Analogue to digital conversion  Sampling and quantizing Sampling is converting a continuous signal into a discrete signal  Quantizing is the process of approximating a continuous range of values  Use filters to measure energy levels for various points on the frequency spectrum  Knowing the relative importance of different frequency bands (for speech) makes this process more efficient  E.g. high frequency sounds are less informative, so can be sampled using a broader bandwidth (log scale)
  • 12.  Noise cancelling microphones ◦ Two mics, one facing speaker, the other facing away ◦ Ambient noise is roughly same for both mics  Knowing which bits of the signal relate to speech
  • 19.  Definition ◦ It is the method of recognizing a person based on his voice ◦ It is one of the forms of biometric identification  Depends of speaker specific characteristics.
  • 20.  Advantages • People with disabilities ◦ Organizations - Increases productivity, reduces costs and errors. ◦ Lower operational Costs ◦ Advances in technology will allow consumers and businesses to implement speech recognition systems at a relatively low cost.  Cell-phone users can dial pre-programmed numbers by voice command.  Users can trade stocks through a voice-activated trading system.  Speech recognition technology can also replace touch-tone dialing resulting in the ability to target customers that speak different languages
  • 21.  Difficult to build a perfect system.  Conversations ◦ Involves more than just words (non-verbal communication; stutters etc. ◦ Every human being has differences such as their voice, mouth, and speaking style.  Filtering background noise is a task that can even be difficult for humans to accomplish.
  • 22.  Accuracy will become better and better.  Dictation speech recognition will gradually become accepted.  Small hand-held writing tablets for computer speech recognition dictation and data entry will be developed, as faster processors and more memory become available.  Greater use will be made of "intelligent systems" which will attempt to guess what the speaker intended to say, rather than what was actually said, as people often misspeak and make unintentional mistakes.  Microphone and sound systems will be designed to adapt more quickly to changing background noise levels, different environments, with better recognition of extraneous material to be discarded.