SlideShare une entreprise Scribd logo
1  sur  4
Report on Speech Recognition AI
Tehmeena Naheed (043)
E-mail:
Tayyaba Rani (046)
E-mail :
Tehzeeb Khan Marwat (016)
E-mail :
Abstract:
Artificial Intelligenceisbecomingapopularfieldincomputerscience.Inthisreportwe exploredits
history, majoraccomplishmentsandthe visionsof itscreators.We lookedathow Artificial Intelligence
expertsinfluence reportingandengineeredasurveytogauge publicopinion.We alsoexaminedexpert
predictionsconcerningthe future of the fieldaswell asmediacoverage of itsrecentaccomplishments.
These resultswere thenusedtoexplore the linksbetweenexpertopinion,publicopinionandmedia
coverage.
Introduction:
Artificial Intelligencehasbeenstudiedfordecadesandisstill one of the mostelusive subjectsin
ComputerScience.Thispartlydue tohow large and nebulousthe subjectis.AIrangesfrommachines
trulycapable of thinkingtosearchalgorithmsusedtoplayboard games.Ithas applicationsinnearly
everywaywe use computersinsociety.thispaperisaboutexaminingthe historyof artificialintelligence
fromtheoryto practice and fromits rise to fall,highlightingfew majorthemesandadvances.
Goal:
There have beenvarioustrendsinAIeversince itsinception.Inthe earlierdaysof Artificial Intelligence,
there wasan enormousamountof hype aboutthe possibilitiesof computertechnologyincreating
intelligentmachines.These expectationswere unrealistic.We wishtoexaminethe currentviews
expressedbybothexpertsandLaypeopleaboutthe nature of Artificial Intelligence,aswell asaboutthe
possibilitiesof AItechnologyinthe nearfuture.Inexaminingbothof these we willconsiderthe extent
to whichexpertopinionsandthe currenttrendsinArtificial Intelligence alignwiththe viewsand
opinionsof the laypeople.Fromthiswe hope tocomprehendthe extenttowhichthe opinionsheldby
laypeoplecorrespondtothe actual innovationsinArtificial Intelligence,aswell asitspastand future
applications.`
Speech Recognition:
Definition:
It isthe science andengineeringof makingintelligentmachines,especiallyintelligent
computerprograms.AImeansArtificial Intelligence.Intelligence howevercannotbe definedbutAIcan
be describedasbranch of computerscience dealing withthe simulationof machine exhibitingintelligent
behavior. Speakerindependency,The speechqualityvariesfrompersontoperson.Itistherefore
difficulttobuildanelectronicsystemthatrecognizeseveryone’svoice.Bylimitingthe systemtothe
voice of a single person,the systembecomesnotonlysimplerbutalsomore reliable.The computer
mustbe trainedtothe voice of that particularindividual.Suchasystemiscalledspeaker-dependent
system.Speakerindependentsystemscanbe usedbyanybody,andcan recognize anyvoice,even
thoughthe characteristicsvarywidelyfromone speakertoanother.Mostof these systemsare costly
and complex.Also,these have verylimitedvocabularies. Itisimportanttoconsiderthe environmentin
whichthe speechrecognitionsystemhastowork.The grammar usedby the speakerandacceptedby
the system,noise level,noise type,positionof the microphone,andspeedandmannerof the user’s
speechare some factorsthat may affectthe qualityof speechrecognition.
Environmental influence:
Real applicationsdemandthatthe performance of the recognitionsystembe unaffectedbychangesin
the environment.However,itisa factthat whena systemistrainedandtestedunderdifferent
conditions,the recognitionrate dropsunacceptably.We needtobe concernedaboutthe variability
presentwhendifferentmicrophonesare usedintrainingandtesting,andspecificallyduring
developmentof procedures.Suchcare can significantlyimprove the accuracyof recognitionsystems
that use desktopmicrophones.Acoustical distortionscandegrade the accuracyof recognitionsystems.
Obstaclestorobustnessincludeadditive noise frommachinery,competingtalkers,reverberationfrom
surface reflectionsinaroom,and spectral shapingbymicrophonesandthe vocal tracts of individual
speakers.These sourcesof distortionsfallintotwocomplementaryclasses;additivenoise and
distortionsresultingfromthe convolutionof the speechsignal withanunknownlinearsystem.A
numberof algorithmsforspeechenhancementhave beenproposed.These includethe following:
1. Spectral subtractionof DFT coefficients
2. MMSE techniquestoestimate the DFTcoefficientsof corruptedspeech
3. Spectral equalizationtocompensate forconvoluteddistortions
4. Spectral subtractionandspectral equalization.Althoughrelativelysuccessful,all thesemethods
dependonthe assumptionof independenceof the spectral estimatesacrossfrequencies.
Improvedperformance canbe gotwithan MMSE estimatorinwhichcorrelationamong
frequenciesismodeledexplicitly.
Speaker-specific features:
Speakeridentitycorrelateswiththe physiological andbehavioral characteristicsof the speaker.These
characteristicsexistbothinthe vocal tract characteristicsandin the voice source characteristics,as also
inthe dynamicfeaturesspanningseveral segments.The mostcommonshort-termspectral
measurementscurrentlyusedare the spectral coefficientsderivedfromthe LinearPredictive Coding
and theirregressioncoefficients.A spectral envelope reconstructedfromtruncatedsetof spectral
coefficientsismuchsmootherthanone reconstructedfromLPCcoefficients.Therefore,itprovidesa
more stable representationfromone repetitiontoanotherof particularspeaker’sutterances.Asforthe
regressioncoefficients,typicallythe firstandsecondordercoefficientsare extractedateveryframe
periodtorepresentthe spectral dynamics.Thesecoefficientsare derivativesof the time functionof the
spectral coefficientsandare calledthe deltaanddelta-delta-spectral coefficientsrespectively.
Speech Recognition:
The user communicateswiththe applicationthroughthe appropriateinputdevice i.e.amicrophone.
The Recognizerconvertsthe analogsignal intodigital signal forthe speechprocessing.A streamof text
isgeneratedafterthe processing.Thissource-language textbecomesinputtothe Translation Engine,
whichconvertsitto the target language text.
Salient Features:
1. InputModes
 ThroughSpeechEngine
 Throughsoft copy
2. Interactive Graphical UserInterface
3. Format Retention
4. Fast and standardtranslation
5. Interactive Pre-processingtool
 Spell checker.
 Phrase marker.
 Propernoun,date and otherpackage specificidentifierInputFormat.
 InputFormat : txt,.doc .rtf.
 User friendlyselectionof multipleoutput.
 Online thesaurusforselectionof contextuallyappropriate synonym.
 Online wordaddition,grammarcreationandupdatingfacility.
 Personal accountcreationandinbox management.
Applications:
One of the mainbenefitsof speechrecognitionsystemisthatitletsuserdo otherworkssimultaneously.
The user can concentrate onobservationandmanual operations,andstill control the machineryby
voice inputcommands.Anothermajorapplicationof speechprocessingisinmilitaryoperations.Voice
control of weaponsisanexample.Withreliablespeechrecognitionequipment,pilotscangive
commandsand informationtothe computersbysimplyspeakingintotheirmicrophones - theydon’t
have to use theirhandsfor thispurpose.Anothergoodexample isaradiologistscanninghundredsof X-
rays, ultrasonograms,CT scansand simultaneouslydictatingconclusionstoa speechrecognitionsystem
connectedtowordprocessors.The radiologistcanfocushisattentiononthe imagesratherthanwriting
the text.Voice recognitioncouldalsobe usedoncomputersformakingairline andhotel reservations. A
User requiressimplystatinghisneeds,tomake reservation,cancel areservation,ormakingenquiries
aboutschedule.
Conclusion:
By usingthisspeakerrecognitiontechnologywe canachieve manyuses.Thistechnologyhelpsphysically
challengedskilledpersons.Thesepeople candotheirworksbyusingthistechnologywithoutpushing
any buttons.ThisASRtechnologyisalsousedinmilitaryweaponsandinResearchcenters.Now aday
thistechnologywasalsousedbyCID officers.Theyusedthistotrapthe criminal activities.
References
 http://venturebeat.com/2012/10/07/google-uses-its-artificial-intelligence-to-improve-speech-
recognition/
 http://venturebeat.com/2012/10/07/google-uses-its-artificial-intelligence-to-improve-speech-
recognition/
 http://www.sciencedaily.com/articles/s/speech_recognition.htm
 DevelopinganArtificialIntelligence Engine(Michael vanLentandJohnLaird)
______________________________________________________________________

Contenu connexe

Plus de Ilhaan Marwat

Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
Ilhaan Marwat
 

Plus de Ilhaan Marwat (14)

internship report on IESCO Wapda final project 2014
internship report on IESCO Wapda final project 2014internship report on IESCO Wapda final project 2014
internship report on IESCO Wapda final project 2014
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Free masonry
Free masonryFree masonry
Free masonry
 
CHAPTER 10 Managing Human Resources
CHAPTER 10                Managing Human ResourcesCHAPTER 10                Managing Human Resources
CHAPTER 10 Managing Human Resources
 
Management 9 chapter Organizational Structure & Design
Management 9 chapter                        Organizational Structure & DesignManagement 9 chapter                        Organizational Structure & Design
Management 9 chapter Organizational Structure & Design
 
Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
Case study: HOW PAKTEL FLOPES AND ZONG COME INTO MARKET...
 
Mobilink project
Mobilink projectMobilink project
Mobilink project
 
Mobilink project
Mobilink projectMobilink project
Mobilink project
 
Mobilink strategic management report
Mobilink strategic management reportMobilink strategic management report
Mobilink strategic management report
 
Superstitions
SuperstitionsSuperstitions
Superstitions
 
Superstitions
SuperstitionsSuperstitions
Superstitions
 
19 & 20 law
19 & 20 law19 & 20 law
19 & 20 law
 
Representation of male and female in media
Representation of male and female in mediaRepresentation of male and female in media
Representation of male and female in media
 
presentation about guns...
presentation about guns...presentation about guns...
presentation about guns...
 

Dernier

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Dernier (20)

Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 

Speech recognition in artificial inteligence

  • 1. Report on Speech Recognition AI Tehmeena Naheed (043) E-mail: Tayyaba Rani (046) E-mail : Tehzeeb Khan Marwat (016) E-mail : Abstract: Artificial Intelligenceisbecomingapopularfieldincomputerscience.Inthisreportwe exploredits history, majoraccomplishmentsandthe visionsof itscreators.We lookedathow Artificial Intelligence expertsinfluence reportingandengineeredasurveytogauge publicopinion.We alsoexaminedexpert predictionsconcerningthe future of the fieldaswell asmediacoverage of itsrecentaccomplishments. These resultswere thenusedtoexplore the linksbetweenexpertopinion,publicopinionandmedia coverage. Introduction: Artificial Intelligencehasbeenstudiedfordecadesandisstill one of the mostelusive subjectsin ComputerScience.Thispartlydue tohow large and nebulousthe subjectis.AIrangesfrommachines trulycapable of thinkingtosearchalgorithmsusedtoplayboard games.Ithas applicationsinnearly everywaywe use computersinsociety.thispaperisaboutexaminingthe historyof artificialintelligence fromtheoryto practice and fromits rise to fall,highlightingfew majorthemesandadvances. Goal: There have beenvarioustrendsinAIeversince itsinception.Inthe earlierdaysof Artificial Intelligence, there wasan enormousamountof hype aboutthe possibilitiesof computertechnologyincreating intelligentmachines.These expectationswere unrealistic.We wishtoexaminethe currentviews expressedbybothexpertsandLaypeopleaboutthe nature of Artificial Intelligence,aswell asaboutthe possibilitiesof AItechnologyinthe nearfuture.Inexaminingbothof these we willconsiderthe extent to whichexpertopinionsandthe currenttrendsinArtificial Intelligence alignwiththe viewsand opinionsof the laypeople.Fromthiswe hope tocomprehendthe extenttowhichthe opinionsheldby
  • 2. laypeoplecorrespondtothe actual innovationsinArtificial Intelligence,aswell asitspastand future applications.` Speech Recognition: Definition: It isthe science andengineeringof makingintelligentmachines,especiallyintelligent computerprograms.AImeansArtificial Intelligence.Intelligence howevercannotbe definedbutAIcan be describedasbranch of computerscience dealing withthe simulationof machine exhibitingintelligent behavior. Speakerindependency,The speechqualityvariesfrompersontoperson.Itistherefore difficulttobuildanelectronicsystemthatrecognizeseveryone’svoice.Bylimitingthe systemtothe voice of a single person,the systembecomesnotonlysimplerbutalsomore reliable.The computer mustbe trainedtothe voice of that particularindividual.Suchasystemiscalledspeaker-dependent system.Speakerindependentsystemscanbe usedbyanybody,andcan recognize anyvoice,even thoughthe characteristicsvarywidelyfromone speakertoanother.Mostof these systemsare costly and complex.Also,these have verylimitedvocabularies. Itisimportanttoconsiderthe environmentin whichthe speechrecognitionsystemhastowork.The grammar usedby the speakerandacceptedby the system,noise level,noise type,positionof the microphone,andspeedandmannerof the user’s speechare some factorsthat may affectthe qualityof speechrecognition. Environmental influence: Real applicationsdemandthatthe performance of the recognitionsystembe unaffectedbychangesin the environment.However,itisa factthat whena systemistrainedandtestedunderdifferent conditions,the recognitionrate dropsunacceptably.We needtobe concernedaboutthe variability presentwhendifferentmicrophonesare usedintrainingandtesting,andspecificallyduring developmentof procedures.Suchcare can significantlyimprove the accuracyof recognitionsystems that use desktopmicrophones.Acoustical distortionscandegrade the accuracyof recognitionsystems. Obstaclestorobustnessincludeadditive noise frommachinery,competingtalkers,reverberationfrom surface reflectionsinaroom,and spectral shapingbymicrophonesandthe vocal tracts of individual speakers.These sourcesof distortionsfallintotwocomplementaryclasses;additivenoise and distortionsresultingfromthe convolutionof the speechsignal withanunknownlinearsystem.A numberof algorithmsforspeechenhancementhave beenproposed.These includethe following: 1. Spectral subtractionof DFT coefficients 2. MMSE techniquestoestimate the DFTcoefficientsof corruptedspeech 3. Spectral equalizationtocompensate forconvoluteddistortions 4. Spectral subtractionandspectral equalization.Althoughrelativelysuccessful,all thesemethods dependonthe assumptionof independenceof the spectral estimatesacrossfrequencies.
  • 3. Improvedperformance canbe gotwithan MMSE estimatorinwhichcorrelationamong frequenciesismodeledexplicitly. Speaker-specific features: Speakeridentitycorrelateswiththe physiological andbehavioral characteristicsof the speaker.These characteristicsexistbothinthe vocal tract characteristicsandin the voice source characteristics,as also inthe dynamicfeaturesspanningseveral segments.The mostcommonshort-termspectral measurementscurrentlyusedare the spectral coefficientsderivedfromthe LinearPredictive Coding and theirregressioncoefficients.A spectral envelope reconstructedfromtruncatedsetof spectral coefficientsismuchsmootherthanone reconstructedfromLPCcoefficients.Therefore,itprovidesa more stable representationfromone repetitiontoanotherof particularspeaker’sutterances.Asforthe regressioncoefficients,typicallythe firstandsecondordercoefficientsare extractedateveryframe periodtorepresentthe spectral dynamics.Thesecoefficientsare derivativesof the time functionof the spectral coefficientsandare calledthe deltaanddelta-delta-spectral coefficientsrespectively. Speech Recognition: The user communicateswiththe applicationthroughthe appropriateinputdevice i.e.amicrophone. The Recognizerconvertsthe analogsignal intodigital signal forthe speechprocessing.A streamof text isgeneratedafterthe processing.Thissource-language textbecomesinputtothe Translation Engine, whichconvertsitto the target language text. Salient Features: 1. InputModes  ThroughSpeechEngine  Throughsoft copy 2. Interactive Graphical UserInterface 3. Format Retention 4. Fast and standardtranslation 5. Interactive Pre-processingtool  Spell checker.  Phrase marker.  Propernoun,date and otherpackage specificidentifierInputFormat.  InputFormat : txt,.doc .rtf.  User friendlyselectionof multipleoutput.  Online thesaurusforselectionof contextuallyappropriate synonym.  Online wordaddition,grammarcreationandupdatingfacility.  Personal accountcreationandinbox management.
  • 4. Applications: One of the mainbenefitsof speechrecognitionsystemisthatitletsuserdo otherworkssimultaneously. The user can concentrate onobservationandmanual operations,andstill control the machineryby voice inputcommands.Anothermajorapplicationof speechprocessingisinmilitaryoperations.Voice control of weaponsisanexample.Withreliablespeechrecognitionequipment,pilotscangive commandsand informationtothe computersbysimplyspeakingintotheirmicrophones - theydon’t have to use theirhandsfor thispurpose.Anothergoodexample isaradiologistscanninghundredsof X- rays, ultrasonograms,CT scansand simultaneouslydictatingconclusionstoa speechrecognitionsystem connectedtowordprocessors.The radiologistcanfocushisattentiononthe imagesratherthanwriting the text.Voice recognitioncouldalsobe usedoncomputersformakingairline andhotel reservations. A User requiressimplystatinghisneeds,tomake reservation,cancel areservation,ormakingenquiries aboutschedule. Conclusion: By usingthisspeakerrecognitiontechnologywe canachieve manyuses.Thistechnologyhelpsphysically challengedskilledpersons.Thesepeople candotheirworksbyusingthistechnologywithoutpushing any buttons.ThisASRtechnologyisalsousedinmilitaryweaponsandinResearchcenters.Now aday thistechnologywasalsousedbyCID officers.Theyusedthistotrapthe criminal activities. References  http://venturebeat.com/2012/10/07/google-uses-its-artificial-intelligence-to-improve-speech- recognition/  http://venturebeat.com/2012/10/07/google-uses-its-artificial-intelligence-to-improve-speech- recognition/  http://www.sciencedaily.com/articles/s/speech_recognition.htm  DevelopinganArtificialIntelligence Engine(Michael vanLentandJohnLaird) ______________________________________________________________________