SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
Hearing by seeing: Can improving the
visibility of the speaker's lips make you hear
better?
Najwa Alghamdi, MSc
Bio
• Lecturer in the Information Technology Department, CCIS, KSU.
• SKERG member
• PhD Candidate in University of Sheffield.
• Member of the Computer Graphics and Virtual Reality, Speech and
Hearing, research groups
• Supervised by: Dr.Steve Maddock, Prof. Guy Brown and Dr. Jon Barker.
• My research investigates methods for enhancing visual speech
intelligibility* to support hard of hearing ( cochlear implant (CI)
users in particular).
• Alghamdi, Najwa / Maddock, Steve / Brown, Guy J. / Barker, Jon (2015):
"Investigating the impact of artificial enhancement of lip visibility on the
intelligibility of spectrally-distorted speech", In FAAVSP-2015, 93-98.
* Speech intelligibility is a measure of how comprehensible speech is in given conditions
28/04/2016 © King Saud University - The University of Sheffield
2
Introduction
• Cochlear implants help profoundly deaf people
• to become more aware of everyday sounds
• to understand speech better when combined
with lip-reading
• The sound waveform is separated by band-pass
filters into different frequency components
• Users initially describe the sound characteristics
like “mechanical” and “synthetic”
28/04/2016
3
Real Synthesized
Cochlear Implant (CI)
© King Saud University - The University of Sheffield
Introduction: Training after implantation
• Auditory training is formal listening activities whose goal is to
optimize the activity of speech perception.
• Auditory training helps the CI user use the new ‘hearing’
• Typically, training uses audio-only speech stimuli
• Recent studies suggest that using visual speech stimuli in the
training may maximize the benefit from the training (Bernstein et
al., 2013)
28/04/2016
4
Audio-only Audiovisual
© King Saud University - The University of Sheffield
Introduction: Enhancing training videos
• Lander and Capek (2013) found that increasing and decreasing lip
visibility by applying lipstick and concealer improved the
speechreading performance of words and sentences compared to
natural, unadorned lips
• Our idea is to artificially colour a speaker’s lips in a video sequence
to improve lip visibility
28/04/2016
5
Natural lip with lipstick with concealer
© King Saud University - The University of Sheffield
Aim of Research
• Investigate whether or not artificially
enhancing the appearance of a speaker’s lips:
• Supports lip-reading thus improving the
intelligibility of visual speech
• Improves auditory training
• Preliminary step: study non-native, normal hearing listeners using
cochlear implant simulation. Why?
• Both CI users and non-native listeners deal with internal adverse
conditions when listening to CI processed speech:
• Linguistic knowledge in non-native listeners (Bent et al., 2009)
• Damaged inner ear in a CI user
• Non-native listeners may help predict the performance of CI users
28/04/2016
6
© King Saud University - The University of Sheffield
Enhancement Method
Automatic
tracking
using
Faceware
Analyzer*
XML
Parser
Smoothing
landmarks using
piecewise
bicubic Bézier
curves
Colour &
luminance
blending
Lip contour
smoothing
using
average
filter
Landmarks
XML
file
7
28/04/2016
*http://facewaretech.com/products/software/analyzer/
© King Saud University - The University of Sheffield
Enhancement Method
28/04/2016
8
Original Simulated
© King Saud University - The University of Sheffield
Method: Subjects
• 46 non-native, Saudi listeners from King Saud University, Riyadh,
Saudi Arabia
• Minimum IELTS score = 5.5
• Subjects are split into groups
28/04/2016
9
Group Size Pre-test Training Post-test
A Audio-only 13
A
A
AV AudioVisual 19 V
E Enhanced audiovisual 14 E
© King Saud University - The University of Sheffield
Method: Stimuli
• We used the Grid corpus*
• Example: ‘bin blue at L 8 please’
• Audio and video (facial) recordings of 1000
sentences × 34 talkers (18 male, 16 female)
• We used audio and video recordings made
by a single talker
28/04/2016
10
command colour preposition letter digit adverb
bin, lay,
place, set
blue, green,
red, white
at, by, in, with 25,
no ‘w’
10 again, now,
please, soon
*http://spandh.dcs.shef.ac.uk/gridcorpus/
© King Saud University - The University of Sheffield
Method: Stimuli
• The Grid videos are processed to produce the different stimuli
• The subjects need to identify the colour, the letter and the digit
keyword of a Grid stimulus in all training and testing sessions
28/04/2016
11
A V E
Grid audio stimuli of a
single speaker are spectrally
distorted (vocoded) to
simulate CI processed
speech (Tabri et al., 2011)
The audio tracks of Grid
videos are replaced with
the spectrally distorted
Audio-only stimuli
The speaker's lips in the
Audiovisual stimuli are
automatically tracked
and artificially coloured
© King Saud University - The University of Sheffield
Results: three sets
1. The impact of using E speech in auditory
training
• training gain = post-test - pre-test
2. A comparison of the intelligibility of A, V
and E speech
• Training scores can be used to provide a
subjective intelligibility assessment
3. Letter confusion matrices from post-test
• Understand the possible sources of
confusion when identifying letters during
the audio-only post test
28/04/2016
12
© King Saud University - The University of Sheffield
1. The Impact of Using Enhanced Audiovisual
Speech in Auditory Training
28/04/2016
13
A V E ANOVA Post-hoc
Pre-test mean scores 14% 14% 13%
Post-test mean scores 46% 54% 71% p= 0.04 p= 0.037
Training gain 32% 40% 58% p= 0.01 p= 0.009
© King Saud University - The University of Sheffield
2. A comparison of the intelligibility of A, V and E speech
28/04/2016
14
k=1
3
Identification score by X subjects in Session kSpeech intelligibility
of X speech = × 100
60where X = {A, V or E}
ANOVA p= 0.008
Post-hoc p= 0.006, between A & E
© King Saud University - The University of Sheffield
3. Letter Confusion Matrices from post-test results
• Letter identification was the most challenging task
• 25 letters to choose from [no ‘W’]
• Due to the vocoding process, some letters sound similar: (P,B), (G,T),
(M,N) and vowels
28/04/2016
15
Letter presented in the test
User’s
response
Clusters
1 Dipthongs
2 Contains plosive
sounds
3 Contains nasal sounds
4 Contains fricative
sounds
5 Contains a lateral
approximant sound
© King Saud University - The University of Sheffield
3. Letter Confusion Matrices from post-test results
• The analysis of the letter confusion matrices for the audio-only
post-test shows that E subjects were better at letter and diphthong
identification:
28/04/2016
16
E V A
Letter identification 75% 65% 55%
Vowels identification 81% 66% 52%
E V A
© King Saud University - The University of Sheffield
3. Letter Confusion Matrices from post-test results
• The visual signal might impede learning the discrimination of
visually similar sounds such as P & B .
28/04/2016
17
E V A
28/04/2016 © King Saud University - The University of Sheffield
Conclusions
• Audio-only post-training tests suggest that the enhanced visual
signal improves the training gain of participants
• Intelligibility of spectrally-distorted speech is improved when a
corresponding enhanced visual signal is introduced
• Next steps: Expand the study; Similar experiment on a group of CI
and hearing aid users
28/04/2016
18
© King Saud University - The University of Sheffield
Current Experiment in SKERG
• Evaluation study of a new enhancement method that exaggerate
speaking style of the speaker in the video.
28/04/2016 © King Saud University - The University of Sheffield
19
Normal Exaggerated Exaggerated with lipstick
Amalghamdi1@sheffield.ac.uk
www.najwa-alghamdi.net
20
28/04/2016
This research has been supported by the Saudi
Ministry of Education, King Saud University and
Faceware Technologies Inc.
© King Saud University - The University of Sheffield
References
• T. Bent, A. Buchwald, D. B. Pisoni. “Perceptual adaptation and intelligibility of
multiple talkers for two types of degraded speech,” The Journal of the Acoustical
Society of America,126(5), 2660–2669,2009.
• L. E. Bernstein, E. T. Auer Jr, S. P. Eberhardt, and J. Jiang, “Auditory perceptual
learning for speech perception can be enhanced by audiovisual training,” Frontiers
in neuroscience, vol. 7, 2013.
• M. F., Dorman, P. C, Loizou. “The identification of consonants and vowels by
cochlear implant patients using a 6-channel continuous interleaved sampling
processor and by normal-hearing subjects using simulations of processors with
two to nine channels”. Ear and hearing 19(2), 162–166, 1998.
• K. Lander and C. Capek, “Investigating the impact of lip visibility and talking style
on speechreading performance,” Speech Communication, vol. 55, no. 5, pp. 600–
605, 2013.
• D. Tabri, K. M. S. A. Chacra, and T. Pring, “Speech perception in noise by
monolingual, bilingual and trilingual listeners,” International Journal of Language
& Communication Disorders, vol. 46, no. 4, pp. 411–422, 2011.
21
28/04/2016 © King Saud University - The University of Sheffield
22
© The University of Sheffield28/04/2016
User’s
response
Letter
presented
in the test
Colouring the lips
• Smoothing the lip contours
• Where are controls points and is a Bernstein polynomial given
by
28/04/2016 © The University of Sheffield
23
Luminance Blending
• Luminance blending was utilized as well to improve colour blending under
different lighting conditions
• This was accomplished by applying Luminance blending in luma/chroma
(Y‘CbCr) space and then converting the results to the RGB space using the
following equations
28/04/2016 © The University of Sheffield
24
CI simulation
• The GRID audio was spectrally distorted using an eight-channel sine-wave vocoder
(AngelSim*).
• Normal hearing listeners can perform in a comparable way to CI users when hearing no
more or less than 8 channels (Dorman et al., 1998)
• The fluctuation of noise in a noise vocoder is not presented in real CI (Bent et al., 2009)
thus we used the sinewave vocoder.
• The processing of vocoding
1. The signal is divided into 8 channels by a bandpass filter [200 to 7,000Hz]
(slope=24dB/octave);
2. Each channel was then low-pass filtered by 160Hz (slope=24dB/octave) to obtain the
envelope;
3. The envelope of each channel modulated a sine wave that replaced the signal
frequency
28/04/2016 © The University of Sheffield
25
*http://www.tigerspeech.com
28/04/2016 © King Saud University - The University of Sheffield
29
healthy cochlea -- ‫سليمة‬ ‫قوقعة‬
back
28/04/2016 © King Saud University - The University of Sheffield
30
cochlear implant-- ‫قوقع‬‫ة‬‫الكترونية‬
back
neurosensory hearing-loss conditions

Contenu connexe

Tendances

BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...ijnlc
 
Interspeech 2017 s_miyoshi
Interspeech 2017 s_miyoshiInterspeech 2017 s_miyoshi
Interspeech 2017 s_miyoshiHiroyuki Miyoshi
 
Using Instant Messaging For Collaborative Learning
Using Instant Messaging For  Collaborative LearningUsing Instant Messaging For  Collaborative Learning
Using Instant Messaging For Collaborative LearningElly Lin
 
Using text-to-speech in exams - practical solutions and pitfalls, UK perspective
Using text-to-speech in exams - practical solutions and pitfalls, UK perspectiveUsing text-to-speech in exams - practical solutions and pitfalls, UK perspective
Using text-to-speech in exams - practical solutions and pitfalls, UK perspectiveAbi James
 
Sadia Ppt
Sadia PptSadia Ppt
Sadia PptCYUT
 
Russian english workshop world-call2013
Russian english workshop world-call2013Russian english workshop world-call2013
Russian english workshop world-call2013Konstantin Shestakov
 
A study of the availability and use of assistive technology with dyslexic pup...
A study of the availability and use of assistive technology with dyslexic pup...A study of the availability and use of assistive technology with dyslexic pup...
A study of the availability and use of assistive technology with dyslexic pup...Abi James
 
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...hiij
 
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...hiij
 

Tendances (9)

BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
 
Interspeech 2017 s_miyoshi
Interspeech 2017 s_miyoshiInterspeech 2017 s_miyoshi
Interspeech 2017 s_miyoshi
 
Using Instant Messaging For Collaborative Learning
Using Instant Messaging For  Collaborative LearningUsing Instant Messaging For  Collaborative Learning
Using Instant Messaging For Collaborative Learning
 
Using text-to-speech in exams - practical solutions and pitfalls, UK perspective
Using text-to-speech in exams - practical solutions and pitfalls, UK perspectiveUsing text-to-speech in exams - practical solutions and pitfalls, UK perspective
Using text-to-speech in exams - practical solutions and pitfalls, UK perspective
 
Sadia Ppt
Sadia PptSadia Ppt
Sadia Ppt
 
Russian english workshop world-call2013
Russian english workshop world-call2013Russian english workshop world-call2013
Russian english workshop world-call2013
 
A study of the availability and use of assistive technology with dyslexic pup...
A study of the availability and use of assistive technology with dyslexic pup...A study of the availability and use of assistive technology with dyslexic pup...
A study of the availability and use of assistive technology with dyslexic pup...
 
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...
 
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...
LANGUAGE CHARACTERISTICS SUPPORTING EARLY ALZHEIMER'S DIAGNOSIS THROUGH MACHI...
 

En vedette

Usability Heuristics for the Design of Interactive Attention Assessment and R...
Usability Heuristics for the Design of Interactive Attention Assessment and R...Usability Heuristics for the Design of Interactive Attention Assessment and R...
Usability Heuristics for the Design of Interactive Attention Assessment and R...HCI Lab
 
Human Factors in the Design and Evaluation of Bioinformatics Tools
Human Factors in the Design and Evaluation of Bioinformatics ToolsHuman Factors in the Design and Evaluation of Bioinformatics Tools
Human Factors in the Design and Evaluation of Bioinformatics ToolsHCI Lab
 
Human Factors in the Design of Emotion-Aware Gaming
Human Factors in the Design of Emotion-Aware GamingHuman Factors in the Design of Emotion-Aware Gaming
Human Factors in the Design of Emotion-Aware GamingHCI Lab
 
Introduction to Rayz
Introduction to RayzIntroduction to Rayz
Introduction to RayzRayz Co.
 
Balancing Priorities in the Academic Workplace
Balancing Priorities in the Academic WorkplaceBalancing Priorities in the Academic Workplace
Balancing Priorities in the Academic WorkplaceAreej Al-Wabil
 
Simulation Modelling in Healthcare: Challenges and Trends
Simulation Modelling in Healthcare: Challenges and TrendsSimulation Modelling in Healthcare: Challenges and Trends
Simulation Modelling in Healthcare: Challenges and TrendsHCI Lab
 
Brain-Computer Interfaces: Applying our Minds to Human-Computer Interaction
Brain-Computer Interfaces: Applying our Minds to Human-Computer InteractionBrain-Computer Interfaces: Applying our Minds to Human-Computer Interaction
Brain-Computer Interfaces: Applying our Minds to Human-Computer InteractionAreej Al-Wabil
 
تنافسية الابداع والابتكار - مهارات الوصول للعالمية
تنافسية الابداع والابتكار  - مهارات الوصول للعالميةتنافسية الابداع والابتكار  - مهارات الوصول للعالمية
تنافسية الابداع والابتكار - مهارات الوصول للعالميةAreej Al-Wabil
 
Social Innovations for Urban Sustainability: Towards Smarter Cities
Social Innovations for Urban Sustainability: Towards Smarter CitiesSocial Innovations for Urban Sustainability: Towards Smarter Cities
Social Innovations for Urban Sustainability: Towards Smarter Cities Areej Al-Wabil
 
Innovation in the Classroom: Keeping Digital Innovation Alive in Schools
Innovation in the Classroom: Keeping Digital Innovation Alive in SchoolsInnovation in the Classroom: Keeping Digital Innovation Alive in Schools
Innovation in the Classroom: Keeping Digital Innovation Alive in SchoolsAreej Al-Wabil
 
Selecting Publication Venues in Computing Disciplines
Selecting Publication Venues in Computing DisciplinesSelecting Publication Venues in Computing Disciplines
Selecting Publication Venues in Computing DisciplinesAreej Al-Wabil
 
Games To Explain Human Factors
Games To Explain Human FactorsGames To Explain Human Factors
Games To Explain Human FactorsScott Abel
 
Prototyping in Tangible Interfaces for Complex Systems
Prototyping in Tangible Interfaces for Complex SystemsPrototyping in Tangible Interfaces for Complex Systems
Prototyping in Tangible Interfaces for Complex SystemsAreej Al-Wabil
 
A Study of the Challenges to Information and Communications Technology in Gir...
A Study of the Challenges to Information and Communications Technology in Gir...A Study of the Challenges to Information and Communications Technology in Gir...
A Study of the Challenges to Information and Communications Technology in Gir...HCI Lab
 
Brain, Body and Bytes: Designing Brain-Computer Interaction Beyond Medical ...
Brain, Body and Bytes: Designing Brain-Computer Interaction Beyond Medical ...Brain, Body and Bytes: Designing Brain-Computer Interaction Beyond Medical ...
Brain, Body and Bytes: Designing Brain-Computer Interaction Beyond Medical ...HCI Lab
 

En vedette (20)

Usability Heuristics for the Design of Interactive Attention Assessment and R...
Usability Heuristics for the Design of Interactive Attention Assessment and R...Usability Heuristics for the Design of Interactive Attention Assessment and R...
Usability Heuristics for the Design of Interactive Attention Assessment and R...
 
Human Factors in the Design and Evaluation of Bioinformatics Tools
Human Factors in the Design and Evaluation of Bioinformatics ToolsHuman Factors in the Design and Evaluation of Bioinformatics Tools
Human Factors in the Design and Evaluation of Bioinformatics Tools
 
GoogleDay_17Apr2013
GoogleDay_17Apr2013GoogleDay_17Apr2013
GoogleDay_17Apr2013
 
ICDR
ICDRICDR
ICDR
 
PSUCASE2014
PSUCASE2014PSUCASE2014
PSUCASE2014
 
Human Factors in the Design of Emotion-Aware Gaming
Human Factors in the Design of Emotion-Aware GamingHuman Factors in the Design of Emotion-Aware Gaming
Human Factors in the Design of Emotion-Aware Gaming
 
Introduction to Rayz
Introduction to RayzIntroduction to Rayz
Introduction to Rayz
 
Balancing Priorities in the Academic Workplace
Balancing Priorities in the Academic WorkplaceBalancing Priorities in the Academic Workplace
Balancing Priorities in the Academic Workplace
 
Dsc_5may2013
Dsc_5may2013Dsc_5may2013
Dsc_5may2013
 
Simulation Modelling in Healthcare: Challenges and Trends
Simulation Modelling in Healthcare: Challenges and TrendsSimulation Modelling in Healthcare: Challenges and Trends
Simulation Modelling in Healthcare: Challenges and Trends
 
Lokomat
LokomatLokomat
Lokomat
 
Brain-Computer Interfaces: Applying our Minds to Human-Computer Interaction
Brain-Computer Interfaces: Applying our Minds to Human-Computer InteractionBrain-Computer Interfaces: Applying our Minds to Human-Computer Interaction
Brain-Computer Interfaces: Applying our Minds to Human-Computer Interaction
 
تنافسية الابداع والابتكار - مهارات الوصول للعالمية
تنافسية الابداع والابتكار  - مهارات الوصول للعالميةتنافسية الابداع والابتكار  - مهارات الوصول للعالمية
تنافسية الابداع والابتكار - مهارات الوصول للعالمية
 
Social Innovations for Urban Sustainability: Towards Smarter Cities
Social Innovations for Urban Sustainability: Towards Smarter CitiesSocial Innovations for Urban Sustainability: Towards Smarter Cities
Social Innovations for Urban Sustainability: Towards Smarter Cities
 
Innovation in the Classroom: Keeping Digital Innovation Alive in Schools
Innovation in the Classroom: Keeping Digital Innovation Alive in SchoolsInnovation in the Classroom: Keeping Digital Innovation Alive in Schools
Innovation in the Classroom: Keeping Digital Innovation Alive in Schools
 
Selecting Publication Venues in Computing Disciplines
Selecting Publication Venues in Computing DisciplinesSelecting Publication Venues in Computing Disciplines
Selecting Publication Venues in Computing Disciplines
 
Games To Explain Human Factors
Games To Explain Human FactorsGames To Explain Human Factors
Games To Explain Human Factors
 
Prototyping in Tangible Interfaces for Complex Systems
Prototyping in Tangible Interfaces for Complex SystemsPrototyping in Tangible Interfaces for Complex Systems
Prototyping in Tangible Interfaces for Complex Systems
 
A Study of the Challenges to Information and Communications Technology in Gir...
A Study of the Challenges to Information and Communications Technology in Gir...A Study of the Challenges to Information and Communications Technology in Gir...
A Study of the Challenges to Information and Communications Technology in Gir...
 
Brain, Body and Bytes: Designing Brain-Computer Interaction Beyond Medical ...
Brain, Body and Bytes: Designing Brain-Computer Interaction Beyond Medical ...Brain, Body and Bytes: Designing Brain-Computer Interaction Beyond Medical ...
Brain, Body and Bytes: Designing Brain-Computer Interaction Beyond Medical ...
 

Similaire à Improving Hearing by Enhancing Lip Visibility

Proposal-Defense-Presentation-Template-2.pptx
Proposal-Defense-Presentation-Template-2.pptxProposal-Defense-Presentation-Template-2.pptx
Proposal-Defense-Presentation-Template-2.pptxPedroManatad1
 
Using a record sorfware to promote High school students English listening and...
Using a record sorfware to promote High school students English listening and...Using a record sorfware to promote High school students English listening and...
Using a record sorfware to promote High school students English listening and...HanaTiti
 
Research proposal Presentation 1
Research proposal Presentation 1Research proposal Presentation 1
Research proposal Presentation 1amirahjuned
 
Deaf Speech Assessment Using Digital Processing Techniques
Deaf Speech Assessment Using Digital Processing TechniquesDeaf Speech Assessment Using Digital Processing Techniques
Deaf Speech Assessment Using Digital Processing Techniquessipij
 
Jalt 2012...spreading it...vocab sig presentation event...flyer & program final
Jalt 2012...spreading it...vocab sig presentation event...flyer & program finalJalt 2012...spreading it...vocab sig presentation event...flyer & program final
Jalt 2012...spreading it...vocab sig presentation event...flyer & program finalAndy Boon
 
developing listening skills through technology
developing listening skills through technologydeveloping listening skills through technology
developing listening skills through technologyabidayou
 
Academic English language policies and their impacts on language practices in...
Academic English language policies and their impacts on language practices in...Academic English language policies and their impacts on language practices in...
Academic English language policies and their impacts on language practices in...Ali Karakaş
 
PODCASTING; READING 5
PODCASTING; READING 5PODCASTING; READING 5
PODCASTING; READING 5cirauqui
 
Research Proposal Presentation 1
Research Proposal Presentation 1Research Proposal Presentation 1
Research Proposal Presentation 1amirahjuned
 
Ictpresentationgroup8 200318073203
Ictpresentationgroup8 200318073203Ictpresentationgroup8 200318073203
Ictpresentationgroup8 200318073203IPBABUNY
 
Ict presentation group 8
Ict presentation  group 8Ict presentation  group 8
Ict presentation group 8AbdRajab1
 
R211 okada sawaumi ito2017 effects of observing model video presentations on ...
R211 okada sawaumi ito2017 effects of observing model video presentations on ...R211 okada sawaumi ito2017 effects of observing model video presentations on ...
R211 okada sawaumi ito2017 effects of observing model video presentations on ...Takehiko Ito
 
Mixed approach blended learning as a theoretical framework for the applicati...
Mixed approach blended learning as a theoretical framework  for the applicati...Mixed approach blended learning as a theoretical framework  for the applicati...
Mixed approach blended learning as a theoretical framework for the applicati...suhailaabdulaziz
 
Tech assisted language learning tasks in an efl setting- use of hand phone re...
Tech assisted language learning tasks in an efl setting- use of hand phone re...Tech assisted language learning tasks in an efl setting- use of hand phone re...
Tech assisted language learning tasks in an efl setting- use of hand phone re...James Cook University
 
ELSA's Speech Recognition Overview
ELSA's Speech Recognition OverviewELSA's Speech Recognition Overview
ELSA's Speech Recognition OverviewLinhVu946763
 
A COMPARATIVE STUDY OF CAPTIONED VIDEO AND FACE-TO-FACE INSTRUCTION IN TEACHI...
A COMPARATIVE STUDY OF CAPTIONED VIDEO AND FACE-TO-FACE INSTRUCTION IN TEACHI...A COMPARATIVE STUDY OF CAPTIONED VIDEO AND FACE-TO-FACE INSTRUCTION IN TEACHI...
A COMPARATIVE STUDY OF CAPTIONED VIDEO AND FACE-TO-FACE INSTRUCTION IN TEACHI...Anushiya Sethupathy
 
Problems and Difficulties of Speaking That Encounter English Language Student...
Problems and Difficulties of Speaking That Encounter English Language Student...Problems and Difficulties of Speaking That Encounter English Language Student...
Problems and Difficulties of Speaking That Encounter English Language Student...inventionjournals
 

Similaire à Improving Hearing by Enhancing Lip Visibility (20)

Proposal-Defense-Presentation-Template-2.pptx
Proposal-Defense-Presentation-Template-2.pptxProposal-Defense-Presentation-Template-2.pptx
Proposal-Defense-Presentation-Template-2.pptx
 
Using a record sorfware to promote High school students English listening and...
Using a record sorfware to promote High school students English listening and...Using a record sorfware to promote High school students English listening and...
Using a record sorfware to promote High school students English listening and...
 
Research proposal Presentation 1
Research proposal Presentation 1Research proposal Presentation 1
Research proposal Presentation 1
 
Deaf Speech Assessment Using Digital Processing Techniques
Deaf Speech Assessment Using Digital Processing TechniquesDeaf Speech Assessment Using Digital Processing Techniques
Deaf Speech Assessment Using Digital Processing Techniques
 
MAL
MALMAL
MAL
 
Jalt 2012...spreading it...vocab sig presentation event...flyer & program final
Jalt 2012...spreading it...vocab sig presentation event...flyer & program finalJalt 2012...spreading it...vocab sig presentation event...flyer & program final
Jalt 2012...spreading it...vocab sig presentation event...flyer & program final
 
developing listening skills through technology
developing listening skills through technologydeveloping listening skills through technology
developing listening skills through technology
 
Academic English language policies and their impacts on language practices in...
Academic English language policies and their impacts on language practices in...Academic English language policies and their impacts on language practices in...
Academic English language policies and their impacts on language practices in...
 
PODCASTING; READING 5
PODCASTING; READING 5PODCASTING; READING 5
PODCASTING; READING 5
 
Research Proposal Presentation 1
Research Proposal Presentation 1Research Proposal Presentation 1
Research Proposal Presentation 1
 
Ictpresentationgroup8 200318073203
Ictpresentationgroup8 200318073203Ictpresentationgroup8 200318073203
Ictpresentationgroup8 200318073203
 
Ict presentation group 8
Ict presentation  group 8Ict presentation  group 8
Ict presentation group 8
 
R211 okada sawaumi ito2017 effects of observing model video presentations on ...
R211 okada sawaumi ito2017 effects of observing model video presentations on ...R211 okada sawaumi ito2017 effects of observing model video presentations on ...
R211 okada sawaumi ito2017 effects of observing model video presentations on ...
 
Mixed approach blended learning as a theoretical framework for the applicati...
Mixed approach blended learning as a theoretical framework  for the applicati...Mixed approach blended learning as a theoretical framework  for the applicati...
Mixed approach blended learning as a theoretical framework for the applicati...
 
Tech assisted language learning tasks in an efl setting- use of hand phone re...
Tech assisted language learning tasks in an efl setting- use of hand phone re...Tech assisted language learning tasks in an efl setting- use of hand phone re...
Tech assisted language learning tasks in an efl setting- use of hand phone re...
 
Abstract
AbstractAbstract
Abstract
 
ELSA's Speech Recognition Overview
ELSA's Speech Recognition OverviewELSA's Speech Recognition Overview
ELSA's Speech Recognition Overview
 
A COMPARATIVE STUDY OF CAPTIONED VIDEO AND FACE-TO-FACE INSTRUCTION IN TEACHI...
A COMPARATIVE STUDY OF CAPTIONED VIDEO AND FACE-TO-FACE INSTRUCTION IN TEACHI...A COMPARATIVE STUDY OF CAPTIONED VIDEO AND FACE-TO-FACE INSTRUCTION IN TEACHI...
A COMPARATIVE STUDY OF CAPTIONED VIDEO AND FACE-TO-FACE INSTRUCTION IN TEACHI...
 
Problems and Difficulties of Speaking That Encounter English Language Student...
Problems and Difficulties of Speaking That Encounter English Language Student...Problems and Difficulties of Speaking That Encounter English Language Student...
Problems and Difficulties of Speaking That Encounter English Language Student...
 
Spaan
SpaanSpaan
Spaan
 

Plus de HCI Lab

Investigating Students’ Attitude towards Cheating and Plagiarism
Investigating Students’ Attitude towards Cheating and PlagiarismInvestigating Students’ Attitude towards Cheating and Plagiarism
Investigating Students’ Attitude towards Cheating and PlagiarismHCI Lab
 
Designing and Evaluating a Contextual Mobile Application to Support Situated ...
Designing and Evaluating a Contextual Mobile Application to Support Situated ...Designing and Evaluating a Contextual Mobile Application to Support Situated ...
Designing and Evaluating a Contextual Mobile Application to Support Situated ...HCI Lab
 
Evaluating the User Experience (UX) of Playful Interactive Learning Interface...
Evaluating the User Experience (UX) of Playful Interactive Learning Interface...Evaluating the User Experience (UX) of Playful Interactive Learning Interface...
Evaluating the User Experience (UX) of Playful Interactive Learning Interface...HCI Lab
 
Human Factors in the Design of Interactive Multimedia Art Installations (IMAI)
Human Factors in the Design of Interactive Multimedia Art Installations (IMAI)Human Factors in the Design of Interactive Multimedia Art Installations (IMAI)
Human Factors in the Design of Interactive Multimedia Art Installations (IMAI)HCI Lab
 
Socio-Cultural Aspects in the Design of Multilingual Banking Interfaces in th...
Socio-Cultural Aspects in the Design of Multilingual Banking Interfaces in th...Socio-Cultural Aspects in the Design of Multilingual Banking Interfaces in th...
Socio-Cultural Aspects in the Design of Multilingual Banking Interfaces in th...HCI Lab
 
Augmenting Speech-Language Rehabilitation with Brain Computer Interfaces: An ...
Augmenting Speech-Language Rehabilitation with Brain Computer Interfaces: An ...Augmenting Speech-Language Rehabilitation with Brain Computer Interfaces: An ...
Augmenting Speech-Language Rehabilitation with Brain Computer Interfaces: An ...HCI Lab
 
Design and Development of an Educational Arabic Sign Language Mobile App
Design and Development of an Educational Arabic Sign Language Mobile AppDesign and Development of an Educational Arabic Sign Language Mobile App
Design and Development of an Educational Arabic Sign Language Mobile AppHCI Lab
 
Comparison of User Responses to English and Arabic Emotion-Elicitation Videos
Comparison of User Responses to English and Arabic Emotion-Elicitation VideosComparison of User Responses to English and Arabic Emotion-Elicitation Videos
Comparison of User Responses to English and Arabic Emotion-Elicitation VideosHCI Lab
 

Plus de HCI Lab (8)

Investigating Students’ Attitude towards Cheating and Plagiarism
Investigating Students’ Attitude towards Cheating and PlagiarismInvestigating Students’ Attitude towards Cheating and Plagiarism
Investigating Students’ Attitude towards Cheating and Plagiarism
 
Designing and Evaluating a Contextual Mobile Application to Support Situated ...
Designing and Evaluating a Contextual Mobile Application to Support Situated ...Designing and Evaluating a Contextual Mobile Application to Support Situated ...
Designing and Evaluating a Contextual Mobile Application to Support Situated ...
 
Evaluating the User Experience (UX) of Playful Interactive Learning Interface...
Evaluating the User Experience (UX) of Playful Interactive Learning Interface...Evaluating the User Experience (UX) of Playful Interactive Learning Interface...
Evaluating the User Experience (UX) of Playful Interactive Learning Interface...
 
Human Factors in the Design of Interactive Multimedia Art Installations (IMAI)
Human Factors in the Design of Interactive Multimedia Art Installations (IMAI)Human Factors in the Design of Interactive Multimedia Art Installations (IMAI)
Human Factors in the Design of Interactive Multimedia Art Installations (IMAI)
 
Socio-Cultural Aspects in the Design of Multilingual Banking Interfaces in th...
Socio-Cultural Aspects in the Design of Multilingual Banking Interfaces in th...Socio-Cultural Aspects in the Design of Multilingual Banking Interfaces in th...
Socio-Cultural Aspects in the Design of Multilingual Banking Interfaces in th...
 
Augmenting Speech-Language Rehabilitation with Brain Computer Interfaces: An ...
Augmenting Speech-Language Rehabilitation with Brain Computer Interfaces: An ...Augmenting Speech-Language Rehabilitation with Brain Computer Interfaces: An ...
Augmenting Speech-Language Rehabilitation with Brain Computer Interfaces: An ...
 
Design and Development of an Educational Arabic Sign Language Mobile App
Design and Development of an Educational Arabic Sign Language Mobile AppDesign and Development of an Educational Arabic Sign Language Mobile App
Design and Development of an Educational Arabic Sign Language Mobile App
 
Comparison of User Responses to English and Arabic Emotion-Elicitation Videos
Comparison of User Responses to English and Arabic Emotion-Elicitation VideosComparison of User Responses to English and Arabic Emotion-Elicitation Videos
Comparison of User Responses to English and Arabic Emotion-Elicitation Videos
 

Dernier

React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sectoritnewsafrica
 

Dernier (20)

React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
 

Improving Hearing by Enhancing Lip Visibility

  • 1. Hearing by seeing: Can improving the visibility of the speaker's lips make you hear better? Najwa Alghamdi, MSc
  • 2. Bio • Lecturer in the Information Technology Department, CCIS, KSU. • SKERG member • PhD Candidate in University of Sheffield. • Member of the Computer Graphics and Virtual Reality, Speech and Hearing, research groups • Supervised by: Dr.Steve Maddock, Prof. Guy Brown and Dr. Jon Barker. • My research investigates methods for enhancing visual speech intelligibility* to support hard of hearing ( cochlear implant (CI) users in particular). • Alghamdi, Najwa / Maddock, Steve / Brown, Guy J. / Barker, Jon (2015): "Investigating the impact of artificial enhancement of lip visibility on the intelligibility of spectrally-distorted speech", In FAAVSP-2015, 93-98. * Speech intelligibility is a measure of how comprehensible speech is in given conditions 28/04/2016 © King Saud University - The University of Sheffield 2
  • 3. Introduction • Cochlear implants help profoundly deaf people • to become more aware of everyday sounds • to understand speech better when combined with lip-reading • The sound waveform is separated by band-pass filters into different frequency components • Users initially describe the sound characteristics like “mechanical” and “synthetic” 28/04/2016 3 Real Synthesized Cochlear Implant (CI) © King Saud University - The University of Sheffield
  • 4. Introduction: Training after implantation • Auditory training is formal listening activities whose goal is to optimize the activity of speech perception. • Auditory training helps the CI user use the new ‘hearing’ • Typically, training uses audio-only speech stimuli • Recent studies suggest that using visual speech stimuli in the training may maximize the benefit from the training (Bernstein et al., 2013) 28/04/2016 4 Audio-only Audiovisual © King Saud University - The University of Sheffield
  • 5. Introduction: Enhancing training videos • Lander and Capek (2013) found that increasing and decreasing lip visibility by applying lipstick and concealer improved the speechreading performance of words and sentences compared to natural, unadorned lips • Our idea is to artificially colour a speaker’s lips in a video sequence to improve lip visibility 28/04/2016 5 Natural lip with lipstick with concealer © King Saud University - The University of Sheffield
  • 6. Aim of Research • Investigate whether or not artificially enhancing the appearance of a speaker’s lips: • Supports lip-reading thus improving the intelligibility of visual speech • Improves auditory training • Preliminary step: study non-native, normal hearing listeners using cochlear implant simulation. Why? • Both CI users and non-native listeners deal with internal adverse conditions when listening to CI processed speech: • Linguistic knowledge in non-native listeners (Bent et al., 2009) • Damaged inner ear in a CI user • Non-native listeners may help predict the performance of CI users 28/04/2016 6 © King Saud University - The University of Sheffield
  • 7. Enhancement Method Automatic tracking using Faceware Analyzer* XML Parser Smoothing landmarks using piecewise bicubic Bézier curves Colour & luminance blending Lip contour smoothing using average filter Landmarks XML file 7 28/04/2016 *http://facewaretech.com/products/software/analyzer/ © King Saud University - The University of Sheffield
  • 8. Enhancement Method 28/04/2016 8 Original Simulated © King Saud University - The University of Sheffield
  • 9. Method: Subjects • 46 non-native, Saudi listeners from King Saud University, Riyadh, Saudi Arabia • Minimum IELTS score = 5.5 • Subjects are split into groups 28/04/2016 9 Group Size Pre-test Training Post-test A Audio-only 13 A A AV AudioVisual 19 V E Enhanced audiovisual 14 E © King Saud University - The University of Sheffield
  • 10. Method: Stimuli • We used the Grid corpus* • Example: ‘bin blue at L 8 please’ • Audio and video (facial) recordings of 1000 sentences × 34 talkers (18 male, 16 female) • We used audio and video recordings made by a single talker 28/04/2016 10 command colour preposition letter digit adverb bin, lay, place, set blue, green, red, white at, by, in, with 25, no ‘w’ 10 again, now, please, soon *http://spandh.dcs.shef.ac.uk/gridcorpus/ © King Saud University - The University of Sheffield
  • 11. Method: Stimuli • The Grid videos are processed to produce the different stimuli • The subjects need to identify the colour, the letter and the digit keyword of a Grid stimulus in all training and testing sessions 28/04/2016 11 A V E Grid audio stimuli of a single speaker are spectrally distorted (vocoded) to simulate CI processed speech (Tabri et al., 2011) The audio tracks of Grid videos are replaced with the spectrally distorted Audio-only stimuli The speaker's lips in the Audiovisual stimuli are automatically tracked and artificially coloured © King Saud University - The University of Sheffield
  • 12. Results: three sets 1. The impact of using E speech in auditory training • training gain = post-test - pre-test 2. A comparison of the intelligibility of A, V and E speech • Training scores can be used to provide a subjective intelligibility assessment 3. Letter confusion matrices from post-test • Understand the possible sources of confusion when identifying letters during the audio-only post test 28/04/2016 12 © King Saud University - The University of Sheffield
  • 13. 1. The Impact of Using Enhanced Audiovisual Speech in Auditory Training 28/04/2016 13 A V E ANOVA Post-hoc Pre-test mean scores 14% 14% 13% Post-test mean scores 46% 54% 71% p= 0.04 p= 0.037 Training gain 32% 40% 58% p= 0.01 p= 0.009 © King Saud University - The University of Sheffield
  • 14. 2. A comparison of the intelligibility of A, V and E speech 28/04/2016 14 k=1 3 Identification score by X subjects in Session kSpeech intelligibility of X speech = × 100 60where X = {A, V or E} ANOVA p= 0.008 Post-hoc p= 0.006, between A & E © King Saud University - The University of Sheffield
  • 15. 3. Letter Confusion Matrices from post-test results • Letter identification was the most challenging task • 25 letters to choose from [no ‘W’] • Due to the vocoding process, some letters sound similar: (P,B), (G,T), (M,N) and vowels 28/04/2016 15 Letter presented in the test User’s response Clusters 1 Dipthongs 2 Contains plosive sounds 3 Contains nasal sounds 4 Contains fricative sounds 5 Contains a lateral approximant sound © King Saud University - The University of Sheffield
  • 16. 3. Letter Confusion Matrices from post-test results • The analysis of the letter confusion matrices for the audio-only post-test shows that E subjects were better at letter and diphthong identification: 28/04/2016 16 E V A Letter identification 75% 65% 55% Vowels identification 81% 66% 52% E V A © King Saud University - The University of Sheffield
  • 17. 3. Letter Confusion Matrices from post-test results • The visual signal might impede learning the discrimination of visually similar sounds such as P & B . 28/04/2016 17 E V A 28/04/2016 © King Saud University - The University of Sheffield
  • 18. Conclusions • Audio-only post-training tests suggest that the enhanced visual signal improves the training gain of participants • Intelligibility of spectrally-distorted speech is improved when a corresponding enhanced visual signal is introduced • Next steps: Expand the study; Similar experiment on a group of CI and hearing aid users 28/04/2016 18 © King Saud University - The University of Sheffield
  • 19. Current Experiment in SKERG • Evaluation study of a new enhancement method that exaggerate speaking style of the speaker in the video. 28/04/2016 © King Saud University - The University of Sheffield 19 Normal Exaggerated Exaggerated with lipstick
  • 20. Amalghamdi1@sheffield.ac.uk www.najwa-alghamdi.net 20 28/04/2016 This research has been supported by the Saudi Ministry of Education, King Saud University and Faceware Technologies Inc. © King Saud University - The University of Sheffield
  • 21. References • T. Bent, A. Buchwald, D. B. Pisoni. “Perceptual adaptation and intelligibility of multiple talkers for two types of degraded speech,” The Journal of the Acoustical Society of America,126(5), 2660–2669,2009. • L. E. Bernstein, E. T. Auer Jr, S. P. Eberhardt, and J. Jiang, “Auditory perceptual learning for speech perception can be enhanced by audiovisual training,” Frontiers in neuroscience, vol. 7, 2013. • M. F., Dorman, P. C, Loizou. “The identification of consonants and vowels by cochlear implant patients using a 6-channel continuous interleaved sampling processor and by normal-hearing subjects using simulations of processors with two to nine channels”. Ear and hearing 19(2), 162–166, 1998. • K. Lander and C. Capek, “Investigating the impact of lip visibility and talking style on speechreading performance,” Speech Communication, vol. 55, no. 5, pp. 600– 605, 2013. • D. Tabri, K. M. S. A. Chacra, and T. Pring, “Speech perception in noise by monolingual, bilingual and trilingual listeners,” International Journal of Language & Communication Disorders, vol. 46, no. 4, pp. 411–422, 2011. 21 28/04/2016 © King Saud University - The University of Sheffield
  • 22. 22 © The University of Sheffield28/04/2016 User’s response Letter presented in the test
  • 23. Colouring the lips • Smoothing the lip contours • Where are controls points and is a Bernstein polynomial given by 28/04/2016 © The University of Sheffield 23
  • 24. Luminance Blending • Luminance blending was utilized as well to improve colour blending under different lighting conditions • This was accomplished by applying Luminance blending in luma/chroma (Y‘CbCr) space and then converting the results to the RGB space using the following equations 28/04/2016 © The University of Sheffield 24
  • 25. CI simulation • The GRID audio was spectrally distorted using an eight-channel sine-wave vocoder (AngelSim*). • Normal hearing listeners can perform in a comparable way to CI users when hearing no more or less than 8 channels (Dorman et al., 1998) • The fluctuation of noise in a noise vocoder is not presented in real CI (Bent et al., 2009) thus we used the sinewave vocoder. • The processing of vocoding 1. The signal is divided into 8 channels by a bandpass filter [200 to 7,000Hz] (slope=24dB/octave); 2. Each channel was then low-pass filtered by 160Hz (slope=24dB/octave) to obtain the envelope; 3. The envelope of each channel modulated a sine wave that replaced the signal frequency 28/04/2016 © The University of Sheffield 25 *http://www.tigerspeech.com
  • 26. 28/04/2016 © King Saud University - The University of Sheffield 29 healthy cochlea -- ‫سليمة‬ ‫قوقعة‬ back
  • 27. 28/04/2016 © King Saud University - The University of Sheffield 30 cochlear implant-- ‫قوقع‬‫ة‬‫الكترونية‬ back neurosensory hearing-loss conditions

Notes de l'éditeur

  1. Why CI users ? 18% of Saudi children are hard of hearing. Cochlear implant surgeries are flourishing in Saudi Arabia. The number of CI users are increasing
  2. The cochlea (the inner ear) is responsible for pitch perception. It is composed of 1000s of hair cells that are arranged in pitch order. The vibration of sound causes the cochlea fluid to move, which stimulates the hair cells. The brain then perceives sounds by electrical pulses passed by the stimulated hair cells .