SlideShare a Scribd company logo
1 of 19
Thrust 2: Interaction Modeling
(2013-14 academic years)
Defined last year by Shri:
1. Analyze and model spatio-temporal context
surrounding (leading to, caused by) behavioral
events.
2. Quantitative description and understanding of the
underlying structure (e.g., timing, phasing of
salient events) and emergence and evolution of
interactions.
Progress since Last Thrust 2 Con Call:
(April 4, 2013)
1. CVPR paper
2. Nine Expeditions-related papers (or more?)
in Interspeech 2013 (Lyon, France)
(some are Thrust 2)
3. Synchronization issues for multimodal RABC
data-base (all but Kinect).
Thrust 1 / Thrust 2 dependencies
• Thrust 1 is concerned with extracting
high/low-level features for further analysis.
• Thrust 1’s raison d’être is to serve Thrust 2
with descriptors.
• Thrust 2‘s obligation is to convey its needs to
Thrust 1.
• We have already developed single-mode
processing to analyze behavior; now it is time
to use multi-mode methods.
Thrust 1 / Thrust 2 dependencies
• We need a comprehensive list of Thrust 1
deliverables.
1. Speech and Audio examples:
• Voicing intervals
• Diarization (who is talking when with cross-talk)
• Word locations (KWS)
• Non-linguistic events (laughter, whining, crying, sighs)
• Prosodic features
• Spectral/articulatory features
2. Q-sensors
• EDA
• Temperature
• Accelerometers
• Statistical measures associated with various states and timings
Thrust 1 / Thrust 2 dependencies
• We need a comprehensive list of Thrust 1
deliverables.
3. Vision
• …. (many)
4. Gaze
• --- (many)
5. Kinect
• --- (many)
Thrust 1: Please fill this out for us.
Thrust 2 projects using RABC
(all multi-modal)
1. Engagement score analysis (revisited)
– We have engagement scores on all 157 sessions
(only 43 are fully annotated).
– Develop optimal categorization using “ground
truth” in the annotated sessions.
– Test using non-annotated sessions with Thrust 1
primitives.
Thrust 2 projects using RABC
(all multi-modal)
2. Parsing of stages (revisited)
– The only attempt was using KWS (~80% correct)
– Obvious extension uses Kinect object tracking
– Others include EDA, para-linguistic detection, gaze,
posture, etc.
Thrust 2 projects using RABC
(all multi-modal)
3. Response to name:
– Every RABC session starts with greeting.
– After formal session, child’s name is uttered
twice (once from side, once from front)
a. What responses are predictable? (All modes)
b. Can we ascertain when name was called from
child’s reactions w/o audio cues?
c. This will require gaze, head-orientation, EDA,
posture, etc.
Multimodal Retrieval
• We currently have the ability to retrieve
interesting regions of RABC session through key-
word spotting and diarization.
• Easy extension using any of the other
modalities.
• Challenging extensions would be:
– Find when child is unhappy (requires multi-modes)
– Find when child is not responding.
– …
Multimodal Retrieval
Examples:
Is a child making eye contact while an examiner
talking? (gaze and diarization)
Where is a child looking at while an examiner
talking? (voice active detector + gaze tracker +
head orientation).
Multimodal Linguistic Analysis
Classify examiner’s speech into the four sentence types:
1. Statement
2. Question (“Are you ready to play with new toys?”,
“where is the yellow duck?”, etc.)
3. Exclamation (“It’s a hat! It’s on my head!”)
4. Command (“Look at my book”)
• Useful for retrieval and parsing.
• Child’s responses and turn-taking to each type can be
tracked.
What Happens During Cross-Talk?
1) terminal overlaps: a speaker assumes the other speaker
has or is about to finish their turn. Can include head-
nodding and gestures.
2) continuers: examples of the continuer’s phrases are
“mm hm” or “uh huh.” (speech and para-linguistics)
3) conditional access to the turn: the current speaker
yields their turn or invite another speaker to interject in
the conversation, usually as collaborative effort. (all sorts
of gestures)
4) chordal: a non-serial occurrence of turns, such as
laughter, smiling, and gesturing.
Example of Multi-Modal Retrieval/Analysis
During the ball/book play in RABC, what is a child
response when a ball/book is presented? Is a child
looking at the ball/book and the examiner back and forth?
(word spotting + gaze tracker + head orientation).
Is a child making vocalizations? If so, is it a positive or
negative response? (word spotting + emotion classifier)
Does a child smile when a ball/book is presented? (word
spotting + smile detector)
Previous Project Example
Uni-modal paralinguistic event detection
– Detection of laughter in children’s speech
– Training data from FAU Aibo Emotion Corpus (AEC)
consists of spontaneous vocalizations/verbalizations by
adolescents
– Testing on ~10 Rapid ABC sessions yielded accuracy of
70.58%
– Robust laughter detector with good generalization
properties that can be used for paralinguistic event
diarization
Current Project Example
Multi-modal paralinguistic event detection
– Laughter detection using audio
– Smile detection using Omron.
– These have different time scales and frame rates.
– Fusion of confidence scores is likely to improve
accuracy.
– Might EDA also be useful for this?
Basic Science in Required for Multi-Modal
Analyses: Fusion
Example from USC
Basic Science in Required for Multi-Modal
Analyses: Fusion
Example from GT: binary fusion method with different analysis lengths
Opportunistic use of other data as it may
become available
• BU children’s diagnostic data. (Helen Tager-Flusberg)
– Includes audio and video of at-risk ASD children with more
vocalizations.
• RABC Floor time (natural but very uncontrolled)
• ADOS (only available to USC)
• CFD table sessions.
Thrust 2 action items
• Set up blog like Thrust 1
• Identify who is participating
• Monthly con-call for everyone.
• Set up time-table for new / revisited
challenges.

More Related Content

Similar to Thrust 2 07-aug-2013_v3 (1)

Ud the motion en_jtaboada
Ud the motion en_jtaboadaUd the motion en_jtaboada
Ud the motion en_jtaboadaJordi Taboada
 
2011 10 07 (uam) emadrid aortigosa uam estilos aprendizaje sistemas adaptativ...
2011 10 07 (uam) emadrid aortigosa uam estilos aprendizaje sistemas adaptativ...2011 10 07 (uam) emadrid aortigosa uam estilos aprendizaje sistemas adaptativ...
2011 10 07 (uam) emadrid aortigosa uam estilos aprendizaje sistemas adaptativ...eMadrid network
 
Pre, while and post listening skills and activities
Pre, while and post listening skills and activitiesPre, while and post listening skills and activities
Pre, while and post listening skills and activitiesAlexander Benito
 
Whakanuia te rerekētanga - Integrated Science plan Year 6-8
Whakanuia te rerekētanga - Integrated Science plan Year 6-8Whakanuia te rerekētanga - Integrated Science plan Year 6-8
Whakanuia te rerekētanga - Integrated Science plan Year 6-8Ruth Lemon
 
Week 7 Lecture Slides
Week 7 Lecture SlidesWeek 7 Lecture Slides
Week 7 Lecture SlidesSerpil
 
In this week’s lecture, we focused on the discipline of primatolog.docx
In this week’s lecture, we focused on the discipline of primatolog.docxIn this week’s lecture, we focused on the discipline of primatolog.docx
In this week’s lecture, we focused on the discipline of primatolog.docxjaggernaoma
 
LARC: Lesson analysis, Differentiation, Assessment 2011
LARC: Lesson analysis, Differentiation, Assessment  2011LARC: Lesson analysis, Differentiation, Assessment  2011
LARC: Lesson analysis, Differentiation, Assessment 2011Toni Theisen
 
Meghan Shank: How can we be better communication partners?
Meghan Shank: How can we be better communication partners?Meghan Shank: How can we be better communication partners?
Meghan Shank: How can we be better communication partners?Ursula Webhofer
 
individual assessment of each child -etwinning -young scientists
individual assessment of each child -etwinning -young scientists individual assessment of each child -etwinning -young scientists
individual assessment of each child -etwinning -young scientists Sofi Liva
 
Presentation: GETSI-InTeGrate Development Model & Writing Learning Goals
Presentation: GETSI-InTeGrate Development Model & Writing Learning GoalsPresentation: GETSI-InTeGrate Development Model & Writing Learning Goals
Presentation: GETSI-InTeGrate Development Model & Writing Learning GoalsSERC at Carleton College
 
Problem-based Learning & Resource-based Learning two complementary approac...
Problem-based Learning & Resource-based Learning  two complementary approac...Problem-based Learning & Resource-based Learning  two complementary approac...
Problem-based Learning & Resource-based Learning two complementary approac...Wilco te Winkel
 
Educ tech lesson 11 12 13 14 15 16 & 18 copy
Educ tech lesson 11 12 13 14 15 16 & 18   copyEduc tech lesson 11 12 13 14 15 16 & 18   copy
Educ tech lesson 11 12 13 14 15 16 & 18 copyAngel Yuto
 
Sensemaker for Partos Plaza - Irene Guyt
Sensemaker for Partos Plaza  - Irene GuytSensemaker for Partos Plaza  - Irene Guyt
Sensemaker for Partos Plaza - Irene Guytannepartos
 
Ccss ppt
Ccss pptCcss ppt
Ccss pptsueola1
 
From Human Intelligence to Machine Intelligence
From Human Intelligence to Machine IntelligenceFrom Human Intelligence to Machine Intelligence
From Human Intelligence to Machine IntelligenceNUS-ISS
 
IntroductionIn this module, you have learned about the intention.docx
IntroductionIn this module, you have learned about the intention.docxIntroductionIn this module, you have learned about the intention.docx
IntroductionIn this module, you have learned about the intention.docxdoylymaura
 
Sequencingr middleschool
Sequencingr middleschoolSequencingr middleschool
Sequencingr middleschoolKelly Kellogg
 
Theories of student development.chapter.2
Theories of student development.chapter.2Theories of student development.chapter.2
Theories of student development.chapter.2jvirwin
 
The Power of Retrospection
The Power of RetrospectionThe Power of Retrospection
The Power of RetrospectionNaresh Jain
 

Similar to Thrust 2 07-aug-2013_v3 (1) (20)

Ud the motion en_jtaboada
Ud the motion en_jtaboadaUd the motion en_jtaboada
Ud the motion en_jtaboada
 
2011 10 07 (uam) emadrid aortigosa uam estilos aprendizaje sistemas adaptativ...
2011 10 07 (uam) emadrid aortigosa uam estilos aprendizaje sistemas adaptativ...2011 10 07 (uam) emadrid aortigosa uam estilos aprendizaje sistemas adaptativ...
2011 10 07 (uam) emadrid aortigosa uam estilos aprendizaje sistemas adaptativ...
 
Pre, while and post listening skills and activities
Pre, while and post listening skills and activitiesPre, while and post listening skills and activities
Pre, while and post listening skills and activities
 
Whakanuia te rerekētanga - Integrated Science plan Year 6-8
Whakanuia te rerekētanga - Integrated Science plan Year 6-8Whakanuia te rerekētanga - Integrated Science plan Year 6-8
Whakanuia te rerekētanga - Integrated Science plan Year 6-8
 
Week 7 Lecture Slides
Week 7 Lecture SlidesWeek 7 Lecture Slides
Week 7 Lecture Slides
 
In this week’s lecture, we focused on the discipline of primatolog.docx
In this week’s lecture, we focused on the discipline of primatolog.docxIn this week’s lecture, we focused on the discipline of primatolog.docx
In this week’s lecture, we focused on the discipline of primatolog.docx
 
LARC: Lesson analysis, Differentiation, Assessment 2011
LARC: Lesson analysis, Differentiation, Assessment  2011LARC: Lesson analysis, Differentiation, Assessment  2011
LARC: Lesson analysis, Differentiation, Assessment 2011
 
Meghan Shank: How can we be better communication partners?
Meghan Shank: How can we be better communication partners?Meghan Shank: How can we be better communication partners?
Meghan Shank: How can we be better communication partners?
 
Unit1 revision
Unit1 revisionUnit1 revision
Unit1 revision
 
individual assessment of each child -etwinning -young scientists
individual assessment of each child -etwinning -young scientists individual assessment of each child -etwinning -young scientists
individual assessment of each child -etwinning -young scientists
 
Presentation: GETSI-InTeGrate Development Model & Writing Learning Goals
Presentation: GETSI-InTeGrate Development Model & Writing Learning GoalsPresentation: GETSI-InTeGrate Development Model & Writing Learning Goals
Presentation: GETSI-InTeGrate Development Model & Writing Learning Goals
 
Problem-based Learning & Resource-based Learning two complementary approac...
Problem-based Learning & Resource-based Learning  two complementary approac...Problem-based Learning & Resource-based Learning  two complementary approac...
Problem-based Learning & Resource-based Learning two complementary approac...
 
Educ tech lesson 11 12 13 14 15 16 & 18 copy
Educ tech lesson 11 12 13 14 15 16 & 18   copyEduc tech lesson 11 12 13 14 15 16 & 18   copy
Educ tech lesson 11 12 13 14 15 16 & 18 copy
 
Sensemaker for Partos Plaza - Irene Guyt
Sensemaker for Partos Plaza  - Irene GuytSensemaker for Partos Plaza  - Irene Guyt
Sensemaker for Partos Plaza - Irene Guyt
 
Ccss ppt
Ccss pptCcss ppt
Ccss ppt
 
From Human Intelligence to Machine Intelligence
From Human Intelligence to Machine IntelligenceFrom Human Intelligence to Machine Intelligence
From Human Intelligence to Machine Intelligence
 
IntroductionIn this module, you have learned about the intention.docx
IntroductionIn this module, you have learned about the intention.docxIntroductionIn this module, you have learned about the intention.docx
IntroductionIn this module, you have learned about the intention.docx
 
Sequencingr middleschool
Sequencingr middleschoolSequencingr middleschool
Sequencingr middleschool
 
Theories of student development.chapter.2
Theories of student development.chapter.2Theories of student development.chapter.2
Theories of student development.chapter.2
 
The Power of Retrospection
The Power of RetrospectionThe Power of Retrospection
The Power of Retrospection
 

Recently uploaded

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 

Recently uploaded (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

Thrust 2 07-aug-2013_v3 (1)

  • 1. Thrust 2: Interaction Modeling (2013-14 academic years) Defined last year by Shri: 1. Analyze and model spatio-temporal context surrounding (leading to, caused by) behavioral events. 2. Quantitative description and understanding of the underlying structure (e.g., timing, phasing of salient events) and emergence and evolution of interactions.
  • 2. Progress since Last Thrust 2 Con Call: (April 4, 2013) 1. CVPR paper 2. Nine Expeditions-related papers (or more?) in Interspeech 2013 (Lyon, France) (some are Thrust 2) 3. Synchronization issues for multimodal RABC data-base (all but Kinect).
  • 3. Thrust 1 / Thrust 2 dependencies • Thrust 1 is concerned with extracting high/low-level features for further analysis. • Thrust 1’s raison d’être is to serve Thrust 2 with descriptors. • Thrust 2‘s obligation is to convey its needs to Thrust 1. • We have already developed single-mode processing to analyze behavior; now it is time to use multi-mode methods.
  • 4. Thrust 1 / Thrust 2 dependencies • We need a comprehensive list of Thrust 1 deliverables. 1. Speech and Audio examples: • Voicing intervals • Diarization (who is talking when with cross-talk) • Word locations (KWS) • Non-linguistic events (laughter, whining, crying, sighs) • Prosodic features • Spectral/articulatory features 2. Q-sensors • EDA • Temperature • Accelerometers • Statistical measures associated with various states and timings
  • 5. Thrust 1 / Thrust 2 dependencies • We need a comprehensive list of Thrust 1 deliverables. 3. Vision • …. (many) 4. Gaze • --- (many) 5. Kinect • --- (many) Thrust 1: Please fill this out for us.
  • 6. Thrust 2 projects using RABC (all multi-modal) 1. Engagement score analysis (revisited) – We have engagement scores on all 157 sessions (only 43 are fully annotated). – Develop optimal categorization using “ground truth” in the annotated sessions. – Test using non-annotated sessions with Thrust 1 primitives.
  • 7. Thrust 2 projects using RABC (all multi-modal) 2. Parsing of stages (revisited) – The only attempt was using KWS (~80% correct) – Obvious extension uses Kinect object tracking – Others include EDA, para-linguistic detection, gaze, posture, etc.
  • 8. Thrust 2 projects using RABC (all multi-modal) 3. Response to name: – Every RABC session starts with greeting. – After formal session, child’s name is uttered twice (once from side, once from front) a. What responses are predictable? (All modes) b. Can we ascertain when name was called from child’s reactions w/o audio cues? c. This will require gaze, head-orientation, EDA, posture, etc.
  • 9. Multimodal Retrieval • We currently have the ability to retrieve interesting regions of RABC session through key- word spotting and diarization. • Easy extension using any of the other modalities. • Challenging extensions would be: – Find when child is unhappy (requires multi-modes) – Find when child is not responding. – …
  • 10. Multimodal Retrieval Examples: Is a child making eye contact while an examiner talking? (gaze and diarization) Where is a child looking at while an examiner talking? (voice active detector + gaze tracker + head orientation).
  • 11. Multimodal Linguistic Analysis Classify examiner’s speech into the four sentence types: 1. Statement 2. Question (“Are you ready to play with new toys?”, “where is the yellow duck?”, etc.) 3. Exclamation (“It’s a hat! It’s on my head!”) 4. Command (“Look at my book”) • Useful for retrieval and parsing. • Child’s responses and turn-taking to each type can be tracked.
  • 12. What Happens During Cross-Talk? 1) terminal overlaps: a speaker assumes the other speaker has or is about to finish their turn. Can include head- nodding and gestures. 2) continuers: examples of the continuer’s phrases are “mm hm” or “uh huh.” (speech and para-linguistics) 3) conditional access to the turn: the current speaker yields their turn or invite another speaker to interject in the conversation, usually as collaborative effort. (all sorts of gestures) 4) chordal: a non-serial occurrence of turns, such as laughter, smiling, and gesturing.
  • 13. Example of Multi-Modal Retrieval/Analysis During the ball/book play in RABC, what is a child response when a ball/book is presented? Is a child looking at the ball/book and the examiner back and forth? (word spotting + gaze tracker + head orientation). Is a child making vocalizations? If so, is it a positive or negative response? (word spotting + emotion classifier) Does a child smile when a ball/book is presented? (word spotting + smile detector)
  • 14. Previous Project Example Uni-modal paralinguistic event detection – Detection of laughter in children’s speech – Training data from FAU Aibo Emotion Corpus (AEC) consists of spontaneous vocalizations/verbalizations by adolescents – Testing on ~10 Rapid ABC sessions yielded accuracy of 70.58% – Robust laughter detector with good generalization properties that can be used for paralinguistic event diarization
  • 15. Current Project Example Multi-modal paralinguistic event detection – Laughter detection using audio – Smile detection using Omron. – These have different time scales and frame rates. – Fusion of confidence scores is likely to improve accuracy. – Might EDA also be useful for this?
  • 16. Basic Science in Required for Multi-Modal Analyses: Fusion Example from USC
  • 17. Basic Science in Required for Multi-Modal Analyses: Fusion Example from GT: binary fusion method with different analysis lengths
  • 18. Opportunistic use of other data as it may become available • BU children’s diagnostic data. (Helen Tager-Flusberg) – Includes audio and video of at-risk ASD children with more vocalizations. • RABC Floor time (natural but very uncontrolled) • ADOS (only available to USC) • CFD table sessions.
  • 19. Thrust 2 action items • Set up blog like Thrust 1 • Identify who is participating • Monthly con-call for everyone. • Set up time-table for new / revisited challenges.