The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Expressive Gestures for NAO Robot Communication
1. Expressive Gestures for NAO
Quoc Anh Le- Catherine Pelachaud
CNRS, LTCI, Telecom-ParisTech, France
NAO TechDay, 13/06/2012, Paris
2. Motivation
Importance of expressive
gestures [Li et al, 2009]
• Communicating messages
• Expressing affective states
Relation between gesture and
speech [Kendon, 2004]
• Two aspects of the same
process of utterance
• Complement and supplement
Believability and life-likeness
• Robot should communicate in
a human-like way (emotion,
persionality, etc) [Fong, 2003]
page 1 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
3. Objectives
Generate communicative gestures for Nao robot
• Integrated within an existing platform for virtual agent
• Nonverbal behaviors described symbolically
• Synchronization (gestures and speech)
• Expressivity of gestures
GVLEX project (Gesture & Voice for Expressive Reading)
• Robot tells a story expressively.
• Partners : LIMSI (linguistic aspects), Aldebaran (robotics),
Acapela (speech synthesis), Telecom ParisTech
(expressive gestures)
page 2 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
4. State of the art
Several initiatives recently:
• Salem et Kopp (2012): robot ASIMO, the virtual
framework MAX, gesture description with MURML.
• Aaron Holroyd et Charles Rich (2011): robot Melvin,
motion scripts with BML, simple gestures, feedback to
synchronize gestures and speech
• Ng-Thow-Hing et al. (2010): robot ASIMO, gestures
selection, synchronization between gestures and speech.
• Nozawa et al. (2006): motion scripts with MPML-HP,
robot HOAP-1
Our system: Focus on expressivity and synchronization of
gestures with speech using a common platform (SAIBA
compliant [Kopp, 2006]) for Greta and for Nao
page 3 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
5. Our methodology
Gesture describes with a symbolic language (BML)
Gestural expressivity (amplitude, fluidity, power,
repetition, speed, stiffness,…)
Elaboration of gestures from a storytelling video corpus
(Martin et al., 2009)
Execution of the animation by translating into joint values
page 4 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
6. Problem and Solution
Using a common framework to control both virtual and
physical agents raises several problems:
• Different degrees of freedom
• Limit of movement space and speed
Solution:
• Use the same representation language
- same algorithm for selecting and planning gestures
- different algorithm for creating the animation
• Elaborate one gesture repository for the robot and
another one for the Greta agent
• Gesture movement space and velocity specification
page 5 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
7. Steps
1. Build a library of gestures from a corpus of storytelling video: the gesture
shapes should not be identical (between the human, virtual agent, robot)
but they have to convey the same meaning.
GRETA
System
2. Use the GRETA system to generate gestures for Nao
• Following the SAIBA framework
- Two representation languages: FML (Function Markup Language) and BML (Behavior
Markup Language)
- Three separated modules: plan communicative intents, select and plan gestures, and
realize gestures
Behavior
BML
Intent Behavior Realizer
Text Planning FML Planning
BML Behavior
Realizer
page 6 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
8. Global diagram
Nao Lexicon Greta Lexicon
LEXICONs
Gesture Synchronisation
Selection with AI speech
Planification of Modification of
gesture gesture
expressivity
duration KEYFRAMES
FML BML
page 7 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
10. Gesture Specification
Gesture->Phases->Hands (wrist position, palm orientation, hand shape,...)
[Kendon, 2004]
Only stroke phases are specified. Other phases will be generated automatically
by the system
1. <gesture id="greeting" category="ICONIC" min_time="1.0“ hand="RIGHT">
2. <phase type="STROKE-START" twohand="ASSYMMETRIC“ >
3. <hand side="RIGHT">
4. <vertical_location>YUpperPeriphery</vertical_location>
5. <horizontal_location>XPeriphery</horizontal_location>
6. <location_distance>ZNear</location_distance>
7. <hand_shape>OPEN</handshape>
8. <palm_orientation>AWAY</palm_orientation>
9. </hand>
10. </phase>
11. <phase type="STROKE-END" twohand="ASSYMMETRIC">
12. <hand side="RIGHT">
13. <vertical_location>YUpperPeriphery</vertical_location>
14. <horizontal_location>XExtremePeriphery</horizontal_location>
15. <location_distance>ZNear</location_distance>
16. <hand_shape>OPEN</handshape>
17. <palm_orientation>AWAY</palm_orientation>
18. </hand>
19.</phase>
20.</gesture> An example for a greeting gesture
page 9 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
11. Synchronization of gestures with speech
The stroke phase coincides or precedes emphasized words of the
speech [McNeill, 1992]
Gesture stroke phase timing specified by synch points
page 10 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
12. Synchronization of gestures with speech
Algorithm
• Compute preparation phase
• Delete gesture if not enough time
• Add a hold phase to fit gesture planned duration
• Coarticulation between several gestures
- If enough time, retraction phase (ie go back to rest position)
Start end Start end
- Otherwise, go from end of stroke to preparation phase of next
gesture S-start S-end S-start S-end
Start
end
page 11 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
13. Gestural Expressivity vs. Affective States
A set of gesture dimensions [Hartmann, 2005]
• Spatial Extent (SPC): Amplitude of gesture movement
• Temporal Extent (TMP): Speed of gesture movement
• Power (PWR): Acceleration of gesture movement
• Fluidity (FLD): Smoothness and Continuity
• Repetition (REP): Number of stroke phases in a gesture movement
• Stiffness (STF): Tension/Flexibility
Example [Mancini, 2008]
Affective states SPC TMP FLD PWR
Sadness Low Low High Low
Happy High High High High
Angrily High High Low High
page 12 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
14. Spatial Extent (SPC)
A real number in the interval [-1 .. 1]
• Zero corresponds to a neutral behavior
• -1 corresponds to small and contracted movements
• 1 corresponds to wide and large movements
Guarantee the unchangeability of the meaning
• Gesture (modifiable dimension, unmodifiable
dimension)
• Example: Negation (vertical position is fixed)
page 13 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
15. Temporal Extent (TMP)
A real number in the interval [-1 .. 1]
• Zero corresponds to a neutral behavior
• Slow if the value is negative
• Fast if the value is positive
page 14 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
16. Power (PWR)
A real number in the interval [-1 .. 1]
• Zero corresponds to a neutral behavior
• Movements more powerful correspond to higher
acceleration
Affect hand shape (close to open)
• More relax/open if the value is negative
• Fist corresponds to 1
Affect duration of stroke phase repetitions
page 15 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
17. Fluidity (FLD)
A real number in the interval [-1 .. 1]
• Zero corresponds to a neutral behavior
• Higher values allow smooth and continuous
execution of movements
• Lower values create discontinuity in the movements
Not yet implemented for Nao
page 16 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
18. Repetition (REP)
Number of stroke phase repeats in a gesture
movement
Tendency to rhythmic repeats of specific
movements
Each stroke coincides with a emphasized
word/words of the speech
page 17 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
19. Animation Computation & Execution
Schedule and plan gestures phases
Compute expressivity parameters
Translate symbolic descriptions into joint values
Execute animation
• Send timed key-positions to the robot using
available APIs
• Animation is obtained by interpolating between joint
values with robot built-in proprietary procedures.
page 18 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
20. Example
<bml> <lexicon> <bml>
<speech id="s1" start="0.0“ <speech id="s1" start="0.0“
<gesture id="hungry" category="BEAT" hand="BOTH">
vce=speaker=Antoine spd=180 vce=speaker=Antoine spd=180
Et le troisième dit en colère: vce=speaker=AntoineLoud <phase type="STROKE" twohand="SYMMETRIC"> Et le troisième dit tristement: vce=speaker=AntoineSad spd=90
spd=200 <tm id="tm1"/>J'ai très faim!
<hand side="RIGHT">
<tm id="tm1"/>J'ai très faim! </speech>
</speech> <vertical_location>YCenterCenter</vertical_location> <gesture id=“hungry" start="s1:tm1" end=“start+1.5" stroke="0.5“>
<gesture id=“hungry" start="s1:tm1" end=“start+1.5" <PWR.value>-1.0</PWR.value>
stroke="0.5“> <PWR.value>1.0</PWR.value> <horizontal_location>XCenter</horizontal_location> <SPC.value>-0.3</SPC.value>
<SPC.value>0.6</SPC.value> <location_distance>ZMiddle</location_distance> <TMP.value>-0.2</TMP.value>
<TMP.value>0.2</TMP.value> <FLD.value>0</FLD.value>
<FLD.value>0</FLD.value> <hand_shape>CLOSED</handshape> <STF.value>0</STF.value>
<STF.value>0</STF.value> <REP.value>0</REP.value>
<palm_orientation>INWARD</palm_orientation>
<REP.value>0</REP.value>
</gesture> </hand> </gesture>
</bml> </bml>
</phase>
</gesture>
</lexicon>
The same gesture prototype
Different expressivity Different expressivity
(i.e. Anger) (i.e. Sadness)
page 19 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
21. Video demo: Nao tells a story
page 20 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
22. Conclusion
Conclusion
• A gesture model is designed, implemented for Nao while taking into
account physical constraints of the robot.
• Common platform for both virtual agent and robot
• Expressivity model
Future work
• Create gestures with different emotional colour and personal style
• Validate the model through perceptive evaluations
page 21 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud
23. Acknowledgment
This work has been funded by the ANR GVLEX project
It is supported from members of the laboratory TSI,
Telecom-ParisTech
page 22 NAO TechDay 2012 Le Quoc Anh & Catherine Pelachaud