Potential of AI (Generative AI) in Business: Learnings and Insights
Identifying Emotions Expressed by Mobile Users through 2D Surface and 3D Motion Gestures (Ubicomp 2012)
1. Identifying Emotions
Expressed by Mobile Users
through 2D and 3D Gestures
Céline Coutrix, Nadine Mandran
CNRS & Laboratoire d’Informatique de Grenoble
1
3. Context
• Users may whish to express emotions
anytime, anywhere, e.g.:
• To communicate (to social network, to
very close friends, to doctor, etc)
• To adapt interaction to increase
performance or satisfaction
• etc.
3
4. Context
• Nowadays techniques for identifying
emotions
• facial expressions,
• thermal imaging of faces,
• vocal intonations,
• language,
• galvanic skin response,
• electromyography of the face,
• heart rate,
• etc. 4
11. Problem
subjects gestures sensing computation
? emotions
• Are all subjects similar or are subjects specific ?
• Which gestures to study ?
• What dimension(s) need to be captured?
• What is the minimum relevant information to
compute?
11
12. Problem
subjects gestures sensing computation
? emotions
• Are all subjects similar or are subjects specific ?
• Which gestures to study ?
• What dimension(s) need to be captured?
• What is the minimum relevant information to
compute?
12
13. Problem
subjects gestures sensing computation
? emotions
• Are all subjects similar or are subjects specific ?
• Which gestures to study ?
• What dimension(s) need to be captured?
• What is the minimum relevant information to
compute?
13
14. Problem
subjects gestures sensing computation
? emotions
• Are all subjects similar or are subjects specific ?
• Which gestures to study ?
• What dimension(s) need to be captured?
• What is the minimum relevant information to
compute?
14
15. Goal
• Uncover directions for more focused studies
• Which subjects, gestures, sensing and
computation will we need to study?
✘ ✔ ✘
✔
✔
15
17. Field Study: Goal
• Are all subjects similar or are subjects specific ?
➡ Explore several subjects (⊃ specific subjects)
• Which gestures to study ?
➡ Explore free gesturing (⊃ specific types of gestures)
• What dimension(s) need to be captured?
➡ Explore accelerometers and touch events
• What is the minimum relevant information to compute?
➡ Explore a wide range of descriptors of gestures
17
19. Field Study: Data Collection
• 12 adult subjects
• Expert everyday users of mobile tactile devices
• Diverse profiles:
• 6 were males and 6 females,
• Aged 25-47 (mean= 33, s.d.= 7),
• Areas of work: software, biology, didactics,
social worker, executive assistant, ski instructor
and mountain guide
19
20. Field Study: Data Collection
• Experience Sampling Method
• Application installed on their personal, usual
mobile device,
• Gestures reported by users
• touchscreen events and accelerometers
• Emotions reported by users (PAD model)
• During 15 days (at least)
20
21. Field Study: Data Collection
• Both event-contingent sampling and signal-contingent
sampling:
• Report when experiencing a particular affective state
➡ not to miss intense, seldomly occuring affective
states
• Report when receiving text message (1 per day, at a
random time)
➡ to capture subsequent data per subjects
➡ to capture more neutral affective states
21
36. Computing Descriptors
• Spectrum
• Gap that maximizes the difference between
most important and less important
frequencies
• Number of important frequencies
• Most important frequency
36
37. From descriptors
• 249 descriptors:
• Statistically analyzed to find
• the relevant/unrelevant information
• the redundant/unredundant information
37
39. Results: Goal
• Are all subjects similar or are subjects specific ?
➡ Aggregated subjects vs. Specific subjects
• Which gestures to study ?
➡ Free gesturing vs. Specific types of gestures
• What dimension(s) need to be captured?
➡ Do we need more sensors?
• What is the minimum relevant information to compute?
➡ Relevant/unrelevant and redundant/unredundant
descriptors 39
40. Results: Agreggated or
subjects specific?
• For aggregated subjects
• Correlations low (<|0.40|)
little relationship between single
descriptors and PAD
40
41. Results: Agreggated or
subjects specific?
• For specific subjects
➡ Further work need to investigate
correlations for single subjects
41
42. Results: Goal
• Are all subjects similar or are subjects specific ?
➡ Aggregated subjects vs. Specific subjects
• Which gestures to study ?
➡ Free gesturing vs. Specific types of gestures
• What dimension(s) need to be captured?
➡ Do we need more sensors?
• What is the minimum relevant information to compute?
➡ Relevant/unrelevant and redundant/unredundant
descriptors 42
43. Free vs. Specific Gestures
• Which gestures to study ?
➡ Manual exploration of dataset
43
44. Free vs. Specific Gestures
• Moving in all directions after holding still
• express a high arousal
44
46. Free vs. Specific Gestures
• Tapping : Dual model?
• Number of strokes could either follow
Pleasure and/or Arousal
• Would make sense
• Needs confirmation
46
47. Free vs. Specific Gestures
• Tapping: Number of strokes pleasure ?
ZP
Number of Strokes
47
48. Free vs. Specific Gestures
• Tapping: Number of strokes pleasure ?
ZP
Number of Strokes
48
49. Free vs. Specific Gestures
• Tapping: Number of strokes pleasure ?
ZP
Number of Strokes
49
50. Free vs. Specific Gestures
• Tapping: Number of strokes Pleasure or Arousal ?
50
51. Results: Goal
• Are all subjects similar or are subjects specific ?
➡ Aggregated subjects vs. Specific subjects
• Which gestures to study ?
➡ Free gesturing vs. Specific types of gestures
• What dimension(s) need to be captured?
➡ Do we need more sensors?
• What is the minimum relevant information to compute?
➡ Relevant/unrelevant and redundant/unredundant
descriptors 51
52. Results: Need for More
Information
• Arousal has the most Number of descriptors correlated
correlations with 100
descriptors
75
• Need more information
for Pleasure and 50
Dominance
25
• Descriptors, sensors,
gestures, subjects? 0
Pleasure Arousal Dominance
52
53. Results: Goal
• Are all subjects similar or are subjects specific ?
➡ Aggregated subjects vs. Specific subjects
• Which gestures to study ?
➡ Free gesturing vs. Specific types of gestures
• What dimension(s) need to be captured?
➡ Do we need more sensors?
• What is the minimum relevant information to compute?
➡ Relevant/unrelevant and redundant/unredundant
descriptors 53
54. Results: Relevant and
Unrelevant Descriptors
Number of descriptors correlated to at least one affective dimension
90
67.5
• 3D motion 45
descriptors has the
22.5
most correlations
to affective 0
3D motion descriptors 2D surface descriptors
dimensions
54
55. Results: Example 3D Motion
Descriptor
• Amplitude of the length of the derivative of high-pass
filtered acceleration and Arousal
calm excited
• r = −0.31
55
56. Results: Redundant and
Unredundant Descriptors
• Can we group redundant descriptors to
increase relationship between unredundant
groups of descriptors and P, A, D ?
➡Principal component analysis
56
57. Results: Redundant and
Unredundant Descriptors
• 92 significatively correlated descriptors
• Reduce to descriptors most correlated to PAD and
less correlated between each other
• if we keep 5 descriptors, then we keep
• 85% of the correlation with descriptors for
Arousal
• 104% of the correlation with descriptors for
Pleasure and Dominance
57
58. Results: Redundant and
Unredundant Descriptors
• 5 descriptors
‣ FAccAmplitudeY and FAccMinX
‣ FAccAmplitudeZ and FAccMinZ
‣ GapBetweenHighLowSpectrumY
58
59. Results: Relevant and
unredundant descriptors
• In particular, some unredundant groups of
similar descriptors tend to evolve like Arousal
• Minima of z projections of acceleration and
jerk
• Duration and spectral descriptors on x and
y projections of raw or low-passed filtered
acceleration
• Can reduce down to 3 descriptors for Arousal
59
61. Conclusion
subjects gestures sensing computation
? emotions
• Are all subjects similar or are subjects
specific ?
➡ Specific subjects need to be explored
for free gesturing
61
62. Conclusion
subjects gestures sensing computation
? emotions
• Which gestures to study ?
➡ Free gesturing, but specific types of
gestures are promising, e.g. tapping
62
63. Conclusion
subjects gestures sensing computation
? emotions
• What dimension(s) need to be captured?
➡ Touches for particular gestures
➡ 3D motion through accelerometers
➡ Need for more
63
64. Conclusion
subjects gestures sensing computation
? emotions
• What is the minimum relevant information
to compute?
➡ 5 descriptors can be enough
➡ 3 descriptors can be enough for Arousal
64
66. Identifying Emotions
Expressed by Mobile Users
through 2D and 3D Gestures
Céline Coutrix, Nadine Mandran
CNRS & Laboratoire d’Informatique de Grenoble
66
Editor's Notes
\n
\n
We started from the idea that users may wish to express emotions or affective states, anytime and anywhere\nfor example, to comunicate emotions to distant others, social network or very close friends, doctors\nor for the interaction to adapt and increase performance or satisfaction\nAmong other purposes. \nJust like &#x201C;static&#x201D; affective interaction, users may benefits from mobile affective interaction. \n
So far, researchers have been working on modalities in order to identify affective expressions,\nfor instance, facial expressions, thermal imaging, vocal intonations, language, galvanic skin response, electromyography of the face, heart rate, and so on. \n
All these modalities are great, but when in mobile situation, they are either\n- Intrusive, like electrodes on the face,\n- Expensive, like a thermal camera\n- Not suitable for mobile use, like too heavy or wouldn&#x2019;t work as good as in static situations for instance due to difficult lightning condition or background noise. \n
So... can we find a technique for identifying emotions that would be discreet, cheap and mobile, i.e. working in any condition?\n
We took the approach of using smartphones for this. \nThey are not literaly cheap, but they are in the sense that we already have them for other purposes and carry them everyday and everywhere. \nAnd of course, they are made to be mobile. \n
Then, what modality could be explored on today&#x2019;s mobile phones that can be discreet?\nCamera is difficult since it wouldn&#x2019;t be usable in a pocket for example, when you don&#x2019;t want people so see you using it. \nSo we explored gesturing with the device. \nIndeed there are very common hardware on smartphones that can be used for this, for instance accelerometers and touchscreens. \n
So we investigated the following question:\nIs it possible to identify emotions explicitely expressed by mobile users through gestures?\n\nI would like to make clear that we address the problem of identifying to which extent such a link exists between intentional expressive gestures and emotions users explicitly wish to express. \nWe neither address the identification of intimate emotion nor the implicit gestural activity of a user throughout the day. \n
The problem is very large and we need to rephrase it\n
\n
\n
\n
\n
Researchers have contributed to this problem, by:\n1) first exploring surface and motion gestures spontaneously performed by users, but only in mapping to commands and not emotions\n2) second, as we said before, by exploring modalities to identify affective expressions but not for cheap, unintrusive and mobile purpose\nSo we wish to take a first step to mobile gestures for affective computing\n
Our goal was to uncover directions for more focused studies to see where there could be a link between gestures and emotions\n
\n
In particular, we will try to answer these questions \n
In order to test this idea, \nwe designed a field study in order to collect gestures and emotions to investigate their relationship\nWe tried to put as few contraints on users as possible, in order to have a realistic dataset. \n\n
\n
As gyroscopes were not present in all devices, we chose to leave them out for this study. \n\nWe could have done a longer study, but as this was the first, early one, we wanted to test the idea before doing a long, time consuming study. \n
\n
\n
TODO : justify why not another model of emotions\n
\n
\n
\n
subjects&#x2019; variance in reporting, but this is the chance we take to gather realistic data\n
Among the 188 gestures reported, 36 are 3D only and 152 are combined 2D/3D gestures\nHere are two examples of the kind of data we collected\n
\n
with gravity filtered out \nx (left-right of device), y (bottom-top of device) and z (back-front of device)\n
\n
with gravity filtered out \nx (left-right of device), y (bottom-top of device) and z (back-front of device)\n
\n
with gravity filtered out \nx (left-right of device), y (bottom-top of device) and z (back-front of device)\n
with gravity filtered out \nx (left-right of device), y (bottom-top of device) and z (back-front of device)\n
1) in order to reveal the periodicity of a gesture\n
\n
\n
\n
\n
Here the coefficient is -0.31, one of the best coefficient we found, which is rather poor. \nSo we tried to find directions to explore to explain this low results and find a way to improve this result, we found interesting differences between subjects!!!\n
\n
We manually classified the samples \n
A first interesting class was the one where the device was moving in all directions after being hold still for a moment. We classified these gestures as different from all others, because of these two phases. On the contrary, for instance, others have movement in all directions from the beginning to the end of the gesture. \n\nwith a 95% confidence but with small number of samples \n-> need to be confirmed with a larger number of samples\n
\n
\n
If we plot Pleasure given the number of strokes, at first sight, it seems noisy. \n\n
Looking closer, there is a interesting pattern, \n\nTODO regarder si les plats ne peuvent pas etre des outliers\n
reminded us a increase until a ceiling. \nbut with too many outliers in order to be promising. \n
So we tried to explain the outliers. \nWhen plotting the number of strokes of tapped gestures along Arousal, \nwerealized they followed another pattern. \n\nSo it would be interesting to investigate if tapped gestures indeed follow this mixed model or not, with more than 18 samples.\nTODO insister sur le fait que c&#x2019;est qqchose a tester avec plus d&#x2019;echantillons\n
\n
\n
\n
\n
\n
Here is an example of a 3D motion descriptors that is correlated to Arousal : \nThe ...\nThis means that when the change in acceleration increase, the arousal increase too. \nBut the coefficient is -0.3, which is low. \n
So we thought that we should try to group redundant descriptors together \nso that we improve the results by considering unredundant descriptors. \n
\n
The resulting space has 3 dimensions\nThese two graphs show two projections of the 3D space\nOn the right, on the surface defined by new axes 1 and 2\nOn the left, on the surface defined by the new axes 2 and 3. \n
Blue vectors show each descriptor&#x2019;s vector in the space. \nAs we will see, what is interesting here is the length of the vectors and the angle with axes, this is why Principal Component Analysis represents descriptors as vectors. \n
In this space, similarly evolving descriptors have angles close to each other. \nAnd descriptors evolving differently are perpendicular. \n\nSo there is no need to consider all of them for a later study, but only one for each angle here. Otherwise it will be redundant. \n
too much information gets noisy. \n
too much information gets noisy. \n
In addition, \nthe closer to unit circle a descriptor is, the more it contributed to the linear combination to build the axes.\nThat means, the more it helps explaining the differences between gestures. \n\nAs the projections on axis 1 are the longest, it is the where we can locate the gestures the easiest. \n
For example, the two calm and excited samples we saw before are very far apart on axis one. \n
In conclusion, in order to characterize a gesture with minimum number of descriptors in a future study, \nwe should consider the descriptors the closest to the unit circle, because these are the ones explaining most variations between gestures \nAND the ones that have the furthest angles, so that they explain different aspect of the variation and are not redundant. \n\n
In conclusion, in order to characterize a gesture with minimum number of descriptors in a future study, \nwe should consider the descriptors the closest to the unit circle, because these are the ones explaining most variations between gestures \nAND the ones that have the furthest angles, so that they explain different aspect of the variation and are not redundant. \n\n
In this space, PAD variables can be located along axes thanks to the known correlations to descriptors. \nAs we said before, correlations were low, so PAD vectors are far from unit circles. \n
But there are descriptors that are in the same direction as affective dimensions. This means that they tend to evolve in a similar way as the affective dimensions. \n\nThere are 7 groups that we represented here as colored vectors whereas descriptors are points. \nAmong these,\n
qd les verts augmentent , ZA tends to augmente\nqd les roses augmentes, ZA tends to diminue\n