ICT role in 21st century education and its challenges
Emochat: Emotional instant messaging with the Epoc headset
1. ABSTRACT
Title of Document: EMOCHAT: EMOTIONAL INSTANT
MESSAGING WITH THE EPOC HEADSET
Franklin Pierce Wright
Master of Science
2010
Directed By: Asst. Professor of Human-Centered Computing
Dr. Ravi Kuber
Information Systems
Interpersonal communication benefits greatly from the emotional information
encoded by facial expression, body language, and tone of voice, however this
information is noticeably missing from typical instant message communication. This
work investigates how instant message communication can be made richer by
including emotional information provided by the Epoc headset. First, a study
establishes that the Epoc headset is capable of inferring some measures of affect with
reasonable accuracy. Then, the novel EmoChat application is introduced which uses
the Epoc headset to convey facial expression and levels of basic affective states
during instant messaging sessions. A study compares the emotionality of
communication between EmoChat and a traditional instant messaging environment.
Results suggest that EmoChat facilitates the communication of emotional information
more readily than a traditional instant messaging environment.
2. EMOCHAT: EMOTIONAL INSTANT MESSAGING WITH THE EPOC
HEADSET
By
Franklin Pierce Wright
Thesis submitted to the Faculty of the Graduate School of the
University of Maryland, Baltimore County, in partial fulfillment
of the requirements for the degree of
Master of Science
2010
8. List of Tables
Table 3.1 Common emoticons in Western and Eastern cultures ............................... 16
Table 3.2 Example of hapticons................................................................................. 20
Table 4.1 Facial expression features measured by the Epoc headset ........................ 24
Table 4.2 Headset and self-reported levels of affect per subject, per trial................. 33
Table 4.3 Spearman correlation between headset and self-reported levels of affect
(N=36) ................................................................................................................. 34
Table 4.4 Spearman correlation of headset and self-report data for varied time
divisions (N=36) ................................................................................................. 35
Table 4.5 Grand mean headset and self-reported levels of affect per difficulty level 36
Table 4.6 Grand mean Spearman correlation between headset and self-reported levels
of affect (N=3) .................................................................................................... 36
Table 4.7 Major themes identified in subjective affect survey results ...................... 39
Table 5.1 Facial movements and affective information used by EmoChat ............... 46
Table 5.2 EmoChat experimental groups................................................................... 48
Table 5.3 ETAQ scores for each subject-pair and both experimental conditions ..... 56
Table 5.4 Spearman correlation matrix between avatar features and perceived
frequency of emotional states (N=5) .................................................................. 58
Table 5.5 Wilcoxon's signed rank test for significant difference between score means
(N=10) ................................................................................................................. 61
Table 5.6 Linguistic categories with significant differences between experimental
conditions ............................................................................................................ 64
Table 5.7 LIWC affective processes hierarchy .......................................................... 65
Table 5.8 LIWC relativity hierarchy ........................................................................... 66
vi
9. List of Figures
Figure 3.1 Examples of expressive avatars ................................................................ 18
Figure 4.1 Initialization screen for the TetrisClone application ................................ 28
Figure 4.2 TetrisClone application during trials ........................................................ 28
Figure 4.3 Example output from the TetrisClone application ................................... 29
Figure 4.4 Comparison of grand mean headset and self-reported levels of excitement
............................................................................................................................. 37
Figure 4.5 Comparison of grand mean headset and self-reported levels of
engagement ......................................................................................................... 37
Figure 4.6 Comparison of grand mean headset and self-reported levels of frustration
............................................................................................................................. 37
Figure 5.1 The EmoChat client application ............................................................... 45
Figure 5.2 EmoChat server application ..................................................................... 47
Figure 5.3 Mean scores from the richness questionnaire, questions 1-5 ................... 59
Figure 5.4 Mean scores from the richness questionnaire, questions 5-10 ................. 60
Figure 5.5 Comparison of mean responses to REQ between subjects with headsets
versus without headsets, in the EmoChat condition, Q1-5 (N=5) ...................... 62
Figure 5.6 Comparison of mean responses to REQ between subjects with headsets
versus without headsets, in the EmoChat condition, Q5-10 (N=5) .................... 62
vii
10. 1 Introduction
This chapter introduces the importance of emotion in interpersonal communication,
and presents some of the challenges with including emotion in instant messages. The
purpose of this thesis is then stated, followed by an overview of the document
structure.
1.1 The Importance of Emotion
Consider the following statement:
“The Yankees won again.”
Does the person who makes this remark intend to be perceived as pleased or
disappointed? Enthusiastic or resentful? The remark is purposely emotionally
ambiguous to illustrate just how powerful the inclusion or absence of emotion can be.
If the same remark were said with a big grin on the face, or with the sound of
excitement in the voice, we would certainly understand that this person was quite
pleased that his team was victorious.
If the speaker displayed slumped shoulders and a head tilted downward we would
assume that he was certainly less than jubilant.
1
11. It is clear that emotions play a very important role in interpersonal communication,
and without them, communication would be significantly less efficient. A statement
that contains emotion implies context, without the necessity of explicit clarification.
In some cases what is said may be equally as important as how it is said.
1.2 Instant Messaging
Real-time text-based communication is still on the rise. Instant messaging, in one
form of another, has infiltrated nearly all aspects of our digital lives, and shows no
sign of retreat. From work, to school, to play, it’s becoming more and more difficult
to shield ourselves from that popup, or that minimized window blinking in the task
bar, or that characteristic sound our phones make when somebody wants to chat with
us. We are stuck with this mode of communication for the foreseeable future.
1.3 The Emotional Problem with Instant Messaging
As convenient as it is, this text-based communication has inherent difficulties
conveying emotional information. It generally lacks intonation and the subtle non-
verbal cues that make face-to-face communication the rich medium that it is. Facial
expression, posture, and tone of voice are among the highest bandwidth vehicles of
emotional information transfer (Pantic, Sebe, Cohn, & Huang, 2005), but are
noticeably absent from typical text-based communication. According to Kiesler and
colleagues, computer-mediated communication (CMC) in general is “observably
poor” for facilitating the exchange of affective information, and note that CMC
2
12. participants perceive the interaction as more impersonal, resulting in less favorable
evaluations of partners (Kiesler, Zubrow, Moses, & Geller, 1985).
The humble emoticon has done its best to remedy the situation by allowing text
statements to be qualified with the ASCII equivalent of a smile or frown. While this
successfully aids in conveying positive and negative affect (Rivera, Cooke, & Bauhs,
1996), emoticons may have trouble communicating more subtle emotions. Other
solutions that have been proposed to address this problem are reviewed in chapter 3.
Each solution is successful in its own right, and may be applicable in different
situations. This work examines a novel method for conveying emotion in CMC,
which is offered to the community as another potential solution.
1.4 Purpose of this Work
The main goal of this body of work is to investigate how instant message
communication is enriched by augmenting messages with emotional content, and
whether this can be achieved through the use of brain-computer interface (BCI)
technology. The Emotiv Epoc headset is a relatively new BCI peripheral intended
for use by consumers and is marketed as being capable of inferring levels of basic
affective states including excitement, engagement, frustration, and meditation. A
study presented in this work attempts to validate those claims by comparing data
reported by the headset with self-reported measures of affect during game play at
varied difficulty levels. The novel EmoChat application is then introduced, which
integrates the Epoc headset into an instant messaging environment to control the
3
13. facial expressions of a basic animated avatar, and to report levels of basic affective
states. A second study investigates differences between communication with
EmoChat and a traditional instant messaging environment. It is posited that the
EmoChat application, when integrated with the Epoc headset, facilitates
communication that contains more emotional information, that can be described as
richer, and that conveys emotional information more accurately than with traditional
IM environments.
In the end, this complete work intends to provide, first, a starting point for other
researchers interested in investigating applications that implement the Epoc headset,
and second, results which may support the decision to apply the Epoc in computer-
mediated communication settings.
1.5 Structure of this Document
The remaining chapters of this work are structured as follows:
Chapter 2 provides an overview of emotion, including historical perspectives, and
how emotions are related to affective computing. Chapter 3 reviews existing
techniques for conveying emotion in instant messaging environments. Chapter 4
details a study to determine the accuracy of the Epoc headset. Chapter 5 introduces
EmoChat, a novel instant messaging environment for exchanging emotional
information. A study compares EmoChat with a traditional instant messaging
environment. Chapter 6 summarizes the contributions this work makes, and
4
14. compares the techniques for conveying emotion used in EmoChat with techniques
described in the literature.
5
15. 2 Emotion
This chapter describes some of the historical perspectives on emotion, and introduces
its role in affective computing. It intends to provide a background helpful for the
study of emotional instant messaging.
2.1 What is Emotion?
The problem of defining what constitutes human emotion has plagued psychologists
and philosophers for centuries, and there is still no generally accepted description
among researchers or laypersons. A complicating factor in attempting to define
emotion is our incomplete understanding of the complexities of the human brain.
Some theorists have argued that our perceptions enter the limbic system of the brain
and trigger immediate action without consultation with the more developed cortex.
Others argue that the cortex plays a very important role in assessing how we relate to
any given emotionally relevant situation, and subsequently provides guidance about
how to feel.
2.1.1 The Jamesian Perspective
In 1884 psychologist William James hypothesized that any emotional experience with
which physiological changes may be associated requires that those same
physiological changes be expressed before the experience of the emotion (James,
1884). In essence, James believed that humans feel afraid because we run from a
bear, and not that we run from a bear because we feel afraid. James emphasized the
6
16. physical aspect of emotional experience causation over the cognitive aspect. This
physical action before the subjective experience of an emotion has subsequently been
labeled a “Jamesian” response. For historical accuracy, note, that at about the same
time that James was developing his theory Carl Lange independently developed a
very similar theory. Collectively, their school of thought is referred to as James-
Lange (Picard, 1997).
2.1.2 The Cognitive-Appraisal Approach
In contrast to James’ physical theory of emotion, a string of psychologists later
developed several different cognitive-based theories. Notable among them is the
cognitive-appraisal theory, developed by Magda Arnold, and later extended by
Richard Lazarus, which holds that emotional experience starts not with a physical
response, but with a cognitive interpretation (appraisal) of an emotionally-inspiring
situation (Reisenzein, 2006). In continuing with the bear example, Arnold and
Lazarus would have us believe that we hold in our minds certain evaluations of the
bear-object (it is dangerous and bad for us), we see that the bear is running toward us
(is, or will soon be present), we anticipate trouble if the bear reaches us (poor coping
potential), and so we experience fear and run away.
Certainly, there are valid examples of situations that seem to trigger Jamesian
responses. Consider the fear-based startle response when we catch a large object
quickly approaching from the periphery. It is natural that we sometimes react to
startling stimuli before the experience of the fear emotion, jumping out of the way
7
17. before we even consciously know what is happening to us. Conversely, consider joy-
based pride after a significant accomplishment. It seems as though pride could only
be elicited after a cognitive appraisal determines that (a) the accomplishment-object is
positive, (b) it has been achieved despite numerous challenges, and (c) it will not be
stripped away. If examples can be found that validate both the James-Lange
approach and the cognitive-appraisal approach, is one theory more correct than the
other?
2.1.3 Component-Process Theory
It is now suggested that emotional experience may result from very complex
interaction between the limbic system and cortex of the brain, and that emotions can
be described as having both physical and cognitive aspects (Picard, 1997).
Encompassing this point of view, that a comprehensive theory of emotion should
consider both cognitive and physical aspects, is the component-process model
supported by Klaus Scherer (Scherer, 2005). This model describes emotion as
consisting of synchronized changes in several neurologically based subsystems,
including, cognitive (appraisal), neurophysiologic (bodily symptoms), motivational
(action tendencies), motor expression (facial and vocal expression), and subjective
feeling (emotional experience) components. Note that Scherer regards “subjective
feeling” as a single element among many in what constitutes an “emotion.”
8
18. 2.2 Emotion and Related Affective States
2.2.1 Primary versus Complex
Several emotion classes exist that are more basic than others. These are the emotions
that seem the most Jamesian in nature—hard coded, almost reflex like responses that,
from an evolutionary perspective, contribute the most to our survival. Fear and anger
are among these basic, primary, emotions. Picard labels these types of emotions as
“fast primary,” and suggests that they originate in the limbic system. This is in
contrast to the “slow secondary,” or cognitive-based emotions that require time for
introspection and appraisal, and therefore require some cortical processing. Scherer
calls this slow type of emotion “utilitarian,” in contrast with the fast, which he terms,
“aesthetic.” An important distinction should be made between emotions and other
related affective states such as moods, preferences, attitudes, and sentiments. A
distinguishing factor of emotions is the comparatively short duration when considered
among the other affective states.
2.3 Expressing Emotion
2.3.1 Sentic Modulation
If emotion has both physical and cognitive aspects, it seems natural that some
emotions can be experienced without much, if any, outward expression. Interpersonal
communication may benefit from those overt expressions of emotion that can be
perceived by others. Picard discusses what she calls “sentic modulation,” overt or
covert changes is physiological features that, although do not constitute emotion
9
19. alone, act as a sort of symptom of emotional experience (Picard, 1997). The easiest
of these sentic responses to recognize are arguably facial expression, tone of voice,
and posture, or body language. Of these three, research suggests that facial
expression is the highest bandwidth, with regard to the ability to convey emotional
state (Pantic, et al., 2005). There are other, more covert, symptoms of emotional
experience, including heart rate, blood pressure, skin conductance, pupil dilation,
perspiration, respiration rate, and temperature (Picard, 1997). Recent research has
also demonstrated that some degree of emotionality may also be inferred by
neurologic response as measured by electroencephalogram (Khalili & Moradi, 2009;
Sloten, Verdonck, Nyssen, & Haueisen, 2008). Facial expression deserves additional
attention, being one of the most widely studied forms of sentic modulation. Ekman
and others have identified six basic emotion/facial expression combinations that
appear to be universal across cultures including, fear, anger, happiness, sadness,
disgust, and surprise (Ekman & Oster, 1979). The universality of these facial
expressions, that they are so widely understood, suggests that it should be quite easy
to infer emotion from them.
2.4 Measuring Emotion
2.4.1 Self-Report Methods
Perhaps the most widely used method for determining emotional state is through self-
report. This technique asks a subject to describe an emotion he or she is
experiencing, or to select one from a pre-made list. Scherer discusses some of the
10
20. problems associated with relying on self-reported emotional experience. On one
hand, limiting self-report of emotion to a single list of words from which the subject
must choose the most appropriate response may lead to emotional “priming,” and/or a
misrepresentation of true experience. On the other hand, allowing a subject to
generate a freeform emotional word to describe experience adds significant difficulty
to any type of analysis (Scherer, 2005). Another method for self-report measurement
is to have a subject identify emotional state within the space of some dimension.
Emotional response is often described as falling somewhere in the two-dimensional
valence/arousal space proposed by Lang. This dimensional model of affect
deconstructs specific emotions into some level of valence (positive feelings versus
negative feelings), and some level of arousal (high intensity versus low intensity)
(Lang, 1995). As an example, joyful exuberance is categorized by high valence and
high arousal, while feelings of sadness demonstrate low valence and low arousal.
Lang posits that all emotion falls somewhere in this two-dimensional space. Problems
may arise when emotion is represented in this space without knowing the triggering
event, since several distinct emotions may occupy similar locations in valence-arousal
space, e.g., intense anger versus intense fear, both located in high-arousal, low-
valence space (Scherer, 2005).
2.4.2 Concurrent Expression Methods
An objective way to measure emotional state is by inferring user affect by monitoring
sentic modulation. This method requires the use of algorithms and sensors in order to
perceive the symptoms of emotion and infer state, and is significantly more
11
21. challenging than simply asking a person how he feels (Tetteroo, 2008). Still, this is
an active area of research within the affective computing domain because user
intervention is not required to measure the emotion, which may be beneficial in some
cases. Techniques include using camera or video input along with classification
algorithms to automatically detect emotion from facial expression (Kaliouby &
Robinson, 2004), monitoring galvanic skin response to estimate level of arousal
(Wang, Prendinger, & Igarashi, 2004), and using AI learning techniques to infer
emotional state from electroencephalograph signals (Khalili & Moradi, 2009; Sloten,
et al., 2008).
2.5 Conclusion
Picard defines affective computing as, “computing that relates to, arises from, or
influences emotion.” (Picard, 1997) According to Picard, some research in this
domain focuses on developing methods of inferring emotional state from sentic user
characteristics (facial expression, physiologic arousal level, etc.), while other research
focuses on methods that computers could use to convey emotional information
(avatars, sound, color, etc.) (Picard, 1997). A study of emotional instant messaging
is necessarily a study of affective computing. In the context of instant messaging,
Tetteroo separates these two research areas of affective computing into the study of
input techniques, and output techniques (Tetteroo, 2008). The next chapter reviews
how these techniques are used during instant message communication to convey
emotional information.
12
22. 3 Existing Techniques for Conveying Emotion during
Instant Messaging
3.1 Introduction
A review of the current literature on emotional communication through instant
message applications has identified several techniques for enriching text-based
communication with affective content. These techniques can be broadly classified as
either input techniques, inferring or otherwise reading in the emotion of the user, or
output techniques, displaying or otherwise conveying the emotion to the partner
(Tetteroo, 2008). These categories are reviewed in turn.
3.2 Input Techniques
Research concerning how emotions can be read into an instant messaging system
generally implements one of several methods: inference from textual cues, inference
through automated facial expression recognition, inference from physiologic data, or
manual selection.
3.2.1 Textual Cues
Input methods that use text cues to infer the emotional content of a message generally
implement algorithms that parse the text of a message and compare its contents with a
database of phrases or keywords for which emotional content is known. Yeo
implements a basic dictionary of emotional terms, such as “disappointed,” or
13
23. “happy,” that incoming messages are checked against. When a match is found, the
general affective nature of the message can be inferred (Yeo, 2008). Others have
used more complicated natural language processing algorithms to account for the
subtleties of communicative language (Neviarouskaya, Prendinger, & Ishizuka,
2007).
Another example of using text cues to infer the emotion of a message involves simply
parsing text for occurrences of standard emoticons. The presence of the smiley
emoticon could indicate a positively valenced message, while a frowning emoticon
could indicate negative valence, as implemented by Rovers & Essen (2004).
3.2.2 Automated Expression Recognition
The goal during automated expression recognition involves using classification
algorithms to infer emotion from camera or video images of a subject’s face.
Kaliouby and Robinson use an automated “facial affect analyzer” in this manner to
infer happy, surprised, agreeing, disagreeing, confused, indecisive, and neutral states
of affect (Kaliouby & Robinson, 2004). The classifier makes an evaluation about
affective state based on information about the shape of the mouth, the presence of
teeth, and head gestures such as nods.
14
24. 3.2.3 Physiologic Data
Some physiologic data is known to encode levels of affect, including galvanic skin
response, skin temperature, heart beat and breathing rate, pupil dilation, and electrical
activity measured from the surface of the scalp (Picard, 1997). Wang and colleagues
used GSR data to estimate levels of arousal in an instant messaging application
(Wang, et al., 2004). Specifically, spikes in the GSR data were used to infer high
levels of arousal, and the return to lower amplitudes signaled decreased level of
arousal.
Output from electroencephalograph (EEG) has also been used to classify emotional
state into distinct categories within arousal valence space by several researchers
(Khalili & Moradi, 2009; Sloten, et al., 2008). These studies use AI learning
techniques to classify affective state into a small number of categories with moderate
success.
3.2.4 Manual Selection
The most basic method of adding emotional content to an instant message is by
simple manual selection or insertion. This method can take the form of a user
selecting from a list of predefined emotions or emotional icons with a mouse click, or
by explicitly inserting a marker, e.g., emoticon, directly in to the body of the message
text. This type of input technique is widely used and is seen in research by (Fabri &
Moore, 2005; Sanchez, Hernandez, Penagos, & Ostrovskaya, 2006; Wang, et al.,
2004).
15
25. 3.3 Output Techniques
Output techniques describe methods that can be used to display emotional content to
a chat participant after it has been input into the system. These techniques generally
involve using emoticons, expressive avatars, haptic devices, and kinetic typography.
3.3.1 Emoticons
Emoticons are typically understood as small text-based or graphical representations of
faces that characterize different affective states, and have been ever evolving in an
attempt to remedy the lack of non-verbal cues during text chat (Lo, 2008). Examples
of commonly used emoticons are presented in the table below.
Meaning Western Emoticon Eastern Emoticon
Happy :-) (^_^)
Sad :-( (T_T)
Surprised :-o O_o
Angry >:-( (>_<)
Wink ;-) (~_^)
Annoyed :-/ (>_>)
Table 3.1 Common emoticons in Western and Eastern cultures
Emoticons are perhaps the most widely used method for augmenting textual
communication with affective information. A survey of 40,000 Yahoo Messenger
users reported that 82% of respondents used emoticons to convey emotional
information during chat (Yahoo, 2010). A separate study by Kayan and colleagues
explored differences in IM behavior between Asian and North American users and
reported that of 34 total respondents, 100% of Asian subjects used emoticons while
16
26. 72% of North Americans (with an aggregate of 85% of respondents) used emoticons
on a regular basis (Kayan, Fussell, & Setlock, 2006). These usage statistics
underscore the prevalence of emoticons in IM communication.
Sanchez and colleagues introduced an IM application with a unique twist on the
standard emoticon. Typical emoticons scroll with the text they are embedded in, and
so lack the ability to convey anything more than brief fleeting glimpses of emotion.
This novel application has a persistent area for emoticons that can be updated as often
as the user sees fit, and does not leave the screen as messages accumulate (Sanchez,
et al., 2006). Building on Russel’s model of affect (Russell, 1980), the team
developed 18 different emoticons, each with three levels of intensity to represent a
significant portion of valence/arousal emotional space.
3.3.2 Expressive Avatars
Using expressive avatars to convey emotion during IM communication may be
considered a close analog to the way emotion is encoded by facial expression during
face-to-face conversation, considering that facial expression is among the highest
bandwidth channels of sentic modulation (Pantic, et al., 2005).
17
27. Figure 3.1 Ex
xamples of expr
ressive avatars
A study by Kaliouby and Robinson presents an in
K p nstant messa
aging applica
ation called
FAIM which uses automa facial ex
ated xpression re cognition to infer affect, and
o
displays an ex
xpressive av
vatar reflectin that affec to the chat partner (Ka
ng ct t aliouby &
Robinson, 2004). Affecti states cu
R ive urrently supp
ported by FA include happy,
AIM
su
urprised, agr
reeing, disag
greeing, conf
fused, indeci
isive, and ne
eutral.
Fabri and Mo
oore investig
gated the use of animated avatars cap
d pable of emo
otional facial
l
ex
xpressions in an instant messaging environment (Fabri & M
n m e Moore, 2005). They
co
ompared this with a condition in wh the avata was not an
hich ar nimated and did not
d
ch
hange facial expressions except for minor rando eyebrow movement. The
s, om w
hy
ypothesis wa that the co
as ondition in question wou result in a higher lev of
q uld vel
“r
richness,” co
omprised of high levels of task invol
o lvement, enj
joyment, sen of
nse
pr
resence, and sense of co
d opresence. Participants i
P interacted thr
rough the IM application
M n
du
uring a class
sical surviva exercise, in which both subjects w tasked w
al n h were with
co
ollectively ordering a lis of survival items in ter of impor
o st l rms rtance. An a
avatar
re
epresenting each chat pa
e artner could be made to d
b display one o Ekman’s six universa
of al
fa
acial express
sions includi happines surprise, a
ing ss, anger, fear, sadness, and disgust
d
18
28. (Ekman & Oster, 1979), by clicking on a corresponding icon in the interface.
Significant results from the study indicated higher levels of task involvement and
copresence in the expressive avatar condition, equally high levels of presence in both
conditions, and a higher level of enjoyment in the non-expressive avatar condition.
The AffectIM application developed by Neviarouskaya also uses expressive avatars
to convey emotion during instant message communication (Neviarouskaya, et al.,
2007). Rather than requiring a user to select an expression from a predefined set,
AffectIM infers the emotional content of a message by analyzing the text of the
message itself, and automatically updates an avatar with the inferred emotion. A
comparison study identified differences between separate configurations of the
AffectIM application: one in which emotions were automatically inferred, one that
required manual selection of a desired emotion, and one that selected an emotion
automatically in a pseudo-random fashion (Neviarouskaya, 2008). The study
compared “richness” between conditions, comprised of interactivity, involvement,
sense of copresence, enjoyment, affective intelligence, and overall satisfaction.
Significant differences indicated a higher sense of copresence in the automatic
condition than in the random condition, and higher levels of emotional intelligence in
both the automatic and manual conditions than in the random condition.
3.3.3 Haptic Devices
Haptic instant messaging is described as instant messaging that employs waveforms
of varying frequencies, amplitudes, and durations, transmitted and received by
19
29. pu
urpose-built haptic devic (force-fe
t ces eedback joys
sticks, haptic touchpads, etc.), to
c ,
which special emotional meaning can be attached (Rovers & Essen, 2004 Rovers
w l m n d 4).
an Essen int
nd troduce their idea of “hap
r pticons,” wh
hich are desc
cribed as “sm
mall
pr
rogrammed force pattern that can be used to co
ns b ommunicate a basic notio in a
on
si
imilar manne as ordinar icons are used in grap
er ry u phical user in
nterfaces.” T
Their
pr
reliminary application, HIM, parses instant mes sage text for occurrence of
a H r es
tr
raditional em
moticons, e.g :), etc., an sends a pr
g., nd redefined wa
aveform to a connecte
any ed
haptic devices. For exam
mple, a smiley face sends a waveform with mode
y s m erate
fr
requency tha slowly gro in ampli
at ows itude, while a frowny fac is represe
ce ented by
se
everal abrup pulses with high frequency and am
pt h mplitude.
Emotic
con Mean
ning Hapt
ticon Wavef
form
:-) Hap
ppy
:-( Sa
ad
Table 3.2 Example of h
2 hapticons
The ContactIM applicatio developed by Oakley and O’Mod
T M on d dhrain takes a different
ap
pproach to in
ntegrating ha
aptic inform
mation with a instant me
an essaging env
vironment. A
plugin for the Miranda IM environme was creat that mim the effec of tossing
e M ent ted mics cts g
a ball between partners by using a for enabled h
n y rce haptic devic such as the phantom o
ce e or
a standard for
rce-feedback joystick (O
k Oakley, 2003 The appli
3). ication is des
signed to
al
llow each us to impart a specific velocity and t
ser t v trajectory to the ball dur
o ring a throw.
The generated momentum of the ball is persistent until the ch partner picks it up. In
T d m t hat
th way, the act of tossin the ball may convey s
his ng m some degree of emotiona
ality, e.g., a
20
30. lightly thrown ball as a playful flirtatious gesture, or a fast throw to indicate
disagreement or anger. Emphasis is placed on the asynchronous nature of typical
instant message use, and the application has been designed to suit this mode of
interaction by keeping the general characteristics of the tossed ball persistent until
interaction by the receiver changes it.
3.3.4 Kinetic Typography
Kinetic typography is described as real time modification of typographic
characteristics including animation, color, font, and size, etc., and may be used to
convey affective information (Yeo, 2008). Yeo developed an IM client that inferred
affective meaning through keyword pattern matching, and used kinetic typography to
update the text of messages in real time (Yeo, 2008).
An instant messaging client developed by Wang and colleagues represents emotion in
arousal/valence space by combining kinetic typography with galvanic skin response.
Manually selected text animations are meant to represent valence, while GSR that is
recorded and displayed to the chat partner represents level of arousal. Users were
asked when they felt the most involved during the online communication, and
answers typically corresponded to peaks in GSR level (Wang, et al., 2004). The
study participants reported that the inclusion of arousal/valence information made the
communication feel more engaging and that it was preferred over traditional text-only
chat, although some users indicated that they would not always want their partner to
be aware of their arousal level (Wang, et al., 2004).
21
31. 3.4 Conclusion
This chapter has separated the major components of emotional instant messaging into
two categories: input techniques and output techniques. Among input techniques,
inference from textual cues, inference through automated facial expression
recognition, inference from physiologic data, and manual selection have been
reviewed. Output techniques that were discussed include emoticons, expressive
avatars, haptic devices, and kinetic typography.
The next chapter introduces the Epoc headset and describes some of its capabilities.
This headset is used as the emotional input device for the EmoChat system discussed
in a subsequent chapter, and can be thought of as using automated facial expression
recognition in combination with physiologic data to infer and convey emotion. The
next chapter also presents a study that investigates the validity of the Epoc affect
classifier.
22
32. 4 Study 1: Validating the Emotiv Epoc Headset
4.1 Introduction
This study investigates the validity of the Epoc headset in terms of how accurately it
measures levels of excitement, engagement, and frustration. Self-reported measures
of excitement, engagement, and frustration are collected after games of Tetris are
played at varied difficulty levels. The self-reported measures are compared with data
from the headset to look for relationships.
4.2 Overview of the Epoc Headset
The EmoChat application makes use of the Epoc headset for measuring affective state
and facial expression information. This headset, developed by Emotiv, was one of
the first consumer-targeted BCI devices to become commercially available.
Alternatives BCI devices that were considered include the Neurosky Mindset, and the
OCZ Neural Impulse Actuator. The Epoc was selected because of the comparatively
large number of electrodes (14) that it uses to sense electroencephalograph (EEG),
and electromyography (EMG) signals, and the resulting capabilities. Additionally,
the Epoc has a growing community of active developers who form a support network
for other people using the software development kit to integrate headset capabilities
with custom applications.
Traditional EEG devices require the use of a conductive paste in order to reduce
electrical impedance and improve conductivity between the electrode and the scalp.
23
33. The Epoc device, however, replaces this conductive paste with saline-moistened felt
pads, which reduces set up time and makes clean up much easier.
A software development kit provides an application programming interface to allow
integration with homegrown applications, and a utility called EmoKey can be used to
associate any detection with any series of keystrokes for integration with legacy
applications. The developers have implemented three separate detection “suites”
which monitor physiologic signals in different ways, and are reviewed below.
4.2.1 Expressiv Suite
This suite monitors EMG activity to detect facial expressions including left/right
winks, blinks, brow furrowing/raising, left/right eye movement, jaw clenching,
left/right smirks, smiles, and laughter. The detection sensitivity can be modified
independently for each feature and for different users. Universal detection signatures
are included for each feature, but signatures can also be trained to increase accuracy.
Lower Face Movements Upper Face Movements Eye Movements
Smirk Right Brow Raise Look Left
Smirk Left Brow Furrow Look Right
Smile Wink Right
Laugh Wink Left
Jaw Clench Blink
Table 4.1 Facial expression features measured by the Epoc headset
24
34. 4.2.2 Affectiv Suite
The Affectiv suite monitors levels of basic affective states including instantaneous
excitement, average excitement, engagement, frustration, and meditation. Detection
algorithms for each state are proprietary and have not been released to the public,
therefore the given labels may be somewhat arbitrary, and may or may not accurately
reflect affective state. The goal of the present study is to determine the accuracy of
these detections with a longer-term goal of investigating whether this information can
be used to augment the instant messaging experience through the presentation of
emotional content.
4.2.3 Cognitiv Suite
This suite allows a user to train the software to recognize an arbitrary pattern of
electrical activity measured by EEG/EMG that is associated with a specific,
repeatable thought or visualization. The user may then reproduce this specific pattern
to act as the trigger for a binary switch. Skilled users may train and be monitored for
up to 4 different thought patterns at once.
4.3 The Need for Validation
The Epoc affectiv suite purports to measure levels of excitement, engagement,
frustration, and meditation; however, the algorithms used to infer these states are
proprietary and closed-source. There has been little research that references the Epoc
headset, perhaps because it is still new and relatively unknown. No studies thus far
25
35. have evaluated the accuracy of its affective inference algorithms. Cambpell and
colleagues used raw EEG data from the headset as input to a P300-based selection
engine (Campbell, et al., 2010), and several others have reviewed the device (Andrei,
2010; Sherstyuk, Vincent, & Treskunov, 2009). Methods are provided to retrieve
what each affectiv suite score is at any given moment, but do not let one see how each
score is calculated. It is understandable that Emotiv has chosen to keep this part of
their intellectual property out of the public domain, but if these affectiv measurements
are to be used in any serious capacity by researchers or developers, evidence should
be provided to support the claim that reported affectiv suite excitement levels are
reasonable estimates of actual subject excitement levels, that affectiv suite
engagement levels are reasonable estimates of actual subject engagement levels, and
so on.
4.4 Experimental Design
A study was designed to determine the accuracy of the Epoc affectiv suite by
presenting study participants with stimuli intended to elicit different levels of
affective and physiologic responses (in the form of game play at varied levels), and
measuring for correlation between output from the headset and self-reported affective
experience. Since the overall goal of this thesis work is to investigate how the
inclusion of affective information enriches instant message communication, the
excitement, engagement, and frustration headset detections are validated here. It is
thought that they are the most applicable to a study of affective communication.
26
36. The study design used in the present study is adapted from similar work by Chanel
and colleagues, during which physiological metrics were monitored as participants
played a Tetris game (Chanel, Rebetez, Betrancourt, & Pun, 2008). The difficulty
level of the game was manipulated in order to elicit differing affective states. Self-
report methods were also used to collect data about participants’ subjective
experience of affect. The goal of the study was to use these physiological metrics
with machine learning techniques to classify affective experience into three
categories, including, anxiety, engagement, and boredom. It is thought that the
anxiety category of the Chanel study may be a close analog to the frustration
component of the Epoc affectiv suite.
4.4.1 TetrisClone System Development
A small Tetris application (TetrisClone) was developed to automate the experiment
and to aid with data collection. The application was written in C# using Microsoft
Visual Studio 2008 and interfaces with the Emotiv Epoc headset through the supplied
API.
The initialization screen for TetrisClone can be seen in fig. 4.1. This screen collects
the test subject’s name and is used to start logging data coming from the Epoc
headset. After logging begins, the right panel can be hidden so that it does not
distract the subject during the experiment. A screenshot of the TetrisClone application
during one of the trials is presented in fig. 4.2.
27
37. Fi
igure 4.1 Initiali
ization screen fo the TetrisClon application
or ne Figure 4.2 TetrisClon application
e ne
during trials
g
4.4.2 Meas
sures
4.4.2.1 Que
estionnaires
All participan complete a total of 3 surveys to c
A nts e collect basic demographi
ic
in
nformation, self-reported levels of af
s d ffect during the experim
ment, and ope
en-ended
op
pinions abou the causes of affect du
ut s play. These questionnaires are
uring game p
pr
rovided in ap
ppendices A-C. Demog
A graphics ques
stions asked about age, g
d gender,
fa
amiliarity wi Tetris, an skill at co
ith nd omputer/cons
sole games. Self-reporte levels of
ed
af
ffect questio asked sub
ons bjects to rate their exper
e riences of ex
xcitement, en
ngagement,
an frustratio on a 5 point likert-scale between t
nd on trial games. Open ended questions
d
as
sked subject to describe what game events mad them feel excited, eng
ts e e de gaged, and
fr
rustrated.
28
38. 4.4.2.2 Hea
adset Data
The TetrisClo applicati receives affective sta informati from the Epoc
T one ion ate ion e
headset at app
proximately 2 Hz. and writes these t a hidden t box cont (visible
w to text trol
in the left pan of fig. 4.1). As a part
n ne ticipant comp
pletes one portion of the experiment
e t
an moves on to the next, the content of the text box control are output t a CSV file
nd n ts t l to e.
In this way, it becomes ea
n t asier to deter
rmine which groupings o headset d records
h of data
ar associated with which points in th experimen
re d h he nt.
Figure 4.3 Example ou
4 utput from the T
TetrisClone app
plication
The output CSV files them
T mselves con
ntain time-sta
amped, head
dset-reported levels of
d
ex
xcitement (s
short term), excitement (long term), e
e engagement frustration, and
t,
meditation, however the present study is only con
m p y ncerned with excitement (short term
h t m),
en
ngagement, and frustrati compone
ion ents.
29
39. 4.5 Experimental Procedures
A total of (7) subjects participated in the experiment; however, data from one subject
was incomplete due to problems with maintaining a consistently good signal quality
from the headset. This incomplete data is not included in the analysis. The
remaining subjects, (N=6), were all male aged 25-30 (mean=28.5), pretty to very
familiar with the Tetris game (mode=very familiar), occasionally to very frequently
played computer or console games (mode=very frequently), but never to rarely played
Tetris (mode=rarely). Participants rated themselves as being average to far above
average skill (mode=above average) when it came to playing computer or console
games.
Subjects arrived, were seated in front of a laptop, asked to review and sign consent
forms, and then completed the Demographics Questionnaire (Appendix A). The
subjects were then fitted with the Epoc headset. Care was taken to ensure that each
headset sensor reported strong contact quality in the Control Panel software. The
self-scaling feature of the Epoc headset required 15-20 minutes prior to data
collection. During this time, the subjects were asked to play several leisurely games
of Tetris as a warm-up activity.
Once the headset adjusted to the subjects, the Tetris game/Headset data recorder was
launched. Subjects played through a series of three Tetris games to determine “skill
level,” as calculated by averaging the highest levels reached during each game. The
TetrisClone application has no maximum level cap, although levels above 10 are so
30
40. difficult that progressing further is not practical. After each game the subjects rested
for 45 seconds to allow any heightened emotional states to return to baseline.
Once skill level was determined, three experimental conditions were calculated
automatically as follows:
High Difficulty Level = Skill Level (+ 2)
Med Difficulty Level = Skill Level
Low Difficulty Level = Skill Level (– 2)
The subjects then played through a random ordered set of 6 trials consisting of 2
games at each difficulty level, e.g., [8,6,6,4,8,4]. Trials were randomized to account
for order effects. During each trial, games lasted for 3 minutes at a constant
difficulty/speed. If the subjects reached the typical game over scenario, the playing
field was immediately reset and the subjects continued playing until the 3 minutes
were over. At the end of each round, the subjects complete a portion of the
Experiment Questionnaire (Appendix B). The subjects rested for 45 seconds after
completing the questionnaire, but before beginning the next round, to allow emotional
state to return to baseline. Headset logging stopped after all 6 rounds had been
played.
31
41. After the subjects finished all game play tasks the facilitator removed the headset, and
subjects completed the final Post-Experiment Questionnaire (Appendix C). The
subject was paid for his time, signed a receipt, and was free to leave.
4.6 Results and Analysis
4.6.1 Headset-Reported versus Self-Reported Levels of Affect
The main goal for this study was to determine whether or not the Epoc headset
reports data that is congruent with self-reported data of the same features. This was
done in order to establish the validity of the Epoc affectiv suite. Headset and self-
report data were compared a number of different ways. Each trial yielded 3 minutes
worth of headset data sampled at 2 Hz. (approximately 360 samples x 3 affective
features per sample = 1080 individual data elements), which were to be compared
with a single self-reported level of affect for each of the 3 affective features in
question (excitement, engagement, and frustration). Headset data from each trial was
reduced to 3 individual data elements by taking the mean of all sampled values for
each of the three affective features. Headset means from each trial were then paired
with the corresponding self-reported levels of affect for that trial. These data are
reproduced in the table below.
32
43. The non-parametric Spearman’s rho was selected as the correlation metric for
determining statistical dependence between headset and self-reported levels of affect
because of the mixed ordinal-interval data. The resulting correlation coefficients and
significances were calculated with SPSS and are presented in the table below (N=36).
Correlation Pair Coefficient Sig. (2-tailed)
Excitement 0.261 0.125
Engagement .361 0.03
Frustration -0.033 0.849
Table 4.3 Spearman correlation between headset and self-reported levels of affect (N=36)
This analysis suggests that of the three features of affect that were examined,
engagement appears to significantly correlate (p=.03) between what is reported by the
headset and what is experienced by the subject. No significant correlation of
excitement or frustration between headset and self-reported affect levels was found.
The levels of headset reported excitement, engagement, and frustration presented in
table 4.2 are average levels over entire 3 minute trials. It could be possible that self-
reported affect levels collected after the trials might better relate to headset levels
averaged over smaller subdivisions of the trial time. For example, a subject who
experienced high levels of frustration during the last 15 seconds of game play, but
low levels of frustration at all other times may have self-reported a high level of
frustration after the trial, even though the subject generally experienced low levels.
To investigate whether headset data averaged over smaller subdivisions of trial time
better correlated with self-reported data, headset data was averaged from time slices
34
44. of the last 15, 30, 60, and first 60 seconds of trial data. Spearman’s rho was
calculated for each new dataset by comparing with the same self-report data. These
results are provided in the table below (N=36), along with original correlation results
from table 4.3.
Correlation Pair Time Division Coefficient Sig. (2-tailed)
Excitement all 3min 0.261 0.125
last 60s 0.133 0.439
last 30s 0.21 0.219
last 15s 0.174 0.31
first 60s 0.285 0.092
Engagement all 3min .361* 0.03
last 60s .340* 0.042
last 30s 0.291 0.085
last 15s 0.316 0.061
first 60s 0.229 0.179
Frustration all 3min -0.033 0.849
last 60s -0.102 0.554
last 30s -0.049 0.775
last 15s 0.005 0.977
first 60s 0.176 0.305
Table 4.4 Spearman correlation of headset and self-report data for varied time divisions (N=36)
This analysis suggests that no new significant relationships between headset and self-
report data are found when analyzing headset data from specific subdivisions of time
(last 60, 30, 15 seconds, and first 60 seconds), however, it does appear that self-
reported excitement and frustration levels correlate best with averaged headset data
from the first 60s of each trial.
The data were further analyzed by calculating grand means for each difficulty
condition, and for both headset and self-reported levels of affect. Grand means are
presented in the table below.
35
45. Condition H_exc H_eng H_fru S_exc S_eng S_fru
low 0.29 0.54 0.42 2.75 3.25 1.25
med 0.31 0.55 0.46 3.33 3.58 2.08
high 0.30 0.57 0.43 3.25 3.75 3.25
Table 4.5 Grand mean headset and self-reported levels of affect per difficulty level
Spearman’s rho was again used as the metric for determining statistical dependence
between headset and self-reported affective levels of the grand mean data. Resulting
correlation coefficients and significances were calculated with SPSS and are
presented in the table below (N=3).
Correlation Pair Coefficient Sig. (2-tailed)
Excitement 1.00 0.01
Engagement 1.00 0.01
Frustration 0.50 0.667
Table 4.6 Grand mean Spearman correlation between headset and self-reported levels of affect (N=3)
This analysis suggests very high correlation of excitement and engagement between
headset and self-reported levels of affect (p=.01), however these results should be
interpreted with caution, considering that this is a correlation between means and the
number of values being compared is small (N=3). No significant correlation of
frustration between headset and self-report levels of affect were found.
To visualize the relationship between grand means of headset and self-reported affect
features, line charts are presented below.
36
46. Fi
igure 4.4 Comp
parison of grand mean headset and self-repor
t rted levels of ex
xcitement
Fi
igure 4.5 Comp
parison of grand mean headset and self-repor
t rted levels of en
ngagement
Fi
igure 4.6 Comp
parison of grand mean headset and self-repor
t rted levels of fru
ustration
37
47. The significant correlation (p=.01) between headset and self-report levels of
excitement and engagement are apparent in fig. 4.4 and fig. 4.5. Frustration is seen to
correlate moderately well from low to medium difficulty conditions; however the
degree and direction of change from medium to high difficulty conditions are clearly
in disagreement. As noted above, cautious interpretation of figures 4.4-4.6 is prudent
considering the use of grand means and small number of data points, but the
emergence of relationships between headset and self-reported levels of excitement
and engagement are suggested.
4.6.2 Subjective Causes of Affect during Gameplay
After the game play portion of the experiment concluded, subjects were asked to
complete a brief survey to collect their subjective opinions about what caused their
experiences of excitement, engagement, and frustration during game play. Participant
responses were analyzed to identify general themes. The data were coded for these
themes, which have been aggregated and are presented in the table below.
38
48. Q1. What types of events caused you to get excited during game play?
Extracted Themes Participant Responses
(S3) “Doing well”
Competent game
(S1) “Clearing many lines at once”
performance
(S1) “Seeing a particularly good move open up”
(S5) “Speed increase of block decent”
Game speed/difficulty (S2) “When blocks got faster”
(S4) “Game speed”
(S5) “Block failing to land where desired”
(S3) “A poor move”
Poor game performance (S4) “A mistake during otherwise good game play”
(S6) “End of game when doing bad”
(S2) “When the blocks would get too high”
(S3) “A good sequence of pieces”
Positively perceived game
(S1) “Getting pieces I was hoping for”
variables
(S6) “Seeing blocks I needed to make 4 rows”
Q2. What made the game engaging?
Extracted Themes Participant Responses
(S2) “The intensity”
Game speed/difficulty
(S1) “Speed increases”
(S4) “You have to think fast”
(S6) “Have to think about what’s happening on board”
Cognitive load
(S1) “Anticipating future moves”
(S1) “Seeing game board get more full”
(S3) “Small learning curve in general”
Game simplicity (S1) “Simplicity”
(S4) “Few input options, enough to stay interesting”
Q3. What happened during the game that made you feel frustrated?
Extracted Themes Participant Responses
(S5) “Too many of same block in a row”
(S3) “Bad sequence of blocks”
Negatively perceived game (S1) “Getting the same piece over and over”
variables (S6) “Not getting the blocks I needed”
(S5) “Block did not fit pattern I was constructing”
(S1) “Getting undesired pieces”
(S5) “When a block would land out of place”
Poor game performance
(S3) “Poor move”
(S3) “Game moving too quickly”
Game speed/difficulty
(S4) “Game speed increase”
Table 4.7 Major themes identified in subjective affect survey results
39
49. 4.7 Discussion
4.7.1 Consistency in the Present Study
The main goal of this study was to determine how accurately the Epoc headset
measures levels of excitement, engagement, and frustration in order to establish the
validity of the Epoc affectiv suite. To this end, the TetrisClone application was
developed and used to log headset output while subjects played games of Tetris at
varied difficulty levels. During the study, subjects were asked to self-report how
excited, engaged, and frustrated they felt for each game that they played.
The responses to the self-report excitement, engagement, and frustration questions
were then statistically compared with the output from the headset. This analysis
suggested that self-reported levels of engagement correlated well with levels reported
by the headset. To a lesser degree, the analysis suggested that excitement levels
measured by the headset correlated fairly well with self-reported levels. Frustration
levels measured by the headset, however, did not appear to correlate with self-
reported levels.
Subjective responses about what made the game engaging seem to corroborate the
self-report and headset data. General trends in the data described engagement as
increasing over low, medium, and high difficulty levels. The two main themes
identified in responses to “what makes the game engaging,” were game
speed/difficulty and cognitive load. As level increases, game speed inherently also
increases. It makes sense that increased difficulty of the game should demand greater
40
50. concentration, more planning, and more efficient decision making—all suggestive of
increased cognitive load. With respect to existing literature, the Chanel study, on
which the experimental design of the present study was based, found a similar upward
linear trend in participant arousal, however, the relationship between engagement and
arousal has not been established.
Excitement trends in self-report and headset data generally showed an increase from
low to medium difficulty, then a slight decrease in the high difficulty condition.
Responses to the question, “what types of events caused you to get excited during
game play,” support this trend. General themes extracted from responses to the
question include competent game performance, game speed/difficulty, poor game
performance, and positively perceived game variables (such as getting a block type
you were hoping for). Game speed increases as difficulty condition increases, so its
contribution to overall excitement level is always present in quantities that increase
with difficulty level. It might be assumed that competent game performance and poor
game performance have a balancing effect on one another, i.e., when one increases
the other decreases, thereby creating a single contributing factor to excitement that is
always present, and arguably stable. The decisive contributor to excitement level
may be the positively perceived game variables. It seems feasible that game variables
that happen to be in the player’s favor should occur at a similar frequency, regardless
of difficulty level. It may also be feasible that these occurrences are less noticed in
higher difficulty levels due to increased cognitive load; a mind occupied by game
tasks of greater importance. This lack of recognition of positive game variables may
41
51. be the reason that excitement increases from low to medium difficulty conditions, but
then decreases in the high condition. A similar trend reported by the Chanel study
occurs in the valence dimension. Valence is shown to increase from low to medium
difficulty conditions, then decrease in the high condition, although the relationship
between valence and excitement has not been established.
4.7.2 Future Direction
It might be beneficial to take a more granular approach to validating output from the
Epoc headset to determine whether specific game events, e.g., a mistake or an optimal
placement, influence affect levels reported by the headset. The present study only
looked at average headset output over large spans of time, but there is a great deal of
variability in the data, some of which might have a relationship with game events.
This more granular approach would require the ability to record specific game events,
e.g., clearing a line, and cross referencing with data from the headset. This could be
accomplished by recording video of the game play. It might also yield interesting
results if headset data were tested for any correlation with other known physiological
measures of affect such as GSR, or skin temperature.
4.8 Conclusion
With the accuracy of at least some of the headset-reported levels of affect established,
an instant message application was developed that uses output from the headset to
control a simple animated avatar and display data from the affectiv suite during
42
52. messaging sessions. This application is called EmoChat. In the next chapter, a study
investigates how EmoChat can be used to enrich emotional content during IM
sessions.
43
53. 5 Study 2: Emotional Instant Messaging with EmoChat
5.1 Introduction
Existing instant messaging environments generally fail to capture non-verbal cues
that greatly increase information transfer bandwidth during face-to-face
communication. The present study is a step toward investigating the use of a low-
cost EEG device as a means to incorporate lost non-verbal information during
computer-mediated communication.
An Instant Messaging application (EmoChat) has been developed that integrates with
the Emotiv Epoc headset to capture facial movements that are used to animate an
expressive avatar. Output from the headset is also used to convey basic affective
states of the user (levels of excitement, engagement, and frustration).
The present study examines how emotional information is transferred differently
between users of the EmoChat application and a “traditional” instant messaging
environment in terms of emotionality. This study addresses the following research
questions:
1. Does the system facilitate communication that generally contains more
emotional information?
2. Does the system provide a greater degree of richness (as defined in section
5.3.1)?
3. Is the emotional state of participants more accurately conveyed/interpreted?
4. How usable is a system that implements this technology?
44
54. 5.2 EmoC
5 Chat System Development
m
5.2.1 Over
rview
EmoChat is a client/serve application that facilit
E er n tates the exch
hange of em
motional
in
nformation during instan message communicati
d nt ion. The app
plication was developed
s
with C# in Microsoft Visu Studio 2008.
w M ual
Traditional in
T nstant messaging environ
nments typic
cally rely on manually ge
enerated
em
moticons in order to sha the emoti
ape ional meanin of a mess
ng sage. EmoC
Chat
in
ntroduces a novel way to capture and convey em
n o d motional mea
aning by inte
egrating with
h
th Emotiv Ep headset, a low cost, commercial available EEG device that is
he poc , lly e e
ca
apable of inf
ferring facia expression and basic a ffective info
al n ormation from raw EEG
m
data.
Fi
igure 5.1 The EmoChat client application
E t
45
55. Facial expression information that is captured by the Epoc headset is passed to
EmoChat and used to animate a simple avatar with brow, eye, and mouth movements.
Affective information captured by the headset is used to modify the value of a series
of progress bar style widgets. The previous validation study suggested that
excitement and engagement are reasonably estimated by the headset, however it was
decided that other affective measures from the headset (frustration and meditation),
would also be presented in the EmoChat application to give the users a chance to
decide for themselves whether or not these measures are of any value.
Although the application has been specifically designed to integrate with the Epoc
headset, a headset is not required. All facial movements and affective levels may be
manually manipulated by the user at a very granular level, i.e., users may override
brow control, but leave eye, mouth, and affect control determined by the headset.
Manual override of facial and affect control is permitted whether or not a headset is
being used. A summary of the facial movements and affective information conveyed
by EmoChat is presented below:
Eyebrow Eyes Mouth Affect
Strong raise Blink Laugh Excitement
Weak raise Left wink Strong smile Average excitement
Neutral Right wink Weak smile Engagement
Weak furrow Neutral Neutral Frustration
Strong furrow Left look Clench (frown) Meditation
Right look
Table 5.1 Facial movements and affective information used by EmoChat
46
56. 5.2.2 Trad
ditional Env
vironment
A “traditional instant me
l” essaging app
plication was approxima by remo
s ated oving avatar
an affect me
nd eters from th original Em
he moChat app
plication, so t subjects would be
that s
re
equired to co
onvey emotio informa
onal ation strictly through tex This trim
y xt. mmed-down
IM environment presents subjects wit standard c
M th chat input an output pa
nd anes.
5.2.3 Application Arc
chitecture
The EmoChat application architecture follows the client/serv model. A server
T n e e ver
ap
pplication listens for con
nnections fro networke client app
om ed plications. A
After a
co
onnection is established, the server monitors for data transm
, m r missions. On data is
nce
re
eceived, the server retran
nsmits the da to all oth connected clients. Th clients an
ata her d he nd
se
erver may re
eside on one or more com
mputers netw
worked acros a LAN or WAN. Thi
ss r is
co
onfiguration allows the server to log all commun
n s g nication even for offlin analysis
nts ne
without requi
w iring any ext effort by client users.
tra .
Fi
igure 5.2 EmoC
Chat server app
plication
47
57. EmoChat introduces a novel way to share emotional information during instant
message communication by integrating with one of the first commercially available
brain-computer interface devices. This application not only demonstrates a new and
interesting way to enrich computer-mediated communication, but also serves an
evaluation tool for the Epoc EEG headset and its inference algorithms. The results
from studies performed with EmoChat may contribute toward a foundation for future
research into BCI applications.
5.3 Experimental Design
The present study follows a crossover-repeated measures design (within-subjects).
Paired subjects spend time chatting with one another using both the EmoChat
application (HE condition), and a “traditional” instant messaging environment (TE
condition). The order in which subjects are presented with these environments is
determined by random assignment to one of two groups.
Group Condition Order
I TE then HE
II HE then TE
Table 5.2 EmoChat experimental groups
Additionally, roles of expresser (EX) and perceiver (PX) are divided between each
subject pair. These roles are used to determine which version of the Emotional
Transfer Accuracy Questionnaire (ETAQ) is completed (discussed below). EX and
PX roles are swapped by paired subjects between each experimental condition.
48