Eduardo Coutinho - Psychoacoustic cues to emotion in speech prosody and music
1. SWISS CENTER FOR
AFFECTIVE SCIENCES
Psychoacoustic cues to emotion in
Music and Speech Prosody
Eduardo Coutinho
SwissNex, San Francisco
May 2013
2. Aims of this talk
• Basic acoustic features allow people to perceive
emotional meaning in music and the human voice
• Both media achieve emotional expression through
particular configurations of these features
• It is possible to make accurate predictions of the
emotions expressed by music and voice from
acoustic features alone
3. • Changes in tone of voice while speaking
communicate emotional meaning (but
not only!) independently of verbal
understanding
• Consistent associations between patterns
of acoustic cues (e.g., speech rate, F0,
loudness) and particular emotions
• Similar emotions are communicated
intra- and cross-culturally through
similar arrangements of acoustic cues
Speech Prosody
JOURNAL OF CROSS-CULTURAL PSYCHOLOGY
Scherer et al. / VOCAL EMOTION EXPRESSION
Whereas the perception of emotion from facial expression has been extensively studied cross-culturally, lit-
tle is known about judges’ ability to infer emotion from vocal cues. This article reports the results from a
study conducted in nine countries in Europe, the United States, and Asia on vocal emotion portrayals of
anger, sadness, fear, joy, and neutral voice as produced by professional German actors. Data show an overall
accuracy of 66% across all emotions and countries. Although accuracy was substantially better than chance,
there were sizable differences ranging from 74% in Germany to 52% in Indonesia. However, patterns of con-
fusion were very similar across all countries. These data suggest the existence of similar inference rules from
vocal expression across cultures. Generally, accuracy decreased with increasing language dissimilarity
from German in spite of the use of language-free speech samples. It is concluded that culture- and lan-
guage-specific paralinguistic patterns may influence the decoding process.
EMOTION INFERENCES FROM VOCAL EXPRESSION
CORRELATE ACROSS LANGUAGES AND CULTURES
KLAUS R. SCHERER
University of Geneva, Switzerland
RAINER BANSE
Humboldt University Berlin, Germany
HARALD G. WALLBOTT
University of Salzburg, Austria
One of the key issues of current debate in the psychology of emotion concerns the universal-
ity versus cultural relativity of emotional expression. This has important implications for the
central question of the nature and function of emotion. Although there is a general consensus
that both biological and cultural factors contribute to the emotion process (see Mesquita,
Frijda, & Scherer, 1997), the relative contribution of each of the factors, or the respective
amount of variance explained, remains to be explored. An ideal way to study this issue empiri-
cally is to compare outward manifestations of emotional reactions with similar appraisals of
Decoding speech prosody in five languages
WILLIAM FORDE THOMPSON and L-L. BALKWILL
4. Japanese Psychological Research
2004, Volume 46, No. 4, 337–349
Special Issue: Cognition and emotion in music
MunksgaardORIGINAL ARTICLERecognition of emotion in music
Recognition of emotion in Japanese, Western, and
Hindustani music by Japanese listeners1
LAURA-LEE BALKWILL2
Queen’s University, Kingston, Ontario K7L 3N6, Canada
WILLIAM FORDE THOMPSON
University of Toronto, Mississauga, Ontario M5S 1A1, Canada
RIE MATSUNAGA
Department of Psychology, Graduate School of Letters, Hokkaido University,
Kita-ku, Sapporo 060-0810, Japan
Abstract: Japanese listeners rated the expression of joy, anger and sadness in Japanese,
Western, and Hindustani music. Excerpts were also rated for tempo, loudness, and com-
plexity. Listeners were sensitive to the intended emotion in music from all three cultures,
and judgments of emotion were related to judgments of acoustic cues. High ratings of joy
were associated with music judged to be fast in tempo and melodically simple. High ratings
of sadness were associated with music judged to be slow in tempo and melodically
complex. High ratings of anger were associated with music judged to be louder and more
complex. The findings suggest that listeners are sensitive to emotion in familiar and un-
familiar music, and this sensitivity is associated with the perception of acoustic cues that
transcend cultural boundaries.
Key words: recognition of emotion, music, cross-culture.
Music is strongly associated with emotions.
Evocative music is used in advertising, television,
movies, and the music industry, and the effects
are powerful. Listeners readily interpret the
Robitaille, 1992; Terwogt & Van Grinsven, 1991).
In some cases, listening to music may give
rise to changes in mood and arousal (Husain,
Thompson, & Schellenberg, 2002; Thompson,
Music
• Listeners’ perceive emotional meaning by
attending (consciously or unconsciously)
to structural aspects of the acoustic signal
• Consistent associations between acoutic
patterns and particular emotions
• At least some emotions are
communicated universally by means of
similar acoustic profiles
Current Biology 19, 1–4, April 14, 2009 ª2009 Elsevier Ltd All rights reserved DOI 10.1016/j.cub.2009.02.058
Universal Recognition
of Three Basic Emotions in Music
Thomas Fritz,1,* Sebastian Jentschke,2 Nathalie Gosselin,3
Daniela Sammler,1 Isabelle Peretz,3 Robert Turner,1
Angela D. Friederici,1 and Stefan Koelsch1,4,*
1Max Planck Institute for Human Cognitive and Brain Sciences
04103 Leipzig
Germany
2UCL Institute of Child Health
London WC1N 1EH
UK
3Department of Psychology and BRAMS
Universite´ de Montre´ al
Montre´ al H2V 4P3
Canada
4Department of Psychology
Pevensey Building
University of Sussex
Falmer BN1 9QH
UK
Summary
It has long been debated which aspects of music perception
are universal and which are developed only after exposure to
a specific musical culture [1–5]. Here, we report a crosscul-
tural study with participants from a native African population
(Mafa) and Western participants, with both groups being
naive to the music of the other respective culture. Experi-
ment 1 investigated the ability to recognize three basic
emotions (happy, sad, scared/fearful) expressed in Western
music. Results show that the Mafas recognized happy, sad,
and scared/fearful Western music excerpts above chance,
indicating that the expression of these basic emotions in
Western music can be recognized universally. Experiment
2 examined how a spectral manipulation of original, natural-
istic music affects the perceived pleasantness of music in
Western as well as in Mafa listeners. The spectral manipula-
tion modified, among other factors, the sensory dissonance
of the music. The data show that both groups preferred orig-
inal Western music and also original Mafa music over their
spectrally manipulated versions. It is likely that the sensory
dissonance produced by the spectral manipulation was at
least partly responsible for this effect, suggesting that
consonance and permanent sensory dissonance universally
influence the perceived pleasantness of music.
Results and Discussion
The expression of emotions is a basic feature of Western
music, and the capacity of music to convey emotional expres-
sions is often regarded as a prerequisite to its appreciation in
Western cultures. This is not necessarily the case in non-
Western music cultures, many of which do not similarly
emphasize emotional expressivity, but rather may appreciate
music for qualities such as group coordination in rituals. To
our knowledge, there has not yet been a con
tion into the universals of the recognition of e
sion in music and music appreciation. Th
musical universals with Western music stimu
ipants who are completely naive to Western
viduals from non-Western cultures who hav
Western music occasionally, and perhap
explicit attention to it (e.g., while listening
watching a movie), do not qualify as part
musical knowledge is usually acquired imp
even shaped through inattentive listening ex
individuals investigated in our study belon
one of approximately 250 ethnic groups that
ulation of Cameroon. They are located in the
the Mandara mountain range, an area cult
a result of a high regional density of endem
more remote Mafa settlements do not have
and are still inhabited by many individuals w
tional lifestyle and have never been exposed
The investigation of the recognition of e
sions conveyed by the music of other cultu
addressed in three previous studies [1, 7,
aimed to investigate cues that transcend cu
and the authors made an effort to include l
prior exposure to the music presented
listening to Hindustani music). Although th
significantly enhanced our understanding
experience may influence music perception
in these studies were exposed to the mas
also inadvertently to emotional cues of the r
music (for example, by the association of this
To draw clear conclusions about music univ
is necessary to address music listeners wh
culturally isolated from one another. Her
a research paradigm to investigate the reco
emotion in two groups: Mafa listeners naive
and a group of Western listeners naive to M
iment 1 was designed to examine the rec
basic emotions as expressed by Western m
and scared/fearful), using music pieces tha
previously to investigate the recognition of
brain-damaged patients [9, 10].
Data from experiment 1 showed that al
expressions (happy, sad, and scared/fearful
above chance level by both Western an
(Figure 1A, see also Supplemental Data av
statistical evaluation; note that the Mafa lis
been exposed to Western music before). H
listeners showed considerable variability in t
and 2 of the 21 Mafa participants perfo
level. The mechanism underlying the univer
emotional expressions conveyed by Western
appears to be quite similar for both Western
Mafas:ananalysisofrating tendenciesreveal
and Westerners relied on temporal cues and
judgment of emotional expressions, althoug
more marked in Western listeners (see Sup
and Discussion for details). For the tempo,*Correspondence: fritz@cbs.mpg.de (T.F.), koelsch@cbs.mpg.de (S.K.)
Please cite this article in press as: Fritz et al., Universal Recognition of Three Basic Emotions in Music, Curre
doi:10.1016/j.cub.2009.02.058
5. Communication of Emotions in Vocal Expression and Music Performance:
Different Channels, Same Code?
Patrik N. Juslin and Petri Laukka
Uppsala University
Many authors have speculated about a close relationship between vocal expression of emotions and
musical expression of emotions, but evidence bearing on this relationship has unfortunately been lacking.
This review of 104 studies of vocal expression and 41 studies of music performance reveals similarities
between the 2 channels concerning (a) the accuracy with which discrete emotions were communicated
to listeners and (b) the emotion-specific patterns of acoustic cues used to communicate each emotion. The
patterns are generally consistent with K. R. Scherer’s (1986) theoretical predictions. The results can
explain why music is perceived as expressive of emotion, and they are consistent with an evolutionary
perspective on vocal expression of emotions. Discussion focuses on theoretical accounts and directions
for future research.
Music: Breathing of statues.
Perhaps: Stillness of pictures. You speech, where speeches end.
You time, vertically poised on the courses of vanishing hearts.
Feelings for what? Oh, you transformation of feelings into
. . . audible landscape!
that proposals about a close relationship between vocal expression
and music have a long history (Helmholtz, 1863/1954, p. 371;
Rousseau, 1761/1986; Spencer, 1857). In a classic article, “The
Origin and Function of Music,” Spencer (1857) argued that vocal
music, and hence instrumental music, is intimately related to vocal
expression of emotions. He ventured to explain the characteristics
Psychological Bulletin Copyright 2003 by the American Psychological Association, Inc.
2003, Vol. 129, No. 5, 770–814 0033-2909/03/$12.00 DOI: 10.1037/0033-2909.129.5.770
Shared expressive acoustic profiles?
11. Computational
framework
• Recurrent neural network
• Nonlinear regression model
• Training phase: learn from
human listeners
• Evaluation phase: perceive
emotional meaning as
human listeners
O1O0
HN
H0
Hn
oooooo
ooo ooo CN
C0
Cnooo ooo
INPUT UNITS
CONTEXT
UNITS
HIDDEN
UNITS
OUTPUT UNITS
Psychoacoustic encoding
Music Speech
Psychoacoustic features
Arousal Valence
21. Summary
• Low-level acoustic parameters are fundamental
cues in the expression of emotion in music and
speech.
• Affective information is, at least partially, encoded
as dynamic acoustic patterns
• A significant part of listeners’ perceived emotions
in music and speech can be predicted from a small
set of basic variables in human audition
22. Possible applications
• Diagnosis of psychiatric/psychological conditions (speech)
• Improvement of emotional communication proficiency
(e.g, speakers, actors, singers, musicians)
• Human-computer and computer mediated human-human
interaction: automatic recognition of emotion in voice
• Music selection for ...
• ... Mood induction/regulation
• ... Cognitive skills enhancement and patients recovery
• ... Music in public spaces
• Improvement of hearing aids responses to music
23. Thank you!
• Collaborators:
• Contact: eduardo.coutinho@unige.ch
• More information: www.eadward.org
SWISS CENTER FOR
AFFECTIVE SCIENCES
Prof. Angelo CangelosiProf. Nikki Dibben