2. ICMI ’21 Companion, October 18–22, 2021, Montréal, QC, Canada Jennifer Hamet Bagnou et al.
Current technological developments allow virtual agent plat-
forms to train human users to acquire a set of social skills and thus
reduce these social dysfunctions. Several studies have shown the
effectiveness of such platforms in the management of disorders
such as schizophrenia, social anxiety, or Autism Spectrum Disorder
[Coffey 2017, Klaassen 2018, Rogers 2017, Wang 2012, Howard 2020,
Hoque 2013]. However, these recent virtual agent platforms are
not yet able to initiate and maintain fluid, natural and personalised
interactions with users. For example, virtual agents are not yet
able to continuously adapt their behaviours during the interaction
according to the verbal and non-verbal cues expressed by the user
and the social context throughout the interaction [Chollet 2018,
Ali 2015]. Moreover, most existing virtual agent platforms do not
follow the recognised framework of Cognitive Behavioural Thera-
pies (CBT) with notably Social Skill Training (SST) and cognitive
restructuring, which makes it difficult to adapt the virtual agent
platforms to existing therapies.
In addition, we need to identify performance measures that
would allow us, on the one hand, to evaluate the social skills of in-
dividuals and, on the other hand, to follow their progression during
training [Dindar 2020, Stadler 2020].
2 RELATED WORK
Collaborative Problem Solving (CPS) is defined as the joint ability
to solve problems and work towards a common goal while socially
collaborating with each other in a group of individuals [O’Neil
2010, Stadler 2020]. It therefore represents a relevant framework for
studying and training social skills. Indeed, as individuals enter the
workforce, they are expected to work with others to solve complex,
non-automatable problems, make decisions and generate new ideas,
which require skills associated with CPS. We also interact with
others to solve problems in our private lives, whether with family,
friends or strangers. In all these contexts, CPS requires to have
sufficient cognitive problem-solving and social collaboration skills
[Hesse 2015, Graesser 2018, Stadler 2020].
The Program for International Student Assessment (PISA) 2012
builds on the theoretical framework of the Assessment and Teaching
of 21st century skills (ATC21S) project and defines CPS as the
ability "to engage effectively in a process in which two or more
agents attempt to solve a problem by sharing the understanding and
effort required to reach a solution and by pooling their knowledge,
skills, and efforts to achieve that solution" [OECD 2017]. Thus, CPS
would involve two categories of skills: cognitive skills associated
with problem-solving processes and social skills associated with
collaborative processes [Andrews-Todd 2018, Hesse 2015, Dindar
2020].
Regarding cognitive skills, group members need to work together
to develop a shared understanding of the problem situation, ex-
change information, discuss the most appropriate strategies for
solving the problem, and monitor and revise their strategies until
the group’s goals are met [Barron 2003, Slof 2010, Zimmerman
2011]. Indicators of these skills can be summarized under two head-
ings: task regulation and knowledge construction. Task regulation
refers to learners’ ability to set goals, manage resources, analyse
and organize the problem space, explore a problem systematically,
gather information, and tolerate ambiguity. Knowledge construc-
tion concerns the person’s ability to understand the problem and
test hypotheses.
Regarding social skills, communication processes between team
members can either facilitate or hinder the collaborative processes
in the cognitive dimension [Janssen 2012]. The theoretical frame-
work proposed by [Hesse 2015] reports three strands of indicators
that summarize the social skills specific to CPS and reflect the collab-
orative aspect of problem solving: participation, perspective taking,
and social regulation. Participation is the foundation of engage-
ment with the task and other collaborators, and is reflected in how
people act or interact to accomplish tasks. Perspective-taking skills
focus on the quality of interaction between students, reflecting
the student’s level of awareness of collaborators’ knowledge and
resources as well as their ability to respond. Social regulation refers
to strategies used in collaboration, such as negotiation, initiative
taking, self-evaluation, and taking responsibility.
2.1 Scales for assessing social interactions
Studying social interactions requires collecting behaviours of par-
ticipants and annotating the observed behaviours during social
interactions between participants. In order to use CPS as a frame-
work for studying social interaction skills, we selected two scales
to qualitatively assess participants’ social performance. These are
the Social Performance Rating Scale [Fydrich 1998, Stevens 2010,
Ramdhonee-Dowlot 2021] and the Social and Cognitive Skills in
Collaborative Solving Problems [Hesse 2015].
The Social Performance Rating Scale (SPRS) assesses behaviours
within a social interaction. It includes the following items: conver-
sation flow, vocal quality, length (speech rate/pressure), gaze or
eye contact, and discomfort. It is designed to be applied to the ob-
servation of videotaped or live conversations between two people.
Observers are asked to rate behaviours on a 5-point scale.
For the Social and Cognitive Skills in Collaborative Solving Prob-
lems scale, we inspired from [Hesse 2015]. This scale assesses the
social and cognitive ability of CPSs in 18 items (9 for the social part
and the same for the cognitive part) such as: the level of interaction,
the adaptation of the discourse to the interlocutor, the formulation
of the steps of the problem solving. Each item is evaluated on a
scale from 0 (low) to 2 (high).
These two scales are written in English. Our experiment was con-
ducted with French-speaking participants. As suggested by [Wild
2005] in their guide to good practice in translation and cultural
adaptation, a first translation must be done from English to French,
then a second from French to English. This forward/backward
translation method provides a quality control check for consistency
between the translation and the original version. A translation
of the two scales was performed at our request by Lionbridge, a
company specializing in language and cross-cultural adaptation.
3 EXPERIMENTAL PROTOCOL
3.1 Procedure
Conducting research about assessing and training social skills re-
quires selecting relevant experimental tasks. Economic games, such
as the "prisoner’s dilemma", have become popular paradigms for
exploring CPS. Indeed, the robust behavioural patterns observed
382
3. A Framework for the Assessment and Training of Collaborative Problem-Solving Social Skills ICMI ’21 Companion, October 18–22, 2021, Montréal, QC, Canada
across studies suggest that the łprisoner’s dilemmaž can be an
important assessment tool for evaluating social cognition [Bland
2017]. We adapted two of these widely used games, the łprisoner’s
dilemmaž and łsurvival taskž to make them collaborative and cre-
ated a third game, łInvestment in students’ social service organi-
sationž. In game #1, łprisoner’s dilemmaž, we asked a pair of par-
ticipants to reach a consensus as to whether they want to confess
or to stay silent (knowing that another pair of participants is in
the same situation). In game #2 łsurvival taskž, participants are
survivors of a plane crash. They are asked to make a list of 10 items
that they consider important for survival and to rank the 5 most
important items. In game #3, the pair of participants has to donate
to the school service organization. They need to determine the
common donation amount within a limited time and discuss how
this amount should be used for to improve the existing services
provided to students. We selected these games because they do
correspond to the definition of a CPS i.e., ła coordinated attempt
by two or more people to share their skills and knowledge with
the goal of constructing and maintaining a unified solution to a
problemž [Andrews-Todd 2018, Hesse 2015, Dindar 2020].
We recruited six participants (three of whom were female), ran-
domly divided into three pairs. Participants were asked to perform
the three tasks via a video conferencing platform so as to cope with
the COVID situation with remote participation.
Each participant was instructed to collaborate with the other
participant to find a common solution to a problem in a maximum
of 10 minutes. Acoustic and visual data of the participants were
recorded. We collected a total of 18 audio and video recordings (6
participants x 3 games). Following each game, participants com-
pleted a qualitative home-made questionnaire on the clarity of the
instructions, the level of anxiety elicited by the task, and the interest
and commitment during the interaction.
3.2 Raters and Assessment Procedure
The protocol described in the previous sections explained how
we collected video recordings of social interactions between par-
ticipants. The next step was to use the two scales we selected to
annotate the observed behaviours and interactions.
Two psychology-trained raters and one clinical psychologist, all
with doctoral degrees, conducted the SPRS and social and cognitive
CPS scale ratings. The raters were previously introduced to the
protocol and the two scales. They received the scales in the form of
an excel spreadsheet with the following instruction: łYou are going
to watch three videos depicting interactions between two people
(e.g. one video for each game). After each video, you must evaluate
the social/cognitive performance of each participant. Please note
that you must watch the video at least twice before beginning
your evaluation. Once the evaluation is completed, you must watch
the video again at least once before you can finally validate your
answers.ž
4 RESULTS AND DISCUSSION
4.1 Inter-rater reliability
For scale validation, inter-rater agreement was measured by calcu-
lating the intra-class correlation coefficient (ICC, 2k). This coeffi-
cient defines the reliability of ratings by comparing the variability
of different ratings of the same individual to the total variation of
all ratings and all individuals [Shout 1979]. In our case, a sample
of three raters was selected and they rated all participants. There-
fore, we opted for the ICC two way random with the mean as the
unit of assessment. All participants performed all three CPS tasks.
However, these tasks can be a source of variability in participants’
performance. Thus, analyses were done on a task-by-task basis.
The intraclass correlation coefficient ranged from 0 to 1; values
below 0.5 indicate poor reliability, values between 0.5 and 0.75
indicate moderate reliability, values between 0.75 and 0.9 indicate
good reliability, and values above 0.9 indicate excellent reliability
[Koo 2016].
The results indicate moderate to excellent reliability (e.. This is
the case, for example, for the action item in the Social CPS scale or
for the gaze and discomfort item in the SPRS scale. However, some
items such as organization (Social CPS), perseverance (Cognitive
CPS) and voice quality obtain poor ICCs for the three collaborative
games.
We observed variability in performance between participants on
the different items of the three scales. This variability means that it
is possible to discriminate between the performances of individuals,
for example for SPRS gaze item, the average performance varies
from 2.33 to 5.
Concerning the responses to the qualitative questionnaire, par-
ticipants considered game #3 to be the best understood and most
interesting. Game #2 was considered to be the easiest to start and
maintain the conversation.
5 DISCUSSION
These preliminary results provide initial evidence of medium to
high reliability for the annotation of the collected data using the two
scales. The discrepancies observed between our results and those of
the authors who created and applied the two scales [Fydrich 1998,
Hesse 2015], can be attributed to the difference between the tasks
we selected and their tasks. For example,[Fydrich 1998] conducted
their validation with a population with anxiety disorders interacting
for 3 minutes with a confederate.
These results also show a variable reliability across tasks. For
example, the łprisoner’s dilemmaž task has the highest inter-class
coefficients in SPRS, while the łsurvival taskž has the lowest. The
opposite result is found for the social CPS scale. We are exploring
how to make the three tasks more equivalent before making inter-
task performance comparisons.
In the future, we plan to better adapt these two scales to our
games and to improve our annotation protocol in order to reach a
better reliability. We intend to use the resulting corpus for exploring
the relations between attention and social interactions, and also for
inspiring the design and validation of virtual characters for social
skills training.
ACKNOWLEDGMENTS
Part of the work described in this article was funded by the ANR
Project TAPAS (ANR-19-JSTS-0001).
REFERENCES
[1] Ali, M.R., Crasta, D., Jin, L., Baretto, A., Pachter, J., Rogge, R.D., Hoque, M.E.:
LISSA - live interactive social skill assistance. In: 2015 International Conference
383
4. ICMI ’21 Companion, October 18–22, 2021, Montréal, QC, Canada Jennifer Hamet Bagnou et al.
on Affective Computing and Intelligent Interaction, ACII 2013, pp. 173ś179 (2015)
[2] Andrews-Todd, J., & Forsyth, C. M. (2018). Exploring social and cognitive dimen-
sions of collaborative problem solving in an open online simulation-based task.
Computers in Human Behavior. https://doi.org/10.1016/j.chb.2018.10.025.
[3] Bland A.R., Roiser, J.P., Mehta, M.A., Schei, T., Sahakian, B.J., Robbins, T.W.,
and Elliott, R. (2017). Cooperative Behavior in the Ultimatum Game and Pris-
oner’s Dilemma Depends on Players’ Contributions. Front. Psychol. 8:1017. doi:
10.3389/fpsyg.2017.01017
[4] Barron, B. (2003). When smart groups fail. The Journal of the Learning Sciences.
https://doi.org/10.1207/S15327809JLS1203_1.
[5] Bellack, A. S. (2004). Skills Training for People with Severe Mental Illness. Psy-
chiatric Rehabilitation Journal, 27(4), 375ś391. https://doi.org/10.2975/27.2004.
375.391
[6] Bellack, A. S., Mueser, K. T., Gingerich, S., & Agresta, J. (2013). Social skills training
for schizophrenia: A step-by-step guide(2nd ed.). Guilford Press.
[7] Casner-Lotto, J., & Barrington, L. (2006). Are they really ready for work? Employ-
ers’ perspectives on the basic knowledge and applied skills of new entrants to the
21st century U.S. Workforce. Washington, DC: The conference board, partnership
for 21st century skills, corporate voices for working families, and society for
human resource management. Retrieved from http://eric.ed.gov/?id=ED519465.
[8] Chollet, M., Neubauer, C., Ghate, P., & Scherer, S. (2018). Influence of individual
differences when training public speaking with virtual audiences. Proceedings of
the 18th International Conference on Intelligent Virtual Agents, IVA 2018, 1ś7.
https://doi.org/10.1145/3267851.3267874
[9] Coffey, A. J., Kamhawi, R., Fishwick, P., & Henderson, J. (2017). The efficacy of
an immersive 3D virtual versus 2D web environment in intercultural sensitivity
acquisition. Educational Technology Research & Development, 65(2), 455ś479.
[10] Dindar, M., Järvelä, S., Järvenoja, H.(2020). Interplay of Metacognitive Experiences
and Performance in Collaborative Problem Solving. Comput. Educ, 154, 103922.
[11] Fydrich, T., Chambless, D. L., Perry, K. J., Buergener, F., & Beazley, M. B. (1998).
Behavioral assessment of social performance: a rating system for social phobia.
Behaviour research and therapy, 36(10), 995ś1010. https://doi.org/10.1016/s0005-
7967(98)00069-2
[12] Graesser, A. C., Dowell, N., Hampton, A. J., Lippert, A. M., Li, H., & Shaffer,
D. W. (2018). Building intelligent conversational tutors and mentors for team
collaborative problem solving: Guidance from the 2015 Program for International
Student Assessment. In Building intelligent tutoring systems for teams: What
matters (pp. 173ś211). Emerald Publishing Limited.
[13] Gresham, F. M., & Elliott, S. N. (1984). Assessment and classification of children’s
social skills: A review of methods and issues. School Psychology Review, 13,
292-3011.
[14] Grover, R. L., Nangle, D. W., Buffie, M., & Andrews, L. A. (2020). Defining social
skills. In Social Skills Across the Life Span (pp. 3ś24). Elsevier. https://doi.org/10.
1016/b978-0-12-817752-5.00001-9
[15] Hesse F., Care E., Buder J., Sassenberg K., Griffin P. (2015). A Framework for Teach-
able Collaborative Problem Solving Skills. In: Griffin P., Care E. (eds) Assessment
and Teaching of 21st Century Skills. Educational Assessment in an Information
Age. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-9395-7_2
[16] Hops, H. (1983). Children’s social competence and skill: Current research prac-
tices and future directions. Behavior Therapy, 14, 3ś18. doi:10.1016/ s0005-
7894(83)80084-7
[17] Hoque, M.E., Courgeon, M., Martin, J., Mutlu, B., Picard, R.W.: Mach: My au-
tomated conversation coach. In: Proc. ACM International Joint Conference on
Pervasive and Ubiquitous Computing, UbiComp 2013 (to appear, 2013)
[18] Howard, M. C., & Gutworth, M. B. (2020). A meta-analysis of virtual reality
training programs for social skill development. Computers and Education, 144,
[103707].https://doi.org/10.1016/j.compedu.2019.103707
[19] Janssen, J., Erkens, G., Kirschner, P. A., & Kanselaar, G. (2012). Task-related
and social regulation during online collaborative learning. Metacognition and
Learning, 7 (1), 25ś43. https://doi.org/10.1007/s11409-010-9061-5
[20] Klaassen, R. G., de Vries, P., & Ceulemans, D. S. (2018). The applicability of
feedback of virtual speech app metrics in a presentation technique course SEFI
2018 conference.
[21] Kingery, J. N., Erdley, C. A., & Scarpulla, E. (2020). Developing social skills. In D.
W. Nangle, C. A. Erdley & R. A. Schwartz-Mette (Eds), Social skills across the life
span (pp. 25ś45). London, UK: Academic Press.
[22] Little S.G., Swangler J., Akin-Little A. (2017) Defining Social Skills. In: Matson J.
(eds) Handbook of Social Behavior and Skills in Children. Autism and Child Psy-
chopathology Series. Springer,Cham. https://doi.org/10.1007/978-3-319-64592-
6_2
[23] Milligan, K., Sibalis, A., Morgan, A., & Phillips, M. (2017). Social competence:
Consideration of behavioral, cognitive, and emotional factors. In J. L. Matson
(Ed.), Handbook of social behav-ior and skills in children (pp. 63ś82). Cham:
Springer International Publishing. https://doi. org/10.1007/978-3-319-64592-6_5.
[24] Moody, C. T., & Laugeson, E. A. (2020). Social Skills Training in Autism Spectrum
Disorder Across the Lifespan. Child and adolescent psychiatric clinics of North
America, 29(2), 359ś371. https://doi.org/10.1016/j.chc.2019.11.001
[25] OECD (2017). PISA 2015 collaborative problem-solving framework. Retrieved
fromJuly https://www.oecd.org/pisa/pisaproducts/Draft%20PISA%202015%
20Collaborative%20Problem%20Solving%20Framework%20.pdf.
[26] OECD (2017). PISA 2015 results (volume V): Collaborative problem solving, PISA.
Paris: OECD Publishing. https://doi.org/10.1787/9789264285521-en.
[27] O’Neil, H.F., S.H. Chuang, E.L. Baker (2010), "Computer-based feedback for
computer-based collaborative problem-solving", in Ifenthaler, D., P. Pirnay-
Dummer and N.M. Seel (eds.), Computer-based diagnostics and systematic analy-
sis of knowledge, Springer-Verlag, New York, NY.
[28] Ramdhonee-Dowlot, K., Balloo, K., & Essau, C. A. (2021). Effectiveness of the Super
Skills for Life programme in enhancing the emotional wellbeing of children and
adolescents in residential care institutions in a low-and middle-income country: A
randomised waitlist-controlled trial. Journal of Affective Disorders, 278, 327ś338.
[29] Rogers, S. A. (2017). A massively multiplayer online role-playing game with
language learning strategic activities to improve English grammar, listening,
reading, and vocabulary. University of South Alabama.
[30] Rosen, Y., & Foltz, P. (2014). Assessing collaborative problem solving through
automated technologies. Research and Practice in Technology Enhanced Learning,
9(3), 389ś410.
[31] Scoular, C., Care, E., & Hesse, F. W. (2017). Designs for operationalizing col-
laborative problem solving for automated assessment. Journal of Educational
Measurement, 54, 12ś35.
[32] Slof, B., Erkens, G., Kirschner, P. A., Janssen, J., & Phielix, C. (2010). Fostering
complex learning-task performance through scripting student use of computer
supported representational tools. Computers & Education. https://doi.org/10.
1016/j.compedu.2010.07.016.
[33] Shrout, P.E., & Fleiss, J.L. (1979). Intraclass Correlation: Uses in Assessing Rater
Reliability. Psychological Bulletin 86: 420ś28.
[34] Stadler, M., Herborn, K., Mustafi,ć, M., and Greiff, S. (2020). The assess-
ment of collaborative problem solving in PISA 2015: an investigation of
the validity of the PISA 2015 CPS tasks. Comput. Educat. 157:103964. doi:
10.1016/j.compedu.2020.103964
[35] Wang, C. X., Calandra, B., Hibbard, S. T., & Lefaiver, M. L. M. (2012). Learning
effects of an experimental EFL program in Second Life. Educational Technology
Research & Development, 60(5), 943ś961.
[36] Zimmerman, B. J., & Schunk, D. H. (2011). Self-regulated learning and perfor-
mance: An introduction and an overview. In B. Zimmerman, & D. H. Schunk
(Eds.), Handbook of self-regulation of learning and performance. New York:
Routledge.
384