My presentation at the International Conference in Artificial Intelligence in Education (AIED’2020)
8th July 2020
Di Mitri D., Schneider J., Trebing K., Sopka S., Specht M., Drachsler H. (2020) Real-Time Multimodal Feedback with the CPR Tutor. In: Bittencourt I., Cukurova M., Muldner K., Luckin R., Millán E. (eds) Artificial Intelligence in Education. AIED 2020. Lecture Notes in Computer Science, vol 12163. Springer, Cham
https://link.springer.com/chapter/10.1007/978-3-030-52237-7_12
1. Real-time Multimodal
Feedback with the CPR Tutor
Full paper at the International Conference in Artificial
Intelligence in Education (AIED’2020)
8th July 2020
daniele.dimitri@ou.nl - @dimstudio
Daniele Di Mitri, Jan Schneider, Kevin Trebing
Sasa Sopka, Marcus Specht, and Hendrik Drachsler
2. Di Mitri et al. - Real Time Multimodal Feedback 2
Imagine this learning situation
3. Di Mitri et al. - Real Time Multimodal Feedback 3
Multimodal Tutors
This project implements the idea of the
Multimodal Tutor: an Intelligent Tutoring System
for psychomotor learning tasks.
The Multimodal Tutor leverages multimodal data
to enrich the learner’s digital representation.
In this study we present the CPR Tutor a
Multimodal Tutor for Cardiopulmonary
Resuscitation training.
Di Mitri, D. (2020) The Multimodal Tutor: Adaptive Feedback from Multimodal
Experiences. PhD Dissertation. Open Universiteit, The Netherlands.
[Multimodal data extend ITSs]
4. Di Mitri et al. - Real Time Multimodal Feedback 4
Multimodal Feedbak Loops
- Multimodal data in learning is getting more and more
attention
- Still most of the related studies stand at the level of
“data geology”
- E.g. they investigate correlations or predictability
- they do not provide direct and immediate
feedback to the learner
- Problem: Multimodal feedback loops are technically
complex.
We identify five main challenges in the collection,
storing, processing, annotation and exploitation.
Multimodal
Feedback
Loops
Keep me in
the loop!
5. Di Mitri et al. - Real Time Multimodal Feedback 5
Why CPR training?
• CPR can be taught singularly to
one learner
• CPR is a highly standardized
procedure
• CPR has clear and well-defined
criteria to measure the quality
• CPR is a highly relevant skill
6. Previous work*: Detecting CPR mistakes
6
Hardware setup:
• Microsoft Kinect v2
• Myo Armband
• Laerdal ResusciAnne QCPR manikin + SimPad
Dateset collected:
• ~5500 Chest Compressions from 14 experts
• Each CC tagged with 5 classes
Trained 5 Neural Networks to classify CPR mistakes
*Di Mitri, D., Schneider, J., Specht, M., & Drachsler, H. (2019). Detecting mistakes in
CPR training with multimodal data and neural networks. Sensors (Switzerland), 19(14) Di Mitri et al. - Real Time Multimodal Feedback
7. Di Mitri et al. - Real Time Multimodal Feedback 7
CPR Performance indicators
Indicator Ideal value
Compression rate 100 to 120 compr./min
Compression depth 5 to 6 cm
Compression release 0 - 1 cm
Arms position Elbows locked
Body position Using body weight
assessed by the ResusciAnne maniking
not measured by the ResusciAnne manikin
8. Di Mitri et al. - Real Time Multimodal Feedback 8
Feedback with the CPR Tutor
Lock your
arms!
Use your
body weight!
Release
the
compression!
*Metronome
sound 110bpm*
Check
compression
depth!
9. System Architecture
Two main software components:
- CPRTutor app (C#)
for data collection & feedback
- SharpFlow script (Python)
for data analysis and machine learning
Two 3rd party components:
- Visual Inspection Tool
for the data annotation
- Multimodal Learning Hub
data synchronisation logic and data storing
format
Two flows of data:
1) Offline data collection
2) Real-time exploitation
All the components are available
Open Source on GitHub!
10. Di Mitri et al. - Real Time Multimodal Feedback 10
Flow 1) Offline TrainingSensors
MLT session
Kinect.json
Myo.json
Video.mp4
CPRTutor
C# app
SharpFlow
Python3
Model training
ML Models
ClassRate
ClassRelease
ClassDepth
ArmsLocked
BodyWeight
11. Di Mitri et al. - Real Time Multimodal Feedback 11
Flow 2) Real-time ExploitationSensors
CPRTutor
C# app
SharpFlow
Python3
TCP
client
TCP
server
Chunk
(1 CC – 0.5
sec)
Classification ML Models
ClassRate
ClassRelease
ClassDepth
ArmsLocked
BodyWeight
Feedback
12. Di Mitri et al. - Real Time Multimodal Feedback 12
Data storing
example of serialisation
of Myo data in JSON
example
annotation.json
MLT data format
MLT = Meaningful Learning Task
13. Di Mitri et al. - Real Time Multimodal Feedback 13
Data annotation
With the Visual Inspection Tool
1. Load each recorded session
2. Load the SimPad external
annotation file providing CC
segmentation and values for
classRate, classDepth, classRelese
3. Annotate manually armsLocked,
bodyweight by looking at the
video sequences
Di Mitri D., Schneider J., Specht M., Drachsler H. (2019) Read Between the Lines: An
Annotation Tool for Multimodal Data for Learning. LAK19 Proceedings
14. Di Mitri et al. - Real Time Multimodal Feedback 14
Data representation
INPUT SPACE:
3D tensor of shape:
[4803 CCs x 17 time-bins x 52 attributes]
HYPOTHESIS SPACE:
Five binary target classes:
(1) classRate, (2) classDepth, (3) classRelease,
(4) armsLocked, (5) bodyWeight
Raw data format:
15. Di Mitri et al. - Real Time Multimodal Feedback 15
Real-time Exploitation steps
1. Activity Detection:
rule-based approach to detect the CC
using ShoulderLeftY fluctuations
2. Mistake classification:
one LSTM network trained to classify 5
different classes
3. Feedback Logic:
Feedback prioritisation based on
the presence and the priority of
mistakes
1. Detection
2.
Classification
3. Feedback
[the three steps of real-time exploitation]
16. Di Mitri et al. - Real Time Multimodal Feedback 16
Neural Network configuration
The Neural Network was configured with two
stacked LSTM layers followed by two dense layers.
• a first LSTM with input shape 17x52 with 128
hidden units;
• a second LSTM with 64 hidden units;
• a fully-connected layer with 32 units with a
sigmoid activation function;
• a fully connected layer with 5 hidden units
(number of target classes)
• a sigmoid activation.
[generic LSTM network]
17. Study procedure
1. Expert group
data collection
(N=10)
•collecting the data
corpus
2. Data annotation
and model training
•training the models
3. Feedback group
(N=10)
•testing the feedback
functionality of the CPR
Tutor
Titel van de presentatie 17
The experiment was run at Medical Simulation centre AIXTRA Uniklinik Aachen, Germany
18. Expert group (n=10)
Titel van de presentatie 18
Each expert executed 4 CPR
sessions of 1 minute each
(only CC – no rescue breath):
1. regular session
2. arms not locked mistake
3. regular session
4. body weight mistake
In two sessions the experts where
asked to purposely make mistakes.
[instructions given to experts]
19. Di Mitri et al. - Real Time Multimodal Feedback 19
Feedback group (n=10)
10 users tested the
system - performing
each 2 sessions in
alternated order
- 1 minute without
feedback
- 1 minute with
feedback
[feedback explanation given to users before starting]
The users were not novices!
20. Di Mitri et al. - Real Time Multimodal Feedback 20
Expert Group results
Note: Due to imbalanced class distribution the dataset was
reduced to 3434 samples (-28.5%).
21. Di Mitri et al. - Real Time Multimodal Feedback 21
Feedback Group results (1)
Error rate:
i: i-th CC in the series
j: Target Class
n: no. of CCs in 10s
ratio of predicted Zeros and Ones
Performance:
[Plot of one session with feedback]
23. Di Mitri et al. - Real Time Multimodal Feedback 23
Short-term positive effect of feedback
feedback
Average Error Rates’ first
derivative 10 seconds before and
10 seconds after feedback.
24. Di Mitri et al. - Real Time Multimodal Feedback 24
Conclusions
This study proves the concept of a Real-time Multimodal Feedback Loop.
This study provided:
1) an architecture design
2) a methodological approach to design multimodal feedback
3) a field study on real-time feedback for CPR training.
The experimental results suggest that the CPR Tutor short-term improvement of
the CPR performance
25. Di Mitri et al. - Real Time Multimodal Feedback 25
Future developments
• For measuring long-term improvements we need to collect
more data from more participants
• Testing complete novices can produce more interesting
results for the CPR Tutor
• The rule-based CC detection can be replaced with automatic
detection using a streaming approach
• Neural networks are optimal for detecting regularities e.g.
CC-rate, better approaches are possible
• Kinect camera can be replaced by a webcams body pose
estimation libraries (OpenPose, PoseNet…)