Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Lak12 - Leeds - Deriving Group Profiles from Social Media
1. Deriving Group Profiles from Social Media to
Facilitate the Design of Simulated Environments
for Learning
Ahmad Ammari, Lydia Lau, Vania Dimitrova
The University of Leeds, UK
at
Learning Analytics and Knowledge 2012, Vancouver, Canada
1
2. In this presentation …
• Vision of ImREAL as motivation
• Potential of semantics in smart social
spaces for learning applications
• Experimental study on combining
semantics and machine learning for group
profiling of digital traces
• Lessons learned
• Future challenges
2
3. Immersive Reflective Experience
based Adaptive Learning Vision
In a simulator for
learning
Forethought Reflection
In the real world
3
4. Consortium (2010-13)
University of Leeds, UK
- Project Coordinator/Scientific Coordinator
Trinity College Dublin, Ireland
Graz University of Technology, Austria
University of Erlangen-Nuremberg, Germany
Delft University of Technology, The
Netherlands
Imaginary Srl, Italy
EmpowerTheUser Ltd, Ireland
4
5. Smart Social Spaces
– semantic underpinning
Sensors
& collectors
Noise
filtration Semantic
augmentation
Group Ontologies
service
profiling
Viewpoint Semantic
Semantic query
Semantic service
service
data browsers
Smart social spaces
5
6. This talk …
1. Sensors
& collectors
2. Noise
filtration + supervised
Ontologies machine learning
3. Group + unsupervised
profiling machine learning
Smart social spaces
Interpersonal skills for
Job interview 6
7. Noise Filtration Service
• Input: social media content (e.g. YouTube
comments)
• Filters the noise from social media content by
removing the content that are not useful to
generate social profiles
• Output: clean social media content, author
IDs
Support service to social profiling services.
Clean content reflects awareness of
authors in domain aspects (e.g. Job Interview7
8. The Social Noise Filtration Service:
Methodology
Semantically Enriched
Experimentally Bag of Words (BoW)
Controlled Ground Truth Corpus
Analyze
Comments
SCORE
Term – Comment
Matrix
(Training Corpus)
S
C
Public Pre- O
R
Comments Process E
On YouTube S 8
9. Example Comments
Comment score
I think trying to decipher gestures as to have a general 8.0
meaning is a bit too vague. You have to put the
background, education, personality, and the culture of
the individual into consideration. Gestures are often
misunderstood and not the clearest form of
communication. For example…
…I will comment that most of us have grown up with 7.7
being told that strong eye contact (without looking
psychotic) is good … However, I agree that you notice if
someone is not used to it and seems intimidated. At this
point it is a good to look away periodically.
Interview on Wednesday, hope it goes well 0.68
9
10. Group Profiling
… …
Relevant Noise
P1
Clustering
– based
Group
P2 Profiles
Demographic
– based
Group
Profiles
Adult Female
10
USA UK
11. Exploration experiment
Purpose is to answer the following:
Q1: Can we generate useful group profiles
to aid training professionals in identifying
learning needs?
Q2: Can we derive learning domain
concepts to augment learner models?
11
12. Dataset used
Data Property Value
Number of Job Interview-related YouTube Videos 17
Number of Comments Retrieved 1465
Number of Remaining Comments after Noise Filtration 471 (32%)
Number of Unique Comment Authors 393
Comment to Author Ratio 1.20
12
13. Sample Output
Clustering–based Group Profiles
Third largest group – Size: 36 Authors, 9% of
population
13
14. Sample Output
Demographic–based Group Profiles
Location: GB – Age: From 20 To 40 years
Frequent Job
Interview_good, eye_contact,
Interview
eyes, interviewer, hope, helpful
Concepts
Location: US – Age: From 20 To 40 years
Frequent Job Good_Interview, people, company, interviewer,
Interview time, girl, experience, answer, money,
Concepts questions, nervous, education, fingers, hands
Location: Asia– Age: From 20 To 40 years
Frequent Job questions, answers, candidate,
Interview interview_guide, money, pay, job_guide, watch
Concepts
14
15. Lessons Learned
• On noise filtration
– Choice of threshold for noise filtration?
– What is “inappropriate” content?
– Can “promotional” content be detected?
• On potential of group profiles to aid
training professionals and learner model
augmentation
– Authentic comments were liked
– Would be useful to know more about the
viewpoints within a group
15
16. Future work
• Increase use of semantics (e.g. For
viewpoints extraction)
• Improve quality of group profiling (e.g. By
understanding the impact of clusters
sorted by age)
• How to get more accurate demographic
data (e.g. „Place‟ from YouTube was not
reliable)
16
17. Deriving Group Profiles from Social Media to
Facilitate the Design of Simulated Environments
for Learning
http://www.imreal-project.eu/
Ahmad Ammari, Lydia Lau, Vania Dimitrova
The University of Leeds, UK
17
Notes de l'éditeur
(2) explain the background and provide the context to understand why we tackled the problem in the way we did.(3) Details of the work for this paper
ImREAL stands for (1) with ‘interpersonal communications’ as the learning domain.(2) Our stakeholders are adult learners, trainers, and developers of the simulated learning environments.Two main problems: (i) what learners learned in the simulated environment could be disconnected from the real world; (ii) simulator developer has limited resources to cater for a wide range of learning experiences.Adult learners learn particular well through exchanging experiences with others...hence ImREAL’s solution adopts a three prong approach:(4) (5) Pedagogy (SRL)(6) Technology (making use of social media as the rich source of experiences)(7) A socio-technical approach to narrow the gap between simulator and real world experiences
In Leeds, we are exciting by the potential of Digital traces in social spaces as additional sources of experience. We need to work out a pipeline from getting raw content from social spaces to providing useful experiences for learning.(1) (2) needing sensors to collect content… (currently guided by users) (2).(3) We acknowledge the noisiness of these spaces, hence noise filtration (guided by semantics)(4) We also use ontologies to augment or enrich the content with semantics (for further processing – e.g. query)A range of intelligent services are then built on these.. (5, 6, 7)All the components require some level of human and machine working together to help each other smarter – synergy.
First objectiveHow to mine the digital traces in social spaces to derive profiles of user groups?(mainly deal with comments)
The top 2 comments – high scores due to the presence of body language and emotions concepts.The bottom comment is clearly no use.(our experiments showed a threshold of 4 is enough)
(0) link relevant comments to individuals (YouTube API for user profiles)Cluster these comments using text-based similarity (to show awareness of domain concepts)Using demographic data to profile groups so we can spot trends (e.g. What are the common concepts amongst 40-50 female in US/UK when discussing job interviews)
Q2 to solve classic ‘cold start’ problem for learner modelling
Example Learning Need could be identified: Applicants in this group need to learn how to well answer interviewer questions related to little or no previous job experience
GB – no money being mentioned!
(0) we have developed a pipeline which seemed to work, however(1) e.g. Swear word may give emotion..is that inappropriate?
(3) May use a range of sources to get more accurate ‘location’ data.