TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
ESWC 2014 Tutorial Part 4
1. Social Web: Where are the Semantics?
ESWC 2014
Miriam Fernández, Victor Rodríguez,
Andrés García-Silva, Oscar Corcho
Ontology Engineering Group, UPM, Spain
Knowledge Media Institute, The Open University
2. Outline
2
• Part 1: Understanding Social Media
– Theory: background & applications described in this tutorial
– Hands on: data extraction from Twitter and Facebook
• Part 2: Using semantics to represent data from SNS
– Theory: Using SW to represent content, users and relations
– Hands on: applying and extending SIOC
• Part 3: Using semantics to understand social media conversations
– Theory: Using semantics to understand topics in social media
– Hands on: using LDA to extract topics from social media
• Part 4: Using semantics to understand user behaviour
3. Implicit vs. Explicit Semantics
• Implicit Semantics
– Implicit, also called statistical semantics, focus on extracting word
sense by studying the patterns of human word usage in massive
collections of text or other human generated data
– It does not rely on an explicit formalisation/conceptualisation of
knowledge
• Explicit Semantics
– Explicit semantics focus on the analysis of content by using the
support of explicit conceptualisations in the form of ontologies and
knowledge bases
ESWC 2014 Social Web: Where are the Semantics? 3
4. Explicit Semantics
Structured
Unstructured
From the Web of human generated content
The Web of unstructured text (Posts / Documents)
and Links
To the Web of machine understandable content
The Web of Objects and Relations
5. • The annotators extract entities (classes / individuals) and relations
from the text and link them to object URIs
Obtaining explicit semantics from social media content
6. Using Semantics to Analyse Topic Evolution
• LDA topics are identified by a set of keywords
– Difficult to assess their meaning and evolution
• Use explicit semantics to characterise topics as concrete entities
6
7. !
!
Using Semantics to Analyse Topic Evolution
ESWC 2014 Social Web: Where are the Semantics? 7
• Analyse concepts appearance
– Within a group
– Across groups
– Over time
• Type filtering
• Interlinking with other datasets
(data.open.ac.uk)
8. Using Semantics To Analyse Sentiment
• Sentiment analysis on social media
– Offers a fast and cheap access to publics’ feelings towards brands,
business, people, etc.
– Comes with additional challenges
– Current approaches
• Lexical-based
• Machine Learning
– Explicit semantics are often neglected
ESWC 2014 Social Web: Where are the Semantics? 8
9. Using Semantics to Analyse Sentiment
• Add semantics as additional features into the training set
• Results
– Incorporating semantics increases accuracy by 6.5% for negative
sentiment and by 4.8% for positive sentiment
– The use of explicit semantics is more appropriate when the datasets
being analysed are large and cover a wide range of topics
Saif, Hassan, He, Yulan, Alani, Harith (2012). Semantic sentiment analysis of twitter. In: 11th
International Semantic Web Conference (ISWC 2012)
10. “Words that occur in
similar context tend
to have similar
meaning”
Wittgenstein (1953)
Using Semantics To Analyse Sentiment
• SentiCircles
– Integrates implicit and explicit semantics to analyse sentiment
– Outperforms other lexicon labeling methods and overtakes the state-of-the-
art SentiStrength approach in accuracy, with a marginal drop in F-measure
ESWC 2014 Social Web: Where are the Semantics? 10
Saif, Hassan, Fernandez, Miriam, He, Yulan, Alani, Harith (2014). SentiCircles for Tweet-level
Sentiment Analysis (ESWC 2014) -> conference presentation on the 27, 14:00!!
11. Using Semantics To Analyse Sentiment
ESWC 2014 Social Web: Where are the Semantics? 11
12. Using Semantics to Analyse User Behaviour
• Goal
– Monitor and capture member activities
– Analyse emerging behaviour over time
– Understand the correlation of behaviour with community evolution
• Approach
– Identify behavioural features and behaviour roles
– Create an ontology to model behavioural roles and behaviour
features
– Use semantic rules to infer user roles in online communities
– Study role composition patterns
ESWC 2014 Social Web: Where are the Semantics? 12
Angeletou, S., Rowe, M. and Alani, H. (2011) Modelling and Analysis of User Behaviour in Online
Communities, 10th International Semantic Web Conference (ISWC 2011), Bonn, Germany
Rowe, Matthew; Fernandez, Miriam; Angeletou, Sofia and Alani, Harith (2013). Community analysis through
semantic rules and role composition derivation. Journal of Web Semantics: Science, Services and Agents on
the World Wide Web, 18(1) pp. 31–47
13. Behavioural roles and features
ESWC 2014 Social Web: Where are the Semantics? 13
Table 1. Roles and the feature-to-level mappings
Role Feature Level
Elitist In-Degree Ratio low
Bi-directional Threads Ratio high
Bi-directional Neighbours Ratio low
Grunt Bi-directional Threads Ratio med
Bi-directional Neighbours Ratio med
Average Posts per Thread low
STD of Posts per Thread low
Joining Conversationalist Thread Initiation Ratio low
Average Posts per Thread high
STD of Posts per Thread high
Popular Initiator In-Degree Ratio high
Thread Initiation Ratio high
Popular Participants In-Degree Ratio high
Thread Initiation Ratio low
Average Posts per Thread med
STD of Posts per Thread med
Supporter In-Degree Ratio med
Bi-directional Threads Ratio med
Bi-directional Neighbours Ratio med
Taciturn Bi-directional Threads Ratio low
Bi-directional Neighbours Ratio low
Average Posts per Thread low
STD of Posts per Thread low
Ignored Posts Replied Ratio low
Jeffrey Chan, Conor Hayes, and Elizabeth Daly. Decomposing discussion forums using
common user roles. In Proc. Web Science Conf. (WebSci10), Raleigh, NC: US, 2010.
14. Modelling user features and interactions
ESWC 2014 Social Web: Where are the Semantics? 14
http://purl.org/net/oubo/0.3• OUBO: The OU Behaviour Ontology
15. Encoding Rules in Ontologies with SPIN
ESWC 2014 Social Web: Where are the Semantics? 15
16. Apply rules to infer user roles over time
ESWC 2014 Social Web: Where are the Semantics? 16
1.- Construct features for community
users at a given time step
2.- Derive bings using equal
frequency binning
Popularity-low cutoff = 0.5
Initiation-high cutoff = 0.4
3.- Use skeleton rule base to construct
rules using bin levels
Popularity=low, Initiation=high ->roleA
Popularity<0.5, Initiation > 0.4 -> roleA
4.- Apply rules to infer user roles and
community composition
5.- Repeat 1-4 following time steps
17. Analyse the role composition of the community
ESWC 2014 Social Web: Where are the Semantics? 17
• Investigate the correlation between the role composition and the
students’ performance
18. Analyse the role composition of the community
• Allow Policy Makers to focus on a smaller set of users, with whom
they may want to engage more closely
ESWC 2014 Social Web: Where are the Semantics? 18
19. Analyse the role composition of the community
• Development of models to predict community health based on role
compositions and evolution of user behaviour
– Health Indicators
• Churn Rate: proportion of users who leave the network in a given time segment
• User Count: number of users who posted at least once
• Seeds / Non seeds: proportion of posts that get responses vs. those that don’t
• Clustering coefficient: measures the cohesion within the network
– Results
• Accurate detection of community health is possible using role composition
information
• There is no “one size fits all” model
ESWC 2014 Social Web: Where are the Semantics? 19
Rowe, M. and Alani, H. (2012) What Makes Communities Tick? Community Health Analysis using Role
compositions. International Conference on Social Computing, 2012
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Churn Rate
FPR
TPR
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
User Count
FPR
TPR
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Seeds / Non−seeds Prop
FPR
TPR
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Clustering Coefficient
FPR
TPR
20. Challenges: How would you address them?
• Scalability
– Communities exceed millions of users
– Infrastructures must support hundreds of millions discussion threads
• Growth (real-time analysis)
– Speed of new incoming data / stream processing
• Concept vs. keyword based data acquisition/pre-processing
– How to filter certain tags?
– Which new topics emerge?
– How topics evolve over time?
– Authorship in social media, who copies who?
• Multilingualism
– We all speak different languages
• Understanding the user and acting accordingly
– We all have different personalities, behaviours and preferences
ESWC 2014 Social Web: Where are the Semantics? 20