SlideShare une entreprise Scribd logo
1  sur  42
Télécharger pour lire hors ligne
Predicting Communication Intention
in (Enterprise) Social Networks
Charalampos “Harris” Chelmis
Computer Science, University of Southern California
Thanks to: Viktor K. Prasanna, Ming Hsieh Department of Electrical Engineering, USC
Vikram Sorathia, Co-founder & CEO at Kensemble Tech Labs LLP
•All audio is muted.
•If you dialed in, you MUST enter your audio pin to be able to ask questions!
•We recommend that you keep your phone muted, and unmute yourself when you need to ask questions.
•You can view the upcoming seminar schedule at www.milibo.com/talent/events.aspx
Social Networks are Everywhere
2
• , ,
• Movie Networks
• Affiliation/co-authorship networks
• Professional networks
• Friendship networks
• Information networks
• Organizational Networks
• Q&A websites
• Even networks
• Multiple applications
 Targeted marketing
 Personalization
− Content delivery
 Recommendation
− People to connect, items to buy, movies to watch
 Law enforcement
− Fraud detection
− Guilt by association
 Epidemiology
 Information dissemination/propagation
 …
• Users interact with one another and content they create and
consume
 Rich interactions
− Friendships based on similarity
− Following based on interest
 Noisy
Social Network Analysis
3
• Collaboration Enabling Technologies
 Multiple communication channels
 Spread of timely and relevant information
 Search for data and experts
Collaboration Technologies at the Workplace
4
• Main focus on business perspective
 Less noisy than online social networks
 Q&A
 Problem solving
 Information seeking
• But also
 Assist in breaking barriers
 Team building
 Knowledge propagation
• Opportunities
 Expert identification
− Experts vs. Influencers
 Information Flow
 Trends
− Technology adoption
− Company focus
Collaboration at the Workplace
5
• More Opportunities
 Collective Knowledge
− Generation
− Sharing
 Collaborative Knowledge Management
− How do employees work together to complete tasks?
− How does innovation happen?
− Best practices
• Difficulties
 Informal interactions
 Heterogeneous, unstructured data
 How to formally model knowledge?
Collaboration at the Workplace
6
• Descriptive Modeling
 Social network analysis
• Predictive modeling
 Link prediction
 Attribute prediction
• Typically networked data are represented as graphs
 Nodes (e.g., users)
 Edges
− Social relations
− Interactions
− Information flow
− Similarity
 Weight
− Communication frequency
− Communication cost (e.g., distance)
− Reciprocity
− Type of interaction (e.g., family member, friend, or officemate)
Networked Data Modeling
7
• Heterogeneous object and link types
• Both nodes and edges may carry attributes
• Attribute dependencies
 Correlation between attribute values and link structure
− e.g. link prediction based on auxiliary information
 Correlation among attributes of related nodes
− e.g. collaborative filtering
• Node dependencies
 e.g. groups/communities
• Partial observations
 e.g. labels
But Networked Data are Very Different than Graphs
8
• Big Data
 Billions of users
 Billions of connections
 Billions of “documents”
• Temporality
 Affiliations
 Interests
 Friendships
• Context
 Spatial
 Temporal
 Topical
• Content multimodality
 Text
 Multimedia
Networked Data ≠ Graphs
9
• Edges are more than links
 Type
− e.g. like vs. comment vs. share
 Trust
 Sentiment
 Strength
 Time
 Number
• Edges “reveal” something about the relation between nodes
 Prior “interaction” to compute similarity
Networked Data ≠ Graphs
10
Networked Data ≠ Graphs
11
• Integrated informal communication
• Context sensitive
• Temporal
• External Sources
• Analysis of implicit relations
Holistic Modeling of Complex Networks
12
Multiple collaborative
platforms
Multimodal,
heterogeneous content
from various sources
Meta-information
about content
- Social Algebraic Operations
- Complex mining and analysis
- Correlation of different domains
- Temporal, semantic analysis
context
time
content
connection
• Directed communication graph G = (V,E)
 Node u represents a user
 Edge e = (u,v) exists iff user u has sent at least one message to user v
• Input
 G0 = (V0,E0), subgraph of G consisting of all nodes in G and a subset of
edges in G
• Output
 Ranked list L of edges, not present in G0, such that
Predicting Intention of Communication
13
ELE 0
OutputInput
u
G0
u
G1
• Edge semantics:
 Conversation between users rather than friendship
•
•
• “What makes people initiate conversations with strangers?”
• “With whom do individuals choose to collaborate and why?”
≠ Link Prediction
14
Contextual – Temporal Properties
Directionality Matters
u1
≠
m1(u1,u2,g1) m1(u1,u2,g2)
u2 u1 u2
u1
≠
m1(u1,u2,g1) m1(u2,u1,g1)
u2 u1 u2
• The tendency to relate to people with similar characteristics
 status, beliefs, etc.
• Fundamental concept underlying social theories (e.g. Blau 1977)
• Fundamental basis for links in many types of social networks
 “Similar” nodes tend to cluster together
• How does this helps us solve our problem?
Homophily
15
• Machine learning
 Probabilistic, supervised, computationally expensive
• Node attributes
 No semantics
 We instead exploit multiple features of variable types
• Network structure
How to Compute Similarity?
16
Graph Distance Length of shortest path between u and v
Common Neighbors
Jaccard Coefficient
Adamic/Adar
Preferential Attachment
Katz
Random walks
)()( vu 
)()(
)()(
vu
vu


 
)()(
)(log
1
vuz
z
)()( vu 



1 ,

 vupaths
• If there is a tie between x and y and one between y and z, then in
a transitive network x and z will also be connected
• Such structural clues have been traditionally used for link
prediction
• Consider what happens if edge semantics change
• Or if we further include context
Transitivity
17
x
y
z
x
y
z
asks ?
Communication Network
18
Threaded
Discussion
Bipartite
Graph
Post-Reply
Network
Augmented, Directed Post-Reply Network
19
• We model a user as a union of her:
 connections and
 her content
• We characterize microblogs using a set of attributes
 each feature according to its type
 Textual Features
− raw textual content (bag-of-words)
− #hashtags
− Groups
 Temporal Features
− Date
− Time
• WordNet: enrich concepts with conceptually, semantically and
lexically related terms
 Synonyms
 Hypernyms
 Hyponyms
User Representation
20
• Semantic Similarity of textual concepts
 Jaccard Index:
 Synonym-based similarity:
 Hypernym-based similarity:
 Hyponym-based similarity:
• Calculate Semantic Similarity using weighted sum
Semantic Similarity of Textual Features
21
|SS|
|SS|
)S,s(Sb)s(a,
ba
ba
ba



)S,(Ssb)(a,s bass 
)S,(Ssb)(a,s bahh 
)S,(Ssb)(a,s bahphp 
• Caveat: concepts belong to the same subtree
 Solution: compute similarity between the union of annotations
• Account for lexical similarity: Levenshtein similarity
• Select the highest similarity, either semantic or lexical
Semantic Similarity of Textual Features
22
)HpHS,HpHs(S
b),(a,swb)(a,swb)(a,sw
b),y(a,nSimilaritLevenshtei
maxb)(a,s
bbbaaa
hphphhsstg












• Textual Similarity between bag-of-words features:
 tf.idf weight vector representation
 Cosine similarity
• Date Similarity:
• Time Similarity:
• Timestamp similarity:
Feature Similarity
23









otherwise
T
dd
Tdd
d
d
,1
,0
)d,(ds 21
21
21d









otherwise
T
tt
Ttt
t
t
,1
,0
)t,(ts 21
21
21t
)y,(xsw)y,(xswy)(x,s ttttdddddf 
• We use a variation of Hausdorff point set distance measure:
 Average of the maximum similarity of features in set A with respect to
features in set B

 : any similarity measure on set elements ak and bi
 Measure is asymmetric with respect to the sets
Feature Set Similarity
24
 ),(max
A
1
B)(A,S
A
1
i
H ik
k
basim

),( ik basim
• A weighted function of content and network proximity
 λ controls the tradeoff between content and network proximity
• Content Proximity
 User similarity with respect to their microblogs
 Similarity of microblogs
− Combined weighted value of respective attributes similarities
• Network Proximity:
User Similarity
25
)p,(psw)p,(psw)p,(pSw)p,(psw)p,S(p 21dfdf21txtx21Htgg2g1gg21 tgtg

 ),(max
u
1
)u,(uS 21
u
1
i
1
21C
1
ipkp
kp
uuS
p


u
vu
vus



||
),(v)(u,SN
v)(u,)S-(1v)(u,Sv)S(u, NC  
Asymmetric with
respect to users
• First construct the augmented communication graph G(V,E)
• Given a user u,
 compute users similarity
− For all posts of user u with respect to all other users in the network
 For all facets
Communication Intention Prediction
26
• Complete snapshot (June 2010 – August 2011) of a corporate micro-
blogging service, which resembles Twitter
 4,213 unique users
 16,438 messages in total
− 8,174 thread starters
− 8,264 replies
 8,139 threads
 88 discussion groups
 637 unique #hastags
Dataset
27
• In our evaluation we focus on the Largest Connected Component
 582 users
 3,773 directed edges
 11,684 messages
 Average degree = 12.97
• Clustering coefficient = 0.2311 >> ccrandom = 0.0223
• Clustering coefficient as a function of node degree
 Average clustering coefficient decreases with increasing node degree
 Higher for nodes of low degree significant clustering among low-
degree nodes
Dataset
28

Number of Neighbors
• Directed messages received vs. directed messages sent
 Scattered across the diagonal
 Cumulative distribution of the out-degree to in-degree ratio, exhibits
high correlation between in-degree and out-degree
 Tendency of users to reply back when they receive a message from
other users?
29
• Four-fold cross validation
• Randomly sample 100 users & recommend top-k links for each user
• Accuracy measures
 Precision@k
 Recall@k
 MRR
• Baselines
 Random
− Random selection
 Shared Vocabulary
− Cosine similarity based on #hastags vector
 Shortest distance
− Length of the shortest path
 Common neighbors
−
Evaluation
30
 
 Sp
k
k
pN
S
)(1
 
 Sp
p
pp
F
RF
S
1
 
 Sp
prankS
11
)()(v)sim(u, vu  
Lexical and Topical Alignment
• Is there a global vocabulary in the corporate microblogging service?
 Hashtags vocabulary
 “Groups vocabulary”
• Select user pairs at random and measure number of shared tags
 Average nst = 1.001
 Most common case is the absence of shared tags
• However adjacent users in social networks tend to share common
interests due to homophily
 We measure user homophily with respect to hashtags as a function of
the distance of users in the network
• Select user pairs at random and measure number of shared groups
 Average nsg = 1
 Most common case is the absence of shared groups
31
Lexical Alignment
• Average number of shared (distinct) hashtags for two users as a
function of their distance d along the network:
,
• Shared hashtags vocabulary up to distance 6!
32
22
)()(
)()(
),(


t vt u
t vu
tags
tftf
tftf
vu
)()(tagsU
vnun tt
t
t
v
t
u


• Bold indicates best performing baseline
• Percentage lift
 the % improvement achieved over the best performing baseline
Methods Comparison
33
• How to choose best values of λ and weighing factors?
• Different datasets may lead to different optimal values
 Grid search over ranges of values for these parameters
 Measure accuracy on the validation set for each configuration setting
Weight Scheme Selection
34
• 0 only considers network proximity
• 1 only considers content similarity
• All schemes perform better than the baseline
• Good value for λ is approximately 0.8
Effect of Parameter λ
35
• Effect of weighting schemes on accuracy per user
• Different weighting schemes perform better for different users
 Features importance is user specific
• Need personalization to achieve better accuracy overall
Effect of Weighting Scheme
36
• Average precision (measured@ 5) of users having k
 (a) posts or
 (b) neighbors in the communication network
 The more statistical evidence the better the overall precision
Content Availability and Structural Proximity
37
• MRR as a function of λ for various restrictions
• Greater statistical evidence results in more accurate predictions
Content Availability and Structural Proximity
38
• Performed modeling and analysis of informal communication at the
workplace
• We introduced the problem of communication intention prediction
• We addressed this problem by exploiting auxiliary information
 Holistic modeling of structural clues and semantically enriched
content
• We tested the efficiency of our approach in a real-world dataset
 The more statistical evidence available, the more accurate predictions
 Need for personalization
• Potential applications
 Contextual expert recommendation for Q&A
 Search for “interesting” people to collaborate
• Open problems
 Scalability
 Replication of results for online social media
Conclusion and Open Problems
39
• Semantic Social Network Analysis for the Enterprise
Contextual Recommendation
40
Employee ID:
• Semantic Social Network Analysis for the Enterprise
 Instantiate our modeling in Ontology
 Collaboration analytics at the workplace
 Real-world data evaluation
Contextual Recommendation
41
Contextual ego-
network analysis
Expert
Identification
Semantic Analysis
• Questions?
• Resources
 http://www-scf.usc.edu/~chelmis/index.php
 http://pgroup.usc.edu/wiki/CSS
• Please send all inquiries at chelmis@usc.edu
Thank you!
42

Contenu connexe

Tendances

CS6010 Social Network Analysis Unit III
CS6010 Social Network Analysis   Unit IIICS6010 Social Network Analysis   Unit III
CS6010 Social Network Analysis Unit IIIpkaviya
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Xiaohan Zeng
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018Arsalan Khan
 
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Matthew Rowe
 
TruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkTruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkLora Aroyo
 
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...BAINIDA
 
Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011guillaume ereteo
 
How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...Jeromy Anglim
 
2009 Node XL Overview: Social Network Analysis in Excel 2007
2009 Node XL Overview: Social Network Analysis in Excel 20072009 Node XL Overview: Social Network Analysis in Excel 2007
2009 Node XL Overview: Social Network Analysis in Excel 2007Marc Smith
 
Asymmetric Social Proximity Based Private Matching Protocols for Online Socia...
Asymmetric Social Proximity Based Private Matching Protocols for Online Socia...Asymmetric Social Proximity Based Private Matching Protocols for Online Socia...
Asymmetric Social Proximity Based Private Matching Protocols for Online Socia...1crore projects
 
Multidimensional Patterns of Disturbance in Digital Social Networks
Multidimensional Patterns of Disturbance in Digital Social NetworksMultidimensional Patterns of Disturbance in Digital Social Networks
Multidimensional Patterns of Disturbance in Digital Social NetworksDimitar Denev
 
IT6701 Information Management - Unit I
IT6701 Information Management - Unit I  IT6701 Information Management - Unit I
IT6701 Information Management - Unit I pkaviya
 
Big social data analytics - social network analysis
Big social data analytics - social network analysis Big social data analytics - social network analysis
Big social data analytics - social network analysis Jari Jussila
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network AnalysisPremsankar Chakkingal
 
Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Doug Needham
 
Social network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and moreSocial network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and moreWael Elrifai
 
A comparative study of social network analysis tools
A comparative study of social network analysis toolsA comparative study of social network analysis tools
A comparative study of social network analysis toolsDavid Combe
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISrathnaarul
 

Tendances (20)

CS6010 Social Network Analysis Unit III
CS6010 Social Network Analysis   Unit IIICS6010 Social Network Analysis   Unit III
CS6010 Social Network Analysis Unit III
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018
 
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
TruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkTruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social Network
 
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
 
Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011
 
How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...
 
2009 Node XL Overview: Social Network Analysis in Excel 2007
2009 Node XL Overview: Social Network Analysis in Excel 20072009 Node XL Overview: Social Network Analysis in Excel 2007
2009 Node XL Overview: Social Network Analysis in Excel 2007
 
Asymmetric Social Proximity Based Private Matching Protocols for Online Socia...
Asymmetric Social Proximity Based Private Matching Protocols for Online Socia...Asymmetric Social Proximity Based Private Matching Protocols for Online Socia...
Asymmetric Social Proximity Based Private Matching Protocols for Online Socia...
 
Multidimensional Patterns of Disturbance in Digital Social Networks
Multidimensional Patterns of Disturbance in Digital Social NetworksMultidimensional Patterns of Disturbance in Digital Social Networks
Multidimensional Patterns of Disturbance in Digital Social Networks
 
IT6701 Information Management - Unit I
IT6701 Information Management - Unit I  IT6701 Information Management - Unit I
IT6701 Information Management - Unit I
 
Big social data analytics - social network analysis
Big social data analytics - social network analysis Big social data analytics - social network analysis
Big social data analytics - social network analysis
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
 
Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview.
 
Social network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and moreSocial network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and more
 
A comparative study of social network analysis tools
A comparative study of social network analysis toolsA comparative study of social network analysis tools
A comparative study of social network analysis tools
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
02 Descriptive Statistics (2017)
02 Descriptive Statistics (2017)02 Descriptive Statistics (2017)
02 Descriptive Statistics (2017)
 

En vedette

Unpacking the social media phenomenon: towards a research agenda
Unpacking the social media phenomenon: towards a research agendaUnpacking the social media phenomenon: towards a research agenda
Unpacking the social media phenomenon: towards a research agendaAndrey Markin
 
Homophily and influence in social networks
Homophily and influence in social networksHomophily and influence in social networks
Homophily and influence in social networksNicola Barbieri
 
Localized methods in graph mining
Localized methods in graph miningLocalized methods in graph mining
Localized methods in graph miningDavid Gleich
 
DeepWalk: Online Learning of Representations
DeepWalk: Online Learning of RepresentationsDeepWalk: Online Learning of Representations
DeepWalk: Online Learning of RepresentationsBryan Perozzi
 
Big Data: Social Network Analysis
Big Data: Social Network AnalysisBig Data: Social Network Analysis
Big Data: Social Network AnalysisMichel Bruley
 
Social Network Analysis Using Gephi
Social Network Analysis Using Gephi Social Network Analysis Using Gephi
Social Network Analysis Using Gephi Goa App
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphsNicola Barbieri
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networksDavid Gleich
 
[Report] The Rise of Digital Influence, by Brian Solis
[Report] The Rise of Digital Influence, by Brian Solis[Report] The Rise of Digital Influence, by Brian Solis
[Report] The Rise of Digital Influence, by Brian SolisAltimeter, a Prophet Company
 

En vedette (12)

Unpacking the social media phenomenon: towards a research agenda
Unpacking the social media phenomenon: towards a research agendaUnpacking the social media phenomenon: towards a research agenda
Unpacking the social media phenomenon: towards a research agenda
 
Homophily and influence in social networks
Homophily and influence in social networksHomophily and influence in social networks
Homophily and influence in social networks
 
Link prediction
Link predictionLink prediction
Link prediction
 
Social Networks
Social NetworksSocial Networks
Social Networks
 
Localized methods in graph mining
Localized methods in graph miningLocalized methods in graph mining
Localized methods in graph mining
 
DeepWalk: Online Learning of Representations
DeepWalk: Online Learning of RepresentationsDeepWalk: Online Learning of Representations
DeepWalk: Online Learning of Representations
 
Big Data: Social Network Analysis
Big Data: Social Network AnalysisBig Data: Social Network Analysis
Big Data: Social Network Analysis
 
Social Network Analysis Using Gephi
Social Network Analysis Using Gephi Social Network Analysis Using Gephi
Social Network Analysis Using Gephi
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphs
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
 
[Report] The Rise of Digital Influence, by Brian Solis
[Report] The Rise of Digital Influence, by Brian Solis[Report] The Rise of Digital Influence, by Brian Solis
[Report] The Rise of Digital Influence, by Brian Solis
 

Similaire à Predicting Communication Intention in Social Media

LSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLocal Social Summit
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smithMarc Smith
 
Vinci2011会议演讲PPT
Vinci2011会议演讲PPTVinci2011会议演讲PPT
Vinci2011会议演讲PPTdasiyjun
 
Tutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social NetworksTutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social Networkspjing2
 
Sas web 2010 lora-aroyo
Sas web 2010 lora-aroyoSas web 2010 lora-aroyo
Sas web 2010 lora-aroyoLora Aroyo
 
Vinci2011会议演讲PPT
Vinci2011会议演讲PPTVinci2011会议演讲PPT
Vinci2011会议演讲PPTdasiyjun
 
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...Charalampos Chelmis
 
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014James Powell
 
Complex Networks Analysis @ Universita Roma Tre
Complex Networks Analysis @ Universita Roma TreComplex Networks Analysis @ Universita Roma Tre
Complex Networks Analysis @ Universita Roma TreMatteo Moci
 
Sylva workshop.gt that camp.2012
Sylva workshop.gt that camp.2012Sylva workshop.gt that camp.2012
Sylva workshop.gt that camp.2012CameliaN
 
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczFOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczIoan Toma
 
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...Marc Smith
 
Social Network Analysis - an Introduction (minus the Maths)
Social Network Analysis - an Introduction (minus the Maths)Social Network Analysis - an Introduction (minus the Maths)
Social Network Analysis - an Introduction (minus the Maths)Katy Jordan
 
20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …Marc Smith
 
20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...Marc Smith
 
Graph Theoretic Model for Community Wireless Networks
Graph Theoretic Model for Community Wireless NetworksGraph Theoretic Model for Community Wireless Networks
Graph Theoretic Model for Community Wireless NetworksABDELAAL
 

Similaire à Predicting Communication Intention in Social Media (20)

LSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social Media
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smith
 
Content-based link prediction
Content-based link predictionContent-based link prediction
Content-based link prediction
 
Chapter 3.pdf
Chapter 3.pdfChapter 3.pdf
Chapter 3.pdf
 
Vinci2011会议演讲PPT
Vinci2011会议演讲PPTVinci2011会议演讲PPT
Vinci2011会议演讲PPT
 
Tutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social NetworksTutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social Networks
 
Sas web 2010 lora-aroyo
Sas web 2010 lora-aroyoSas web 2010 lora-aroyo
Sas web 2010 lora-aroyo
 
Vinci2011会议演讲PPT
Vinci2011会议演讲PPTVinci2011会议演讲PPT
Vinci2011会议演讲PPT
 
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
 
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
 
Slides ecir2016
Slides ecir2016Slides ecir2016
Slides ecir2016
 
Complex Networks Analysis @ Universita Roma Tre
Complex Networks Analysis @ Universita Roma TreComplex Networks Analysis @ Universita Roma Tre
Complex Networks Analysis @ Universita Roma Tre
 
The P4 of Networkacy
The P4 of NetworkacyThe P4 of Networkacy
The P4 of Networkacy
 
Sylva workshop.gt that camp.2012
Sylva workshop.gt that camp.2012Sylva workshop.gt that camp.2012
Sylva workshop.gt that camp.2012
 
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczFOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
 
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
 
Social Network Analysis - an Introduction (minus the Maths)
Social Network Analysis - an Introduction (minus the Maths)Social Network Analysis - an Introduction (minus the Maths)
Social Network Analysis - an Introduction (minus the Maths)
 
20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …
 
20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...
 
Graph Theoretic Model for Community Wireless Networks
Graph Theoretic Model for Community Wireless NetworksGraph Theoretic Model for Community Wireless Networks
Graph Theoretic Model for Community Wireless Networks
 

Dernier

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Dernier (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Predicting Communication Intention in Social Media

  • 1. Predicting Communication Intention in (Enterprise) Social Networks Charalampos “Harris” Chelmis Computer Science, University of Southern California Thanks to: Viktor K. Prasanna, Ming Hsieh Department of Electrical Engineering, USC Vikram Sorathia, Co-founder & CEO at Kensemble Tech Labs LLP •All audio is muted. •If you dialed in, you MUST enter your audio pin to be able to ask questions! •We recommend that you keep your phone muted, and unmute yourself when you need to ask questions. •You can view the upcoming seminar schedule at www.milibo.com/talent/events.aspx
  • 2. Social Networks are Everywhere 2 • , , • Movie Networks • Affiliation/co-authorship networks • Professional networks • Friendship networks • Information networks • Organizational Networks • Q&A websites • Even networks
  • 3. • Multiple applications  Targeted marketing  Personalization − Content delivery  Recommendation − People to connect, items to buy, movies to watch  Law enforcement − Fraud detection − Guilt by association  Epidemiology  Information dissemination/propagation  … • Users interact with one another and content they create and consume  Rich interactions − Friendships based on similarity − Following based on interest  Noisy Social Network Analysis 3
  • 4. • Collaboration Enabling Technologies  Multiple communication channels  Spread of timely and relevant information  Search for data and experts Collaboration Technologies at the Workplace 4
  • 5. • Main focus on business perspective  Less noisy than online social networks  Q&A  Problem solving  Information seeking • But also  Assist in breaking barriers  Team building  Knowledge propagation • Opportunities  Expert identification − Experts vs. Influencers  Information Flow  Trends − Technology adoption − Company focus Collaboration at the Workplace 5
  • 6. • More Opportunities  Collective Knowledge − Generation − Sharing  Collaborative Knowledge Management − How do employees work together to complete tasks? − How does innovation happen? − Best practices • Difficulties  Informal interactions  Heterogeneous, unstructured data  How to formally model knowledge? Collaboration at the Workplace 6
  • 7. • Descriptive Modeling  Social network analysis • Predictive modeling  Link prediction  Attribute prediction • Typically networked data are represented as graphs  Nodes (e.g., users)  Edges − Social relations − Interactions − Information flow − Similarity  Weight − Communication frequency − Communication cost (e.g., distance) − Reciprocity − Type of interaction (e.g., family member, friend, or officemate) Networked Data Modeling 7
  • 8. • Heterogeneous object and link types • Both nodes and edges may carry attributes • Attribute dependencies  Correlation between attribute values and link structure − e.g. link prediction based on auxiliary information  Correlation among attributes of related nodes − e.g. collaborative filtering • Node dependencies  e.g. groups/communities • Partial observations  e.g. labels But Networked Data are Very Different than Graphs 8
  • 9. • Big Data  Billions of users  Billions of connections  Billions of “documents” • Temporality  Affiliations  Interests  Friendships • Context  Spatial  Temporal  Topical • Content multimodality  Text  Multimedia Networked Data ≠ Graphs 9
  • 10. • Edges are more than links  Type − e.g. like vs. comment vs. share  Trust  Sentiment  Strength  Time  Number • Edges “reveal” something about the relation between nodes  Prior “interaction” to compute similarity Networked Data ≠ Graphs 10
  • 11. Networked Data ≠ Graphs 11
  • 12. • Integrated informal communication • Context sensitive • Temporal • External Sources • Analysis of implicit relations Holistic Modeling of Complex Networks 12 Multiple collaborative platforms Multimodal, heterogeneous content from various sources Meta-information about content - Social Algebraic Operations - Complex mining and analysis - Correlation of different domains - Temporal, semantic analysis context time content connection
  • 13. • Directed communication graph G = (V,E)  Node u represents a user  Edge e = (u,v) exists iff user u has sent at least one message to user v • Input  G0 = (V0,E0), subgraph of G consisting of all nodes in G and a subset of edges in G • Output  Ranked list L of edges, not present in G0, such that Predicting Intention of Communication 13 ELE 0 OutputInput u G0 u G1
  • 14. • Edge semantics:  Conversation between users rather than friendship • • • “What makes people initiate conversations with strangers?” • “With whom do individuals choose to collaborate and why?” ≠ Link Prediction 14 Contextual – Temporal Properties Directionality Matters u1 ≠ m1(u1,u2,g1) m1(u1,u2,g2) u2 u1 u2 u1 ≠ m1(u1,u2,g1) m1(u2,u1,g1) u2 u1 u2
  • 15. • The tendency to relate to people with similar characteristics  status, beliefs, etc. • Fundamental concept underlying social theories (e.g. Blau 1977) • Fundamental basis for links in many types of social networks  “Similar” nodes tend to cluster together • How does this helps us solve our problem? Homophily 15
  • 16. • Machine learning  Probabilistic, supervised, computationally expensive • Node attributes  No semantics  We instead exploit multiple features of variable types • Network structure How to Compute Similarity? 16 Graph Distance Length of shortest path between u and v Common Neighbors Jaccard Coefficient Adamic/Adar Preferential Attachment Katz Random walks )()( vu  )()( )()( vu vu     )()( )(log 1 vuz z )()( vu     1 ,   vupaths
  • 17. • If there is a tie between x and y and one between y and z, then in a transitive network x and z will also be connected • Such structural clues have been traditionally used for link prediction • Consider what happens if edge semantics change • Or if we further include context Transitivity 17 x y z x y z asks ?
  • 20. • We model a user as a union of her:  connections and  her content • We characterize microblogs using a set of attributes  each feature according to its type  Textual Features − raw textual content (bag-of-words) − #hashtags − Groups  Temporal Features − Date − Time • WordNet: enrich concepts with conceptually, semantically and lexically related terms  Synonyms  Hypernyms  Hyponyms User Representation 20
  • 21. • Semantic Similarity of textual concepts  Jaccard Index:  Synonym-based similarity:  Hypernym-based similarity:  Hyponym-based similarity: • Calculate Semantic Similarity using weighted sum Semantic Similarity of Textual Features 21 |SS| |SS| )S,s(Sb)s(a, ba ba ba    )S,(Ssb)(a,s bass  )S,(Ssb)(a,s bahh  )S,(Ssb)(a,s bahphp 
  • 22. • Caveat: concepts belong to the same subtree  Solution: compute similarity between the union of annotations • Account for lexical similarity: Levenshtein similarity • Select the highest similarity, either semantic or lexical Semantic Similarity of Textual Features 22 )HpHS,HpHs(S b),(a,swb)(a,swb)(a,sw b),y(a,nSimilaritLevenshtei maxb)(a,s bbbaaa hphphhsstg            
  • 23. • Textual Similarity between bag-of-words features:  tf.idf weight vector representation  Cosine similarity • Date Similarity: • Time Similarity: • Timestamp similarity: Feature Similarity 23          otherwise T dd Tdd d d ,1 ,0 )d,(ds 21 21 21d          otherwise T tt Ttt t t ,1 ,0 )t,(ts 21 21 21t )y,(xsw)y,(xswy)(x,s ttttdddddf 
  • 24. • We use a variation of Hausdorff point set distance measure:  Average of the maximum similarity of features in set A with respect to features in set B   : any similarity measure on set elements ak and bi  Measure is asymmetric with respect to the sets Feature Set Similarity 24  ),(max A 1 B)(A,S A 1 i H ik k basim  ),( ik basim
  • 25. • A weighted function of content and network proximity  λ controls the tradeoff between content and network proximity • Content Proximity  User similarity with respect to their microblogs  Similarity of microblogs − Combined weighted value of respective attributes similarities • Network Proximity: User Similarity 25 )p,(psw)p,(psw)p,(pSw)p,(psw)p,S(p 21dfdf21txtx21Htgg2g1gg21 tgtg   ),(max u 1 )u,(uS 21 u 1 i 1 21C 1 ipkp kp uuS p   u vu vus    || ),(v)(u,SN v)(u,)S-(1v)(u,Sv)S(u, NC   Asymmetric with respect to users
  • 26. • First construct the augmented communication graph G(V,E) • Given a user u,  compute users similarity − For all posts of user u with respect to all other users in the network  For all facets Communication Intention Prediction 26
  • 27. • Complete snapshot (June 2010 – August 2011) of a corporate micro- blogging service, which resembles Twitter  4,213 unique users  16,438 messages in total − 8,174 thread starters − 8,264 replies  8,139 threads  88 discussion groups  637 unique #hastags Dataset 27
  • 28. • In our evaluation we focus on the Largest Connected Component  582 users  3,773 directed edges  11,684 messages  Average degree = 12.97 • Clustering coefficient = 0.2311 >> ccrandom = 0.0223 • Clustering coefficient as a function of node degree  Average clustering coefficient decreases with increasing node degree  Higher for nodes of low degree significant clustering among low- degree nodes Dataset 28 
  • 29. Number of Neighbors • Directed messages received vs. directed messages sent  Scattered across the diagonal  Cumulative distribution of the out-degree to in-degree ratio, exhibits high correlation between in-degree and out-degree  Tendency of users to reply back when they receive a message from other users? 29
  • 30. • Four-fold cross validation • Randomly sample 100 users & recommend top-k links for each user • Accuracy measures  Precision@k  Recall@k  MRR • Baselines  Random − Random selection  Shared Vocabulary − Cosine similarity based on #hastags vector  Shortest distance − Length of the shortest path  Common neighbors − Evaluation 30    Sp k k pN S )(1    Sp p pp F RF S 1    Sp prankS 11 )()(v)sim(u, vu  
  • 31. Lexical and Topical Alignment • Is there a global vocabulary in the corporate microblogging service?  Hashtags vocabulary  “Groups vocabulary” • Select user pairs at random and measure number of shared tags  Average nst = 1.001  Most common case is the absence of shared tags • However adjacent users in social networks tend to share common interests due to homophily  We measure user homophily with respect to hashtags as a function of the distance of users in the network • Select user pairs at random and measure number of shared groups  Average nsg = 1  Most common case is the absence of shared groups 31
  • 32. Lexical Alignment • Average number of shared (distinct) hashtags for two users as a function of their distance d along the network: , • Shared hashtags vocabulary up to distance 6! 32 22 )()( )()( ),(   t vt u t vu tags tftf tftf vu )()(tagsU vnun tt t t v t u  
  • 33. • Bold indicates best performing baseline • Percentage lift  the % improvement achieved over the best performing baseline Methods Comparison 33
  • 34. • How to choose best values of λ and weighing factors? • Different datasets may lead to different optimal values  Grid search over ranges of values for these parameters  Measure accuracy on the validation set for each configuration setting Weight Scheme Selection 34
  • 35. • 0 only considers network proximity • 1 only considers content similarity • All schemes perform better than the baseline • Good value for λ is approximately 0.8 Effect of Parameter λ 35
  • 36. • Effect of weighting schemes on accuracy per user • Different weighting schemes perform better for different users  Features importance is user specific • Need personalization to achieve better accuracy overall Effect of Weighting Scheme 36
  • 37. • Average precision (measured@ 5) of users having k  (a) posts or  (b) neighbors in the communication network  The more statistical evidence the better the overall precision Content Availability and Structural Proximity 37
  • 38. • MRR as a function of λ for various restrictions • Greater statistical evidence results in more accurate predictions Content Availability and Structural Proximity 38
  • 39. • Performed modeling and analysis of informal communication at the workplace • We introduced the problem of communication intention prediction • We addressed this problem by exploiting auxiliary information  Holistic modeling of structural clues and semantically enriched content • We tested the efficiency of our approach in a real-world dataset  The more statistical evidence available, the more accurate predictions  Need for personalization • Potential applications  Contextual expert recommendation for Q&A  Search for “interesting” people to collaborate • Open problems  Scalability  Replication of results for online social media Conclusion and Open Problems 39
  • 40. • Semantic Social Network Analysis for the Enterprise Contextual Recommendation 40 Employee ID:
  • 41. • Semantic Social Network Analysis for the Enterprise  Instantiate our modeling in Ontology  Collaboration analytics at the workplace  Real-world data evaluation Contextual Recommendation 41 Contextual ego- network analysis Expert Identification Semantic Analysis
  • 42. • Questions? • Resources  http://www-scf.usc.edu/~chelmis/index.php  http://pgroup.usc.edu/wiki/CSS • Please send all inquiries at chelmis@usc.edu Thank you! 42