1. Socio-semantic network
algorithms for a point of
view based visualization
of online communities
Juan David CRUZ GOMEZ
Cécile BOTHOREL
LUSSI Department
2. The dimensions of social network
Real world social networks store both
social and structural information from the
actors
For example, a social network from DBLP
(scientific bibliographic database)
represent authors writing about different
computer science fields that can be
connected in several ways
This social network has two main
dimensions:
-Structural dimension
-Compositional dimension
2 Juan David Cruz Gómez
3. The dimensions of social network
Real world social networks store both Authors Papers Topics
social and structural information from the
actors
For example, a social network from DBLP
(scientific bibliographic database)
represent authors writing about different
computer science fields that can be
connected in several ways
Social network
derived from
This social network has two main author-domain
dimensions: network
-Structural dimension
-Compositional dimension
3 Juan David Cruz Gómez
4. The dimensions of social network
Real world social networks store both Authors Papers Topics
social and structural information from the
actors
For example, a social network from DBLP
(scientific bibliographic database)
represent authors writing about different
computer science fields that can be
connected in several ways
Partition of
topics derived Topics
This social network has two main from author-
dimensions: domain network
-Structural dimension Data M.
-Compositional dimension Math
4 Juan David Cruz Gómez
5. The dimensions of social network
Integration process
?
How to find an expert in some How to find experts in related
domain with multidisciplinary topics that could be easily
competences? connected?
5 Juan David Cruz Gómez
6. Towards a new definition of community
This model can
be used with
Each dimension is used to answer a limited different social
spectrum of questions… networks and
applications
Integrating both dimensions can help to solve
more complex questions about the data.
-Find groups of connected experts in different
topics for a project
Both dimensions are included into the
community detection process to unveil highly
connected people with similar profiles
! This requires first, a new definition of community and second, a set of tools and
methods supporting that definition
6 Juan David Cruz Gómez
7. The generalapproach
How to integrate structural and composition variables to
generate an affiliation variable to induce a partition with
groups of well connected and similar nodes?
7 Juan David Cruz Gómez
8. The generalapproach
How to design a visual model to analyze the affiliation
variables under the light of the other two variables?
8 Juan David Cruz Gómez
9. Used methods – community detection
Type Example Variable
Node indexing [Rattigan et al 2007] Structural
Partitional
Clique enumeration [Du et al 2007] Structural
Spectral Sigmoid commute time kernel [Yen et al 2009] Structural
Highest betweenness removal [Newman 2001] Structural
Divisive
Betweenness generalization [Newman& Girvan 2002] Structural
Hierarchical approach [Newman 2004] Structural
Modularity Modularity variation [Clauset et al 2004] Structural
Fast unfolding of communities [Blondel et al 2008] Structural
Game theoretic approach [Mehrotra et al 2012] Structural
Random walk for structural and semantic similarity [Zhou et al 2009] Structural+
Composition
Other
Structural+
My community detection algorithm Composition
! Most of the methods use only the structural variable. Our method combines the structural
and composition variables reducing the number of a priori assumptions
9 Juan David Cruz Gómez
10. Used methods - Advantages
■ We combine methods from data and graph mining
to integrate the variables and analyze the result
■ The selected methods were chosen because:
● Their execution speed: general linear
complexities, allowing us to manipulate large data sets
● In the case of compositional information, flexibility for
defining different features from different spaces
● Use of common quality measures allowing
benchmarking and integration with other elements
■ The model was built to be a framework that allows
the user go from the data selection to the visual
analysis of the variables
10 Juan David Cruz Gómez
11. Integration of variables – Experiments
Results of the experiment with the DBLP co-authorship network
■ Each community contains authors on different domains, taking into
account cross-domain information
■ The density measure for each point of view is lower
■ The entropy is lower than the reference value
■ In general, the results are better, in terms of density and entropy, than
those reported by Zhou et al., 2009. In their work, authors use the same
data set
11 Juan David Cruz Gómez
12. Visualization of communities –
Experiments – DBLP social network
Authors connecting
10000 nodes,
different communities
65734 edges,
862 communities of
authors with
different profiles:
topics and number
of publications…
The visual model
allows us to
identify important
authors in
Authors connected only
different levels…
with authors on the same
community
12 Juan David Cruz Gómez
13. Milestones of the work
- Full exploitation of the information available in social
Integration of variables networks
1
in a social network - Ample spectrum of analyses and applications:
transversality of communities, sight beyond sight…
- A developing field in social network analysis: this is one
2 New/Open problematic of the first works using the SN information in this way
- A growing research interest!
New definition of Applicable to other domains like personal visual
3 analytics, social marketing, biology, impact and spreading
community in SN
measurements…
13 Juan David Cruz Gómez
14. Conclusion
■ Outline of a new definition of community in social
networks
■ Formalization of the problem of integration of
variables
■ Definition of a general manipulation method:
● Novel community detection algorithm
● Use of different types of data
● The use of this method can be extended to other
domains (biology, marketing, business administration)
■ Integration of variables of divergent nature
■ Analysis and visualization methods for exploring
communities
14 Juan David Cruz Gómez
15. Thank you for your attention
Do you have questions?
15 Juan David Cruz Gómez
Notes de l'éditeur
In this graph, several well connected nodes remain as inner nodes. These nodes can be seen as gurus in their communities, however they are not connected with other communities (treating other topics)