- First version was a guest lecture about Network Visualization in the class "Data Visualization" taught by Dr. Sharon Hsiao in the QMSS program at Columbia University http://www.columbia.edu/~ih2240/dataviz/index.htm
- This updated version was delivered in our class on SNA at PUC Chile in the MPGI master program.
Data Fusion for Dealing with the Recommendation Problem
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PUC Chile
1. Network
Visualiza0on
+
Gephi
Tutorial
Denis
Parra,
Ph.D.
Assistant
Professor
Pon0fical
Catholic
University
of
Chile
MPGI
~
Master
Program
Monday,
October
5th,
2015
2. Expecta0ons
• What
will
you
learn
at
the
end
of
this
class?
1. Basic
concepts
of
Networks,
Graphs,
and
Social
Network
Analysis
(SNA)
2. Systems/Applica0ons
that
make
use
of
network
visualiza0ons
3. Recent
Research
on
Network
Visualiza0on
4. How
to
use
a
Network
Visualiza0on
and
Analysis
tool
(Gephi)
~
in
class
tutorial
5. Bonus
:
Where
do
I
find
data
sets
to
do
more
cool
visualiza0ons?
10/5/15
@denisparra
|
2
4. We
live
in
a
connected
world
• …
and
we
need
visualiza0on
models
to
represent
networks
such
as:
– Online
Social
networks:
Facebook,
Twiaer
~
people
connected
online
– Informa3on
networks:
WWW
~
web
pages
connected
through
hyperlinks
– Computer
networks:
The
internet
~
computers
and
routers
connected
through
wired/wireless
connec0ons
• What
is
a
network?
(Easley
and
Kleinberg,
2011)
“a
network
is
any
collec0on
of
objects
in
which
some
pairs
of
these
objects
are
connected
by
links”.
10/5/15
@denisparra
|
4
5. A
bit
of
history:
Graph
models
• Around
1735,
the
mathema0cian
Venn
Euler
set
the
founda0on
for
graph
theory
by
crea0ng
a
model
to
represent
the
problem
of
the
“7
bridges
of
Königsberg”
Source:
hap://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg
and
“Linked”
by
A-‐L.
Barabasi
10/5/15
@denisparra
|
5
6. A
formal
defini0on
of
Graph
• Based
on
(Easley
and
Kleinberg,
2011):
A
graph
is
a
way
of
specifying
rela.onships
among
a
collec.on
of
items.
A
graph
consists
of
a
set
of
objects,
called
nodes,
with
certain
pairs
of
these
objects
connected
by
links
called
edges.
10/5/15
@denisparra
|
6
7. Graphs
as
Models
of
Networks
• Based
on
(Easley
and
Kleinberg,
2011):
“Graphs
are
useful
because
they
serve
as
mathema0cal
models
of
network
structures.”
• But
keep
in
mind:
Graphs
are
only
one
way
to
represent
networks
(though
the
most
popular)
{
arc
,
link
,
edge
}
{
node
,
vertex
}
Source:
L.
Adamic
SNA
class
@coursera
10/5/15
@denisparra
|
7
8. The
Historic
Development
of
Network
Visualiza0on
• The
following
slides
are
based
on
the
work
of
Pfeffer
and
Freeman
(2015)
Pfeffer,
Juergen
&
Freeman,
Lin.
Methods
of
Social
Network
VisualizaAon.
Encyclopedia
of
Complexity
and
Systems
Science,
2nd
EdiAon,
Springer
Reference.
• They
categorize
this
historic
development
on
three
categories:
1. Nodes,
Links,
Shape,
Size
2. Substance-‐Based
Layout
3. Two-‐Mode
Networks
10/5/15
@denisparra
|
8
9. Overall
View
of
the
Visualiza0ons
Reference:
Pfeffer,
Juergen
&
Freeman,
Lin
(forthcoming).
Methods
of
Social
Network
Visualiza0on.
Encyclopedia
of
Complexity
and
Systems
Science,
2nd
Edi0on,
Springer
Reference.
hap://www.pfeffer.at/
data/visposter/
@denisparra
|
9
16. The
tennis
players’
social
network
Sharonpova
Sharapova
Serena
Li
Na
Rafa
Djoker
Soderling
Prof.
Parra
Roger
10/5/15
@denisparra
|
16
17. Some
Types
of
Networks
Undirected
(Facebook
friendships)
Directed
(TwiAer
following)
mul3mode
(Amazon
user-‐
product)
Weighted
(Facebook
likes)
and
more
….
9
3
• Hereinauer,
I
will
refer
indis0nc0vely
to
graphs
and
networks.
Here
some
types:
10/5/15
@denisparra
|
17
18. Analyzing
a
network:
SNA
• How
do
we
analyze
a
network?
• How
do
we
compare
different
networks?
• This
class
is
about
network
visualiza0ons,
but
some
founda0onal
concepts
of
SNA
need
to
be
understood
before.
• Let’s
see
ways
to
describe
the
network
at
local
and
at
global
level
Source:
hap://moviegalaxies.com
10/5/15
@denisparra
|
18
19. Measures
in
SNA
Node-‐level
metrics
• Centrality
– (In/Out)
Degree
– Betweenness
– Closeness
– Eigenvector
• Clustering
coefficient
Graph-‐level
metrics
• Size
• Diameter
(longest
path)
• Average
path
length
• Average
[node
metric]
10/5/15
@denisparra
|
19
• These
are
only
a
few
representa0ve
measures
• For
further
understanding
of
these
measures:
See
the
presenta0on
of
Giorgos
Chelo0s
in
slideshare,
from
slide
8
hap://www.slideshare.net/gchelio0s/social-‐network-‐analysis-‐3273045
20. Interpreta0on
of
measures
10/5/15
@denisparra
|
20
Source:
hap://www.slideshare.net/gchelio0s/social-‐network-‐analysis-‐3273045
slide
24
Interpreta3on
in
Social
Networks
Degree
How
many
people
can
this
person
reach
directly?
Betweenness
How
likely
is
this
person
to
be
the
most
direct
route
between
two
people
in
the
network?
21. Interpreta0on
of
measures
10/5/15
@denisparra
|
21
Source:
hap://www.slideshare.net/gchelio0s/social-‐network-‐analysis-‐3273045
slide
24
Interpreta3on
in
Social
Networks
Closeness
How
fast
can
this
person
reach
everyone
in
the
network?
Eigenvector
How
well
is
this
person
connected
to
other
well-‐connected
people?
22. Two
more
concepts…
• Total
possible
number
of
edges
in
a
network
#edges
=
n
*
(n
-‐1
)
/2
(undirected
network)
#edges
=
n
*
(n
-‐1
)
(directed
network)
• (Shortest)
Path:
the
shortest
sequence
of
edges
to
be
followed
to
reach
a
node
B
from
a
node
A
in
a
network.
Which
is
the
length
of
the
shortest
path
between
Rafa
Nadal
and
Sharonpova?
10/5/15
@denisparra
|
22
23. Prac0ce
the
learned
concepts…
• Prac0ce
the
learned
concepts
comparing
these
2
movie
networks
(characters’
interac0ons)
:
Traffic
(2000)
Forrest
Gump
(1994)
Source:
hap://moviegalaxies.com
10/5/15
@denisparra
|
23
26. Network
Components
• (from
G.
Chelio0s)
“many
large
groups
and
online
communi0es
have
a
core
of
densely
connected
users
…
and
a
much
larger
periphery”
• Source:
hap://www.slideshare.net/gchelio0s/social-‐network-‐analysis-‐3273045,
page
34
• (from
L.
Adamic)
“if the largest
component encompasses a
significant fraction of the graph, it is
called the giant component”
• Source:
haps://class.coursera.org/sna-‐2012-‐001/class/index
,
week
1
slides
10/5/15
@denisparra
|
26
27. Remarks
and
Further
topics
in
SNA
• With
the
concepts
already
described,
we
will
aaempt
to
visualize
and
analyze
two
networks
in
the
NodeXL
&
Gephi
tutorial.
• Not
covered
in
this
class,
but
worth
men0oning
other
SNA
topics:
– Network
growth/forma0on:
Erdős–Rényi,
Waas-‐Strogatz,
Barabassi-‐Albert
(preferen0al
aaachment)
– Community
Structure:
Girvan-‐Newman,
Clauset-‐Moore-‐
Newman
(max-‐modularity),
affinity
propaga0on,
etc.
– Processes
in
networks:
Diffusion,
epidemics,
innova0on,
etc.
– Network
mo0fs:
small
subgraphs
that
are
over-‐represented
in
the
network
10/5/15
@denisparra
|
27
29. Examples
of
Applica0ons
• These
are
a
few
examples
of
applica0ons
that
make
use
of
Network
Visualiza0ons:
– Truthy
– Moviegalaxies
– Poderopedia
– TwiaerScope
– LinkedIn
Maps
•
These
ARE
NOT
tools
for
generic
Visualiza0on
and
Analysis
(we’ll
see
those
in
the
tutorial
sec0on)
10/5/15
@denisparra
|
29
30. Truthy
• Informa0on
Diffusion
research
at
Indiana
U.
• hap://truthy.indiana.edu
10/5/15
@denisparra
|
30
31. MovieGalaxies
• Visualize
an
discuss
the
characters
of
movies
as
networks
• hap://moviegalaxies.com
10/5/15
@denisparra
|
31
32. Poderopedia
• Who
is
who
in
business
and
poli0cs
in
Chile?
• Knight
Founda0on:
Top
10
digital
tools
for
journalists
(Feb
4,
2013)
hap://www.knigh•ounda0on.org/blogs/knightblog/2013/2/4/new-‐
digital-‐tools-‐journalists-‐10-‐learn/
10/5/15
@denisparra
|
32
33. TwiaerScope
• A
visual
monitor
of
tweets
in
real
0me.
This
is
an
enhanced
graph
model.
• hap://0bes0.research.aa.com/twiaerscope/
10/5/15
@denisparra
|
33
36. Recent
Research
(~by
Feb
2013)
• Can
we
go
Beyond
the
Graph?
• ManyNets
• HivePlots
• Orion
• GraphPrism
• Mo0f
Simplifica0ons
• GeoSpa0al
Network
Visualiza0on
10/5/15
@denisparra
|
36
37. Social
Network
Visualiza0on:
Can
we
go
Beyond
the
Graph?
(2006)
• Authors
support
that
social
network
visualiza0on
for
end
users
should
go
beyond
the
graph-‐only
paradigm
• hap://web.media.mit.edu/~fviegas/papers/viegas-‐cscw04.pdf
10/5/15
@denisparra
|
37
40. Orion
(2011)
• Different
visualiza0ons
to
present
network
data
• hap://vis.stanford.edu/papers/orion
a)
Sorted
matrix
b)
Node-‐link
diagram
c)
Plot
of
betweenness
for
two
networks
10/5/15
@denisparra
|
40
42. GraphPrism
(2012)
• GraphPrism:
Compact
Visualiza0on
of
Network
Structure,
inspired
in
B-‐Matrices
• hap://vis.stanford.edu/papers/graphprism
10/5/15
@denisparra
|
42
43. Mo0f
Simplifica0on
(2012)
• Use
of
fans
and
parallel
glyphs
to
improve
readibility
• hap://hcil2.cs.umd.edu/trs/2012-‐11/2012-‐11.pdf
10/5/15
@denisparra
|
43
44. MuxViz:
Mul0layer
Networks
(2014)
• Mul0layer
analysis
and
visualiza0on
of
networks
• hap://muxviz.net/index.php
10/5/15
@denisparra
|
44
45. 4.
Using
a
Network
Visualiza0on
Tool
(NodeXL
&
Gephi
in
a
nutshell)
47. How
do
I
format
my
network
data?
• Depends
on
your
informa0on
needs.
What
do
you
want
to
describe?
– GDF
hap://guess.wikispot.org/The_GUESS_.gdf_format
– GEXF
hap://gexf.net/format/
– GraphML
hap://graphml.graphdrawing.org
– Pajek
Net
format
hap://vlado.fmf.uni-‐lj.si/pub/networks/pajek/doc/pajekman.pdf
– CSV
haps://gephi.org/users/supported-‐graph-‐formats/csv-‐format/
• For
a
summary
and
examples,
check
haps://gephi.org/users/supported-‐graph-‐formats/
10/5/15
@denisparra
|
47
48. How
do
I
format
my
Data?
10/5/15
@denisparra
|
48
49. For
the
rest
of
my
classes
• Will
use
this
as
reference
for
iGraph
analysis
52. Final
Remarks
• In
this
class
you
learnt:
– Basic
concepts
of
networks,
graphs,
and
SNA
– Existent
applica0ons
that
make
use
of
network
visualiza0ons
– Research
related
to
network
visualiza0on
– How
to
use
a
network
visualiza0on
and
analysis
tool
• My
final
message:
– Graph
model
is
great,
but
try
to
move
beyond
the
graph-‐only
visualiza0on.
– Think
of
ways
to
create
visualiza0ons
that
help
to
make
sense
of
the
different
proper0es
inherent
to
the
network
and
to
its
elements
(nodes
and
links).
R
and
Javascript
give
you
enough
power
to
implement.
10/5/15
@denisparra
|
52
53. Thanks!
• Ques0ons?
• denisparra@gmail.com
or
@denisparra
• Check
my
academic
web
page
hap://web.ing.puc.cl/~dparra/
• and
my
research
blog
hap://kawinproject.wordpress.com
55. Where
do
I
find
cool
NetVis?
• hap://www.visualcomplexity.com/vc/
Where
do
I
find
network
datasets?
• Jure
Leskovec
page
hap://snap.stanford.edu/data/
• Mark
Newman’s
page
hap://www-‐personal.umich.edu/~mejn/netdata/
• Gephi
wiki
datasets
hap://wiki.gephi.org/index.php/Datasets
• From
CMU’s
Graphlab
hap://graphlab.org/downloads/datasets/
10/5/15
@denisparra
|
55
56. Recommended
books
• Linked
by
Albert
L.
Barabasi
• Networks,
Crowds,
and
Markets
by
D.
Easley
and
J.
Kleinberg
(pre-‐print
available
free
online)
10/5/15
@denisparra
|
56
58. • Do
you
R?
– Temporal
networks
with
igraph
and
R
(with
20
lines
of
code!)
hap://markov.uc3m.es/2012/11/temporal-‐
networks-‐with-‐igraph-‐and-‐r-‐with-‐20-‐lines-‐of-‐
code/
10/5/15
@denisparra
|
58
59. LineSets
(InfoVis
2011)
• Alper
et
al.
(UCSB
and
Microsou
research)
• Extend
a
concept
from
subway
maps
to
sets
of
items
10/5/15
@denisparra
|
59
60. Denis
Parra’s
Research
• Using
networks
vis.
in
recommenda0on
approaches:
“Visualizing
Recommenda3ons
to
Support
Explora3on,
Transparency
and
Controllability”
by
Verbert,
Parra,
Brusilovsky
and
Duval,
IUI
Conference
(2013).
61. Denis
Parra’s
Research
• Plo‡ng
edges’
weight
distribu0ons
of
several
networks
to
compare
community
explain
algorithms
performance
10/5/15
@denisparra
|
61
62. Denis
Parra’s
Research
• Twiaer
in
academic
events:
A
study
of
temporal
usage,
communica0on,
sen0mental
and
topical
paaerns
in
16
Computer
Science
conferences
(
hap://dx.doi.org/10.1016/j.comcom.
2015.07.001
)
Editor's Notes
9*8 = 72, 72 / 2 = 36
94*93 = 8742Density: Network density is the proportion of edges in a network relative to the total number of possible edges.Diameter: The diameter of a network is the length (#edges) of the longest path between two nodesClustering Coefficient: : A measure of the likelihood that two associates of a node are associates themselves. A higher clustering coefficient indicates a greater cliquishness.Path Length: The distances between pairs of nodes in the network. Average path-length: is the average of these distances between all pairs of nodes.