SlideShare a Scribd company logo
1 of 37
Download to read offline
Social Network Analysis with R ∗
Yanchang Zhao
http://www.RDataMining.com
R and Data Mining Course
Beijing University of Posts and Telecommunications,
Beijing, China
July 2019
∗
Chapter 11: Social Network Analysis, in R and Data Mining: Examples and Case
Studies. http://www.rdatamining.com/docs/RDataMining-book.pdf
1 / 37
Contents
Graph and Social Network Analysis
Graph Construction
Graph Visualization
Graph Query
Centrality Measures
Advanced Graph Visualization
R Packages
Wrap Up
Further Readings and Online Resources
2 / 37
Network and Graph
Nodes, vertices or entities
Edges, links or relationships
Network analysis, graph mining
Link prediction, community/group detection, entity resolution,
recommender system, information propogation modeling
3 / 37
Graph Databases
Neo4j: https://neo4j.com
Giraph on Hadoop: http://giraph.apache.org
GraphX on Spark: http://spark.apache.org/graphx/
4 / 37
Social Network Analysis
Graph construction
Graph query
Centrality measures
Graph visualization
Clustering and community detection
5 / 37
Contents
Graph and Social Network Analysis
Graph Construction
Graph Visualization
Graph Query
Centrality Measures
Advanced Graph Visualization
R Packages
Wrap Up
Further Readings and Online Resources
6 / 37
Graph Construction
Tom, Ben, Bob and Mary are friends of John.
Alice and Wendy are friends of Mary.
Wendy is a friend of David.
library(igraph)
# nodes
nodes <- data.frame(
name = c("Tom","Ben","Bob","John","Mary","Alice","Wendy","David"),
gender = c("M", "M", "M", "M", "F", "F", "F", "M"),
age = c( 16, 30, 42, 29, 26, 32, 18, 22)
)
# relations
edges <- data.frame(
from = c("Tom", "Ben", "Bob", "Mary", "Alice", "Wendy", "Wendy"),
to = c("John", "John", "John", "John","Mary", "Mary", "David")
)
# build a graph object
g <- graph.data.frame(edges, directed=F, vertices=nodes)
7 / 37
Contents
Graph and Social Network Analysis
Graph Construction
Graph Visualization
Graph Query
Centrality Measures
Advanced Graph Visualization
R Packages
Wrap Up
Further Readings and Online Resources
8 / 37
Graph Visualization
layout1 <- g %>% layout_nicely() ## save layout for reuse
g %>% plot(vertex.size = 30, layout = layout1)
Tom
Ben
Bob
John
Mary
AliceWendy
David
9 / 37
Graph Visualization (cont.)
## use blue for male and pink for female
colors <- ifelse(V(g)$gender=="M", "skyblue", "pink")
g %>% plot(vertex.size=30, vertex.color=colors, layout=layout1)
Tom
Ben
Bob
John
Mary
AliceWendy
David
10 / 37
Contents
Graph and Social Network Analysis
Graph Construction
Graph Visualization
Graph Query
Centrality Measures
Advanced Graph Visualization
R Packages
Wrap Up
Further Readings and Online Resources
11 / 37
Graph Query
## nodes
V(g)
## + 8/8 vertices, named, from 8dfec3f:
## [1] Tom Ben Bob John Mary Alice Wendy David
## edges
E(g)
## + 7/7 edges from 8dfec3f (vertex names):
## [1] Tom --John Ben --John Bob --John John --Mary
## [5] Mary --Alice Mary --Wendy Wendy--David
## immediate neighbors (friends) of John
friends <- ego(g,order=1,nodes="John",mindist=1)[[1]] %>% print()
## + 4/8 vertices, named, from 8dfec3f:
## [1] Tom Ben Bob Mary
## female friends of John
friends[friends$gender == "F"]
## + 1/8 vertex, named, from 8dfec3f:
## [1] Mary
12 / 37
Graph Query (cont.)
## 1- and 2-order neighbors (friends) of John
g2 <- make_ego_graph(g, order=2, nodes="John")[[1]]
g2 %>% plot(vertex.size=30, vertex.color=colors)
Tom
Ben
Bob
John
Mary
Alice
Wendy
13 / 37
Contents
Graph and Social Network Analysis
Graph Construction
Graph Visualization
Graph Query
Centrality Measures
Advanced Graph Visualization
R Packages
Wrap Up
Further Readings and Online Resources
14 / 37
Friendship Graph
Tom
Ben
Bob
John
Mary
AliceWendy
David
15 / 37
Centrality Measures
Degree: the number of adjacent edges; indegree and
outdegree for directed graphs
Closeness: the inverse of the average length of the shortest
paths to/from all other nodes
Betweenness: the number of shortest paths going through a
node
degree <- g %>% degree() %>% print()
## Tom Ben Bob John Mary Alice Wendy David
## 1 1 1 4 3 1 2 1
closeness <- g %>% closeness() %>% round(2) %>% print()
## Tom Ben Bob John Mary Alice Wendy David
## 0.06 0.06 0.06 0.09 0.09 0.06 0.07 0.05
betweenness <- g %>% betweenness() %>% print()
## Tom Ben Bob John Mary Alice Wendy David
## 0 0 0 15 14 0 6 0
16 / 37
Centrality Measures (cont.)
Eigenvector centrality: the values of the first eigenvector of
the graph adjacency matrix
Transivity, a.k.a clustering coefficient, measures the
probability that the adjacent nodes of a node are connected.
eigenvector <- evcent(g)$vector %>% round(2) %>% print()
## Tom Ben Bob John Mary Alice Wendy David
## 0.45 0.45 0.45 1.00 0.85 0.38 0.48 0.22
transitivity <- g %>% transitivity(type = "local") %>% print()
## [1] NaN NaN NaN 0 0 NaN 0 NaN
17 / 37
Contents
Graph and Social Network Analysis
Graph Construction
Graph Visualization
Graph Query
Centrality Measures
Advanced Graph Visualization
R Packages
Wrap Up
Further Readings and Online Resources
18 / 37
Static Network Visualization
Static network visualization
Fast in rendering big graphs
For very big graphs, the most efficient way is to save
visualization result into a file, instead of directly to screen.
Save network diagram into files: pdf(), bmp(), jpeg(),
png(), tiff()
library(igraph)
## plot directly to screen when graph is small
plot(g)
## for big graphs, save visualization to a PDF file
pdf("mygraph.pdf")
plot(g)
graphics.off() ## or dev.off()
19 / 37
Interactive Network Visualization
Coordinates of other nodes are not adjusted when moving a
node.
Can be slow when rendering big graphs
Save network diagram into files: visSave(), visExport()
visIgraph(g, idToLabel=T) %>%
## highlight nodes connected to a selected node
visOptions(highlightNearest=T) %>%
## use different icons for different types (groups) of nodes
visGroups(groupname="person", shape="icon",
icon=list(code="f007")) %>%
... %>%
## use FontAwesome icons
addFontAwesome() %>%
## add legend of nodes
visLegend() %>%
## to save to file
visSave(file = "network.html")
20 / 37
Interactive Network Visualization (cont.)
Dynamically adjusting coordinates for better visualization
Very slow when rendering big graphs
x <- toVisNetworkData(g)
visNetwork(nodes=x$nodes, edges=x$edges)%>%
## use different icons for different types (groups) of nodes
visGroups(groupname="person", shape="icon",
icon=list(code="f007")) %>%
... %>%
## use FontAwesome icons
addFontAwesome() %>%
## add legend of nodes
visLegend()
21 / 37
Load Graph Data
## download graph data
url <- "http://www.rdatamining.com/data/graph.rdata"
download.file(url, destfile = "./data/graph.rdata")
library(igraph)
# load graph data into R
# what will be loaded: g, nodes, edges
load("../data/graph.rdata")
22 / 37
Build a Graph
head(nodes, 3)
## name type
## 1 T9 tid
## 2 T24 tid
## 3 T13 tid
head(edges, 3)
## from to
## 1 T9 P27
## 2 T24 P8
## 3 T13 P2
## buid a graph object
g <- graph.data.frame(edges, directed = F, vertices = nodes)
g
## IGRAPH 9597c42 UN-B 61 60 --
## + attr: name (v/c), type (v/c)
## + edges from 9597c42 (vertex names):
## [1] T9 --P27 T24--P8 T13--P2 T27--P10 T29--P29 T2 --P27
## [7] T16--P21 T27--P20 T17--P30 T14--P20 T29--P22 T14--P17
## [13] T21--P18 T18--P9 T4 --P5 T9 --A29 T24--A28 T13--A21 23 / 37
Example of Static Network Visualization
library(igraph)
plot(g, vertex.size=12, vertex.label.cex=0.7,
vertex.color=as.factor(V(g)$type), vertex.frame.color=NA)
T9
T24
T13
T27
T29
T2
T16
T17
T14
T21
T18
T4
P27
P8
P2
P10
P29
P21
P20
P30
P22
P17
P18
P9
P5
A29
A28
A21
A24
A1A15
A23
A7
A10
A5
A13
A12
N5
N7
N14
N8
N26
N2
N24
N4
N17
N23
N27
N12
E20
E3
E12
E9
E25
E14
E24
E23
E19
E22
E1
E15
24 / 37
Example of Interactive Network Visualization
library(visNetwork)
V(g)$group <- V(g)$type
## visualization
data <- toVisNetworkData(g)
visNetwork(nodes=data$nodes, edges=data$edges) %>%
visGroups(groupname="tid",shape="icon",icon=list(code="f15c")) %>%
visGroups(groupname="person",shape="icon",icon=list(code="f007")) %>%
visGroups(groupname="addr",shape="icon",icon=list(code="f015")) %>%
visGroups(groupname="phone",shape="icon",icon=list(code="f095")) %>%
visGroups(groupname="email",shape="icon",icon=list(code="f0e0")) %>%
addFontAwesome() %>%
visLegend()
25 / 37
26 / 37
Contents
Graph and Social Network Analysis
Graph Construction
Graph Visualization
Graph Query
Centrality Measures
Advanced Graph Visualization
R Packages
Wrap Up
Further Readings and Online Resources
27 / 37
R Packages
Network analysis: igraph, sna, statnet
Network visualization: visNetwork
Interface with graph databases: RNeo4j
28 / 37
Package igraph †
V(g), E(g): nodes and edges of graph g
degree, betweenness, closeness, transitivity:
various centrality scores
neighborhood: neighborhood of graph vertices
cliques, largest.cliques, maximal.cliques,
clique.number: find cliques, ie. complete subgraphs
clusters, no.clusters: maximal connected components
of a graph and the number of them
fastgreedy.community, spinglass.community:
community detection
cohesive.blocks: calculate cohesive blocks
induced.subgraph: create a subgraph of a graph (igraph)
read.graph, write.graph: read and writ graphs from and
to files of various formats
†
https://cran.r-project.org/package=igraph
29 / 37
Contents
Graph and Social Network Analysis
Graph Construction
Graph Visualization
Graph Query
Centrality Measures
Advanced Graph Visualization
R Packages
Wrap Up
Further Readings and Online Resources
30 / 37
Wrap Up
Package igraph and sna
Static visualization
Can visualize nodes with shapes, images and icons
Visualise very large graph
Support network analysis and graph mining
Package visNetwork
Interactive visualization
Can visualize nodes with shapes, images and icons
Image rendering can be very slow for large graphs
Designed for visualization only, and does not support network
analysis and graph mining
31 / 37
Contents
Graph and Social Network Analysis
Graph Construction
Graph Visualization
Graph Query
Centrality Measures
Advanced Graph Visualization
R Packages
Wrap Up
Further Readings and Online Resources
32 / 37
Further Readings
Social network analysis (SNA)
https://en.wikipedia.org/wiki/Social_network_analysis
igraph – a network analysis package, supporting R, Python
and C/C++
http://igraph.org
sna – an R package for social network analysis
https://cran.r-project.org/web/packages/sna/index.html
statnet – software tools for the analysis, simulation and
visualization of network data; also available as an R package
http://www.statnet.org
visNetwork – an R package for network visualization
http://datastorm-open.github.io/visNetwork/
33 / 37
Online Resources
Book titled R and Data Mining: Examples and Case
Studies [Zhao, 2012]
http://www.rdatamining.com/docs/RDataMining-book.pdf
R Reference Card for Data Mining
http://www.rdatamining.com/docs/RDataMining-reference-card.pdf
Free online courses and documents
http://www.rdatamining.com/resources/
RDataMining Group on LinkedIn (27,000+ members)
http://group.rdatamining.com
Twitter (3,300+ followers)
@RDataMining
34 / 37
The End
Thanks!
Email: yanchang(at)RDataMining.com
Twitter: @RDataMining
35 / 37
How to Cite This Work
Citation
Yanchang Zhao. R and Data Mining: Examples and Case Studies. ISBN
978-0-12-396963-7, December 2012. Academic Press, Elsevier. 256
pages. URL: http://www.rdatamining.com/docs/RDataMining-book.pdf.
BibTex
@BOOK{Zhao2012R,
title = {R and Data Mining: Examples and Case Studies},
publisher = {Academic Press, Elsevier},
year = {2012},
author = {Yanchang Zhao},
pages = {256},
month = {December},
isbn = {978-0-123-96963-7},
keywords = {R, data mining},
url = {http://www.rdatamining.com/docs/RDataMining-book.pdf}
}
36 / 37
References I
Zhao, Y. (2012).
R and Data Mining: Examples and Case Studies, ISBN 978-0-12-396963-7.
Academic Press, Elsevier.
37 / 37

More Related Content

What's hot

Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programmingizahn
 
An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)Dataspora
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching moduleSander Timmer
 
R-programming-training-in-mumbai
R-programming-training-in-mumbaiR-programming-training-in-mumbai
R-programming-training-in-mumbaiUnmesh Baile
 
4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply Function4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply FunctionSakthi Dasans
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine LearningAmanBhalla14
 
2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factorskrishna singh
 
R basics
R basicsR basics
R basicsFAO
 
Grouping & Summarizing Data in R
Grouping & Summarizing Data in RGrouping & Summarizing Data in R
Grouping & Summarizing Data in RJeffrey Breen
 
R Programming: Mathematical Functions In R
R Programming: Mathematical Functions In RR Programming: Mathematical Functions In R
R Programming: Mathematical Functions In RRsquared Academy
 
R Programming: Importing Data In R
R Programming: Importing Data In RR Programming: Importing Data In R
R Programming: Importing Data In RRsquared Academy
 
2 R Tutorial Programming
2 R Tutorial Programming2 R Tutorial Programming
2 R Tutorial ProgrammingSakthi Dasans
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in RFlorian Uhlitz
 
Data analysis with R
Data analysis with RData analysis with R
Data analysis with RShareThis
 

What's hot (20)

R language
R languageR language
R language
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
R language introduction
R language introductionR language introduction
R language introduction
 
An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching module
 
Language R
Language RLanguage R
Language R
 
R-programming-training-in-mumbai
R-programming-training-in-mumbaiR-programming-training-in-mumbai
R-programming-training-in-mumbai
 
4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply Function4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply Function
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine Learning
 
2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors
 
R basics
R basicsR basics
R basics
 
R studio
R studio R studio
R studio
 
Grouping & Summarizing Data in R
Grouping & Summarizing Data in RGrouping & Summarizing Data in R
Grouping & Summarizing Data in R
 
R Programming: Mathematical Functions In R
R Programming: Mathematical Functions In RR Programming: Mathematical Functions In R
R Programming: Mathematical Functions In R
 
R Programming: Importing Data In R
R Programming: Importing Data In RR Programming: Importing Data In R
R Programming: Importing Data In R
 
2 R Tutorial Programming
2 R Tutorial Programming2 R Tutorial Programming
2 R Tutorial Programming
 
Machine Learning in R
Machine Learning in RMachine Learning in R
Machine Learning in R
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in R
 
Data analysis with R
Data analysis with RData analysis with R
Data analysis with R
 
Chapter 10 ds
Chapter 10 dsChapter 10 ds
Chapter 10 ds
 

Similar to RDataMining slides-network-analysis-with-r

DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernelsivaderivader
 
Social network-analysis-in-python
Social network-analysis-in-pythonSocial network-analysis-in-python
Social network-analysis-in-pythonJoe OntheRocks
 
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)Ankur Dave
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingNesreen K. Ahmed
 
Graph computation
Graph computationGraph computation
Graph computationSigmoid
 
Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Benjamin Bengfort
 
CMSC 350 FINAL PROJECT
CMSC 350 FINAL PROJECTCMSC 350 FINAL PROJECT
CMSC 350 FINAL PROJECTHamesKellor
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Spark Summit
 
software engineering modules iii & iv.pptx
software engineering  modules iii & iv.pptxsoftware engineering  modules iii & iv.pptx
software engineering modules iii & iv.pptxrani marri
 
Dagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsDagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsArijit Khan
 
Sigmod11 outsource shortest path
Sigmod11 outsource shortest pathSigmod11 outsource shortest path
Sigmod11 outsource shortest pathredhatdb
 
[1D6]RE-view of Android L developer PRE-view
[1D6]RE-view of Android L developer PRE-view[1D6]RE-view of Android L developer PRE-view
[1D6]RE-view of Android L developer PRE-viewNAVER D2
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXBenjamin Bengfort
 
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...AMD Developer Central
 

Similar to RDataMining slides-network-analysis-with-r (20)

DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
 
Social network-analysis-in-python
Social network-analysis-in-pythonSocial network-analysis-in-python
Social network-analysis-in-python
 
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
 
Igraph
IgraphIgraph
Igraph
 
Graph computation
Graph computationGraph computation
Graph computation
 
QMC: Undergraduate Workshop, Tutorial on 'R' Software - Yawen Guan, Feb 26, 2...
QMC: Undergraduate Workshop, Tutorial on 'R' Software - Yawen Guan, Feb 26, 2...QMC: Undergraduate Workshop, Tutorial on 'R' Software - Yawen Guan, Feb 26, 2...
QMC: Undergraduate Workshop, Tutorial on 'R' Software - Yawen Guan, Feb 26, 2...
 
Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)
 
K10765 Matlab 3D Mesh Plots
K10765 Matlab 3D Mesh PlotsK10765 Matlab 3D Mesh Plots
K10765 Matlab 3D Mesh Plots
 
CMSC 350 FINAL PROJECT
CMSC 350 FINAL PROJECTCMSC 350 FINAL PROJECT
CMSC 350 FINAL PROJECT
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
 
R basics
R basicsR basics
R basics
 
software engineering modules iii & iv.pptx
software engineering  modules iii & iv.pptxsoftware engineering  modules iii & iv.pptx
software engineering modules iii & iv.pptx
 
Nx tutorial basics
Nx tutorial basicsNx tutorial basics
Nx tutorial basics
 
Dagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsDagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphs
 
Sigmod11 outsource shortest path
Sigmod11 outsource shortest pathSigmod11 outsource shortest path
Sigmod11 outsource shortest path
 
[1D6]RE-view of Android L developer PRE-view
[1D6]RE-view of Android L developer PRE-view[1D6]RE-view of Android L developer PRE-view
[1D6]RE-view of Android L developer PRE-view
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
 
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...
 

More from Yanchang Zhao

RDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rRDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rYanchang Zhao
 
RDataMining slides-data-exploration-visualisation
RDataMining slides-data-exploration-visualisationRDataMining slides-data-exploration-visualisation
RDataMining slides-data-exploration-visualisationYanchang Zhao
 
RDataMining slides-association-rule-mining-with-r
RDataMining slides-association-rule-mining-with-rRDataMining slides-association-rule-mining-with-r
RDataMining slides-association-rule-mining-with-rYanchang Zhao
 
RDataMining-reference-card
RDataMining-reference-cardRDataMining-reference-card
RDataMining-reference-cardYanchang Zhao
 
Text Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter DataText Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter DataYanchang Zhao
 
Association Rule Mining with R
Association Rule Mining with RAssociation Rule Mining with R
Association Rule Mining with RYanchang Zhao
 
Time Series Analysis and Mining with R
Time Series Analysis and Mining with RTime Series Analysis and Mining with R
Time Series Analysis and Mining with RYanchang Zhao
 
Regression and Classification with R
Regression and Classification with RRegression and Classification with R
Regression and Classification with RYanchang Zhao
 
Data Clustering with R
Data Clustering with RData Clustering with R
Data Clustering with RYanchang Zhao
 
Data Exploration and Visualization with R
Data Exploration and Visualization with RData Exploration and Visualization with R
Data Exploration and Visualization with RYanchang Zhao
 
Introduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RIntroduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RYanchang Zhao
 
An Introduction to Data Mining with R
An Introduction to Data Mining with RAn Introduction to Data Mining with R
An Introduction to Data Mining with RYanchang Zhao
 
Time series-mining-slides
Time series-mining-slidesTime series-mining-slides
Time series-mining-slidesYanchang Zhao
 
R Reference Card for Data Mining
R Reference Card for Data MiningR Reference Card for Data Mining
R Reference Card for Data MiningYanchang Zhao
 

More from Yanchang Zhao (14)

RDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rRDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-r
 
RDataMining slides-data-exploration-visualisation
RDataMining slides-data-exploration-visualisationRDataMining slides-data-exploration-visualisation
RDataMining slides-data-exploration-visualisation
 
RDataMining slides-association-rule-mining-with-r
RDataMining slides-association-rule-mining-with-rRDataMining slides-association-rule-mining-with-r
RDataMining slides-association-rule-mining-with-r
 
RDataMining-reference-card
RDataMining-reference-cardRDataMining-reference-card
RDataMining-reference-card
 
Text Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter DataText Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter Data
 
Association Rule Mining with R
Association Rule Mining with RAssociation Rule Mining with R
Association Rule Mining with R
 
Time Series Analysis and Mining with R
Time Series Analysis and Mining with RTime Series Analysis and Mining with R
Time Series Analysis and Mining with R
 
Regression and Classification with R
Regression and Classification with RRegression and Classification with R
Regression and Classification with R
 
Data Clustering with R
Data Clustering with RData Clustering with R
Data Clustering with R
 
Data Exploration and Visualization with R
Data Exploration and Visualization with RData Exploration and Visualization with R
Data Exploration and Visualization with R
 
Introduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RIntroduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in R
 
An Introduction to Data Mining with R
An Introduction to Data Mining with RAn Introduction to Data Mining with R
An Introduction to Data Mining with R
 
Time series-mining-slides
Time series-mining-slidesTime series-mining-slides
Time series-mining-slides
 
R Reference Card for Data Mining
R Reference Card for Data MiningR Reference Card for Data Mining
R Reference Card for Data Mining
 

Recently uploaded

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

RDataMining slides-network-analysis-with-r

  • 1. Social Network Analysis with R ∗ Yanchang Zhao http://www.RDataMining.com R and Data Mining Course Beijing University of Posts and Telecommunications, Beijing, China July 2019 ∗ Chapter 11: Social Network Analysis, in R and Data Mining: Examples and Case Studies. http://www.rdatamining.com/docs/RDataMining-book.pdf 1 / 37
  • 2. Contents Graph and Social Network Analysis Graph Construction Graph Visualization Graph Query Centrality Measures Advanced Graph Visualization R Packages Wrap Up Further Readings and Online Resources 2 / 37
  • 3. Network and Graph Nodes, vertices or entities Edges, links or relationships Network analysis, graph mining Link prediction, community/group detection, entity resolution, recommender system, information propogation modeling 3 / 37
  • 4. Graph Databases Neo4j: https://neo4j.com Giraph on Hadoop: http://giraph.apache.org GraphX on Spark: http://spark.apache.org/graphx/ 4 / 37
  • 5. Social Network Analysis Graph construction Graph query Centrality measures Graph visualization Clustering and community detection 5 / 37
  • 6. Contents Graph and Social Network Analysis Graph Construction Graph Visualization Graph Query Centrality Measures Advanced Graph Visualization R Packages Wrap Up Further Readings and Online Resources 6 / 37
  • 7. Graph Construction Tom, Ben, Bob and Mary are friends of John. Alice and Wendy are friends of Mary. Wendy is a friend of David. library(igraph) # nodes nodes <- data.frame( name = c("Tom","Ben","Bob","John","Mary","Alice","Wendy","David"), gender = c("M", "M", "M", "M", "F", "F", "F", "M"), age = c( 16, 30, 42, 29, 26, 32, 18, 22) ) # relations edges <- data.frame( from = c("Tom", "Ben", "Bob", "Mary", "Alice", "Wendy", "Wendy"), to = c("John", "John", "John", "John","Mary", "Mary", "David") ) # build a graph object g <- graph.data.frame(edges, directed=F, vertices=nodes) 7 / 37
  • 8. Contents Graph and Social Network Analysis Graph Construction Graph Visualization Graph Query Centrality Measures Advanced Graph Visualization R Packages Wrap Up Further Readings and Online Resources 8 / 37
  • 9. Graph Visualization layout1 <- g %>% layout_nicely() ## save layout for reuse g %>% plot(vertex.size = 30, layout = layout1) Tom Ben Bob John Mary AliceWendy David 9 / 37
  • 10. Graph Visualization (cont.) ## use blue for male and pink for female colors <- ifelse(V(g)$gender=="M", "skyblue", "pink") g %>% plot(vertex.size=30, vertex.color=colors, layout=layout1) Tom Ben Bob John Mary AliceWendy David 10 / 37
  • 11. Contents Graph and Social Network Analysis Graph Construction Graph Visualization Graph Query Centrality Measures Advanced Graph Visualization R Packages Wrap Up Further Readings and Online Resources 11 / 37
  • 12. Graph Query ## nodes V(g) ## + 8/8 vertices, named, from 8dfec3f: ## [1] Tom Ben Bob John Mary Alice Wendy David ## edges E(g) ## + 7/7 edges from 8dfec3f (vertex names): ## [1] Tom --John Ben --John Bob --John John --Mary ## [5] Mary --Alice Mary --Wendy Wendy--David ## immediate neighbors (friends) of John friends <- ego(g,order=1,nodes="John",mindist=1)[[1]] %>% print() ## + 4/8 vertices, named, from 8dfec3f: ## [1] Tom Ben Bob Mary ## female friends of John friends[friends$gender == "F"] ## + 1/8 vertex, named, from 8dfec3f: ## [1] Mary 12 / 37
  • 13. Graph Query (cont.) ## 1- and 2-order neighbors (friends) of John g2 <- make_ego_graph(g, order=2, nodes="John")[[1]] g2 %>% plot(vertex.size=30, vertex.color=colors) Tom Ben Bob John Mary Alice Wendy 13 / 37
  • 14. Contents Graph and Social Network Analysis Graph Construction Graph Visualization Graph Query Centrality Measures Advanced Graph Visualization R Packages Wrap Up Further Readings and Online Resources 14 / 37
  • 16. Centrality Measures Degree: the number of adjacent edges; indegree and outdegree for directed graphs Closeness: the inverse of the average length of the shortest paths to/from all other nodes Betweenness: the number of shortest paths going through a node degree <- g %>% degree() %>% print() ## Tom Ben Bob John Mary Alice Wendy David ## 1 1 1 4 3 1 2 1 closeness <- g %>% closeness() %>% round(2) %>% print() ## Tom Ben Bob John Mary Alice Wendy David ## 0.06 0.06 0.06 0.09 0.09 0.06 0.07 0.05 betweenness <- g %>% betweenness() %>% print() ## Tom Ben Bob John Mary Alice Wendy David ## 0 0 0 15 14 0 6 0 16 / 37
  • 17. Centrality Measures (cont.) Eigenvector centrality: the values of the first eigenvector of the graph adjacency matrix Transivity, a.k.a clustering coefficient, measures the probability that the adjacent nodes of a node are connected. eigenvector <- evcent(g)$vector %>% round(2) %>% print() ## Tom Ben Bob John Mary Alice Wendy David ## 0.45 0.45 0.45 1.00 0.85 0.38 0.48 0.22 transitivity <- g %>% transitivity(type = "local") %>% print() ## [1] NaN NaN NaN 0 0 NaN 0 NaN 17 / 37
  • 18. Contents Graph and Social Network Analysis Graph Construction Graph Visualization Graph Query Centrality Measures Advanced Graph Visualization R Packages Wrap Up Further Readings and Online Resources 18 / 37
  • 19. Static Network Visualization Static network visualization Fast in rendering big graphs For very big graphs, the most efficient way is to save visualization result into a file, instead of directly to screen. Save network diagram into files: pdf(), bmp(), jpeg(), png(), tiff() library(igraph) ## plot directly to screen when graph is small plot(g) ## for big graphs, save visualization to a PDF file pdf("mygraph.pdf") plot(g) graphics.off() ## or dev.off() 19 / 37
  • 20. Interactive Network Visualization Coordinates of other nodes are not adjusted when moving a node. Can be slow when rendering big graphs Save network diagram into files: visSave(), visExport() visIgraph(g, idToLabel=T) %>% ## highlight nodes connected to a selected node visOptions(highlightNearest=T) %>% ## use different icons for different types (groups) of nodes visGroups(groupname="person", shape="icon", icon=list(code="f007")) %>% ... %>% ## use FontAwesome icons addFontAwesome() %>% ## add legend of nodes visLegend() %>% ## to save to file visSave(file = "network.html") 20 / 37
  • 21. Interactive Network Visualization (cont.) Dynamically adjusting coordinates for better visualization Very slow when rendering big graphs x <- toVisNetworkData(g) visNetwork(nodes=x$nodes, edges=x$edges)%>% ## use different icons for different types (groups) of nodes visGroups(groupname="person", shape="icon", icon=list(code="f007")) %>% ... %>% ## use FontAwesome icons addFontAwesome() %>% ## add legend of nodes visLegend() 21 / 37
  • 22. Load Graph Data ## download graph data url <- "http://www.rdatamining.com/data/graph.rdata" download.file(url, destfile = "./data/graph.rdata") library(igraph) # load graph data into R # what will be loaded: g, nodes, edges load("../data/graph.rdata") 22 / 37
  • 23. Build a Graph head(nodes, 3) ## name type ## 1 T9 tid ## 2 T24 tid ## 3 T13 tid head(edges, 3) ## from to ## 1 T9 P27 ## 2 T24 P8 ## 3 T13 P2 ## buid a graph object g <- graph.data.frame(edges, directed = F, vertices = nodes) g ## IGRAPH 9597c42 UN-B 61 60 -- ## + attr: name (v/c), type (v/c) ## + edges from 9597c42 (vertex names): ## [1] T9 --P27 T24--P8 T13--P2 T27--P10 T29--P29 T2 --P27 ## [7] T16--P21 T27--P20 T17--P30 T14--P20 T29--P22 T14--P17 ## [13] T21--P18 T18--P9 T4 --P5 T9 --A29 T24--A28 T13--A21 23 / 37
  • 24. Example of Static Network Visualization library(igraph) plot(g, vertex.size=12, vertex.label.cex=0.7, vertex.color=as.factor(V(g)$type), vertex.frame.color=NA) T9 T24 T13 T27 T29 T2 T16 T17 T14 T21 T18 T4 P27 P8 P2 P10 P29 P21 P20 P30 P22 P17 P18 P9 P5 A29 A28 A21 A24 A1A15 A23 A7 A10 A5 A13 A12 N5 N7 N14 N8 N26 N2 N24 N4 N17 N23 N27 N12 E20 E3 E12 E9 E25 E14 E24 E23 E19 E22 E1 E15 24 / 37
  • 25. Example of Interactive Network Visualization library(visNetwork) V(g)$group <- V(g)$type ## visualization data <- toVisNetworkData(g) visNetwork(nodes=data$nodes, edges=data$edges) %>% visGroups(groupname="tid",shape="icon",icon=list(code="f15c")) %>% visGroups(groupname="person",shape="icon",icon=list(code="f007")) %>% visGroups(groupname="addr",shape="icon",icon=list(code="f015")) %>% visGroups(groupname="phone",shape="icon",icon=list(code="f095")) %>% visGroups(groupname="email",shape="icon",icon=list(code="f0e0")) %>% addFontAwesome() %>% visLegend() 25 / 37
  • 27. Contents Graph and Social Network Analysis Graph Construction Graph Visualization Graph Query Centrality Measures Advanced Graph Visualization R Packages Wrap Up Further Readings and Online Resources 27 / 37
  • 28. R Packages Network analysis: igraph, sna, statnet Network visualization: visNetwork Interface with graph databases: RNeo4j 28 / 37
  • 29. Package igraph † V(g), E(g): nodes and edges of graph g degree, betweenness, closeness, transitivity: various centrality scores neighborhood: neighborhood of graph vertices cliques, largest.cliques, maximal.cliques, clique.number: find cliques, ie. complete subgraphs clusters, no.clusters: maximal connected components of a graph and the number of them fastgreedy.community, spinglass.community: community detection cohesive.blocks: calculate cohesive blocks induced.subgraph: create a subgraph of a graph (igraph) read.graph, write.graph: read and writ graphs from and to files of various formats † https://cran.r-project.org/package=igraph 29 / 37
  • 30. Contents Graph and Social Network Analysis Graph Construction Graph Visualization Graph Query Centrality Measures Advanced Graph Visualization R Packages Wrap Up Further Readings and Online Resources 30 / 37
  • 31. Wrap Up Package igraph and sna Static visualization Can visualize nodes with shapes, images and icons Visualise very large graph Support network analysis and graph mining Package visNetwork Interactive visualization Can visualize nodes with shapes, images and icons Image rendering can be very slow for large graphs Designed for visualization only, and does not support network analysis and graph mining 31 / 37
  • 32. Contents Graph and Social Network Analysis Graph Construction Graph Visualization Graph Query Centrality Measures Advanced Graph Visualization R Packages Wrap Up Further Readings and Online Resources 32 / 37
  • 33. Further Readings Social network analysis (SNA) https://en.wikipedia.org/wiki/Social_network_analysis igraph – a network analysis package, supporting R, Python and C/C++ http://igraph.org sna – an R package for social network analysis https://cran.r-project.org/web/packages/sna/index.html statnet – software tools for the analysis, simulation and visualization of network data; also available as an R package http://www.statnet.org visNetwork – an R package for network visualization http://datastorm-open.github.io/visNetwork/ 33 / 37
  • 34. Online Resources Book titled R and Data Mining: Examples and Case Studies [Zhao, 2012] http://www.rdatamining.com/docs/RDataMining-book.pdf R Reference Card for Data Mining http://www.rdatamining.com/docs/RDataMining-reference-card.pdf Free online courses and documents http://www.rdatamining.com/resources/ RDataMining Group on LinkedIn (27,000+ members) http://group.rdatamining.com Twitter (3,300+ followers) @RDataMining 34 / 37
  • 36. How to Cite This Work Citation Yanchang Zhao. R and Data Mining: Examples and Case Studies. ISBN 978-0-12-396963-7, December 2012. Academic Press, Elsevier. 256 pages. URL: http://www.rdatamining.com/docs/RDataMining-book.pdf. BibTex @BOOK{Zhao2012R, title = {R and Data Mining: Examples and Case Studies}, publisher = {Academic Press, Elsevier}, year = {2012}, author = {Yanchang Zhao}, pages = {256}, month = {December}, isbn = {978-0-123-96963-7}, keywords = {R, data mining}, url = {http://www.rdatamining.com/docs/RDataMining-book.pdf} } 36 / 37
  • 37. References I Zhao, Y. (2012). R and Data Mining: Examples and Case Studies, ISBN 978-0-12-396963-7. Academic Press, Elsevier. 37 / 37