3. About Me
Marc A. Smith
Chief Social Scientist
Connected Action Consulting Group
Marc@connectedaction.net
http://www.connectedaction.net
http://www.codeplex.com/nodexl
http://www.twitter.com/marc_smith
http://delicious.com/marc_smith/Paper
http://www.flickr.com/photos/marc_smith
http://www.facebook.com/marc.smith.sociologist
http://www.linkedin.com/in/marcasmith
http://www.slideshare.net/Marc_A_Smith April 10-12, Chicago, IL
http://www.smrfoundation.org
13. A network is born whenever two GUIDs are joined.
Username Attributes Username Attributes
@UserName1 Value, value @UserName2 Value, value
A B
Vertex1 Vertex 2 “Edge” “Vertex1” “Vertex2”
Attribute Attribute Attribute
@UserName1 @UserName2 value value value
16. Social
Networks
Jacob Moreno ’ s early
social network diagram of
History:
positive and negative
relationships among
from the dawn of time!
members of a football
team.
Theory and method: Originally published in
1934 ->
Moreno, J. L. (1934). Who
shall survive? Washington,
Jacob L. Moreno
DC: Nervous and Mental
Disease Publishing
Company.
http://en.wikipedia.org/wiki/Jacob_L._Moren 16
17.
18. Social network diagram of relationships among workers in a factory illustrates the positions
different workers occupy within the workgroup.
Originally published in Roethlisberger, F., and Dickson, W. (1939). Management and
the worker. Cambridge, UK: Cambridge University Press.
41. Welser, Howard T., Eric Gleave, Danyel Fisher, and Marc Smith. 2007.
Visualizing the Signatures of Social Roles in Online Discussion Groups.
The Journal of Social Structure. 8(2).
Experts & Discussion people
Discussion starters
“Answer People” Topic setters
Topic setters
41
42. NodeXL: Network Overview Discovery and Exploration add-in for Excel 2007/2010
A minimal network can illustrate
the ways different locations
have different values for
centrality and degree
42
52. SNA questions for social media:
1. What does my topic network look like?
2. What does the topic I aspire to be look like?
3. What is the difference between #1 and #2?
4. How does my map change as I intervene?
What do #SQLPass and #PASSBAC look like?
52
59. Social Network Theory
http://en.wikipedia.org/wiki/Social_network
Central tenet
Social structure emerges from the aggregate of relationships (ties)
among members of a population
Phenomena of interest
Emergence of cliques and clusters from patterns of relationships
Centrality (core), periphery (isolates), betweenness Source: Richards, W.
(1986). The NEGOPY
network analysis
Methods program. Burnaby, BC:
Department of
Surveys, interviews, observations, log file analysis, computational Communication, Simon
analysis of matrices Fraser University. pp.7-
16
(Hampton &Wellman, 1999; Paolillo, 2001; Wellman, 2001)
59
60. SNA 101 • Node
– “actor” on which relationships act; 1-mode versus 2-mode networks
• Edge
A – Relationship connecting nodes; can be directional
• Cohesive Sub-Group
– Well-connected group; clique; cluster A B D E
B C • Key Metrics
– Centrality (group or individual measure)
• Number of direct connections that individuals have with others in the group (usually look at
D incoming connections only)
• Measure at the individual node or group level
E – Cohesion (group measure)
• Ease with which a network can connect
• Aggregate measure of shortest path between each node pair at network level reflects
average distance
– Density (group measure)
• Robustness of the network
• Number of connections that exist in the group out of 100% possible
F G – Betweenness (individual measure)
• # shortest paths between each node pair that a node is on
• Measure at the individual node level
H I • Node roles
– Peripheral – below average centrality C
– Central connector – above average centrality D
– Broker – above average betweenness E
61. NodeXL: Free/Open Social Network Analysis add-in for Excel 2007/2010
makes graph theory as easy as a pie chart, with integrated analysis of social
media sources. See: http://nodexl.codeplex.com
61
63. Goal: Make SNA easier
• Existing Social Network Tools are challenging for many novice users
• Tools like Excel are widely used
• Leveraging a spreadsheet as a host for SNA lowers barriers to
network data analysis and display
63
68. The Content summary
spreadsheet displays the most
frequently used URLs, hashtags,
and user names within the
network as a whole and within
each calculated sub-group.
68
75. Social Media Research Foundation
People Disciplines Institutions
University Faculty Computer Science University of Maryland
Students HCI, CSCW Oxford Internet Institute
Industry Machine Learning Stanford University
Independent Information Visualization Microsoft Research
Researchers UI/UX Illinois Institute of Technology
Developers Social Science/Sociology Connected Action
Network Analysis Cornell
Collective Action Morningside Analytics
75
76. What we are trying to do:
Open Tools, Open Data, Open Scholarship
Build the “Firefox of GraphML” – open tools for collecting and
visualizing social media data
Connect users to network analysis – make
network charts as easy as making a pie chart
Connect researchers to social media data sources
Archive: Be the “Allen Very Large Telescope Array” for Social
Media data – coordinate and aggregate the results of many user’s
data collection and analysis
Create open access research papers & findings
Make “collections of connections” easy for users to manage
76
77. What we have done: Open Tools
NodeXL
Data providers (“spigots”)
• ThreadMill Message Board
• Exchange Enterprise Email
• Voson Hyperlink
• SharePoint
• Facebook
• Twitter
• YouTube
• Flickr
77
78. What we have done: Open Data
NodeXLGraphGallery.org
• User generated collection of network
graphs, datasets and annotations
• Collective repository for the research
community
• Published collections of data from a
range of social media data sources to
help students and researchers connect
with data of interest and relevance
78
81. What we want to do:
(Build the tools to) map the social web
Move NodeXL to the web: (Node[NOT]XL)
• Node for Google Doc Spreadsheets?
• WebGL Canvas? D3.JS? Sigma.JS
Connect to more data sources of interest:
• RDF, Gmail, NYT, Citation Networks
Solve hard network manipulation UI problems:
• Modal transform, Time series, Automated layouts
Grow and maintain archives of social media network data sets for research use.
Improve network science education:
• Workshops on social media network analysis
• Live lectures and presentations
• Videos and training materials
81
82. How you can help
Sponsor a feature
Sponsor workshops
Sponsor a student
Schedule training
Sponsor the foundation
Donate your money, code, computation, storage, bandwidth, data or
employee’s time
Help promote the work of the Social Media Research Foundation
82
83. Charting Collections of Social
Media Connections with NodeXL
Maps and reports for social media networks
April 10-12, Chicago, IL
84. Win a Microsoft Surface Pro!
Complete an online SESSION EVALUATION
to be entered into the draw.
Draw closes April 12, 11:59pm CT
Winners will be announced on the PASS BA
Conference website and on Twitter.
Go to passbaconference.com/evals or follow the QR code link displayed on
session signage throughout the conference venue.
Your feedback is important and valuable. All feedback will be used to improve
and select sessions for future events.
84
A tutorial on analyzing social media networks is available from: casci.umd.edu/NodeXL_TeachingDifferent positions within a network can be measured using network metrics.
The network of connections among people who tweeted “#My2K” over the 1-day, 21-hour, 39-minute period from Sunday, 06 January 2013 at 03:30 UTC to Tuesday, 08 January 2013 at 01:09 UTC.
The graph represents a network of 268 Twitter users whose recent tweets contained "#cmgrchat OR #smchat. The network was obtained on Friday, 18 January 2013 at 15:44 UTC. There is an edge for each follows relationship. There is an edge for each "replies-to" relationship in a tweet. There is an edge for each "mentions" relationship in a tweet. There is a self-loop edge for each tweet that is not a "replies-to" or "mentions". The tweets were made over the 3-day, 21-hour, 15-minute period from Monday, 14 January 2013 at 18:23 UTC to Friday, 18 January 2013 at 15:38 UTC.
The graph represents a network of 1,227 Twitter users whose recent tweets contained "lumia. The network was obtained on Saturday, 12 January 2013 at 19:52 UTC. There is an edge for each follows relationship. There is an edge for each "replies-to" relationship in a tweet. There is an edge for each "mentions" relationship in a tweet. There is a self-loop edge for each tweet that is not a "replies-to" or "mentions". The tweets were made over the 5-hour, 1-minute period from Saturday, 12 January 2013 at 14:36 UTC to Saturday, 12 January 2013 at 19:37 UTC.
The graph represents a network of 1,260 Twitter users whose recent tweets contained "flotus". The network was obtained on Friday, 18 January 2013 at 18:26 UTC. There is an edge for each follows relationship. There is an edge for each "replies-to" relationship in a tweet. There is an edge for each "mentions" relationship in a tweet. There is a self-loop edge for each tweet that is not a "replies-to" or "mentions". The tweets were made over the 3-hour, 3-minute period from Friday, 18 January 2013 at 15:16 UTC to Friday, 18 January 2013 at 18:20 UTC.
The graph represents a network of 399 Twitter users whose recent tweets contained "http://www.nytimes.com/2013/01/11/opinion/krugman-coins-against-crazies.html. The network was obtained on Friday, 11 January 2013 at 14:27 UTC. There is an edge for each follows relationship. There is an edge for each "replies-to" relationship in a tweet. There is an edge for each "mentions" relationship in a tweet. There is a self-loop edge for each tweet that is not a "replies-to" or "mentions". The tweets were made over the 12-hour, 32-minute period from Friday, 11 January 2013 at 01:52 UTC to Friday, 11 January 2013 at 14:24 UTC.
The graph represents a network of 388 Twitter users whose recent tweets contained "delllistens OR dellcares”. The network was obtained on Tuesday, 19 February 2013 at 17:44 UTC. There is an edge for each follows relationship. There is an edge for each "replies-to" relationship in a tweet. There is an edge for each "mentions" relationship in a tweet. There is a self-loop edge for each tweet that is not a "replies-to" or "mentions". The tweets were made over the 6-day, 21-hour, 58-minute period from Tuesday, 12 February 2013 at 19:34 UTC to Tuesday, 19 February 2013 at 17:33 UTC.