1. Exploring the Structure of Government
on the Web
Presentation by Robert Ackland at DISC2013,
12-14 December 2013, Daegu, South Korea
Robert Ackland (Australian National University)
Paul Henman (University of Queensland)
Tim Graham (University of Queensland)
Homepage: https://researchers.anu.edu.au/researchers/ackland-rj
Project: http://voson.anu.edu.au
2. VOSON Project at the ANU (http://voson.anu.edu.au): Teaching,
research and tool development in areas of computational social
science, network science, web science since 2003
2
3. Background
Government use of the Internet has rapidly evolved.
● While this evolution has been examined in terms of the
content, usability and interactivity of sites, the institutional
structure of government on the web is less explored.
● Australian Research Council-funded project titled "The
institutional structure of e-government: a cross-policy,
cross-country comparison" (Henman, Ackland, Margetts)
●
3
4. Overall aims of project
●
Aim 1: Assess whether government hyperlink networks reflect
offline institutional structures
Is e-government facilitating joined-up government or are
jurisdictional boundaries still a significant barrier?
● Whalen (2011) studied the hyperlink structure of the US .gov
domain, assessing correspondence between online structure of
US government and its offline hierarchy.
●
●
Major difference is our project compares the UK and Australia, identifying
both similarities and contrasts in the relationship between institutional
structure and online presence.
4
5. ●
Aim 2: Use hyperlink data to assess “nodality” of government (Hood &
Margetts 2007) – is government at centre of informational networks on
Web?
Nodality affects whether government messages received by the population.
● Web might increase government nodality, but can also decrease nodality,
through increased competition from other information providers (who may
destabilise/confuse/subvert the messages and actions of government).
Example: anti-vaccination lobby groups.
● We ask: is government using the web to enhance its visibility? Are there
differences in nodality across policy domains, countries (AU and UK)?
● Our approach is different to that used by Escher et al. (2006)
●
●
●
Escher et al. focused only on the UK Foreign Office (and US and Australian
counterparts), our analysis includes other sectors of government, allowing crosscountry and cross-sector comparisons
We collect more hyperlink data, allowing us to identify the connection between sites
that link to (or are linked to by) government sites. We can construction of nodality
measures that are different to those used by Escher et al. (e.g. those requiring
complete network data).
5
6. Webometrics (link count analysis)
focus on
egonetworks,
rather than
complete
networks
●
typically only
know attributes
of ego, not
alters
●
6
7. Today – some methodological aspects
Hyperlink network data collection (VOSON)
●
Network reduction techniques
●
Community structure in government
hyperlink networks
●
Coding websites (machine learning)
●
7
9. ●
Manually identified AU and UK government seed pages (typically, entry pages
to government websites):
AU – 88 pages
● UK – 92 pages
●
●
Used the VOSON software (http://voson.anu.edu.au) to construct hyperlink
network data using two stage approach:
●
Stage 1:
●
●
●
Stage 2:
●
●
VOSON in-built crawler crawled the seed sites finding internal pages linked to from the entry
page. Collected outbound links from each of the internal pages and also text content
Bing API was used to find all inbound links to each of the internal pages (including seed page)
Every new page discovered above (i.e. pages that either link to or are linked to by government
web page) was then crawled by VOSON in-built crawler to find connections among these pages
Data collected in 2012
9
11. VOSON 2.0 web
interface works with
Firefox, Chrome, Safari,
iPad
VOSON+NodeXL allows
construction and import
of hyperlink networks
from within NodeXL
11
13. ●
Network size (pages):
AU: 1,517,020 nodes (pages)
● UK: 1,588,757 nodes (pages)
●
●
First major network reduction technique: construct network
of websites rather than pages
VOSON has approach for automatically grouping pages into
“pagegroups”
● e.g for AU, 6694 pages from Australian Taxation office all
included in a single node “ato.gov.au”
●
●
Full network size (pagegroups/sites):
AU: 110665 nodes (pages), 290031 edges
● UK: 109161 nodes (pages), 280580 edges
●
13
14. ●
Gephi map UK network – only showing 30K+ nodes with
indegree+outdegree>1 ...not much analytical potential from this
visualisation...
14
15. ●
In future work we will be investigating
approaches for removing edges to reveal
the “backbone” of UK and AU government
hyperlink networks
●
e.g. Serrano, M., Boguñá, M. and A.
Vespignani (2009): “Extracting the
multiscale backbone of complex weighted
networks,” PNAS, 106(16), 6483-6488.
15
17. Some approaches for 'community'
detection in networks
Modularity maximisation (Lancichinetti &
Fortunato, 2012)
●
Edge-Betweenness (Girvan & Newman, 2001)
●
Fast-Greedy (Clauset et al, 2004)
●
Multi-Level (Blondel et al, 2008)
●
Walktrap (Pons & Latapy, 2005)
●
Infomap (Rosvall, Axelsson & Bergstrom, 2009)
●
17
18. The hyperlink networks we have collected
are both directed and weighted (weight
on edge from node i to j are number of
pages with links from site i to j)
●
Of the above, only Edge-Betweenness
and Infomap support directed and
weighted graphs
●
18
19. Edge-Betweenness
We found the Edge-Betweenness
algorithm (as implemented in igraph/R)
does not scale well.
●
In a test run with UK hyperlink network,
algorithm did not converge after 24 hours
running...
●
19
20. Infomap
See: http://www.mapequation.org
● Scales well for large, dense networks
● information theoretic approach - appropriate to this network,
where there is flow of information and attention
●
If site i links to site j can think of a flow of information from j to i and
a flow of attention from i to j.
● We do not have data on flow of web users from site i to site j i.e.
'clickstream data'
● We therefore make assumption that the number of pages on site i that
contain hyperlinks to site j (these are our edge weights) is proportional
to the flow of attention/information
●
20
21. First attempt...
Tried Infomap implemented in R/iGraph (v. 0.6.5)
● Results: Not good! Algorithm consistently generated a single
massive community (approx. 95% of nodes) and thousands
of tiny communities (1 or 2 nodes per community)
● Results do not pass ‘sanity test’ (i.e. face validity)
● The problem:
●
Many nodes in the UK network have no outlinks
● Therefore, effect of teleportation in the Infomap algorithm is
significant (it randomly connects nodes)
● This problem was solved in Lambiotte and Rosvall (2012)
●
21
22. Second attempt...
Results from Lambiotte and Rosvall (2012) were recently
developed into Infomap algorithm
● This latest code is not yet integrated in R/iGraph
● So, next steps:
●
Download and compile C++ source code for Infomap (v. 0.12.13)
● http://www.mapequation.org/code.html
● Run the standalone Infomap algorithm
●
●
Using Infomap Map Generator, can examine the community
structure of UK network at different scales (varying the number of
communities displayed and number of links between communities)
22
23. 17 out of 4571
communities
(44% of all
flow)
23
24. 45 out of 4571
communities
(70% of all flow)
24
25. Each community is named after the website that has the highest
flow and PageRank in that particular community (i.e. the ‘top
dog’ website)
● Distribution of flow across network follows a power law
●
There are many communities, but a very small percentage ‘hog’ all
the flow across the network
● Top 5% of communities (229 nodes out of 4571) account for about
86% of all flow in the network
●
●
Infomap uses an implementation of the PageRank algorithm to
calculate ‘importance’ of each community (aggregate PageRank
of all websites in that community)
25
26. Preliminary findings
Extremely influential communities form around social media
and blogging platforms
● A massive amount of flow is directed through the ‘Twitter’
community (e.g. from Twitter to www.parliament.uk)
● Many UK seed sites form influential communities (i.e. Top
20), but not all.
● Somewhat unexpectedly, two UK Gov ‘business’ websites
each form highly influential communities
●
http://www.direct.gov.uk (community rank #4, 0.048% of all flow
throughout network)
● http://bis.gov.uk (community rank #8, 0.025% of all flow
throughout network)
●
26
28. ●
To understand the structure of government hyperlink networks, we need to
know something about the websites in these networks
●
●
Generic top-level domains (.edu, .com, org etc.) will only give very coarsegrained information on who these sites are
●
●
What policy domain are they in? (health, education, social security?)
This is social science research so we need more information on nodes
Options:
1. Manually code every site (not feasible, as we have >100K sites)
2. Manually code a subset of sites e.g. the “most important” sites based on
centrality measure (scientifically valid?)
3. Manually code a sample of sites (e.g. adaptive sampling). To be explored in
future...
4. Manually code training dataset and then use machine learning to predict website
type
●
The following is summary of preliminary work on approach 4...
28
29. Data collection
●
Subset of 'important' websites in the UK network were
coded into discrete policy domains by a human coder
Subset chosen as seed sites plus sites connected to two
or more seed sites
● e.g. coding: ‘Community services’, ‘Health’, ‘Foreign
Affairs’
●
Need to collect and ‘clean’ the HTML data from
websites in the network
● While the original VOSON crawl collected text content
for all websites crawled, for this proof of concept, we
re-collected the text content (in future we will use the
VOSON-collected text data)
●
30. Text processing
R ‘XML’ package used to clean the HTML
(strip HTML tags, remove white spaces,
remove strange ASCII characters, convert to
lowercase, extract key word frequencies)
●
2157 websites were usable (i.e. with ‘clean’
web text and a known policy domain)
●
Machine Learning using the ‘RTextTools’
package in R (supervised learning for text
classification)
●
31. Support Vector Machine (SVM)
●
Websites with known policy codes = 2157
SVM ‘training sample’ = 2000
● SVM ‘test sample’ = 157
●
●
Some example results of classification:
PRECISION
RECALL
F-SCORE
Education
0.94
0.83
0.88
Employment
1.00
0.14
0.25
Environment
0.99
0.79
0.88
Foreign Affairs
1.00
0.44
0.61
Health
0.52
0.97
0.68
Housing
0.96
0.79
0.87
32. SVM Conclusion
Surprising level of accuracy
●
Future work will involve:
●
More data (will use HTML collected via
VOSON)
●
Investigate different machine learning
algorithms
●
37. Previous studies
Level
Authors
Result
Small-world effect existed between co-authors and the degree
Newman(2001)
distribution roughly follows the power law in co-authorship networks
in the fields of physics, biomedicine and computer science
Barabasi et al. (2002)
Ramasco et al. (2004)
Co-authorship network in mathematics and neuroscience is scale-free,
and the network evolution is characterized by preferential attachment.
Co-authorships network in the field of condensed matter showed that
the degree distribution follows a power law.
Individual
Co-authorship network in the field of genetic programming changes
Researcher Tomassini and Luthi (2007)
in accordance with preferential attachment
level
International co-authorship grew based on the principle of
Wagner and Leydesdorff
(2005)
preferential attachment, although the attachment mechanism was not
fitted to a pure power law.
Moody (2004)
Brantle and Fallah (2011)
Co-authorship network in sociology does not have a small-world
structure.
Collaboration network of patent inventors has a scale-free power law
property.
4
38. Previous studies
Level
Authors
Result
Verspagen and
Strategic technology alliances, in the two technology fields of chemicals
Duysters (2004)
Powell et al. (2005)
Organization
level
Gay and Dousset (2005)
Barber et al. (2006)
Breschi and Cusmao
(2004)
and food, could be characterized as small worlds.
The alliance network among dedicated biotech firms is scale-free.
The alliance network in the biotechnology industry has a small-world
effect with a scale-free property based on preferential attachment.
Both studies reported the existence of small-world and scale-free
property in inter-organizational R&D relationships from EU-FP
Programmes data.
5
39. Brief history of governmental policy for UIG collaboration (‘00~’11)
6
40. Brief history of governmental policy for UIG collaboration (‘00~’11)
7
42. Methodology
Network topological analysis
Measures
Definition
Density
Average degree
Average path
length
Diameter
The largest geodesic path length in the network
Clustering
coefficient
Degree
centralization
Power law
distribution
9
43. Methodology
Centrality measures
Measures
Degree centrality
Definition
CD(i) = (ΣAi)/(n-1)
* Ai = the number of direct links of node i,
* n = the total number of nodes
Closeness centrality
CC(i) = (n-1)/(ΣDij)
* Dij = the number of links in the geodesic linking node i
and node j
* n = the total number of nodes
Betweenness centrality
CB(i)=[Σj<k gjk(i)/gjk]/[(n-1)(n-2)/2]
* gjk = the number of geodesics linking node j and node k
* gjk(i) = the number of geodesics linking node j and node k
that contain node i
* n = the total number of nodes
10
55. Agenda
1. 3-Helix as a meso-level notion
– Epicycle in a grander tech-psych-inst
cycle
2. Speed (differentials) as high-level
system metric
– Roles of buffering institutions and ICT
– Need for smart engagement
3. Applying 3-helix in the developing
world
4. SUNY Korea’s joint TS/CS research
56. 3-Helix papers published in
Technological Forecasting &
Social Change
• Wilfred Dolfsma, Loet Leydesdorff “Lock-in and break-out from
technological trajectories: Modeling and policy implications,” 76( 7),
Sept. 2009, 932-941.
• Raul Gouvea, Sul Kassicieh, M.J.R. Montoya “Using the quadruple
helix to design strategies for the green economy,” 80(2), Feb. 2013,
221-230.
• Øivind Strand, Loet Leydesdorff “Where is synergy indicated in the
Norwegian innovation system? Triple-Helix relations among
technology, organization, and geography,” 80(3), Mar. 2013, 471-484.
• Inga A. Ivanova, Loet Leydesdorff “Rotational symmetry and the
transformation of innovation systems in a Triple Helix of university–
industry–government relations,” In Press, Corrected Proof, Available
online 19 Sept. 2013.
57. In D.S. Oh & F. Phillips (Eds),
Technopolis: Best Practices for Science
and Technology Cities (Springer, 2014)
• E. Becker, B. Burger and T. Hülsmann,
“Regional Innovation and Cooperation
among Industries, Universities, R&D
Institutes, and Governments”
• F. Phillips, S. Alarakhia and P.
Limprayoon,“The Triple Helix:
International Cases and Critical
Summary”
• José Alberto Sampaio Aranha,
“Arrangement of Actors in the Triple
Helix Innovation”
58. IC2 Model
• Preceded 3-helix by several years
• But only parts were made mathematical (Bard et al)
Ac a d e mi a
Indu st ry
Go v e r n me n t
Com m un it y
Talen t
Technology
Capi t al
Kno w - Ho w
Ma rke t Ne e ds
V alu e - A dd e d
Ec ono m ic Deve lop me nt
59. The math of AcademicGovernment-Industry
dynamics is interesting,
but...
It is just part of a bigger picture.
60. The cycle of innovation and change:
Lab to society & back again
Technological
Innovation
New desires
& dreams
New ways to
organize (Public &
private)
Note how this
schema extends
Everett Rogers’
more linear
model.
New Products
& Services
New ways to
Interact socially
New ways of producing
and using
products & services
61. We might think all the elements
move together in an orderly way.
Social Needs
Institutional Change
Technological Change
Psychological Change
Organizational
Change
62. But in a free-market economy,
they do not.
• They continually
engage and
disengage.
• Sometimes they
move each other
only by friction.
• 90% of MOT and
Tech Policy
problems stem from
the differing speeds
of the 3 sectors.
63. Example: Transportation
• Mobile-web rideshare
services
– Gain VC investment
– Start operations
– Get shut down by city
governments trying to
regulate them under old taxi
rules.
• Institutions have changed
slower than technology
and social demand.
64. Example: Health
• An elderly person dies
because he was too proud
to wear
– A medical bracelet
– or
– An emergency signaller.
• Psychology has changed
slower than technology.
65. Example: Software
• Record companies and publishers
– Sue student MP3 pirates
– Develop DRP software that further alienates
customers
– Can’t adapt away from paper and CD
publishing.
• Business organizations change more
slowly than technology and social
demand.
66. Example: More and more often,
social/institutional change outpaces
tech change - or will do so soon.
• In most of the world, an excess of funds
is chasing too few growth investment
opportunities.
• Fewer US companies are making IPOs.
• Small-government activists rail
indiscriminately against direct
government monetary support for new
technologies.
See Phillips (2011).
67. This can be good.
• Individual creativity
may bloom.
• Mistakes...
– Can be undone
efficiently.
– Don’t necessarily infect
the whole system.
68. It (disengagement)can be bad.
• Alienation
• Lack of coordination and cooperation
• Little institutional or organizational
creativity
• Waste and pollution
• Lives lost
69. Speed as the system metric
• Really, speed
differentials among the
sectors.
• A “clutch” and
“transmission” are
needed.
• The question is less
how to engage, but
rather, when.
• The key is not
engagement per se,
but smart (well-timed)
engagement.
70. Not bridging organizations, but
buffering organizations
•
•
•
•
•
•
•
•
Civic groups
Workforce training programs
Economic development agencies
Technology brokers
Open innovation integrators
Accountancies
Industry associations
NGOs
The IC2 Model partially captured this.
•
•
•
•
Incubators
Law firms
Venture capital
TTOs
71. 3-Helix as meso-level construct: An
epicycle within the TechnologyPsychology-Institutional dynamic
• Macro: Tech-Psych-Inst
• Meso: Aca-Gov-Indus
Tech
– “Triple Helix”
• Micro:
– Dynamics within people and
within organizations;
– Technology life cycles
• The buffering institutions
span all 3 levels.
Inst
(3-Helix)
72. What causes TOPI* disengagement?
*Technological-Organizational-PsychologicalInstitutional
• Bad marketing, bad market research
• Mistrust, bad service
• Technology inaccessible to underserved
populations
• Competition among de facto standards
(e.g., VHS vs Beta)
• Lack of vision
• Poor design of information &
communication products and programs.
75. Marketing guru Geoffrey Moore says,
• “People have disengaged, for ... self-preservation.”
– With “consequences for consumer and brand marketing,
– “and long-term implications for education, health care,
citizen participation, and workforce involvement.
• “So engagement is rightfully going to be a big
investment theme.”
76. Moore: Engagement is taking
center stage in business.
• Off-line retailers are using digital interactions/devices in their
in-store experiences.
– Example: Starbucks.
• “Social marketing foster[s] engagement around topics that ...
reflect well upon the sponsor.”
– Example: Sephora.
• “Big data analytics drive communications that can break
through the wall of detachment.”
– Example: Obama campaign 2012.
77. Moore is saying
• Advertising used to be
like this.
– Annoying! Consumers
disengaged.
• Now with social media,
mobile web, Yelp.com,
– Consumers share product
reviews & complaints.
– Advertisers have to treat
consumers more gently.
– To make us want to
continually re-engage.
• Engaging doesn’t mean
shouting.
81. People are proud to
participate electronically.
• Fighting crime
– Zapruder film; Rodney King videos
• Supporting favorite businesses, authors
– Amazon reviews
• For post-disaster aid
– Crowd-mapping of post-earthquake Haiti
• Crowd-funding research projects and
entrepreneurs
• Though there are abuses.
82. Source: Ganti et al, Mobile
Crowdsensing: Current State and
Future Challenges.
83. Micro Level: Workforce
Engagement
• Definition: The measure of whether
employees merely do the minimum required
of them, versus proactively driving innovation
and new value for the organization.
• Thus, engagement
– “can only ever be partially accounted for by
deploying the latest new collaborative technology,
– “and probably significantly less than many of its
proponents would have you believe.”
Source: Hinchcliffe
86. ICT for engagement? Summary
• ICT alone cannot create/sustain engagement.
– Human intervention, via buffering institutions, can achieve
ICT-aided engagement.
• ICT, especially sensing and crowdsourcing, may
assist in deciding when to engage.
– Thus achieving smart engagement.
• This applies to all 3 levels (macro, meso, micro) of
our multi-level Technology & Society diagram.
87. For many countries where
central government direction is
the norm, 3-helix thinking is
premature.
• Indonesia, Mongolia
• USA: Industry lobbying government
presents a slightly different problem...
89. In sum, the problem is not disengagement, but mis-engagement
among governments, people,
organizations and products, due to:
• Speed differentials (i.e., poor timing)
• Lack of vision
• Poor design of information & communication
products and programs.
– Lack of feedback
– Excess complexity, leading to slow comprehension and
adoption
– Excess technology push (solutions without problems)
– Excess demand pull (unrealistic expectations)
– Other factors
90. SUNY Korea’s research agenda
• Combine social science and computer science...
• To find principles of IT design that more quickly
lead to engagement that is...
– Well-timed
– Smart
– Satisfying
• Among
–
–
–
–
Individuals
Businesses
Government institutions
Technology developers
• With secure applications in several techno-policy
domains (health, energy, etc.).
91. Some Implications
• For IT: Meeting users halfway
• For managers: Engagement plans for
each constituency
• For theorists:
– Modeling the moderating effect of buffering
institutions
– Impact of coalitions on the 3-helix dynamic
92. The math of AcademicGovernment-Industry
dynamics is interesting,
but...
It is just part of a bigger picture.
93. An aside: Spatializing
an innovation
diffusion model
F. Phillips, On S-curves and Tipping Points. Tech.
Forecasting & Social Change, 74(6), July 2007,
715-730.
Alan M. Turing, The chemical basis of morphogenesis. Philosophical Transactions of the
Royal Society of London. B 327, 37–72 (1952)
http://www.cgjennings.ca/toybox/turingmorph/
94. References
• http://davidsasaki.name/2013/01/beyond-technology-fortransparency/
• A. Charnes, S. Littlechild and S. Sorensen, “Core-stem Solutions of
N-person Essential Games.” Socio-Econ. Plan. Sci. Vol. I, pp. 649660 (1973).
• David Watson The Engaged University. Routledge, 2013.
• Dion Hinchcliffe, “Does technology improve employee engagement?”
Enterprise Web 2.0, Nov. 5, 2013. http://www.zdnet.com/doestechnology-improve-employee-engagement-7000021695/
• Jonathan Bard, Boaz Golany and Fred Phillips, “Bubble Planning
and the Mathematics of Consortia.” Third International Conference
on Technology Policy and Innovation, Austin, Texas, September,
1999.
• F. Phillips, The state of technological and social change:
Impressions. Technological Forecasting & SocialChange. 78(6), July
2011, 1072-1078.
96. A Network Analysis of Web-Citations
Among the World’s Universities
George A. Barnett
Department of Communication
University of California, Davis
gbarnett@ucdavis.edu
Daegu Gyeongbuk International Social Network
Conference
December 12-14, 2013
97. Research Aims
• Network Analysis of URL-citations among
– 1,000 universities with greatest presence on WWW (1 million
edges)
– In 58 different countries
– Multi-level analysis (both Universities & Countries)
• Antecedent factors that determine the network’s
structure
– University level
− National Level
• Physical distance
• Same country
Capacity
• Language of instruction
• Size
• Ph.D. granting
• Prestige
• Research Excellence (Nobel Prizes)
Hyperlink Connections
International Bandwidth
GDP, GDP/capita
International Student Flows
Nobel Prizes
98. Data—Web-Citations
• Web-citations among universities collected using Google
– 2,100 X 2,100 matrix of universities (4,407,900 cells) generated
– search query
“university A webdomain” site:university B webdomain
"harvard.edu" site:stanford.edu
− Not all URL-citations are links, e.g., email addresses in coauthored
papers
− Removed universities with no ties & the smaller of a university’s
multiple domains, retained 1,000 most interlinked Universities
− Matrix of inter-citations aggregated to the national level
99. Data--Antecedents
University Level
Physical Location
− Google Maps
Country
− cTLD of website (USA--.edu)
Language of Instruction
− Country of University (India & Singapore—English)
Size of University
− Europe -- (EUMIDA)
(http://thedatahub.org/dataset/eumida)
− U.S. -- College Handbook 2012
− Asia, Africa, Oceania, Latin American & Canada –
Universities’ Websites
Prestige
− U.S. News, World’s Best Universities 2012
http://www.usnews.com/education/
Nobel Prizes
− (http://www.nobelprize.org)
100. Data--Antecedents
National Level
Total Hyperlinks
− Barnett & Park (2012)
International Internet Bandwidth,
GDP & population
− TeleGeography (2012)
(http://www.telegeography.com/)
Student Exchange
− UNESCO (http://stats.uis.unesco.org/unesco)
International Co-authorships
− Leydesdorff & Wagner (2008)
International Citations
− Science Citation Index
101. Results - Universities
•
•
•
•
Over 9.6 million links among 1,000 universities
Density = .606
Mean # of Links = 24.0; S.D. = 2,208.6
Greatest # of links (322,000)
– Universität Trier & Rheinisch Westfalische
Technische Hochschule Aachen, two German
institutions that host huge & popular bibliographic
systems (DBLP & SunSite)
104. Results – Clusters of Universities
Cluster
Defining Attributes
1. German, Swiss & Italian, not English, central, low prestige, less bandwidth
connections
2. English (U.S., Canada, U.K., Australia), central, high prestige, strong bandwidth
connections
3. Low prestige, peripheral, less bandwidth connections
4. English, not French, peripheral, no Ph.D.s, strong bandwidth connections
5. Continental Europe, not English
6. Chinese, less bandwidth connections
7. French, not English, peripheral, lower prestige
8. English, primarily (Jesuit Institutions), peripheral, low prestige
9. English, peripheral
10. Japanese & other Asian, peripheral, little bandwidth connections
105.
106. Results - National
• N = 58 Countries
• Density = .924
• United States most central, followed by Germany, U.K., Canada
– >30% of links ; >4 million outward & 1.9 million inward
– Eigenvector centrality 10 times > Germany
• Gini = .672, a core = periphery structure
– U.S. (359), Germany (67), U.K. (67) & Canada (38) 53.1% of the universities
– These four nations account for 68.3% of the links
– Links distributed by power law; concentrated in a few countries
• Cluster Analysis – 1 group of countries centered about U.S. & U.K.
107.
108. Results – Predicting the Structure of
the University URL-citation Network
• Physical Distance Between Campuses
– QAP Correlation = .005 No relationship between
physical distance and web-citations
• Same Country
–
–
–
–
QAP Correlation = .065
Links 78.4% domestic; 21.6% international
No Links 6.1% domestic; 93.9% international
Mean Link Strength 1,415 with domestic; 42.5
international
• Web-citations tend to be domestic
110. Results – Predicting University
Centrality in Network -- Regression
In-degree
R2
F
P
Size (log)
English
Bandwidth
Rating
Out-Degree
.350
47.94
.000
ß
.279
-.025
.268
.465
Betweenness
.489
85.16
.000
t
6.49
-.516
5.70
10.53
all p< .001, except English for In-degree
ß
.123
.356
.302
.323
t
3.22
8.50
7.31
8.25
Eigenvector
.579
122.25
.000
ß
.282
.185
.336
.502
t
8.13
4.86
8.94
14.12
.310
39.94
.000
ß
.150
.214
.208
.348
t
3.36
4.40
4.33
7.65
111. Results – Predicting the Structure of the
URL-citation Network-National Level
• QAP Correlations with National Level Network
– Co-Authorships .772
– Citations
.967
– Hyperlinks
.545
– Student Flows .270
– Missing Data N = 52 on all except Student Flows,
N = 48
113. Results – Predicting National Centrality
in the Network -- Regression
In-degree
.524
33.78
.000
35.12
.670
ß
R2
F
P
Out-Degree
ß
t
Nobles
English
Population .482 .4.80
GDP/capital .722 7.19
GDP
.000
t
.184 2.27
.398 4.70
.797 9.28
All relations are significant p < .02
Betweenness
22.99
ß
.505
.000
t
.443 4.33
.720 7.03
Eigenvector
.642
31.05
.000
ß t
.553 5.07
.183 2.15
.258 2.41
114. Discussion
• So where is academic knowledge produced?
– Primarily at prestigious English speaking institutions in the U.S.A. &
U.K. , but also in Canada & Germany
• Distance is unrelated to dissemination & collaboration via the
Internet
• Universities tend to link to others from the same country
• Ten clusters- One composed of most prestigious institutions,
suggesting exchanges of knowledge among this group
• Centrality predicted by university size, its prestige (whether it
offered doctoral degrees, its U.S. News ranking, the number of
its faculty’s Noble Prizes), language of instruction (English), &
national international bandwidth capacity
115. Discussion
• At the national level, the countries formed a single group
centered about the U.S. & the U.K.
• U.S. is the most central, followed by Germany, U.K. & Canada
– They accounted for the majority of the universities in the network
• The International Network has a core-periphery structure
with a few countries accounting for the majority of the links
• International co-authorships, citations, student exchanges &
the number of links among the individual countries are
strongly predictive of the network’s structure
• Centrality is predicted, by a country’s population & GDP,
depending on the measure, it may also be predicted by
language of instruction (English) & the number of Noble Prizes
116. Discussion
• Results are consistent with Seeber, et al. (2012)
– European university hyperlink network displays a
center-periphery structure
– centrality a function of the universities’ reputation
– This study extends their conclusions to the global
academic community
117. Discussion
• Consistent with Ortega & Aguilla (2009)
– “The world-class university network graph is comprised of national
sub-networks that merge in a central core where the principal
universities of each country pull their networks toward international
link relationships. This network rests on the United States, which
dominates the world network in conjunction with the aggregation of
the European ones, especially the British and the German subnetworks. This situation may be caused mainly by the technological
development of these countries and the production of international
content, that is, English web pages. This second reason might explain
the apparent backward situation of some East Asian countries.“
• World Systems Theory
– Telephone (Barnett, 2001, 2012)
– Internet (Barnett & Park, 2005, 2012; Park, Barnett & Chung, 2011)
– Student flows (Barnett & Wu, 1995; Chen & Barnett, 2000; Jiang,
2013)
– Patents, trademarks and copyrights (Nam & Barnett, 2011).
118. Discussion
• Global academic community as a self-organizing system
– Academic network may be considered an autopoietic or selfreplicated system
– Evolved from traditional scientific activities (co-authorship,
citing the research of others & other behaviors that required
the sharing of information among scholars)
– Krippendorf defines an autopoietic system as “a network of
processes that produces all the components necessary to
embody the very process that produces it”. The network
recursively produces its components through the interaction
in this historical reproductive network of postings on
university websites & links among institutions
119. Discussion
• There are environmental constraints that limit the
possible states into which this system may evolve
• issues of information property
• policies of individual universities & national governments
• scientific funding agencies (U.S. National Science Foundation)
• Academic networks co-evolved with other global
institutions
• Universally, higher education is developing common
curricula especially in the sciences (Lechner & Boli,
2005). This seems to be reflected in pattern of
universities’ hyperlinks and web-citations
120. Thank you!
See:
Barnett, G.A. , Park, H.W., Jiang, K, Tang, C, & Aguillo, I.F., (2013),
“A multi-level network analysis of web-citations among the
world’s universities”, Scientometrics, DOI 10.1007/s11192-013-1070-0
121. Virtual Knowledge Studio (VKS)
“Webometrics Studies” Revisited
in the Age of “Big Data”
Asso. Prof. Dr. Han Woo PARK
CyberEmotions Research Institute
Dept. of Media & Communication
YeungNam University
214-1 Dae-dong, Gyeongsan-si,
Gyeongsangbuk-do 712-749
Republic of Korea
www.hanpark.net
cerc.yu.ac.kr
eastasia.yu.ac.kr
asia-triplehelix.org
122. Big data
The term “big data” refers to “analytical technologies that
have existed for years but can now be applied faster, on
a greater scale and are accessible to more users. (Miller,
2013).
Big data sizes may vary per discipline.
Characteristics: Garner’s 3Vs plus SAS’s VC and IBM’s
Veracity
- Volume (amount of data), Velocity (speed of data in and
out), Variety (range of data types and sources)
- Variability: Data flows can be highly inconsistent with
daily, seasonal, and event-triggered peak data loads
- Complexity: Multiple data sources requiring cleaning,
linking, and matching the data across system
- Veracity: 1 in 3 business leaders don’t trust the
information they use to make decisions.
http://en.wikipedia.org/wiki/Big_data
http://www-01.ibm.com/software/data/bigdata/
125. Data-driven Research that focuses on
extracting meaningful data from technosocio-economic systems to discover
some hidden patterns.
Today’s “big” is probably tomorrow’s “medium” and
next week’s “small” and thus the most effective definition of “big data” may be derived when the size of data
itself becomes part of the research problem.
Loukides (2012)
126. Introduction
Webometrics is broadly defined as the study of webbased content (e.g., text, images, audio-visual objects, and
hyperlinks) with primarily quantitative indicators for
social science research goals and visualization techniques
derived from information science and social network
analysis.
127. • Han Woo Park
- “hidden” and “relational” data about
lots of people as well as the few
individuals, or small groups
• Lev Manovich
- “surface” data about lots of people (i.e.,
statistical, mathematical or computational
techniques for analyzing data)
- “deep” data about the few individuals or small
groups (i.e., hermeneutics, participant
observation, thick description, semiotics, and
close reading)
7
128. First type of Webometrics
• Hyperlink Network Analysis
- Inter-linkage: who linked to whom matrix
- Co-inlink : a link to two different nodes from a third node
- Co-outlink : A link from two different nodes to a third node
Björneborn (2003)
129. Inter-link network analysis diagram among Korean escience sites within public domain
WCU
WEBOMETRICS
INSTITUTE
Mapping the e-science landscape
In South Korea using the Webometrics method
131. Findings
As seen in Figure 4, the network structure shows a clear butterfly pattern. There is one hub (ghism)
that belongs to Park Gyun-Hye (Park GH, www.cyworld.com/ghism), the daughter of ex-president
Park Jeong-Hee and one of two major GNP candidates (along with president-elect Lee MB) in the
2007 presidential race.
Figure 4: Cyworld Mini-hompies of Korean legislators
How do social scientists use link data
from search engines to understand
Internet-based political and electoral
communication?
WCU
WEBOMETRICS
INSTITUTE
INVESTIGATING INTERNET-BASED POLITICS WITH E-RESEARCH TOOLS
Case 2. Cyworld Mini-hompies of Korean Legislators
132. Sociology of Hyperlink Networks of Web 1.0,
Web 2.0, and Twitter
A Case Study of South Korea
133. Introduction
‣ Online & offline lives ➭ co-constructing (e.g. Beer & Burrows, 2007)
‣ Politicians communicate with their constituencies using different platforms
‣ Questions:
- What are the structural similarities and/or differences in South Korean
politicians’ networks from Web 1.0 to Web 2.0 (and Twitter)?
- Are online structures similar to structures in the physical world?
- Are online patterns affected by offline relationships?
‣ Related studies conducted:
- online social network analysis
- online networks in Web 2.0
- role of Twitter on online politics
134. 2001
2000
‣ 59 isolated in 2000
‣ more centralised in 2001
‣ network of 2001 ➭ a ‘star’ network
- might affected by political events
➭ presidential election in 2001
Web 1.0
135. 2005
2006
‣hubs disappearing
‣easy use of blogs
‣Clear boundaries between different parties
‣strong presence of GNP Assembly members
➭ party policy on using blogs
Web 2.0
138. Bi-linked network of politically active
A-list Korean citizen blogs (July 2005)
URI=Centre
DLP=Left
GNP=Right
Just A-list blogs exchanging links with politicians
139. Affiliation network diagram using pages
linked to Lee’s and Park’s sites
N = 901 (Lee: 215, Park: 692, Shared: 6)
143. “Those studies perpetuate the idea that linking
behaviour is not random, and that links are ‘socially
significant in some way’. In this perspective, links
have an ‘information side-effect’, they can be used
to understand other facts even though they were
not individually designed to do so: ‘information
side-effects are by-products of data intended for
one use which can be mined in order to understand
some tangential, and possibly larger scale,
phenomena’
144. Park and his colleagues were
extensively cited: 9 times!
•
•
•
•
•
•
•
•
•
Barnett GA, Chung CJ and Park HW (2011) Uncovering transnational hyperlink patterns
and web mediated contents: a new approach based on cracking.com domain. Social
Science Computer Review 29(3): 369–384.
Hsu C and Park HW (2011) Sociology of hyperlink networks of Web 1.0, Web 2.0, and
Twitter: a case study of South Korea. Social Science Computer Review 29(3): 354–368.
Park HW (2003) Hyperlink network analysis: a new method for the study of social
structure on the web. Connections 25(1): 49–61.
Park HW (2010) Mapping the e-science landscape in South Korea using the
webometrics method. Journal of Computer-Mediated Communication 15(2): 211–229.
Park HW and Jankowski NW (2008) A hyperlink network analysis of citizen blogs in
South Korean politics. Javnost: The Public 15(2): 5–16.
Park HW and Thelwall M (2003) Hyperlink analyses of the World Wide Web: a review.
Journal of Computer-Mediated Communication 8(4).
Park HW and Thelwall M (2008) Developing network indicators for ideological
landscapes from the political blogosphere in South Korea. Journal of ComputerMediated Communication 13(4): 856–879.
Park HW, Kim C and Barnett GA (2004) Socio-communicational structure among political
actors on the web in South Korea. New Media & Society 6(3): 403–423.
Park HW, Thelwall M and Kluver R (2005) Political hyperlinking in South Korea: technical
indicators of ideology and content. Sociological Research Online 12(3).
145. A comment from those who are
NOT doing a hyperlink analysis
• In a chapter of The Sage Handbook of
Online Research Methods edited by
Fielding et al. (2008), Horgan emphasizes
that ‘link analysis’ has become an active
research domain in examining social
behavior online.
25
146. A threat to Webometrics
• The key application in this area is to collect
some incoming, outgoing, inter-linking, and
co-linking data from search engines
- AltaVista in early 2000
- Yahoo renewed the AltaVista’s hyperlink
commands via “Site Explorer” and its API
- Yahoo discontinued its API option for
interlinkage data in April 2011, and finally
stopped its popular Site Explore service in
November 2011
148. A new proposal
• Mike Thelwall
- URL citation searches with the Bing search
API facilities
• Liwen Vaughan
- Incoming hyperlinks from Alexa.com
Can these "alternative" techniques be
acceptable for scientific publishing?
149. A new proposal : SEO Tools
•
-
Search Engine Optimization Tools
http://www.majesticseo.com/
http://www.opensiteexplorer.org/
https://ahrefs.com/
Enrique Orduña-Malea & John J.
Regazzi (2013). Influence of the academic
Library on U.S. university reputation:
a webometric approach. Technologies. 1,
26-43, http://www.mdpi.com/2227-7080/1/2/26
150. Webometrics Ranking of
World Universities
The link visibility data is collected from the two
most important providers of this
information: Majestic SEO and ahrefs.
Both use their own crawlers, generating different
databases that should be used jointly for filling
gaps or correcting mistakes.
The indicator is the product of square root of the
number of backlinks and the number of
domains originating those backlinks, so it is not
only important the link popularity but even
more the link diversity.
The maximum of the normalized results is the
impact indicator.
http://www.webometrics.info/en/Methodology
151. Interlinkage among world universities
• Barnett, G.A., Park, H. W., Jiang, K., Tang, C.,
& Aguillo, I. F. (2013 forthcoming). A MultiLevel Network Analysis of Web-Citations
Among The World’s Universities.
Scientometrics*.
Isidro F. Aguillo
“Large interlinking matrix (1000*1000) are no
longer possible to obtain. Perhaps national
academic systems (200 or 300 institutions)”
152. Intentional inattention
among Information Scientists?
• Robert Ackland (2013). Web Social Science.
- http://voson.anu.edu.au/
• Richard Rogers (2013). Digital
Methods.
- https://www.issuecrawler.net/index.php
- https://www.digitalmethods.net/Dmi/Tool
Database
153. Let us move to Web Visibility Analysis
Frequently occurring key words in e-science webpages in Korea
Created on Many Eyes(http://many-eyes.com)
Words are larger according to the frequency of their occurrence but their
positions are randomly-chosen for the best visualization
WCU
WEBOMETRICS
INSTITUTE
154. Websites retrieved more than two times
Note: Websites are larger according to their frequency of retrieval; however, heir
colors and locations are randomly-chosen for the best visualization
WCU
WEBOMETRICS
INSTITUTE
155. 2nd type of Webometrics: Web Visibility
Web visibility as an indicator of online political power
Presence or appearance of actors or issues being discussed by
the public (Internet users) on the web.
Tracking web visibility is powerful way to get an insight into
public reactions to actors or issues.
Recent studies indicates the positive relationships
between politicians’ web visibility level and election.
Also, the co-occurrence web visibility between two
politicians represents their hidden online political
relationships based on the public perception.
159. e-리서치 도구의 활용: 웹가시성 분석
블로그 공간에서 후보자들의 웹가시성 수준과 득표 수간
에 밀접한 상관성을 나타냄. (임연수, 박한우, 2010, JKDAS)
실제 득표수
29,120
평균 블로그 수
19,427
14,218
3,071 2,125
504
경대수 정범구 정원헌 박기수 이태희 김경회
161. I. 소셜 미디어의 특징 및 영향력
10.26 재보궐 선거 사례
•
(2)
페이스북에서 이름이 동시에 언급되는 이름 연결망을 구성
하여 분석
•
초반에는 두 후보자가 비슷하게 언급되다가,
중반에 접어들자 박원순 지지자들과 박원순이 언급되면서
나경원 후보자 지지자가 안보이게 되고,
종반에는 박원순 중심으로 네트워크가 재편되며 종결됨
162. I. Semantic network에서 중심성 비교
10.26 재보궐 선거 사례
(2)
•
서울시장 선거 관련 메세지들의 내용
을 분석하여 나오는 단어들의 빈도
분석
•
초반부터 나경원 후보는 빈도가 떨어
지다가, 후반에 박원순 후보와 경쟁
및 선거 결과를 이야기하면서 나타나
는 경우를 제외하고는 줄곳 담론외곽
에 존재
•
안철수 효과는 초반에 크고, 중반이
후 떨이지는 효과가 나타났으나, 한
나라당이라는 언급이 높게 나오면서
집권여당에 반하는 정서가 나타나,
선거의 성격을 말해줌
163.
As Lim & Park (2011, 2013)
claim, the use of web
mentions of politicians’
names is particularly useful
for hierarchically ranking
individual politicians.
However, it may not
sufficiently capture the
entropy probability of an
event (hidden in changing
communication structures)
resulting from the amount of
information conveyed by the
occurrence of that event
(Shannon, 1948).
164.
Taleb (2012) argues that society
can be conceived as a complex
fabric consisting of the extended
disorder family including
uncertainty, chance, entropy, etc.
Therefore, such disorder system
can be better derived from
empirical data mining, not
obtained by a priori theorem.
Uncertainty exists when three or
more events take place
simultaneously and is
increasingly beyond the control of
individual events (Leydesdorff,
2008).
165.
In social and communication
sciences, entropy-based
indicators have been widely
used for exploring entropy
values generated from
university-industrygovernment (UIG)
relationships.
This “Triple Helix Model”
(THM) can be applied to
the concurrence of a pair
of two or three terms in
the public search engine
database
166. Mapping Election Campaigns Through Negative Entropy:
Triple and Quadruple Helix Approach
to Korea’s 2012 Presidential Election
Social media platforms have become a notable venue for Korean
voters wishing to share their opinions and predictions with others
(Park et al., 2011; Sams & Park, 2013).
Politicians have made increasingly use of SNSs to provide updates
and communicate with citizens (Hsu & Park, 2012).
With the increasing proliferation of smartphones and portable
computers in Korea, SNSs have been widely used for facilitating
political discourse.
Prior studies have found that Web 1.0 contents tended to contain the
more enduring political and electoral statements of the public in
various contexts.
167. Introduction
To better understand the dynamics of the 2012 presidential election
in Korea, this study estimates the web visibility of the three major
candidates— Geun-Hye Park (PARK), Cheol-Soo Ahn (AHN), and
Jae-In Moon (MOON)—in the entire digital sphere.
168. Literature Review
The total probabilistic entropy (uncertainty) produced by changes in one or
two dimensions is always positive, which is in accordance with the second
law of thermodynamics (Theil, 1972, p. 59).
On the other hand, the relative contribution of each event to the
summation in three or four dimensions can be positive, zero, or negative
(configurational information).
This configurational information provides a measure of synergy within a
complex communication system. Network effects occur in a systemic and
nonlinear manner when loops in the configuration generate redundancies
in relationships between three or four events (Leydesdorff, 2008).
169. Method: Data collection
The number of hits for each search query per media
channel (Facebook, Twitter, and Google) was harvested.
The hit counts obtained from Google.com were
employed to look primarily at entropies represented on a
set of digitally accessible documents (e.g., online
versions of newspapers, online word-of-mouth, Web 1.0
contents, etc.).
We measured the occurrence and co-occurrence of the
politicians’ names based on their bilateral, trilateral, and
quadruple relationships by using Boolean operators.
For example, we measured the number of web and
social media mentions referring only to PARK (this is, no
mention of AHN, MOON, or the term “president”).
171. Literature Review
Twitter can be very effective to amplify messages particularly in terms of their
one-to-many mode of communication (Barash & Golder, 2010).
Twitter is viable both as a political news and communication channel
(González-Bailón, Borge-Holthoefer, Rivero & Moreno, 2011; Hsu & Park,
2011, 2012; Otterbacher, Shapiro, & Hemphill, 2013)
and to citizens who look for platforms for political participation and engagement
(Hsu, Park, & Park, 2013; Kim & Park, 2011; Tufekci& Wilson, 2012).
172. Literature Review
The mode of information sharing on Facebook differs from that on Twitter.
Facebook functions as a living room where friends talk to one another.
Facebook can be a mixture of interpersonal and mass channels for the sharing of
informational as well as social messages in a context of political campaign (Bond
et al., 2012; Effing, van Hillegersberg, & Huibers, 2011; Robertson, Vatrapu, &
Medina, 2010; Vitak et al., 2011).
Both Twitter and Facebook communications seem to be biased because two
platforms have been particularly dominated by the “2040 Generation”, who are
generally categorized as political liberals in Korea (Kwak et al., 2011).
173. Research questions
Therefore, it is important to examine what (social) media
conversations are more likely to generate more entropies that
others and which politician:
RQ 1) What (social) media generate (negative) entropy more than
others across different periods?
RQ 2) Which politician (or which pair of politicians) generates
entropy more than others for bilateral, trilateral, or quadruple
relationships across various media and periods?
175.
Entropy values (expressed as T for transmission)
for bilateral relationships are, by definition,
positive. Here T is defined as the difference in
uncertainty when the probability distributions of
two incidents (e.g., i and j) are combined. The
mutual information transmission capacity,
expressed in T values, is measured by “bits” of
information (for a more detailed mathematical
definition, see Leydesdorff, 2003):
Hi = – Σi pi log2 (pi); Hij = – Σi Σj pij log2 (pij),
Hij = Hi + Hj – Tij ,
Tij = Hi + Hj – Hij
(1)
Here Tij is zero if the two distributions are mutually
independent and positive otherwise (Theil, 1972).
176.
On the other hand, T values for trilateral and quadruple
relationships can be negative, positive, or zero depending on the
size of contributing terms. Therefore, it is necessary to compare
the absolute value of each (negative) entropy value when entropy
values are calculated for trilateral and quadruple relationships. In
the case of entropy values for trilateral and quadruple
relationships, the higher the absolute entropy value, the more
balanced the communication system is. Let p denote PARK; a,
AHN; and m, MOON and formulate mutual information in these
three dimensions as follows (Abramson. 1963, p. 129):
Tpam = Hp + Ha + Hm – Hpa – Hpm – Ham + Hpam
Here we are interested not only in information on mutual
relationships between these three candidates but also in semantic
relationships with respect to the term “president.” Accordingly, we
measure the entropy value by using mutual information in these
four dimensions (here “r” denotes “president”):
Tpamr = Hp + Ha + Hm + Hr – Hpa – Hpm – Hpr – Ham – Har – Hmr +
Hpam + Hpar + Hpmr + Hamr –Hpamr
(3)
(2)
180. Discussion and conclusions
Twitter has scored the most negative entropy
values and Facebook followed. Google came last.
This indicates that Twitter is the most open
communication system.
The entropy values for liberal candidates (AHN and
MOON) have been higher than their conservative
opponent PARK on social media than Google
sphere.
This may not be surprising because both Twitter
and Facebook have particularly appeared to the
Korean citizens in the age of late teenagers to
early 40s.
181. Discussion and conclusions
PARK’s entropy has been slightly higher on
Google than her liberal challenger MOON.
Park was successful in garnering a strong support
from senior voters in their 50s and 60s accounted
for 39% of the population, up from 29% a decade
ago (Wall Street Journal, 2012).
Exit poll also revealed that PARK gained a support
from 62% of voters in their 50s and 72% of voters
in their 60s. Indeed, the most significant statistic on
the election was that South Koreans in their 20s,
30s, and 40s actually voted 65.2%, 72.5%, and
78.7% respectively but 89.9% in 50s and 78.8%
over 60s went to the polling booth.
182. Paper-code
Keynote Speech
“Creativity and TRIZ”for the Knowledge Network
Analysis in the Emerging Big Data Research”
- DISC 2013 2013. 12. 14.
Dr. Jae Ho Par, Ph.D.
Managing Director of GRCIOP
Professor Emeritus Jae Ho Park
Yeungnam University
183. Curriculum Vitae
Paper-code
December 14, 2013
Professor emeritus Jae H. Park, Ph.D
-
Professor Emeritus , Industrial and Organizational Psychology,
Yeungnam University, South Korea
-Chairman, Global TRIZ Conference, Organizing Committes
- Chairman, Korean Society of Creativity
- Managing Director, GRCIOP Research Center
- Senior Advisor, ICEDR(International Consortium for Executive
Development Research, Boston, USA
- Ph.D., Organizational Psychology, Goettingen University, Germany
- MA, Social Psychology, Seoul National University
- BA, Seoul National University
<Academic Career> -
Harvard University, Research Professor. USA
University of Michigan, Exchange Professor, Ann Arbor, Michigan, USA
Yokohama National University, Research Fellow Professor, Japan
CSPP(California School of Professional Psychology), Teaching Professor, 1999-2000
Senior Advisor, ICEDR(International Consortium for Executive Development Research), USA
Visiting Professor, Meio University, Japan, current
Partner, THT Cross-cultural Consulting, Amsterdam, the Netherlands
Partner, SYMLOG Consulting Group, San Diego, USA
Liscencee, Center for Creative Leadership(CCL), Greensboro, USA,
Partner, Global Integration, UK
184. Paper-code
<International Consulting and Training>
Samsung Electronics; Creativity and Innovation “Change Begins with Me”
Samsung New Management, Train the trainers for 6,000 managers.
JMA(Japan Management Association and FMIC(Future Management and
Innovation Consulting, Japan ), SYMLOG Diagnosis, Team-building and
Coaching, Tokyo, Japan
- LG Philips Displays, M & A Process Consultation, Coaching, Diagnosis
LG Electronics, DAC(White electronics Division), Changwon, Korea
Hyundai Motor Company, Creativity and Innovation Program, Korea
Samsung Electronics, Large Scale Change, Korea
BorgWarner, Detroit, USA
Ericsson, Sweden
Applied Materials Korea, Coaching and Consultation, Seoul, Korea
Goldman Sachs, Integration Project Coaching, with THT Consulting Group, 2007
MetLife, Coaching for Asset Managers, 2007
Mirae Assets Stock Company, Creativity Coaching, 2010
Team-building and Innovation, Trondheim University, Norway
185. Paper-code
<International Network>
Center for Creative Leadership, Partner, Liscencee, North Carolina, USA
SPGR Consulting, Oslo, Norway
JMAC(Japan Management Association Consulting) Tokyo, Japan
SYMLOG Consulting Group, Researcher and Partner, San Diego, USA
Global Integration, Partner, London, United Kingdom
Japan Creativity Research Center, Partner, Tokyo, Japan
THT Cross-cultural Consulting(Trompenaars & Turner), Amsterdam, Partner,
the Netherlands
ICEDR(International Consortium for Executive Development Research) Boston, USA
<Consultant and Advisor >
Samsung HRD
Center
Samsung Electronics
Samsung SDI
LG Education Center
LG Electronics
POSCO HRD Center
<Contact>
Phone; 82-53-810-2230(Office)
Fax; 82-53-810-4610
Mobile; 82-10-8751-7579
email; grciop@gmail.com
186. TRIZ Founder
G. S. Altshuller
(1926~1998)
Father of TRIZ
Global TRIZ Conference 2013 | www.koreatrizcon.kr
Seoul Trade Exhibition & Convention, Seoul, Korea | July 09-11, 2013
187. Paper-code
What is TRIZ ?
TRI Z is a tool for Thinking
but not instead of thinking
G. Altshuller
194. Paper-code
Research
Areas
◦ Understanding creative cognition and
computation
◦ Creativity to stimulate breakthrough in
science and engineering
◦ Educational approaches that encourage
creativity
◦ Supporting creativity with IT
197. Edison and Altshuller
•
•
•
•
•
Everybody can be a Inventor
TRIZ Diffusion; No cost
Developed TRIZ in Prison
Benevolent Mentor
(Dialectics; ideal Communist)
Paper-code
200. Paper-code
Various views on TRIZ
•
•
•
•
•
•
•
From Knowledge Management
From 6 Sigma
From Engineering Design
From Innovation
From Creativity
From R&D
Etc…
202. Paper-code
TRIZ as a Science
Technical
Systems
Social
Systems
Natural
Systems
TRIZ
N&A Narbut, 2003
203. Paper-code
5 Levels of Invention
① Apparent Solution (32%)
①
- Simple
② Simple Improvement within current system
(45%)
③ Major improvement (18%)
- within same science
④ Innovation within current system (4%)
- Application different science principle
⑤ Pioneer Invention (1%)
- New principle and Paradigm Shift
⑤
④
③
②
205. Paper-code
Common Approach
TRIZ
Innovation involves the
creation of new ideas
Innovation involves
adapting existing ideas
Trained in the notion of the
‘great idea’. Popular
mythology - “Einstein” as
model. Belief that ‘six
months in the lab beats one
hour spent in the library’.
Tap existing solutions. Look
outside of discipline and to
Nature. Key benefit:
reduces perceived risk of
innovation (predictable,
higher chance of success).
207. Creativity and TRIZ
Paper-code
*
Korea Academic TRIZ Association
Industry-Academia Knowledge sharing
Contributor for industry competitiveness and
creative talent by TRIZ
Founded in May 2010
Participating of Univ. & Co.
Homepage: www.katatriz.or.kr
32 Co.
29 Univ.
- 3/10 -
208. Paper-code
Main Activities
Expanded use of
TRIZ and social
contribution
Evolution
Nurturing
creative talent
MATRIZ & KATA
MOU
Problem-solving,
Patent-creation
Biz. TRIZ research
Univ. professor
Workshop
Anti-school violence
program
TRIZ education
Charity fair
TRIZ Youth Acamedy
Lectures
for SMEs
Consulting for SMEs
problem-solving
Technical TRIZ
application
2010
2011
2012
2013
Time
- 5/10 -
209. TRIZ Activities in Korea
Paper-code
Company : Development of Innovative Products,
Problem-Solving and Patents Creation
Core tech & innovative product
Foundation of TRIZ Univ.
TRIZ Elite
Development of POSCO methodology
TRIZ research group
Internal TRIZ Conference
Mixing DFSS & TRIZ
Strategic R&D patent creation
Patent creation
On-site TRIZ process designed to
TRIZ research group
improve on-site work performance
- 6/10 -
210. TRIZ Activities in Korea
Paper-code
University : Utilizing TRIZ in subject of “Creative design”
POSTECH
Master course curriculum
TRIZ Project organization
YONSEI
Creative engineering education
Inter-discipline activities and courses
Engineering certification program
HANYANG
Creative design education
Business management and
creative design curriculum
POLYTECHNIC
Mechanical engineering-focused courses
KOREA/RUSSIA cooperation center
※ TRIZ application supported by the government and research institutions
(i.e. Ministry of Trade, Industry and Energy and ETRI)
- 7/10 -
212. Paper-code
Recognition that
(technical) systems evolve
Towards the increase of ideality
By overcoming Contradiction
Mostly with minimal introduction of (free) Resources
Thus, for creative problem solving
TRIZ provides a dialectic ways of thinking, i.e.,
To understand the problem as a system
To image the Ideal solution first
And solve Contradiction
213. Paper-code
GRCIOP Global Network
ICEDR(International Consortium for Executive
Development Research(USA)
Global Integration(United Kingdom)
SYMLOG Consulting Group(USA)
Center for Creative Leadership(USA)
THT Consulting(the Netherlands)
Endre Sjovold Association(Norway)
215. The context
The rise of “new media” has transformed politics,
economics, and societies.
But, “Internet Studies” as a field ignores the
geopolitical issues associated with the rise of new
media technologies
Lots of emphasis on “politics” and the internet, but little on the
relations between states
“Arab Spring”-events occur, but the focus remains primarily on
a domestic context
Likewise, traditional IR theory focuses primarily on
elite level strategy, and doesn’t have the tools to
account for publics
217. Issue 1: The implications of a “networked” globe
on geopolitics
Shifting configurations of influence
Networked, rather than hierarchical
Highly transnational
“foreign” vs “domestic” doesn’t capture the reality
The conversation has become global, especially among
elites
Values
Politics
Economics
But, influence depends on your connectedness to the
global conversation
Thus, dependent on access to technological infrastructure
238. Issue 2: Information Access/Control
Crowd Sourced
Unprecedented access to sensitive information
Stratified
Customized
“The spread of information networks is forming a
new nervous system for our planet. When something
happens in Haiti or Hunan, the rest of us learn about
it in real time-from real people.”
US Sec of StateHillary Clinton, 2010
239. Wikileaks: Crowd-sourced espionage or
invaluable public service?
Revealed US war plans
and operations, as well as
diplomatic secrets
Led to multiple
recriminations, including
attempted assassination
of Saudi ambassador
Snowden: hero or
traitor?
242. Issue Three: Policies
Re-articulation of “national interest”
Alec J. Ross and “21st Century Statecraft”
“addresses new forces propelling change in international
relations that are pervasive, disruptive, and difficult to
predict.” US Dept of State
Perhaps what we can predict
Publics more important than elites
Don’t assume you can keep secrets
Companies comply with national laws more for reputational
reasons than for fear of sanction
243. The Internet Freedom Agenda
“Countries that restrict free access to information or
violate the basic rights of internet users risk walling
themselves off from the progress of the next
century.” Hillary Clinton, January 2010, Remarks
on Internet Freedom
“Let’s be clear. This disclosure is not just an attack
on America-it’s an attack on the international
community.” Hillary Clinton, November 2010, after
the Wikileaks release.
Conclusion: no set of easy answers
244. Final thoughts…..
We need far more sustained attention to the impact
of new media in between states, as well as within
states.
Unrealistic to simply say “NO,” no matter how loudly
we say it. The technology won’t be unmade.
We are in uncharted, and largely unstudied,
territory, and our policies are being driven by what is
technically feasible, rather than what is desirable.
245. A project from the Social Media Research Foundation: http://www.smrfoundation.org
246. About Me
Introductions
Marc A. Smith
Chief Social Scientist
Connected Action Consulting Group
Marc@connectedaction.net
http://www.connectedaction.net
http://www.codeplex.com/nodexl
http://www.twitter.com/marc_smith
http://delicious.com/marc_smith/Paper
http://www.flickr.com/photos/marc_smith
http://www.facebook.com/marc.smith.sociologist
http://www.linkedin.com/in/marcasmith
http://www.slideshare.net/Marc_A_Smith
http://www.smrfoundation.org
248. Social Media Research Foundation
People
Disciplines
Institutions
University
Faculty
Computer Science
University of Maryland
Students
HCI, CSCW
Oxford Internet Institute
Industry
Machine Learning
Stanford University
Independent
Information Visualization
Microsoft Research
Researchers
UI/UX
Illinois Institute of
Technology
Developers
Social Science/Sociology
Connected Action
Network Analysis
Cornell
Collective Action
Morningside Analytics
249. What we are trying to do:
Open Tools, Open Data, Open Scholarship
• Build the “Firefox of GraphML” – open tools for
collecting and visualizing social media data
• Connect users to network analysis – make
network charts as easy as making a pie chart
• Connect researchers to social media data sources
• Archive: Be the “Allen Very Large Telescope Array”
for Social Media data – coordinate and aggregate
the results of many user’s data collection and
analysis
• Create open access research papers & findings
• Make “collections of connections” easy for users
to manage
250. What we have done: Open Tools
• NodeXL
• Data providers (“spigots”)
–
–
–
–
–
–
–
–
ThreadMill Message Board
Exchange Enterprise Email
Voson Hyperlink
SharePoint
Facebook
Twitter
YouTube
Flickr
251. What we have done: Open Data
• NodeXLGraphGallery.org
– User generated collection
of network graphs,
datasets and annotations
– Collective repository for
the research community
– Published collections of
data from a range of social
media data sources to help
students and researchers
connect with data of
interest and relevance
256. There are many kinds of ties….
Send, Mention,
Like, Link, Reply, Rate, Review, Favorite, Friend, Follow, Forward, Edit, Tag, Comment, Check-in…
http://www.flickr.com/photos/stevendepolo/3254238329
257. Social Network Theory
http://en.wikipedia.org/wiki/Social_network
• Central tenet
– Social structure emerges from
– the aggregate of relationships (ties)
– among members of a population
• Phenomena of interest
– Emergence of cliques and clusters
– from patterns of relationships
– Centrality (core), periphery (isolates),
– betweenness
• Methods
– Surveys, interviews, observations,
log file analysis, computational
analysis of matrices
Source: Richards, W.
(1986). The NEGOPY
network analysis
program. Burnaby, BC:
Department of
Communication, Simon
Fraser University. pp.716
(Hampton &Wellman, 1999; Paolillo, 2001; Wellman, 2001)
258. SNA 101
• Node
A
– “actor” on which relationships act; 1-mode versus 2-mode networks
• Edge
B
– Relationship connecting nodes; can be directional
C
• Cohesive Sub-Group
– Well-connected group; clique; cluster
• Key Metrics
A B D E
– Centrality (group or individual measure)
D
• Number of direct connections that individuals have with others in the group (usually look at
incoming connections only)
• Measure at the individual node or group level
E
– Cohesion (group measure)
• Ease with which a network can connect
• Aggregate measure of shortest path between each node pair at network level reflects
average distance
– Density (group measure)
• Robustness of the network
• Number of connections that exist in the group out of 100% possible
G
F
– Betweenness (individual measure)
• # shortest paths between each node pair that a node is on
• Measure at the individual node level
• Node roles
H
I
C
– Peripheral – below average centrality
– Central connector – above average centrality
– Broker – above average betweenness
E
D
259.
260. NodeXL
Free/Open Social Network Analysis add-in for Excel 2007/2010 makes graph
theory as easy as a pie chart, with integrated analysis of social media sources.
http://nodexl.codeplex.com
263. Goal: Make SNA easier
• Existing Social Network Tools are challenging
for many novice users
• Tools like Excel are widely used
• Leveraging a spreadsheet as a host for SNA
lowers barriers to network data analysis and
display
275. Social Network Maps Reveal
Key influencers in any topic.
Sub-groups.
Bridges.
276. NodeXL
Network Overview Discovery and Exploration add-in for Excel 2007/2010
A minimal network can
illustrate the ways different
locations have different values
for centrality and degree
280. Welser, Howard T., Eric Gleave, Danyel Fisher,
and Marc Smith. 2007. Visualizing the Signatures
of Social Roles in Online Discussion Groups.
The Journal of Social Structure. 8(2).
Experts and “Answer People”
Discussion people, Topic setters
Discussion starters, Topic setters
311. SNA questions for social media:
1.
2.
3.
4.
What does my topic network look like?
What does the topic I aspire to be look like?
What is the difference between #1 and #2?
How does my map change as I intervene?
What does #YourHashtag look like?
320. What is Social Network Analysis?
How is it useful for the humanities?
1. New framework for analysis
2. Data visualization allows new perspectives – less linear, more comprehensive
Social Network Analysis and Ancient History
Diane H. Cline, Ph.D.
University of Cincinnati
322. The Content summary
spreadsheet displays the most
frequently used URLs, hashtags,
and user names within the
network as a whole and within
each calculated sub-group.
326. NodeXL as a Teaching Tool
I. Getting Started with Analyzing Social Media Networks
1. Introduction to Social Media and Social Networks
2. Social media: New Technologies of Collaboration
3. Social Network Analysis
II. NodeXL Tutorial: Learning by Doing
4. Layout, Visual Design & Labeling
5. Calculating & Visualizing Network Metrics
6. Preparing Data & Filtering
7. Clustering &Grouping
III Social Media Network Analysis Case Studies
8. Email
9. Threaded Networks
10. Twitter
11. Facebook
12. WWW
13. Flickr
14. YouTube
15. Wiki Networks
http://www.elsevier.com/wps/find/bookdescription.cws_home/723354/description
82
327. What we want to do:
(Build the tools to) map the social web
• Move NodeXL to the web: (Node[NOT]XL)
– Node for Google Doc Spreadsheets?
– WebGL Canvas? D3.JS? Sigma.JS
• Connect to more data sources of interest:
– RDF, MediaWikis, Gmail, NYT, Citation Networks
• Solve hard network manipulation UI problems:
– Modal transform, Time series, Automated layouts
• Grow and maintain archives of social media network data sets for
research use.
• Improve network science education:
– Workshops on social media network analysis
– Live lectures and presentations
– Videos and training materials
328. NodeXL Results
• Easy to learn, yet powerful and insightful
• Widely used by both students and researchers
• Free and open source sofware
• World-wide team of collaborators
Malik S, Smith A, Papadatos P, Li J, Dunne C, and Shneiderman B (2013), “TopicFlow: Visualizing topic
alignment of Twitter data over time. In ASONAM '13.
Bonsignore EM, Dunne C, Rotman D, Smith M, Capone T, Hansen DL and Shneiderman B (2009), "First steps
to NetViz Nirvana: Evaluating social network analysis with NodeXL", In CSE '09. pp. 332-339.
DOI:10.1109/CSE.2009.120
Mohammad S, Dunne C and Dorr B (2009), "Generating high-coverage semantic orientation lexicons from
overtly marked words and a thesaurus", In EMNLP '09. pp. 599-608.
Smith M, Shneiderman B, Milic-Frayling N, Rodrigues EM, Barash V, Dunne C, Capone T, Perer A and Gleave E
(2009), "Analyzing (social media) networks with NodeXL", In C&T '09. pp. 255-264.
84
DOI:0.1145/1556460.1556497
329. How you can help
Sponsor a feature
Sponsor workshops
Sponsor a student
Schedule training
Sponsor the foundation
Donate your money, code, computation, storage,
bandwidth, data or employee’s time
• Help promote the work of the Social Media
Research Foundation
•
•
•
•
•
•
332. A project from the Social Media Research Foundation: http://www.smrfoundation.org
333.
334. International Collaboration &
Green Technology Generation
Assessing the East Asian
Environmental Regime
Matthew A. Shapiro
Illinois Institute of Technology
matthew.shapiro@iit.edu
335. Impetus
• Shapiro and Nugent (2012) “Institutions and the
sources of innovation” in IJPP
• Total factor productivity is hindered by collaboration if
institutions are absent or if not beyond TFP threshold
• Shapiro (2013) “Regionalism’s challenge to the
pollution haven hypothesis” in Pacific Review
• Regional efforts to eliminate pollution are
multifaceted
• Support
• East Asia Institute
• Asiatic Research Institute, Korea University
336. International
institutions
To other regions
To other regions
Regional institutions
Country 2 FDI
Country 2
ecologists
(+)
Pollution
haven
hypothesis
(+)
(+)
Epistemic
community
hypothesis
(-)
Country 1
pollution
Country 2
pollution
Country 3
pollution
Country 1
institutions
(-)
Country 2
domestic R&D
funding
Country 3
domestic R&D
funding
Country 3
ecologists
Country 3 FDI
Contra-pollution
haven
hypothesis (-)
Country 1
domestic R&D
funding
Country 1
ecologists
Country 1 FDI
Country 2
institutions
Country 3
institutions
337.
338. International
institutions
To other regions
To other regions
Regional institutions
Country 2 FDI
Country 2
ecologists
(+)
Pollution
haven
hypothesis
(+)
(+)
Epistemic
community
hypothesis
(-)
Country 1
pollution
Country 2
pollution
Country 3
pollution
Country 1
institutions
(-)
Country 2
domestic R&D
funding
Country 3
domestic R&D
funding
Country 3
ecologists
Country 3 FDI
Contra-pollution
haven
hypothesis (-)
Country 1
domestic R&D
funding
Country 1
ecologists
Country 1 FDI
Country 2
institutions
Country 3
institutions
339.
340. International
institutions
To other regions
To other regions
Regional institutions
Country 2 FDI
Country 2
ecologists
(+)
Pollution
haven
hypothesis
(+)
(+)
Epistemic
community
hypothesis
(-)
Country 1
pollution
Country 2
pollution
Country 3
pollution
Country 1
institutions
(-)
Country 2
domestic R&D
funding
Country 3
domestic R&D
funding
Country 3
ecologists
Country 3 FDI
Contra-pollution
haven
hypothesis (-)
Country 1
domestic R&D
funding
Country 1
ecologists
Country 1 FDI
Country 2
institutions
Country 3
institutions
341.
342. Research Questions
• Are the Northeast Asian countries key
collaborators in pursuit of green R&D?
• Yes, particularly in recent years.
• Are the Northeast Asian countries
collaborating extensively with each other?
• Not as much as they collaborate with countries
beyond the region.
• Implications?
343. Green R&D
• Patents
• IPC Green Inventory
•
•
•
•
•
•
•
Alternative energy production
Transportation
Energy conservation
Waste management
Agriculture/forestry
Administrative aspects
Nuclear power generation
344. Alternative energy production
• Biofuels
• Integrate gasification combined cycle
• Fuel cells
• Pyrolysis or gasification of biomass
• Harnessing energy from manmade
waste
• Hydro energy
• Ocean thermal energy conversion
• Wind energy
• Solar energy
• Geothermal energy
• Other production or use of heat not
derived from combustion
• Using waste heat
• Devices for producing mechanical
power from muscle energy
Energy conservation
• Storage of electrical
energy
• Power supply
circuitry
• Measurement of
electricity
consumption
• Storage of thermal
energy
• Low energy lighting
• Thermal building
insulation, in general
• Recovering
mechanical energy
345. Data Collection
• Source: USPTO
• Collection method: Leydesorff’s tools
• Unit of analysis: country of inventor
346. Data Description
IL
BE
• Dates: 1990-2013
• 129,640 total inventors
IN
IT
CN
CH
NZ TW
all others
AU
KR
DK
• Assumption: Any
collaboration is valued,
so proportionate share
of patent inventorship is
ignored.
CA
GB
• 242,331 total nodes
based on country
classification
NL
FR
US
DE
JP
361. Implications
• Empirical
• R&D collaboration can be beneficial from both
intra- as well as extra-regionally. Both are
happening extensively for Northeast Asia.
• Methodological
• Challenges of connecting these results to other
variables in model
• Longitudinal concerns: Change in connectedness?
• Qualitative, quantitative, mixed?
362. Assessing Social Media Coverage in
Japan: Before and After March 11, 2011
Leslie M. Tkach-Kawasaki
University of Tsukuba
DISC 2013, December 11, 2013
365. Social Media in Japan 2010-2011
Have used the following at least once…..
Blogs 77.3%
Video-sharing websites 62.8%
SNS 53.6%
Microblogs (Twitter) 30.9%
Source: 2010 White Paper on Information and Communications in Japan
366. The Year in Social Media 2010-11
International diplomacy:Youtube and Chinese
fishing vessel (September 2010)
Entertainment: Release of The Social Network
(October 2010)
International conflicts: Role of Twitter and
Facebook in Tunisia and Egypt (January 2011)
Disasters: New Zealand Earthquake (February
2011)
369. Research question….
Are there perceivable differences
in the discourse (phrases) about
social media in Japan’s
newspaper media before and after
March 11, 2011?