3. Background
• Corporate Social Responsibility : embracing
sustainability into business strategy to gain
social benefits and create business value
(Dauvergne & Lister, 2012).
• Corporate Sustainability:
A firm's awareness of environmental
protection issues and its incorporation of
ecological concern and sustainable
development for long-term growth (Lu & Li,
2009).
4. Background
• Pohl (2006) noticed that corporate social
responsibility (CSR) represents the broad
spectrum of a company’s corporate culture,
including values, beliefs, attitudes, and norms.
• Culture has been regarded as critical element in
business ethical decision-making and PR
strategies (Kim & Kim, 2010)
• However, there is little empirical data on the
relation between cultural and PR approaches
5. The purpose of this study
• The literature on the CSR and global environmental
management has been biased in developed and Western
countries such as those in North America and Europe
(Leonidou & Leonidou, 2011).
• Little studies have addressed corporate CER approaches in
new media setting though new media is the main platform to
reach widest consumers.
• This study investigates how large firms in Korea and China
employ CER communication through their websites to better
understand the major approaches taken by these firms to
disclose their CER principles and compares the two countries
in terms of their cultural similarities and differences through
mixed method approaches.
6. Why Korea and China?
• Emerging consumer markets
• Increasing corporate power and economic
influence in the world
• Serious environmental problems and lower
concerns on environmental issues
• Relatively low efforts and perception for
environmental sustainability
7. Hofstede’s cultural dimension
Uncertainty Avoidance: “The extent to
which the members of a culture feel
threatened by ambiguous or unknown
situations and have created beliefs and
institutions that try to avoid.”
Individualism: “the degree of
interdependence a society maintains among
its members.”
South Korea is higher uncertainty avoiding
culture than China.
Both countries are collectivistic culture.
8. Research Questions
• RQ1: What are the semantic patterns of CER approaches
taken by Korean and Chinese large companies on their
websites?
• RQ2:What are the key themes in CER communication of
Korean and Chinese large companies on their websites?
• RQ3: How do cultural differences shape these CER
communication strategies?
9. Mixed-Methods
• Data Collection
: Korean and Chinese corporations examined were sampled from the
country’s list of the top 50 largest corporations in terms of revenue.
A total of 44 Korean firms provided CER-related information,
whereas 32 Chinese firms provided it
• Semantic Network Analysis
• This study employed semantic network analyses based on the top
100 frequently used key words. A co-word analysis and cluster
analysis (CONCOR) were conducted for specifying key themes from
texts.
• “It examines the relationships among a system's components based
on the shared meanings of symbols (Doerfel & Barnett, 1999).”
FullText, a network analysis tool was used
(http://www.leydesdorff.net/software/fulltext/).
• Qualitative content analysis
• Identified key CER principles and the relation of cultural values and
prominent themes.
10. Findings
Centralityofsemanticnetwork
• Korean CER network (54.80%) was more centralized than
Chinese CER network (38.69%).
• Prominent words in Korean CER: Management (58.715),
we(49.599), green (44.522), our (37.743), energy(35.827),
system (33.55), environment (33.045)
• Prominent words in Chinese CER: Energy (45.691), company
(45.691), we (34.87), development (34.62), china (32.03), our
(26.859), management (26.859)
11. Density of network
• Korean corporations had denser semantic network of CER
than Chinese corporations.
• Korea (Mean: 446.510, SD: 865.747)
• China (Mean: 163.929, SD: 212.001)
16. KeythemesinKoreanCERNetwork
Risk minimization global
environmental issues
Commitment and responsibility for
environmental change
Improvement of eco-technology
Efficient use of resource
and suitability
management system
Endorsement for green facility
Employee education &
workplace security
Internal and international
management system of
hazardous substance
Collective efforts to embracing
environmentalism
17. KeythemesinChineseCERNetwork
Commitment to economy
and society
Resource conservation and
environmental responsibility
Regulation on management
of environmental protection
& consumer right
Development of green
products & supports to
government
Improvement of local
economies and awareness
on global environment
Implement of national
policies & social value
Environmental concerns
Advanced facilities
18. Discussion & Conclusion
• The results imply that Korean corporations focus on presenting their
capability and pragmatic skills to resolve environmental problems as
economic powers, while Chinese corporations are more concerned
with their brand image as social responsible and engagement with
stakeholders.
• Korean and Chinese corporations framed CER principles and
practices differently: While the Korean corporations focused on
promoting eco-friendly technologies as a competitive strategy and
frequently used performance-related terms, the Chinese
corporations employed more collectivistic appeal such as
commitment to local and global communities and partnerships with
NGOs and stakeholders.
• Hofstede’s cultural dimensions on collectivism explain the similar
approach of CER in both countries.
19. Discussion & Conclusion
• Korean firms were more strategical in articulating their
environmental initiatives, visions, performance, activities, and
environmental concerns with detailed reports in comparison
with Chinese companies.
• This can be explained by Korea’s high uncertainty avoiding
culture that corresponds to the institutions’ concerns on
surrounding environments and future condition.
20. This study contributes to..
• The results provide theoretically meaningful insights for
evaluating corporate practices in Asia in terms of
communicating environmental management strategies and
their principles in the context of new media. Given the lack of
environmental management research in the business
communication domain, this study contributes to the
literature by empirically analyzing East Asian firms' campaign
performance.
• In addition, the study provides important methodological
implications for the analysis of corporate websites and
demonstrates mixed methods to extract and analyze a large
size of texts in a systemic way.
21. Limitation
• Different number of samples from each country was used
although this was due to discrepancies in their real
performance. Another explanation for this may be related to
the data including only English versions of websites.
• This study considers only websites for examining CER, but
many firms now make increasingly active use of social media
such as Twitter and Facebook for marketing purposes.
23. 1Dr. Nick Guldemond
The Micro Foundations of Triple Helix, Workshop
May 26-27 2014, Grenoble Ecole de Management
User Groups in Triple Helix
Interaction:
The Case of Living Labs in Health
Innovation
Marina van Geenhuizen* and Nick Guldemond**
* TU Delft **University Medical Centre Utrecht
24. 2Dr. Nick Guldemond
Road map
•Introduction: Grand societal health
challenges, user involvement and Living labs
•Research question
•Methodology: literature study and six case
studies
•Preliminary list of critical factors
•Results of case studies
•Conclusions on critical factors and future
research steps
25. 3Dr. Nick Guldemond
Grand Societal Health Challenges
To maintain the health care
affordable, make it more
effective and oriented towards
persons in a situation of ageing
population and shrinking
budgets!!
26. 4Dr. Nick Guldemond
2007 EU
•average ~ 1:4
•differences between countries
2050 EU
•1:3 (NL)
•1:1.5 (Italy and Spain)
• EU average ~ 1:2
2050 China
•average ~ 1:<1
Population aged (>65) in proportion to
working population (18-65)
Prof.dr. Marina van Geenhuizen
28. 6Dr. Nick Guldemond
medical curative model
community care
university hospital
local hospital
social (interconnected)
health perspective
community care
advanced local
care centres
high
specialized
cure
30. 8Dr. Nick Guldemond
Users and Living labs
In the medical sector, there are more than
one user group:
•Patients, elderly people, etc.
•Family doctors
•Medical staff in clinics
•Clinics
Why involvement of user groups/customers
in design (co-creation)?
The design process turns out to be quicker and
more effective, like in design of artificial limbs
(patients) and of surgery room equipment.
Living Labs are one way to involve user
groups/customers in innovation
32. 10Dr. Nick Guldemond
Challenges of Living Labs:
Involvement of the right user groups
(motivation, capabilities)
Positioning them in the network, given the
dynamic stakeholder situation of which the
Triple Helix partners (academia, industry and
government) are only a few (also, insurance
companies, registration authorities, venture
capitalists, ngos etc.)
33. 11Dr. Nick Guldemond
Research questions and methodology
Research Questions
What are the characteristics of user-groups in Living
Labs? In which ways are Triple Helix partners active
in Living Labs and can user-groups in interaction
with them contribute to bringing new technology to
market?
Methodology:
• Evaluation of the literature: critical factors in
founding and managing Livings Labs
• 6 in-depth case studies of medical Living Labs
(multiple data sources)
Prof.dr. Marina van Geenhuizen
34. 12Dr. Nick Guldemond
Character of Living Labs
Two operational levels (Følstad 2008)
• Open innovation networks or platforms in a city/region
• Real-life physical setting used for co-creation and testing with strong
involvement of user groups
Despite differences in size, setting, organization, driving actors, etc.
three common characteristics:
• An early involvement of user groups
• A physical and/or social environment representing real-life
• Open networks of stakeholders sharing the desire to support a
better/quicker take up of inventions
Prof.dr. Marina van Geenhuizen
35. 13Dr. Nick Guldemond
Preliminary set of LL critical factors (literature
Criterion Details
1.Involvement
of user groups
-Adequate model of involvement
-Selection of users (motivation and capabilities needed)
2.Composition &
management of
the network
-Involvement of all relevant actors to create vertical cooperation in the
value chain and horizontal cooperation (scale economies).
-Avoiding a too many partners, avoiding dominance of a powerful one and
strong interdependency between powerful partners
-Increasing openness and neutrality, including trust, to avoid one powerful
partner to play a ‘key role’ deterring other partners to participate
3.Structured
process
-Working with a transparent ‘funnel’ or other innovation model
-Working with clear go/no-go decisions
4.Role of ICT -Sufficient use of ICT in monitoring and analysis of user
response in the design processes
-ICT should not be the main driver, unless its adoption is subject
of analysis, like in ambient assisted living
5.Operational
management
-Quality management of the networks is required, enabling the balancing
of partners’ interests and managing expectations (and trust)
- Transparency of distribution of tasks and cost/benefits over the partners
6.Practical
requirements
-Ethics/law: sufficient attention for ethical/legal issues, like
users’ privacy and legal liability in case of failure
-Intellectual property (IP): Sufficient attention necessary in early stage.
36. 14Dr. Nick Guldemond
Case studies and user groups
1. Doornakkers (NL) real-life: Elderly of Turkish origin
2. Living Labs Amsterdam (NL) real-life: Elderly , housing
foundation
3. i360 Royal College of Surgeons (Ireland) real-life:
Medical staff (surgeons)
4. Medical Field Lab (NL) platform+real-life: Mix of users
5.Pontes Medical (NL) platform+real-life: Mix of users
6. Healthcare Innovation Lab (DK) real-life: Hospitals,
clinicians, patients
37. 15Dr. Nick Guldemond
Critical factors concerning users (1)
Doornakkers (Eindhoven-NL)
• eHealth/domotics, safety (maintain independent living)
• Users: elderly from Turkish origin (isolated community)
• Role of users: rather passive (sometimes active)
• Triple Helix: disconnected from university
• Success factor users: preparation study of needs; trust
creation (coach of Turkish origin); ICT well managed
Living lab Amsterdam-NL
• eHealth/domotics, safety (maintain independent living)
• Users: mixed elderly (also social housing foundation)
• Role of users: manifold (designers, subjects, storytellers)
• Triple Helix: business weakly involved; universities strongly
involved (co-design, broader research on needs)
• Success factors users: trust creation prior to project start;
mixed methods in learning, multidisciplinary; ICT well
managed; more attention needed for user values
Prof.dr. Marina van Geenhuizen
38. 16Dr. Nick Guldemond
Critical factors concerning users (2)
i360 Royal College of Surgeons (Dublin-IRE)
• Healthcare/surgical technology
• Users: medical staff hospitals (surgeons)
• Role of users: user-driven model
• Triple Helix: active role for university, but government
weakly involved; active reduction of TH gaps.
• Success factors users: trust between partners, flexibility of
users in shift from network to company
39. 17Dr. Nick Guldemond
Case studies: larger scale platforms
Pontes Medical (Amsterdam-Utrecht, NL)
• Health care and medical technology (selected)
• Users: Medical staff, care professionals, hospitals, patients, firms
• Role of users: user-driven model (clinic driven)
• Triple Helix: strongly connected and active reduction of TH gaps
• Success factors users: protection of IO (clinicians, companies)
Healthcare Innovation Lab (Copenhagen, DK)
• New services, and organization and care concepts (e-health) and
a methodology of user driven innovation (using simulation lab)
• Users: Hospitals, clinicians, patients
• Role of users: highly interactive in simulation lab
• Triple Helix: strongly connected, but business weakly connected,
and active reduction of TH gaps
• Success factors users: selection of users on capabilities
(simulation), trust between partners, passionate leadership
Prof.dr. Marina van Geenhuizen
40. 18Dr. Nick Guldemond
Answers to questions
Characteristics of user-groups in medical Living Labs:
• User groups are mainly patient-oriented (care/
treatment) or hospital/clinicians-oriented (facilities)
• Their involvement may include various methods: co-
design, story-telling, scenario-thinking, co-simulation
Ways in which Triple Helix partners are active in Living
Labs:
• In Living Labs on e-health for elderly, either the
university or industry tend to be weakly involved
• In Living Labs for broader medical care/cure and
hospital facilities, all three TH actors tend to be
actively involved.
41. 19Dr. Nick Guldemond
Critical factors in having user-groups
involved (Patient-oriented Living Labs)
1. Prior study of user needs
2. Trust creation (eventually prior to project) and
role models and coaches based on familiarity
3. Manifold inputs and multidisciplinary approach:
co-design, story-telling, scenario-thinking, etc.
4. Attention for user values: ICT dependency,
privacy, individuality
5. Moderate ‘dosage’ of new ICT
6. Passionate leadership for inspiration
42. 20Dr. Nick Guldemond
Critical factors in having user-groups
involved (hospital/clinician oriented
Living Labs)
1. Trust creation between users and other
partners
2. Flexibility in shift to new concepts, i.e. from
network to company
3. Protection of IO of users (clinicians,
companies)
4. Selection on capabilities of users
5. Passionate leadership
43. 21Dr. Nick Guldemond
Future lines of research
• To validate the outcomes using expert opinion.
• To increase the number of Living Labs and to analyze
them quantitatively (fuzzy set analysis): pattern
recognition, causal structures, etc.
• To compare medical Living Labs with Living Labs in
other domains.
• To determine what success of Living Labs would mean
and how it can be measured (so far merely by process
variables).
46. TU Delft
• TU Delft is in the city of Delft in The
Netherlands (European Union).
• It is the largest University of Technology in the
Netherlands (10,500 bachelor students and
6,650 master students in 2012)
Founded in 1842 as a Royal
Academy for Engineering
47. TU Delft
Faculties
• Aerospace Engineering
• Applied Sciences
• Architecture and the Built Environment
• Civil Engineering and Geosciences
• Electrical Engineering, Mathematics and Computer Sciences
• Industrial Design
• 3-ME (Mechanical, Maritime and Material Engineering)
• Technology, Policy and Management
48. Why a Challenge to come to TU Delft
for a Master Study?
TU Delft has risen to 42nd place in the global
reputation ranking list of the World Reputation
Rankings of Times Higher Education magazine.
TU Delft is now the highest ranked Dutch
university and the third highest European
university of technology. Three other Dutch
universities are ranked in this top 100.
50. City of Delft
• Small city (95.000 inhabitants) but surrounded
by larger cities The Hague (Peace and Justice)
and Rotterdam (World Port City)
• Historical city center (houses from late middle
ages and older)
• Amsterdam at one hour drive, Brussels at two
hours car drive
• Paris and London also pretty nearby (45
minutes flight).
51. Faculty of Technology, Policy and
Management
Four MSc Programs
- Engineering and Policy Analysis
- Systems Engineering, Policy Analysis and
Management
- Transport, Infrastructure and Logistics
- Management of Technology
52. Methods of Education
• Problem-oriented and geared towards design of problem
solutions (like in traffic and adoption of new technology)
• Much group work (except for MSc thesis)
• MSc often in internship (company, policy institute)
• Strongly multidisciplinary (e.g. sustainable energy,
health technology and medical care, water works)
53. Requirements (TPM)
- BSc degree in a technical domain
- Cumulative Grade Point Average (CGPA) of at least 75% of
the scale maximum
- Proof of English language proficiency
So Welcome at TU Delft, TPM!!
For further information:
www.admissions.tudelft.nl
E: Internationaloffice-tbm@tudelft.nl
64. Big Data and the Triple
Helix - a bibliometric
perspective
Martin Meyer*, Wolfgang Glanzel & Kevin Grant
*Kent Business School, University of Kent, Canterbury CT2 7PE, United
Kingdom, m.s.meyer@kent.ac.uk
65. Purpose
• Big data has become the buzz word in recent years:
• topic of interest to a multitude of players
• be it government or industry, academics or the public at large
• This presentation will offer a bibliometric perspective
• We analyze the emergent literature in the field
• Our analysis will offer a general overview of developments and then
zoom in focusing on areas of particular interest
• publication activity in certain domains are focused on particular themes.
• outlook as to what strongly emergent topics are
67. The Triple Helix Aspect
• Bibliometric study of TH indicators literature
• 110 papers, analysis of references cited
• 2 groups emerged:
• Neo-evolutionary (mostly Leydesdorff and colleagues)
• Neo-institutional (Etzkowitz, Leydesdorff)
68. Triple Helix from a bibliometric
perspective
• Work at the heart
of the TH
• Cluster 1
• located at the heart
of the detailed
network map of
papers.
• reaches out to both
groups almost
equally.
69. Triple Helix from a bibliometric
perspective
• The ‘neo-institutional’
side of the TH
• Science-technology
linkage
• Cluster 8: mostly related
to patent citation
indicators to measure
S&T linkage or discuss
their usefulness.
• Cluster 3: very closely
aligned to the work on
patent citation analysis as
described above.
• Entrepreneurial universities and university
patenting
• Cluster 4 is focused on the entrepreneurial
university and ways of capturing researchers’
entrepreneurial and collaborative activity.
70. Triple Helix from a bibliometric
perspective
• The Neo-evolutionary Approach:
• Mutual information, entropy, and
sub dynamics
• Cluster 2: approaches to capture
triple helix relations in terms of
information and communication
flows and identify their knowledge
bases.
• Evolutionary Thinking &
Knowledge Spill-overs
• Cluster 7 : evolutionary theorising
as well as the geography of
innovation, especially on regional
innovation systems and knowledge
spill-overs.
• Cluster 5 (closely related to Cluster 2 as well as 6) extends this perspective
towards a framework for empirical research.
• Cluster 6: innovation as an interactive process, leading from user-producer
interactions to a national system of innovation, work on the intellectual and
social organisation of the sciences as increasingly an organised and controlled
knowledge production system
71. ‘Big data’ – a bibliometric snapshot
• Based on 1,500 articles, letters, reviews and notes with BIG DATA as
topic or title
• Based on WoK SSCI/SCI indices
• No claim that study is exhaustive but opens up a view on what kind of
scholarly literature is currently associated with the Big Data label
• We will be zooming in even further and look at a subset of Social
Science / Information Science related works that could be potentially
linked to TH indicators works
72. Some Basic Stats
Research Areas Records %
COMPUTER SCIENCE 803 57.6
ENGINEERING 462 33.2
TELECOMMUNICATIONS 98 7.0
SCIENCE TECHNOLOGY OTHER TOPICS 66 4.7
BUSINESS ECONOMICS 64 4.6
INFORMATION SCIENCE LIBRARY SCIENCE 56 4.0
OPTICS 45 3.2
PHYSICS 33 2.4
BIOCHEMISTRY MOLECULAR BIOLOGY 31 2.2
MATHEMATICS 29 2.1
• Big data covered in obvious research areas
73. Some Basic Stats
Expected players visible
Countries/Territories Records %
USA 573 41.1
PEOPLES R CHINA 187 13.4
ENGLAND 91 6.5
GERMANY 66 4.7
AUSTRALIA 48 3.4
CANADA 47 3.4
JAPAN 45 3.2
SOUTH KOREA 38 2.7
NETHERLANDS 35 2.5
ITALY 33 2.4
FRANCE 29 2.1
SPAIN 29 2.1
INDIA 28 2.0
SWITZERLAND 25 1.8
TAIWAN 23 1.7
POLAND 16 1.1
SINGAPORE 15 1.1
74. Searching for big
data
• Evolution of n of publications (left)
and citations (right) in WoS
Source: Thomson Reuter
Early study making reference to ‘big
data sets’:
“DnaSP, DNA polymorphism
analyses by the coalescent and
other methods” by Rozas, J;
Sanchez-DelBarrio, JC; Messeguer,
X; Rozas, R in BIOINFORMATICS
(2003,
10.1093/bioinformatics/btg359)
Strong effect: 3707 cites
Next highest cited article; 120
citations
75. Removing ‘outliers’
Strong effect
shows the rapidly growing field
2011/12 onwards
Still strong influence of early papers
• Evolution of n of publications (left)
and citations (right) in WoS
76. Zooming in
•TOPIC: "BIG DATA“
•Timespan: All years.
•Refined by RESEARCH
AREAS:
• BUSINESS ECONOMICS
• INFORMATION SCIENCE
• LIBRARY SCIENCE
• OPERATIONS RESEARCH
MANAGEMENT SCIENCE
• PSYCHOLOGY
• COMMUNICATION
• BEHAVIORAL SCIENCES
• GEOGRAPHY
• GOVERNMENT
• LAW
• SOCIAL SCIENCES OTHER TOPICS
• SOCIOLOGY
• SOCIAL ISSUES
• 187 papers from Web of
Science Core Collection
77. Topics and Keywords
• Analysis based on DE and ID fields in WoS records
• Included all keywords/topics occurring more than once
• Total: 54 across the 136 papers that contained relevant fields
• Triple Helix occurred 6 times
• Normalised dataset (Jaccard)
• Mapped in Pajek (Kamada Kawai)
• Big data by far the most frequent term and by default at the centre of
field:
• Signs of emergent differentiation
79. Mapping of Big Data Works
• Based on links of shared topics and references
• 187 papers
• 2881 references and terms
• 60 most linked papers mapped in Pajek
• Person’s cluster algorithm applied:
• 9 clusters
81. Clusters
• Cluster 1: ‘possibilities and
challenges’: Big data and social
research (Psychology, TFSC, etc)
• Cluster 2:
Informetrics/Scientometrics
• Cluster 3: Big Data and the Media
• Cluster 4: Big Data as a Driver of
Change: ‘Challenges and Solutions’
(Mkt, Transp, IT related services)
• Cluster 5: Big Data and Geography
• Cluster 6: Big Data in the cloud:
Information systems related
contributions
• Cluster 7: Techniques to analyse Big
Data
• Cluster 8: Big Data and Big Brother:
Cyber Surveillance
• Cluster 9: Big Data and Decision
Support Systems
82. Cluster 1: ‘possibilities and challenges’: Big
data and social research (Psychology, TFSC,
etc)
AU TI- SO-
Bentley RA; O'Brien MJ; Brock WA Mapping collective behavior in the big-data era BEHAVIORAL AND BRAIN SCIENCES
Boyd D; Crawford K
CRITICAL QUESTIONS FOR BIG DATA Provocations for a cultural,
technological, and scholarly phenomenon INFORMATION COMMUNICATION & SOCIETY
Tangherlini TR; Leonard P
Trawling in the Sea of the Great Unread: Sub-corpus topic modeling and
Humanities research POETICS
Huang TL; Van Mieghem JA
Clickstream Data and Inventory Management: Model and Empirical
Analysis PRODUCTION AND OPERATIONS MANAGEMENT
Ballings M; Van den Poel D Customer event history for churn prediction: How long is long enough? EXPERT SYSTEMS WITH APPLICATIONS
Enjolras B Big Data and social research: New possibilities and ethical challenges TIDSSKRIFT FOR SAMFUNNSFORSKNING
Miller AR; Tucker C Health information exchange, system size and information silos JOURNAL OF HEALTH ECONOMICS
Kern ML; Eichstaedt JC; Schwartz HA;
Park G; Ungar LH; Stillwell DJ;
Kosinski M; Dziurzynski L; Seligman
MEP
From "Sooo Excited!!!" to "So Proud": Using Language to Study
Development DEVELOPMENTAL PSYCHOLOGY
Jun SP; Yeom J; Son JK
A study of the method using search traffic to analyze new technology
adoption TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE
Boyd D; Crawford K
Critical questions for big data - Provocations for a cultural, technological,
and scholarly phenomenon INFORMACIOS TARSADALOM
83. Cluster 2: Informetrics/Scientometrics
AU TI- SO-
Park HW; Leydesdorff L
Decomposing social and semantic networks in emerging "big
data" research JOURNAL OF INFORMETRICS
Park HW
An interview with Loet Leydesdorff: the past, present, and future
of the triple helix in the age of big data SCIENTOMETRICS
Skoric MM
The implications of big data for developing and transitional
economies: Extending the Triple Helix? SCIENTOMETRICS
Fairfield J; Shtein H
Big Data, Big Problems: Emerging Issues in the Ethics of Data
Science and Journalism JOURNAL OF MASS MEDIA ETHICS
Uprichard E Being stuck in (live) time: the sticky sociological imagination SOCIOLOGICAL REVIEW
84. Cluster 3: Big Data and the Media
AU TI- SO-
Bruns A; Highfield T; Burgess J
The Arab Spring and Social Media Audiences: English and Arabic
Twitter Users and Their Networks AMERICAN BEHAVIORAL SCIENTIST
Lewis SC; Zamith R; Hermida A
Content Analysis in an Era of Big Data: A Hybrid Approach to
Computational and Manual Methods
JOURNAL OF BROADCASTING & ELECTRONIC
MEDIA
Mahrt M; Scharkow M The Value of Big Data in Digital Media Research
JOURNAL OF BROADCASTING & ELECTRONIC
MEDIA
Procter R; Vis F; Voss A
Reading the riots on Twitter: methodological innovation for the
analysis of big data
INTERNATIONAL JOURNAL OF SOCIAL
RESEARCH METHODOLOGY
85. Cluster 4: Big Data as a Driver of
Change: ‘Challenges and IT Solutions’
AU TI- SO-
Rust RT; Huang MH
The Service Revolution and the Transformation of Marketing
Science MARKETING SCIENCE
Leeflang PSH; Verhoef PC;
Dahlstrom P; Freundt T Challenges and solutions for marketing in a digital era EUROPEAN MANAGEMENT JOURNAL
Hilbert M
What Is the Content of the World's Technologically Mediated
Information and Communication Capacity: How Much Text,
Image, Audio, and Video? INFORMATION SOCIETY
Huang MH; Rust RT IT-Related Service: A Multidisciplinary Perspective JOURNAL OF SERVICE RESEARCH
Miller HJ
Beyond sharing: cultivating cooperative transportation systems
through geographic information science JOURNAL OF TRANSPORT GEOGRAPHY
86. Cluster 5: Big Data and Geography
AU TI- SO-
DeLyser D; Sui D
Crossing the qualitative-quantitative divide II: Inventive
approaches to big data, mobile methods, and rhythmanalysis PROGRESS IN HUMAN GEOGRAPHY
Wright DJ Theory and application in a post-GISystems world
INTERNATIONAL JOURNAL OF GEOGRAPHICAL
INFORMATION SCIENCE
Crampton JW; Graham M;
Poorthuis A; Shelton T;
Stephens M; Wilson MW; Zook
M
Beyond the geotag: situating 'big data' and leveraging the
potential of the geoweb
CARTOGRAPHY AND GEOGRAPHIC
INFORMATION SCIENCE
Wilson MW Geospatial technologies in the location-aware future JOURNAL OF TRANSPORT GEOGRAPHY
Longley PA
Geodemographics and the practices of geographic information
science
INTERNATIONAL JOURNAL OF GEOGRAPHICAL
INFORMATION SCIENCE
Shah NH; Tenenbaum JD
The coming age of data-driven medicine: translational
bioinformatics' next frontier
JOURNAL OF THE AMERICAN MEDICAL
INFORMATICS ASSOCIATION
Kwon O; Sim JM
Effects of data set features on the performances of classification
algorithms EXPERT SYSTEMS WITH APPLICATIONS
87. Cluster 6: Big Data in the cloud:
Information systems related contributions
AU TI- SO-
Tien JM Big Data: Unleashing information
JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS
ENGINEERING
Miller HE Big-data in cloud computing: a taxonomy of risks
INFORMATION RESEARCH-AN INTERNATIONAL
ELECTRONIC JOURNAL
Lee MY; Lee AS; Sohn SY
Behavior scoring model for coalition loyalty programs by using
summary variables of transaction data EXPERT SYSTEMS WITH APPLICATIONS
Waller MA; Fawcett SE
Data Science, Predictive Analytics, and Big Data: A Revolution That
Will Transform Supply Chain Design and Management JOURNAL OF BUSINESS LOGISTICS
Lycett M 'Datafication': making sense of (big) data in a complex world EUROPEAN JOURNAL OF INFORMATION SYSTEMS
Kim C; Lev B
Enterprise Analytics: Optimize Performance, Process, and Decisions
Through Big Data INTERFACES
Lee CH; Chien TF
Leveraging microblogging big data with a modified density-based
clustering approach for event awareness and topic ranking JOURNAL OF INFORMATION SCIENCE
Tien JM The next industrial revolution: Integrated services and goods
JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS
ENGINEERING
Sahoo SS; Jayapandian C; Garg G;
Kaffashi F; Chung S; Bozorgi A;
Chen CH; Loparo K; Lhatoo SD;
Zhang GQ
Heart beats in the cloud: distributed analysis of electrophysiological
'Big Data' using cloud computing for epilepsy clinical research
JOURNAL OF THE AMERICAN MEDICAL
INFORMATICS ASSOCIATION
88. Cluster 7: Techniques to analyse Big
Data
AU TI- SO-
Janowicz K Observation-Driven Geo-Ontology Engineering TRANSACTIONS IN GIS
Chen HC; Chiang RHL; Storey VC
BUSINESS INTELLIGENCE AND ANALYTICS: FROM BIG DATA TO
BIG IMPACT MIS QUARTERLY
Wiedemann G
Opening up to Big Data: Computer-Assisted Analysis of Textual
Data in Social Sciences
HISTORICAL SOCIAL RESEARCH-HISTORISCHE
SOZIALFORSCHUNG
Videla-Cavieres IF; Rios SA
Extending market basket analysis with graph mining techniques:
A real case EXPERT SYSTEMS WITH APPLICATIONS
Prathap G
Big data and false discovery: analyses of bibliometric indicators
from large data sets SCIENTOMETRICS
McKenzie G; Janowicz K; Adams
B
A weighted multi-attribute method for matching user-generated
Points of Interest
CARTOGRAPHY AND GEOGRAPHIC
INFORMATION SCIENCE
Gao S; Liu Y; Wang YL; Ma XJ
Discovering Spatial Interaction Communities from Mobile Phone
Data TRANSACTIONS IN GIS
89. Cluster 8: Big Data and Big Brother:
Cyber Surveillance
AU TI- SO-
Hu M Biometric ID Cybersurveillance INDIANA LAW JOURNAL
Martinez MG; Walton B
Crowdsourcing: the potential of online communities as a tool
for data analysis
OPEN INNOVATION IN THE FOOD AND
BEVERAGE INDUSTRY
Sui D Opportunities and Impediments for Open GIS TRANSACTIONS IN GIS
Krasmann S; Kuhne S
Big Data and Big Brother - what if they met? On a neglected
political dimension of technologies of control and surveillance in
the research on acceptance KRIMINOLOGISCHES JOURNAL
90. Cluster 9: Big Data and Decision Support
Systems
AU TI- SO-
Demirkan H; Delen D
Leveraging the capabilities of service-oriented decision support
systems: Putting analytics and big data in cloud DECISION SUPPORT SYSTEMS
Cogean DI; Fotache M; Greavu-
Serban V NOSQL IN HIGHER EDUCATION. A CASE STUDY
INTERNATIONAL CONFERENCE ON
INFORMATICS IN ECONOMY
Li T; Kauffman RJ Adaptive learning in service operations DECISION SUPPORT SYSTEMS
Julian CD
Do Relational Databases Finally Have a Real Competitor? The
Struggle of a New Breed - NoSQL
INNOVATION AND SUSTAINABLE
COMPETITIVE ADVANTAGE: FROM REGIONAL
DEVELOPMENT TO WORLD ECONOMIES, VOLS
1-5
Walker S
Big Data: A Revolution That Will Transform How We Live, Work,
and Think INTERNATIONAL JOURNAL OF ADVERTISING
Lovric M; Li T; Vervest P
Sustainable revenue management: A smart card enabled agent-
based modeling approach DECISION SUPPORT SYSTEMS
91. Outlook
• New field, little work linking the various themes:
• BIG DATA the one key denominator
• Emerging differentiation
• Identified 8-9 clusters in SS/LIS ‘big data’ literature in WoS
• The Triple Helix and Big Data
• Plenty of space to leave a mark
• Very little ground covered
• Leydesdorff and Park notable exceptions
• Opportunities:
• TH occurs implicitly in most social science papers
• More conceptual work necessary
92. SPEECH ACTS IN TELEVISED PRESIDENTIAL
DEBATES AND FACEBOOK MESSAGES:
THE CASE OF THE 2012 SOUTH KOREAN
PRESIDENTIAL ELECTION
93. Purpose of the current study
With the advent of social networking sites (SNSs),
ordinary individuals have opportunities to participate in
communication on televised social events and issues.
The present study bridges theories of speech acts and
political representation
How did leading and trailing presidential candidates
incorporate speech acts into their rhetorical strategies in
three consecutive presidential debates during the 2012
presidential election in Korea?
How did their supporters employ speech acts when leaving
messages on Facebook fanpages?
94. Speech acts
Language use goes beyond the boundary of the
syntactic structure and its semantic meaning
Language is used to perform speech acts for certain
functions such as promising, asking, ordering, and
requesting, among others (Austin, 1976;
Habermas,1981; Searle, 1969; Wittgenstein, 2009).
Every speech act has three components (Austin; Searle)
A locutionary component (a propositional content
component),
An illocutionary component (an action component),
A perlocutionary effect (a consequence of saying something).
95. Televised presidential debates and
speech acts
A few studies have attempted to understand how
debate participants use different argumentative styles,
linguistic devices, and speech acts.
Lee and Benoit (2005) reported that during the 2002
Korean presidential debates, the candidates used acclaims
(52%) more often than attacks (37%) and defenses (11%).
Benoit (2007) reviewed political debates in various countries
and concluded that presidential candidates most frequently
used acclaims, followed by attacks and defenses.
Bilmes (1992) analyzed the 1992 U.S. vice presidential
debate and found that, in addition to assertions, questions
were frequently addressed by the candidates.
96. Televised presidential debates and
speech acts
The use of interrogatives can be perceived as an
aggressive tactic used by trailing candidates
attempting to raise the public's suspicion about the
leading candidate's credibility, integrity, morality, and
expertise, among others (Wilson & Speder, 1988).
The candidates frequently and strategically asked
questions to one another to identify controversial issues
and raise the listener's suspicion about the opponent's
normative base (Bilmes, 1999).
97. Televised presidential debates and
speech acts
The presidential candidates during the 2004 U.S.
presidential debates frequently offered promises and
that their verbs included "promise," "swear," and "want"
(Marietta, 2009).
Al-Bantany (2013) analyzed a gubernatorial debate
and found guarantees and promises to be two most
frequently employed commissive speech acts.
Edelsky and Adams (1990), who examined six mixed-
gender state and local debates and verified
stereotypical differences in communication styles
between male and female candidates.
98. Suggested hypotheses (part one)
H1. Presidential candidates are more likely to use
constatives than any other type of speech act.
RQ1. Other than constatives, how frequently do presidential
candidates use various types of speech acts during
presidential debates?
H2. The trailing candidate is more likely to use directives
and interrogatives than the leading candidate during a
presidential debate.
H3. The leading candidate is more likely to use commissives
than the trailing candidate during presidential debates.
H4. Female candidates are more likely to use expressives
than male candidates during presidential debates.
99. Speech acts on candidates’ Facebook fanpages
With respect to CMC messages, assertives are the dominant
type of speech act, followed by expressives and
commissives (Hassel & Christensen, 1996; Nastri et al.,
2006) .
With respect to SNS messages, expressives are the most
widely employed type of speech act, followed assertives,
directives, and commissives, claiming that SNS users try to
present themselves through the use of humor (Carr et al,
2009; 2012; Ellison, Steinfeild, & Lampe, 2011; Ilyas &
Khushi, 2012; Thelwall & Buckley, 2013).
Supporters of leading and trailing candidates may be
inclined to use different types of speech acts to actualize
the possibility of winning the presidential election.
100. Suggested hypotheses (part two)
H5. Visitors to presidential candidates’ Facebook pages
are more likely to use assertives than any other type of
speech act, followed by expressives.
H6. Moon’s Facebook page visitors are more likely to
use constatives than Park’s visitors.
H7. Moon’s Facebook page visitors are more likely to
use directives than Park’s visitors
H8. Moon’s Facebook page visitors are more likely to
use commissives than Park’s visitors.
H9. Moon’s Facebook page visitors are more likely to
use quotations than Park’s visitors.
101. Method
Samples
the debate script was extracted for each candidate from
http://www.debates.go.kr: 609 sentences for Park and 776
sentences for Moon
Facebook messages posted on these pages were extracted
from December 4, 2012, to December 17, 2012.
Postings were divided based on the debate schedule: six time
periods.
A total of 300 messages were randomly selected for each time
period for each candidate’s Facebook page.
If there were fewer than 300 messages during a certain period,
then all messages were included.
102. Method
Coding
Code Examples
Constatives “She doesn’t have any idea about economic democratization,” “He was t
oo gentle,” “He definitely won the debate,” and “Mr. Lee. Without natio
nal security we can’t achieve welfare either.”
Directives “You have to be more aggressive next time,” “Just ignore his stupid accu
sation,” “Do not post this kind of stupid comment,” and “Tell me what y
our opinion is on the half-tuition policy.”
Commissives “I’ll definitely vote in this election,” “We should vote for change,” and “
Let’s vote and end this absurdity.”
Expressives “I was so impressed^^,” “Fighting!” “I love all Korean mothers ^^~~^^♥
♥♥♥♥.”
Interrogatives “Do you agree with me?” and “I want to ask how you feel about those pe
ople who suffered under your father’s reign.”
Quotations* “Lee is giving a speech for Moon http://news1.kr/articles/917472.”
Expectatives “If you graduate from a university, I hope our country will be a livable pl
ace” and “I want to see president Moon.”
*Only quotations were applied to analyze Facebook messages.
103. Results
Speech acts Frequency Percentage
Constatives 933 67.4
Directives 35 2.5
Commissives 198 14.3
Expressives 53 3.8
Interrogatives 161 11.6
Expectatives 5 .3
Two candidates’ speech acts during three presidential debates
Speech acts Frequency (%) Chi-square P
Park Moon
Constatives 380 551 2.73 <.05
Directives 11 28 3.72 <.05
Commissives 129 68 46.88 <.01
Expressives 31 22 5.17 <.05
Interrogatives 46 114 17.83 <.01
Expectatives 2 3 .02 n.s.
Differences in speech acts between Park and Moon during presidential debates
104. Results
Speech acts Frequency (%) Chi-square P
Park Moon
Constatives 623 583 .09 n.s.
Directives 113 164 9.92 <.01
Commissives 4 44 41.32 <.01
Expressives 521 413 25.09 <.01
Interrogatives 55 55 .01 n.s.
Quotations 73 156 32.29 <.01
Expectatives 41 53 2.67 n.s.
Total 1430 1468
Differences in speech acts of Facebook visitors between Park and Moon
105. Results
Both candidates uttered more acclaims than any
other speech acts, consistent with the findings of
previous research.
The leading candidate used more commissives,
whereas the trailing candidate, more aggressive
speech acts such as constatives, directives, and
interrogatives.
Moon was aggressive in that he used more
directives and interrogatives than Park. On the
contrary, Park used more commissives and
expressives than Moon.
106. Results
Moon’s fanpage visitors used more commissives and
directives than Park’s visitors.
Moon’s visitors used more quotations than Park’s.
Park’s visitors used more expressives than Moon’s.
107. Concluding remarks
First, the candidates were most likely to employ clams
for truth (constatives), promises for the future
(commissives), revelations of subjective feelings
(expressives), attacks for regulating interpersonal
relationships (directives and interrogatives), and
expectatives, in that order.
Second, Moon was more likely to attack than Park, and
Park was more likely to promise than Moon.
Third, Moon’s Facebook page visitors engaged in
interactive relationships with others by using more
directives and commissives than Park’s visitors.
118. • Social sciences department at
the University of Oxford
• Undertaking rigorous multi-
disciplinary research and
teaching on the societal impact
of the Internet and ICTs (e.g.
law, economics, politics &
sociology)
• Developing methodologically
innovative tools and techniques
• Training the next generation of
Internet-literate researchers.
Since our inception we have sought to inform
and shape policy and practice.
119. Taught Courses
• 50+ graduate students from wide variety of disciplinary backgrounds, and
from industry or government
• DPhil Information, Communication and the Social Sciences:
supports single or multi-disciplinary research.
• MSc in Social Science of the Internet: 1 year Masters delivering
core training in social science methods and statistics, understanding of the
Internet’s technical architecture and regulatory framework, social
dynamics of Internet’s impact, in-depth disciplinary study e.g. Internet
Economics or Law plus cutting edge tools for digital social research.
• Annual Summer Doctoral Programme (2 weeks) for advanced PhD
students completing Internet-related theses across a variety of disciplines.
120. Michaelmas Hilary Trinity
Methods Social Research Methods and the
Internet Part I
Social Research Methods
and the Internet Part II
Core Survey
courses
Social Dynamics of the Internet
Internet Technologies and
Regulations
Options Two Option
Courses
Dissertation Dissertation
121. Two Options
• Digital Era Government and Politics
• Internet Economics
• Law and the Internet
• Online Social Networks
• Learning, the Internet and Society
• Big Data and Society
• Subversive Technologies
• ICTs and Development
• Digital Social Research
122. OII Research
• Topics covered across Governance and Democracy, Everyday
Life, Science & Learning, Network Economy, Shaping the
Internet
• Social science faculty with computer science skills
• Making major contributions to social science, e.g. addressing
the challenge of Big Data
• Field-leading methodological innovation e.g. Facebook &
NameGenWeb, OxLab.
• Biennial benchmarking and analysis of UK Internet use and
non-use (OxIS)
• Compelling presentation of data and findings to maximise
public engagement (e.g. iBook, Visualising Data).
123. Other relevant projects
• Future Home Networks & Services (Ian Brown & Joss Wright):
researching and developing security frameworks for sharing between
networks and devices, and cloud services;
• Oxford e-Social Science Project (Ralph Schroeder & Eric Meyer):
aims to understand how e-Research projects negotiate various social,
ethical, legal and organizational forces and constraints;
• The Learning Companion Project (Rebecca Eynon & Yorick Wilks):
evaluates the feasibility of a computer-based digital tool to help adults
whose engagement with learning is tentative make productive use of the
Internet for learning projects.
• Privacy Value Networks (Ian Brown): producing an empirical base for
developing concepts of privacy across contexts and timeframes,
addressing a current lack of clarity of what privacy is and what it means to
stakeholders in different usage scenarios
124. Research Examples
• People and Research
• Big Data: UK Government
• OxIS
• Political Science: Helen Margetts
• Geography: Mark Graham
• Social Network Analysis: Bernie Hogan
• Oxford e-Social Science Project: Dutton, Schroeder, Meyer
125. Big Data:
UK Government Online
.
• JISC UK Web Domain
Dataset (30 Tb) of .uk ccTLD
from 1996-2010
• Here shows link structure of
government (.gov.uk) in
2012
• Data can reveal change in
government relationships
and structure over time
126. Data
Internet Archives data of .uk back to 1996
Annual crawls of .uk websites since 2013
2.7 billion nodes, 40TB compressed
Features
Full text search (in progress, IHR)
Network analysis (OII)
N-gram analysis
Limitations
Page content data access limited
132. Use by Age
(QH14 by QD1)
OxIS 2005: N=2,185; OxIS 2007: N=2,350; OxIS 2009: N=2,013
16
133. Which is more Important: Age or
Income?
Internet Users in Each Age-Income Category
(percents)
Age Groups
Income 14-44 45-64 65+
Up to £20K/year 71.3 39.3 21.3
£20-40K/year 92.6 78.3 49.0
Over £40K/year 97.0 96.4 75.0
•
OxIS 2009: N=1,318 Internet Users
134. Use by Education (QH14 by QD14)
OxIS 2007: N=2,350; OxIS 2009: N=2,013 (Basic: N=901; Further: N=510; Higher: N=360).
Note: Students were excluded.
18
135. Web 2.0 User Creativity & Production Online (QC10 and
QC31)
Current users. OxIS 2005: N=1,309; OxIS 2007: N=1,578; OxIS 2009: N=1,401
Note. Social networking question changed in 2009.
19
136. Helen Margetts ESRC Professorial Fellowship 2011-2014
The Internet, Political Science And Public Policy
Re-examining Collective Action, Governance and Citizen-government Interactions in the Digital Era
• Using the internet to generate ‘real’ transactional data about political
behaviour (including webmetrics, datamining and experiments)
8,327 petitions
scraped from No 10 Downing Street site, all
new ones 2009-2010
95% of petitions fail
to reach 500 (number necessary for official
reply)
Number of signatures on
launch day crucial to whether it
reaches 500
137. •Social network
map of Bernie
Hogan’s FB ties,
Dec. 2008;
•Proof of concept
network that led
to creation of
NameGenWeb
Mapping Personal Networks
143. •Mark Graham & Bernie
Hogan’s project
investigates inequalities
in the creation of
knowledge.
• Map reveals uneven
spread of geo-tagged
Wikipedia articles
2011-12.
144. Sandra Gonzalez-Bailon
USENET Political Discussions (1999-2005)
0
2
4
6
8
x10000
09/1999 09/2000 09/2001 09/2002 09/2003 09/2004
gun
whiteblack
newswar
people
hateworld
partyfree
deathgood
mancrime
housetime
moneyboy
abortion flag
0:1
white
gun
news
people
war
black
time
house
party
world
goodcut
death
power
hateman
fraudfree
truthcrime
0:1
war
white
worldgun
terrorist
newstime
people
housegood
deathhate
mandeadblack
peacetruthfree
lettergod
0:1
warworld
news people
whitegood time
peace gun death
hate house dead
black terrorist party
f ree man truth lie
0:1
war
news
white
worldtime
peoplegood
hatedead
manhouse
partydeath
freeblack lie
truthguntorture
terrorist
0:1
war
news
timeworld
people
hatewhite
socialdead
goodman
houseparty
goddeath
fraudwinfree
gunblack
146. Oxford e-Social Science Project
• Social shaping and
implications of e-Research
• Collaborative project with:
• SBS / InSIS group
• OeRC
• ESRC: 6 years of funding +
multiple follow-on projects
147. Source: Schroeder, R., Meyer, E.T. (2009). Gauging the Impact of e-Research in the Social Sciences. Paper presented
at the 104th American Sociological Association Annual Meeting, August 8-11, San Francisco, California.
148. Source: Meyer, E.T., Park, H-W., Schroeder, R. (2009). Mapping Global e-Research: Scientometrics and Webometrics. Proceedings of the 5th
International Conference on e-Social Science, June 24-26, Cologne, Germany.
149. Source: Meyer, E.T., Schroeder, R. (2009). Untangling the Web of e-Research: Towards a Sociology of Online Knowledge. Journal of Informetrics 3(3):246-260.
150. Source: Meyer, E.T., Schroeder, R. (2009). Untangling the Web of e-Research: Towards a Sociology of Online Knowledge. Journal of Informetrics 3(3):246-260
151. Source: Schroeder, R., Meyer, E.T. (2009). Gauging the Impact of e-Research in the Social Sciences. Paper presented at the 104th American Sociological
Association Annual Meeting, August 8-11, San Francisco, California.
153. Big Data, Big Brother, and
Social Science
Ralph Schroeder
Collaborators:
Eric T. Meyer, Linnet Taylor, Josh Cowls, Greg Taylor, Monica Bulger
Asia Triple Helix Society, Daegu, 25th June, 2014
154. Overview
• Projects
• Questions
• Issues
• Definition
• How knowledge advances
• Examples
• Big Data Issues in Research and Beyond
• Policy Implications
• Conclusion
155. Accessing and Using Big Data to Advance
Social Science Knowledge
• Funded by Sloan Foundation
• Data sources
• 100+ interviews, mainly with social scientists
• Reports, workshops
• Publications, conferences
• No representative sample, but some patterns of
disciplinary and skills background and career
trajectory
157. Data-driven economic models: challenges
and opportunities of big data
• Funded by Research Councils UK (RCUK),
New Economic Models in the Digital
Economy (NEMODE) network
• Data Sources:
– 25+ interviews
– Case studies
– Issues include how models relate to national
contexts (ie. privacy laws in Germany), where
skills are located (plus gaps), use of
public/private data, standardization
161. Twitter-bots
OII master’s students Alexander Furnas and Devin Gaffney saw a large spike in then-US
presidential candidate Mitt Romney’sTwitter followers, and decided to look at the new
followers:
Furnas, A. and Gaffney, D. (2012). ‘Statistical Probability That Mitt Romney's New Twitter Followers Are Just Normal Users: 0%’. The Atlantic, July 31,
http://www.theatlantic.com/technology/archive/2012/07/statistical-probability-that-mitt-romneys-new-twitter-followers-are-just-normal-users-0/260539/ (accessed August 31, 2012).
163. Source: Hill, K. (Feb 16, 2012). Forbes.com. Available at: http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-
out-a-teen-girl-was-pregnant-before-her-father-did/
Based on Duhigg, C. (Feb 16, 2012). “How Companies Learn Your Secrets.” New York Times Magazine.
164. 113 240 278 367
558
1,195
1,538
2,350
3,960
6,787
7,276
9,010
-
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
10,000
1st Q 2nd Q 3rd Q 4th Q 1st Q 2nd Q 3rd Q 4th Q 1st Q 2nd Q 3rd Q 4th Q
2010
(n=998)
2011
(n=5,641)
2012
(n=27,033)
Number of News Articles on Big Data
Source: Nexis data compiled by Meyer & Schroeder
165. Big data in the commercial world
• Commercial uses are: ‘in house’,
‘outsourced own data’, ‘data analysis as a
consultancy service’
• Careers in data analysis entail as a baseline
computer science/statistical expertise, plus
different domains of ‘sorting people’ and
being able to ‘manipulate’ them (ie.
predict their behaviour)
166. Definition
• ‘Big data’
– the advance of knowledge via a leap in the scale
and scope in relation to a given object or
phenomenon
‘Data’
– Belongs to the object
– ‘taking…before interpreting’ (Ian Hacking)
• the view that ‘all data are of their nature interpreted’ is
misleading: ‘data are made, but as a good first
approximation, the making and taking come before
interpreting’
– The most atomizable useful unit of analysis
167. Computational Manipulability?
• ‘the distinctiveness of the network of mathematical
practitioners is that they focus their attention on the pure,
contentless form of human communicative operations: on
the gestures of marking items as equivalent and of ordering
them in series, and on the higher-order operations which
reflexively investigate the combinations of such operations’
• ‘mathematical rapid-discovery science…the lineage of
techniques for manipulating formal symbols representing
classes of communicative operations’
• Why is big data a big deal? Manipulability, plus new data
sources
169. Digital Objects and their Referents
Digital Object
(Examples: Twitter,
Tesco Loyalty card
information
Real World
(People / Physical
Objects)
Represent / Manipulate
172. Uses and Limits
• Big data research uses (academic, commercial, government) are limited to
the exploitation of suitable objects, and the objects which ‘give off’ digital
data, and the phenomena they lay bare, are limited
• The knowledge produced is aimed at ‘sorting people’ and advancing
‘representing and intervening’ (but without ‘manipulating’, except where
this is warranted by practical economic and political objectives)
• Difference commercial versus academic world is that knowledge provides
competitive and practical advantage as against advancing (high-consensus
rapid-discovery) knowledge
– The limits in both cases are the objects (to which the data ‘belong’), and that need to
have available digitally manipulable data points
• How available these objects are differs, but also…
– Causation and theoretical embedding matters for academic social science
– For commercial (and non-academic uses), ‘predicting’ consumer choices and other
behaviours, for limited purposes and without increasing scientific knowledge, is good
enough
• There are many objects, for non-academics and scientists to humanities
scholars (physical, human, cultural), but they are not infinite
• This availability, not skills or other issues, determines the future of big data
research
173. 113 240 278 367
558
1,195
1,538
2,350
3,960
6,787
7,276
9,010
-
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
10,000
1st Q 2nd Q 3rd Q 4th Q 1st Q 2nd Q 3rd Q 4th Q 1st Q 2nd Q 3rd Q 4th Q
2010
(n=998)
2011
(n=5,641)
2012
(n=27,033)
Number of News Articles on Big Data
Source: Nexis data compiled by Meyer & Schroeder
174. Platform Paper Size of Data in relation to
phenomenon investigated
Theoretical
question/practical aim
Key findings
Facebook Backstrom et al. (2012) 69 billion friendship links
between 721 million Facebook
users
Re-examine Milgram’s ‘six
degrees of separation’
online
Four degrees of separation on
Facebook
Ugander et al. (2012) 54 million invitation emails to
Facebook users
How does structure of
contacts affect invitation
acceptance?
Not number of contacts, but
number of distinct contexts,
matters for acceptance
Bond et al. (2012) 600000 Facebook users Facebook experiment about
how to mobilize voters
Voters can be mobilized via
Facebook friends more than via
informational messages
Twitter Kwak et al. (2010) 1.47 billion directed Twitter
relations
Is Twitter a broadcast
medium or a social
network?
Most use is for information, not
as a social network
Cha et al. (2010) 1.7 billion tweets among 54
million users
Who influences whom? Top influentials dominate, but
some variation by topic
Bakshy et al. (2011) 1.6 million Twitter users Who influences whom? ‘Ordinary user’ influencers can
sometimes be more effective
than top influencers
Wikipedia Loubser (2009) All Wikipedia activity How is editing organized? Administrators can impact
negatively on participation
Yasseri, Kertesz (2012) Editorial activity on Wikipedia,
especially reverts
Understanding conflict and
collaboration
Types of conflicts can be
modelled
West, Weber and Castillo
(2012)
Wikipedia contributions related
to Yahoo! browsing
What characterizes
Wikipedia contributors’
information behaviour
compared to Wikipedia
readers and non-readers
Wikipedia contributors are more
‘information hungry’, especially
about their topics
175. Example 1:
Search engine behaviour
Waller’s analysis ofAustralian Google Users
Key findings:
- Mainly leisure
- > 2% contemporary issues
- No perceptible ‘class’ differences
Novel advance:
- Unprecedented insight into what people search for
Challenge:
- Replicability
- Securing access to commercial data
176. ?
?
?
?
?
?
?
?
?
“Surprisingly, the distribution of
types of search query did not vary
significantly across the different
Lifestyle Groups (p>0.01).”
Source: Waller, V. (2011). “Not Just Information:Who Searches for What on the Search Engine Google?” Journal of the American Society for Information Science &
Technology 62(4): 761-775.
177. Example 2:
Large-scale text analysis
Michel et al. ‘culturomic’ analysis of 5 Million Digitized Google
Books and Heuser & Le-Khac of 2779 19th Century British
Novels
Key findings:
- Patterns of key terms
- Industrialization tied to shift from abstract to concrete
words
Novel advance:
- Replicability, extension to other areas, systematic
analysis of cultural materials
Challenge:
- Data quality
179. Example 3:
Social network or news?
Kwak et al.’s analysis ofTwitter
Key findings:
- 1.47 billion social relations
- 2/3 of users are not followers or not followed by any of their
followings
- Celebrities, politicians and news are among top 20 being followed
Novel advance:
-Volume of relations and topics
Challenge:
- News or social network needs to be contextualized in media
ecology
- Securing access to commercial data
180. (Big) data definition enables
pinpointing impacts and threats
• ‘Google Plus may not be much of a competitor to Facebook as
a social network, but…some analysts…say that Google
understands more about people’s social activity than
Facebook does.’
– New York Times, 15.2. 2014, p. A1 ‘The Plus in Google Plus? It’s Mostly for Google’.
• Facebook Likes: ‘Predicting users’ individual attributes and
preferences can beused to improve numerous products and
services. For instance, digital systems and devices (such as
online stores or cars) could be designed to adjust their
behavior to best fit each user’s inferred profile…online
insurance…advertisements might emphasize security when
facing emotionally unstable (neurotic) users but stress
potential threats when dealing with emotionally stable ones’
– ‘Private traits and attributes are predictable from digital records of human behavior.’ Kosinski M,
Stillwell D, Graepel T.,Proc Natl Acad Sci 2013 Apr 9;110(15):5802-5.
• More powerful knowledge will enable better services, and
more manipulation
181. ‘Big data‘ for understanding society
• Real-time transactional data (unlike survey
data, traditional staple of social science)
• Outside capability of normal desktop
computing environment (‘Too big to
handle’)
• Big potential for understanding
institutions and individual behaviour
182. Social Science and Big Data
Research
• Dominated by social media
• Issues of ‘whole universe’
– What population, offline and online, does it
represent
– Data quality and replicability
– How does ‘modality’ determine findings about
implications
• How to embed the research
– In existing theory (but also advance theory)
– In existing ecology of media uses in society
(including ones that extend existing ones)
183. Scientificity and Big Data: Pro and
Con
• Pro
– Replicability, extension to new domain
– ‘Total’ datasets, ‘whole universe’
– (Often) no sampling needed, data for all behaviour and over
whole existence
– Ready made manipulability
– Powerful relation of data to object
• Con
– Limited access to object, skills needed for manipulability
– (Often) not known who users are
– No or little knowledge of how (commercial) data were gathered
– Researcher does not ask what is of interest without ‘givenness’
– Datasets capture limited dimensions, and about one object
– Object in isolation, not framed for social change significance
184. Ethical and Social Issues in Big Data Research
• Objects with ‘total’ knowledge (universes)
– Danger is inferring behaviour not of individuals, but of classes of
people
• Asymmetry of knower and the subjects of knowledge is
greater than elsewhere
• Based not on individuals’ but on aggregate behaviour
– Hence only utilitarian, not Kantian justification?
• Why does prediction or uncovering laws of behaviour ‘grate’?
• Benefits: greater scientific power and more specific details
• Relation to smaller data? ‘Creep’
• Solution: ethical = greater researcher and public awareness,
regulatory (would apply to academic researchers?) = prevent
legal and specific harms
185. Other positions on Big Data
Implications 1
• Mayer-Schoenberger and Cukier, boyd and Crawford argue that not
all information can or should be captured
– No, need to create the legal and ethical social space which protects the
individual. The solution does not rely on denying the powerfulness of
knowledge, but harnessing it appropriately.
• Mayer-Schoenberger and Cukier solution of 1.more transparent
algorithm, 2. Certifiying validity of algorithm 3. Allowing
disprovability of prediction (p.176) –
– Yes, but within social science, solution is to make knowledge more
scientific.
• Underlying all these problems is more powerful knowledge
– This goes against free, untrammelled behaviour
– Solution: Society becomes more self-aware and shapes knowledge to
constrain it
• Crawford, Marwick: big data is product of neoliberal capitalism? No,
uses by different societies, and for purposes apart from ‘neoliberal
capitalist’ ones, such as open government data and Wikipedia
analysis
186. Other Positions on Big Data Implications 2
• Savage and Burrows: ask are commercial data outpacing
social science?
• Boyd and Crawford: does big data raise epistemological
conundrums, and isn’t it always already (social) contextual ?
• Mayer-Schoenberger and Cukier: what are the political and
commercial harms of wrong knowledge, especially when it
changes ‘everything’?
... No ...
• Knowledge depends on the relation between research
technologies and the advance of knowledge
• The threats and opportunities are not contextual, but
depend on how more powerful knowledge is used
• Big data contributes to more ‘scientific’ (i.e. cumulative)
social sciences, but within limits, and there are limits to
commercial and political uses too
187. Consumer (and gov’t) Big Data
• Consumer data and privacy (ie. Target pregnancy case)
– Solution: data protection
• Consumer data and prediction and control (ie. click
behaviour): affects consumer without transparency, predictive
privacy harm
– Solution: transparency, ‘due process’ (Crawford and Schultz)
• Consumer data – and government data - and exclusion from
benefits thereof (ie. no or little use of digital devices) - if not
captured by data, left out
– Solution: Data antisubordination (Lerman)
– Solution: government may need more data about us (and
counteract the data invisibility of parts of the population)
• Consumer data from digital media (ie. search engines) – manipulate
what is found without transparenyc, inappropriate personalization
(Pariser)
– Solution: transparency, consumer protection
188. Big Data and Policy
• Probabilistic rather than ‘causal’ commercial and
government uses of data (ie. profiling) - only probable,
not definite causal behaviour of data emitters
established (Mayer-Schoenberger and Cukier)
– Solution: more accurate knowledge
• Exposure of Data emitter because of identifiers in large-
scale and linked data (Netflix, AOL, Google Streetview,
National Security Administration), such that
anonymization does not work
– Solution: data protection, better anonymization,
opting out, consent
• Social media used in authoritarian regimes for control
(Weibo in China)
– Solution: more commercial independence, more civil
society pushback, researcher non-cooperation
189. Future of Big Data Research
• Difference commercial versus academic world is that
knowledge provides competitive advantage as against
advancing (high-consensus rapid-discovery) knowledge
• The limits in both cases are the objects (to which the data
‘belong’), and that need to have available digitally
manipulable data points
• How available these objects are differs
• There are many objects, for non-academics and scientists to
humanities scholars (physical, human, cultural), but they are
not infinite
• This availability, not skills or other issues, determines the
future of big data research
• A Golden Age of Quantification and New Sources of Data…A
Dark Age (so far) of understanding new online phenomena
and their social significance
190.
191.
192. Outlook and Implications
• There is an overlap between real world research and
the world of academic research which is closer than
elsewhere
– because this is the research front in both
– because they share common objects
• For research
– Develop theoretical frame in which to embed big data (for
social media), including power/function, relation to
traditional media, and role in society
• For society
– Awareness of how research can generate transparency and
manipulability
• Big Brother?
– Yes, but also Brave New World of Omniscience, with Social
Science as Handmaiden
193. Additional readings and references
Bond, Robert et al. (2012). ‘A 61-million-person experiment in social influence and political mobilization’,
Nature 489: 295–298.
Bruns, A. and Liang,Y.E. (2012). ‘Tools and methods for capturingTwitter data during natural disasters’, First
Monday, 17 (4 – 2), http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/viewArticle/3937/3193
Furnas, A. and Gaffney, D. (2012). ‘Statistical ProbabilityThat Mitt Romney's NewTwitter Followers Are Just
Normal Users: 0%’. The Atlantic, July 31, http://www.theatlantic.com/technology/archive/2012/07/statistical-
probability-that-mitt-romneys-new-twitter-followers-are-just-normal-users-0/260539/ (accessed August 31,
2012).
Giles, J. (2012). ‘Making the Links: From E-mails to Social Networks, the DigitalTraces left Life in the
ModernWorld areTransforming Social Science’, Nature, 488: 448-50.
Kwak, H. et al. (2010). ‘What isTwitter, a Social Network or a News Media?’ Proceedings of the 19th
InternationalWorldWide Web (WWW) Conference, April 26-30, 2010, Raleigh NC.
Manyika, J. et al. (2011). ‘Big data: the next frontier for innovation, competition and productivity’, McKinsey
Global Institute, available at: http://www.mckinsey.com/insights/mgi/research/technology_and_innovation/
big_data_the_next_frontier_for_innovation (last accessed August 29, 2012).
Silver, Nate. (2012). The Signal and the Noise:The Art and Science of Prediction. London:Allen Lane.
Tancer, B. (2009). Click:What Millions of People are Doing Online andWhy It Matters. NewYork: Harper
Collins, 2009.
Wu, S. , J.M. Hofman,W.A. Mason, and D.J. Watts, (2011). ‘Who says what to whom on twitter’, Proceedings
of the 20th international conference onWorld WideWeb. (on DuncanWatts webpage,
http://research.microsoft.com/en-us/people/duncan/, last accessed August 29, 2012).
194. Project Papers
Schroeder, Ralph (Forthcoming). ‘Big Data: Towards a More Scientific Social Science and Humanities’ in Mark Graham and William H
Dutton (eds.), Society and the Internet: How Networks of Information are Changing our Lives. Forthcoming.
Schroeder, Ralph, & Taylor, Linnet (Forthcoming). ‘Is bigger better? The emergence of big data as a tool for international development
policy.’ GeoJournal.
Meyer, Eric T., Schroeder, Ralph, & Taylor, Linnet (2013, August). ‘Big Data in the Study of Twitter, Facebook and Wikipedia: On the Uses
and Disadvantages of Scientificity for Social Research.’ Paper presented at the proceedings of the Annual Meeting of the American
Sociological Association. (being submitted)
Schroeder, Ralph, & Taylor, Linnet. ‘Big Data and Wikipedia Research: Social Science Knowledge across Disciplinary Divides’. Submitted to
Information, Communication and Society.
Taylor, Linnet. ‘No place to hide? The ethics and analytics of tracking mobility using African mobile phone data. Submitted to Population,
Space and Place.
Meyer, Eric T., Schroeder, Ralph, & Taylor, Linnet. ‘Big Data in the Social Sciences: Towards a New Research Paradigm?’ (being
submitted).
Meyer, Eric T., Schroeder, Ralph, & Taylor, Linnet (2013, November). ‘The Boundaries of Big Data.’ Paper presented at SIG-SI Symposium,
ASIST 2013, November 1-6, 2013, Montreal, Quebec, Canada.
Schroeder, Ralph and Cowls, Josh. ‘Answering Questions and Questioning Answers in the Era of Big Data.’ In preparation.
Taylor, Linnet, Meyer, Eric T., & Schroeder, Ralph. ‘Bigger and better, or more of the same? Emerging practices and perspectives on big
data analysis in economics”. Forthcoming in Big Data & Society.
Cowls, Josh. ‘The Crowd in the Cloud?’, forthcoming presentation and IPP 2014’
Cowls, Josh ‘Big Data and Policy Implementation’, in preparation.
Schroeder, Ralph ‘Big Data and Policy Implications’, in preparation.
195. Oxford Internet Institute
With support from:
Ralph Schroeder
ralph.schroeder@oii.ox.ac.uk
http://www.oii.ox.ac.uk/people/?id=26
See http://www.oii.ox.ac.uk/research/projects/?id=98
196. Understanding “Wedge-Driving” Rumors
Online during a Political Crisis: Insights
from Twitter Analyses during Korean
Saber Rattling 2013
K. Hazel Kwon, PhD, ASU
C. Chris Bang, MA, Univ. at Buffalo
H. R. Rao, PhD, Univ. at Buffalo
197. Rumors Revisited
• Unofficial Information Sharing in Social
Media
• Unofficial Information = Rumors =
Representation of bottom-up, spontaneously
shaped public opinions (Knapp, 1944;
Peterson & Gist, 1951; Turner & Killian,
1987)
• Haven’t been studied much until recently.
198. Goals of the Study
• Theoretically: Understanding social media
rumormongering as a contentious process of
collectively constructing meaning under a
high uncertainty
• Methodologically: Demonstrating how
semantic network analytic approach can
help textual, discourse analysis of rumors.
199. Public Opinions
• Public Opinions: (1) citizen responses as
opposed to governing actors; (2) expressed
openly instead of privately reserved; (3)
relevant to social affairs with a potential
influence on political process
• In modern political system: Public Opinion
= Opinion Polling Results
200. Opinion Polling…
• A top-down, institutionalized construction
of public opinions
• Quantitative, limited conveyance of opinion
patterns
• Mainly for social control
• Overemphasis on a “rational” process of
opinion formations
201. Rumors: Improvised Public Opinions
• Alternative indicators of opinion climate
• Bottom-up, unstructured construction of
social affairs
• A less normative, less rational process of
public sense-making: “Affect-laden”
• Help qualitative, granular understanding of
opinion patterns
202. Textual Analysis of Rumors
• Social Psychology of Rumors
• Textual Analysis of Rumors
- Only a few studied due to the lack of text
data
- Advantage of utilizing social media data
(i.e. Twitter) for both theoretical and practical
reasons
203. Wedge-Driving (WD) Rumors
• 3 rumor types during a crisis: wish, dread, WD
• WD rumors: a moniker for unverified propositions
toned with derogatory toward a specific target
group or individuals representative of the group
• Reflective of social structures for emotional
contagions; subconscious roots of intergroup
conflict; inverse indicator of social capital;
prevailing norms and way of thinking
204. Empirical Research Questions:
To what extent does rumoring happen
in social media when a society faces a
social/political crisis?
Do WD rumors reveal distinctive
narrative characteristics in comparison
to other types of informal public
discourses?
205. Case: Korean Saber Rattling
2013
• Rumormongering = uncertainty (ambiguous
situation) x anxiety (issue importance)
• Saber Rattling between North and South
Koreas 2013 picked up as a proper case to
explore social media rumoring
[North Korea = NK; South Korea = SK]
206. Small-Scale Content Analysis
• Quota sampling of 2,500 non-redundant,
unique tweet messages (2,352 after
filtering) from a total of 207,992 tweets
collected between Feb 18 and Mar 14, 2013
• 7 search keywords: 북한(North-Korea),
북핵(North-Korea-Nuclear), 북조선(North-
Chosun), 핵무기(Nuclear-Weapon),
핵폭탄(Nuclear-Bomb), 핵실험(Nuclear-
Experiment), 김정은(Kim-Jung-Un)
207. Content Analysis
• Dummy coding: (1) informational ambiguity
(84.5% agreement), (2) propositional statement
(88.9% agreement), (3) hostility towards others
than NK (and its politicians)
• 3 Groups categorized:
(1)&(2)&(3) = WD rumor
(1)&(2) = General rumors (GR)
The rest = Non-rumors (NR)
208. Semantic Network Analysis
• Words selected based on Bonferroni-
adjusted z-tests of word frequency
comparisons among the 3 groups
• Co-occurrence matrix for each group
• Degree & Eigenvector centralities
• Clauset-Newman-Moor clustering
algorithms
209. General Results
• 25% NR message (62 words), 36.4% WD
messages (99 words), and 38.6% GR
messages (41 words)
• Two centrality scores highly correlated:
Spearman’s ρ = .991 for NR, .946 for GR,
.943 for WD
• 4 semantic clusters in NR network; 5 in WD
network; 7 in GR network
215. NR network highlights…
• Formal, top-down responses to the threat,
in a broader geopolitical context.
SK’s political and military capability (C1)
Foreign diplomacy of both Koreas (C2&C3)
International responses to the threat (C4)
220. WD network highlights
• Derogatory themes:
Defaming historic or current politicians
(C1), even a public figure in a non-
political sector (C2)
Distorting a historical event not directly
related with the current threat (C3)
Evoke Cold-War rhetoric to attack
opposite political beliefs (C4&C5).
225. GR network highlights…
• Bottom-up reaction to the threat
the public’s curiosity about the NK’s
readiness of kinetic warfare (C1&C2) and
their true motivations behind threatening
(C3).
Trivialization (C2&C5&C6)
Conveyance of hope (C4&C7)
226. Discussion & Conclusions
• Nontrivial portion of spontaneous, less-
than-rational public responses to social or
political affairs, i.e. in time of crisis: Calls
for understanding rumor publics
227. • Non-rumors: similar to institutional polling
(e.g. Gallup questionnaire)
• General-rumors: derivative of the news
agenda but mutated into the bottom-up
desires to cope with fears: In forms of
Guesswork, witticism, pipe-dreaming
• WD rumors: deviate a lot, mainly
ideological contention between pro-peace
and pro-constraint political faction,
intertwined with collective memory in
histories
228. Limitation & Future Research
• Threw away a large amount of available
data due to limited methods
• Needs to incorporate a machine-learning
approach to scale up research
229. A social network framework to analyze the
cultural contents of Kpop across countries
Ji-Young Park & Ji-Young Kim
(PhD student, YeungNam University)
Wayne Weiai Xu
(PhD student, State University of New York at Buffalo)
Han Woo Park
(Professor, Ph.D.)
230. Contents
• Cultural phenomenon of the Korean wave
• Variety of Data procedure
- Data preparation
- Data process
• Social network analysis framework
- online cultural contents of Kpop
231. Cultural phenomenon of the Korean wave
• Hallyu(한류: Korean Wave) is a neologism referring to the
increase in the popularity of South Korean culture since the
late 1990s. The term was originally coined in mid-1999
by Beijing journalists who were surprised by China's growing
interest for South Korean cultural exports. They subsequently
referred to this new phenomenon as "Hánliú" (韓流), which
literally means "flow of Korea".
232. Cultural phenomenon of the Korean wave
• Cultural exports such as Hallyu (“Korean Wave”) embody the
global influence of local pop culture.
• The promotion of strategic cultural offerings can enhance the
national image and strengthen the country’s entertainment
industry (Maitland & Bauer, 2001).
• The global diffusion of cultural offerings has been increasingly
facilitated through social media, a phenomenon that has drawn
growing scholarly attention in recent years (see Kim, Heo, et
al., 2013).
233. Web 1.0 Korean Wave Web 2.0 Korean Wave
Period Early 2000s 2010s
Genre Mostly TV dramas Multiple Contents
(e.g. K-pop, Online games)
Location Asia Region Centered Globalization
Users’ main media
platform
Websites Social Media
(e.g, Twitter, youtube)
Marketing strategy Top-down
(Government)
Bottom – up
(fans, market players)
The Change of the Korean Wave
Source : revised from SERI Quarterly, Oct. 2011.
Cultural phenomenon of the Korean wave
234. • This study focuses on Kpop and a Korean
rapper Psy’s Gangnam Style (GS)
235. Research Questions
• What is the communication patterns among
international fans of Kpop across countries ?
236. • Various kinds of online data are used in current paper.
• The big data-based analysis programs, including the
Webometric Analyst 2.0 and Webonaver & Webogoogle, are
employed to retrieve and parse data from the World Wide Web
• Data collected are moved to SNA tools such as NodeXL,
UciNet, Pajek, and ConText for quantitative investigation
237. • (1) Web documents on Korean singers
• (2) Visibility of Korean singers at popular social
media sites
• (3) Communication patterns among international fans
of Kpop across countries
Social network analysis framework
238. Social Network Analysis Framework Data procedure Method SNA tool
(1) Web documents on Korean singers
- Scrape keyword(Korean singer)
hit count in search result
- Scrape keyword(Korean singer)
title, phrase & url in search result
Webometrics Analysis
NodeXL,
UciNet, Pajek,
and ConText
(2) Visibility of Korean singers at
popular social media sites
- Data collect keyword(Korean
singer)’s social media activity like
Singer`s follower, following,
tweets on Twitter
Webometrics Analysis
(3) Communication patterns among
international fans of Kpop across
countries
- Data collect using Webometrics
Analyst 2.0
- video ID, published date,
updated date, video title,
video url, author name,
dislike, likes viewcount,
favorite count
- recent 1,000 comments
- subscription
Network Analysis
239. Social network analysis framework
• (1) Web documents on Korean singers
- Webonaver, Webogoogle
240. Social network analysis framework
• (1) Web documents on Korean singers
- Webonaver as a scrapper tool
-NaverScrapper - ScrapperTools related Naver, Search
Engine and Portal
-*Using OpenAPI on Naver
-Scrape keyword hit count in search result
-Scrape keyword title, phrase & url in search result
-박한우, 박세정, David Stuart, 이승욱(2009). API를 활용한 검색 프로그램
WeboNaver의 이해와 적용: 18대 국회의원 웹 가시성 분석과 신종플루 관련 단
어의 연관성 분석. Journal of the Korean Data Analysis Society. 11권 6호(B).
3427-3440
-It can be download from http://hanpark.net (allow autherized )
241. Social network analysis framework
• (1) Web documents on Korean singers
- WeboGoogle as a scrapper tool
-WeboGoogle - ScrapperTools related
Google, Search Engine
-*Using Custom search API on Google
-Scrape keyword hit count in search result
-Scrape keyword title, phrase & url in search
result
- Keyword co-occurrence of the sites' domains
based on their symmetrical relationships by using
Boolean operators.
242. Social network analysis framework
• (1) Web documents on Korean singers
- WeboGoogle as a scrapper tool
- The results based on a total of 3,320,000 hit counts from
Google-indexed web documents for the search query
"Gangnam Style“ on August 14, 2012,
- indicate 39.0% of all returned web documents from
YouTube.com, followed by AllKpop.com (9.0%) and
blogs.wsj.com (3.0%).
243. Social network analysis framework
• (2) Visibility of Korean singers at popular
social media sites
-Twitter, Facebook
Using Nodexl, an open-source software tool, to collect
and analyze these Tweets (Hansen, Shneiderman &
Smith, 2010).
Collect keyword(singer)’s social activity like follower,
following, tweets.
244. Social network analysis framework
• (3) Communication patterns among
international fans of Kpop across countries
Webometric Analyst analyses
the web impact of documents or
web sites and creates network
diagrams of collections of web
sites, as well as creating networks
and time series analysis of social
web sites (e.g., YouTube, Twitter)
and some specialist web sites
(e.g., Google Books).
This employed to retrieve and parse data from YouTube.com (Thelwall, 2012).
245. Social network analysis framework
• (3) Communication patterns among
international fans of Kpop across countries
• Using webometric analyst, we collected data that related psy`s
Gangnam style. It include video ID, published date, updated
date, video title, video url, author name, dislike, likes
viewcount, favorite count at al.
• And most recent 1,000 comments posted to a GS video clips
on Psy`s official Youtue acoount that uploaded on Psy's
official YouTube account (“officialpsy”) was identified.
246. Social network analysis framework
• A user-to-user network was constructed to reveal hidden
relationships between commenters, i.e., nodes. Three
networks of users were considered: a network of
commentaries, a network of subscriptions, and
subscriptions to a common network.
Type Nodes refer to Ties occur when
Commentary
network
Users commenting
on the GS video.
One user replies to a comment by another.
Subscription
network
Same as above. One user subscribes to the channel/account
of another.
Subscriptions to a
common network
Same as above. Two users share common channel/account
subscriptions on YouTube.
Nodes and ties for each type of user network
247. Social network analysis framework
• In terms of the geographical distribution of
commenters, the U.S. had the largest number of
commenters (46.93%, 214, N=456), followed by the U.K.
(7.02%, 32), Canada (6.80%, 31), Korea (4.17%, 19),
the Netherlands (2.85%, 13), Brazil (2.19%, 10), and
Finland (2.19%, 10).
• This reveals that Western users were influential in
determining the flow of GS on YouTube. The sample was
compared to demographics for all YouTube users in the
U.S. According to Quantcast.com,
248. Results
• This structural difference between the NC and the NSCN can
be explained in part by the nature of YouTube.
• In the Web 2.0 social media era, participants in internet forums are more
synchronous by being more engaged in seeking information and selectively
exposed to the congenial idea through receiving information highly
personalized by their search and navigation patterns (Choi & Park, 2014).
Types Commentary
network
Subscriptions to a common network
Nodes 234 357
Ties 325 47,944
Density (Directed) 0.006 0.377
Density
(Undirected)
0.010 0.377
Comparison of commentary networks and subscriptions to a
common network in August
249. Figure1. Commentary network in August
Gangnam Style Communication Networks on Youtube
chain shape reflecting a circle
250. •.
Figure 2. Subscriptions to a common network in August
Gangnam Style Communication Networks on Youtube
hub-and-spoke topology
251. • The structural pattern of the NC
• Correlation analysis of common networks
• These results indicate that frequent replies of commenters attracted
some feedback from other commenters in the network because
there was ongoing mutual recognition between repliers and those
being replied to.t. Male users from the U.S.
Outdegree IndegreeBinary
Outde
geeBin
ary
Indegree .546** .978** .506**
Outdegree .487** .979**
IndegreeBi
nary
.461**
252. In terms of the structural pattern of the NSCN,
• According to the independent sample t-test, U.S. (N = 158) and non-
U.S. (N = 180) commenters showed no difference in their channel
co-subscription behaviors (undisclosed = 19)
• Male commenters shared their subscription channels with others
significantly more than female commenters. The average number of
the shared subscription channels of male commenters was 58.40
(S.D. = 62.16), whereas that of female commenters, 43.00 (S.D. =
42.55).
253. Discussion & Implication
• Asian popular music has grown rapidly, particularly in the
U.S. and European countries, but such international
diversity is not well reflected in the central channel for
cultural discussions on music. The results have
important implications for open digital settings, providing
music firms with insights specifically focused on users'
approaches (with mixed motives) to information
dissemination.
• Perhaps more importantly, the results have important
practical implications for the music industry.
254. An analysis of Twitter communication
on Organic products in Mexico and Korea
using webometrics method.
G.CD. Xanat V. Meza
Advisor: Prof. Han Woo Park
255. Objectives
• The present study compares social media resources for organic
products between Mexico and Korea in the Twitter sphere in a period
of six months.
• A social media resource is any comment within or URL linked from a
SNS page containing information on the production, consumption and
diffusion of organic products (The Internet Society, 2005).
Introduction
256. Literature Review Cross cultural research and SNS.
• This study will apply a framework by Marcus & Gould (2001),
which is based on Hofstede’s theory.
• Several researchers (Ess & Sudweeks 2005, Callahan 2006, W¨urtz 2006,
Gevorgyan & Manucharova 2009, Snelders, Morel & Havermans, 2011) have
applied it to website features analyzes and users’ interaction.
257. Method Webometrics.
“The study of web-based content
with primarily quantitative methods
for social science research goals and
using techniques that are not specific
to one field of study.” (Thelwall, 2009, p.6).
“Hidden” and “relational” patterns can be discovered by extracting a
sizeable quantity of data from the social media sphere. Webometrics
could be particularly effective in identifying interrelationships
between businesses’ stakeholders (Kim and Nam, 2012) .
258. Method Semantic analysis.
• Analyses semantic relationships between concepts (Sowa, 1987).
• In the present study, the unit of analysis is keywords.
259. Method Data collection procedures.
• Hashtags for “Organic”:
• Organico (in spanish)
• 유기농 (in korean)
• The process:
• Collection of data by country
• Classification of data by region.
• Analysis of networks.
• Classification of network influencers.
• Analysis of TLDS.
• Analysis and classification of linked URLs
• Semantic analysis.
• Analysis of hashtags and keywords.
260. Results
RQ1.What is the diffusion path of social media resources for
organic products in Mexico and Korea through Twitter?
COUNTRY MX KOR
Vertices 2382 7791
Total Edges 4227 37864
Maximum Geodesic Distance (Diameter) 20 15
Average Geodesic Distance 5.75 4.23
Average Betweenness Centrality 5848.87 23139.08
263. Results RQ1.1.How are the networks changing through time?
0
2000
4000
6000
8000
10000
January
February
March
April
May
June
Edges
KOR
Edges MX
0
500
1000
1500
2000
2500
Vertices
KOR
Vertices
MX
0
1
2
3
4
5
6
7
Average
geodesic
distance
KOR
Average
geodesic
distance
MX
0
5
10
15
20
Maximum
geodesic
distance
KOR
Maximum
geodesic
distance
MX
0
1000
2000
3000
4000
5000
6000
7000
Average
betweenn
ess
centrality
KOR
264. Results RQ1.1.How are the networks changing through time?
Correlations for Mexico
Vertices Edges
Maximum Geodesic
Distance
Average Geodesic
Distance
Betweenness
Centrality
Date 0.116 .203 -.053 -.019 .146
Significance .415 .149 .707 .891 .303
Correlations for Korea
Vertices Edges
Maximum Geodesic
Distance
Average Geodesic
Distance
Betweenness
Centrality
Date .449** .453** .253 .252 .289*
Significance .001 .001 .070 .071 .037
Pearson correlation
N = 52
265. Results RQ1.2. Who are influential players in
diffusing organic products on Twitter?
266. Results RQ1.2. Who are influential players in
diffusing organic products on Twitter?
267. Results RQ1.2. Who are influential players in
diffusing organic products on Twitter?
Indegree Centrality value Type of user Location Outdegree Centrality value Type of user Location
KEN_QUOTES 136 General public Mexico City ExpoOrganicos 14 Business Mexico City
mx_df 55 Alternative media Mexico City homeroblas 13 Celebrity Undefined
En_laDelValle 46 Business Mexico City laorganizacion 13 Business Oaxaca
PublimetroMX 40 Mass media Mexico City ChiczaMexico 13 Business Undefined
tonygalifayad 37 Celebrity Puebla HacklCondesa 10 Business Mexico City
laorganizacion 36 Business Oaxaca Tianguis_ 19 Business Mexico City
Mean 58 Mean 25
Standard Deviation 38.691 Standard Deviation 6.022
Betweenness Centrality value Type of users Location Eigenvector Centrality value Type of users Location
KEN_QUOTES 212381.479 General public Mexico City KEN_QUOTES 0.020 General public Mexico City
ChiczaMexico 111712.703 Business Undefined ExpoOrganicos 0.010 Business Mexico City
mx_df 98234.672 Alternative media Mexico City homeroblas 0.008 Celebrity Undefined
ExpoOrganicos 97670.240 Business Mexico City laorganizacion 0.0007 Business Oaxaca
laorganizacion 86222.745 Business Oaxaca mx_df 0.0006 Alternative media Mexico City
anditagar 316512.780 General public Undefined ChiczaMexico .00006 Business Undefined
Mean 430754 Mean 0.0066
Standard Deviation 88351.077 Standard Deviation 0.0058
268. Results RQ1.2. Who are influential players in
diffusing organic products on Twitter?
ALTERNATIVE MEDIA
1
POLITICIAN
2
BUSINESS
6
CITIZEN
2
MASS MEDIA
1
269. Results RQ1.2. Who are influential players in
diffusing organic products on Twitter?
Indegree Centrality value Type of user Location Outdegree Centrality value Type of user Location
cjtlj 963 Business Undefined cjtlj 200 Business Undefined
StarbucksKorea 368 Business Seoul GrouponKorea 125 Business Seoul
wikitree 288 Alternative media Undefined doolbob 104
Alternative
media
Undefined
six2k 245 General public Seoul erounnet 84 Mass media Undefined
amazingkiss1104 237 General public Undefined sunshine7892 80 Business Gyeonggi
Mangosix_kr 221 Business elelohemh 74 Business Gyeonggi
Mean 387 Mean 111
Standard Deviation 287.109 Standard Deviation 47.381
Betweenness Centrality values Type of users Location Eigenvector Centrality values Type of users Location
cjtlj 9497927.968 Business Undefined cjtlj 0.015 Business Undefined
StarbucksKorea 3418206.580 Business Seoul Mangosix_kr 0.006 Business Undefined
amazingkiss1104 3385445.805 General public Undefined StarbucksKorea 0.005 Business Seoul
wikitree 3336795.105 Alternative media Undefined mosfkorea 0.004 Government Sejong
six2k 2954885.522 General public Seoul melvita_korea 0.004 Business Seoul
Sunshine7892 2082136.391 Business Gyeonggi busanbank 0.004 Business Busan
Mean 4112566 Mean 0.0063
Standard Deviation 2686173.906 Standard Deviation 0.0043