SlideShare une entreprise Scribd logo
1  sur  41
Télécharger pour lire hors ligne
ASIST Webinar 12/2013

Conducting
Twitter Research
Kim Holmberg, PhD
Statistical Cybermetrics Research Group
University of Wolverhampton, UK
(e) kim.holmberg@abo.fi
(w3) http://kimholmberg.fi
Cascades, Islands, or Streams?
Time, Topic, and Scholarly Activities in
Humanities and Social Science Research
Indiana University, Bloomington, USA
University of Wolverhampton, UK
Université de Montréal, Canada
Cascades, Islands, or Streams?
Integrate several datasets representing a
broad range of scholarly activities
Use methodological and data triangulation
to explore the lifecycle of topics within and
across a range of scholarly activities

Develop transparent tools and techniques
to enable future predictive analyses
I’m preparing slides for an #ASIST #webinar
DATA COLLECTION
Webometric Analyst, for data
collection via Twitter’s API, data
cleaning and analysis
http://lexiurl.wlv.ac.uk/
For detailed instructions visit
http://lexiurl.wlv.ac.uk/searcher/twitter.htm
DATA COLLECTION
Other data collection tools
Twitter Archiving Google Spreadsheet (TAGS)
http://mashe.hawksey.info/2013/02/twitter-archive-tagsv5/

HootSuite
http://hootsuite.com/
Or you can write your own script:
https://dev.twitter.com/
http://140dev.com/free-twitter-api-source-code-library/twitter-database-server/
DATA COLLECTION
Information
dissemination
Influence,
popularity
Networks,
communities
Content,
trends
Time series,
sentiment

Tweet
Retweet or RT
@username
#Hashtag
Tweeters
DATA EXTRACTION
Use Webometric Analyst to sort the data and
depending on your research goals, to extract
URLs, hashtags or usernames or to remove
stopwords from the tweets
ETHICS
Data collected from social media sites is openly available on the web,
hence it is already fully public and does not raise any ethical concerns
(Wilkinson & Thelwall, 2011). However, in some cases the content of the
tweets, blog entries or comments collected may contain identifiable,
sensitive information. Although already public, publicizing such
information by discussing it in an academic article could potentially have
unwanted side-effects. Hence, one must consider to anonymise all data
and treate it confidentially.

Wilkinson, D. & Thelwall, M. (2011). Researching personal information
on the public Web: Methods and ethics, Social Science Computer Review,
vol. 29, no. 4, pp. 387-401.
What can we research?
1
1. Networks (users, words, topics, …)
2
2. Content (tweets, RTs, hashtags, …)
FIRST STEPS
Step 1. What do you want to research?
Step 2. Collect tweets that are relevant for your research
questions
Step 3. Sort and clean the tweets (e.g. tweets vs.
retweets, remove tweets in other languages,
remove spam, remove false positives, ...)
Step 4. Extract the data that you need (e.g. tweeters,
usernames mentioned, hashtags, URLs, ...)

1

2
1 NETWORK ANALYSIS
Possible research questions:
How different communities related to A are in
connection to each other?
Who is most central/influential (has most
connections) in a certain network of tweeters?
How information is disseminated in the network?
Who the actors involved in a certain network are?
What kind of local communities are there in a
certain network and what do those communities
represent?
and many more...
TWITTER NETWORK DATA
1,248
TWEETS

1

111

2

290

FOLLOWING

FOLLOWERS

3
CREATE THE NETWORK
ALTERNATIVE 1
This creates a network file (.net) based
on the connections between tweeters
and those they mention (@username) in
their tweets.
Detailed instructions on how to create
and analyze conversational networks on
Twitter are available at:
http://lexiurl.wlv.ac.uk/searcher/twitterC
onversationNetworks.html
CREATE THE NETWORK
ALTERNATIVE 2
Sort the data
Then convert the data
into a network file

Source
Username1
Username1
Username2
Username3
Username3
Username3

Target
Username2
Username3
Username3
Username1
Username2
Username4
OBJECTS OF ANALYSIS
1. An actors
(person, group,
organisation,
word, etc.)
position in the
network
2. Structure of the
network (in
relation to other
networks) or
subnetworks
(clusters)
AN ACTORS POSITION
Degree centrality
Used to locate actors with
influence in the network or
those that are in a position
where they can spread
information in the network.
Can be divided into in- and
outdegree.
How many other actors can
this actor reach directly?
Other often used centrality
measures: closeness,
betweenness, Eigen-vector
NETWORK STRUCTURE
Communities in the
network
Tells something about the
structure of the network
and how the different
actors are spread and
connected to each other in
the network
NETWORK ANALYSIS
- tools of the trade

Gephi (for network visualizations)
http://gephi.org/

Ucinet (for network analysis and visualization)
https://sites.google.com/site/ucinetsoftware/

Pajek (for network analysis and visualization)
http://pajek.imfm.si/doku.php
Analyzing astrophysicists’ conversational
connections on Twitter
Holmberg, Haustein, Bowman & Peters (work in progress)

Communities detected based on
the conversational connections
in astrophysicists’ tweets
Analyzing astrophysicists’ conversational
connections on Twitter
Holmberg, Haustein, Bowman & Peters (work in progress)
100 %
7.4
90 %

4.4
0.0
2.9
2.9

80 %
13.2
70 %
5.9
60 %

4.4
2.9

6.7

3.3

16.7

4.5

7.8
1.1
0.6
3.3

12.5

16.7

13.3

5.7
10.2

3.4

17.5

9.2
0.9
2.8
0.9

1.1
19.3

20 %

Amateur astronomer

Teacher or educator

0.0
5.0
0.0
2.5
7.5
12.5

33.3
46.7

27.2

Corporative
Organization or association

36.7

Science communicator

0.6
4.4

0.0
10 %

Other

11.4

18.2
47.1

Unknown

13.8

26.7

40 %

33.3

40.0

50 %

30 %

0.0

10.1

0.0

0.0

Other researchers

0.9
3.7

13.3

13.8

8.0
5.7

2.5

5.0

6.7

Mod1
(n=88)

Mod2
(n=40)

Mod3
(n=180)

Mod4
(n=30)

Mod5
(n=109)

Researcher

7.3

Mod0
(n=68)

Other astrophysicists
33.3

12.5

8.8

Students

0%

Mod6 (n=3)

Percentage of people with different roles in the 7 communities
Climate change on Twitter: topics, communities
and conversations about the IPCC
Pearce, Holmberg, Hellsten & Nerlich (under review).

Three groups coded
based on their stance
to climate change:
• Convinced
• Skeptic
• Neutral
1 NETWORK ANALYSIS
Summary
Step 4. Extract the data that you need (e.g. Tweeters and the
usernames they mentioned, following or followers
lists, ...)
Step 5. Convert your data into a network file
Step 6. Visualize the network and analyse
In addition you may want to run some social network
analysis on the network (e.g. centrality) or code the actors
according to suitable titles (e.g. work roles, opinion about
something, etc.)
2 CONTENT ANALYSIS
Possible research questions:
How is topic A discussed on Twitter?
How certain activities on Twitter correlate with
offline activities?
How popular is A compared with B, based on
visibility on Twitter?
What is the public opinion (of tweeters) about A?
What are tweeters saying about A?

and many more...
15,672
Quantitative

Qualitative
CONTENT ANALYSIS
- manual coding

Positive-Neutral-Negative
Scientific-Not scientific-Not clear
Skeptic-Convinced-Neutral
Personal-Work related
Astrophysics-Biochemistry-Cheminformatics ...
Pro something-Against something
and many more depending on your research goals...
Holmberg, K. & Thelwall, M. (2013). Disciplinary differences in Twitter
scholarly communication. In the Proceedings of 14th International Society
for Scientometrics and Informetrics conference, 2013, Vienna, Austria.
Available at: http://issi2013.org/proceedings.html.
40%

35%
5

30%

25%

7
Other

20%

3.5

Links

3.5
7.5

15%

Conversations
Retweets

10
3

10%
3

18

3
0.5
8.5

6.5
0%
Astrophysics

Biochemistry

Digital humanities

1.5

5

4.5

0
1

5%

0.5
1

Economics

History of science

Scientific content of the tweets by communication type
CONTENT ANALYSIS
- tools of the trade

VOSviewer (to extract noun-phrases from tweets)
http://www.vosviewer.com/

BibExcel (for co-word analysis)
http://www8.umu.se/inforsk/Bibexcel/

Notepad++ (to search and replace in your data)
http://notepad-plus-plus.org/

Screaming Frog SEO Spider (to decode short urls)
http://www.screamingfrog.co.uk/seo-spider/
Noun-phrases
from one of the
communities

Analyzing astrophysicists’ conversational connections on Twitter
Holmberg, Haustein, Bowman & Peters (work in progress)
TIME SERIES
- tools of the trade
Mozdeh (Persian for Good news)
Visit http://mozdeh.wlv.ac.uk/index.html
for free download and instructions
TIME SERIES

Pearce, Holmberg, Hellsten & Nerlich (under review). Climate change on
Twitter: topics, communities and conversations about the IPCC.
The Next Pope?
699,337 tweets collected
between February 12, 2013
and March 11, 2013.
Pope Francis - Jorge Mario Bergoglio
Was mentioned in 9 tweets...
ONLINE/OFFLINE
CORRELATIONS
Comparison of Twitter and publication activity and impact
• publications and tweets per day: ρ=−0.339*
• citation rate and tweets per day: ρ=−0.457**

Haustein, Bowman, Holmberg, Larivière, & Peters, (under review). Astrophysicists on
Twitter: An in-depth analysis of tweeting and scientific publication behavior.
ONLINE/OFFLINE
CORRELATIONS
Overall similarity between abstracts and tweets is low
• cosine=0.081
• 4.1% of 50,854 tweet NPs in abstracts
• 16.0% of 12,970 abstract NPs in tweets

Haustein, Bowman, Holmberg, Larivière, & Peters, (under review). Astrophysicists on
Twitter: An in-depth analysis of tweeting and scientific publication behavior.
2 CONTENT ANALYSIS
Summary
Step 4. Extract the data that you need (e.g. hashtags,
usernames, original tweets, ...)

And then, depending on your research goals:
Step 5A. Analyze frequencies (e.g. most used hashtags, etc.)
Step 5B. Classify the tweets manually
Step 5C. Extract the noun phrases and create a co-mention
network of them with VOSviewer
Step 5D. Analyze time series of certain word/hashtag
occurrences
Step 5E. Run sentiment analysis on the tweets
During this hour
over 20,820,000
tweets were sent
Thank you for your attention

Kim Holmberg
Statistical Cybermetrics Research Group
University of Wolverhampton, UK
kim.holmberg@abo.fi
http://kimholmberg.fi
@kholmber
Acknowledgements
This presentation is based upon work supported by the international funding initiative Digging into Data. Specifically, funding comes
from the National Science Foundation in the United States (Grant No. 1208804), JISC in the United Kingdom, and the Social Sciences and
Humanities Research Council of Canada.

Contenu connexe

Tendances

A History Of First Search Engine S
A History Of First Search Engine SA History Of First Search Engine S
A History Of First Search Engine S
Earnestine336Prue
 
Semantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsSemantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer Apps
Jie Bao
 

Tendances (20)

Open software and knowledge for MIOSS
Open software and knowledge for MIOSS Open software and knowledge for MIOSS
Open software and knowledge for MIOSS
 
Elsevier - Labs on Line
Elsevier - Labs on Line Elsevier - Labs on Line
Elsevier - Labs on Line
 
Open software and knowledge for MIOSS
Open software and knowledge for MIOSSOpen software and knowledge for MIOSS
Open software and knowledge for MIOSS
 
Professor Hendrik Speck - Information Mining in the Social Web. Empolis Execu...
Professor Hendrik Speck - Information Mining in the Social Web. Empolis Execu...Professor Hendrik Speck - Information Mining in the Social Web. Empolis Execu...
Professor Hendrik Speck - Information Mining in the Social Web. Empolis Execu...
 
Automatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the LiteratureAutomatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the Literature
 
Disseminating Research and Managing Your Online Reputation
Disseminating Research and Managing Your Online Reputation Disseminating Research and Managing Your Online Reputation
Disseminating Research and Managing Your Online Reputation
 
Data, data, data
Data, data, dataData, data, data
Data, data, data
 
Unknown Unknowns
Unknown UnknownsUnknown Unknowns
Unknown Unknowns
 
A History Of First Search Engine S
A History Of First Search Engine SA History Of First Search Engine S
A History Of First Search Engine S
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature
 
Professor Hendrik Speck - E*Lobbying. Elobbying.
Professor Hendrik Speck - E*Lobbying. Elobbying.Professor Hendrik Speck - E*Lobbying. Elobbying.
Professor Hendrik Speck - E*Lobbying. Elobbying.
 
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
 
Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...
Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...
Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...
 
Challenges in-archiving-twitter
Challenges in-archiving-twitterChallenges in-archiving-twitter
Challenges in-archiving-twitter
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literatureAmanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature
 
Super Searcher
Super SearcherSuper Searcher
Super Searcher
 
Semantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsSemantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer Apps
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 
Automatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the LiteratureAutomatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the Literature
 
What to expect when you are visualizing
What to expect when you are visualizingWhat to expect when you are visualizing
What to expect when you are visualizing
 

En vedette

En vedette (8)

Sosiaalinen media elinkeinopolitiikan toteuttamisessa
Sosiaalinen media elinkeinopolitiikan toteuttamisessaSosiaalinen media elinkeinopolitiikan toteuttamisessa
Sosiaalinen media elinkeinopolitiikan toteuttamisessa
 
From Library 2.0 to Library 3D
From Library 2.0 to Library 3DFrom Library 2.0 to Library 3D
From Library 2.0 to Library 3D
 
The impact of retweeting on altmetrics
The impact of retweeting on altmetricsThe impact of retweeting on altmetrics
The impact of retweeting on altmetrics
 
Sociala medier - användning, trender och analys
Sociala medier - användning, trender och analysSociala medier - användning, trender och analys
Sociala medier - användning, trender och analys
 
Hur IKT förändrar skolan
Hur IKT förändrar skolanHur IKT förändrar skolan
Hur IKT förändrar skolan
 
Analyzing the climate change debate on Twitter – content and differences bet...
Analyzing the climate change debate on Twitter – content and differences bet...Analyzing the climate change debate on Twitter – content and differences bet...
Analyzing the climate change debate on Twitter – content and differences bet...
 
Disciplinary Differences in Twitter Scholarly Communication
Disciplinary Differences in Twitter Scholarly CommunicationDisciplinary Differences in Twitter Scholarly Communication
Disciplinary Differences in Twitter Scholarly Communication
 
Sociala medier och biblioteket
Sociala medier och biblioteketSociala medier och biblioteket
Sociala medier och biblioteket
 

Similaire à Conducting Twitter Reserch

Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platform
Fayan TAO
 
Disseminating Scientific Research via Twitter: Research Evidence and Practica...
Disseminating Scientific Research via Twitter: Research Evidence and Practica...Disseminating Scientific Research via Twitter: Research Evidence and Practica...
Disseminating Scientific Research via Twitter: Research Evidence and Practica...
Katja Reuter, PhD
 

Similaire à Conducting Twitter Reserch (20)

myExperiment @ Nettab
myExperiment @ NettabmyExperiment @ Nettab
myExperiment @ Nettab
 
Mike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to WebometricsMike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to Webometrics
 
Keeping up: strategic use of online social networks for librarian current awa...
Keeping up: strategic use of online social networks for librarian current awa...Keeping up: strategic use of online social networks for librarian current awa...
Keeping up: strategic use of online social networks for librarian current awa...
 
Science dissemination 2.0: Social media for researchers
Science dissemination 2.0: Social media for researchersScience dissemination 2.0: Social media for researchers
Science dissemination 2.0: Social media for researchers
 
Social media for researchers: Increase your research competitiveness using We...
Social media for researchers: Increase your research competitiveness using We...Social media for researchers: Increase your research competitiveness using We...
Social media for researchers: Increase your research competitiveness using We...
 
Stepping out of the echo chamber - Alternative indicators of scholarly commun...
Stepping out of the echo chamber - Alternative indicators of scholarly commun...Stepping out of the echo chamber - Alternative indicators of scholarly commun...
Stepping out of the echo chamber - Alternative indicators of scholarly commun...
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ Deloitte
 
Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platform
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
HIBERLINK: Reference Rot and Linked Data: Threat and Remedy
HIBERLINK: Reference Rot and Linked Data: Threat and RemedyHIBERLINK: Reference Rot and Linked Data: Threat and Remedy
HIBERLINK: Reference Rot and Linked Data: Threat and Remedy
 
Social Media and Scientific Research How Semantic Technologies Enhance Colla...
Social Media and Scientific ResearchHow Semantic Technologies Enhance Colla...Social Media and Scientific ResearchHow Semantic Technologies Enhance Colla...
Social Media and Scientific Research How Semantic Technologies Enhance Colla...
 
Twitter research overview
Twitter research overviewTwitter research overview
Twitter research overview
 
Roar Presentation To School Of Psychology
Roar Presentation To School Of PsychologyRoar Presentation To School Of Psychology
Roar Presentation To School Of Psychology
 
Information Skills: 3. Social Media (Natural Sciences, Bangor University)
Information Skills: 3. Social Media (Natural Sciences, Bangor University)    Information Skills: 3. Social Media (Natural Sciences, Bangor University)
Information Skills: 3. Social Media (Natural Sciences, Bangor University)
 
Condor3
Condor3 Condor3
Condor3
 
UW President's Summit 2011 - Social Media Workshop
UW President's Summit 2011 - Social Media WorkshopUW President's Summit 2011 - Social Media Workshop
UW President's Summit 2011 - Social Media Workshop
 
Big data in social sciences and IT developments (ethics considerations)
Big data in social sciences and IT developments (ethics considerations)Big data in social sciences and IT developments (ethics considerations)
Big data in social sciences and IT developments (ethics considerations)
 
What happen after crawling big data?
What happen after crawling big data?What happen after crawling big data?
What happen after crawling big data?
 
Twitter analytics
Twitter analyticsTwitter analytics
Twitter analytics
 
Disseminating Scientific Research via Twitter: Research Evidence and Practica...
Disseminating Scientific Research via Twitter: Research Evidence and Practica...Disseminating Scientific Research via Twitter: Research Evidence and Practica...
Disseminating Scientific Research via Twitter: Research Evidence and Practica...
 

Plus de Kim Holmberg

Drivers of higher education institutions’ visibility: a study of UK HEIs soci...
Drivers of higher education institutions’ visibility: a study of UK HEIs soci...Drivers of higher education institutions’ visibility: a study of UK HEIs soci...
Drivers of higher education institutions’ visibility: a study of UK HEIs soci...
Kim Holmberg
 
The conceptual landscape of iSchools: Examining current research interests of...
The conceptual landscape of iSchools: Examining current research interests of...The conceptual landscape of iSchools: Examining current research interests of...
The conceptual landscape of iSchools: Examining current research interests of...
Kim Holmberg
 
Sosiaalinen media yritysten käytössä
Sosiaalinen media yritysten käytössäSosiaalinen media yritysten käytössä
Sosiaalinen media yritysten käytössä
Kim Holmberg
 

Plus de Kim Holmberg (17)

Altmetrics and research profiles for 10 universities in Finland
Altmetrics and research profiles for 10 universities in FinlandAltmetrics and research profiles for 10 universities in Finland
Altmetrics and research profiles for 10 universities in Finland
 
Measuring the societal impact of open science
Measuring the societal impact of open scienceMeasuring the societal impact of open science
Measuring the societal impact of open science
 
Drivers of higher education institutions’ visibility: a study of UK HEIs soci...
Drivers of higher education institutions’ visibility: a study of UK HEIs soci...Drivers of higher education institutions’ visibility: a study of UK HEIs soci...
Drivers of higher education institutions’ visibility: a study of UK HEIs soci...
 
Altmetrics - Measuring the impact of scientific activities
Altmetrics - Measuring the impact of scientific activitiesAltmetrics - Measuring the impact of scientific activities
Altmetrics - Measuring the impact of scientific activities
 
Measuring the societal impact of open science (1st presentation of a research...
Measuring the societal impact of open science (1st presentation of a research...Measuring the societal impact of open science (1st presentation of a research...
Measuring the societal impact of open science (1st presentation of a research...
 
Hur IT förändrar skolan
Hur IT förändrar skolanHur IT förändrar skolan
Hur IT förändrar skolan
 
Identifying rumours on Twitter
Identifying rumours on TwitterIdentifying rumours on Twitter
Identifying rumours on Twitter
 
Combining network structures and meanings: Tweeting over the IPCC report
Combining network structures and meanings: Tweeting over the IPCC reportCombining network structures and meanings: Tweeting over the IPCC report
Combining network structures and meanings: Tweeting over the IPCC report
 
The conceptual landscape of iSchools: Examining current research interests of...
The conceptual landscape of iSchools: Examining current research interests of...The conceptual landscape of iSchools: Examining current research interests of...
The conceptual landscape of iSchools: Examining current research interests of...
 
Information Strategies
Information StrategiesInformation Strategies
Information Strategies
 
Sociala medier i undervisning
Sociala medier i undervisningSociala medier i undervisning
Sociala medier i undervisning
 
Co-inlinking to a municipal Web space
Co-inlinking to a municipal Web spaceCo-inlinking to a municipal Web space
Co-inlinking to a municipal Web space
 
Sosiaalinen media yritysten käytössä
Sosiaalinen media yritysten käytössäSosiaalinen media yritysten käytössä
Sosiaalinen media yritysten käytössä
 
From Library 2.0 To Library 3D
From Library 2.0 To Library 3DFrom Library 2.0 To Library 3D
From Library 2.0 To Library 3D
 
Library 2.0
Library 2.0Library 2.0
Library 2.0
 
Avatarit opiskelijoina
Avatarit opiskelijoinaAvatarit opiskelijoina
Avatarit opiskelijoina
 
Local government web sites in Finland: A geographic and webometric analysis
Local government web sites in Finland: A geographic and webometric analysisLocal government web sites in Finland: A geographic and webometric analysis
Local government web sites in Finland: A geographic and webometric analysis
 

Dernier

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Dernier (20)

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 

Conducting Twitter Reserch

  • 1. ASIST Webinar 12/2013 Conducting Twitter Research Kim Holmberg, PhD Statistical Cybermetrics Research Group University of Wolverhampton, UK (e) kim.holmberg@abo.fi (w3) http://kimholmberg.fi
  • 2. Cascades, Islands, or Streams? Time, Topic, and Scholarly Activities in Humanities and Social Science Research Indiana University, Bloomington, USA University of Wolverhampton, UK Université de Montréal, Canada
  • 3. Cascades, Islands, or Streams? Integrate several datasets representing a broad range of scholarly activities Use methodological and data triangulation to explore the lifecycle of topics within and across a range of scholarly activities Develop transparent tools and techniques to enable future predictive analyses
  • 4. I’m preparing slides for an #ASIST #webinar
  • 5. DATA COLLECTION Webometric Analyst, for data collection via Twitter’s API, data cleaning and analysis http://lexiurl.wlv.ac.uk/ For detailed instructions visit http://lexiurl.wlv.ac.uk/searcher/twitter.htm
  • 6. DATA COLLECTION Other data collection tools Twitter Archiving Google Spreadsheet (TAGS) http://mashe.hawksey.info/2013/02/twitter-archive-tagsv5/ HootSuite http://hootsuite.com/ Or you can write your own script: https://dev.twitter.com/ http://140dev.com/free-twitter-api-source-code-library/twitter-database-server/
  • 8. DATA EXTRACTION Use Webometric Analyst to sort the data and depending on your research goals, to extract URLs, hashtags or usernames or to remove stopwords from the tweets
  • 9. ETHICS Data collected from social media sites is openly available on the web, hence it is already fully public and does not raise any ethical concerns (Wilkinson & Thelwall, 2011). However, in some cases the content of the tweets, blog entries or comments collected may contain identifiable, sensitive information. Although already public, publicizing such information by discussing it in an academic article could potentially have unwanted side-effects. Hence, one must consider to anonymise all data and treate it confidentially. Wilkinson, D. & Thelwall, M. (2011). Researching personal information on the public Web: Methods and ethics, Social Science Computer Review, vol. 29, no. 4, pp. 387-401.
  • 10. What can we research? 1 1. Networks (users, words, topics, …) 2 2. Content (tweets, RTs, hashtags, …)
  • 11. FIRST STEPS Step 1. What do you want to research? Step 2. Collect tweets that are relevant for your research questions Step 3. Sort and clean the tweets (e.g. tweets vs. retweets, remove tweets in other languages, remove spam, remove false positives, ...) Step 4. Extract the data that you need (e.g. tweeters, usernames mentioned, hashtags, URLs, ...) 1 2
  • 12. 1 NETWORK ANALYSIS Possible research questions: How different communities related to A are in connection to each other? Who is most central/influential (has most connections) in a certain network of tweeters? How information is disseminated in the network? Who the actors involved in a certain network are? What kind of local communities are there in a certain network and what do those communities represent? and many more...
  • 14. CREATE THE NETWORK ALTERNATIVE 1 This creates a network file (.net) based on the connections between tweeters and those they mention (@username) in their tweets. Detailed instructions on how to create and analyze conversational networks on Twitter are available at: http://lexiurl.wlv.ac.uk/searcher/twitterC onversationNetworks.html
  • 15. CREATE THE NETWORK ALTERNATIVE 2 Sort the data Then convert the data into a network file Source Username1 Username1 Username2 Username3 Username3 Username3 Target Username2 Username3 Username3 Username1 Username2 Username4
  • 16. OBJECTS OF ANALYSIS 1. An actors (person, group, organisation, word, etc.) position in the network 2. Structure of the network (in relation to other networks) or subnetworks (clusters)
  • 17. AN ACTORS POSITION Degree centrality Used to locate actors with influence in the network or those that are in a position where they can spread information in the network. Can be divided into in- and outdegree. How many other actors can this actor reach directly? Other often used centrality measures: closeness, betweenness, Eigen-vector
  • 18. NETWORK STRUCTURE Communities in the network Tells something about the structure of the network and how the different actors are spread and connected to each other in the network
  • 19. NETWORK ANALYSIS - tools of the trade Gephi (for network visualizations) http://gephi.org/ Ucinet (for network analysis and visualization) https://sites.google.com/site/ucinetsoftware/ Pajek (for network analysis and visualization) http://pajek.imfm.si/doku.php
  • 20. Analyzing astrophysicists’ conversational connections on Twitter Holmberg, Haustein, Bowman & Peters (work in progress) Communities detected based on the conversational connections in astrophysicists’ tweets
  • 21. Analyzing astrophysicists’ conversational connections on Twitter Holmberg, Haustein, Bowman & Peters (work in progress) 100 % 7.4 90 % 4.4 0.0 2.9 2.9 80 % 13.2 70 % 5.9 60 % 4.4 2.9 6.7 3.3 16.7 4.5 7.8 1.1 0.6 3.3 12.5 16.7 13.3 5.7 10.2 3.4 17.5 9.2 0.9 2.8 0.9 1.1 19.3 20 % Amateur astronomer Teacher or educator 0.0 5.0 0.0 2.5 7.5 12.5 33.3 46.7 27.2 Corporative Organization or association 36.7 Science communicator 0.6 4.4 0.0 10 % Other 11.4 18.2 47.1 Unknown 13.8 26.7 40 % 33.3 40.0 50 % 30 % 0.0 10.1 0.0 0.0 Other researchers 0.9 3.7 13.3 13.8 8.0 5.7 2.5 5.0 6.7 Mod1 (n=88) Mod2 (n=40) Mod3 (n=180) Mod4 (n=30) Mod5 (n=109) Researcher 7.3 Mod0 (n=68) Other astrophysicists 33.3 12.5 8.8 Students 0% Mod6 (n=3) Percentage of people with different roles in the 7 communities
  • 22. Climate change on Twitter: topics, communities and conversations about the IPCC Pearce, Holmberg, Hellsten & Nerlich (under review). Three groups coded based on their stance to climate change: • Convinced • Skeptic • Neutral
  • 23. 1 NETWORK ANALYSIS Summary Step 4. Extract the data that you need (e.g. Tweeters and the usernames they mentioned, following or followers lists, ...) Step 5. Convert your data into a network file Step 6. Visualize the network and analyse In addition you may want to run some social network analysis on the network (e.g. centrality) or code the actors according to suitable titles (e.g. work roles, opinion about something, etc.)
  • 24. 2 CONTENT ANALYSIS Possible research questions: How is topic A discussed on Twitter? How certain activities on Twitter correlate with offline activities? How popular is A compared with B, based on visibility on Twitter? What is the public opinion (of tweeters) about A? What are tweeters saying about A? and many more...
  • 26.
  • 28. CONTENT ANALYSIS - manual coding Positive-Neutral-Negative Scientific-Not scientific-Not clear Skeptic-Convinced-Neutral Personal-Work related Astrophysics-Biochemistry-Cheminformatics ... Pro something-Against something and many more depending on your research goals...
  • 29. Holmberg, K. & Thelwall, M. (2013). Disciplinary differences in Twitter scholarly communication. In the Proceedings of 14th International Society for Scientometrics and Informetrics conference, 2013, Vienna, Austria. Available at: http://issi2013.org/proceedings.html. 40% 35% 5 30% 25% 7 Other 20% 3.5 Links 3.5 7.5 15% Conversations Retweets 10 3 10% 3 18 3 0.5 8.5 6.5 0% Astrophysics Biochemistry Digital humanities 1.5 5 4.5 0 1 5% 0.5 1 Economics History of science Scientific content of the tweets by communication type
  • 30. CONTENT ANALYSIS - tools of the trade VOSviewer (to extract noun-phrases from tweets) http://www.vosviewer.com/ BibExcel (for co-word analysis) http://www8.umu.se/inforsk/Bibexcel/ Notepad++ (to search and replace in your data) http://notepad-plus-plus.org/ Screaming Frog SEO Spider (to decode short urls) http://www.screamingfrog.co.uk/seo-spider/
  • 31. Noun-phrases from one of the communities Analyzing astrophysicists’ conversational connections on Twitter Holmberg, Haustein, Bowman & Peters (work in progress)
  • 32. TIME SERIES - tools of the trade Mozdeh (Persian for Good news) Visit http://mozdeh.wlv.ac.uk/index.html for free download and instructions
  • 33. TIME SERIES Pearce, Holmberg, Hellsten & Nerlich (under review). Climate change on Twitter: topics, communities and conversations about the IPCC.
  • 34. The Next Pope? 699,337 tweets collected between February 12, 2013 and March 11, 2013.
  • 35. Pope Francis - Jorge Mario Bergoglio Was mentioned in 9 tweets...
  • 36. ONLINE/OFFLINE CORRELATIONS Comparison of Twitter and publication activity and impact • publications and tweets per day: ρ=−0.339* • citation rate and tweets per day: ρ=−0.457** Haustein, Bowman, Holmberg, Larivière, & Peters, (under review). Astrophysicists on Twitter: An in-depth analysis of tweeting and scientific publication behavior.
  • 37. ONLINE/OFFLINE CORRELATIONS Overall similarity between abstracts and tweets is low • cosine=0.081 • 4.1% of 50,854 tweet NPs in abstracts • 16.0% of 12,970 abstract NPs in tweets Haustein, Bowman, Holmberg, Larivière, & Peters, (under review). Astrophysicists on Twitter: An in-depth analysis of tweeting and scientific publication behavior.
  • 38. 2 CONTENT ANALYSIS Summary Step 4. Extract the data that you need (e.g. hashtags, usernames, original tweets, ...) And then, depending on your research goals: Step 5A. Analyze frequencies (e.g. most used hashtags, etc.) Step 5B. Classify the tweets manually Step 5C. Extract the noun phrases and create a co-mention network of them with VOSviewer Step 5D. Analyze time series of certain word/hashtag occurrences Step 5E. Run sentiment analysis on the tweets
  • 39.
  • 40. During this hour over 20,820,000 tweets were sent
  • 41. Thank you for your attention Kim Holmberg Statistical Cybermetrics Research Group University of Wolverhampton, UK kim.holmberg@abo.fi http://kimholmberg.fi @kholmber Acknowledgements This presentation is based upon work supported by the international funding initiative Digging into Data. Specifically, funding comes from the National Science Foundation in the United States (Grant No. 1208804), JISC in the United Kingdom, and the Social Sciences and Humanities Research Council of Canada.