SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
Rela%ve
Trends
in
Scien%fic

     Terms
on
Twi4er


       Victoria
Uren,
Aba‐Sah
Dadzie

             
The
OAK
Group,
Dept.
of
Computer
Science,
The
University
of
Sheffield

Introduc%on

  •  scien%fic
research
tradi%onally
disseminated
via
journals,
books,

     scien%fic
conferences


  •  new
form
of
discourse
–
online
social
media


           –  suitable
forum
for
dissemina%ng
scien%fic
research?

           –  do
scien%sts
engage
with
online
social
media?

           –  are
there
sufficient
amounts
of
informa%on
on
scien%fic
topics?


  •  are
there
suitable
metrics
for
measuring
scien%fic
impact
online?

      –  between
scien%sts?

      –  for
public
engagement?

  •  are
these
new
measures
comparable
to
formal
metrics?



altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Outline

  •  Aims/Introduc%on

  •  Related
Work


  •  Experiment

           –  Data

           –  Analysis
&
Results



  •  Conclusions


  •  Next
Steps


  •  Acknowledgements



altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Outline

  •  Aims/Introduc%on

  •  Related
Work


  •  Experiment

           –  Data

           –  Analysis
&
Results


  •  Conclusions


  •  Next
Steps


  •  Acknowledgements



altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Related
Work

  •  Garfield,
E.
(from
1950s)

           –  father
of
scientometrics


  •  Priem
et
al.
(2010)


           –  Scientometrics
2.0
as
a
new
metric
for
measuring
scholarly
impact
on
social
web

                                                                                             

  •  Lane
(2010)


           –  need
to
improve
metrics
used
to
measure
scien%fic
impact


  •  Michel
et
al.
(2011)

           –  Google
nGrams
to
analyse
culture

           –  a.o.,
recognised
fame
for
scien%sts
low…


  •  Cheong
et
al.
(2009)


           –  H1N1
spike
(trend)
detected
on
Twi4er
during
flu
pandemic
(May
2009)

  •  Rowe
et
al.
(2011)

           –  influence
of
content
and
author
features
on
predic%on
of
ac%ve,
long
term

              discussions
on
social
web

  •  Kinsella
et
al.
(2011)

           –  using
hyperlinked
metadata
to
aid
categorisa%on
of
topics
discussed
in
online

              social
media


altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Outline

  •  Aims/Introduc%on

  •  Related
Work



  •  Experiment

           –  Data

           –  Analysis
&
Results


  •  Conclusions


  •  Next
Steps


  •  Acknowledgements



altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Experiment
                                                                

  •  exploratory
experiment


           –  to
determine
frequency
of
occurrence
of
scien%fic
term
usage
in

              online
social
media



  •  data
set

           –  three
sets
of
(scien%fic)
terms
selected
from
UNESCO
thesaurus

           –  Google
Books
NGrams
corpus
used
as
a
baseline

           –  300
tweets
collected
in
each
sample,
using
Twi4er
API,
for
selected

              terms

  •  frequency/usage
analysis





altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Outline

  •  Aims/Introduc%on

  •  Related
Work


  •  Experiment

           – Data

           –  Analysis
&
Results


  •  Conclusions


  •  Next
Steps


  •  Acknowledgements



altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

UNESCO
Thesaurus
1Gram
Terms

                                   

       Topic
                                       Terms


       Physical
Sciences

                          Ioniza%on,
Electromagne%sm,
Crystallography


       Chemical
Sciences

                          Phosphorus,
Alkalinity,
Microchemistry


       Earth
Sciences

                             Permafrost,
Lithosphere,
Glaciology


  •  selec%on
criteria

           –     minimisa%on
of
noise
due
to
polysemy

           –     avoidance
of
scien%fic
terms
with
other
common/colloquial
usage

           –     terms
unique
to
a
par%cular
topic

           –     words
with
a
single
stem

           –     1Grams
only




altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Baseline
Dataset
–
Google
1Grams
                                      

  •  obtained
from
Google
Books
NGrams
corpus1



  •  total
NGrams
by
year
for
three
sets
of
terms

           –  2006
–
116,029


           –  2007
–
126,206


           –  2008
–
111,417



  •  annual
varia%on
by
topic
(of
total
NGrams
baseline
dataset)

           –  Chemical
Sciences

50‐60%


           –  Physical
Sciences



30‐40%


           –  Earth
Sciences








~
10%



  •      [1]
h4p://ngrams.googlelabs.com/datasets            




altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Baseline
Dataset
–
Google
1Grams
                                    





altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Twi4er
Dataset
                                                            


       Sample
ID
                CollecAon
Period

                    Elapsed
Time
(h)

       T‐300‐1

                 Tue


Mar
01
20:56:43
GMT
2011
–

           41

                                 Thu


Mar
03
14:22:18
GMT
2011


       T‐300‐2
                  Fri




Mar
04
02:35:55
GMT
2011
–
          64

                                                                                

                                 Sun


Mar
06
18:38:05
GMT
2011


       T‐300‐3
                  Mon
Mar
07
20:31:11
GMT
2011
–
              44

                                                                                

                                 Wed
Mar
09
16:21:36
GMT
2011




  •  three
samples
collected,
containing
300
consecu%ve
tweets
each

  •  ~
0.003%
of
total
tweets
over
collec%on
period





altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Outline

  •  Aims/Introduc%on

  •  Related
Work


  •  Experiment

           –  Data


           – Analysis
&
Results


  •  Conclusions


  •  Next
Steps


  •  Acknowledgements



altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Twi4er
c.f.
Google
NGrams
                                              





altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Twi4er
c.f.
Google
NGrams
                                              

  •  higher
varia%on
in
distribu%on
for
Twi4er
sample

           –  however
largely
in
line
with
Google
NGrams



  •  can
Google
NGrams
serve
as
a
suitable
baseline?

           –  need
to
more
closely
examine
varia%on…



  •  notable
peaks
in
Twi4er
sample
for
three
terms

           –  Permafrost
(Earth
Sciences)

           –  Alkalinity
(Chemical
Sciences)


           –  Phosphorus
(Chemical
Sciences)


  •  are
these
poten%al
trends?



altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Twi4er
c.f.
Google
NGrams
                                              





altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Twi4er
c.f.
Google
NGrams
                                                     

  •  Permafrost


           –  17%
and
15%
in
Twi4er
samples
(T‐300‐1
&
2)
–
c.f.
5%
in
G‐2006‐2008

                                                                                  

           –  41
out
of
113
tweets
(36%)
used
in
scien%fic
context


           –  large
number
of
tweets
referred
to

                    •  online
game
server1


                    •  designer
case
for
iPhone


  •  Alkalinity

           –  none
found
to
have
scien%fic
content


           –  mostly
used
in
pseudo‐scien%fic
health
advice

           –  peak
in
T‐300‐2
(31
out
of
60
tweets
–
~50%)


                    •  dominated
by

pH
measures
in
swimming
pools
&
fish
tanks


                    •  influence
probably
due
to
collec%on
period
–
weekend
–
engagement
in

                       leisure
ac%vi%es




  •      [1]
h4p://www.everquest2.com/Permafrost



altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Example
Tweets
–
Permafrost


  •      advert/chat

           –  @HDNinjacp
go
to
Permafrost
Its
never
full
:
Fri
Mar
04
05:21:02
GMT
2011


           –  @Riffy8888
hey
Could
you
Come
to
my
Party
birthday
Party
on
CP
March
13
Server

              Permafrost
Dock
6:00PST
:
Sun
Mar
06
04:37:13
GMT
2011


           –  Party
Server
Permafrost
Dock
Please
Go
It's
An
Early
Birthday
Party
For
me
:
Thu
Mar
03

              01:38:28
GMT
2011



  •      cold

           –  36
inches
of
permafrost
s%ll,
I
want
to
stake
my
bird
condo
b4
the
squirrals
knock
it

              down
again..bas%ds..all
of'm
:
Sat
Mar
05
01:51:48
GMT
2011



  •      science

           –  Fire
and
Ice:
Permafrost
Melt
Spews
Combus%ble
Methane
h4p://%ny.ly/be8q
:
Fri
Mar

              04
16:43:10
GMT
2011


           –  (retweeted)
‐
Experts
Monitor
Methane
Release
from
Permafrost:
Over
the
past
few

              years,
methane
levels
around
the
world
have
b...
h4p://bit.ly/hvVEJX
:
Wed
Mar
02

              12:27:25
GMT
2011


           –  RT
@NetNewsBuzz:
Permafrost
Melt
Soon
Irreversible
Without
Major
Fossil
Fuel
Cuts

              h4p://%nyurl.com/5w8w2oh
#oil
#climate
#CO2
#fossilfuels
:
Thu
Mar
03
02:57:48
GMT

              2011




altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Example
Tweets
–
Alkalinity
T‐300‐2

  •  Chemistry
Help
Needed!
pH,
concentra%on
of
carbonate
species
and

     alkalinity...
just
got
published:
h4p://bit.ly/hUCpz7


           –  URL
points
to
the
ques%on
on
“My
Chemistry
Tutor”
–
homework?


  •  retweeted

           –  The
proper
total
alkalinity
for
your
pool
is
100
ppm.
h4p://su.pr/8hrxCE
:
Fri
Mar

              04
19:02:20
GMT
2011


           –  If
the
Total
Alkalinity
in
your
swimming
pool
is
low,
your
pH
will
be
low.
h4p://
              su.pr/8hrxCE
:
Fri
Mar
04
20:34:11
GMT
2011



  •  spam/adverts
(including
retweets)

           –  @Poet_Carl_Wa4s:
some
foods
create
acidity
or
alkalinity
ayer
they‚Äôre

              metabolized...h4p://ping.fm/GQTvA
#KnowledgeIsPower!
:
Sat
Mar
05
02:38:55

              GMT
2011


           –  
RT
@CourtneyPool:
Green
juice,
oh
Liquid
Emerald
Elixir
of
Life
and
Alkalinity!

              Course
through
my
BODY!
#juicing
:
Sun
Mar
06
18:34:29
GMT
2011





altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Twi4er
c.f.
Google
NGrams:
Phosphorus
                                     

Sample
 Total
 LegislaAon
 NutriAon

             
           
         
                                Other
     Industry

                                                                                       
     White

  ID
                                                              Sciences

                                                                           
               Phosphorus
                                                                                                    

T‐300‐1

       
            129

                       
                  46

                                            
                16

                                                               
      29

                                                                        
         4
           5

T‐300‐2
            119

                       
                    4
               26

                                                               
      35

                                                                        
         9
           5

T‐300‐3
            171

                       
                  12

                                            
                23

                                                               
      37

                                                                        
        42

                                                                                   
          19

                                                                                                



  •  Twi4er
trends
for
Phosphorus
in
sample
T‐300‐3


           –  Industry


                    •  takeover
of
a
Brazilian
company
by
the
Indian
firm
United
Phosphorus


           –  White
Phosphorus

                    •  17
retweets
of
an
emo%ve
message
(rela%on
to
Middle
East
wars)




altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Twi4er
c.f.
Google
NGrams:
Phosphorus
                                     

  •  usage
largely
with
scien%fic
content

           –  with
rela%onships,
a.o.,
to
legal,
nutri%onal
&
economic
context

           –  five
main
categories
iden%fied


  •  Legisla%on


           –  limits
to
use
in
fer%liser,
soap


  •  Nutri%on

           –  phosphorus
content


  •  Other
Science


           –  peak
phosphorus,
pollu%on

           –  discovery
of
arsenic
replacing
phosphorus
in
a
microbe


           –  tweets
about
new
paper
on
Redfield
ra%o
in
organisms

  •  Industry


           –  mergers,
prices
of
Phosphorus‐containing
goods

  •  White
Phosphorus


           –  use
in
Middle
East
wars




altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Example
Tweets
–
Phosphorus 

  •      Legisla%on



           –  RT
@YarnPlayCafe:
The
fact
that
he
wants
to
repeal
the
phosphorus
ban
and
kill
the
Madison

              lakes
is,
by
itself,
enough
to
#killthisbill

...
:
Tue
Mar
08
02:04:49
GMT
2011


  •      Nutri%on

           –  Big,
wet
snowflakes
driy
over
the
farm.
To
warm
up,
I
try
some
Horlicks,
a
wheat/barley/whey

              drink
with
lots
of
calcium
&
phosphorus.
Mmmm.
:
Tue
Mar
01
20:56:43
GMT
2011


           –  Vitamin
D
acts
as
an
hormone
and
plays
a
controlling
role
in
the
metabolism
of
calcium
and

              phosphorus
:
Sun
Mar
06
12:12:36
GMT
2011

  •      
Other
Science


           –  
[java]
129
:
Greater
Phosphorus
Efficiency
h4p://bit.ly/iehsmK
#agriculture
:
Wed
Mar
02

              14:36:21
GMT
2011


  •      Industry


           –  #stocks
#bse
#nse
Buy
United
Phosphorus
‐
posi%ve
move
to
tap
largest
La%n
American
market;

              Edelweiss
h4p://dlvr.it/JdSpV
:
Tue
Mar
08
17:22:55
GMT
2011


           –  Enshi
:
Wugang
develops
technique
to
handle
high‐phosphorus
iron
ore
‐
Steel
Business
Briefing

              (subscri
h4p://uxp.in/30538045
:
Tue
Mar
08
09:33:05
GMT
2011


  •      
White
Phosphorus


           –  Dear
America,
your
white
phosphorus
and
depleted
uranium
can
not
stop
the
growth
of
Iraq's

              future.
Iraq
Will
Rise.
:
Wed
Mar
02
07:49:21
GMT
2011


           –  @Remroum
so
first
they
steal
our
land,
now
they
want
our
"tac%cs"
i.e.
poetry?
i
guess
the
white

              phosphorus
just
isn't
cu•ng
it
anymore.
:
Sat
Mar
05
03:42:44
GMT
2011


  •      ???

           –  @p_kojo
‐
Phosphorus
Potassium
‐
Pinocchio
,
I'm
so
glad
we
found
each
other
nw
we
can
hav

              lots
of
fun
:)
:
Sun
Mar
06
13:43:10
GMT
2011




altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Outline

  •  Aims/Introduc%on

  •  Related
Work


  •  Experiment

           –  Data

           –  Analysis
&
Results



  •  Conclusions


  •  Next
Steps


  •  Acknowledgements



altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Conclusions
–
Experiment
                                               

  •  recognised
challenges

           –  baseline
corpus
for
online
social
media
difficult
to
obtain

                    •  very
small
(rela%vely)
samples
found
in
Twi4er
stream

                    •  difficult
to
obtain
representa%ve
samples


                               more
effec%ve
methods
required
to
extract
lower
frequency
terms

           –  difficulty
reproducing
experiments

           –  reliability,
ethical
&
privacy
issues
–
due
to
user‐created
content


  •  what
is
a
suitable,
publicly
available
baseline
corpus?

           –  Google
NGrams?

                    •  different
informa%on
collec%on
methods
from
online
social
media

                             –  coverage
of
topics
may
see
large
varia%on
between
corpora

           –  any
others?

                    •  Wikipedia/DBpedia?
TREC?


altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Engagement
with
the
Web?

•  why
do
scien%sts
not
tweet?
(or
engage
much
in
other
social
media)?

        –  is
the
web
not
seen
to
enforce
sufficient
scien%fic
rigour?

        –  do
scien%sts
not
view
the
web
as
a
poten%al
audience?

•  is
the
web
audience
a
suitable
peer
reviewer?


•  why
do
scien%sts
hesitate
to
disseminate
informa%on
online?

        –  poten%al
for
ideas
to
be
stolen?

        –  trust
–
how
to
differen%ate
between
valid
science
and
pseudo‐science,

           spam
and
adverts?


•  social
media
largely
driven
by
personal
interest,
sen%ment,
opinion

        –  may
explain
low
scien%fic
content

        –  more
colloquial
use
of
what
is
tradi%onally
scien%fic
terminology


altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Implica%ons
for
Altmetrics
                                                

  •  however
‐
some
level
of
scien%fic
discourse
on
Twi4er

           –  e.g.,
Phosphorus
iden%fied
as
a
poten%al
Twi4er
trend



  •  online
social
media
may
s%ll
have
poten%al
to
serve
as
an
altmetric

     for
measuring
impact
of
science

  •  star%ng
from
scientometrics
‐
which
looks
at
author
features,
e.g.,


           –  co‐cita%on

           –  affilia%on
–
rela%onship
to
reputa%on

  •  corresponding
features
in
online
social
media

           –  followers


           –  retweets
–
rela%onship
to
trust?





altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Outline

  •  Aims/Introduc%on

  •  Related
Work


  •  Experiment

           –  Data

           –  Analysis
&
Results



  •  Conclusions



  •  Next
Steps


  •  Acknowledgements


altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Next
Steps
                                                                  

•  replicate
experiments
with
larger
samples
over
longer
period

       –  more
detailed
analysis

                •  e.g.,
hashtag
analysis;
urls
within
tweets

                •  focus
on
terms
with
more
trending
poten%al,
e.g.,
nanostructures,
nanosilver


•  consider
specific
tweets

       –  from
scien%fic
media
and
journals

       –  posted
during
scien%fic
conferences,
congresses

•  comparison
with
other
independent
baseline
data
sets


•  compare
Twi4er
use
within
different
disciplines

       –  influence
of
interdisciplinary
collabora%on
on
use
of
online
social
media?


•  create
new
benchmarks
data
&
experiments

  define
alt‐metric
for
scien%fic
term
usage
in
online
social
media


altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

Acknowledgements
                                                     

  •  Elizabeth
Cano
for
discussions
on
collec%on
and
use
of
data
from

     Twi4er
streams


  •  V.S.
Uren
&
A.‐S.
Dadzie
funded
by:

           –  European
Commission
7th
Framework
Programme
project

              SmartProducts
(grant
no.
231204)





altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

References
                                                                  

  •      Garfield
bib
‐
h4p://garfield.library.upenn.edu/pub.html

  •      Ma4hew
Rowe,
Sofia
Angeletou
and
Harith
Alani.
(2011)
Predic%ng
Discussions
on

         the
Social
Seman%c
Web,
Proc.,
ESWC
(2)
2011:
405‐420

  •      Sheila
Kinsella,
Mengjiao
Wang,
John
Breslin
and
Conor
Hayes.
(2011)
Improving

         Categorisa%on
in
Social
Media
using
Hyperlinks
to
Structured
Data
Sources,
Proc.,

         ESWC
(2)
2011:

390–404


  •      others
in
paper
references
–
see
h4p://altmetrics.org/altmetrics11/uren‐v0





altmetrics11:
Tracking
scholarly
impact
on
the
social
Web


Contenu connexe

Similaire à Relative Trends in Scientific Terms on Twitter

Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
Timo Wandhoefer
 
Investigating Impact Metrics for Performance for the US-EPA National Center f...
Investigating Impact Metrics for Performance for the US-EPA National Center f...Investigating Impact Metrics for Performance for the US-EPA National Center f...
Investigating Impact Metrics for Performance for the US-EPA National Center f...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Social Networks: Analysing relationships in learning communities
Social Networks: Analysing relationships in learning communitiesSocial Networks: Analysing relationships in learning communities
Social Networks: Analysing relationships in learning communities
Andrew Deacon
 

Similaire à Relative Trends in Scientific Terms on Twitter (20)

Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...
Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...
Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...
 
Altmetrics for kla final
Altmetrics for kla finalAltmetrics for kla final
Altmetrics for kla final
 
System modelling with STELLA: An introduction
System modelling with STELLA: An introductionSystem modelling with STELLA: An introduction
System modelling with STELLA: An introduction
 
Thesis review Presentation
Thesis review PresentationThesis review Presentation
Thesis review Presentation
 
DIE 20130724
DIE 20130724DIE 20130724
DIE 20130724
 
Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
 
Social Network Analysis: applications for education research
Social Network Analysis: applications for education researchSocial Network Analysis: applications for education research
Social Network Analysis: applications for education research
 
Not just for STEM: Open and reproducible research in the social sciences
Not just for STEM: Open and reproducible research in the social sciencesNot just for STEM: Open and reproducible research in the social sciences
Not just for STEM: Open and reproducible research in the social sciences
 
Timothy D Bowman Dissertation Defense
Timothy D Bowman Dissertation DefenseTimothy D Bowman Dissertation Defense
Timothy D Bowman Dissertation Defense
 
Science Data, Responsibly
Science Data, ResponsiblyScience Data, Responsibly
Science Data, Responsibly
 
Öppen data och forskningens genomslag
Öppen data och forskningens genomslagÖppen data och forskningens genomslag
Öppen data och forskningens genomslag
 
Using Twitter as a data source: An overview of ethical challenges
Using Twitter as a data source: An overview of ethical challengesUsing Twitter as a data source: An overview of ethical challenges
Using Twitter as a data source: An overview of ethical challenges
 
Beyond MOOCS – A Catalyst for Change
Beyond MOOCS – A Catalyst for ChangeBeyond MOOCS – A Catalyst for Change
Beyond MOOCS – A Catalyst for Change
 
Expectations and benefits of utilizing social media tools in new product deve...
Expectations and benefits of utilizing social media tools in new product deve...Expectations and benefits of utilizing social media tools in new product deve...
Expectations and benefits of utilizing social media tools in new product deve...
 
Day 1 - Quisumbing and Davis - Moving Beyond the Qual-Quant Divide
Day 1 - Quisumbing and Davis - Moving Beyond the Qual-Quant DivideDay 1 - Quisumbing and Davis - Moving Beyond the Qual-Quant Divide
Day 1 - Quisumbing and Davis - Moving Beyond the Qual-Quant Divide
 
Investigating Impact Metrics for Performance for the US-EPA National Center f...
Investigating Impact Metrics for Performance for the US-EPA National Center f...Investigating Impact Metrics for Performance for the US-EPA National Center f...
Investigating Impact Metrics for Performance for the US-EPA National Center f...
 
Bowman.2014 nordicworkshop
Bowman.2014 nordicworkshopBowman.2014 nordicworkshop
Bowman.2014 nordicworkshop
 
Analytic emperical Mehods
Analytic emperical MehodsAnalytic emperical Mehods
Analytic emperical Mehods
 
Social Networks: Analysing relationships in learning communities
Social Networks: Analysing relationships in learning communitiesSocial Networks: Analysing relationships in learning communities
Social Networks: Analysing relationships in learning communities
 
Altmetrics
AltmetricsAltmetrics
Altmetrics
 

Relative Trends in Scientific Terms on Twitter

  • 1. Rela%ve
Trends
in
Scien%fic
 Terms
on
Twi4er

 Victoria
Uren,
Aba‐Sah
Dadzie
 
The
OAK
Group,
Dept.
of
Computer
Science,
The
University
of
Sheffield

  • 2. Introduc%on
 •  scien%fic
research
tradi%onally
disseminated
via
journals,
books,
 scien%fic
conferences
 •  new
form
of
discourse
–
online
social
media

 –  suitable
forum
for
dissemina%ng
scien%fic
research?
 –  do
scien%sts
engage
with
online
social
media?
 –  are
there
sufficient
amounts
of
informa%on
on
scien%fic
topics?
 •  are
there
suitable
metrics
for
measuring
scien%fic
impact
online?
 –  between
scien%sts?
 –  for
public
engagement?
 •  are
these
new
measures
comparable
to
formal
metrics?
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 3. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 4. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 5. Related
Work
 •  Garfield,
E.
(from
1950s)
 –  father
of
scientometrics

 •  Priem
et
al.
(2010)

 –  Scientometrics
2.0
as
a
new
metric
for
measuring
scholarly
impact
on
social
web
 
 •  Lane
(2010)

 –  need
to
improve
metrics
used
to
measure
scien%fic
impact

 •  Michel
et
al.
(2011)
 –  Google
nGrams
to
analyse
culture
 –  a.o.,
recognised
fame
for
scien%sts
low…
 •  Cheong
et
al.
(2009)

 –  H1N1
spike
(trend)
detected
on
Twi4er
during
flu
pandemic
(May
2009)
 •  Rowe
et
al.
(2011)
 –  influence
of
content
and
author
features
on
predic%on
of
ac%ve,
long
term
 discussions
on
social
web
 •  Kinsella
et
al.
(2011)
 –  using
hyperlinked
metadata
to
aid
categorisa%on
of
topics
discussed
in
online
 social
media
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 6. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 7. Experiment 
 •  exploratory
experiment

 –  to
determine
frequency
of
occurrence
of
scien%fic
term
usage
in
 online
social
media
 •  data
set
 –  three
sets
of
(scien%fic)
terms
selected
from
UNESCO
thesaurus
 –  Google
Books
NGrams
corpus
used
as
a
baseline
 –  300
tweets
collected
in
each
sample,
using
Twi4er
API,
for
selected
 terms
 •  frequency/usage
analysis
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 8. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 – Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 9. UNESCO
Thesaurus
1Gram
Terms
 
 Topic
 Terms

 Physical
Sciences

 Ioniza%on,
Electromagne%sm,
Crystallography

 Chemical
Sciences

 Phosphorus,
Alkalinity,
Microchemistry

 Earth
Sciences

 Permafrost,
Lithosphere,
Glaciology
 •  selec%on
criteria
 –  minimisa%on
of
noise
due
to
polysemy
 –  avoidance
of
scien%fic
terms
with
other
common/colloquial
usage
 –  terms
unique
to
a
par%cular
topic
 –  words
with
a
single
stem
 –  1Grams
only
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 10. Baseline
Dataset
–
Google
1Grams 
 •  obtained
from
Google
Books
NGrams
corpus1

 •  total
NGrams
by
year
for
three
sets
of
terms
 –  2006
–
116,029

 –  2007
–
126,206

 –  2008
–
111,417
 •  annual
varia%on
by
topic
(of
total
NGrams
baseline
dataset)
 –  Chemical
Sciences

50‐60%

 –  Physical
Sciences



30‐40%

 –  Earth
Sciences








~
10%
 •  [1]
h4p://ngrams.googlelabs.com/datasets 

 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 11. Baseline
Dataset
–
Google
1Grams 
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 12. Twi4er
Dataset 
 Sample
ID
 CollecAon
Period

 Elapsed
Time
(h)
 T‐300‐1

 Tue


Mar
01
20:56:43
GMT
2011
–

 41
 Thu


Mar
03
14:22:18
GMT
2011

 T‐300‐2
 Fri




Mar
04
02:35:55
GMT
2011
–
 64
 
 Sun


Mar
06
18:38:05
GMT
2011

 T‐300‐3
 Mon
Mar
07
20:31:11
GMT
2011
–
 44
 
 Wed
Mar
09
16:21:36
GMT
2011
 •  three
samples
collected,
containing
300
consecu%ve
tweets
each
 •  ~
0.003%
of
total
tweets
over
collec%on
period

 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 13. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 – Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 14. Twi4er
c.f.
Google
NGrams 
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 15. Twi4er
c.f.
Google
NGrams 
 •  higher
varia%on
in
distribu%on
for
Twi4er
sample
 –  however
largely
in
line
with
Google
NGrams
 •  can
Google
NGrams
serve
as
a
suitable
baseline?
 –  need
to
more
closely
examine
varia%on…
 •  notable
peaks
in
Twi4er
sample
for
three
terms
 –  Permafrost
(Earth
Sciences)
 –  Alkalinity
(Chemical
Sciences)

 –  Phosphorus
(Chemical
Sciences)

 •  are
these
poten%al
trends?
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 16. Twi4er
c.f.
Google
NGrams 
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 17. Twi4er
c.f.
Google
NGrams 
 •  Permafrost

 –  17%
and
15%
in
Twi4er
samples
(T‐300‐1
&
2)
–
c.f.
5%
in
G‐2006‐2008
 
 –  41
out
of
113
tweets
(36%)
used
in
scien%fic
context

 –  large
number
of
tweets
referred
to
 •  online
game
server1

 •  designer
case
for
iPhone
 •  Alkalinity
 –  none
found
to
have
scien%fic
content

 –  mostly
used
in
pseudo‐scien%fic
health
advice
 –  peak
in
T‐300‐2
(31
out
of
60
tweets
–
~50%)

 •  dominated
by

pH
measures
in
swimming
pools
&
fish
tanks

 •  influence
probably
due
to
collec%on
period
–
weekend
–
engagement
in
 leisure
ac%vi%es

 •  [1]
h4p://www.everquest2.com/Permafrost
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 18. Example
Tweets
–
Permafrost

 •  advert/chat
 –  @HDNinjacp
go
to
Permafrost
Its
never
full
:
Fri
Mar
04
05:21:02
GMT
2011

 –  @Riffy8888
hey
Could
you
Come
to
my
Party
birthday
Party
on
CP
March
13
Server
 Permafrost
Dock
6:00PST
:
Sun
Mar
06
04:37:13
GMT
2011

 –  Party
Server
Permafrost
Dock
Please
Go
It's
An
Early
Birthday
Party
For
me
:
Thu
Mar
03
 01:38:28
GMT
2011

 •  cold
 –  36
inches
of
permafrost
s%ll,
I
want
to
stake
my
bird
condo
b4
the
squirrals
knock
it
 down
again..bas%ds..all
of'm
:
Sat
Mar
05
01:51:48
GMT
2011

 •  science
 –  Fire
and
Ice:
Permafrost
Melt
Spews
Combus%ble
Methane
h4p://%ny.ly/be8q
:
Fri
Mar
 04
16:43:10
GMT
2011

 –  (retweeted)
‐
Experts
Monitor
Methane
Release
from
Permafrost:
Over
the
past
few
 years,
methane
levels
around
the
world
have
b...
h4p://bit.ly/hvVEJX
:
Wed
Mar
02
 12:27:25
GMT
2011

 –  RT
@NetNewsBuzz:
Permafrost
Melt
Soon
Irreversible
Without
Major
Fossil
Fuel
Cuts
 h4p://%nyurl.com/5w8w2oh
#oil
#climate
#CO2
#fossilfuels
:
Thu
Mar
03
02:57:48
GMT
 2011

 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 19. Example
Tweets
–
Alkalinity
T‐300‐2
 •  Chemistry
Help
Needed!
pH,
concentra%on
of
carbonate
species
and
 alkalinity...
just
got
published:
h4p://bit.ly/hUCpz7

 –  URL
points
to
the
ques%on
on
“My
Chemistry
Tutor”
–
homework?
 •  retweeted
 –  The
proper
total
alkalinity
for
your
pool
is
100
ppm.
h4p://su.pr/8hrxCE
:
Fri
Mar
 04
19:02:20
GMT
2011

 –  If
the
Total
Alkalinity
in
your
swimming
pool
is
low,
your
pH
will
be
low.
h4p:// su.pr/8hrxCE
:
Fri
Mar
04
20:34:11
GMT
2011

 •  spam/adverts
(including
retweets)
 –  @Poet_Carl_Wa4s:
some
foods
create
acidity
or
alkalinity
ayer
they‚Äôre
 metabolized...h4p://ping.fm/GQTvA
#KnowledgeIsPower!
:
Sat
Mar
05
02:38:55
 GMT
2011

 –  
RT
@CourtneyPool:
Green
juice,
oh
Liquid
Emerald
Elixir
of
Life
and
Alkalinity!
 Course
through
my
BODY!
#juicing
:
Sun
Mar
06
18:34:29
GMT
2011

 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 20. Twi4er
c.f.
Google
NGrams:
Phosphorus 
 Sample
 Total
 LegislaAon
 NutriAon
 
 
 
 Other
 Industry
 
 White
 ID
 Sciences
 
 Phosphorus 
 T‐300‐1
 
 129
 
 46
 
 16
 
 29
 
 4
 5
 T‐300‐2
 119
 
 4
 26
 
 35
 
 9
 5
 T‐300‐3
 171
 
 12
 
 23
 
 37
 
 42
 
 19
 
 •  Twi4er
trends
for
Phosphorus
in
sample
T‐300‐3

 –  Industry

 •  takeover
of
a
Brazilian
company
by
the
Indian
firm
United
Phosphorus

 –  White
Phosphorus
 •  17
retweets
of
an
emo%ve
message
(rela%on
to
Middle
East
wars)
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 21. Twi4er
c.f.
Google
NGrams:
Phosphorus 
 •  usage
largely
with
scien%fic
content
 –  with
rela%onships,
a.o.,
to
legal,
nutri%onal
&
economic
context
 –  five
main
categories
iden%fied
 •  Legisla%on

 –  limits
to
use
in
fer%liser,
soap

 •  Nutri%on
 –  phosphorus
content

 •  Other
Science

 –  peak
phosphorus,
pollu%on
 –  discovery
of
arsenic
replacing
phosphorus
in
a
microbe

 –  tweets
about
new
paper
on
Redfield
ra%o
in
organisms
 •  Industry

 –  mergers,
prices
of
Phosphorus‐containing
goods
 •  White
Phosphorus

 –  use
in
Middle
East
wars

 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 22. Example
Tweets
–
Phosphorus 
 •  Legisla%on


 –  RT
@YarnPlayCafe:
The
fact
that
he
wants
to
repeal
the
phosphorus
ban
and
kill
the
Madison
 lakes
is,
by
itself,
enough
to
#killthisbill

...
:
Tue
Mar
08
02:04:49
GMT
2011

 •  Nutri%on
 –  Big,
wet
snowflakes
driy
over
the
farm.
To
warm
up,
I
try
some
Horlicks,
a
wheat/barley/whey
 drink
with
lots
of
calcium
&
phosphorus.
Mmmm.
:
Tue
Mar
01
20:56:43
GMT
2011

 –  Vitamin
D
acts
as
an
hormone
and
plays
a
controlling
role
in
the
metabolism
of
calcium
and
 phosphorus
:
Sun
Mar
06
12:12:36
GMT
2011
 •  
Other
Science

 –  
[java]
129
:
Greater
Phosphorus
Efficiency
h4p://bit.ly/iehsmK
#agriculture
:
Wed
Mar
02
 14:36:21
GMT
2011

 •  Industry

 –  #stocks
#bse
#nse
Buy
United
Phosphorus
‐
posi%ve
move
to
tap
largest
La%n
American
market;
 Edelweiss
h4p://dlvr.it/JdSpV
:
Tue
Mar
08
17:22:55
GMT
2011

 –  Enshi
:
Wugang
develops
technique
to
handle
high‐phosphorus
iron
ore
‐
Steel
Business
Briefing
 (subscri
h4p://uxp.in/30538045
:
Tue
Mar
08
09:33:05
GMT
2011

 •  
White
Phosphorus

 –  Dear
America,
your
white
phosphorus
and
depleted
uranium
can
not
stop
the
growth
of
Iraq's
 future.
Iraq
Will
Rise.
:
Wed
Mar
02
07:49:21
GMT
2011

 –  @Remroum
so
first
they
steal
our
land,
now
they
want
our
"tac%cs"
i.e.
poetry?
i
guess
the
white
 phosphorus
just
isn't
cu•ng
it
anymore.
:
Sat
Mar
05
03:42:44
GMT
2011

 •  ???
 –  @p_kojo
‐
Phosphorus
Potassium
‐
Pinocchio
,
I'm
so
glad
we
found
each
other
nw
we
can
hav
 lots
of
fun
:)
:
Sun
Mar
06
13:43:10
GMT
2011

 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 23. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 24. Conclusions
–
Experiment 
 •  recognised
challenges
 –  baseline
corpus
for
online
social
media
difficult
to
obtain
 •  very
small
(rela%vely)
samples
found
in
Twi4er
stream
 •  difficult
to
obtain
representa%ve
samples

   more
effec%ve
methods
required
to
extract
lower
frequency
terms
 –  difficulty
reproducing
experiments
 –  reliability,
ethical
&
privacy
issues
–
due
to
user‐created
content
 •  what
is
a
suitable,
publicly
available
baseline
corpus?
 –  Google
NGrams?
 •  different
informa%on
collec%on
methods
from
online
social
media
 –  coverage
of
topics
may
see
large
varia%on
between
corpora
 –  any
others?
 •  Wikipedia/DBpedia?
TREC?
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 25. Engagement
with
the
Web?
 •  why
do
scien%sts
not
tweet?
(or
engage
much
in
other
social
media)?
 –  is
the
web
not
seen
to
enforce
sufficient
scien%fic
rigour?
 –  do
scien%sts
not
view
the
web
as
a
poten%al
audience?
 •  is
the
web
audience
a
suitable
peer
reviewer?
 •  why
do
scien%sts
hesitate
to
disseminate
informa%on
online?
 –  poten%al
for
ideas
to
be
stolen?
 –  trust
–
how
to
differen%ate
between
valid
science
and
pseudo‐science,
 spam
and
adverts?
 •  social
media
largely
driven
by
personal
interest,
sen%ment,
opinion
 –  may
explain
low
scien%fic
content
 –  more
colloquial
use
of
what
is
tradi%onally
scien%fic
terminology
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 26. Implica%ons
for
Altmetrics 
 •  however
‐
some
level
of
scien%fic
discourse
on
Twi4er
 –  e.g.,
Phosphorus
iden%fied
as
a
poten%al
Twi4er
trend
 •  online
social
media
may
s%ll
have
poten%al
to
serve
as
an
altmetric
 for
measuring
impact
of
science
 •  star%ng
from
scientometrics
‐
which
looks
at
author
features,
e.g.,

 –  co‐cita%on
 –  affilia%on
–
rela%onship
to
reputa%on
 •  corresponding
features
in
online
social
media
 –  followers

 –  retweets
–
rela%onship
to
trust?
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 27. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 28. Next
Steps 
 •  replicate
experiments
with
larger
samples
over
longer
period
 –  more
detailed
analysis
 •  e.g.,
hashtag
analysis;
urls
within
tweets
 •  focus
on
terms
with
more
trending
poten%al,
e.g.,
nanostructures,
nanosilver

 •  consider
specific
tweets
 –  from
scien%fic
media
and
journals
 –  posted
during
scien%fic
conferences,
congresses
 •  comparison
with
other
independent
baseline
data
sets
 •  compare
Twi4er
use
within
different
disciplines
 –  influence
of
interdisciplinary
collabora%on
on
use
of
online
social
media?
 •  create
new
benchmarks
data
&
experiments
   define
alt‐metric
for
scien%fic
term
usage
in
online
social
media
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 29. Acknowledgements 
 •  Elizabeth
Cano
for
discussions
on
collec%on
and
use
of
data
from
 Twi4er
streams
 •  V.S.
Uren
&
A.‐S.
Dadzie
funded
by:
 –  European
Commission
7th
Framework
Programme
project
 SmartProducts
(grant
no.
231204)
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  • 30. References 
 •  Garfield
bib
‐
h4p://garfield.library.upenn.edu/pub.html
 •  Ma4hew
Rowe,
Sofia
Angeletou
and
Harith
Alani.
(2011)
Predic%ng
Discussions
on
 the
Social
Seman%c
Web,
Proc.,
ESWC
(2)
2011:
405‐420
 •  Sheila
Kinsella,
Mengjiao
Wang,
John
Breslin
and
Conor
Hayes.
(2011)
Improving
 Categorisa%on
in
Social
Media
using
Hyperlinks
to
Structured
Data
Sources,
Proc.,
 ESWC
(2)
2011:

390–404
 •  others
in
paper
references
–
see
h4p://altmetrics.org/altmetrics11/uren‐v0
 altmetrics11:
Tracking
scholarly
impact
on
the
social
Web