1. Ageing Factor: a Potential
Altmetric for Observing Events
and Attention Spans in
Microblogs
Victoria Uren & Aba-Sah Dadzie
2. Why look at science discussion on Twitter?
• Public engagement with science matters:
• Enthuse kids to learn science,
• Inform people about fascinating stuff,
• Build consensus for social and economic change,
• The public paid for our research!
• Social media present a great opportunity to “talk nerdy” to the
public (on.ted.com/Marshall)
3. Challenges
1. Level of tweeting low for science
• Of 32 astronomy terms from UNESCO Thesaurus 6 occurred
at a usable level
• Earth, Moon, Sun, Stars, Universe, Space
2. Scientific tweets lost in noise
• 0.04 of tweets in the UNESCO terms sample were on topic
• ~0.4 for popular culture tweets (Mejovo and Srinivasan 2012)
4. Meteor Showers – coming to a sky near you!
• Debris from comets stream
to earth on parallel paths
• Shooting stars appear to
radiate from a point
• Predictable time and place
• Fun to observe with friends!
• Geminid 13-14 Dec 2011
• Quadrantid 3 Jan 2012
Images from Wikipedia
5. Aging Factor
k
AF = i
k +l
Where:
i is the cut-off time in hours,
k is the number of retweets originating at least i hours ago,
l is the number of retweets originating less than i hours ago,
k + l is therefore all the tweets in the sample
Based on Avremescu’s metric from Egghe & Rousseau (1990) – using hours
instead of years.
6. Assumptions
• Assumption 1: ageing factors for topics which concern special
events will be lower than suitable baselines.
• Assumption 2: ageing factors which are higher than suitable
baselines are associated with topics in which interest is sustained
over time.
8. Experiment 1
• Dec 13-14th 2011 – Geminid meteor shower
• Training set:
• 8980 tweets
• Dec 14th 2011 22:36 GMT - Dec 14th 2011 23:18 GMT
• Test set:
• 81891 tweets
• Dec 14th 2011 23:18 GMT - Dec 15th 2011 03:30 GMT
• Human categorization by reading of tweets in the training data
• 9 composite searches
• 3 baseline searches
• 1hAF & 24hAF
10. Experiment 2
• Jan 3 2012 - Quadrantid meteor shower
• > 2 weeks later
• Four time windows:
• 0:00-5:59 GMT (6)
• 6:00-11:59 GMT (12)
• 12:00-17:59 GMT (18)
• 18:00-23:59 GMT (24)
• Are 1hAF values low for this event?
• Does the time of day matter (it must be dark to see meteors)?
11. Results – Queries from Training Data
1
0.9
0.8
0.7
0.6
0.5
0.4 6
0.3 12
0.2 18
0.1 24
0
Batch Space Space Space Space Space Space Astro Astro Astro Astro Astro Meteor
AND sci AND AND AND AND AND AND @ AND NOT
gear amb bodies bodies+ events tech meteor
Astro AND events @12 – of 275
total retweets 18 contain the term
quadrantid while 213 contain the Astro & events and Meteor
term wish – these are NOT about both contain “shooting star”
meteor showers! and are low c.f. Astro in12
12. Modified Queries
• Background Knowledge - 3 astronomical events:
• Quadrantid meteor shower night of 3-4 Jan.
• 2nd of the twin Grail spacecraft moving into orbit around the
Moon on the 2nd of Jan.
• Proximity of the Moon and the planet Jupiter in the night sky on
the 2nd of Jan
13. Results – Modified Queries
Space AND grail @18 lies within the
expected variance of the population
14. Results – 3 “Interesting” Sets
• 2 Astro AND quad points
• @18 0.15 182, @24 0.22 330
• Inference: retweeting activity around the Quadrantid meteor
shower was significant in the hours of darkness for the UK and
USA
• 1 Space NOT grail
• @6 0.71 274
• 216 of the retweets contained the phrase “join NASA”
• “Oh really? You need space? You might as well join NASA.”
• Inference: this is a funny joke (apparently)!
16. Conclusions
• 1hAF does support analysis of the smaller datasets typical of
scientific posts
• 24hAF was not a sensitive measure
• 24h time window too long for Twitter
• Funnel plot suggests some observations are significant
• Both low and high 1hAF were observed
• High interest = low 1hAF
• Long attention span = high 1hAF
17. Future Directions
• NLP/ML approach for identifying scientific posts?
• Clustering better than categorisation because data changes
rapidly with time?
• ( after >2 weeks our training data queries were outdated)
• Different cutoffs for the AF?
• E.g. 6hAF for quarter days
• Episodic vs steady tweets (Hu et al. 2012)
• Episodic -> low AF ?
• Steady -> high AF?
• Different types of participant?
18. References
Y. Mejova and P. Srinivasan, “Crossing Media Streams with
Sentiment: Domain Adaptation in Blogs, Reviews and Twitter,” in
Sixth International AAAI Conference on Weblogs and Social Media,
2012, pp. 234-241.
L. Egghe and R. Rousseau, Introduction to Informetrics. Elsevier,
1990.
Y. Hu et al., “What Were the Tweets About? Topical Associations
between Public Events and Twitter Feeds,” in Sixth International
AAAI Conference on Weblogs and Social Media, 2012, pp.
154-161.