1. How Useful Is A Tweet?
iHub Research’s 3Vs of Crowdsourcing
Framework
Angela Crandall Nanjira Sambuli Chris Orwa
This research was funded by Canada’s International Development Research Centre.
2.
3. Twitter: Some Facts and Figures
• Launched in 2006
• Approximately 550 million active users
worldwide
• About 200 million monthly active users
• An average of 400 million tweets are sent
everyday globally
• 60% of the monthly active users log on using
a mobile device at least once every month
4. Twitter and the Tweeps
• Twitter…can be more of a news media than
even a social network (Kwak et al, 2010)
• Breaking news and coverage of real-time
events are all shared under the 140-character
limit
• Twitter users search for up-to-the-second
information and updates on unfolding events
5. Twitter for Crowdsourcing.
That Is…
Collecting information from the “crowd”
• Allows for a wide reach of people in inexpensive ways
• Large amounts of data can be obtained quickly, and
often in real time
• Not necessarily through tech, but nowadays most use
tech such as online or via mobile phone
• Crowdsourcing fosters citizen engagement with the
information—to dispute, confirm, or acknowledge its
existence.
7. What is there to (Twitter) crowdsourcing?
Viability: In what situation/events is crowdsourcing a
viable venture likely to offer worthwhile results/
outcomes?
Validity: Does crowd-sourced information offer a true
reflection of the reality on the ground?
Verification: Is there a way in which we can verify
that the information provided through crowdsourcing
is indeed valid? If so, can the verification process be
automated?
8. Crowdsourcing during an Election
• What, if any, particular conditions should be in place
for crowdsourcing of information to be viable during
an election period?
• Can crowd-sourced information be validated during
an election period? If so, what is the practical
implementation of doing so?
• How do different crowdsourcing methods contribute
to the quality of information collected?
9. Why Elections?
o Elections in Kenya have been noted to spark many
online conversations, especially with the continued
uptake of social media;
o Citizens have an important role to play to contribute
information from the ground;
o Existing election crowdsourcing initiatives (such as
Uchaguzi), but none use passive crowdsourcing;
o Research exists around crowdsourcing during
disasters, but does not yet exist around elections.
10. Why Crowdsourcing, Kenyan
Elections and #KoT
• #KoT have participated in crowdsourcing activities
severally, under hashtags such as #CarPoolKE,
#findfuel, #SomeoneTellCNN etc.
• Approximately 90,000 tweets generated during the
first Kenyan Presidential Debates (as monitored
using popular hashtags)
• Election-campaigning was also digital
11. (Online) Passive Crowdsourcing
vs. Active Crowdsourcing
• Active – Open call made for participation
(e.g. Ushahidi’s Crowdmap).
• Passive – Sifting through content already
being generated (e.g. on Twitter/
Facebook) to capture relevant
information.
12. What we did
Cross-comparison of different media
sources:
o Traditional Media
o Data mining from Twitter
o Uchaguzi Crowdsourcing
o Fieldwork
19. First tweet by a government institution about
the attack
20. Mining Of Twitter Data without
Machine Learning is Not Feasible
Search
method
Time
taken
Number of
Newsworthy
Tweets
Search time
for whole
data set
Viable for
real time
analysis
Viable for
post-data
analysis
Linear
search
90 hrs 100 270 days No No
Keyword
search
4.5 hrs 400 27 days No In a very
limited way
ML,
supervised
learning
Less than
6 mins, 1.5
hrs
labeling
12,208 Less than 1
sec
Yes Yes
21. From the Westgate Incident…
Mining tweets from the Westgate attack
manually has been labour-intensive, limiting us
to sufficiently analysing the first half hour
(12:38 PM – 1:18 PM GMT+ 3)
Further analysis into Twitter data from the
incident will require machine learning
techniques.
22. In Summary:
o Kenyan social media content is rich with real-time
updates of happenings that might not be present in
mainstream media reports.
o Mining of crowd-sourced data appears to be high value
when one is looking for timely, local information.
o There are indeed considerations that are useful for
assessing and running an election-based crowdsourcing
activity.
24. Next Steps
• Testing the 3V’s Framework on other election-
related crowdsourcing opportunities
• Move to real-time analysis of tweets
• Provide tools for verifying crowdsourced
information.
• Integrate research to media practices
• Working with local media organizations to build a
useable tool for collecting real-time
newsworthy incidents from the crowd