Guess the Country - Playing with Twitter Streaming API

Guess the Country
Playing with Twitter Streaming API
Chris Birchall
#m3dev Tech Talk 2014/7/11

It started with an idle tweet...
https://twitter.com/cbirchall/status/466197512143912961

Let’s use Twitter for something
(slightly) useful!
The plan:
● Collect geo-tagged tweets from Twitter
Streaming API
● Use them to build a name⇔country DB
● Build a simple search UI as a proof of
concept
● (crowbar Spark in there somewhere
because it’s cool)

Implementation
Twitter
Streaming
API
EC2
https://github.com/cb372/guess-the-country
Twitter4j
.log
Fluentd
S3
EC2
Spark
Postgres
(RDS)
Heroku
Rails

Collecting tweets
● Ran the collector for 13 days
● Collected 285,340 geo-tagged tweets
● 205,798 distinct users
● Only collected names and countries,
threw everything else away
● Used Spark to filter out duplicate users
Processing

Results
It works surprisingly well!
(well, it worked for my name, anyway)
Note for the pedantic:
Since the original data is geo-tagged tweets, strictly speaking we only know
where a user is, not where they come from.

Try for yourself
Demo
http://guess-the-country.herokuapp.com/
Code
https://github.com/cb372/guess-the-country

Guess the Country - Playing with Twitter Streaming API

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (19)

Similaire à Guess the Country - Playing with Twitter Streaming API

Similaire à Guess the Country - Playing with Twitter Streaming API (20)

Plus de Chris Birchall

Plus de Chris Birchall (6)

Dernier

Dernier (20)

Guess the Country - Playing with Twitter Streaming API