NYC Data Science Academy, Data Science by R Intensive Beginner level, R003 student, Jiten presented how he scrapped dataset and did south park episode popularity analysis.
2. HIGH LEVEL STEPS
Retrieve IMDB Data (Season 17 Episodes)
Episode Rating
# of Votes
Retrieve Twitter Statuses (@SouthPark)
# of Retweets
# of Favorites
Join Data Sets on Episode Title
4. WHAT NEXT?
Add Other Social Network Data Sets (Facebook, Google+, etc)
Network with strongest correlation to rating
A-B Test to establish causality
Add TV Re-Air Dates, Social Network Activity
Hashtags
Retweets
Expand to Other Shows
# of Retweets
# of Favorites
Find Most Valuable Social Network Users