2. Collecting social media from API vs. website
Advantages of API:
● Structured data (typically JSON).
● Tend to be stable.
● Some provide more metadata than
available from website.
● Data can be collected efficiently.
Disadvantages of API:
● Not all social media platforms have
complete, public APIs.
● Data not readily human-viewable.
● Each API is different.
● Some platforms (notably Twitter) limit
data sharing.
● May be greater limitations on the
data available from API, especially
historical data.
4. Social Feed Manager (SFM)
● Open-source software.
● Developed by GW Libraries with grant from National Historical Publications &
Records Commission.
● Collects social media from APIs of Twitter, Tumblr, Flickr & Sina Webo. Also
collects web resources.
● Supports requirements of researchers and archivists/librarians.
○ Collect to answer specific research questions.
○ Proactively collect to support future research.
● Intended to be provided as a service to the members of a community (as
opposed to individual use).
5.
6.
7.
8. Example collections
● 2016 U.S. election
○ 280 million tweets
○ Separate collections for
candidates, debates,
conventions
○ Published to Harvard Dataverse
● End of Term (EOT) collection
○ 3000 Twitter accounts
○ 70 Tumblr blogs
○ Continuing as 2017-2020
Federal Term collection
● Women’s March
● Trump Administration officials
● 115th U.S. Congress
● News outlets
● Chinese anti-corruption
○ Sina Weibo and Twitter
● ISIS-related Twitter users
● Latin American political leaders
● George Washington University
○ Official accounts and student
groups