Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Five steps to search and store tweets by keywords

24 075 vues

Publié le

This episode of tutorial teaches you how to download tweets that include a set of keywords.

Publié dans : Formation, Technologie
  • can any one help me
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • processing id 1/2/home/python/global/set/local/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning Error reading id %g uncontrol, exception: Twitter API returned a 401 (Unauthorized), Timestamp out of bounds. processing id 2/2 Error reading id %g unviolence, exception: Twitter API returned a 401 (Unauthorized), Timestamp out of bounds.
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • @cosmopolitanvan sure...it will be really helpful if you can send a test mail to me from your id...my email id is pallab.sarkar59@gmail.com
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • @Pallab Sarkar Pallab, can you send me the script you edited? Sometimes, nuanced tweaking in a script can cause problems.
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • First of all...awesome tutorial. But while running the embedded code for extracting data for a keyword...its throwing me an error at the getData function.
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici

Five steps to search and store tweets by keywords

  1. 1. Five Steps to Search and Store Tweets by Keywords • Created by The Curiosity Bits Blog (curiositybits.com) • With the support from Dr. Gregory D. Saxton (http://social-metrics.org/ )
  2. 2. The output you will get… Let’s say I want to study Twitter discussions of the missing Malaysian airliner MH370. I plan to gather all tweets that include the keywords MH370 or Malaysian. You will get an ample amount of metadata for each tweet. Here is a breakdown of each metadata type: name Def. tweet_id The unique identifier for a tweet inserted_date When the tweet is downloaded into your database language language retweeted_status Is the tweet a RETWEET? content The content of the tweet from_user_scree n_name The screen name of the tweet sender
  3. 3. name Def. from_user_followers_count The number of followers the sender has from_user_friends_count The number of users the sender is following from_user_listed_count How many times the sender is listed from_user_statuses_count The number of tweets sent by the sender from_user_description The profile bio of the sender from_user_location The location of the sender from_user_created_at When the Twitter account is created retweet_count How many times the tweet is retweeted entities_urls The URLs included in the tweet entities_urls_count The number of URLs included in the tweet entities_hashtags The hashtags included in the tweet entities_hashtags_count The number of hashtags in the tweet entities_mentions The screen-names mentioned in a tweet
  4. 4. name Def. in_reply_to_screen_name The screen name of the user who is replied to by the sender in_reply_to_status_id The unique identifier of a reply entities_expanded_urls Complete URLs extracted from short URLs json_output The ENTIRE metadata in JSON format, including metadata not parsed into columns entities_media_count NA media_expanded_url NA media_url NA media_type NA video_link NA photo_link NA twitpic NA
  5. 5. Step 1: Checklist • Do you know how to install necessary Python libraries? If not, please review pg.8 in http://curiositybits.com/python-for-mining-the-social-web/python- tutorial-mining-twitter-user-profile/ • Do you know how to browse and edit SQLite database through SQLite Database Browser? If not, please review pg.10-14 in http://curiositybits.com/python-for- mining-the-social-web/python-tutorial-mining-twitter-user-profile/ Download the code https://drive.google.com/file/d/0Bwwg6GLCW_I Pdm1mcHNXeU85Nkk/edit?usp=sharing
  6. 6. Have you installed these necessary Python libraries? Step 1: Checklist
  7. 7. Step 1: Checklist Most importantly, we need to install a Twitter mining library called Twython (https://twython.readthedocs.org/en/latest/index.html)
  8. 8. Step 2: enter the search terms You can enter multiple search terms, separated by comas. Please notice that the last search term ends by a coma. You can enter non-English search terms. But make sure the Python script starts by the following block of code:
  9. 9. Step 3: enter your API keys API Key API secret Access token Access token secret Enter the key inside the quotation marks
  10. 10. Step 3: enter your API keys • Set up your API keys - 1 First, go to https://dev.twitter.com/, and sign in your Twitter account. Go to my applications page to create an application.
  11. 11. Step 3: enter your API keys • Set up your API keys - 2 Enter any name that makes sense to you Enter any text that makes sense to you you can enter any legitimate URL, here, I put in the URL of my institution. Same as above, you can enter any legitimate URL, here, I put in the URL of my institution.
  12. 12. Step 4: change the parameter result_type defined by the Twitter API Documents. Now, we set it to recent, we can also set it to mixed or popular.
  13. 13. Step 4: change the parameter Here is a list of parameters you can tweak or add: https://dev.twitter.com/docs/api/1.1/get/search/tweets For example, if you want to limit the search to Chinese, you can add lang = ‘zh’
  14. 14. Step 4: change the parameter For another example, if you want to limit the search to all tweets sent until April 1 of 2014. You can add until = ‘2014- 04-01’
  15. 15. Step 5: set up SQLite database • When you type in just a file name, the database will be saved in the same folder with the Python script. You can use a full file path such as sqlite:///C:/xxxx/xxx/MH370.sqlite.
  16. 16. Hit RUN!
  17. 17. If you run the script daily or twice a day, you should be good enough to cover all tweets generated on that day, and tweets a few days old. But, historical tweets are EXPENSIVE! Tweets older than a week can be purchased through http://gnip.com/ Are we getting all the tweets?

×