Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
News Sharing User Behaviour on Twitter:
a Comprehensive Data Collection of
News Articles and Social Interactions
Giovanni ...
• 69 news sources (online newspapers and news agencies)
• 8 news categories
• 300K news articles
• 1MLN tweets sharing the...
• Articles details: source, category, topic
TheDataset
• Enriched user profile: age, sex, ethnicity, geo-localization and Twitter profile
• Twitter network of the user: follower...
• Sharing behaviour analysis: sharing and reaction stats, vocabulary and most frequent hashtags
TheDataset
TheDataCollection&EnrichmentPipeline
Implementation available on GitHub:
https://github.com/DataSciencePolimi/NewsAnalyzer
• Prediction of U.S. 2018 Midterm election results by
state based on classification of news sharing behaviour
• Rep. vs. D...
Marco Brambilla
@marcobrambi
marco.brambilla@polimi.it
Politecnico di Milano, Italy
http://datascience.deib.polimi.it/
New...
Prochain SlideShare
Chargement dans…5
×

News Sharing User Behaviour on Twitter: a Comprehensive Data Collection of News Articles and Social Interactions

294 vues

Publié le

We offer free and open access to two core resources for social media and journalism research:

1) A data collection and enrichment pipeline which allows to generate custom datasets that include both news content and the social interactions over them, starting from a pre-defined set of news sources.

2) A dataset which includes news articles from major U.S. news outlets and associated sharing activities on Twitter, covering the tweets content and the author profiles.

First, we build a robust pipeline for collecting datasets describing news sharing; the pipeline takes as input a list of news sources and generates a large collection of articles, of the accounts that provide them on the social media either directly or by retweeting, and of the social activities performed by these accounts.

The dataset is published on Harvard Dataverse:

https://doi.org/10.7910/DVN/5XRZLH

Second, we also provide a large-scale dataset that can be used to study the social behavior of Twitter users and their involvement in the dissemination of news items. Finally we show an application of our data collection in the context of political stance classification and we suggest other potential usages of the presented resources.

The code is published on GitHub:

https://github.com/DataSciencePolimi/NewsAnalyzer

The details of our approach is published in a paper at ICWSM 2019 accessible online on AAAI library:
https://aaai.org/ojs/index.php/ICWSM/article/view/3256

You can cite the paper as:

Giovanni Brena, Marco Brambilla, Stefano Ceri, Marco Di Giovanni, Francesco Pierri, Giorgia Ramponi. News Sharing User Behaviour on Twitter: A Comprehensive Data Collection of News Articles and Social Interactions. AAAI ICWSM 2019, pp. 592-597.

Publié dans : Données & analyses
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

News Sharing User Behaviour on Twitter: a Comprehensive Data Collection of News Articles and Social Interactions

  1. 1. News Sharing User Behaviour on Twitter: a Comprehensive Data Collection of News Articles and Social Interactions Giovanni Brena, Marco Brambilla, Stefano Ceri, Marco Di Giovanni, Francesco Pierri, Giorgia Ramponi Politecnico di Milano @marcobrambi marco.brambilla@polimi.it http://datascience.deib.polimi.it/
  2. 2. • 69 news sources (online newspapers and news agencies) • 8 news categories • 300K news articles • 1MLN tweets sharing the news • 40K user profiles authoring the tweets • On dataverse.harvard.edu: https://doi.org/10.7910/DVN/5XRZLH TheDataset
  3. 3. • Articles details: source, category, topic TheDataset
  4. 4. • Enriched user profile: age, sex, ethnicity, geo-localization and Twitter profile • Twitter network of the user: followers, friends, mentions, and retweets TheDataset
  5. 5. • Sharing behaviour analysis: sharing and reaction stats, vocabulary and most frequent hashtags TheDataset
  6. 6. TheDataCollection&EnrichmentPipeline Implementation available on GitHub: https://github.com/DataSciencePolimi/NewsAnalyzer
  7. 7. • Prediction of U.S. 2018 Midterm election results by state based on classification of news sharing behaviour • Rep. vs. Dem. Affiliation achieved 90% accuracy at user level • Only 5 states wrongly predicted results WhatdoIuseitfor? …Anythingyouwant
  8. 8. Marco Brambilla @marcobrambi marco.brambilla@polimi.it Politecnico di Milano, Italy http://datascience.deib.polimi.it/ News Sharing User Behaviour on Twitter: a Comprehensive Data Collection of News Articles and Social Interactions

×