SlideShare a Scribd company logo
1 of 24
The Wisdom of the Audience: An Empirical Study of
Social Semantics in Twitter Streams
Claudia Wagner, Philipp Singer, Lisa Posch and Markus Strohmaier
10th Extended Semantic Web Conference, Montpellier, 29.5.2013
Problem
#music
#fashion
Authors make their messages as informative as required but do not provide
more information than necessary (Maxim of Quantity by Grice (1975))
[src: http://www.techweekeurope.co.uk/wp-content/uploads/2012/07/Twitter.jpg]
Research Questions3
RQ 1: To what extent is the background knowledge of audiences useful for
analyzing the semantics of social media messages?
RQ 2: What are the characteristics of an audience which possesses useful
background knowledge for interpreting the meaning of a stream's messages
and which types of streams tend to have useful audiences?
[scr: http://www.teachthought.com/twitter-hashtags-for-teacher/]
Methodology
Message Classification Task
Use hashtags as ground truth
Laniado and Mika (2010) showed that around half of all hashtags can
be associated with Freebase concepts
Compare real audience with random audience - how well can an
audience predict the hashtag of a tweet?
The audience which is better in guessing the hashtag of a Twitter
message is better in interpreting the meaning of the message
Null hypothesis: If the audience of a stream does not possess
more knowledge about the semantics of the stream's messages
than a randomly selected baseline audience, the results from
both classification models should not differ significantly
4
Methodology
Train different multiclass classifiers on the background
knowledge of the audience
Logistic Regression, Stochastic Gradient Descent, Multinomial Naive
Bayes and Linear SVM
Compare different approaches for estimating the
background knowledge
Different audience and content selection approaches
Different methods for estimating the background knowledge
Test how well each model can predict the hashtag of
future messages
Weighted Macro F1
5
Dataset
Diverse sample of hashtags
Romero et al. (2011) identified eight categories of
hashtags on a large data sample
celebrity, games, idioms, movies/TV, music, political, sports, and
technology
We randomly draw from each category ten
hashtags which were still in use
6
Dataset7
Technology Idioms Sports Politics
#blackbery,
#iphone, #google
#omgfacts,
#factsaboutme,
#iwish
#football, #nfl,
#yankees
#climate, #iran,
#teaparty
Games Music Celebrity Movies
#gaming,
#mafiawars,
#wow
#lastfm,
#eurovision,
#nowplaying
#bsb,
#michaeljackson,
#rogis
#avatar, #tv,
#glennbeck
Dataset8
t0 t1 t2
3/4/2012 4/1/2012 4/29/2012
stream
tweets
crawl of
social
structure
stream
tweets
crawl of
social
structure
stream
tweets
crawl of
social
structure
1 week
crawl of audience
tweets
crawl of audience
tweets
crawl of audience
tweets
t1 t2 t3
Stream Tweets 94,634 94,984 95,105
Stream Authors 53,593 54,099 53,750
Friends 7,312,792 7,896,758 8,390,143
Audience Tweets 29,144,641 29,126,487 28,513,876
Audience Selection
A
B
C
AuthorsAudienceRank
1
2
3
Stream
Team bc tryouts tomo
#football
What we learned this
week: Chelsea are
working in reverse
and Avram is coming
#football #soccer
Weekend pleeeease
hurrrrry #sanmarcos
#football
Holy #ProBowl I'm
spent for the rest of
the day. #football
Fifa warns Indonesia
to clean up its football
or face sanctions
#Indonesia #Football
Background Knowledge
Content Selection
Recent
The most recent messages authored by the
audience users
Top Links (plain and enriched)
the messages authored by the audience which
contain one of the top links of that audience
Top Tags
the messages authored by the audience which
contain one of the top hashtags of that audience
10
Background Knowlegde
Representation
Preprocessing: remove stopwords, twitter
syntax, stemming
Represent background knowledge of the audience
via the most likely topics or most important words
of their messages
Bag of Words: TF and TFIDF
Topic Models: LDA
11
Empirical Evaluation
RQ 1: To what extent does the background
knowledge of the audience support the semantic
annotation of individual messages?
Combine audience selection and background
knowledge estimation approaches to generate
semantic features of the messages authored by an
audience
Training data on audience’s messages crawled at t0
Test model using messages of the hashtag streams
crawled at t1
12
Results13
F1 (TF-IDF) F1 (LDA)
Random Guessing 1/78 1/78
Baseline (random audience) 0.01 0.01
F1 (TF-IDF) F1 (LDA)
Random Guessing 1/78 1/78
Baseline (random audience) 0.01 0.01
Audience - recent 0.25 0.23
F1 (TF-IDF) F1 (LDA)
Random Guessing 1/78 1/78
Baseline (random audience) 0.01 0.01
Audience – recent 0.25 0.23
Audience – top links enriched 0.13 0.10
Audience – top links plain 0.12 0.10
F1 (TF-IDF) F1 (LDA)
Random Guessing 1/78 1/78
Baseline (random audience) 0.01 0.01
Audience – recent 0.25 0.23
Audience – top links enriched 0.13 0.10
Audience – top links plain 0.12 0.10
Audience – top tags 0.24 0.21
The audience of a hashtag stream contains knowledge
which is useful for predicting the hashtags of future
messages
Results14
F1 (TF-IDF) F1 (LDA)
Random Guessing 1/78 1/78
Baseline (random audience) 0.01 0.01
F1 (TF-IDF) F1 (LDA)
Random Guessing 1/78 1/78
Baseline (random audience) 0.01 0.01
Audience - recent 0.25 0.23
F1 (TF-IDF) F1 (LDA)
Random Guessing 1/78 1/78
Baseline (random audience) 0.01 0.01
Audience – recent 0.25 0.23
Audience – top links enriched 0.13 0.10
Audience – top links plain 0.12 0.10
F1 (TF-IDF) F1 (LDA)
celebrity 0.17 0.15
games 0.25 0.22
idioms 0.09 0.05
movies 0.22 0.18
music 0.23 0.18
political 0.36 0.33
sports 0.45 0.42
technology 0.22 0.22
Empirical Evaluation
RQ 2: What are the characteristics of an
audience which possesses useful background
knowledge for interpreting the meaning of a
stream's messages and which types of streams
tend to have useful audiences?
Correlation analysis between the ability of an
audience to interpret the meaning of
messages and structural properties of the
stream
15
Structural Stream Properties
Static Measures
Coverage: informational, hashtag, retweet and
conversational extent of a stream
Entropy: randomness of a stream's authors and their
followers, followees and friends
Overlap: overlap between authors and followers,
authors and followees and authors and friends
Dynamic Measures
KL divergence between the author-, the follower-, and
the friend-distributions of a stream at different time
points
16
Stat. Significant Spearman
Rank Correlation (p<0.05)17
F1 (TF-IDF) F1 (LDA)
Overlap Author-Follower 0.675 0.655
Overlap Author-Followee 0.642 0.628
Overlap Author-Friend 0.612 0.602
Streams which are produced and consumed by a
community of users who are tightly interconnected tend to
have a useful audience.
A useful audience possesses background knowledge which
helps interpreting the meaning of messages.
Stat. Significant Spearman
Rank Correlation (p<0.05)18
F1 (TF-IDF) F1 (LDA)
Conversation Coverage 0.256 0.256
Conversational streams tend to have a useful
audience.
Stat. Significant Spearman
Rank Correlation (p<0.05)19
F1 (TF-IDF) F1 (LDA)
Entropy Author Distribution -0.270 -0.400
Entropy Friend Distribution -0.307 -
Entropy Follower Distribution -0.400 -0.319
Entropy Followee Distribution -0.401 -0.368
Streams which are produced and consumed by a
focused set of authors, followers, followees and
friends tend to have a useful audience.
Stat. Significant Spearman
Rank Correlation (p<0.05)20
F1 (TF-IDF) F1 (LDA)
KL Follower Distribution -0.281 -
KL Followee Distribution -0.343 -0.302
KL Author Distribution -0.359 -0.307
Socially stable streams tend to have an audience
which is good in interpreting the meaning of a
stream's messages.
Summary & Conclusions
The audience of a social stream possesses knowledge which
may indeed help to interpret the meaning of a stream's
messages
But not all streams have similar useful audiences
The audience of a social stream seems to be most useful if
the stream is created and consumed by a stable, focused and
communicative community – i.e., a group of users who are
interconnected and have few core users to whom almost
everyone is connected
We do not know if those relations are causal but we got
similar results when repeating our experiments on t1 and t2
21
Current and Future Work
Compare the utility of ontological knowledge with
audience background knowledge for the hashtag
prediction task
Algorithmic exploitation of our results
Hybrid hashtag recommendation algorithm
Structural stream measures may inform weighting (how much
can we count on the audience?)
Differentiate between social and topical hashtags
User-centric algorithms work only for active users who used
hashtags before
An audience-integrated approach only requires an active audience
22
References
Grice, H. P. (1975). Logic and conversation. In Speech acts, 3, 41–58. New
York: Academic Press.
Laniado, D., & Mika, P. (2010). Making sense of twitter. In Proceedings of
the 9th international semantic web conference (pp. 470-485). Shanghai,
China.
Romero, D. M., Meeder, B., & Kleinberg, J. (2011). Differences in the
me-chanics of information diffusion across topics: idioms, political hashtags,
and complex contagion on twitter. In Proceedings of the 20th international
conference on world wide web (pp. 695–704). Hyderabad, India.
24
Experimental Setup
src: http://adobeairstream.com/green/a-natural-predicament-sustainability-in-the-21st-century/
THANK YOU
claudia.wagner@joanneum.at
http://claudiawagner.info
[src: http://www.crowdscience.com/2008/06/tips_and_more/]

More Related Content

What's hot

Odsc 2018 detection_classification_of_fake_news_using_cnn_venkatraman
Odsc 2018 detection_classification_of_fake_news_using_cnn_venkatramanOdsc 2018 detection_classification_of_fake_news_using_cnn_venkatraman
Odsc 2018 detection_classification_of_fake_news_using_cnn_venkatramanvenkatramanJ4
 
Analyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-TweetsAnalyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-TweetsRESHAN FARAZ
 
Evolving Swings (topics) from Social Streams using Probability Model
Evolving Swings (topics) from Social Streams using Probability ModelEvolving Swings (topics) from Social Streams using Probability Model
Evolving Swings (topics) from Social Streams using Probability ModelIJERA Editor
 
Fake News Detection using Machine Learning
Fake News Detection using Machine LearningFake News Detection using Machine Learning
Fake News Detection using Machine Learningijtsrd
 
Modeling Causal Reasoning in Complex Networks through NLP: an Introduction
Modeling Causal Reasoning in Complex Networks through NLP: an IntroductionModeling Causal Reasoning in Complex Networks through NLP: an Introduction
Modeling Causal Reasoning in Complex Networks through NLP: an IntroductionLuca Nannini
 
Insights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter contentInsights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter contentStephen Dann
 
IRJET - Fake News Detection using Machine Learning
IRJET -  	  Fake News Detection using Machine LearningIRJET -  	  Fake News Detection using Machine Learning
IRJET - Fake News Detection using Machine LearningIRJET Journal
 
SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for ...
SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for ...SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for ...
SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for ...COMRADES project
 
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...IRJET Journal
 
A RELIABLE ARTIFICIAL INTELLIGENCE MODEL FOR FALSE NEWS DETECTION MADE BY PUB...
A RELIABLE ARTIFICIAL INTELLIGENCE MODEL FOR FALSE NEWS DETECTION MADE BY PUB...A RELIABLE ARTIFICIAL INTELLIGENCE MODEL FOR FALSE NEWS DETECTION MADE BY PUB...
A RELIABLE ARTIFICIAL INTELLIGENCE MODEL FOR FALSE NEWS DETECTION MADE BY PUB...caijjournal
 
Tags as tools for social classification
Tags as tools for social classificationTags as tools for social classification
Tags as tools for social classificationIsabella Peters
 
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...Labic Ufes
 
Self-disclosure in twitter conversations - talk in QCRI
Self-disclosure in twitter conversations - talk in QCRISelf-disclosure in twitter conversations - talk in QCRI
Self-disclosure in twitter conversations - talk in QCRIJinYeong Bak
 
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MININGFAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MININGijnlc
 
These slides cover the final defense presentation for my Doctorate degree. Th...
These slides cover the final defense presentation for my Doctorate degree. Th...These slides cover the final defense presentation for my Doctorate degree. Th...
These slides cover the final defense presentation for my Doctorate degree. Th...Eric Brown
 
Politycheck - Political Ideology Detection (Natural Language Processing Chall...
Politycheck - Political Ideology Detection (Natural Language Processing Chall...Politycheck - Political Ideology Detection (Natural Language Processing Chall...
Politycheck - Political Ideology Detection (Natural Language Processing Chall...Hamza Mahmood
 
What makes a tweet relevant for a topic?
What makes a tweet relevant for a topic?What makes a tweet relevant for a topic?
What makes a tweet relevant for a topic?Ke Tao
 

What's hot (19)

Odsc 2018 detection_classification_of_fake_news_using_cnn_venkatraman
Odsc 2018 detection_classification_of_fake_news_using_cnn_venkatramanOdsc 2018 detection_classification_of_fake_news_using_cnn_venkatraman
Odsc 2018 detection_classification_of_fake_news_using_cnn_venkatraman
 
Analyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-TweetsAnalyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-Tweets
 
Evolving Swings (topics) from Social Streams using Probability Model
Evolving Swings (topics) from Social Streams using Probability ModelEvolving Swings (topics) from Social Streams using Probability Model
Evolving Swings (topics) from Social Streams using Probability Model
 
Fake News Detection using Machine Learning
Fake News Detection using Machine LearningFake News Detection using Machine Learning
Fake News Detection using Machine Learning
 
Nannobloging pr
Nannobloging prNannobloging pr
Nannobloging pr
 
Modeling Causal Reasoning in Complex Networks through NLP: an Introduction
Modeling Causal Reasoning in Complex Networks through NLP: an IntroductionModeling Causal Reasoning in Complex Networks through NLP: an Introduction
Modeling Causal Reasoning in Complex Networks through NLP: an Introduction
 
Insights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter contentInsights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter content
 
IRJET - Fake News Detection using Machine Learning
IRJET -  	  Fake News Detection using Machine LearningIRJET -  	  Fake News Detection using Machine Learning
IRJET - Fake News Detection using Machine Learning
 
SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for ...
SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for ...SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for ...
SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for ...
 
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
 
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
 
A RELIABLE ARTIFICIAL INTELLIGENCE MODEL FOR FALSE NEWS DETECTION MADE BY PUB...
A RELIABLE ARTIFICIAL INTELLIGENCE MODEL FOR FALSE NEWS DETECTION MADE BY PUB...A RELIABLE ARTIFICIAL INTELLIGENCE MODEL FOR FALSE NEWS DETECTION MADE BY PUB...
A RELIABLE ARTIFICIAL INTELLIGENCE MODEL FOR FALSE NEWS DETECTION MADE BY PUB...
 
Tags as tools for social classification
Tags as tools for social classificationTags as tools for social classification
Tags as tools for social classification
 
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...
 
Self-disclosure in twitter conversations - talk in QCRI
Self-disclosure in twitter conversations - talk in QCRISelf-disclosure in twitter conversations - talk in QCRI
Self-disclosure in twitter conversations - talk in QCRI
 
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MININGFAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
 
These slides cover the final defense presentation for my Doctorate degree. Th...
These slides cover the final defense presentation for my Doctorate degree. Th...These slides cover the final defense presentation for my Doctorate degree. Th...
These slides cover the final defense presentation for my Doctorate degree. Th...
 
Politycheck - Political Ideology Detection (Natural Language Processing Chall...
Politycheck - Political Ideology Detection (Natural Language Processing Chall...Politycheck - Political Ideology Detection (Natural Language Processing Chall...
Politycheck - Political Ideology Detection (Natural Language Processing Chall...
 
What makes a tweet relevant for a topic?
What makes a tweet relevant for a topic?What makes a tweet relevant for a topic?
What makes a tweet relevant for a topic?
 

Viewers also liked

Slam about "Discrimination and Inequalities in socio-computational systems"
Slam about "Discrimination and Inequalities in socio-computational systems"Slam about "Discrimination and Inequalities in socio-computational systems"
Slam about "Discrimination and Inequalities in socio-computational systems"Claudia Wagner
 
Towards Mining Semantic Maturity in Social Bookmarking Systems
Towards Mining Semantic Maturity in Social Bookmarking SystemsTowards Mining Semantic Maturity in Social Bookmarking Systems
Towards Mining Semantic Maturity in Social Bookmarking SystemsInovex GmbH
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Claudia Wagner
 
Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...John Breslin
 
Measuring Gender Inequality in Wikipedia
Measuring Gender Inequality in WikipediaMeasuring Gender Inequality in Wikipedia
Measuring Gender Inequality in WikipediaClaudia Wagner
 
Open Badges for Migrant Academics
Open Badges for Migrant AcademicsOpen Badges for Migrant Academics
Open Badges for Migrant AcademicsIlona Buchem
 

Viewers also liked (6)

Slam about "Discrimination and Inequalities in socio-computational systems"
Slam about "Discrimination and Inequalities in socio-computational systems"Slam about "Discrimination and Inequalities in socio-computational systems"
Slam about "Discrimination and Inequalities in socio-computational systems"
 
Towards Mining Semantic Maturity in Social Bookmarking Systems
Towards Mining Semantic Maturity in Social Bookmarking SystemsTowards Mining Semantic Maturity in Social Bookmarking Systems
Towards Mining Semantic Maturity in Social Bookmarking Systems
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014
 
Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...
 
Measuring Gender Inequality in Wikipedia
Measuring Gender Inequality in WikipediaMeasuring Gender Inequality in Wikipedia
Measuring Gender Inequality in Wikipedia
 
Open Badges for Migrant Academics
Open Badges for Migrant AcademicsOpen Badges for Migrant Academics
Open Badges for Migrant Academics
 

Similar to Eswc2013 audience short

Social Media and Text Analytics
Social Media and Text AnalyticsSocial Media and Text Analytics
Social Media and Text AnalyticsRushikeshChikane2
 
Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...Alfonso Crisci
 
Press Kit -LiMoSINe Project
Press Kit -LiMoSINe ProjectPress Kit -LiMoSINe Project
Press Kit -LiMoSINe ProjectLiMoSINe Project
 
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHIBig Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHIRuchika Sharma
 
Meaning as Collective Use: Predicting Semantic Hashtag Categories on Twitter
Meaning as Collective Use: Predicting Semantic Hashtag Categories on TwitterMeaning as Collective Use: Predicting Semantic Hashtag Categories on Twitter
Meaning as Collective Use: Predicting Semantic Hashtag Categories on Twitterph_singer
 
Eavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging SiteEavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging SiteShalin Hai-Jew
 
Data Science: 2018 Media & Influencer Analysis
Data Science: 2018 Media & Influencer AnalysisData Science: 2018 Media & Influencer Analysis
Data Science: 2018 Media & Influencer AnalysisZeno Group
 
Cloud Access Security Broker CASB: 2018 Media & Influencer Analysis
Cloud Access Security Broker CASB: 2018 Media & Influencer AnalysisCloud Access Security Broker CASB: 2018 Media & Influencer Analysis
Cloud Access Security Broker CASB: 2018 Media & Influencer AnalysisZeno Group
 
SocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfSocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfBalasundaramSr
 
Artificial Intelligence: 2018 Media & Influencer Analysis
Artificial Intelligence: 2018 Media & Influencer AnalysisArtificial Intelligence: 2018 Media & Influencer Analysis
Artificial Intelligence: 2018 Media & Influencer AnalysisZeno Group
 
Strategic perspectives 3
Strategic perspectives 3Strategic perspectives 3
Strategic perspectives 3archiejones4
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentIoannis Katakis
 
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...IJCSEA Journal
 
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...IJCSEA Journal
 
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...IJCSEA Journal
 
Knime social media_white_paper
Knime social media_white_paperKnime social media_white_paper
Knime social media_white_paperFiras Husseini
 
Extracting Social Network Data and Multimedia Communications from Social Medi...
Extracting Social Network Data and Multimedia Communications from Social Medi...Extracting Social Network Data and Multimedia Communications from Social Medi...
Extracting Social Network Data and Multimedia Communications from Social Medi...Shalin Hai-Jew
 
From Data to Insights: how to build accurate customer insights from online co...
From Data to Insights: how to build accurate customer insights from online co...From Data to Insights: how to build accurate customer insights from online co...
From Data to Insights: how to build accurate customer insights from online co...Pulsar
 

Similar to Eswc2013 audience short (20)

Social Media and Text Analytics
Social Media and Text AnalyticsSocial Media and Text Analytics
Social Media and Text Analytics
 
Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...
 
Press Kit -LiMoSINe Project
Press Kit -LiMoSINe ProjectPress Kit -LiMoSINe Project
Press Kit -LiMoSINe Project
 
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHIBig Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
 
Meaning as Collective Use: Predicting Semantic Hashtag Categories on Twitter
Meaning as Collective Use: Predicting Semantic Hashtag Categories on TwitterMeaning as Collective Use: Predicting Semantic Hashtag Categories on Twitter
Meaning as Collective Use: Predicting Semantic Hashtag Categories on Twitter
 
Eavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging SiteEavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging Site
 
Data Science: 2018 Media & Influencer Analysis
Data Science: 2018 Media & Influencer AnalysisData Science: 2018 Media & Influencer Analysis
Data Science: 2018 Media & Influencer Analysis
 
Cloud Access Security Broker CASB: 2018 Media & Influencer Analysis
Cloud Access Security Broker CASB: 2018 Media & Influencer AnalysisCloud Access Security Broker CASB: 2018 Media & Influencer Analysis
Cloud Access Security Broker CASB: 2018 Media & Influencer Analysis
 
SocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfSocialCom09-tutorial.pdf
SocialCom09-tutorial.pdf
 
Artificial Intelligence: 2018 Media & Influencer Analysis
Artificial Intelligence: 2018 Media & Influencer AnalysisArtificial Intelligence: 2018 Media & Influencer Analysis
Artificial Intelligence: 2018 Media & Influencer Analysis
 
Strategic perspectives 3
Strategic perspectives 3Strategic perspectives 3
Strategic perspectives 3
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor Agent
 
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
 
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
 
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...
 
s00146-014-0549-4.pdf
s00146-014-0549-4.pdfs00146-014-0549-4.pdf
s00146-014-0549-4.pdf
 
Knime social media_white_paper
Knime social media_white_paperKnime social media_white_paper
Knime social media_white_paper
 
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleFull Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
 
Extracting Social Network Data and Multimedia Communications from Social Medi...
Extracting Social Network Data and Multimedia Communications from Social Medi...Extracting Social Network Data and Multimedia Communications from Social Medi...
Extracting Social Network Data and Multimedia Communications from Social Medi...
 
From Data to Insights: how to build accurate customer insights from online co...
From Data to Insights: how to build accurate customer insights from online co...From Data to Insights: how to build accurate customer insights from online co...
From Data to Insights: how to build accurate customer insights from online co...
 

More from Claudia Wagner

It's a Man's Wikipedia?
It's a Man's Wikipedia? It's a Man's Wikipedia?
It's a Man's Wikipedia? Claudia Wagner
 
When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...Claudia Wagner
 
WWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging StreamsWWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging StreamsClaudia Wagner
 
Welcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESISWelcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESISClaudia Wagner
 
Spatio and Temporal Dietary Patterns
Spatio and Temporal Dietary PatternsSpatio and Temporal Dietary Patterns
Spatio and Temporal Dietary PatternsClaudia Wagner
 
The Impact of Socialbots in Online Social Networks
The Impact of Socialbots in Online Social NetworksThe Impact of Socialbots in Online Social Networks
The Impact of Socialbots in Online Social NetworksClaudia Wagner
 
It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter users It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter users Claudia Wagner
 
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...Claudia Wagner
 
Topic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsClaudia Wagner
 
Knowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness StreamsKnowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness StreamsClaudia Wagner
 
The wisdom in Tweetonomies
The wisdom in TweetonomiesThe wisdom in Tweetonomies
The wisdom in TweetonomiesClaudia Wagner
 

More from Claudia Wagner (15)

It's a Man's Wikipedia?
It's a Man's Wikipedia? It's a Man's Wikipedia?
It's a Man's Wikipedia?
 
Food and Culture
Food and CultureFood and Culture
Food and Culture
 
When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...
 
WWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging StreamsWWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging Streams
 
Welcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESISWelcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESIS
 
Spatio and Temporal Dietary Patterns
Spatio and Temporal Dietary PatternsSpatio and Temporal Dietary Patterns
Spatio and Temporal Dietary Patterns
 
The Impact of Socialbots in Online Social Networks
The Impact of Socialbots in Online Social NetworksThe Impact of Socialbots in Online Social Networks
The Impact of Socialbots in Online Social Networks
 
It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter users It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter users
 
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
 
Socialbots www2012
Socialbots www2012Socialbots www2012
Socialbots www2012
 
SDOW (ISWC2011)
SDOW (ISWC2011)SDOW (ISWC2011)
SDOW (ISWC2011)
 
Topic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic Models
 
Topic Models
Topic ModelsTopic Models
Topic Models
 
Knowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness StreamsKnowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness Streams
 
The wisdom in Tweetonomies
The wisdom in TweetonomiesThe wisdom in Tweetonomies
The wisdom in Tweetonomies
 

Recently uploaded

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 

Recently uploaded (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 

Eswc2013 audience short

  • 1. The Wisdom of the Audience: An Empirical Study of Social Semantics in Twitter Streams Claudia Wagner, Philipp Singer, Lisa Posch and Markus Strohmaier 10th Extended Semantic Web Conference, Montpellier, 29.5.2013
  • 2. Problem #music #fashion Authors make their messages as informative as required but do not provide more information than necessary (Maxim of Quantity by Grice (1975)) [src: http://www.techweekeurope.co.uk/wp-content/uploads/2012/07/Twitter.jpg]
  • 3. Research Questions3 RQ 1: To what extent is the background knowledge of audiences useful for analyzing the semantics of social media messages? RQ 2: What are the characteristics of an audience which possesses useful background knowledge for interpreting the meaning of a stream's messages and which types of streams tend to have useful audiences? [scr: http://www.teachthought.com/twitter-hashtags-for-teacher/]
  • 4. Methodology Message Classification Task Use hashtags as ground truth Laniado and Mika (2010) showed that around half of all hashtags can be associated with Freebase concepts Compare real audience with random audience - how well can an audience predict the hashtag of a tweet? The audience which is better in guessing the hashtag of a Twitter message is better in interpreting the meaning of the message Null hypothesis: If the audience of a stream does not possess more knowledge about the semantics of the stream's messages than a randomly selected baseline audience, the results from both classification models should not differ significantly 4
  • 5. Methodology Train different multiclass classifiers on the background knowledge of the audience Logistic Regression, Stochastic Gradient Descent, Multinomial Naive Bayes and Linear SVM Compare different approaches for estimating the background knowledge Different audience and content selection approaches Different methods for estimating the background knowledge Test how well each model can predict the hashtag of future messages Weighted Macro F1 5
  • 6. Dataset Diverse sample of hashtags Romero et al. (2011) identified eight categories of hashtags on a large data sample celebrity, games, idioms, movies/TV, music, political, sports, and technology We randomly draw from each category ten hashtags which were still in use 6
  • 7. Dataset7 Technology Idioms Sports Politics #blackbery, #iphone, #google #omgfacts, #factsaboutme, #iwish #football, #nfl, #yankees #climate, #iran, #teaparty Games Music Celebrity Movies #gaming, #mafiawars, #wow #lastfm, #eurovision, #nowplaying #bsb, #michaeljackson, #rogis #avatar, #tv, #glennbeck
  • 8. Dataset8 t0 t1 t2 3/4/2012 4/1/2012 4/29/2012 stream tweets crawl of social structure stream tweets crawl of social structure stream tweets crawl of social structure 1 week crawl of audience tweets crawl of audience tweets crawl of audience tweets t1 t2 t3 Stream Tweets 94,634 94,984 95,105 Stream Authors 53,593 54,099 53,750 Friends 7,312,792 7,896,758 8,390,143 Audience Tweets 29,144,641 29,126,487 28,513,876
  • 9. Audience Selection A B C AuthorsAudienceRank 1 2 3 Stream Team bc tryouts tomo #football What we learned this week: Chelsea are working in reverse and Avram is coming #football #soccer Weekend pleeeease hurrrrry #sanmarcos #football Holy #ProBowl I'm spent for the rest of the day. #football Fifa warns Indonesia to clean up its football or face sanctions #Indonesia #Football
  • 10. Background Knowledge Content Selection Recent The most recent messages authored by the audience users Top Links (plain and enriched) the messages authored by the audience which contain one of the top links of that audience Top Tags the messages authored by the audience which contain one of the top hashtags of that audience 10
  • 11. Background Knowlegde Representation Preprocessing: remove stopwords, twitter syntax, stemming Represent background knowledge of the audience via the most likely topics or most important words of their messages Bag of Words: TF and TFIDF Topic Models: LDA 11
  • 12. Empirical Evaluation RQ 1: To what extent does the background knowledge of the audience support the semantic annotation of individual messages? Combine audience selection and background knowledge estimation approaches to generate semantic features of the messages authored by an audience Training data on audience’s messages crawled at t0 Test model using messages of the hashtag streams crawled at t1 12
  • 13. Results13 F1 (TF-IDF) F1 (LDA) Random Guessing 1/78 1/78 Baseline (random audience) 0.01 0.01 F1 (TF-IDF) F1 (LDA) Random Guessing 1/78 1/78 Baseline (random audience) 0.01 0.01 Audience - recent 0.25 0.23 F1 (TF-IDF) F1 (LDA) Random Guessing 1/78 1/78 Baseline (random audience) 0.01 0.01 Audience – recent 0.25 0.23 Audience – top links enriched 0.13 0.10 Audience – top links plain 0.12 0.10 F1 (TF-IDF) F1 (LDA) Random Guessing 1/78 1/78 Baseline (random audience) 0.01 0.01 Audience – recent 0.25 0.23 Audience – top links enriched 0.13 0.10 Audience – top links plain 0.12 0.10 Audience – top tags 0.24 0.21 The audience of a hashtag stream contains knowledge which is useful for predicting the hashtags of future messages
  • 14. Results14 F1 (TF-IDF) F1 (LDA) Random Guessing 1/78 1/78 Baseline (random audience) 0.01 0.01 F1 (TF-IDF) F1 (LDA) Random Guessing 1/78 1/78 Baseline (random audience) 0.01 0.01 Audience - recent 0.25 0.23 F1 (TF-IDF) F1 (LDA) Random Guessing 1/78 1/78 Baseline (random audience) 0.01 0.01 Audience – recent 0.25 0.23 Audience – top links enriched 0.13 0.10 Audience – top links plain 0.12 0.10 F1 (TF-IDF) F1 (LDA) celebrity 0.17 0.15 games 0.25 0.22 idioms 0.09 0.05 movies 0.22 0.18 music 0.23 0.18 political 0.36 0.33 sports 0.45 0.42 technology 0.22 0.22
  • 15. Empirical Evaluation RQ 2: What are the characteristics of an audience which possesses useful background knowledge for interpreting the meaning of a stream's messages and which types of streams tend to have useful audiences? Correlation analysis between the ability of an audience to interpret the meaning of messages and structural properties of the stream 15
  • 16. Structural Stream Properties Static Measures Coverage: informational, hashtag, retweet and conversational extent of a stream Entropy: randomness of a stream's authors and their followers, followees and friends Overlap: overlap between authors and followers, authors and followees and authors and friends Dynamic Measures KL divergence between the author-, the follower-, and the friend-distributions of a stream at different time points 16
  • 17. Stat. Significant Spearman Rank Correlation (p<0.05)17 F1 (TF-IDF) F1 (LDA) Overlap Author-Follower 0.675 0.655 Overlap Author-Followee 0.642 0.628 Overlap Author-Friend 0.612 0.602 Streams which are produced and consumed by a community of users who are tightly interconnected tend to have a useful audience. A useful audience possesses background knowledge which helps interpreting the meaning of messages.
  • 18. Stat. Significant Spearman Rank Correlation (p<0.05)18 F1 (TF-IDF) F1 (LDA) Conversation Coverage 0.256 0.256 Conversational streams tend to have a useful audience.
  • 19. Stat. Significant Spearman Rank Correlation (p<0.05)19 F1 (TF-IDF) F1 (LDA) Entropy Author Distribution -0.270 -0.400 Entropy Friend Distribution -0.307 - Entropy Follower Distribution -0.400 -0.319 Entropy Followee Distribution -0.401 -0.368 Streams which are produced and consumed by a focused set of authors, followers, followees and friends tend to have a useful audience.
  • 20. Stat. Significant Spearman Rank Correlation (p<0.05)20 F1 (TF-IDF) F1 (LDA) KL Follower Distribution -0.281 - KL Followee Distribution -0.343 -0.302 KL Author Distribution -0.359 -0.307 Socially stable streams tend to have an audience which is good in interpreting the meaning of a stream's messages.
  • 21. Summary & Conclusions The audience of a social stream possesses knowledge which may indeed help to interpret the meaning of a stream's messages But not all streams have similar useful audiences The audience of a social stream seems to be most useful if the stream is created and consumed by a stable, focused and communicative community – i.e., a group of users who are interconnected and have few core users to whom almost everyone is connected We do not know if those relations are causal but we got similar results when repeating our experiments on t1 and t2 21
  • 22. Current and Future Work Compare the utility of ontological knowledge with audience background knowledge for the hashtag prediction task Algorithmic exploitation of our results Hybrid hashtag recommendation algorithm Structural stream measures may inform weighting (how much can we count on the audience?) Differentiate between social and topical hashtags User-centric algorithms work only for active users who used hashtags before An audience-integrated approach only requires an active audience 22
  • 23. References Grice, H. P. (1975). Logic and conversation. In Speech acts, 3, 41–58. New York: Academic Press. Laniado, D., & Mika, P. (2010). Making sense of twitter. In Proceedings of the 9th international semantic web conference (pp. 470-485). Shanghai, China. Romero, D. M., Meeder, B., & Kleinberg, J. (2011). Differences in the me-chanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In Proceedings of the 20th international conference on world wide web (pp. 695–704). Hyderabad, India. 24
  • 24. Experimental Setup src: http://adobeairstream.com/green/a-natural-predicament-sustainability-in-the-21st-century/ THANK YOU claudia.wagner@joanneum.at http://claudiawagner.info [src: http://www.crowdscience.com/2008/06/tips_and_more/]

Editor's Notes

  1. Solet‘sstartbyexploringwhatistheproblemwearetryingtoadresswiththiswork? So take a minutetoreadthose 2 sample tweetsandthinkabouthowyouwould tag them. Whataretopicsofthosetweets? You will noticeitisoftenprettyhardtopredictthetopicsof a tweetwithouthavingfurthercontextualinformation. So in thisexamplethesolutionismusicandfashion. Why do theauthors not providemorecontext? Makethemessageasshortaspossibleandasinformtiveasrequiredandtheyknow/estimatetheiraudience. But whatdoesthatmeaniftheymaketheirinformationaas informative asrequired? This impliesthatthey must estimatewhatthereaudienceknows.Theaudienceofthesecondauthorhere will knowwhois Rafael Cennamo. The audienceofthefirstauthor will knowthatsheis a singer. This examplenicelyshowsthattherearecertainpeoplewhohavetherightbackgroundknowledgetomake sense out ofthesparseinformation.
  2. So thisworkbuilds on thisintuitionthatifonehasaccesstothebackgroundknowledgeoftheaudiencethiswouldhelptointerpretthemeaningoftweetsandannotatethemwithsemantics. Andtherefore, we hypothesize that semantic technologies which aim to enrich social media with semantics would benefit from incorporating the background knowledge of the audience.Concretlyweexplorethefollowing 2 researchquestions
  3. Toadresstheseresearchquestionsweconduct a messageclassificationtofigure out howwellcantheaudiencepredictthehashtagsofstream‘smessages. Thatmeansweusedtweetsoftheaudienceofhashtagstreamsastrainingsamplesandthehashtagitselfaslabels. Wetestedhowwelltheaudience-informedclassifiercanpredictthehashtagoffuturemessagescomparedto a classifiertrained on messagesofrandomaudiences.
  4. Wetrained different classificationmodelsandfoundthat linear SVM worksbestandthereforereportthisresultshere. Toestimatethe BK foreachhashtagstreamwecompare different audienceandcontentselectionapproachesand different methodsforestimatingthe BK.
  5. As a datasetwecreated a diverse sample ofhahstags. Weusedthe 8 categoriesofhashtagswhichhavebeenidentifiedby Romero and Kleinberg in theirpreviousworkanddrawfromeachcategory 10 hashtags (weonlyusedhashtagswhichare still active).We bias our random sample towards active hashtag streams by re-sampling hashtags for which we found less than 1,000 messages when crawling (4. March 2012). For those categories for which we could not find 10 hashtags which had more than 1,000 messages (games and celebrity), we select the most active hashtags per category (i.e., the hashtags for which we found the most messages). Since two hashtags #bsb and #mj appeared in the sample of two different categories, we ended up having a sample of 78 different hashtags (see Table 1).
  6. Hereyoucanseesome sample hashtagsforeachcategorywhichareinluded in our sample. Weendeduphaving 78 hashtagcatgeories.
  7. After having selected a diverse sample of hashtags we started the crawling process. We crawled the data for each hastag at 3 different time points. First we retrieved hashtag-stream tweets that were authored within the last week. Next we extracted the authors of those tweets and crawled their SN. Finally, we also crawled the most recent 3,200 tweets (or less if less were available) of all users who belong either to the top hundred authors or audience users of each hashtag stream. We ranked authors by the number of tweets they published within the hashtag stream and audience users by the number of authors they are friends with. We repeated this process three times.
  8. To estimate the audience of a hashtag stream, we rank the friends (or followers) of the stream&apos;s authors by the number of authors they are related with. In this example, the hashtag stream #football has four authors. User B is a friend of all four authors of the stream and is therefore most likely to be exposed to the messages of the stream and to be able to interpret them. Consequently, user B receives the highest rank. User C is a friend of two authors and receives the second highest rank. The user with the lowest rank (user A) is only the friend of one author of the stream.
  9. After havingselectedtheaudienceusers per hashtagstream, we also need a waytoselectthecontentsourcewhichwewanttouselaterforestimatingtheaudience‘s BK. Wetested 4 different approacheshere.
  10. Finnaly, after havingselectedthe top audienceusersandtheircontentof a hashtagstreamwe also need a waytoabstractthecontentandrepresentit in form offeatures
  11. Toanswerour 1st RQ wecombine different audienceselectionand BK estimationstrategiesandtrain a linear SVM usingmessagesauthoredbytheaudienceat t0 orbeforeastrainingsamples. Weextractfeaturesfromthosemessagesbyusingtfidfandtopicmodel. Wetesttheperformanceoftheclassifiersusingfuturehashtagstreammessages.
  12. Ok let‘slookattheresultsfromthisstudy. First notethatsincewehave 78 classes a randomguesserwouldachieve a verylowperformance. Weusedas a baseline a classifiertrainedwith a randomaudience. Thatmeanswerandomlyswappedaudiencesandhashtagstreamsbeforetraining. Thereforewedistroyedtherelationbetweentrainingdataandlabels. A baselineclassifiercanachieve an F1 score of 0.01. Whenusingthe real audienceof a hashtagstream (the top 10 friends) andtheirmostrecentlypublishedmessagesweachive a performanceof 0.25. This issignificvantlybetter.
  13. So I showedyoutheoverallperformance so far, but we also lookedattheperformanceof individual categoriesofhashtagsusingthebestmethod (mostrecentmessagesauthoredbythe top audience). Onecanseefromthisthatthereisone negative outlier–idioms. Sincethosearethehashtagswhichare not representingsemanticconceptsweare not so worriedaboutthat.
  14. Since we observed that the audience of certain hashtag streams is more useful than for others we wanted to know what types of streams tend to have useful audiences. Therefore we performed a correlation analysis btw… we don’t want to claim that there is a causal relation but we repeated our exp at different time points and observed the same correlations.
  15. Toquantifystructuralpropertiesofsocialstreamsweuse a coupleofstaticanddynamicstreammeasures.
  16. Across all categories, the overlapmeasures show the highest positive correlation with the F1-scores. This indicates that streams which..
  17. The only significant coverage measure is the conversational coverage measure. It indicates that the audiences of conversational streams are better in interpreting the meaning of a stream&apos;s messages. This suggests that it is not only important that a community exists around a stream, but also that the community is communicative.
  18. All entropy measures show significant negative correlations with the F1-Scores. This shows that the more focused the author-, follower-, followee- and/or friend-distribution of a stream is (i.e., lower entropy), the higher the F1-Scores of an audience-based classification model are. The entropy measures the randomness of a random variable. For example, the author-entropy describes how random the set of authors around a hashtag stream is formed – i.e., how well one can predict who will author the next message in this hashtag stream. The friend-entropy describes how random the friends of hashtag stream&apos;s authors are – i.e., how well one can predict who will be a friend of most hashtag stream&apos;s authors. Our results suggest that streams tend to have a better audience if their authors and author&apos;s followers, followees and friends are less random.
  19. The KL divergences of the author-, follower-, and followee-distributions show a significant negative correlation with the F1-Scores. The author distribution of a hashtag stream reveals how many messages each author contributed. The follower distribution reveals how many authors a follower is followiing. The Followee distribution reveals how many authors are following a folllowee.This indicates that the more stable the author, follower and followee distribution is over time, the better the audience of a stream. If for example the followee distribution of a stream changes heavily over time, authors are shifting their social focus. If the author distribution of a stream has a high KL divergence, this indicates that the sets of authors of streams are changing over time.
  20. To summerize our work provides initial empirical evidence for the fact that… And we also found that certain stream characteritiscs are strongly correlated with the usefulness of the audience. We do not know if this is a causal relation but we repeated our experiment on a later time point and found compareable results.
  21. Currentlyand in thenearfuturewework on comparingtheutilityof .... Andwe also wanttoexploitourresultsfordeveloping a newhashtagrecommendationalgorithm.
  22. To understand whether the structure of a stream has an effect on the usefulness of its audience for interpreting the meaning of its messages, we perform a correlation analysis and investigate to what extent the ability of an audience to interpret the meaning of messages correlates with structural stream properties introduced in Section Structural Stream Measures.