This document describes a project to analyze sentiment and detect topics in posts on Twitter and Facebook. It uses natural language processing and sentiment analysis techniques to classify posts to one of four DNA themes (Creativity, Spirit of Enterprise, Freedom of Thought, Civic Virtue) and determine if the sentiment is positive, negative, or neutral. The results will be shared with other work packages to power a story engine and location-based data visualizations. Challenges include incorrect grammar in social media posts and detecting sentiment when sarcasm is used.
Lucknow 💋 Dating Call Girls Lucknow | Whatsapp No 8923113531 VIP Escorts Serv...
Topic & Sentiment Detection on Twitter and Facebook
1. Topic & Sentiment Detection
on Twitter and Facebook
Sylvia van Schie Wouter Stuifmeel Vincent Velthuis
Knowledge Based Media Systems | March 2013
2. Index
1. Project
2. Processing
3. Method
4. Sentiment Analysis
5. Topic Detection
6. Keywords
7. Example
8. Collaboration
9. Difficulties
2/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
3. Project
DNA Themes
– Freedom of Thought
– Creativity
– Spirit of Enterprise
– Civic Virtue
• Access information by using API's (Facebook, Twitter, Google Places)
• Classify posts/tweets to themes
• Classify user’s sentiment
3/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
4. Processing: General opinion of
core themes within Amsterdam
5/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
5. Method
• [Twitter-only] Insert hashtags to “term table”
and replace hashtags with regular words
• Delete URL’s and Links from text
• Natural Language Processing
• Sentiment Detection
• Delete irrelevant terms using NLP
• Insert remaining terms to “term table”
• Contents of “term table” to Wordnet
• Classify to Themes
4/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
6. Sentiment Analysis
OpinionFinder (Wiebe & Mihalcea)
Sentiment Polarity
Adjectives, Adverbs and Verbs (NLP)
Positive [+], Negative [-] or Neutral [0]
Weak (+/- 1) or strong (+/- 5) Sentiment
For example
• Great is strong positive (+5), dull is weak negative (-1)
6/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
7. Topic Detection
Topics are the 4 DNA themes
• Semantic deduction via Wordnet
• Compare Wordnet meaning with list of keywords related to DNA themes
• Classify to themes according to overlap
7/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
8. Core Themes of DNA (#keywords)
Appendix A: Core Themes of DNA
Civic Virtue (Burgerschap)
Church / Cathedral / Basilica / Chapel / Mosque / Synagogue
Catholic / Protestant / Jewish / Muslim / Semitic / Religion / God
Politic / Politicians / Government / Civic government / Society
Social support / Social Security / Payment / Community / Clan /
Alliance / Union / Nation
Freedom of Thought (Vrijdenken)
Church / Cathedral / Basilica / Chapel / Mosque / Synagogue
Catholic / Protestant / Jewish / Muslim / Semitic / Religion / God
Religious / Tolerance / Free Thinking / Open-mindedness /
Understanding / Sympathy / Freedom / Humanity / Mercy /
Kindness / Sympathy / Charity / Compassion / Drugs / Weed /
Doges / Coffeeshop / Marihauna / Dope / Legalisation /
Prostitution / Red Light Distrcit / Hookers / Whores / Prostitutes
/ Legalisation / Squatters / Squat / Homesteader / Settler /
Homosexuality / Gay Marriage / Homophile / Free Speech /
Uncensored / Censorship / Opinion / Liberty / Education /
Teaching / School / University / Instruction
8/14
Creativity (Creativiteit)
Artists / Painters / Sculptors / Designers / Composer /
Inventor / Creator / Painting / Building / Statue /
Inventions / Inventive / Music / Sonet / Composition /
Concert / Symphony / Musician / Song
Museum / Exhibition / Library / Gallery / Concert hall
Academics / University / Researchers / Science
Creativity / Culture / Original / Vision / Inspiration /
Innovation / Creative Minds / Rembrandt / van Gogh /
Vermeer / Brood / Multicultural / Cultural
Spirit of Enterprise (Ondernemerschap)
Settlement / City / Town / Harbour
Commerce / Trade / Economics / World Trade
Agriculture / Farming / Fishery / Cultivation
Transportation / Import / Export / Shipping / Shipment
Spirit of enterprise / Commercial enterprise / Golden
Age
/ Ship / VOC / EAC / Finance / Money / Profit / Capital /
Resources / Wealth / Slavery / Exploitation /
Enslavement / Servitude / Innovation / Industry /
Business / Organization / Entrepreneurship
Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
9. Wouter @myshuno 1day
The exhibit about slavery was really great! #Amsterdam #YOLO
Expand | Reply | Retweet | Favorite | More
• “The exhibit about slavery was really great.”
• S( NP(( DT NN ) Adj ( NN )) VP( VN( Adv Adj )))
• SA looks at Adj, Adv and VN
– about = neutral (0)
– was = neutral (0)
– really = ‘booster’ for great
– great = strong positive (+5)
– Overall sentiment is positive (+)
Example (1)
9/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
10. Example (2)
• Insert NN’s into “Term table”
– ‘exhibit’ and ‘slavery’
• Wordnet semantics
– S: (n) display, exhibit, showing (something shown to the public)
"the museum had many exhibits of oriental art“
– S: (n) slavery, slaveholding (the practice of owning slaves)
• Keyword comparison gives us two themes:
– Creativity and Spirit of Enterprise
10/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
11. Collaboration (1)
WP2: Story Engine
– Input: DNA Themes
– Input: User’s sentiment
11/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
Sentiment
DNA
Positive Negative Neutral
Creativity CR[+] CR[-] CR[0]
Spirit of Enterprise EN[+] EN[-] EN[0]
Freedom of Thought FT[+] FT[-] FT[0]
Civic Virtue CV[+] CV[-] CV[0]
12. Collaboration (2)
WP4: Presentation
– Input: Timestamp
GPS
• Easy to extract via Facebook and Twitter API’s
• Coordinates by Google Places
12/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
13. Collaboration (2)
13/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013
14. Difficulties
• Grammar is not always correct on Facebook and Twitter
– Ignore posts who don’t give results
• Sentiment is not always easy to classify (for example by sarcasm)
– Default sentiment is neutral [0]
– Work in progress in the field
14/14Knowledge Based Media Systems | Topic & Sentiment Detection on Twitter and Facebook | S. van Schie, W. Stuifmeel and V. Velthuis | March 2013