SlideShare a Scribd company logo
1 of 12
A Study on the Spacio-Temporal Trend
of Brand Index using Twitter Messages
Sentiment Analysis
Abstract
Twitter Data
Social
Scien
ce
Huma
n
ArtMedic
al
Econo
my
Sentiment
Analysis
Introduction
 Twitter Crawling
 Data Pre-processing
 Korean Morphology Analysis
 Twitter Opinion Mining
 Sentiment Dictionary
 Evaluating performance of candidate classifiers
 Sentiment Classification
 Visualize Associative Relationship of Terms
 Relationship with Brand Index
Twitter Crawling
Twitter API
Streaming API
REST API
- Search API
Get 1% of all
twitter data in
real time
Get twitter data
from the keyword
2013.9.9.Mon. 9:35pm ~ Now
About 10,000 ~ 15,000 tweets per a day
Total 1,220,000 tweets (2013.11.2.Sat)
Data Pre-Processing
 Only get tweets which contain at least more than 3 Korean characters and tweets within
a 500km radius of Seoul, Korea.
 To remove foreign languages, special characters
 Remove tweets which only contain location information.
 Remove retweets
‫ويتكلم‬ ‫نهائيا‬ ‫السمع‬ ‫فقد‬ ‫متعب‬ ‫ابو‬ ‫الملك‬ ‫ان‬ ‫خبر‬ ‫اكد‬ ‫المستوى‬ ‫رفيع‬ ‫وامير‬ ‫موثوق‬ ‫صدر‬
‫مفهوم‬ ‫وغير‬ ‫مترابط‬ ‫غير‬ ‫كالم‬((‫تخريف‬::)) Sat Oct 12 00:06:37 KST 2013
I'm at Club ELLUI - @ellui_seoul (서울특별시) w/ 2
others http://t.co/zhcrncosKH::Sat Oct 12 00:02:06 KST 2013
Korean Morpheme Analyzer
 꼬꼬마 Korean Morpheme Analyzer
 한나눔 Korean Morpheme Analyzer
 Komoran Korean Morpheme Analyzer
 Lucene Korean Analyzer
 은전한닢 Korean Morpheme Analyzer
 Performance of the analyzer
 Foreign language and slang tagging
 Sentiment related word tagging (slang,
verb, emoticon)
 It has good dictionary
 Don’t need to think about word spacing
 But, unable to perceive lots of emoticons,
metaphor, sarcasm, irony.
Korean Morpheme Analyzer
> 배가 아파서 병원에 갔다.
배 NN,F,배,*,*,*,*,*
가 JKS,F,가,*,*,*,*,*
아파서 VA+EC,F,아파서,Inflect,VA,EC,아프/VA+ㅏ서/EC,*
병원 NN,T,병원,*,*,*,*,*
에 JKB,F,에,*,*,*,*,*
갔 VV+EP,T,갔,Inflect,VV,EP,가/VV+ㅏㅆ/EP,*
다 EF,F,다,*,*,*,*,*
. SF,*,*,*,*,*,*,*
EOS
Noun
Verb
Adjective
Adverb
Root
Building Sentiment Dictionary
Manually labeled twitter data
1 • 6 days of twitter data (2013.9.9, 9.16, 9.23, 9.30, 10.7, 10.14)
• Labeled positive and negative sets of Noun, Adjective, Verb, Root (total 8 sets)
• Labeled by 4 person
2 • 20,000 reviews from 2 movies
• 545 positive set, 545 negative set,
545 neutral set
Naver Movie review data with rating
0
1000
2000
3000
4000
5000
6000
1 2 3 4 5 6 7 8 9 10
0
500
1000
1500
2000
2500
3000
3500
1 2 3 4 5 6 7 8 9 10
Positive
Positivenegative
Movie 1 Movie 2
Sentiment Classification
 SVM Classifier
 1. Training set - 150 positive set, 150 negative set (Twitter data)
2. Test set – 545 positive set, 545 negative set (Movie review data)
Accuracy = 70.64220183486239% (770/1090) (classification)
Mean squared error = 1.1743119266055047 (regression)
Squared correlation coefficient = 0.18400994471523438 (regression)
 Naïve bayes Classifier
 SO-PMI Classifier
Building Sentiment Dictionary
Unlabeled &
labeled data set
Ternary classifier : Naïve Bayes,
SO-PMI, SVM
Positive
set
Negative
set
Neutral
set
Positive
set
Negative
set
Neutral
set
Positive
set
Negative
set
Neutral
set
SO-PMI
SVM
Naïve Bayes
Sentiment of Brand Index
Samsung
Galaxy S2
Battery LCDPrice ….
: Brand (keyword)
: Related nouns (attribute)
Adjective
Verb
Noun
Adverb …
correlation
good
good nice
good good
Nice, pretty,
lovely …
Bad, terrible …
PMI(word, pword) + PMI(word, nword)
Determining
Objectivity
Scenario

More Related Content

More from SOYEON KIM

Network embedding
Network embeddingNetwork embedding
Network embeddingSOYEON KIM
 
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...SOYEON KIM
 
Deep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveyDeep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveySOYEON KIM
 
DeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social RepresentationsDeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social RepresentationsSOYEON KIM
 
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringConvolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringSOYEON KIM
 
Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image SearchVisual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image SearchSOYEON KIM
 
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...SOYEON KIM
 
A survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisA survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisSOYEON KIM
 
Translated learning
Translated learningTranslated learning
Translated learningSOYEON KIM
 
Self taught clustering
Self taught clusteringSelf taught clustering
Self taught clusteringSOYEON KIM
 
Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...SOYEON KIM
 
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...SOYEON KIM
 
Text extraction from natural scene image, a survey
Text extraction from natural scene image, a surveyText extraction from natural scene image, a survey
Text extraction from natural scene image, a surveySOYEON KIM
 
Opinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network EffectsOpinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network EffectsSOYEON KIM
 
Evaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognitionEvaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognitionSOYEON KIM
 
Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...SOYEON KIM
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clusteringSOYEON KIM
 
Sentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion miningSentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion miningSOYEON KIM
 
Opinion spam and analysis
Opinion spam and analysisOpinion spam and analysis
Opinion spam and analysisSOYEON KIM
 
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...SOYEON KIM
 

More from SOYEON KIM (20)

Network embedding
Network embeddingNetwork embedding
Network embedding
 
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
Integrative Pathway-based Survival Prediction utilizing the Interaction betwe...
 
Deep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveyDeep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a survey
 
DeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social RepresentationsDeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social Representations
 
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringConvolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
 
Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image SearchVisual-Textual Joint Relevance Learning for Tag-Based Social Image Search
Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search
 
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated wi...
 
A survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisA survey of heterogeneous information network analysis
A survey of heterogeneous information network analysis
 
Translated learning
Translated learningTranslated learning
Translated learning
 
Self taught clustering
Self taught clusteringSelf taught clustering
Self taught clustering
 
Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...Semi-automatic ground truth generation using unsupervised clustering and limi...
Semi-automatic ground truth generation using unsupervised clustering and limi...
 
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
Mobile Phone Spam Image Detection based on Graph Partitioning with Pyramid H...
 
Text extraction from natural scene image, a survey
Text extraction from natural scene image, a surveyText extraction from natural scene image, a survey
Text extraction from natural scene image, a survey
 
Opinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network EffectsOpinion Fraud Detection in Online Reviews by Network Effects
Opinion Fraud Detection in Online Reviews by Network Effects
 
Evaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognitionEvaluating color descriptors for object and scene recognition
Evaluating color descriptors for object and scene recognition
 
Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clustering
 
Sentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion miningSentiwordnet: A publicly available lexical resource for opinion mining
Sentiwordnet: A publicly available lexical resource for opinion mining
 
Opinion spam and analysis
Opinion spam and analysisOpinion spam and analysis
Opinion spam and analysis
 
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...
Investigating the Effectiveness of E-mail Spam Image Data for Phone Spam Imag...
 

Recently uploaded

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...HyderabadDolls
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 

Recently uploaded (20)

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 

A study on the spacio temporal trend of brand index using twitter messages sentiment analysis

  • 1. A Study on the Spacio-Temporal Trend of Brand Index using Twitter Messages Sentiment Analysis
  • 3. Introduction  Twitter Crawling  Data Pre-processing  Korean Morphology Analysis  Twitter Opinion Mining  Sentiment Dictionary  Evaluating performance of candidate classifiers  Sentiment Classification  Visualize Associative Relationship of Terms  Relationship with Brand Index
  • 4. Twitter Crawling Twitter API Streaming API REST API - Search API Get 1% of all twitter data in real time Get twitter data from the keyword 2013.9.9.Mon. 9:35pm ~ Now About 10,000 ~ 15,000 tweets per a day Total 1,220,000 tweets (2013.11.2.Sat)
  • 5. Data Pre-Processing  Only get tweets which contain at least more than 3 Korean characters and tweets within a 500km radius of Seoul, Korea.  To remove foreign languages, special characters  Remove tweets which only contain location information.  Remove retweets ‫ويتكلم‬ ‫نهائيا‬ ‫السمع‬ ‫فقد‬ ‫متعب‬ ‫ابو‬ ‫الملك‬ ‫ان‬ ‫خبر‬ ‫اكد‬ ‫المستوى‬ ‫رفيع‬ ‫وامير‬ ‫موثوق‬ ‫صدر‬ ‫مفهوم‬ ‫وغير‬ ‫مترابط‬ ‫غير‬ ‫كالم‬((‫تخريف‬::)) Sat Oct 12 00:06:37 KST 2013 I'm at Club ELLUI - @ellui_seoul (서울특별시) w/ 2 others http://t.co/zhcrncosKH::Sat Oct 12 00:02:06 KST 2013
  • 6. Korean Morpheme Analyzer  꼬꼬마 Korean Morpheme Analyzer  한나눔 Korean Morpheme Analyzer  Komoran Korean Morpheme Analyzer  Lucene Korean Analyzer  은전한닢 Korean Morpheme Analyzer  Performance of the analyzer  Foreign language and slang tagging  Sentiment related word tagging (slang, verb, emoticon)  It has good dictionary  Don’t need to think about word spacing  But, unable to perceive lots of emoticons, metaphor, sarcasm, irony.
  • 7. Korean Morpheme Analyzer > 배가 아파서 병원에 갔다. 배 NN,F,배,*,*,*,*,* 가 JKS,F,가,*,*,*,*,* 아파서 VA+EC,F,아파서,Inflect,VA,EC,아프/VA+ㅏ서/EC,* 병원 NN,T,병원,*,*,*,*,* 에 JKB,F,에,*,*,*,*,* 갔 VV+EP,T,갔,Inflect,VV,EP,가/VV+ㅏㅆ/EP,* 다 EF,F,다,*,*,*,*,* . SF,*,*,*,*,*,*,* EOS Noun Verb Adjective Adverb Root
  • 8. Building Sentiment Dictionary Manually labeled twitter data 1 • 6 days of twitter data (2013.9.9, 9.16, 9.23, 9.30, 10.7, 10.14) • Labeled positive and negative sets of Noun, Adjective, Verb, Root (total 8 sets) • Labeled by 4 person 2 • 20,000 reviews from 2 movies • 545 positive set, 545 negative set, 545 neutral set Naver Movie review data with rating 0 1000 2000 3000 4000 5000 6000 1 2 3 4 5 6 7 8 9 10 0 500 1000 1500 2000 2500 3000 3500 1 2 3 4 5 6 7 8 9 10 Positive Positivenegative Movie 1 Movie 2
  • 9. Sentiment Classification  SVM Classifier  1. Training set - 150 positive set, 150 negative set (Twitter data) 2. Test set – 545 positive set, 545 negative set (Movie review data) Accuracy = 70.64220183486239% (770/1090) (classification) Mean squared error = 1.1743119266055047 (regression) Squared correlation coefficient = 0.18400994471523438 (regression)  Naïve bayes Classifier  SO-PMI Classifier
  • 10. Building Sentiment Dictionary Unlabeled & labeled data set Ternary classifier : Naïve Bayes, SO-PMI, SVM Positive set Negative set Neutral set Positive set Negative set Neutral set Positive set Negative set Neutral set SO-PMI SVM Naïve Bayes
  • 11. Sentiment of Brand Index Samsung Galaxy S2 Battery LCDPrice …. : Brand (keyword) : Related nouns (attribute) Adjective Verb Noun Adverb … correlation good good nice good good Nice, pretty, lovely … Bad, terrible … PMI(word, pword) + PMI(word, nword) Determining Objectivity

Editor's Notes

  1. SNS(SocialNetWorkServic) 시작 확대 -> 개인 BigData 출현 BigData를 이용한 DataMining 대두 트위터롤로지(twitterology) 새로운 학문의 출현 - 트위터를 연구하는 학문’을 뜻하는 신조어 - 소셜네트워크서비스(SNS)인 트위터(twitter)에 학문을 뜻하는 접미사 로지(-logy) - 트위터의 실시간 정보가 사회학 경제학 의학 언어학 등의 연구
  2. Twitter 4J library를 이용한 Streaming API (실시간)와 REST API(15분에 420회- 15분마다 요청하면 420개 받음) 구현 전체 데이터의 1%만 받을 수 있음 – 승우 발표 9월 9일 9:35pm ~ 지금도 계속 하루 평균 만~만오천개의 데이터 현재 2013.11.2 122만개의 데이터 축적
  3. 한글 3글자 이하는 받지않음 (특수문자 다빠지고, 영어, 일본어 다 빠짐) 위치정보 imap 등의 정보 제거 서울 반경 500km 이내의 데이터 받음 (전세계의 트위터가 다나옴. 우리나라꺼만 받기위해)
  4. 은전한닢 형태소분석기 리눅스에서 자바연동
  5. 1. Training set - 긍정 : DB 검색 '좋' 결과 - 이중 150개                         부정 : DB 검색 '싫' 결과 - 이중 150개  2. Test set - 긍정 : 영화평 545개                    부정 : 영화평 545개  사전에 아예 걸리지 않은 영화평도 포함하였을 때  optimization finished, #iter = 73  nu = 0.16326140616206591  obj = -32.23746306073249, rho = 0.11723225832508417  nSV = 61, nBSV = 38  Total nSV = 61  Accuracy = 70.64220183486239% (770/1090) (classification)  Mean squared error = 1.1743119266055047 (regression)  Squared correlation coefficient = 0.18400994471523438 (regression)
  6. p(word1 & word2) is the probability that word1 and word2 co-occur f the degree of statistical dependence between the words The log of the ratio corresponds to a form of correlation
  7. – 시나리오 : 악성 보도 이후 해명기사를 낸 기업