SlideShare une entreprise Scribd logo
1  sur  32
Sentiment and Affect analysis of Dark Web Forums: Measuring Radicalization on the Internet Hsinchun Chen, Fellow, IEEE
Introduction Web forums offer participants a medium to express their opinions and emotions freely in discussion. Extremist and terrorist groups also use web forums for community. Expression and dissemination of their ideologies and propaganda Such forums are often  referred to as being part of Dark Web
Introduction Information contained within Dark Web forums represent asignificant source of knowledge for security and intelligence organizations. Theopinions and emotions expressed within these forums provide valuable insights: the nature and position of the online community  Characterizing individual participants Manual analysis of the vast quantities of messages to measure the opinions and emotions expressed is often infeasible.
Introduction This paper presents an automated approach to sentiment and affect analysis of two Dark Web forums related to the Iraqi insurgency and Al-Qaeda. The automated approach utilizes a rich set of textual features and machine learning techniques.
Related Work Sentiment and affect analysis are related tasks in text mining that focus on directional text, containing opinions, emotions, and biases. [5]    M.  A.  Hearst,  “Direction-based  text  interpretation  as  an  information  access refinement,” In Text-Based Intelligent Systems: Current Research  and   Practice   in   Information   Extraction   and   Retrieval.   Lawrence  Erlbaum Associates, 1992.  [6]    J.   Wiebe,   “Tracking   point   of   view   in   narrative,”   Computational Linguistics, vol. 20 (2), pg. 233-287, 1994.
Related Work Sentiment analysis attempt to identify, analyze, and measure opinions expressed in text. Affect analysis focuses on the emotional content of the communication.  R.   Agrawal,   S.   Rajagopalan,   R.   Srikant,   and   Y.   Xu,   “Mining  newsgroups using networks arising from social behavior,”  Proc. of the  12th Int’l WWW Conf., 2003.  P.  Subasic  and  A.  Huettner,  “Affect  analysis  of  text  using  fuzzy  semantic typing,” IEEE Trans. Fuzzy Systems, vol. 9 (4), pg. 483-496.
Related Work There are some important distinction between the two Affect analysis evaluates the intensity of a number of potential emotions, including happiness, sadness, anger, fear, etc Sentiment analysis considers the polarity of opinions along a positive-neutral-negative continuum. The words and phrases associated with sentiments are mutually exclusive. Segments of text can convey multiple affects
Related Work Researchers have utilized various machine learning approaches to perform automated sentiment and affect analysis. B.  Pang,  L.  Lee,  and  S.  Vaithyanathain,  “Thumbs  up?    sentiment  classification  using  machine  learning  techniques,”  Proc.  Empirical  Methods in Natural Language Processing, pg. 79-86, 2002.  R.  W.  Picard,  E.  Vyzas,  and  J.  Healey,  “Toward  machine  emotional  intelligence:  analysis  of  affective  physiological  state,”  IEEE  Tran.  Pattern Analysis and Machine Intelligence, vol. 23 (10), pg. 1179-1191,  2001.
Related Work In particular, the SVM learning approach has been shown to be particularly effective in determining whether a text segment contains expression of a particular affects class. Only for discrete label.  Y.  H.  Cho  and  K.  J.  Lee,  “Automatic  affect  recognition  using  natural  language  processing  techniques  and  manually  built  affect  lexicon,”  IEICE Tran. Information Systems, vol. E89 (12), pg. 2964-2971, 2006.
Related Work SVR is an alternate approach that is capable of predicting continuous sentiment and affect intensities while benefitting from the robustness of SVM.  A. Webb, Statistical Pattern Recognition. John Wiley & Sons, 2002.
Research Questions In a recent book by Ryan, the author highlights the critical role that the Web forums play for militant Islamic radicalization on the Internet. Marc Sageman, an internationally renowned terrorism study consultant, also emphasizes the importance of the internet, especially forums. This paper presents our web mining research on sentiment and affect analysis of two large-scale, internal Jihadist forums.
Research Questions This study seeks to answer the following research questions: How effective are automated methods of sentiment and  affect  analysis  in  measuring  the  polarities  of  opinions  and intensities of emotions in Dark Web forums?  What insights into the Dark Web forums are gained by  performing sentiment and affect analysis?
Data Two Dark Web forums were selected for sentiment and affect analysis  Al-Firdaws    (www.alfirdaws.org/vb) Montada (www.montada.com) Al-Firdaws a more radical forum considerable content dedicated to support of the Iraqi insurgency and Al-Qaeda. Montada Montada is a general discussion forum with content pertaining to a variety of social and religious issues. Domain  experts  consider  Montada  to  be  more  moderate  compared to Al-Firdaws, with less radical content.
Data Spidering  programs  were  used  to  collect  the  content  from  the two web forums. A summary of the collection statistics is presented in Table I. Data set is larger. An older forum Al-Firdaws is too radical
Data Both  Al-Firdawsand Montada are major forums for their respective purposes and communities, with  relatively  high membership levels and numerous authors.
Data In both cases postings are more evenly distributed across web forum threads. Although the Montada forum has a larger average number of posts per thread compared to Al-Firdaws, the median number of posts per thread is nearly equal.
Data 500 sentences were selected from each  web  forum,  and scored for the intensities of sentiments and affects expressed. The affects of interest in the study included  those of most interest to security and  intelligence organizations including  violence,  anger,  hate,  and  racism.  These affects were measured on a continuous scale ranging from  0  to  1. The sentiment measurement was on a continuous scale from -1  to  1
Data
Methods
Methods Annotation step Character, word, root, collocation n-grams Character and word n-grams are commonly  used in text mining applications.  To derive root level n-grams, Arabic words were  converted to their roots using a clustering algorithm. Collocation n-grams included the Hapax and Dis collocations. Features with less than four occurrences in the  test bed were excluded.
Methods
Methods The machine learning approach for identifying the presence and intensities of sentiments and affects in Dark Web forum sentences  utilized a SVR ensemble. SVR was utilized toleverage the robustness of SVM, while accommodating the continuous intensities of sentiments and affects. Ensemble classifiers  aggregate  multiple  independent  classifiers  built using  different  techniques  or  feature  subsets improving  performance  over  a  single  classifier.
Methods For the analysis of the Al-Firdaws and Montada web forums, a separate classifier   was developed for each of the five sentiment and affect classes
Methods Feature selection Information gain (IG) heuristic Discretization of intensities were performed before IG could be applied and the relevant features selected. To compensate for the discretization, multiple  iterations were performed varying the number  of class bins for intensity between 2 and 10. The IG heuristic was used recursively to select relevant features in these iterations using recursive feature elimination (RFE).
Methods
Methods The feature selection phase resulted in a subset of the features identified in the test  bed selected for each of the 5 classifiers in the ensemble. Originally 7556 features. Only 22% was selected
Methods Evaluation was performed using 10-fold cross validation
Results A sample of messages and their sentiment and affect intensities determined through automated analysis are presented inTable VII.
Results Results confirm the assessment of the forums  by domain experts. The  Al-Firdaws  forum  contained  higher  intensities of violence and hate affects with a more negative sentiment polarity
Results The percentage of postings containing intense levels of the four affects are greater in the Al-Firdaws forum compared to the Montada forum, as shown in Figs. 8 and 9.
Results The  violence  and  hate  affects  were  used  by  a relatively large percentage of Al-Firdaw authors
Results A time series analysis was  performed  to  understand  how  forum  affect  intensities  progressed over time

Contenu connexe

Tendances

Tweet Segmentation and Its Application to Named Entity Recognition
Tweet Segmentation and Its Application to Named Entity RecognitionTweet Segmentation and Its Application to Named Entity Recognition
Tweet Segmentation and Its Application to Named Entity Recognition1crore projects
 
IRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
IRJET - Unauthorized Terror Attack Tracking System using Web Usage MiningIRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
IRJET - Unauthorized Terror Attack Tracking System using Web Usage MiningIRJET Journal
 
An evolutionary approach to comparative analysis of detecting Bangla abusive ...
An evolutionary approach to comparative analysis of detecting Bangla abusive ...An evolutionary approach to comparative analysis of detecting Bangla abusive ...
An evolutionary approach to comparative analysis of detecting Bangla abusive ...journalBEEI
 
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MININGFAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MININGijnlc
 
A data mining tool for the detection of suicide in social networks
A data mining tool for the detection of suicide in social networksA data mining tool for the detection of suicide in social networks
A data mining tool for the detection of suicide in social networksYassine Bensaoucha
 
INFO4990_Hossain
INFO4990_HossainINFO4990_Hossain
INFO4990_Hossainwebuploader
 
KnowMe and ShareMe: Understanding Automatically Discovered Personality Trai...
 KnowMe and ShareMe:  Understanding Automatically Discovered Personality Trai... KnowMe and ShareMe:  Understanding Automatically Discovered Personality Trai...
KnowMe and ShareMe: Understanding Automatically Discovered Personality Trai...Wookjae Maeng
 
Odsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graphOdsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graphvenkatramanJ4
 
IRJET- Fake News Detection and Rumour Source Identification
IRJET- Fake News Detection and Rumour Source IdentificationIRJET- Fake News Detection and Rumour Source Identification
IRJET- Fake News Detection and Rumour Source IdentificationIRJET Journal
 
IRJET- Personality Recognition using Social Media Data
IRJET- Personality Recognition using Social Media DataIRJET- Personality Recognition using Social Media Data
IRJET- Personality Recognition using Social Media DataIRJET Journal
 
Finding Pattern in Dynamic Network Analysis
Finding Pattern in Dynamic Network AnalysisFinding Pattern in Dynamic Network Analysis
Finding Pattern in Dynamic Network AnalysisAndry Alamsyah
 
Hybrid sentiment and network analysis of social opinion polarization icoict
Hybrid sentiment and network analysis of social opinion polarization   icoictHybrid sentiment and network analysis of social opinion polarization   icoict
Hybrid sentiment and network analysis of social opinion polarization icoictAndry Alamsyah
 
IRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic RegressionIRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic RegressionIRJET Journal
 
Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Marco Brambilla
 
Cyworld Jeju 2009 Conference(10 Aug2009)No2(2)
Cyworld Jeju 2009 Conference(10 Aug2009)No2(2)Cyworld Jeju 2009 Conference(10 Aug2009)No2(2)
Cyworld Jeju 2009 Conference(10 Aug2009)No2(2)SangMe Nam
 
Evolving Swings (topics) from Social Streams using Probability Model
Evolving Swings (topics) from Social Streams using Probability ModelEvolving Swings (topics) from Social Streams using Probability Model
Evolving Swings (topics) from Social Streams using Probability ModelIJERA Editor
 

Tendances (20)

E017433538
E017433538E017433538
E017433538
 
Tweet Segmentation and Its Application to Named Entity Recognition
Tweet Segmentation and Its Application to Named Entity RecognitionTweet Segmentation and Its Application to Named Entity Recognition
Tweet Segmentation and Its Application to Named Entity Recognition
 
IRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
IRJET - Unauthorized Terror Attack Tracking System using Web Usage MiningIRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
IRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
 
An evolutionary approach to comparative analysis of detecting Bangla abusive ...
An evolutionary approach to comparative analysis of detecting Bangla abusive ...An evolutionary approach to comparative analysis of detecting Bangla abusive ...
An evolutionary approach to comparative analysis of detecting Bangla abusive ...
 
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MININGFAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
 
A data mining tool for the detection of suicide in social networks
A data mining tool for the detection of suicide in social networksA data mining tool for the detection of suicide in social networks
A data mining tool for the detection of suicide in social networks
 
INFO4990_Hossain
INFO4990_HossainINFO4990_Hossain
INFO4990_Hossain
 
KnowMe and ShareMe: Understanding Automatically Discovered Personality Trai...
 KnowMe and ShareMe:  Understanding Automatically Discovered Personality Trai... KnowMe and ShareMe:  Understanding Automatically Discovered Personality Trai...
KnowMe and ShareMe: Understanding Automatically Discovered Personality Trai...
 
presentation29
presentation29presentation29
presentation29
 
Odsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graphOdsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graph
 
IRJET- Fake News Detection and Rumour Source Identification
IRJET- Fake News Detection and Rumour Source IdentificationIRJET- Fake News Detection and Rumour Source Identification
IRJET- Fake News Detection and Rumour Source Identification
 
IRJET- Personality Recognition using Social Media Data
IRJET- Personality Recognition using Social Media DataIRJET- Personality Recognition using Social Media Data
IRJET- Personality Recognition using Social Media Data
 
F017433947
F017433947F017433947
F017433947
 
Finding Pattern in Dynamic Network Analysis
Finding Pattern in Dynamic Network AnalysisFinding Pattern in Dynamic Network Analysis
Finding Pattern in Dynamic Network Analysis
 
Hybrid sentiment and network analysis of social opinion polarization icoict
Hybrid sentiment and network analysis of social opinion polarization   icoictHybrid sentiment and network analysis of social opinion polarization   icoict
Hybrid sentiment and network analysis of social opinion polarization icoict
 
IRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic RegressionIRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic Regression
 
Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018
 
Evaluation Social Ties and Trust in Online Social Network
Evaluation Social Ties and Trust in Online Social NetworkEvaluation Social Ties and Trust in Online Social Network
Evaluation Social Ties and Trust in Online Social Network
 
Cyworld Jeju 2009 Conference(10 Aug2009)No2(2)
Cyworld Jeju 2009 Conference(10 Aug2009)No2(2)Cyworld Jeju 2009 Conference(10 Aug2009)No2(2)
Cyworld Jeju 2009 Conference(10 Aug2009)No2(2)
 
Evolving Swings (topics) from Social Streams using Probability Model
Evolving Swings (topics) from Social Streams using Probability ModelEvolving Swings (topics) from Social Streams using Probability Model
Evolving Swings (topics) from Social Streams using Probability Model
 

Similaire à 投影片 1

FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISmlaij
 
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...mathsjournal
 
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...IJECEIAES
 
Application Of Sentiment Lexicons On Movies Transcripts To Detect Violence In...
Application Of Sentiment Lexicons On Movies Transcripts To Detect Violence In...Application Of Sentiment Lexicons On Movies Transcripts To Detect Violence In...
Application Of Sentiment Lexicons On Movies Transcripts To Detect Violence In...Sara Alvarez
 
Sentiment analysis on Bangla conversation using machine learning approach
Sentiment analysis on Bangla conversation using machine  learning approachSentiment analysis on Bangla conversation using machine  learning approach
Sentiment analysis on Bangla conversation using machine learning approachIJECEIAES
 
A scalable, lexicon based technique for sentiment analysis
A scalable, lexicon based technique for sentiment analysisA scalable, lexicon based technique for sentiment analysis
A scalable, lexicon based technique for sentiment analysisijfcstjournal
 
A review on sentiment analysis and emotion detection.pptx
A review on sentiment analysis and emotion detection.pptxA review on sentiment analysis and emotion detection.pptx
A review on sentiment analysis and emotion detection.pptxvoicemail1
 
Text to Emotion Extraction Using Supervised Machine Learning Techniques
Text to Emotion Extraction Using Supervised Machine Learning TechniquesText to Emotion Extraction Using Supervised Machine Learning Techniques
Text to Emotion Extraction Using Supervised Machine Learning TechniquesTELKOMNIKA JOURNAL
 
Intro to sentiment analysis
Intro to sentiment analysisIntro to sentiment analysis
Intro to sentiment analysisTimea Turdean
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESJournal For Research
 
Sentiment Analysis Tasks and Approaches
Sentiment Analysis Tasks and ApproachesSentiment Analysis Tasks and Approaches
Sentiment Analysis Tasks and Approachesenas khalil
 
Hate Speech Recognition System through NLP and Deep Learning
Hate Speech Recognition System through NLP and Deep LearningHate Speech Recognition System through NLP and Deep Learning
Hate Speech Recognition System through NLP and Deep LearningIRJET Journal
 
Review of Sentiment Analysis: An Hybrid Approach
Review of Sentiment Analysis: An Hybrid Approach Review of Sentiment Analysis: An Hybrid Approach
Review of Sentiment Analysis: An Hybrid Approach IIJSRJournal
 
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet IJECEIAES
 
Sentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data MiningSentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data MiningIRJET Journal
 
Neural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisNeural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisEditor IJCATR
 
A Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningA Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningIJSRD
 

Similaire à 投影片 1 (20)

FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
 
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
 
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...
 
Application Of Sentiment Lexicons On Movies Transcripts To Detect Violence In...
Application Of Sentiment Lexicons On Movies Transcripts To Detect Violence In...Application Of Sentiment Lexicons On Movies Transcripts To Detect Violence In...
Application Of Sentiment Lexicons On Movies Transcripts To Detect Violence In...
 
Sentiment analysis on Bangla conversation using machine learning approach
Sentiment analysis on Bangla conversation using machine  learning approachSentiment analysis on Bangla conversation using machine  learning approach
Sentiment analysis on Bangla conversation using machine learning approach
 
A scalable, lexicon based technique for sentiment analysis
A scalable, lexicon based technique for sentiment analysisA scalable, lexicon based technique for sentiment analysis
A scalable, lexicon based technique for sentiment analysis
 
F334047
F334047F334047
F334047
 
ana.pdf
ana.pdfana.pdf
ana.pdf
 
A review on sentiment analysis and emotion detection.pptx
A review on sentiment analysis and emotion detection.pptxA review on sentiment analysis and emotion detection.pptx
A review on sentiment analysis and emotion detection.pptx
 
Text to Emotion Extraction Using Supervised Machine Learning Techniques
Text to Emotion Extraction Using Supervised Machine Learning TechniquesText to Emotion Extraction Using Supervised Machine Learning Techniques
Text to Emotion Extraction Using Supervised Machine Learning Techniques
 
Intro to sentiment analysis
Intro to sentiment analysisIntro to sentiment analysis
Intro to sentiment analysis
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
 
A survey on approaches for performing sentiment analysis ijrset october15
A survey on approaches for performing sentiment analysis ijrset october15A survey on approaches for performing sentiment analysis ijrset october15
A survey on approaches for performing sentiment analysis ijrset october15
 
Sentiment Analysis Tasks and Approaches
Sentiment Analysis Tasks and ApproachesSentiment Analysis Tasks and Approaches
Sentiment Analysis Tasks and Approaches
 
Hate Speech Recognition System through NLP and Deep Learning
Hate Speech Recognition System through NLP and Deep LearningHate Speech Recognition System through NLP and Deep Learning
Hate Speech Recognition System through NLP and Deep Learning
 
Review of Sentiment Analysis: An Hybrid Approach
Review of Sentiment Analysis: An Hybrid Approach Review of Sentiment Analysis: An Hybrid Approach
Review of Sentiment Analysis: An Hybrid Approach
 
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
 
Sentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data MiningSentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data Mining
 
Neural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisNeural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment Analysis
 
A Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningA Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion Mining
 

Plus de butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

Plus de butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

投影片 1

  • 1. Sentiment and Affect analysis of Dark Web Forums: Measuring Radicalization on the Internet Hsinchun Chen, Fellow, IEEE
  • 2. Introduction Web forums offer participants a medium to express their opinions and emotions freely in discussion. Extremist and terrorist groups also use web forums for community. Expression and dissemination of their ideologies and propaganda Such forums are often referred to as being part of Dark Web
  • 3. Introduction Information contained within Dark Web forums represent asignificant source of knowledge for security and intelligence organizations. Theopinions and emotions expressed within these forums provide valuable insights: the nature and position of the online community Characterizing individual participants Manual analysis of the vast quantities of messages to measure the opinions and emotions expressed is often infeasible.
  • 4. Introduction This paper presents an automated approach to sentiment and affect analysis of two Dark Web forums related to the Iraqi insurgency and Al-Qaeda. The automated approach utilizes a rich set of textual features and machine learning techniques.
  • 5. Related Work Sentiment and affect analysis are related tasks in text mining that focus on directional text, containing opinions, emotions, and biases. [5] M. A. Hearst, “Direction-based text interpretation as an information access refinement,” In Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval. Lawrence Erlbaum Associates, 1992. [6] J. Wiebe, “Tracking point of view in narrative,” Computational Linguistics, vol. 20 (2), pg. 233-287, 1994.
  • 6. Related Work Sentiment analysis attempt to identify, analyze, and measure opinions expressed in text. Affect analysis focuses on the emotional content of the communication. R. Agrawal, S. Rajagopalan, R. Srikant, and Y. Xu, “Mining newsgroups using networks arising from social behavior,” Proc. of the 12th Int’l WWW Conf., 2003. P. Subasic and A. Huettner, “Affect analysis of text using fuzzy semantic typing,” IEEE Trans. Fuzzy Systems, vol. 9 (4), pg. 483-496.
  • 7. Related Work There are some important distinction between the two Affect analysis evaluates the intensity of a number of potential emotions, including happiness, sadness, anger, fear, etc Sentiment analysis considers the polarity of opinions along a positive-neutral-negative continuum. The words and phrases associated with sentiments are mutually exclusive. Segments of text can convey multiple affects
  • 8. Related Work Researchers have utilized various machine learning approaches to perform automated sentiment and affect analysis. B. Pang, L. Lee, and S. Vaithyanathain, “Thumbs up? sentiment classification using machine learning techniques,” Proc. Empirical Methods in Natural Language Processing, pg. 79-86, 2002. R. W. Picard, E. Vyzas, and J. Healey, “Toward machine emotional intelligence: analysis of affective physiological state,” IEEE Tran. Pattern Analysis and Machine Intelligence, vol. 23 (10), pg. 1179-1191, 2001.
  • 9. Related Work In particular, the SVM learning approach has been shown to be particularly effective in determining whether a text segment contains expression of a particular affects class. Only for discrete label. Y. H. Cho and K. J. Lee, “Automatic affect recognition using natural language processing techniques and manually built affect lexicon,” IEICE Tran. Information Systems, vol. E89 (12), pg. 2964-2971, 2006.
  • 10. Related Work SVR is an alternate approach that is capable of predicting continuous sentiment and affect intensities while benefitting from the robustness of SVM. A. Webb, Statistical Pattern Recognition. John Wiley & Sons, 2002.
  • 11. Research Questions In a recent book by Ryan, the author highlights the critical role that the Web forums play for militant Islamic radicalization on the Internet. Marc Sageman, an internationally renowned terrorism study consultant, also emphasizes the importance of the internet, especially forums. This paper presents our web mining research on sentiment and affect analysis of two large-scale, internal Jihadist forums.
  • 12. Research Questions This study seeks to answer the following research questions: How effective are automated methods of sentiment and affect analysis in measuring the polarities of opinions and intensities of emotions in Dark Web forums? What insights into the Dark Web forums are gained by performing sentiment and affect analysis?
  • 13. Data Two Dark Web forums were selected for sentiment and affect analysis Al-Firdaws (www.alfirdaws.org/vb) Montada (www.montada.com) Al-Firdaws a more radical forum considerable content dedicated to support of the Iraqi insurgency and Al-Qaeda. Montada Montada is a general discussion forum with content pertaining to a variety of social and religious issues. Domain experts consider Montada to be more moderate compared to Al-Firdaws, with less radical content.
  • 14. Data Spidering programs were used to collect the content from the two web forums. A summary of the collection statistics is presented in Table I. Data set is larger. An older forum Al-Firdaws is too radical
  • 15. Data Both Al-Firdawsand Montada are major forums for their respective purposes and communities, with relatively high membership levels and numerous authors.
  • 16. Data In both cases postings are more evenly distributed across web forum threads. Although the Montada forum has a larger average number of posts per thread compared to Al-Firdaws, the median number of posts per thread is nearly equal.
  • 17. Data 500 sentences were selected from each web forum, and scored for the intensities of sentiments and affects expressed. The affects of interest in the study included those of most interest to security and intelligence organizations including violence, anger, hate, and racism. These affects were measured on a continuous scale ranging from 0 to 1. The sentiment measurement was on a continuous scale from -1 to 1
  • 18. Data
  • 20. Methods Annotation step Character, word, root, collocation n-grams Character and word n-grams are commonly used in text mining applications. To derive root level n-grams, Arabic words were converted to their roots using a clustering algorithm. Collocation n-grams included the Hapax and Dis collocations. Features with less than four occurrences in the test bed were excluded.
  • 22. Methods The machine learning approach for identifying the presence and intensities of sentiments and affects in Dark Web forum sentences utilized a SVR ensemble. SVR was utilized toleverage the robustness of SVM, while accommodating the continuous intensities of sentiments and affects. Ensemble classifiers aggregate multiple independent classifiers built using different techniques or feature subsets improving performance over a single classifier.
  • 23. Methods For the analysis of the Al-Firdaws and Montada web forums, a separate classifier was developed for each of the five sentiment and affect classes
  • 24. Methods Feature selection Information gain (IG) heuristic Discretization of intensities were performed before IG could be applied and the relevant features selected. To compensate for the discretization, multiple iterations were performed varying the number of class bins for intensity between 2 and 10. The IG heuristic was used recursively to select relevant features in these iterations using recursive feature elimination (RFE).
  • 26. Methods The feature selection phase resulted in a subset of the features identified in the test bed selected for each of the 5 classifiers in the ensemble. Originally 7556 features. Only 22% was selected
  • 27. Methods Evaluation was performed using 10-fold cross validation
  • 28. Results A sample of messages and their sentiment and affect intensities determined through automated analysis are presented inTable VII.
  • 29. Results Results confirm the assessment of the forums by domain experts. The Al-Firdaws forum contained higher intensities of violence and hate affects with a more negative sentiment polarity
  • 30. Results The percentage of postings containing intense levels of the four affects are greater in the Al-Firdaws forum compared to the Montada forum, as shown in Figs. 8 and 9.
  • 31. Results The violence and hate affects were used by a relatively large percentage of Al-Firdaw authors
  • 32. Results A time series analysis was performed to understand how forum affect intensities progressed over time