A review on sentiment analysis and emotion detection.pptx
1. Areview on sentiment analysis and
emotion detection from text
Adnan Nawaz
MSCS-II
FA21-RCS-002
Advanced Data Mining
1
Nandwani, P., & Verma, R. (2021). A review on sentiment analysis and emotion detection
from text. Social Network Analysis and Mining, 11(1), 1-19.
2. Table of contents
Abstract
Introduction
Use of
Social Media
Review of Techniques of S & E Analysis
Levels of Sentimental Analysis
Emotion Models
Basic Steps in Sentiment / Emotion detection
Overview on Dataset used
Techniques for sentiment analysis and emotion
detection
Challenges in sentiment analysis and emotion
detection
Conclusion Advanced Data Mining
2
3. Abstract
Social Networking platform use for communicating
feelings.
Textual content, pictures, audio, and video to express
their feelings.
Massive amount of data is generated.
Rapidly processed data through sentimental analysis.
SA recognizes polarity in text.
Author has positive, negative or neutral toward an
item, Administration, location, individual etc.
Individual’s precise emotional/mental state.
Advanced Data Mining
3
4. Topics
Levels of sentiment analysis.
Various emotion models, and
The process of sentiment analysis and emotion
detection from text.
Challenges during sentiment and emotion analysis.
Advanced Data Mining
4
5. Introduction
Critical areas of NLP are Sentiment Analysis and
Emotion Recognition.
SA means Data is positive, negative or neutral.
ER means furious, cheerful, or depressed.
Use of social Media to communicate their feelings,
arguments, opinion.
Feedbacks and reviews on various product and
services.
Rating and reviews to encourage vendors and service
provider.
Transforms unstructured data into meaningful insights
for decision making
Advanced Data Mining
5
6. Use of
Social Media
Broadcast information about product and collect client
feedback.
Feedback is valuable not just for business marketers
for satisfaction.
Sentimental analysis helps marketers in
understanding their customer's perspectives.
The rise of social media has made it easier and faster.
Advanced Data Mining
6
7. Healthcare
Sector
Social media have become essential sources of health-
related information.
Health practitioners must use automated sentiment
and emotion analysis to save patient
Advanced Data Mining
7
8. Education
Sector
Sentiment Analysis plays a critical role for both
student
Enthusiasm, talent, and dedication decides teacher
efficiency.
Timely feedback from students to improve teaching
approaches.
Sentiment Analysis and emotion analysis of textual
feedback.
Social Media use for advertising and marketing
purpose.
Students and Guardians conduct online research about
institutes, courses.
Sentiment and emotion analysis can help the student
to select the best institute or teacher
Advanced Data Mining
8
9. Techniques of S
& EAnalysis
Three techniques for sentiment and emotion analysis:
1) Lexicon based,
2) Machine learning based, and
3) Deep learning based.
Researcher face significant challenges, including:
1) Dealing with context,
2) Ridicule,
3) Statements conveying several emotions,
4) Spreading Web slang,
5) and lexical and syntactical ambiguity.
Advanced Data Mining
9
10. Sentimental
Analysis
A process of obtaining meaningful information and
semantics from text using natural processing
techniques
Big data is generated through Social media.
Sentiment Analysis is use to analyze it effectively and
Efficiently.
Not restricted to just positive or negative.
It can be agreed or disagreed, good or bad.
5-point scale: strongly disagree, disagree, neutral,
agree, or strongly agree
Advanced Data Mining
10
11. Example
Scale of 1 to 5 was used for Reviews on European and
US destinations labeled.
e.g 1 or 2 stars for negative polarity.
Gräbner et al. (2012) built a domain-specific lexicon:
Consists of tokens with their sentiment value.
Customer reviews in tourism domain
5-star ratings from terrible to excellent
Advanced Data Mining
11
12. Levels of
Sentimental
Analysis
Sentiment analysis is possible at three levels:
Sentence level,
Broken down into sentence
Document level, and
Sentiment detected for entire document.
To extract global sentiment.
Contain redundant local patterns and lots of noise.
Link between words and phrases
Aspect level
Opinion about a specific aspect or feature is determined.
The speed of the processor is high, but this product is
overpriced.
Here, speed and cost are two aspects.
Advanced Data Mining
12
13. Aspect level
sentiment
analysis
Devi Sri Nandhini and Pradeep (2020) proposed an
algorithm to extract:
Implicit aspects from documents based and
By exploiting the relation between opinionated (adj)
words and explicit aspects(Noun).
Ma et al. (2019) took care of two issues:
Different polarities of various aspects in a single
sentence.
Explicit position of context in an opinionated sentence.
Built up a two-stage model based on LSTM
Context words near to aspect are more relevant and
Need greater attention than farther context words.
Advanced Data Mining
13
14. Stages
At stage One:
Model exploits multiple aspects in a sentence one by one
with a position attention mechanism.
At the second state
Identifies (aspect, sentence) pairs according to the
position of aspect and context around it and
Calculates the polarity of each team simultaneously.
Advanced Data Mining
14
15. Emotion
Detection
Process of identifying a person’s various feelings or
emotions.
For example, joy, sadness, or fury.
Physical activities such as heart rate, shivering of
hands, sweating, and voice pitch
From text, Emotion detection is difficult
New slang or terminologies being introduced e.g LOL
Emotion detection is challenging
Advanced Data Mining
15
16. Emotion
Models
Dimensional Emotion model:
Represents emotions based on three parameters:
Valence, Arousal, and Power
Valence means polarity, and
Arousal means how exciting a feeling is.
e.g, delighted is more exciting than happy.
Power signifies restriction over emotion.
Advanced Data Mining
16
18. Emotion
models
Categorical Emotion model:
Emotions are defined discretely,
such as anger, happiness, sadness, and fear.
Categorized into four, six, or eight categories.
Advanced Data Mining
18
22. Models used
byAuthors
Authors Model Emotions Purpose
Batbattar &
Becker
Ekman’s
Model
Six
Sailunaz &
Alhajj
Ekman’s
Model
Six Tweets
Robert Ekman with
“Love” state
Seven Tweets
Ahmad Wheel of
Emotion Model
by Plutchik
Nine states Labeling
Hindi
Sentences
Laubert &
Parlamis
Shahver Three
Advanced Data Mining
22
25. Pre-processing
of text
Social media platform's posts, audits, comments,
remarks, and criticisms are highly unstructured
Data Cleaning is necessary
Including tokenization, stop word removal, POS
tagging, etc.
Advanced Data Mining
25
27. Tokenization
Tokenization:
“this place is so beautiful” and
Post-tokenization, it will become
'this,’ "place," is, "so," beautiful.’
Converting the text into standard form.
Correcting the spelling of words, etc.
Advanced Data Mining
27
28. Removal of
Stop Words
Stop words like "is," "at," "an," "the"
Avoid unnecessary computations.
Finding various aspects from a sentence.
Noun or Noun phrase describe various aspect.
While and emotions are conveyed by adjectives.
Advanced Data Mining
28
29. Stemming and
lemmatization
Two crucial steps of preprocessing.
In stemming:
words are converted to their root form
The terms "argued“ and "argue" become "argue.“
Lemmatization:
Turn a work into base word.
the term "caught" is converted into "catch“.
Removing numbers and Lemmatization enhanced accuracy.
Removing punctuation did not affect accuracy.
Advanced Data Mining
29
30. Feature
extraction
The process of converting or mapping the text or words
to real valued vectors is called word vectorization.
Document is broken down into sentences and the
Words.
The resulting matrix, each row represents a sentence
or document.
while each feature column represents a word.
Advanced Data Mining
30
31. Feature
extraction
Straightforward methods used is 'Bag of Words' (BOW).
Fixed-length vector of the count is defined.
Each entry corresponds to a word in a pre-defined dictionary
Count of 0 if it is not present in the pre-defined dictionary,
otherwise >=1.
Vector length is always equal to the words present in the
dictionary.
Easy Implementation.
Drawbacks:
Sparse Matrix.
Loses the order of words in the sentence, and
Does not capture the meaning of a sentence
To represent the text “Are you enjoying reading”
I, Hope, you, are, enjoying, reading would be (0,0,1,1,1,1)
Can be Improved:
Pre-processing of text and
By utilizing n-gram, TF-IDF. Advanced Data Mining
31
32. N-Gram
Excellent option to resolve the order of words in
sentence vector representation.
The value of n can be any natural number.
“To teach is to touch a life forever” and n = 3 called
trigram.
Will Generate, 'to teach is,' 'teach is to,' 'is to touch,' 'to
touch a,' 'touch a life,' 'a life forever.’
Perform better than the BOW.
Advanced Data Mining
32
33. Term frequency-
inverse document
frequency
Used for feature extraction.
Represents text in matrix form.
Ahuja et al. (2019) implemented six pre-processing techniques
and
Compared two feature extraction techniques to identify the best
approach.
Advanced Data Mining
33
35. Lexicon based
approach
This method maintains a word dictionary.
Each positive and negative word is assigned a
sentiment value.
Mean value is used to calculate the sentiment of the
entire sentence or document.
Two Approaches:
1. Dictionary Approach:
Words of some language
less efficient.
Multiple domains with a data-driven approach.
2. Corpus Based:
Random sample of text in some language.
domain-specific sentiment words.
Poor generalization.
excellent performance within a particular domain
Advanced Data Mining
35
36. Machine
Learning based
approach
Dataset is divided into two parts:
Training and testing purposes.
Supervised Classification
Naive Bayes, support vector machine (SVM), decision
trees, etc.
Gamon (2004) applied a SVM:
Accuracy upto 85.47%.
Ye et al. (2009) worked with SVM, N-gram model, and
Naive Bayes:
Sentiment and review on seven popular destinations of
Europe and the USA.
Accuracy of up to 87.17%
Advanced Data Mining
36
37. Deep Learning
basedApproach
These algorithms detect the sentiments from text
without doing feature engineering.
Multiple deep learning algorithms:
RNN, CNN
Authors applied the model to review the data of
Cornell movie:
More accurate as compared to SVM.
Pasupa and Ayutthaya (2019) use CNN, LSTM, and Bi-
LSTM.
children’s tale (Thai) dataset.
with or without features:
POS-tagging
Thai2Vec(word embedding trained from Thai Wikipedia)
Sentic (to understand the sentiment of the word).
Best performance in CNN
Advanced Data Mining
37
38. Transfer Learning
Approach and
HybridApproach
Part of machine learning.
Model trained on large datasets.
To resolve one problem can be applied to other related
issues.
Re-using a pre-trained model on related domains as a
starting point
Can save time and produce more efficient results.
Zhang et al. (2012) proposed a novel instance learning
method:
Modeling the distribution between different domains.
classified the dataset:
Amazon product reviews and
Twitter dataset into positive and negative sentiments.
Advanced Data Mining
38
41. Challenges in
sentiment analysis
and emotion
detection
Lot of data in the form of informal text.
Spelling mistakes, new slang, and incorrect use of
grammar.
Sometimes individuals do not express their emotions
clearly.
E.g “Y have u been soooo late?”
Advanced Data Mining
41
43. Conclusion
Review of the existing techniques for both E and S
detection is presented.
Lexicon-based technique performs well in both.
Dictionary-based approach is quite adaptable and
straightforward to apply.
Corpus based method is built on rules.
Machine and deep learning algorithms depends on
dataset size and Preprocessing.
LSTM Model can cover long-term dependencies and
extract features very well.
Various approaches depends on preprocessing and
feature extraction.
Advanced Data Mining
43