1. SPECIAL SESSION ON: Intelligent System with Blockchain, Security and Wireless Communication
Classification of Sentiment reviews for Indian Railways
Using Machine Learning Methods
Manju Bagga*
, Ritu Aggarwalb
,Nitika Arora c
*
MMICT&BM, MMDU, Mullana, Ambala, Haryana, India, rathi.ancester@gmail.com
b
MMICT&BM, MMDU, Mullana, Ambala, Haryana, India, errituaggarwal@gmail.com
c
MMICT&BM, MMDU, Mullana, Ambala, Haryana, India,, nitika.arora@mmumullana.org
Abstract
AI provides the concept of machine learning that helps to automate the decision making process
by analysing data inputs. It trains machines by providing it sample data and thus makes the
system intelligent that is helpful for real-world AI applications. Machine learning algorithms are
applied to such social feedback data to excerpt useful information that confers a competitive
edge to several enterprises. There are enough machine learning technologies in the existing
literature on sentiment analysis. However, it still needs optimizations for a better decision
making process for several enterprises. In this paper, we proposed a scheme for Indian Railways
for determining sentiments from Facebook. This is a more specific scheme that clouts business
intelligence over different classifiers viz. SVM, NB, RF, and Decision tree, K-NN. The proposed
scheme is provided with various parameters like F-Measure, recall, precision, logarithmic loss,
and accuracy. The first section of this paper provides the preface of sentiment analysis, the next
section presents the related work and motivation for sentiment analysis then methodology
adopted for better decision making through machine learning to bring out in depth knowledge for
future marketing game plans then discussed the experimental results and finally, the paper
encapsulates the conclusion and future scope in the area of sentiment analysis.
Keywords: Sentiment Analysis, Machine Learning, DT, NB, SVM, RF, logarithmic-loss
1 Introduction
In past years the different kinds of sharing applications and opinions are rising in the different
fields. Different social sites like Facebook, twitter provide to access the users to post their
comments and reviews regarding any topics issues and projects. With the help of these, we can
2. SPECIAL SESSION ON: Intelligent System with Blockchain, Security and Wireless Communication
see the reviews that are shared by the other peoples and make importance for us. Many
commercial enterprises like business firms, airways, railways, twitter, facebook, instragram can make use
of sentiment analysis so that the satisfaction level of user can be examined in the direction of services,
issues and products. Moreover a volume of complex data is quickly available from social newworking
hub that acts as a base for analysis by applying AI. Sentiment analysis accustomed in accord with user’s
attitude. Sentiment analysis revolves around making decisions that requires adequate information to
acquire and refine it that investigations could be available by different applications of AI. Well
established and standardized process i.e. machine learning is really a great language for making
decisions in established areas.
Sentiments can be defined as a view and opinion someone that expressed or being held, and the
analysis means when people review it accordingly their opinions. This Framework provides the
training and testing phase to guide the different operations related to it. The methodology behind
this SA used the evaluation method with the following metrics like Accuracy, Precision, Recall,
F-measure and logarithmic-loss that based on some matrix that is confusion matrix with
TruePositive (T_Pos), FalsePositive (F_Pos), TrueNegative (T_Neg) and FalseNegative
(F_Neg). In this paper, we examine the efficiency of supervised learning techniques namely the
Support vector machine, Naïve Bayes, Random forest, and Decision tree, K-NN for sentiment
analysis of Indian Railways from Facebook. K-NN is capable of performing classification and
regression, effectual for high dimensional spaces, and mostly used in data mining
experimentation.
2 Related work
This briefly surveys related work on sentiment classification techniques using machine learning.
[1] Analyzed twitter sentiment classification using machine learning techniques. Recurrent
neural network (RNN) is studied in [2] for document-level sentiment analysis. [3] Opposed the
traditional neural network word embedding and proposed new sentiment specific word
embedding for sentiment classification. [4] Considered the n-gram machine learning method
used for sentiment analysis. Other related work on text classification using deep learning
methods suggested in [5]. A novel method proposed in [6] for merging lexical and learning-
based techniques for sentiment classification. An approach to sentiment analysis using lifelong
(LL) method proposed in [7]. [8] discussed the sentiment analysis of people for movie reviews
3. SPECIAL SESSION ON: Intelligent System with Blockchain, Security and Wireless Communication
by using K-Nearest Neighbor and Information Gain to bring a successful conclusion. [9] It studied the
twitter and redidt posts for real time prediction for the prices of bitcoin using sentiment analysis,
various algorithms of machine learning are analysed for variations in bitcoin price. In [10] for
sentiment analysis it evaluated different machine learning algorithms for Czech language. For
categorization of ideas , a material analysis way is used.
3 Methodologies for Sentiment Analysis
The methodology is used for the Facebook accounts with the dataset collection by using the
sentiment classification for Indian railways. The Facebook API in which we have the live
connectivity and reviews are based on the classification technique in which the preprocessed
datasets are used for training and testing. The following subsections provide detailed
methodologies that are to be employed.
3.1 Problem Formulation
Different social networking sites like twitter, Facebook, Instagram a few of these mentions how
the information changed and dispersed or shared frequently. These are the platform that is used
for social gathering, meetings and chats or for information exchange and transfer. These
platforms provide the better opportunity to connect with the people and deliver the valuable
Feedback about on products and their services..
3.2 Framework for Proposed Methodology
A framework is employed for this proposed work that used the sentiment classification
techniques used for guiding the research work. The sentiments are the opinions and reviews on
social media, comments used on social media sites of Indian railways are collected from the
Facebook web site and these sentiments are the reviews or the knowledge about the people
sentiments. In this framework, the different machine learning algorithms and approaches are
used for sentiment analysis like Naïve Bayes, SVM, K-NN, Random Forest, Decision Tree, etc.
4. SPECIAL SESSION ON: Intelligent System with Blockchain, Security and Wireless Communication
Figure 1: Framework for Sentiment Classification
Fig. 1 shows the framework. In the sentiment, classifications are two-phase training and testing.
In the training phase to train the datasets for preprocessing and the train, datasets are tested for
analysis of the system. The results are based on some classification metrics of their accuracy
performance like positives, negative and mixed-mode it .there is a need to know about each
algorithm that accurately finds the classification necessity.
3.3 Evaluation Method
In this proposed work the methodology used for sentiment analysis evaluation with the help of
machine learning techniques and their algorithms which extract and collect the utility of
information by the various comments and reviews from the Indian Railways.
5. SPECIAL SESSION ON: Intelligent System with Blockchain, Security and Wireless Communication
Figure 2: Shows the Procedure for Evaluation
Figure 2 above shows the evaluation metrics. Table 1 describes the confusion matrix which
describes the results for according to their prediction values based on the classification Metrics.
Classification Accuracy is based on prediction values means the number of correct predictions/
number of incorrect predictions. Prediction of classification accuracy is as follows: the
correct/Incorrect prediction results are T_pos (TruePositive), F_Pos (FalsePositive), F_Neg
(FalseNegative), T_Neg (TrueNegative). Calculation of metrics is shown in Table 1.
Table 1: Metrics Calculation
4 EXPERIMENTAL RESULTS
Experiments are calculated by using ML approaches. SVM and K-NN datasets are used against
the Indian railways collected from the Facebook accounts. The observed F_Pos rate is mentioned
below according to each technique. As shown in Table 2, the FPR is presented for all machine
6. SPECIAL SESSION ON: Intelligent System with Blockchain, Security and Wireless Communication
learning algorithms.
Table 2: Results for False Positives
It showed that KNN has better performance with the least percentage of false positives. Naive
Bayes is the algorithm that showed the highest number of false positives. Higher FPR indicates
less in performance in terms of false positives. The computed accuracy, precision, recall, and F-
Measure, Logarithmic -loss are presented against the machine learning techniques in Table 3.
Table 3: Performance Using Machine learning Approaches for Sentiment Analysis
Figure 3: Performance Using Machine learning Approaches for Sentiment Analysis
In figure 3 y-axis represents the supervised machine learning experiments. The x-axis shows the
0
50
100
1 2 3 4 5
Performance Using Machine learning
Approcahes for Sentiment Analysis
Decision tree Navive bayes SVM Random forest KNN
Machine Learning Algorithm F_Pos Rate
Decision tree 8.8
KNN 6.1
Naïve Bayes 9.4
SVM 7.1
Random Forest 7.9
Algorithms Accuracy Precision F-Measure Recall Logarithmic –Loss
Decision tree 88.5 83.2 82 82.8 79.8
Navive bayes 87 82.3 80.5 79 74
SVM 90 85.5 86.5 84 80.2
Random forest 89.5 88.2 86 83.5 80.5
KNN 90.2 86 87 84.3 80
7. SPECIAL SESSION ON: Intelligent System with Blockchain, Security and Wireless Communication
performance by using classification accuracy metrics In this model, the K-NN, SVM and RF
showed high precision and recall classification for SA so that the K-NN has the highest F-
measure value concerning other algorithms. K-NN shows a better performance for SA.
5 Conclusion and Future Scope
In this paper, we presented a methodology for sentiment classification of Indian Railways from
Facebook, has two stages training and testing. In the training stage, different models can be
designed through classifiers. In the testing stage, these models sassed for labeling the unlabelled
comments online. The results show that K-NN outperforms when compared with the Decision
tree, NB, and RF in the presented methodology.There exist some possible extensions to our work
for a wide variety of annotations to simplify sentiment classification and retrieval. Also, the
presented methodology is flexible enough so that it is applicable in the developing areas of
sentiment classification.
References
[1] Madhuri, D. K. (2019). A machine learningBased framework for sentiment classification: Indian
railways case study. Int. J. Innov. Technol. Explore., 8(4), 441-445.
[2] Tang et al., (2015). DocumentModeling with gated recurrent neural network for sentiment
classification. In Proceedings of the 2015 conference on EMNLP (pp. 1422-1432).
[3] Tang et al., (2014, June). LearningSentimentSpecific word embedding for twitter sentiment
classification. In Proceedings of the 52nd Annual Meeting of the ACLinguistics (Vol 1: Long Papers),
1555-1565.
[4] Tripathy, A., Agrawal, A., & Rath, S. K. (2016). Classification of sentiment reviews using the n-gram
machine learning approach. Expert Systems with Applications, 57, 117-126.
[5] Zhang, X., Zhao, J., & LeCun, Y. (2015). Character_level convolutional networks for text
classification. In Advances in neural information processing systems, 649-657.
[6] Zhang, L., Ghosh, R., Dekhil, M., Hsu, M., & Liu, B. (2011). Combining lexiconBased and
learningBased methods for Twitter sentiment analysis. Technical Report HPL-2011, 89.
8. SPECIAL SESSION ON: Intelligent System with Blockchain, Security and Wireless Communication
[7] Chen, Z., Ma, N., & Liu, B. (2018). Lifelong learning for sentiment classification. arXiv preprint
arXiv:1801.02808.
[8] Daeli, N. O. F., & Adiwijaya, A. (2020). Sentiment Analysis on Movie Reviews using Information
Gain and K-Nearest Neighbor. Journal of Data Science and Its Applications, 3(1), 1–7.
[9] Raju, S. M., & Tarif, A. M. (2020). Real-Time Prediction of BITCOIN Price using Machine
Learning Techniques and Public Sentiment Analysis.
[10] Arote Rutuja S., Gaikwad Ruchika P., Late Samidha S., Prof. G. B. Gadekar(2020). Online
Shopping with Sentimental Analysis for Furniture Shop, IRJMETC International Research
Journal of Modernization in Engineering Technology and Science,02(05),1-8.