2. Contents
• Social Media Data
• BigData
• Steps of social media analytics
• Sensitive Analysis
• Social Media Analytics tools
• Big data Analytics Software
• Applications of Social Media Analytics
• Opportunities & Challenges
Aug 28, 2018 2
7. Social Media Data
•The amount of data we produce every day is truly mind-
boggling. There are2.5 quintillion bytes of data (1000 EB)
created each day
•Over the last two years alone 90 percent of the data in
the world was generated.
Aug 28, 2018 7
8. Big Data
• Big data is the term for a
collection of data sets so large
and complex that it becomes
difficult to process using on-
hand database management
tools or traditional data
processing applications
• Systems / Enterprises
generate huge amount of data
from Terabytes to and even
Petabytes of information
• It’s very difficult to manage
such huge data……
Aug 28, 2018 8
9. 2009
800,000 petabytes
2020
35 zettabytes
as much Data and Content
Over Coming Decade
Business leaders frequently make
decisions based on information they don’t
trust, or don’t have1in3
83%
of CIOs cited “Business intelligence
and analytics” as part of their visionary
plans
to enhance competitiveness
Business leaders say they don’t have
access to the information they need to do
their jobs
1in2
of CEOs need to do a better job
capturing and understanding
information rapidly in order to make
swift business decisions
60%
… And Organizations Need Deeper
Insights
Of world’s data
is unstructured
90%
BIG DATA
9
Aug 28, 2018 9
10. Extracting insight from an immense volume, variety and velocity of data, in
context, beyond what was previously possible.
Big Data
Aug 28, 2018 10
12. The Challenge: Bring Together a Large Volume and Variety of Data
to Find New Insights
Identify criminals and threats from
disparate video, audio, and data
feeds
Make risk decisions based on real-time
transactional data
Predict weather patterns to plan
optimal wind turbine usage, and
optimize capital expenditure on asset
placement
Detect life-threatening
conditions at hospitals in time to
intervene
Multi-channel customer sentiment
and experience a analysis
12
Aug 28, 2018 12
14. The New Customer Influence Path
14
Awareness Consideration Purchase
Source: Evans et al. (2010), Social Media Marketing: The Next Generation of Business Engagement
Aug 28, 2018
15. Steps of social media analytics
• social media analytics framework around four
critical steps – listen, analyze, engage and
integrate – to effectively use social media for
intelligent decision making
• Listen - identifying and collecting relevant
social media data. Data-gathering tools (free
or subscription-based) can help organizations
collect customers’ tweets, blog posts, status
updates, etc.,
Aug 28, 2018 15
16. Steps of social media analytics-Analyze
• analyzing the collected data to understand
customer sentiment.
• Removing the “noise” around the data will help
improve the accuracy of the analysis.
• Semantic analysis is an advanced data-cleansing
method that groups large amounts of data based
on the relationship between words and/or
phrases.
• Semantic analysis goes beyond classifying
customer comments into positive, negative and
neutral, and provides insights into what
customers think about products, including what
they like and what improvements they would like
to see.
Aug 28, 2018 16
17. Steps of social media analytics -
Engage
• Engage - Customers who are engaged with
companies through social media spend 20% to
40% more than other customers, reveals a Bain &
Co. study of more than 3,000 customers.
• Analyzing social media posts provides a deeper
perspective on trending topics, hot brands and
the type of content that is being shared.
• Predictive analytics can also be used to
understand what would interest customers, and
the ideal time to publish content.
Aug 28, 2018 17
18. Steps of social media analytics -
Integrate
• Integrate - this stage involves integrating unstructured
data across the organization with enterprise structured
data to obtain a 360-degree view of customers. To
achieve this, organizations must integrate their social
media platforms with their existing master data
management (MDM) systems.
• it can automatically add relevant social media data to
the master customer file. It can also update customer
profiles whenever changes are made in source systems
to reflect the latest customer information.
Aug 28, 2018 18
19. Sentiment Analysis of Social Media Data
• Sentiment
– A thought, view, or attitude, especially one based
mainly on emotion instead of reason
• Sentiment Analysis
– opinion mining
– use of natural language processing (NLP) and
computational techniques to automate the
extraction or classification of sentiment from
typically unstructured text
19Aug 28, 2018
20. Emotions
20
Source: Bing Liu (2011) , “Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data,” Springer, 2nd Edition,
Love
Joy
Surprise
Anger
Sadness
Fear
Aug 28, 2018
21. Sentiment Analysis and
Opinion Mining
• Computational study of
opinions,
sentiments,
subjectivity,
evaluations,
attitudes,
appraisal,
affects,
views,
emotions,
ets., expressed in text.
– Reviews, blogs, discussions, news, comments, feedback, or any other
documents
21
Source: Bing Liu (2011) , “Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data,” Springer, 2nd Edition,
Aug 28, 2018
22. Applications of Sentiment Analysis
• Consumer information
– Product reviews
• Marketing
– Consumer attitudes
– Trends
• Politics
– Politicians want to know voters’ views
– Voters want to know policitians’ stances and who
else supports them
• Social
– Find like-minded individuals or communities
22Aug 28, 2018
23. Classification Based on
Supervised Learning
• Sentiment classification
– Supervised learning Problem
– Three classes
• Positive
• Negative
• Neutral
23
Source: Bing Liu (2011) , “Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data,” Springer, 2nd Edition,
Aug 28, 2018
24. Opinion words in
Sentiment classification
• topic-based classification
– topic-related words are important
• e.g., politics, sciences, sports
• Sentiment classification
– topic-related words are unimportant
– opinion words (also called sentiment words)
• that indicate positive or negative opinions are
important,
e.g., great, excellent, amazing, horrible, bad, worst
24
Source: Bing Liu (2011) , “Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data,” Springer, 2nd Edition,
Aug 28, 2018
25. Sentiment Analysis Architecture
25Vishal Kharde and Sheetal Sonawane (2016), "Sentiment Analysis of Twitter Data: A Survey of Techniques,"
International Journal of Computer Applications, Vol 139, No. 11, 2016. pp.5-15
Positive
tweets
Negative
tweets
Word
features
Features
extractor
Features
extractor
Positive Negative
TweetClassifier
Training
set
Aug 28, 2018
26. Sentiment Classification Based on Emotions
26Vishal Kharde and Sheetal Sonawane (2016), "Sentiment Analysis of Twitter Data: A Survey of Techniques,"
International Journal of Computer Applications, Vol 139, No. 11, 2016. pp.5-15
Based on Positive Emotions
Feature Extraction
Positive Negative
Tweeter
Classifier
Training Dataset
Tweeter Streaming API 1.1
Positive tweets Negative tweets
Tweet preprocessing
Based on Negative Emotions
Generate Training Dataset for Tweet
Test Dataset
Aug 28, 2018
27. Sentiment Classification Techniques
27Source: Jesus Serrano-Guerrero, Jose A. Olivas, Francisco P. Romero, and Enrique Herrera-Viedma (2015),
"Sentiment analysis: A review and comparative analysis of web services," Information Sciences, 311, pp. 18-38.
Sentiment
Analysis
Machine
Learning
Approach
Lexicon-
based
Approach
Corpus-based
Approach
Supervised
Learning
Unsupervised
Learning
Dictionary-
based
Approach
Statistical
Semantic
Decision Tree
Classifiers
Linear
Classifiers
Rule-based
Classifiers
Probabilistic
Classifiers
Support Vector
Machine (SVM)
Deep Learning
(DL)
Neural Network
(NN)
Bayesian
Network (BN)
Maximum
Entropy (ME)
Naïve Bayes
(NB)
Aug 28, 2018
28. SJSU Washington Square
Research Project
Twitter Sentiment Analysis for Understanding
Citizens’ Trust in Government
• Collected over 1m tweets from January 2013 from
60 accounts
• 20 cities, 20 mayors, 20 police departments
• Analysis was done using R (for data retrieval,
preparation, and computation) and Excel (for
plotting)
• Use topsy.com as an alternative: lists top 1000
tweets from historical data
Aug 28, 2018 28
30. SJSU Washington Square
Methodology: Data Collection
• Topsy API was used to retrieve the tweets
• An API URL example:
http://otter.topsy.com/search.js?q=@hfxgov&offset=0&mintime=1356978601&maxtime=140890
5001&type=tweet&nohidden=0&perpage=100&page=1&apikey=09C43A9B270A470B8EB8F2946
A9369F3
• A batch script in R was executed to retrieve these tweets
• The API response: a JSON data file (a tree/XML like format)
Aug 28, 2018 30
31. SJSU Washington Square
Methodology: Data Preparation
• The retrieved data was cleansed by removing:
• symbols
• punctuations
• special characters
• URLs
• numbers
Aug 28, 2018 31
32. SJSU Washington Square
• Bag of Words approach was used for sentiment analysis.
• stemming: Each tweet was stemmed into the group of English words
• Matching: A match of each word was searched in the lexicon database
(total 6135 words in the lexicon; 2230 positive and 3905 negative)
• Scoring: Positive and negative matches were summed to define a score
of each tweet
• Polarity: (P-N)/(P+N), where P=total sum of positive sentiment words;
N=total sum of negative sentiment words
• Results were grouped and combined.
Aug 28, 2018 32
Methodology: Sentiment Analysis
35. Word-of-mouth
Voice of the Customer
• 1. Attensity
– Track social sentiment across brands and
competitors
– http://www.attensity.com/home/
• 2. Clarabridge
– Sentiment and Text Analytics Software
– http://www.clarabridge.com/
35Aug 28, 2018
36. 36
Attensity: Track social sentiment across brands and competitors
http://www.attensity.com/
http://www.youtube.com/watch?v=4goxmBEg2Iw#!Aug 28, 2018
37. 37
Clarabridge: Sentiment and Text Analytics Software
http://www.clarabridge.com/
http://www.youtube.com/watch?v=IDHudt8M9P0
Aug 28, 2018
38. Purpose of Social Media analytics tools
• With analytics tools for social media you are able
to quickly and easily see the most important
metrics of your brand performance.
• audience growth graph - number of new
likes/follows on a social media profile on a day-
to-day basis
• total engagement chart - information about how
your audience interacts with your content.
• Demographics - paint a better picture of what
your current audience
Aug 28, 2018 38
40. E-Popular Tools
(“Social Media Monitoring/Analysis")
• Radian 6
• Social Mention
• Overtone OpenMic
• Microsoft Dynamics Social Networking
Accelerator
• SAS Social Media Analytics
• Lithium Social Media Monitoring
• RightNow Cloud Monitor
40
Source: Wiltrud Kessler (2012), Introduction to Sentiment Analysis
Aug 28, 2018
47. Opinion Spamming
• Opinion Spamming
– "illegal" activities
• e.g., writing fake reviews, also called shilling
– try to mislead readers or automated opinion mining
and sentiment analysis systems by giving
undeserving positive opinions to some target entities
in order to promote the entities and/or by giving
false negative opinions to some other entities in
order to damage their reputations.
47
Source: http://www.cs.uic.edu/~liub/FBS/fake-reviews.html
Aug 28, 2018
48. Forms of Opinion spam
• fake reviews (also called bogus reviews)
• fake comments
• fake blogs
• fake social network postings
• deceptions
• deceptive messages
48
Source: http://www.cs.uic.edu/~liub/FBS/fake-reviews.html
Aug 28, 2018
50. Professional Fake Review Writing Services
(some Reputation Management companies)
• Post positive reviews
• Sponsored reviews
• Pay per post
• Need someone to write positive reviews about our
company (budget: $250-$750 USD)
• Fake review writer
• Product review writer for hire
• Hire a content writer
• Fake Amazon book reviews (hiring book reviewers)
• People are just having fun (not serious)
50
Source: http://www.cs.uic.edu/~liub/FBS/fake-reviews.html
Aug 28, 2018
54. Opinion Spamming – eg.
• Big data analytics can accumulate the wisdom of
crowds, reveal patterns, and yield best practices.
• For a real-world example, in events related to the
2013 Boston Marathon bombings, social
networks of marathon participants and general
high-performance computational techniques
were combined to cluster and analyze large sets
of candid photos and video shots — ultimately
leading to the discovery of the perpetrators.
Aug 28, 2018 54
55. Impact of Data and Analytics on
Social Media in 2018
Targeted Advertising
• According to Nielsen survey of 28,000 global
Internet users, 92% of consumers trust
recommendations from friends and family
more than any other form of advertising.
• Seventy percent of customers place their
trust in online consumer reviews – making this
medium the second most trusted form of
advertising.
Aug 28, 2018 55
56. Impact of Data and Analytics on
Social Media in 2018
Converting unstructured data into Knowledge
• According to Gartner, 80% of enterprise data –
documents, e-mails, call logs, corporate blogs and
the like – is unstructured (i.e., it does not fit into
any traditional database).
• Advanced social analytics can help organizations
analyze and quickly draw inferences from
burgeoning unstructured social media and
enterprise data, and convert it into actionable
insights.
Aug 28, 2018 56
57. Impact of Data and Analytics on
Social Media in 2018
• “Search Engine Optimized” marketing - is a
technique used to boost webpages to the top
search results returned whenever
• Predictive analytics
• Personalized marketing communication -
relevance of advertisements to you will be
determined by what you post online, what you
watch, what you share, etc,
• In 2018, many companies are going to invest in
hiring digital marketers and data analysts, so they
can take advantage of all that data lying out there
on the internet and create better and more
efficient marketing strategies.
Aug 28, 2018 57
59. Teradata Aster Analytics platform
• The Teradata Aster Analytics platform includes the Aster
Database, Aster SNAP Framework, Aster R, SQL-MapReduce
framework, SQL-GR and the Aster Analytics Portfolio.
• The suite provides business users with a set of tools and
modules that enable them to efficiently uncover data insights
for the entire data discovery lifecycle, using advanced data
analytic functions.
• The tools address a range of business analytics scenarios,
including customer churn, path to purchase, fraud analysis,
manufacturing optimization and product affinity.
• Aster SQL-GR is a Graph processing engine for performing
Graph analytics on big data sets in the Aster Database.
Aug 28, 2018 59
60. Apache Hadoop
• Apache Hadoop is a framework that allows the distributed
processing of large data sets across clusters of commodity
computers using a simple programming model
• Map Reduce -Scenario
Aug 28, 2018 60
62. Image Recognition Analytics
• Social networking sites such as Facebook, Pinterest, Instagram and Flickr
receive and host billions of photos, with thousands added every minute.
Some of the images can be of brands, company logos and products,
without any text to reference them.
• Since traditional social media monitoring tools can only track text (such as
user comments and posts mentioning a brand), marketers often do not
know what customers are referring to, who is using their company’s
products, or if counterfeit versions of those products exist.
• Analytics with image recognition capabilities can help companies
overcome this challenge and leverage images to enhance their market
knowledge and extend their reach. Advanced image analytics with pixel-
level analysis is gradually gaining acceptance among large retailers and
advertising agencies.
• Companies such as Piqora and Curalate have developed image recognition
technologies for social media sites such as Facebook, Pinterest and
Instagram – allowing them to identify the most popular shared images
from their Web sites, the most influential individual visitors, and the traffic
that an image diverts to a target Web site.
Aug 28, 2018 62
63. Is this ethical — what about data
protection?
• some social media platforms do have some
form of open access user data (for example,
Twitter and Facebook)
• some sell their data to companies (for
example, Instagram)
• some platforms keep their user data entirely
confidential (for example, Snapchat).
Aug 28, 2018 63
64. Effects of Social Media Analytics
• Analysis of social media data collected by a retailer could for
instance reveal that unmarried females between 25 and 35
are suitable candidates for a discount offer on gym
equipment.
• The Future: Big Data Will Continue to Accelerate the Intrusion
of Social Media Companies into People’s Privacy
• A study published by researchers from Cambridge and
Stanford Universities shows that Facebook can use its data to
predict people’s personality with more accuracy than close
friends and families.
• any action you take on browsers and search engines today will
most likely link back to your social media profile, leaving
behind a long trail of digital footprint that can be used for
detecting your next moves.
Aug 28, 2018 64
65. Effects of Social Media Analytics -
Contd
• data collection and analytics is probably going to be
around for as long as we users are still giving out our
data on the internet.
• regulations such as General Data Protection
Regulation (GDPR) provide hope for some semblance
of data protection and privacy. This doesn’t mean that
we should openly publish all our personal information
on our social media accounts however.
• It’s best to follow this rule: if it’s not something you’re
comfortable with the entire world knowing, don’t post
it on the internet.
Aug 28, 2018 65
67. SJSU Washington SquareOpportunities
• “…data is useless without the skill to analyze it.
• A McKinsey Global Institute study states that the US will
face a shortage of about 190,000 data scientists and 1.5
million managers and analysts who can understand and
make decisions using Big Data by 2018.
Aug 28, 2018 67
68. Issues & Challenges
Dozens of questions must be addressed still…
what is the best architecture for the physical
data storage infrastructure?
how should data workers be situated within a
managerial hierarchy?
what security protocols should be introduced
to protect the integrity of the data ?
what is the appropriate ethical stance on
handling personal data?
Aug 28, 2018 68