Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
An Overview of Text Mining
and Sentiment Analysis
- for Decision Support System
Gan Keng Hoon
School of Computer Sciences
...
Outlines
1. Decision Support Systems
2. Overview of Text Mining &
Sentiment Analysis
 Techniques in Text Mining
 Techniq...
Decision Support System
As an end user,
every day, we
need to make
decision ..
What to
eat for
lunch? What
subject to
choo...
Decision Support System
every
hour/minute/sec
ond, business
provider needs to
make crucial
decision ..
Source:
http://attu...
Decision Support System
Source: http://www.informationbuilders.com/decision-support-systems-dss
Decision
maker in a
compan...
Decision Support System
A hotelier wants
to know why ..
If location is
good, how can I
take advantage ..
Why are they/we using
Decision Support System
Business provider
 Improve customer
experience
 Improve products and
servi...
Sample Decision Support System
Looks good, 155
person says Very
Good…
Not bad,
customers rated 4
* and above for
location,...
The Truth ?
http://www.tripadvisor.com.my
Many Questions …
 Mr X: How is the condition of
Wifi?
 Miss Y: Is the toilet really
dirty?
 Family Z: Any convenience
s...
Harnessing Web and Social Texts
Very influential.
Latest and most updated.
The truth (but sometimes not).
Free (most of th...
However. With No Automation
Methods
 It is impossible to scan through
each of them.
 Important details could be missed.
...
Overview of Text Mining &
Sentiment Analysis
 Is the toilet really dirty?
Text Mining
- Let’s mine some texts
to answer t...
Techniques in Text Mining
What is text mining?
 To exploit information contained in
textual documents in various ways.
Na...
Information Retrieval
- Find relevant sentences.
Document Collection Processing
1. Texts Preprocessing
 Sentence Tokenize...
Information Retrieval
- Find relevant sentences.
Query Processing
1. Intention as Query
2. Query Preprocessing
 Tokenizat...
Information Retrieval
- Find relevant sentences.
 Simple and fast
 Quickly retrieve all relevant sentences or
documents ...
Natural Language Processing
Source: Cheng Xiang Zhai, Text Retrieval and Search Engine, Coursera Slide.
Natural Language Processing
 Difficult because we assume the
hearer has some background
knowledge.
 Not only surface ana...
Techniques in Sentiment Analysis
Sentence Extractor
Tokenization
Boundary Detection
Sentence
Selector
Entity
Dictionary
Se...
Entity Detection (or Aspect
Selection)
Texts
1. in the bathroom, used
toiletries (shampoo &
soap) were not thrown
and were...
Sentiment Extraction
Texts
1. in the bathroom, used
toiletries (shampoo &
soap) were not thrown
and were left in the
showe...
Sentiment Scoring
Texts
1. in the bathroom, used
toiletries (shampoo &
soap) were not thrown
and were left in the
shower a...
Applications Source: http://www.twtbase.com/twitrratr/
Challenges Ahead
 How to detect a more in depth sentiment.
 Differentiate the spam and the credible.
 Language problem
...
Challenges Ahead
Last but not least,
The challenge is to put the research
and solution into real use.
Prochain SlideShare
Chargement dans…5
×

An overview of text mining and sentiment analysis for Decision Support System

1 616 vues

Publié le

This presentation contains an overview on how simple text mining and sentiment analysis techniques can be used for decision support system.

Publié dans : Données & analyses
  • Soyez le premier à commenter

An overview of text mining and sentiment analysis for Decision Support System

  1. 1. An Overview of Text Mining and Sentiment Analysis - for Decision Support System Gan Keng Hoon School of Computer Sciences Universiti Sains Malaysia 12 May 2015
  2. 2. Outlines 1. Decision Support Systems 2. Overview of Text Mining & Sentiment Analysis  Techniques in Text Mining  Techniques in Sentiment Analysis 3. Applications and Challenges ahead.
  3. 3. Decision Support System As an end user, every day, we need to make decision .. What to eat for lunch? What subject to choose? Which hotel to stay?
  4. 4. Decision Support System every hour/minute/sec ond, business provider needs to make crucial decision .. Source: http://attunelive.com/blog/how-a- screening-prompted-by-clinical- decision-support-system-helped- save-a-patients-life/ As a business provider,
  5. 5. Decision Support System Source: http://www.informationbuilders.com/decision-support-systems-dss Decision maker in a company checks the sales before decide which product to promote ..
  6. 6. Decision Support System A hotelier wants to know why .. If location is good, how can I take advantage ..
  7. 7. Why are they/we using Decision Support System Business provider  Improve customer experience  Improve products and services  More returns … End user  Better purchasing choice  Better value  Happier ..
  8. 8. Sample Decision Support System Looks good, 155 person says Very Good… Not bad, customers rated 4 * and above for location, cleanliness .. http://www.tripadvisor.com.my
  9. 9. The Truth ? http://www.tripadvisor.com.my
  10. 10. Many Questions …  Mr X: How is the condition of Wifi?  Miss Y: Is the toilet really dirty?  Family Z: Any convenience store nearby?  Manager of Hotel: I want to know all the complaints about toilet!
  11. 11. Harnessing Web and Social Texts Very influential. Latest and most updated. The truth (but sometimes not). Free (most of the time). Source: Hotel Review Sites: What’s the ‘Truth’ About Fairness? http://www.hospitalitynet.org/news/4056065.html
  12. 12. However. With No Automation Methods  It is impossible to scan through each of them.  Important details could be missed.  It is hard to visualize or summarize all the texts via manual effort.  It is impossible to digest new reviews generated each day. *There are 344 reviews (as of 10/5/2015) for the mentioned hotel.
  13. 13. Overview of Text Mining & Sentiment Analysis  Is the toilet really dirty? Text Mining - Let’s mine some texts to answer the question. 1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area 2. dirty sink, and very very dirty shower glass wall. 3. the shower, it's clean... Sentiment Analysis - Let’s find some sentiments about these texts.
  14. 14. Techniques in Text Mining What is text mining?  To exploit information contained in textual documents in various ways. Natural Language Processing Information Retrieval
  15. 15. Information Retrieval - Find relevant sentences. Document Collection Processing 1. Texts Preprocessing  Sentence Tokenizer  Stop Word Removal 2. Feature Selection  Bags of Words Approach  Term Frequency Inversed Document Frequency 3. Inverted Index Creation  Term – Doc Posting
  16. 16. Information Retrieval - Find relevant sentences. Query Processing 1. Intention as Query 2. Query Preprocessing  Tokenization  Expansion using Synonym 3. Query-Doc Matching  Ranking
  17. 17. Information Retrieval - Find relevant sentences.  Simple and fast  Quickly retrieve all relevant sentences or documents given some keywords.  But losses detail like sentence structure, word order.  Context is not captured. E.g. a term “cold” may be referring to air cond is cold or the receptionist is cold.
  18. 18. Natural Language Processing Source: Cheng Xiang Zhai, Text Retrieval and Search Engine, Coursera Slide.
  19. 19. Natural Language Processing  Difficult because we assume the hearer has some background knowledge.  Not only surface analysis of text is required.  Need common sense analysis.  E.g. I can write words on that dusty table top.
  20. 20. Techniques in Sentiment Analysis Sentence Extractor Tokenization Boundary Detection Sentence Selector Entity Dictionary Sentence Categorization Sentiment Dictionary Sentiment Extraction Pre-processing Entity Detection Post-processing MySQL Database Browser Entity Extraction Prediction Rating Part of Summarev Framework for Entity’s Text Processing and Sentiment Analysis http://ir.cs.usm.my/siir/project_summarev.php
  21. 21. Entity Detection (or Aspect Selection) Texts 1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area 2. dirty sink, and very very dirty shower glass wall. 3. the shower, it's clean... … Aspect 1. Bathroom 2. Toiletries 3. Shower area 4. Sink 5. Shower 6. Hair dryer 7. Wifi 8. Bed ... - POS - Tagging - Noun Phrase Selection - Term Weighting
  22. 22. Sentiment Extraction Texts 1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area 2. dirty sink, and very very dirty shower glass wall. 3. the shower, it's clean... … Aspect - Sentiment 1. Sink – dirty 2. Shower – clean 3. Shower glass wall - dirty - POS - Tagging - Adjective Phrase Selection
  23. 23. Sentiment Scoring Texts 1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area 2. dirty sink, and very very dirty shower glass wall. 3. the shower, it's clean... … Aspect - Sentiment 1. Sink – dirty (N:0.75) 2. Shower – clean (P:0.5) 3. Shower glass wall – dirty (N:0.75) Source: sentiwordnet.isti.cnr.it
  24. 24. Applications Source: http://www.twtbase.com/twitrratr/
  25. 25. Challenges Ahead  How to detect a more in depth sentiment.  Differentiate the spam and the credible.  Language problem  usage of mixed languages.  Usage of non standard languages.
  26. 26. Challenges Ahead Last but not least, The challenge is to put the research and solution into real use.

×