Intro to Sentiment Analysis

•Télécharger en tant que PPT, PDF•

1 j'aime•1,394 vues

Bernard Goldbach

A brief introduction for business students in the Limerick Institute of Technology, Ireland.

Technologie Design

Intro to Sentiment Analysis
“FAST, NEAT, AVERAGE, FRIENDLY, GOOD, GOOD” was the author’s first sentiment.

aka Opinion Mining
 Sentiment analysis is opinion mining.
 Uses Natural Language Processing.
 Dives deep into text analysis.
 Leverages computational linguistics.
 Develops meta data with business intelligence.

Basic Opinion Mining
 Construct a range of polarity for opinion markers.
 Classify statements by their polarity.
 Analyse several levels deep.
 Websites are one level.
 Authors are another level.
 Web page is a third level.
 A sentence is a fourth level.

Ranges of Polarity
 Classify emotional states.
 “Angry” can be codified as “upset” or “cross”.
 “Sad” may be “disappointed” or “confused”.
 “Happy” may be “amazing” or “gorgeous”.

Scaling Systems
 Some words are negative and deserve to be minus 10.
 Some words are neutral and should be equal to five.
 Some words are positive and could range from six to 10.

Subjectivity and Objectivity
 Starts with classifying a given text (no more than a paragraph).
 Mark the media text as objective or subjective.
 The challenge lies in the subtlety of expression or the compound effect of multiple authors.
 Proper analysis normally means removing objective statements from the given text.

Aspect-Based Sentiment Analysis
 Determine opinions based on features.
 Mark the media text as objective or subjective.
 The challenge lies in the subtlety of expression or the compound effect of multiple authors.
 Proper analysis normally means removing objective statements from the given text.

When Something is Ambiguous
 Detect entity within text, such as person, place or company.
 Get detailed view at entity level, not document-level.
 “I love Ireland but I hate traveling on Irish roads.”

Disambiguation
 Detect entity within text, such as person, place or company.
 Get detailed view at entity level, not document-level.
 “I love Ireland but I hate traveling on Irish roads.”

Entity-Level
 Detect entity within text, such as person, place or company.
 Get detailed view at entity level, not document-level.
 “I love Ireland but I hate traveling on Irish roads.”

Keyword-Level Sentiment
 Gleans sentiment for every detected keyword.
 Much more detailed than view at document-level.
 BMW can determine positive comments about cars mention quality of handling.

User-Specified Sentiment
 You, the analyst, target specific words or phrases.
 So you specify a restaurant’s name and return sentiment scores based on that name.
 You cull various media texts for sentiment about a specific hotel.

Directional Sentiment
 Identifies the commentator and emotional range.
 First, discover the incident where emotion is expressed.
 Second, determine the degree of positive or negative response.
 Third, conclude who is mentioning both the product and how negatively.

Disambiguation by Location
 Identifies the exact point on the earth.
 Use contextual cues.
 Perhaps where something is posted or where commentator is based.

Entity Subtypes
 Author is a real person.
 Author is a man.
 Man’s name is Paul O’Connell.
 This Paul O’Connell is Munster.

Exact Quotations
 What was said.
 Who said what.
 When it was said.
 Where it was said.
 This exactness provides context.

References
 Turney and Pang applied methods for detecting polarity at the document level.
 Pang and Snyder classified documents on a multi-way scale, such as “five stars”.
 Katie Paine wrote “Measure What Matters”

Useful Links
 For Immediate Release G+ Community
 Marketing Over Coffee Podcast
 KD Paine’s Blog
 The Alchemy Blog

Continue the Discussion
 Use the Google Doc.
 Consult Moodle.
 Shout to @topgold

Contenu connexe

Tendances

Sec 3 Social Studies SBQ Skill: Inferences notesearlgreytea

Social Studies Exam GuideAbdul Rahim

Sec 2 History SBQ Skill: Compare and Contrast earlgreytea

Source based made simplestardusts98

SBQ Notes for Social Studiesbtvssmedia

Sec 2 History SBQ Skill: Compare and Contrast notesearlgreytea

A Critical ReadingChalatip Intaramarut

The Art of SBQsNgNameless

Tendances (8)

Sec 3 Social Studies SBQ Skill: Inferences notes

Social Studies Exam Guide

Sec 2 History SBQ Skill: Compare and Contrast

Source based made simple

SBQ Notes for Social Studies

Sec 2 History SBQ Skill: Compare and Contrast notes

A Critical Reading

The Art of SBQs

Similaire à Intro to Sentiment Analysis

Annotated Bibliographieskhornberger

sent_analysis_reportSubhadarsini Prusty

Effective Writing2guest349908

Writing Research Reportguest349908

A Summary Of Interrater ReliabilityCheap Paper Writing Services Hastings

Business Analyst-KnowYourAudience-GuideJas Mahay

1 Recognizing Assignment Expectations Implied by Key Ver.docxjeremylockett77

1 Recognizing Assignment Expectations Implied by Key Ver.docxcroftsshanon

EVALUATING-SOURCES.pptxmarkcaspillo1

Rater ErrorsWrite My Business Paper Southwestern College

Tips for Scale Development: Evaluating Automatic PersonasJoni Salminen

Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...CITE

Online research and citing sources for speeches graysonjmillspaugh

Ap lang apsi 2012 presentation kristenthisiscooling

Module 7 Discussion Board Algebra1. What does it mean when s.docxmoirarandell

What is academic writingarthurdemelosa

Evalauting Textguest349908

Senior High School Reading and Writing SKillsqueenpressman14

DBS Library Harvard Referencing Class Slides Trevor Haugh

Tools of critical readingIbrahem Abdel Ghany

Similaire à Intro to Sentiment Analysis (20)

Annotated Bibliographies

sent_analysis_report

Effective Writing2

Writing Research Report

A Summary Of Interrater Reliability

Business Analyst-KnowYourAudience-Guide

1 Recognizing Assignment Expectations Implied by Key Ver.docx

EVALUATING-SOURCES.pptx

Rater Errors

Tips for Scale Development: Evaluating Automatic Personas

Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...

Online research and citing sources for speeches grayson

Ap lang apsi 2012 presentation kristen

Module 7 Discussion Board Algebra1. What does it mean when s.docx

What is academic writing

Evalauting Text

Senior High School Reading and Writing SKills

DBS Library Harvard Referencing Class Slides

Tools of critical reading

Plus de Bernard Goldbach

Journaling through a PinholeBernard Goldbach

An Introduction to Media WritingBernard Goldbach

Sharing Workflow IdeasBernard Goldbach

Mapping active responsesBernard Goldbach

Academic Credit for Marchathon 2018Bernard Goldbach

Enhanced podcasts in educationBernard Goldbach

Creating Digital Media Profiles OnlineBernard Goldbach

How to See People Who Block You on TwitterBernard Goldbach

Managing Digital FootprintsBernard Goldbach

Using OneNote for Teaching and LearningBernard Goldbach

Online Profiles of Creative StudentsBernard Goldbach

Attracting and engaging with sharingBernard Goldbach

Identity as a workshopBernard Goldbach

Talking to Creative Illustrator and Author Nicola ColtonBernard Goldbach

Social Media ProcessBernard Goldbach

Realism with RealiaBernard Goldbach

Spotlighting innovationBernard Goldbach

Digital Literacy and Professional Development #heieBernard Goldbach

The Alpha Version of a Wundering MoleskineBernard Goldbach

Topgold's Dropbox WorkflowBernard Goldbach

Plus de Bernard Goldbach (20)

Journaling through a Pinhole

An Introduction to Media Writing

Sharing Workflow Ideas

Mapping active responses

Academic Credit for Marchathon 2018

Enhanced podcasts in education

Creating Digital Media Profiles Online

How to See People Who Block You on Twitter

Managing Digital Footprints

Using OneNote for Teaching and Learning

Online Profiles of Creative Students

Attracting and engaging with sharing

Identity as a workshop

Talking to Creative Illustrator and Author Nicola Colton

Social Media Process

Realism with Realia

Spotlighting innovation

Digital Literacy and Professional Development #heie

The Alpha Version of a Wundering Moleskine

Topgold's Dropbox Workflow

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

ICT role in 21st century education and its challengesrafiqahmad00786416

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub

Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney

FWD Group - Insurer Innovation Award 2024The Digital Insurer

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays

Exploring Multimodal Embeddings with MilvusZilliz

Manulife - Insurer Transformation Award 2024The Digital Insurer

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

DBX First Quarter 2024 Investor PresentationDropbox

Dernier (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

ICT role in 21st century education and its challenges

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Apidays New York 2024 - The value of a flexible API Management solution for O...

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Artificial Intelligence Chap.5 : Uncertainty

presentation ICT roal in 21st century education

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Strategies for Landing an Oracle DBA Job as a Fresher

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

FWD Group - Insurer Innovation Award 2024

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

Exploring Multimodal Embeddings with Milvus

Manulife - Insurer Transformation Award 2024

How to Troubleshoot Apps for the Modern Connected Worker

DBX First Quarter 2024 Investor Presentation

Intro to Sentiment Analysis

1. Intro to Sentiment Analysis “FAST, NEAT, AVERAGE, FRIENDLY, GOOD, GOOD” was the author’s first sentiment.

2. aka Opinion Mining  Sentiment analysis is opinion mining.  Uses Natural Language Processing.  Dives deep into text analysis.  Leverages computational linguistics.  Develops meta data with business intelligence.

3. Basic Opinion Mining  Construct a range of polarity for opinion markers.  Classify statements by their polarity.  Analyse several levels deep.  Websites are one level.  Authors are another level.  Web page is a third level.  A sentence is a fourth level.

4. Ranges of Polarity  Classify emotional states.  “Angry” can be codified as “upset” or “cross”.  “Sad” may be “disappointed” or “confused”.  “Happy” may be “amazing” or “gorgeous”.

5. Scaling Systems  Some words are negative and deserve to be minus 10.  Some words are neutral and should be equal to five.  Some words are positive and could range from six to 10.

6. Subjective and Objective

7. Subjectivity and Objectivity  Starts with classifying a given text (no more than a paragraph).  Mark the media text as objective or subjective.  The challenge lies in the subtlety of expression or the compound effect of multiple authors.  Proper analysis normally means removing objective statements from the given text.

8. Aspect-Based Sentiment Analysis  Determine opinions based on features.  Mark the media text as objective or subjective.  The challenge lies in the subtlety of expression or the compound effect of multiple authors.  Proper analysis normally means removing objective statements from the given text.

9. Ambiguous and Disambiguation

10. When Something is Ambiguous  Detect entity within text, such as person, place or company.  Get detailed view at entity level, not document-level.  “I love Ireland but I hate traveling on Irish roads.”

11. Disambiguation  Detect entity within text, such as person, place or company.  Get detailed view at entity level, not document-level.  “I love Ireland but I hate traveling on Irish roads.”

12. Entity-Level  Detect entity within text, such as person, place or company.  Get detailed view at entity level, not document-level.  “I love Ireland but I hate traveling on Irish roads.”

13. Keyword-Level Sentiment  Gleans sentiment for every detected keyword.  Much more detailed than view at document-level.  BMW can determine positive comments about cars mention quality of handling.

14. User-Specified Sentiment  You, the analyst, target specific words or phrases.  So you specify a restaurant’s name and return sentiment scores based on that name.  You cull various media texts for sentiment about a specific hotel.

15. Directional Sentiment  Identifies the commentator and emotional range.  First, discover the incident where emotion is expressed.  Second, determine the degree of positive or negative response.  Third, conclude who is mentioning both the product and how negatively.

16. Disambiguation by Location  Identifies the exact point on the earth.  Use contextual cues.  Perhaps where something is posted or where commentator is based.

17. Disambiguation: Meta Data  Meta data provides data about data.  Links can remove ambiguity.  Past geographical movements clarify reach of commentators.  Simple internet searches can provide accurate profile data.

18. Entity Subtypes  Author is a real person.  Author is a man.  Man’s name is Paul O’Connell.  This Paul O’Connell is Munster.

19. Exact Quotations  What was said.  Who said what.  When it was said.  Where it was said.  This exactness provides context.

20. Author Profile  Analyse the text.  Validate the context.  Extract the concept.  Extract the keywords.  Apply to author profile.  Determine what author’s write about.

21. References  Turney and Pang applied methods for detecting polarity at the document level.  Pang and Snyder classified documents on a multi-way scale, such as “five stars”.  Katie Paine wrote “Measure What Matters”

22. Useful Links  For Immediate Release G+ Community  Marketing Over Coffee Podcast  KD Paine’s Blog  The Alchemy Blog

23. Continue the Discussion  Use the Google Doc.  Consult Moodle.  Shout to @topgold

Notes de l'éditeur

This is the first look at sentiment analysis during a discussion with business students in the Limerick Institute of Technology in October 2013. It is based on professional experience shared by Bernard @topgold Goldbach, Katie @kdpaine Paine, Neville @jangles Hobson, Christopher @cspenn Penn and The Alchemy Group. The author of this deck lives at http://www.insideview.ie.
Sentiment analysis (also known as opinion mining ) refers to the use of natural language processing , text analysis and computational linguistics to identify and extract subjective information in source materials. Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation (see appraisal theory ), affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader).
A basic task in sentiment analysis is classifying the polarity of a given text at the document, sentence, or feature/aspect level — whether the expressed opinion in a document, a sentence or an entity feature/aspect is positive, negative, or neutral. Advanced, "beyond polarity" sentiment classification looks, for instance, at emotional states such as "angry," "sad," and "happy."
Advanced, "beyond polarity" sentiment classification looks, for instance, at emotional states such as "angry," "sad," and "happy."
A different method for determining sentiment is the use of a scaling system whereby words commonly associated with having a negative, neutral or positive sentiment with them are given an associated number on a -10 to +10 scale (most negative up to most positive) and when a piece of unstructured text is analyzed using natural language processing , the subsequent concepts are analyzed for an understanding of these words and how they relate to the concept [ citation needed ] . Each concept is then given a score based on the way sentiment words relate to the concept, and their associated score. This allows movement to a more sophisticated understanding of sentiment based on an 11 point scale. Alternatively, texts can be given a positive and negative sentiment strength score if the goal is to determine the sentiment in a text rather than the overall polarity and strength of the text.
Another research direction is subjectivity/objectivity identification . According to Wikipedia, this task is commonly defined as classifying a given text (usually a sentence) into one of two classes: objective or subjective. This problem can sometimes be more difficult than polarity classification: the subjectivity of words and phrases may depend on their context and an objective document may contain subjective sentences (e.g., a news article quoting people's opinions). Results are largely dependent on the definition of subjectivity used when annotating texts. (Su) As Pang’s research shows, removing objective sentences from a document before classifying its polarity helped improve performance.
Another research direction is subjectivity/objectivity identification . This task is commonly defined as classifying a given text (usually a sentence) into one of two classes: objective or subjective. This problem can sometimes be more difficult than polarity classification: the subjectivity of words and phrases may depend on their context and an objective document may contain subjective sentences (e.g., a news article quoting people's opinions). Results leargely depend on the definition of subjectivity used when annotating texts. (Su) Removing objective sentences from a document before classifying its polarity helped improve performance. (Pang)
The more fine-grained analysis model is called the feature/aspect-based sentiment analysis . It refers to determining the opinions or sentiments expressed on different features or aspects of entities, e.g., of a cell phone, a digital camera, or a bank. A feature or aspect is an attribute or component of an entity, e.g., the screen of a cell phone, or the picture quality of a camera. This problem involves several sub-problems, e.g., identifying relevant entities, extracting their features/aspects, and determining whether an opinion expressed on each feature/aspect is positive, negative or neutral. More detailed discussions about this level of sentiment analysis can be found in Liu's NLP Handbook chapter, "Sentiment Analysis and Subjectivity”.
Ambiguous: open to more than one interpretation. Disambiguation: clarification that follows from the removal of ambiguity.
AMBIGUOUS. You need to provide sentiment data for every detected entity within text, such as person, place, organization. You need to give clients a more detailed view than document-level sentiment analysis.
REMOVE AMBIGUITY WITH DISAMBIGUATION TACTICS.
Entity-Level Sentiment Analysis provides sentiment data for every detected entity within text, such as person, place, organization. Alchemy algorithms do this kind of work.
Keyword-Level Sentiment Analysis provides sentiment data for every detected keyword so that instead of generating sentiment by document, it’s possible to generate sentiment for keywords within the document. For example, when analyzing car posts, determine that of the 70% posts that were positive, 80% of them mentioned road handling and 30% complained about the road tax.
User-Specified Sentiment Analysis allows the user to target specific words or phrases. For instance, specifying a movie title returns sentiment scores based on that phrase. This can be done by hand or by Alchemy API.
Directional Sentiment Analysis reveals who is emitting the sentiment. For example, if a person spoke negatively about a product, determine not only that the product was mentioned negatively, but who mentioned the product negatively.
Disambiguation: Dominos in Limerick or Dominos all across Ireland? Since one business can have multiple locations, you need to be able to distinguish by location. This effectively means you are using a disambiguation technique to ferret out the various locations. You can often located contextual cues within the text or by geolocation in a Foursquare tip.
Disambiguation: Additional Information Disambiguation provides additional information for the people, places and things mentioned in a document such as links to their official websites, Wikipedia pages, geographical coordinates and more.
Entity Subtypes: Paul O’Connell, a Person and an Athlete. In addition to the most common entity types, such as person or organization, you should seek to identify subtypes. For example, your basic text analysis services will identify Paul O’Connell as a man but you need to know he is a prominent rugby player for Munster. That way, you know he is an influencer.
Quotations Extraction: What Was Said and Who Said It Entity extraction determines what was said, but quotations extraction tells you who said what by extracting a quote and attributing it back to the person or organization responsible. Knowing that a company was mentioned in a piece of text is important, however, finding out who mentioned the company gives a fuller story. For example, entity extraction can provide you with a list of news articles where a topic and Willie O’Dea were both mentioned, but quotations extraction can provide you with a list of news articles where Willie O’Dea was quoted mentioning that topic.
Author Extraction For data to be meaningful, your text analysis service must be able to contribute to building an author profile. Comments on web pages, tweets, image collections, and site critiques provide excellent data sets. Author extraction combined with concept extraction, keyword extraction, and entity extraction provides information on what topics specific authors write about.
Early work in that area includes Turney and Pang who applied different methods for detecting the polarity of product reviews and movie reviews respectively. This work is at the document level. One can also classify a document's polarity on a multi-way scale, which was attempted by Pang and Snyder . This expanded the basic task of classifying a movie review as either positive or negative to predicting star ratings on either a 3 or a 4 star scale, while Snyder performed an in-depth analysis of restaurant reviews, predicting ratings for various aspects of the given restaurant, such as the food and atmosphere (on a five-star scale). Peter Turney (2002). "Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews". Proceedings of the Association for Computational Linguistics . pp. 417–424. Bo Pang; Lillian Lee and Shivakumar Vaithyanathan (2002). "Thumbs up? Sentiment Classification using Machine Learning Techniques" . Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) . pp. 79–86. Bo Pang; Lillian Lee (2005). "Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales" . Proceedings of the Association for Computational Linguistics (ACL) . pp. 115–124. Benjamin Snyder; Regina Barzilay (2007). "Multiple Aspect Ranking using the Good Grief Algorithm" . Proceedings of the Joint Human Language Technology/North American Chapter of the ACL Conference (HLT-NAACL) . pp. 300–307.
The FIR Community is at https://plus.google.com/communities/112349929544876511942 MOC is http://marketingovercoffee.com KD Paine blogs at http://kdpaine.blogs.com/ Alchemy’s blog is at http://www.alchemyapi.com/blog/
The Moodle Document concerning sentiment analysis is at http://bit.ly/crm-document04 but that might change as the years go on. MOC is http://marketingovercoffee.com KD Paine blogs at http://kdpaine.blogs.com/ Alchemy’s blog is at http://www.alchemyapi.com/blog/ You can contact the author by using the nic “topgold” on all good social networks. This document was written to support the business curriculum in LIT.ie on 11 October 2013.

Intro to Sentiment Analysis

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (8)

Similaire à Intro to Sentiment Analysis

Similaire à Intro to Sentiment Analysis (20)

Plus de Bernard Goldbach

Plus de Bernard Goldbach (20)

Dernier

Dernier (20)

Intro to Sentiment Analysis

Notes de l'éditeur