Analyzing natural language feedback using python

•

1 like•227 views

This talk outlines how to analyze natural language feedback from restauranteering using Python. It is accompanied by a Jupyter notebook that shows how to use spaCy to split long texts into sentences and token, access the lemma of a token. Next a lexicon is used to match the tokens and assign a topic and rating to each sentence.While the presented algorithm is quite simple to implement and understand it can resolve that constructs like "not very tasty" represent a sentiment of "somewhat bad" despite the positive word "tasty".

Technology

Analyzing natural
language feedback
using Python
Thomas Aglassinger
V1.0.0
PyDays Vienna
2018

Agenda
●
About me
●
What‘s SpaCy?
●
What sentiment detection?
●
Collecting restauranteering feedbacks using TeLLers
●
Jupyter notebook with example analysis

About me
●
Thomas Aglassinger
●
Sofware developer (e-commerce, finance, health)
●
Master of science in information processing
●
Co-organizer Python user group Graz: https://pygraz.org
●
Homepage: http://www.roskakori.at

What is SpaCy?
●
Natural language processing in Python
●
Simple to use
●
Pragmatic algorithms
●
Fast
●
However: does not (yet) support sentiment detection
●
More information: https://spacy.io/

What is sentiment detection?
●
„systematically identify, extract, quantify, and study afective states
and subjective information“
https://en.wikipedia.org/wiki/Sentiment_analysis
●
Collects opinions from text written in natural language and stores
them in a structured way
●
Diferent levels:
– Document
– Sentence (possibly multiple per document)
– Aspect (possibly multiple per sentence)

Opinion (de luxe edition)
●
Example: “The Schnitzel is too small for a hungry student”
(Hans Meier, 2018-04-28, 13:12 UTC)
●
Consists of:
– Target entity: Schnitzel (popular Austrian food)
– Aspect: size
– Sentiment: bad
– Opinion holder: Hans Meier
– Posting time: 2018-04-28, 13:12 UTC
– Reason: “too small”
– Qualifier: “for a hungry student” → might be find for others
●
Reference: Bing Liu, “Sentiment Analysis”, Cambridge Press, 2015, p. 22f

Opinion (simplifed)
●
Example: “The Schnitzel is too small for a hungry student”
(Hans Meier, 2018-04-28, 13:12 UTC)
●
Consists of:
– Topic: food
– Sentiment: bad
– Opinion holder: Hans Meier
– Posting time: 2018-04-28, 13:12 UTC
●
Enough to get a grip about
– Pain points
– Unique sales propositions (USP)

Where to get feedback from?
●
TeLLers mobile web application
●
Stores feedback in database
●
Accessible only by restaurateur
●
No public publishing on the internet
●
Austrian startup
●
https://tellers.co.at/

About the feedback
●
German language
●
Textual answers to questions like
– What did you like about your visit?
– What did you like about your visit?
– What can we improve to make your next visit even more pleasant?
– Anything else you want to tell us?

Challenges
●
Nowadays NLP is mostly about English and Chinese
●
Limited data
– Region: Graz, Austria
– Time: 6 Months
– Amount: about 1000 feedbacks
●
Needs old school carefully handmade algorithm
●
No magic pixie dust of machine learning

Algorithm
●
Distributed as Jupyter notebook.
Yeah, hardcore!
●
Fully executable code and example data
●
Play around and reuse!
●
By-product of master‘s thesis

Algorithm – basic pipeline
1.Replace abbreviations that confuse SpaCy‘s sentence detection
2.Unify smiley codes and emojis
3.Replace Austrian slang terms with proper German (surprisingly few)
4.Split feedback in sentences and tokens (SpaCy)
5.Extend tokens with information about topic and rating (using a lexicon)
6.Combine related words, e.g. „nicht besonders gut“ (=“not particularly
great“ = somewhat bad)
7.Reduce sentence to single topic and rating

Summary
●
Lexicon based sentiment detection on a sentence level can
be implemented comparably easily using SpaCy as base
●
Manual pre-analysis of existing data required
●
„good enough“ result to identify areas of interest

Similar to Analyzing natural language feedback using python

Tool criticismMarijn Koolen

Apertium: a unique free/open-source MT system for related languages [but not ...Prompsit Language Engineering

Apertium: a unique free/open-source MT system for related languages [but not ...Gema Ramirez-Sanchez

What can Natural Language Processing do for you?Yves Peirsman

Teach colleagues accessibility - CSUN ATC 2018Tom Widerøe

How UX has helped us to do a better jobVirginia Tejada

SoC Python Discussion Groupkrishna_dubba

An Introduction to Natural Language ProcessingTyrone Systems

Indextank east bay ruby meetup slidesYogiWanKenobi

Text Analysis and Semantic Search with GATEDiana Maynard

OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document levelMaría Navas Loro

Datascope Runs on Python - Chipy February 2016Brian Lange

Text analysis and Semantic Search with GATEDiana Maynard

Tools for (Almost) Real-Time Social Media AnalysisDiana Maynard

Docathon: How to write (good) documentationnelle varoquaux

Dataiku hadoop summit - semi-supervised learning with hadoop for understand...Dataiku

Tannishk ResumeTannishk Sharma

Pankaj Gupta CV / ResumePankaj Gupta, PhD

Deprecating the state machine: building conversational AI with the Rasa stack...PyData

Deprecating the state machine: building conversational AI with the Rasa stackJustina Petraitytė

Similar to Analyzing natural language feedback using python (20)

Tool criticism

Apertium: a unique free/open-source MT system for related languages [but not ...

What can Natural Language Processing do for you?

Teach colleagues accessibility - CSUN ATC 2018

How UX has helped us to do a better job

SoC Python Discussion Group

An Introduction to Natural Language Processing

Indextank east bay ruby meetup slides

Text Analysis and Semantic Search with GATE

OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document level

Datascope Runs on Python - Chipy February 2016

Text analysis and Semantic Search with GATE

Tools for (Almost) Real-Time Social Media Analysis

Docathon: How to write (good) documentation

Dataiku hadoop summit - semi-supervised learning with hadoop for understand...

Tannishk Resume

Pankaj Gupta CV / Resume

Deprecating the state machine: building conversational AI with the Rasa stack...

Deprecating the state machine: building conversational AI with the Rasa stack

Recently uploaded

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Architecting Cloud Native ApplicationsWSO2

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays

GenAI Risks & Security Meetup 01052024.pdflior mazor

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Why Teams call analytics are critical to your entire businesspanagenda

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Recently uploaded (20)

Strategies for Landing an Oracle DBA Job as a Fresher

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...

Apidays New York 2024 - The value of a flexible API Management solution for O...

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Architecting Cloud Native Applications

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

GenAI Risks & Security Meetup 01052024.pdf

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

AXA XL - Insurer Innovation Award Americas 2024

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Data Cloud, More than a CDP by Matt Robison

2024: Domino Containers - The Next Step. News from the Domino Container commu...

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Powerful Google developer tools for immediate impact! (2023-24 C)

Why Teams call analytics are critical to your entire business

How to Troubleshoot Apps for the Modern Connected Worker

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Axa Assurance Maroc - Insurer Innovation Award 2024

Analyzing natural language feedback using python

1. Analyzing natural language feedback using Python Thomas Aglassinger V1.0.0 PyDays Vienna 2018

2. Agenda ● About me ● What‘s SpaCy? ● What sentiment detection? ● Collecting restauranteering feedbacks using TeLLers ● Jupyter notebook with example analysis

3. About me ● Thomas Aglassinger ● Sofware developer (e-commerce, finance, health) ● Master of science in information processing ● Co-organizer Python user group Graz: https://pygraz.org ● Homepage: http://www.roskakori.at

4. What is SpaCy? ● Natural language processing in Python ● Simple to use ● Pragmatic algorithms ● Fast ● However: does not (yet) support sentiment detection ● More information: https://spacy.io/

5. What is sentiment detection? ● „systematically identify, extract, quantify, and study afective states and subjective information“ https://en.wikipedia.org/wiki/Sentiment_analysis ● Collects opinions from text written in natural language and stores them in a structured way ● Diferent levels: – Document – Sentence (possibly multiple per document) – Aspect (possibly multiple per sentence)

6. Opinion (de luxe edition) ● Example: “The Schnitzel is too small for a hungry student” (Hans Meier, 2018-04-28, 13:12 UTC) ● Consists of: – Target entity: Schnitzel (popular Austrian food) – Aspect: size – Sentiment: bad – Opinion holder: Hans Meier – Posting time: 2018-04-28, 13:12 UTC – Reason: “too small” – Qualifier: “for a hungry student” → might be find for others ● Reference: Bing Liu, “Sentiment Analysis”, Cambridge Press, 2015, p. 22f

7. Opinion (simplifed) ● Example: “The Schnitzel is too small for a hungry student” (Hans Meier, 2018-04-28, 13:12 UTC) ● Consists of: – Topic: food – Sentiment: bad – Opinion holder: Hans Meier – Posting time: 2018-04-28, 13:12 UTC ● Enough to get a grip about – Pain points – Unique sales propositions (USP)

8. Where to get feedback from? ● TeLLers mobile web application ● Stores feedback in database ● Accessible only by restaurateur ● No public publishing on the internet ● Austrian startup ● https://tellers.co.at/

9. About the feedback ● German language ● Textual answers to questions like – What did you like about your visit? – What did you like about your visit? – What can we improve to make your next visit even more pleasant? – Anything else you want to tell us?

10. Challenges ● Nowadays NLP is mostly about English and Chinese ● Limited data – Region: Graz, Austria – Time: 6 Months – Amount: about 1000 feedbacks ● Needs old school carefully handmade algorithm ● No magic pixie dust of machine learning

11. Algorithm ● Distributed as Jupyter notebook. Yeah, hardcore! ● Fully executable code and example data ● Play around and reuse! ● By-product of master‘s thesis

12. Algorithm – basic pipeline 1.Replace abbreviations that confuse SpaCy‘s sentence detection 2.Unify smiley codes and emojis 3.Replace Austrian slang terms with proper German (surprisingly few) 4.Split feedback in sentences and tokens (SpaCy) 5.Extend tokens with information about topic and rating (using a lexicon) 6.Combine related words, e.g. „nicht besonders gut“ (=“not particularly great“ = somewhat bad) 7.Reduce sentence to single topic and rating

13. Summary ● Lexicon based sentiment detection on a sentence level can be implemented comparably easily using SpaCy as base ● Manual pre-analysis of existing data required ● „good enough“ result to identify areas of interest

Analyzing natural language feedback using python

Recommended

Recommended

More Related Content

Similar to Analyzing natural language feedback using python

Similar to Analyzing natural language feedback using python (20)

More from roskakori

More from roskakori (18)

Recently uploaded

Recently uploaded (20)

Analyzing natural language feedback using python