SlideShare a Scribd company logo
1 of 13
Download to read offline
Analyzing natural
language feedback
using Python
Thomas Aglassinger
V1.0.0
PyDays Vienna
2018
Agenda
●
About me
●
What‘s SpaCy?
●
What sentiment detection?
●
Collecting restauranteering feedbacks using TeLLers
●
Jupyter notebook with example analysis
About me
●
Thomas Aglassinger
●
Sofware developer (e-commerce, finance, health)
●
Master of science in information processing
●
Co-organizer Python user group Graz: https://pygraz.org
●
Homepage: http://www.roskakori.at
What is SpaCy?
●
Natural language processing in Python
●
Simple to use
●
Pragmatic algorithms
●
Fast
●
However: does not (yet) support sentiment detection
●
More information: https://spacy.io/
What is sentiment detection?
●
„systematically identify, extract, quantify, and study afective states
and subjective information“
https://en.wikipedia.org/wiki/Sentiment_analysis
●
Collects opinions from text written in natural language and stores
them in a structured way
●
Diferent levels:
– Document
– Sentence (possibly multiple per document)
– Aspect (possibly multiple per sentence)
Opinion (de luxe edition)
●
Example: “The Schnitzel is too small for a hungry student”
(Hans Meier, 2018-04-28, 13:12 UTC)
●
Consists of:
– Target entity: Schnitzel (popular Austrian food)
– Aspect: size
– Sentiment: bad
– Opinion holder: Hans Meier
– Posting time: 2018-04-28, 13:12 UTC
– Reason: “too small”
– Qualifier: “for a hungry student” → might be find for others
●
Reference: Bing Liu, “Sentiment Analysis”, Cambridge Press, 2015, p. 22f
Opinion (simplifed)
●
Example: “The Schnitzel is too small for a hungry student”
(Hans Meier, 2018-04-28, 13:12 UTC)
●
Consists of:
– Topic: food
– Sentiment: bad
– Opinion holder: Hans Meier
– Posting time: 2018-04-28, 13:12 UTC
●
Enough to get a grip about
– Pain points
– Unique sales propositions (USP)
Where to get feedback from?
●
TeLLers mobile web application
●
Stores feedback in database
●
Accessible only by restaurateur
●
No public publishing on the internet
●
Austrian startup
●
https://tellers.co.at/
About the feedback
●
German language
●
Textual answers to questions like
– What did you like about your visit?
– What did you like about your visit?
– What can we improve to make your next visit even more pleasant?
– Anything else you want to tell us?
Challenges
●
Nowadays NLP is mostly about English and Chinese
●
Limited data
– Region: Graz, Austria
– Time: 6 Months
– Amount: about 1000 feedbacks
●
Needs old school carefully handmade algorithm
●
No magic pixie dust of machine learning
Algorithm
●
Distributed as Jupyter notebook.
Yeah, hardcore!
●
Fully executable code and example data
●
Play around and reuse!
●
By-product of master‘s thesis
Algorithm – basic pipeline
1.Replace abbreviations that confuse SpaCy‘s sentence detection
2.Unify smiley codes and emojis
3.Replace Austrian slang terms with proper German (surprisingly few)
4.Split feedback in sentences and tokens (SpaCy)
5.Extend tokens with information about topic and rating (using a lexicon)
6.Combine related words, e.g. „nicht besonders gut“ (=“not particularly
great“ = somewhat bad)
7.Reduce sentence to single topic and rating
Summary
●
Lexicon based sentiment detection on a sentence level can
be implemented comparably easily using SpaCy as base
●
Manual pre-analysis of existing data required
●
„good enough“ result to identify areas of interest

More Related Content

Similar to Analyzing natural language feedback using python

Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Prompsit Language Engineering
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Gema Ramirez-Sanchez
 
What can Natural Language Processing do for you?
What can Natural Language Processing do for you?What can Natural Language Processing do for you?
What can Natural Language Processing do for you?Yves Peirsman
 
Teach colleagues accessibility - CSUN ATC 2018
Teach colleagues accessibility - CSUN ATC 2018Teach colleagues accessibility - CSUN ATC 2018
Teach colleagues accessibility - CSUN ATC 2018Tom Widerøe
 
How UX has helped us to do a better job
How UX has helped us to do a better jobHow UX has helped us to do a better job
How UX has helped us to do a better jobVirginia Tejada
 
SoC Python Discussion Group
SoC Python Discussion GroupSoC Python Discussion Group
SoC Python Discussion Groupkrishna_dubba
 
An Introduction to Natural Language Processing
An Introduction to Natural Language ProcessingAn Introduction to Natural Language Processing
An Introduction to Natural Language ProcessingTyrone Systems
 
Indextank east bay ruby meetup slides
Indextank east bay ruby meetup slidesIndextank east bay ruby meetup slides
Indextank east bay ruby meetup slidesYogiWanKenobi
 
Text Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATEText Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATEDiana Maynard
 
OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document level
OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document levelOEG at TASS 2017: Spanish Sentiment Analysis of tweets at document level
OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document levelMaría Navas Loro
 
Datascope Runs on Python - Chipy February 2016
Datascope Runs on Python - Chipy February 2016Datascope Runs on Python - Chipy February 2016
Datascope Runs on Python - Chipy February 2016Brian Lange
 
Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEDiana Maynard
 
Tools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media AnalysisTools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media AnalysisDiana Maynard
 
Docathon: How to write (good) documentation
Docathon: How to write (good) documentationDocathon: How to write (good) documentation
Docathon: How to write (good) documentationnelle varoquaux
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...Dataiku
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...PyData
 
Deprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stackDeprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stackJustina Petraitytė
 

Similar to Analyzing natural language feedback using python (20)

Tool criticism
Tool criticismTool criticism
Tool criticism
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...
 
What can Natural Language Processing do for you?
What can Natural Language Processing do for you?What can Natural Language Processing do for you?
What can Natural Language Processing do for you?
 
Teach colleagues accessibility - CSUN ATC 2018
Teach colleagues accessibility - CSUN ATC 2018Teach colleagues accessibility - CSUN ATC 2018
Teach colleagues accessibility - CSUN ATC 2018
 
How UX has helped us to do a better job
How UX has helped us to do a better jobHow UX has helped us to do a better job
How UX has helped us to do a better job
 
SoC Python Discussion Group
SoC Python Discussion GroupSoC Python Discussion Group
SoC Python Discussion Group
 
An Introduction to Natural Language Processing
An Introduction to Natural Language ProcessingAn Introduction to Natural Language Processing
An Introduction to Natural Language Processing
 
Indextank east bay ruby meetup slides
Indextank east bay ruby meetup slidesIndextank east bay ruby meetup slides
Indextank east bay ruby meetup slides
 
Text Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATEText Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATE
 
OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document level
OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document levelOEG at TASS 2017: Spanish Sentiment Analysis of tweets at document level
OEG at TASS 2017: Spanish Sentiment Analysis of tweets at document level
 
Datascope Runs on Python - Chipy February 2016
Datascope Runs on Python - Chipy February 2016Datascope Runs on Python - Chipy February 2016
Datascope Runs on Python - Chipy February 2016
 
Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATE
 
Tools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media AnalysisTools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media Analysis
 
Docathon: How to write (good) documentation
Docathon: How to write (good) documentationDocathon: How to write (good) documentation
Docathon: How to write (good) documentation
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
 
Tannishk Resume
Tannishk ResumeTannishk Resume
Tannishk Resume
 
Pankaj Gupta CV / Resume
Pankaj Gupta CV / ResumePankaj Gupta CV / Resume
Pankaj Gupta CV / Resume
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...
 
Deprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stackDeprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stack
 

More from roskakori

Expanding skill sets - Broaden your perspective on design
Expanding skill sets - Broaden your perspective on designExpanding skill sets - Broaden your perspective on design
Expanding skill sets - Broaden your perspective on designroskakori
 
Django trifft Flutter
Django trifft FlutterDjango trifft Flutter
Django trifft Flutterroskakori
 
Multiple django applications on a single server with nginx
Multiple django applications on a single server with nginxMultiple django applications on a single server with nginx
Multiple django applications on a single server with nginxroskakori
 
Helpful pre commit hooks for Python and Django
Helpful pre commit hooks for Python and DjangoHelpful pre commit hooks for Python and Django
Helpful pre commit hooks for Python and Djangoroskakori
 
Startmeeting Interessengruppe NLP NLU Graz
Startmeeting Interessengruppe NLP NLU GrazStartmeeting Interessengruppe NLP NLU Graz
Startmeeting Interessengruppe NLP NLU Grazroskakori
 
Helpful logging with python
Helpful logging with pythonHelpful logging with python
Helpful logging with pythonroskakori
 
Helpful logging with Java
Helpful logging with JavaHelpful logging with Java
Helpful logging with Javaroskakori
 
Einführung in Kommunikation und Konfliktmanagement für Software-Entwickler
Einführung in Kommunikation und Konfliktmanagement für Software-EntwicklerEinführung in Kommunikation und Konfliktmanagement für Software-Entwickler
Einführung in Kommunikation und Konfliktmanagement für Software-Entwicklerroskakori
 
Microsoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and DockerMicrosoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and Dockerroskakori
 
Migration to Python 3 in Finance
Migration to Python 3 in FinanceMigration to Python 3 in Finance
Migration to Python 3 in Financeroskakori
 
Introduction to pygments
Introduction to pygmentsIntroduction to pygments
Introduction to pygmentsroskakori
 
Lösungsorientierte Fehlerbehandlung
Lösungsorientierte FehlerbehandlungLösungsorientierte Fehlerbehandlung
Lösungsorientierte Fehlerbehandlungroskakori
 
XML namespaces and XPath with Python
XML namespaces and XPath with PythonXML namespaces and XPath with Python
XML namespaces and XPath with Pythonroskakori
 
Erste-Hilfekasten für Unicode mit Python
Erste-Hilfekasten für Unicode mit PythonErste-Hilfekasten für Unicode mit Python
Erste-Hilfekasten für Unicode mit Pythonroskakori
 
Introduction to trader bots with Python
Introduction to trader bots with PythonIntroduction to trader bots with Python
Introduction to trader bots with Pythonroskakori
 
Open source projects with python
Open source projects with pythonOpen source projects with python
Open source projects with pythonroskakori
 
Python builds mit ant
Python builds mit antPython builds mit ant
Python builds mit antroskakori
 
Kanban zur Abwicklung von Reporting-Anforderungen
Kanban zur Abwicklung von Reporting-AnforderungenKanban zur Abwicklung von Reporting-Anforderungen
Kanban zur Abwicklung von Reporting-Anforderungenroskakori
 

More from roskakori (18)

Expanding skill sets - Broaden your perspective on design
Expanding skill sets - Broaden your perspective on designExpanding skill sets - Broaden your perspective on design
Expanding skill sets - Broaden your perspective on design
 
Django trifft Flutter
Django trifft FlutterDjango trifft Flutter
Django trifft Flutter
 
Multiple django applications on a single server with nginx
Multiple django applications on a single server with nginxMultiple django applications on a single server with nginx
Multiple django applications on a single server with nginx
 
Helpful pre commit hooks for Python and Django
Helpful pre commit hooks for Python and DjangoHelpful pre commit hooks for Python and Django
Helpful pre commit hooks for Python and Django
 
Startmeeting Interessengruppe NLP NLU Graz
Startmeeting Interessengruppe NLP NLU GrazStartmeeting Interessengruppe NLP NLU Graz
Startmeeting Interessengruppe NLP NLU Graz
 
Helpful logging with python
Helpful logging with pythonHelpful logging with python
Helpful logging with python
 
Helpful logging with Java
Helpful logging with JavaHelpful logging with Java
Helpful logging with Java
 
Einführung in Kommunikation und Konfliktmanagement für Software-Entwickler
Einführung in Kommunikation und Konfliktmanagement für Software-EntwicklerEinführung in Kommunikation und Konfliktmanagement für Software-Entwickler
Einführung in Kommunikation und Konfliktmanagement für Software-Entwickler
 
Microsoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and DockerMicrosoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and Docker
 
Migration to Python 3 in Finance
Migration to Python 3 in FinanceMigration to Python 3 in Finance
Migration to Python 3 in Finance
 
Introduction to pygments
Introduction to pygmentsIntroduction to pygments
Introduction to pygments
 
Lösungsorientierte Fehlerbehandlung
Lösungsorientierte FehlerbehandlungLösungsorientierte Fehlerbehandlung
Lösungsorientierte Fehlerbehandlung
 
XML namespaces and XPath with Python
XML namespaces and XPath with PythonXML namespaces and XPath with Python
XML namespaces and XPath with Python
 
Erste-Hilfekasten für Unicode mit Python
Erste-Hilfekasten für Unicode mit PythonErste-Hilfekasten für Unicode mit Python
Erste-Hilfekasten für Unicode mit Python
 
Introduction to trader bots with Python
Introduction to trader bots with PythonIntroduction to trader bots with Python
Introduction to trader bots with Python
 
Open source projects with python
Open source projects with pythonOpen source projects with python
Open source projects with python
 
Python builds mit ant
Python builds mit antPython builds mit ant
Python builds mit ant
 
Kanban zur Abwicklung von Reporting-Anforderungen
Kanban zur Abwicklung von Reporting-AnforderungenKanban zur Abwicklung von Reporting-Anforderungen
Kanban zur Abwicklung von Reporting-Anforderungen
 

Recently uploaded

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Analyzing natural language feedback using python

  • 1. Analyzing natural language feedback using Python Thomas Aglassinger V1.0.0 PyDays Vienna 2018
  • 2. Agenda ● About me ● What‘s SpaCy? ● What sentiment detection? ● Collecting restauranteering feedbacks using TeLLers ● Jupyter notebook with example analysis
  • 3. About me ● Thomas Aglassinger ● Sofware developer (e-commerce, finance, health) ● Master of science in information processing ● Co-organizer Python user group Graz: https://pygraz.org ● Homepage: http://www.roskakori.at
  • 4. What is SpaCy? ● Natural language processing in Python ● Simple to use ● Pragmatic algorithms ● Fast ● However: does not (yet) support sentiment detection ● More information: https://spacy.io/
  • 5. What is sentiment detection? ● „systematically identify, extract, quantify, and study afective states and subjective information“ https://en.wikipedia.org/wiki/Sentiment_analysis ● Collects opinions from text written in natural language and stores them in a structured way ● Diferent levels: – Document – Sentence (possibly multiple per document) – Aspect (possibly multiple per sentence)
  • 6. Opinion (de luxe edition) ● Example: “The Schnitzel is too small for a hungry student” (Hans Meier, 2018-04-28, 13:12 UTC) ● Consists of: – Target entity: Schnitzel (popular Austrian food) – Aspect: size – Sentiment: bad – Opinion holder: Hans Meier – Posting time: 2018-04-28, 13:12 UTC – Reason: “too small” – Qualifier: “for a hungry student” → might be find for others ● Reference: Bing Liu, “Sentiment Analysis”, Cambridge Press, 2015, p. 22f
  • 7. Opinion (simplifed) ● Example: “The Schnitzel is too small for a hungry student” (Hans Meier, 2018-04-28, 13:12 UTC) ● Consists of: – Topic: food – Sentiment: bad – Opinion holder: Hans Meier – Posting time: 2018-04-28, 13:12 UTC ● Enough to get a grip about – Pain points – Unique sales propositions (USP)
  • 8. Where to get feedback from? ● TeLLers mobile web application ● Stores feedback in database ● Accessible only by restaurateur ● No public publishing on the internet ● Austrian startup ● https://tellers.co.at/
  • 9. About the feedback ● German language ● Textual answers to questions like – What did you like about your visit? – What did you like about your visit? – What can we improve to make your next visit even more pleasant? – Anything else you want to tell us?
  • 10. Challenges ● Nowadays NLP is mostly about English and Chinese ● Limited data – Region: Graz, Austria – Time: 6 Months – Amount: about 1000 feedbacks ● Needs old school carefully handmade algorithm ● No magic pixie dust of machine learning
  • 11. Algorithm ● Distributed as Jupyter notebook. Yeah, hardcore! ● Fully executable code and example data ● Play around and reuse! ● By-product of master‘s thesis
  • 12. Algorithm – basic pipeline 1.Replace abbreviations that confuse SpaCy‘s sentence detection 2.Unify smiley codes and emojis 3.Replace Austrian slang terms with proper German (surprisingly few) 4.Split feedback in sentences and tokens (SpaCy) 5.Extend tokens with information about topic and rating (using a lexicon) 6.Combine related words, e.g. „nicht besonders gut“ (=“not particularly great“ = somewhat bad) 7.Reduce sentence to single topic and rating
  • 13. Summary ● Lexicon based sentiment detection on a sentence level can be implemented comparably easily using SpaCy as base ● Manual pre-analysis of existing data required ● „good enough“ result to identify areas of interest