Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Language Intelligence
Why Sentiment Analysis is a Market for
Lemons … and How to Fix it
Robert Munro
With thanks!
Gary King & Jana Thompson:
<- other Idibon people here:
Michelle Casbon & Nick Gaylord
What is a market for lemons?
• Information asymmetry between
buyers and sellers, leaving only
"lemons" behind. George Aker...
Competition is not increasing accuracy
• 100+ companies
offering some
form of sentiment
analysis
• Accuracy hovering
aroun...
The most honest sentiment analysis results you will
see
Accuracy F-Score Recall Precision F-Score
Positive Negative Neutra...
Data beats algorithms; feedback beats data
0.457 0.473
0.615
0.948
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Linear mode...
Consumers are
uncertain
• When consumers try out-
of-domain analysis, they
lose confidence from the
poor results.
• Domain...
Market forces are not breeding innovation
• Can’t innovate
through code alone
• More training data!
• But low price-points...
The Solution
• A different economic
models for useful
sentiment analysis:
• Data-sharing for more
accurate training data
•...
Machine
learning
Optimization
Human
annotation
Cloud
prediction
engine
Actionable
intelligence
On-site
prediction
engine
C...
The Benefits
• Multiple organizations can share in the benefits of better
sentiment analysis, without sacrificing privacy
...
Idibon Public: our implementation
• Free product, offered in addition to our enterprise
Idibon Studio and Idibon Terminal ...
Applies to NLP and Machine
Learning more broadly
Every human communication
• Any task can be bundled this way
• Allows mar...
Language Intelligence
Why Sentiment Analysis is a Market for
Lemons … and How to Fix it
QUESTIONS?
Robert Munro
Prochain SlideShare
Chargement dans…5
×

Why Sentiment Analysis is a Market for Lemons … and How to Fix it

173 vues

Publié le

New technical and economic models for improving sentiment analysis and other machine learning services.
Presented at "Data Day Texas", January 2016.

Publié dans : Technologie
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Why Sentiment Analysis is a Market for Lemons … and How to Fix it

  1. 1. Language Intelligence Why Sentiment Analysis is a Market for Lemons … and How to Fix it Robert Munro
  2. 2. With thanks! Gary King & Jana Thompson: <- other Idibon people here: Michelle Casbon & Nick Gaylord
  3. 3. What is a market for lemons? • Information asymmetry between buyers and sellers, leaving only "lemons" behind. George Akerlof • Buyers cannot distinguish good from bad products • Prices are equally low for all products • The buyer's price adverse selection problem drives the high-quality products from the market
  4. 4. Competition is not increasing accuracy • 100+ companies offering some form of sentiment analysis • Accuracy hovering around 70% for real-world applications for almost a decade
  5. 5. The most honest sentiment analysis results you will see Accuracy F-Score Recall Precision F-Score Positive Negative Neutral Positive Negative Neutral Positive Negative Neutral Semantria 0.59 0.59 0.56 0.47 0.78 0.68 0.80 0.45 0.62 0.59 0.57 MonkeyLearn 0.50 0.38* 0.84 0.54 0.00 0.45 0.60 0.00 0.59 0.57 0.00 MetaMind 0.66 0.66 0.68 0.46 0.88 0.78 0.88 0.50 0.73 0.60 0.64 Idibon Public 0.68 0.67 0.76 0.75 0.49 0.66 0.69 0.72 0.71 0.72 0.58 • Even within the best results for one domain, there is no clear leader when broken down by category • All systems could have best results in other domains • All could adapt here: Monkey Learn had errors with the ‘Neutral’ category, but we are sure they could update their models Source: Sentiment 140 corpus, 3-way sentiment on social data: http://cs.stanford.edu/people/alecmgo/trainingandtestdata.zip
  6. 6. Data beats algorithms; feedback beats data 0.457 0.473 0.615 0.948 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Linear model Deep Learning In-domain training 10mins analyst feedback precision recall F-value Distinguishing the correct ‘Ford’ Distinguishing “Ford” the company from people called “Ford”
  7. 7. Consumers are uncertain • When consumers try out- of-domain analysis, they lose confidence from the poor results. • Domain-dependence means that even bad models will be accurate in some areas • Consumers can only evaluate anecdotally or by precision, not recall • Uncertainty prevails
  8. 8. Market forces are not breeding innovation • Can’t innovate through code alone • More training data! • But low price-points means low margins • Lack of capital to find & label enough training data
  9. 9. The Solution • A different economic models for useful sentiment analysis: • Data-sharing for more accurate training data • Protecting sensitive data from public release
  10. 10. Machine learning Optimization Human annotation Cloud prediction engine Actionable intelligence On-site prediction engine Copy & Sync Models App Requests Ambiguous, Novel & Interesting Items Internal Data Flow Hybrid Model Data Flow Application Data Flow firewall
  11. 11. The Benefits • Multiple organizations can share in the benefits of better sentiment analysis, without sacrificing privacy • Single point of human-contact: no expensive duplicate manual labeling of data • Keeps lemons out of the market
  12. 12. Idibon Public: our implementation • Free product, offered in addition to our enterprise Idibon Studio and Idibon Terminal solutions
  13. 13. Applies to NLP and Machine Learning more broadly Every human communication • Any task can be bundled this way • Allows margins for use cases that were not otherwise viable • … including the full diversity of languages, priced out when everyone started in English
  14. 14. Language Intelligence Why Sentiment Analysis is a Market for Lemons … and How to Fix it QUESTIONS? Robert Munro

×