Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
PROOF OF FAILURE
Clare Corthell
Machine Learning Engineer & Data Scientist
@clarecorthell
www.datasciencemasters.org
Deal Intelligence Platform

find and evaluate private companies
Machine Learning Need
• Very little structured information
• Disaggregated data
• Need for categorization
=> Data Structur...
WHAT DOES THE COMPANY DO?
“industry” dimension
INDUSTRIES AS BINARY CATEGORIES
you’re in or out
inputs outputsmodel
decision:
• reinforce
• ship
PERFECTION!
UTOPIA!
HUMAN INFERENCE WITHOUT HUMANS!
and 60 fewer people on payroll
- $4.2m / yr
ANALYSTS
the user is not the database
EXAMPLE 1: WIND TURBINES
WindTurbines.
definition
EXAMPLE 2: WEARABLES
on your body?
electronic?
new materials?
what are they?
definition
REINFORCEMENT
doesn’t always work
REINFORCEMENT PITFALLS
- (Technical) Overfitting
- Humans have to question their own assumptions
- Dimensional encoding iss...
SOOOOOOOOOOOOOOO…
SVM > neural nets
things I’ve heard recently
WHY IS SVM BETTER?
Feature inspectability
• sometimes for debugging
• mostly for humans
Humans don’t know what transformat...
MONSTERS IN THE BLACK BOX
because
HUMANS SHOULD BE HUMANS
COMPUTERS SHOULD BE COMPUTERS.
Sometimes, our identities get a little mixed up.
1. Set Expectations
make sure the organization understands failures
2. Reduce the “Trickery”*
We build systems for humans....
datasciencemasters.org
github@clarecorthell.com
@clarecorthell
mattermark.com
Prochain SlideShare
Chargement dans…5
×

User-Operated Model-Building Systems - Data Science: Inconvenient Truths

Event: https://www.eventbrite.com/e/data-science-inconvenient-truths-tickets-15582414421

Audio: https://soundcloud.com/clarealaska/user-operated-model-building-systems-data-science-inconvenient-truths

  • Identifiez-vous pour voir les commentaires

User-Operated Model-Building Systems - Data Science: Inconvenient Truths

  1. 1. PROOF OF FAILURE Clare Corthell Machine Learning Engineer & Data Scientist @clarecorthell www.datasciencemasters.org
  2. 2. Deal Intelligence Platform find and evaluate private companies
  3. 3. Machine Learning Need • Very little structured information • Disaggregated data • Need for categorization => Data Structuring & Creation
  4. 4. WHAT DOES THE COMPANY DO? “industry” dimension
  5. 5. INDUSTRIES AS BINARY CATEGORIES you’re in or out
  6. 6. inputs outputsmodel decision: • reinforce • ship
  7. 7. PERFECTION! UTOPIA! HUMAN INFERENCE WITHOUT HUMANS! and 60 fewer people on payroll - $4.2m / yr
  8. 8. ANALYSTS the user is not the database
  9. 9. EXAMPLE 1: WIND TURBINES WindTurbines. definition
  10. 10. EXAMPLE 2: WEARABLES on your body? electronic? new materials? what are they? definition
  11. 11. REINFORCEMENT doesn’t always work
  12. 12. REINFORCEMENT PITFALLS - (Technical) Overfitting - Humans have to question their own assumptions - Dimensional encoding issues (is this expressible in features?) - Human definitions is inadequate
  13. 13. SOOOOOOOOOOOOOOO… SVM > neural nets things I’ve heard recently
  14. 14. WHY IS SVM BETTER? Feature inspectability • sometimes for debugging • mostly for humans Humans don’t know what transformation the black box exerts on inputs. But sometimes, they need to know. Their investors, their customers, their data analysts, their operators, their CEO — all want to know.
  15. 15. MONSTERS IN THE BLACK BOX because
  16. 16. HUMANS SHOULD BE HUMANS COMPUTERS SHOULD BE COMPUTERS. Sometimes, our identities get a little mixed up.
  17. 17. 1. Set Expectations make sure the organization understands failures 2. Reduce the “Trickery”* We build systems for humans. They need to understand how the levers and knobs affect the outcome *h/t SeanTaylor
  18. 18. datasciencemasters.org github@clarecorthell.com @clarecorthell mattermark.com

×