Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Debugging Machine Learning.
Mostly for profit but with a bit of fun too!
Michał Łopuszyński
PyData Warsaw, 19.10.2017
About me
In my previous professional life, I was a theoretical physicist.
I got a PhD in solid state physics
•
For the las...
Agenda
4 failure modes of ML systems•
9 simple hints what to do•
Hey, my
ML system
does not
work at all...
Hint #1
Check your code
AKA it is engineering, stupid!
Write tests
Do not strive for 100% coverage, partial coverage is infinitely better
then none!
•
In Python, doctests are yo...
Test your code with Monte Carlo / synthetic data
Try to generate a trivial case for your
ML system
•
This tests the whole ...
Code style
Single responsibility principle•
Do not repeat yourself (DRY!)•
Have and apply coding standard (PEP8)•
Consider...
Naming
Minimal requirement - be consistent!•
Interesting software development books, offering chapters on naming•
Freely a...
Hint #2
Check your data
Data quality audits are difficult
Happy families are all alike; every unhappy family
is unhappy in its own way.
Leo Tolsto...
Data quality
Beware, your data providers usually
overestimate the data quality
•
Think of outliers, missing values (and ho...
OK, my ML system
works, but I think it
should perform
better...
Hint #3
Examine your features
Features
Features make a difference!•
Be creative with your features•
Try meaningful transformations,
combinations (produc...
Hint #4
Examine your data points
Data points
Find difficult data points! (DDP)•
DDP = notoriously misclassified (or high error) cases
in your cross-validat...
Influence functions
Influence functions
Best Paper Award
ICML 2017
Data points
Get more data!• ID
P1
P2
P3
P4
P5
Trick 1. Extend your set with artificial data
E.g., data augmentation in ima...
Hint #5
Examine your model
Why my model predicts what it predicts? (philosophical slide)
How do you answer why questions?•
Inspiring homework: watch ...
Model introspection
You can answer thy why question, only for very simple models
(e.g., linear model, basic decision trees...
Complex model introspection
LIME algorithm = Local Interpretable Model-agnostic Explanations
ACM SIGKDD International Conf...
Complex model introspection – practical issues
Lime authors provided open source Python implementation as lime package•
ht...
Visualizing models
Display model in
a data space
Look at collection of
models at once
Explore the process of
model fitting...
So my ML system
works on test data,
but you tell me it
fails in production?
Hint #6
Watch out for overfitting
Overfitting
If you torture the data long enough,
it will confess.
Roald Coase
Hint #7
Watch out for data leakage
Data leakage
Some time ago, I used to thing data leaks are trivial to avoid•
They are not! (Look at number of Kaggle compe...
Hint #8
Watch out for covariate shift
What is covariate shift?
Training data
y
X
What is covariate shift?
Model
Training data
X
y
What is covariate shift?
Model
Noiseless reality
Training data
y
X
What is covariate shift?
Model
Noiseless reality
Training data
Production data
(Test data)
y
X
Covariate shift
Unlike overfitting and data leakage, it is easier to detect•
Method: Try to build classifier differentiati...
The quality of my
super ML system
deteriorates with
time, really?
Really really?
ML system in production
2009: Hooray we can predict flu
epidemics from Google query data!
2014: Hmm... Can we?
ML system in production
NIPS 2015 paper
Hint #9
Remember monitoring & maintenance
AKA it is engineering again, stupid!
5
Thank you!
Questions?
@lopusz
Prochain SlideShare
Chargement dans…5
×

Debugging machine-learning

4 129 vues

Publié le

Debugging machine-learning

My presentation from PyDataWarsaw 2017

Publié dans : Données & analyses

Debugging machine-learning

  1. 1. Debugging Machine Learning. Mostly for profit but with a bit of fun too! Michał Łopuszyński PyData Warsaw, 19.10.2017
  2. 2. About me In my previous professional life, I was a theoretical physicist. I got a PhD in solid state physics • For the last 5 years I work as a Data Scientist in ICM, University of Warsaw•
  3. 3. Agenda 4 failure modes of ML systems• 9 simple hints what to do•
  4. 4. Hey, my ML system does not work at all...
  5. 5. Hint #1 Check your code AKA it is engineering, stupid!
  6. 6. Write tests Do not strive for 100% coverage, partial coverage is infinitely better then none! • In Python, doctests are your friends• "Hidden" benefits of tests• Better code structure• Executable documentation• Test the fragile parts first•
  7. 7. Test your code with Monte Carlo / synthetic data Try to generate a trivial case for your ML system • This tests the whole pipeline (transforming/training) and allows for exploration of your system performance under unusual conditions • If you have generative model, prepare Monte Carlo data from assumed distributions • Perturb the original data, by generating the output with known and learnable prescription •
  8. 8. Code style Single responsibility principle• Do not repeat yourself (DRY!)• Have and apply coding standard (PEP8)• Consider using a linter• Short functions and methods• https://xkcd.com/844/ pycodestyle checker (formerly pep8)• yapf formater• pylint, pyflakes, pychecker
  9. 9. Naming Minimal requirement - be consistent!• Interesting software development books, offering chapters on naming• Freely available chapter on names
  10. 10. Hint #2 Check your data
  11. 11. Data quality audits are difficult Happy families are all alike; every unhappy family is unhappy in its own way. Leo Tolstoy Like families, tidy datasets are all alike but every messy dataset is messy in its own way. Hadley Wickham H. Wickham, Tidy Data, JSS 59 (2014), doi: 10.18637/jss.v059.i10 Images credit - Wikipedia
  12. 12. Data quality Beware, your data providers usually overestimate the data quality • Think of outliers, missing values (and how the are represented)• Understand your data• Do exploratory data analysis• Visualize, visualize, visualize• Talk to the domain expert• Is your data correct, complete, coherent, stationary (seasonality!), deduplicated, up-to-date, representative (unbiased) •
  13. 13. OK, my ML system works, but I think it should perform better...
  14. 14. Hint #3 Examine your features
  15. 15. Features Features make a difference!• Be creative with your features• Try meaningful transformations, combinations (products/ratios), decorrelation... • Think of good representations for non-tabular data (text, time-series) • Make conscious decision about missing values• Understand what features are important for your model• Use ML models offering feature ranking• Use feature selection methods• ID F1 F2 F3
  16. 16. Hint #4 Examine your data points
  17. 17. Data points Find difficult data points! (DDP)• DDP = notoriously misclassified (or high error) cases in your cross-validation loop for large variety of models • ID P1 P2 P3 P4 P5 Examine DDPs, understand them!• In the easiest case, remove DDPs from the dataset (think outliers, mislabeled examples) •
  18. 18. Influence functions
  19. 19. Influence functions Best Paper Award ICML 2017
  20. 20. Data points Get more data!• ID P1 P2 P3 P4 P5 Trick 1. Extend your set with artificial data E.g., data augmentation in image processing, SMOTE algorithm for imbalanced datasets • Good performance booster, rarely applicable• Trick 2. Generate automatically noisy labeled data set by heuristics, e.g. distant supervision in NLP (requires unlabeled data!) • Trick 3. Semisupervised learning methods self-training and co-training (requires unlabeled data!) •
  21. 21. Hint #5 Examine your model
  22. 22. Why my model predicts what it predicts? (philosophical slide) How do you answer why questions?• Inspiring homework: watch Richard Feynman, Fun to imagine on magnets (youtube)•
  23. 23. Model introspection You can answer thy why question, only for very simple models (e.g., linear model, basic decision trees) • Sometimes, it is instructive to run such a simple model on your dataset, even though it does not provide top-level performance • You can boost your simple model by feeding it with more advanced (non-linearly transformed) features •
  24. 24. Complex model introspection LIME algorithm = Local Interpretable Model-agnostic Explanations ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016)
  25. 25. Complex model introspection – practical issues Lime authors provided open source Python implementation as lime package• http://eli5.readthedocs.io/en/latest/tutorials• Another option eli5 package, aimed more generally at model introspection and explanation (includes lime implementation) •
  26. 26. Visualizing models Display model in a data space Look at collection of models at once Explore the process of model fitting The ASA Data Science Journal 8, 203 (2015), doi: 10.1002/sam.11271
  27. 27. So my ML system works on test data, but you tell me it fails in production?
  28. 28. Hint #6 Watch out for overfitting
  29. 29. Overfitting If you torture the data long enough, it will confess. Roald Coase
  30. 30. Hint #7 Watch out for data leakage
  31. 31. Data leakage Some time ago, I used to thing data leaks are trivial to avoid• They are not! (Look at number of Kaggle competitions flawed by Data Leakage)• You may introduce them yourself E.g. meaningful identifiers, past & future separation in time series • You may receive them in the data from your provider• Good paper•
  32. 32. Hint #8 Watch out for covariate shift
  33. 33. What is covariate shift? Training data y X
  34. 34. What is covariate shift? Model Training data X y
  35. 35. What is covariate shift? Model Noiseless reality Training data y X
  36. 36. What is covariate shift? Model Noiseless reality Training data Production data (Test data) y X
  37. 37. Covariate shift Unlike overfitting and data leakage, it is easier to detect• Method: Try to build classifier differentiating train from production (test). If you succeed, you very likely have a problem • Basic remedy – reweighting data points. Give production-like data higher impact on your model •
  38. 38. The quality of my super ML system deteriorates with time, really? Really really?
  39. 39. ML system in production 2009: Hooray we can predict flu epidemics from Google query data! 2014: Hmm... Can we?
  40. 40. ML system in production NIPS 2015 paper
  41. 41. Hint #9 Remember monitoring & maintenance AKA it is engineering again, stupid! 5
  42. 42. Thank you! Questions? @lopusz

×