Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

VSSML17 L6. Time Series and Deepnets

Valencian Summer School in Machine Learning 2017 - Day 2
Lecture 6: Time Series and Deepnets. By Charles Parker (BigML).
https://bigml.com/events/valencian-summer-school-in-machine-learning-2017

  • Soyez le premier à commenter

VSSML17 L6. Time Series and Deepnets

  1. 1. Valencian Summer School in Machine Learning 3rd edition September 14-15, 2017
  2. 2. BigML, Inc 2Time Series / Deepnets Time Series Analysis
  3. 3. BigML, Inc 3Time Series / Deepnets Beyond Supervision • Traditional machine learning data is assumed to be IID • Independent (points have no information about each other’s class) and • Identically distributed (come from the same distribution) • But what if you want to predict just the next value in a sequence? Is all lost? • Applications • Predicting battery life from change-discharge cycles • Predicting sales for the next day/week/month
  4. 4. BigML, Inc 4Time Series / Deepnets Machine Learning Data Color Mass Type red 11 pen green 45 apple red 53 apple yellow 0 pen blue 2 pen green 422 pineapple yellow 555 pineapple blue 7 pen Discovering patterns within data: • Color = “red” Mass < 100 • Type = “pineapple” Color ≠ “blue” • Color = “blue” PPAP = “pen”
  5. 5. BigML, Inc 5Time Series / Deepnets Machine Learning Data Color Mass Type red 53 apple blue 2 pen red 11 pen blue 7 pen green 45 apple yellow 555 pineapple green 422 pineapple yellow 0 pen Patterns valid despite reshuffling • Color = “red” Mass < 100 • Type = “pineapple” Color ≠ “blue” • Color = “blue” PPAP = “pen”
  6. 6. BigML, Inc 6Time Series / Deepnets Time Series Data Year Pineapple Harvest 1986 50,74 1987 22,03 1988 50,69 1989 40,38 1990 29,80 1991 9,90 1992 73,93 1993 22,95 1994 139,09 1995 115,17 1996 193,88 1997 175,31 1998 223,41 1999 295,03 2000 450,53 Pineapple Harvest Tons 0 125 250 375 500 Year 1986 1988 1990 1992 1994 1996 1998 2000 Trend
  7. 7. BigML, Inc 7Time Series / Deepnets Time Series Data Year Pineapple Harvest 1986 139,09 1987 175,31 1988 9,91 1989 22,95 1990 450,53 1991 73,93 1992 40,38 1993 22,03 1994 295,03 1995 50,74 1996 29,8 1997 223,41 1998 115,17 1999 193,88 2000 50,69 Pineapple Harvest Tons 0 125 250 375 500 Year 1986 1988 1990 1992 1994 1996 1998 2000 Patterns invalid after shuffling
  8. 8. BigML, Inc 8Time Series / Deepnets Prediction Use the data from the past to predict the future
  9. 9. BigML, Inc 9Time Series / Deepnets Exponential Smoothing
  10. 10. BigML, Inc 10Time Series / Deepnets Exponential Smoothing Weight 0 0,05 0,1 0,15 0,2 Lag 1 3 5 7 9 11 13
  11. 11. BigML, Inc 11Time Series / Deepnets Trendy 0 12,5 25 37,5 50 Time Apr May Jun Jul y 0 50 100 150 200 Time Apr May Jun Jul Additive Multiplicative
  12. 12. BigML, Inc 12Time Series / Deepnets Seasonalityy 0 30 60 90 120 Time 1 4 7 10 13 16 19 y 0 35 70 105 140 Time 1 4 7 10 13 16 19 Additive Multiplicative
  13. 13. BigML, Inc 13Time Series / Deepnets Errory 0 150 300 450 600 Time 1 4 7 10 13 16 19 y 0 125 250 375 500 Time 1 4 7 10 13 16 19 Additive Multiplicative
  14. 14. BigML, Inc 14Time Series / Deepnets Model Types None Additive Multiplicative None A,N,N M,N,N A,N,A M,N,A A,N,M M,N,M Additive A,A,N M,A,N A,A,A M,A,A A,A,M M,A,M Additive + Damped A,Ad,N M,Ad,N A,Ad,A M,Ad,A A,Ad,M M,Ad,M Multiplicative A,M,N M,M,N A,M,A M,M,A A,M,M M,M,M Multiplicative + Damped A,Md,N M,Md,N A,Md,A M,Md,A A,Md,M M,Md,M M,N,A Multiplicative Error No Trend Additive Seasonality
  15. 15. BigML, Inc 15Time Series / Deepnets Evaluating Model Fit • AIC: Akaike Information Criterion; tries to trade off accuracy and model complexity • AICc: Like the AIC, but with a sample size correction • BIC: Bayesian Information Criterion; like the AIC but penalizes large numbers of parameters more harshly • R-squared: Raw performance, the number of model parameters isn’t considered
  16. 16. BigML, Inc 16Time Series / Deepnets Linear Splitting Year Pineapple Harvest 1986 139,09 1987 175,31 1988 9,91 1989 22,95 1990 450,53 1991 73,93 1992 40,38 1993 22,03 1994 295,03 1995 115,17 Random Split Year Pineapple Harvest 1986 139,09 1987 175,31 1988 9,91 1989 22,95 1990 450,53 1991 73,93 1992 40,38 1993 22,03 1994 295,03 1995 115,17 Linear Split
  17. 17. BigML, Inc 17Time Series / Deepnets Deep Neural Networks
  18. 18. BigML, Inc 18Time Series / Deepnets BigML Deepnets • Not Done Yet! • I’m the tech lead, so I’m the reason we don’t have a demo for this (sorry). • Check out our next release webinar! • Let’s Still Have a Chat • Deep learning is regarded in the media as some sort of strange robot messiah, destined to either save or destroy us all • What’s good about deep learning and why is it so popular now? • How much is hype and what are some of the major issues with it?
  19. 19. BigML, Inc 19Time Series / Deepnets Going Further • Trees • Pro: Massive representational power that expands as the data gets larger; efficient search through this space • Con: Difficult to represent smooth functions and functions of many variables • Ensembles mitigate some of these difficulties • Logistic Regression • Pro: Some smooth, multivariate, functions are not a problem; fast optimization of chosen • Con: Parametric - If decision boundary is nonlinear, tough luck • Can these be mitigated?
  20. 20. BigML, Inc 20Time Series / Deepnets LR Level Up Outputs Inputs
  21. 21. BigML, Inc 21Time Series / Deepnets LR Level Up wi Class 1, logistic(w, b)
  22. 22. BigML, Inc 22Time Series / Deepnets LR Level Up Outputs Inputs Hidden layer
  23. 23. BigML, Inc 23Time Series / Deepnets LR Level Up Class 1, logistic(w, b) Hidden unit 1, logistic(w, b)
  24. 24. BigML, Inc 24Time Series / Deepnets LR Level Up Class 1, logistic(w, b) Hidden unit 1, logistic(w, b) n nodes ?
  25. 25. BigML, Inc 25Time Series / Deepnets LR Level Up Class 1, logistic(w, b) Hidden unit 1, logistic(w, b) n hidden layers?
  26. 26. BigML, Inc 26Time Series / Deepnets LR Level Up Class 1, logistic(w, b) Hidden unit 1, logistic(w, b)
  27. 27. BigML, Inc 27Time Series / Deepnets Why? • This isn’t new. Why the sudden interest? • Scale • Massive parameter space <=> Massive data • Abundance of compute power + GPUs • Frameworks for computational graph composition (TensorFlow, Theano, Torch, Caffe) • “Compiles” the network architecture into a highly optimized set of commands that run quickly and with maximum parallelism • Symbolically differentiates the objective for gradient descent
  28. 28. BigML, Inc 28Time Series / Deepnets Deep Networks • Like Trees / Ensembles, we have arbitrary representational power by modifying the structure • Like logistic regression, smooth, multivariate objectives aren’t a problem (provided we have the right structure) • So what have we lost?
  29. 29. BigML, Inc 29Time Series / Deepnets Deep Network Cons • Efficiency • The right structure for given data is not easily found, and most structures are bad • Solution: Try a bunch of them, and be clever about how you do it • Interpretability • We’ve gotten quite far away from the interpretability of trees • Solution: Use sampling and tree induction to create decision tree-like explanations for predictions
  30. 30. BigML, Inc 30Time Series / Deepnets Bayesian Parameter Optimization Model and EvaluateStructure 1 Structure 2 Structure 3 Structure 4 Structure 5 Structure 6
  31. 31. BigML, Inc 31Time Series / Deepnets Bayesian Parameter Optimization Model and EvaluateStructure 1 Structure 2 Structure 3 Structure 4 Structure 5 Structure 6 0.75
  32. 32. BigML, Inc 32Time Series / Deepnets Bayesian Parameter Optimization Model and EvaluateStructure 1 Structure 2 Structure 3 Structure 4 Structure 5 Structure 6 0.75 0.48
  33. 33. BigML, Inc 33Time Series / Deepnets Bayesian Parameter Optimization Model and EvaluateStructure 1 Structure 2 Structure 3 Structure 4 Structure 5 Structure 6 0.75 0.48 0.91
  34. 34. BigML, Inc 34Time Series / Deepnets Bayesian Parameter Optimization Structure 1 Structure 2 Structure 3 Structure 4 Structure 5 Structure 6 0.75 0.48 0.91 Model! Structure -> performance Model and Evaluate
  35. 35. BigML, Inc 35Time Series / Deepnets Bayesian Parameter Optimization Structure 1 Structure 2 Structure 3 Structure 4 Structure 5 Structure 6 0.75 0.48 0.91 Model! Structure -> performance Model and Evaluate
  36. 36. BigML, Inc 36Time Series / Deepnets Should I use it? • Things that make deep learning less useful: • Small data (where that could still be thousands of instances) • Problems where you could benefit by iterating quickly (better features always beats better models) • Problems that are easy, or for which top-of-the-line performance isn’t absolutely critical • Remember deep learning is just another sort of classifier “…deep learning has existed in the neural network community for over 20 years. Recent advances are driven by some relatively minor improvements in algorithms and models and by the availability of large data sets and much more powerful collections of computers.” — Stuart Russell https://people.eecs.berkeley.edu/~russell/research/future/

×