6. Plan
1. My path in ML
2. AI big picture (expert systems to ML)
3. ML trends over time 1980-2008+
4. Type of ML (supervised vs unsupervised)
5. Relationship Data-mining vs ML
6. Training process
7. regularization technique
8. ML research big picture
9. What is this Deeplearning Revolution?
10. ML in practice -> feature engineering
11. Importance of the cost function
12. Data importance -> NIPS 2009
13. The tagging nightmare
14. ML & Optimization
15. Adversarial Examples
7. Francis Evolution in ML
• 1999 – Decision Tree expert (Samy Bengio)
• 2001-2003 – Research with Bengio (huge networks) -> flayers
• 2003 – Idilia -> Importance of good tagged dataset and features &
overfitting
• 2005-2006 – Dakis -> KISS (ML not required & importance of
comprehensive knowledge) – Expert System
• 2006-2009 – Data-Mining (Understand first & features extraction)…MLboost
• 2010-2013 – QMining -> big-data mining
• 2003-2016 – Nuance -> Data Maturity & Data-driven design
17. Regularisation technique
• Regularization is a technique used in an
attempt to solve the overfitting [1]
problem
in statistical models.*
• Exemple:
– Early stopping
– Decrease constant
– Dropout
– Mini-batch
– Better cost function (ex: margin vs MSE)
18. What is tough about ML
• More parameters = more examples are
required
• Tagged data is hard to create compare to
untagged data
• There is no magic -> Feature engineering
• Better features -> less examples -> less
capacity problem
• Getting good example sampling (don’t
introduce bias)
21. What is this Deeplearning Revolution?
• Deep architecture are more powerful then shallow
architecture
• Before 2006 we couldn’t train deep architecture
• Revolution
– Convolution NN
– Train generative models (Auto-encoder) -> learn the
data constraints…..unsupervised learning… (better
parameters initialization)
– STD Training
25. ML learning in practice
• Black box = recipe for a disaster
• 90% feature engineering
• ML = automatic tuning
• Garbage in = Garbage out
• Tagging is a pain….manual work
26. Importance of the cost function
• Neural network cost functions (back prop)
– MSE & Log soft max
– Example NETFLIX & recommendation
• Optimization
– SVM = Maximize Margin
27. Data Importance NIPS 2009
• Google -> that is enough
– Parameter optimization; tweaking kernels (SVM)
– More parameters then # examples
– Simpler model + more data = what works
28. The tagging nightmare
• You still need tagged data
• Tagged data is hard to automate + error
prone
• Tagged data is error prone (garbage in
garbage out)
– Idilia use case
– Nuance use case
29. The lie about ML
• Machine learning != Optimization
• Machine learning != Statistics
• Machine learning = Optimization problem
with constraints to generalize
(regularization)
31. Conclusion
• ML is a quite mature field
• ML != Deeplearning
– Deeplearning = major breakthrough, hype
phase, not mature
• NN = optimization problem with constraints
• SP operates more like expert systems
• Algo is as good as its inputs -> feature
engineering