Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Introduction to deep learning in python and Matlab

4 852 vues

Publié le

Training a deep regression network in Lasagne (Python)
Training a deep classification network in MatConvNet (Matlab)

Publié dans : Sciences
  • D.I.Y. 10-Second Carb Ritual Strips Away Fat... ♥♥♥ https://bit.ly/2YcYRME
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici

Introduction to deep learning in python and Matlab

  1. 1. Introduction to Hands-on Deep Learning Imry Kissos Algorithm Researcher
  2. 2. Outline ● Problem Definition ● Motivation ● Training a Regression DNN ● Training a Classification DNN ● Open Source Packages ● Summary + Questions 2
  3. 3. Problem Definition 3 Deep Convolutional Network
  4. 4. Tutorial ● Goal: Detect facial landmarks on (normal) face images ● Data set provided by Dr. Yoshua Bengio ● Tutorial code available: https://github.com/dnouri/kfkd-tutorial/blob/master/kfkd.py 4
  5. 5. Flow 5 Predict Points on Test Set Train Model General Train Model “Nose Tip” Train Model “Mouth Corners”
  6. 6. Flow 6 Train Images Train Points Fit Trained Net
  7. 7. Flow 7 Test Images Predict Predicted Points
  8. 8. Python DL Framework Wrapper to Lasagne Theano extension for Deep Learning Define, optimize, and evaluate mathematical expressions Efficient Cuda GPU for DNN 8 Low Level High Level HW Supports: GPU & CPU OS: Linux, OS X, Windows
  9. 9. Training a Deep Neural Network 1. Data Analysis 2. Architecture Engineering 3. Optimization 4. Training the DNN 9
  10. 10. Training a Deep Neural Network 1. Data Analysis a. Exploration + Validation b. Pre-Processing c. Batch and Split 2. Architecture Engineering 3. Optimization 4. Training the DNN 10
  11. 11. Data Exploration + Validation Data: ● 7K gray-scale images of detected faces ● 96x96 pixels per image ● 15 landmarks per image (?) Data validation: ● Some Landmarks are missing 11 1
  12. 12. Pre-Processing 12 Data Normalization Shuffle train data
  13. 13. Batch - - t - train batch - validation batch - - test batch ⇐One Epoch’s data 13train/valid/test splits are constant
  14. 14. Train / Validation Split 14 Classification - Train/Validation preserve classes proportion
  15. 15. Training a Deep Neural Network 1. Data Analysis 2. Architecture Engineering a. Layers Definition b. Layers Implementation 3. Optimization 4. Training 15
  16. 16. Architecture 16 X Y Conv Pool Dense Output
  17. 17. Layers Definition 17
  18. 18. Activation Function 18 1 ReLU
  19. 19. Dense Layer 19
  20. 20. Dropout 20
  21. 21. Dropout 21
  22. 22. Training a Deep Neural Network 1. Data Analysis 2. Architecture Engineering 3. Optimization a. Back Propagation b. Objective c. SGD d. Updates e. Convergence Tuning 4. Training the DNN 22
  23. 23. Back Propagation Forward Path 23 Conv Dense X Y Output Points
  24. 24. Back Propagation Forward Path 24 X Y Conv Output PointsDense X Y Training Points
  25. 25. Back Propagation Backward Path 25 X Y Conv Dense
  26. 26. Back Propagation Update 26 Conv Dense For All Layers:
  27. 27. Objective 27
  28. 28. S.G.D 28Updates the network after each batch
  29. 29. Optimization - Updates 29 Alec Radford
  30. 30. Adjusting Learning Rate & Momentum 30 Linear in epoch
  31. 31. Convergence Tuning 31 stops according to validation loss returns best weights
  32. 32. Training a Deep Neural Network 1. Data Analysis 2. Architecture Engineering 3. Optimization 4. Training the DNN a. Fit b. Fine Tune Pre-Trained c. Learning Curves 32
  33. 33. Fit 33 Loop over test batchs Forward Loop over train batchs Forward+BackProp
  34. 34. Fine Tune Pre-Trained fgd 34 change output layer load pre-trained weight fine tune specialist
  35. 35. Learning Curves Loop over 6 Nets: 35 Epochs
  36. 36. Learning Curves Analysis 36 Net 1 Net 2 OverfittingConvergence Jittering EpochsEpochs RMSE RMSE
  37. 37. Part 1 Summary Training a DNN: 37
  38. 38. Python ● Rich eco-system ● State-of-the-art ● Easy to port from prototype to production 38 https://github.com/yoavram/Py4Eng
  39. 39. Python DL Framework 39 Theano based Packages
  40. 40. Part 1 End Break
  41. 41. Part 2
  42. 42. Outline ● Problem Definition ● Motivation ● Training a regression DNN ● Training a classification DNN ● Improving the DNN ● Open Source Packages ● Summary 42
  43. 43. Matlab DL Framework Open Source CNN Toolbox by Numerical computing using Parallel Computing Toolbox Efficient Cuda GPU for DNN 43 Low Level High Level HW Supports: GPU & CPU OS: Linux, OS X, Windows
  44. 44. Problem Statement Classify a, b, …, z images into 26 classes: 44http://www.robots.ox.ac.uk/~vgg/practicals/cnn/ Bonus - OCR:
  45. 45. Training a Deep Neural Network 1. Data Analysis 2. Training the DNN 3. Architecture Engineering 4. Optimization 45
  46. 46. Data Analysis 46 Defines training vs validation Class uint per class [1,26]
  47. 47. Data Pre-Processing Image 47 scalar
  48. 48. Training Flow 48
  49. 49. Customized Batch Loading 49 How would you add Data Augmentation ?
  50. 50. trainOpts 50 Start from last iter if interrupted
  51. 51. initializeCharCnn() Net Architecture Layers: ● Conv ● Pool ● Conv ● Pool ● Conv ● Relu ● Conv ● SoftMaxLoss 51 %f is the W initial std
  52. 52. Optimization SoftMax Score (-∞,∞) → probabilities [0,1] 52https://classroom.udacity.com/courses/ud730
  53. 53. One Hot Encoding Encode class labels 53
  54. 54. Cross Entropy Distance measure between S(Y) and Labels 54
  55. 55. Cross Entropy Distance measure between S(Y) and Labels 55 D(S,L) is a positive scalar t - index of ground-truth class
  56. 56. Cross Entropy Distance measure between S(Y) and Labels 56 In vl_nnloss.m:
  57. 57. Training Goal 57 CNN
  58. 58. Minimize Loss Loss=average cross entropy 58
  59. 59. Minimize loss - learning rate 59
  60. 60. Error Rate TopK - Target label is one of the top K predictions The Error Rate is: 60
  61. 61. Loss & Error Convergence 61 Loss Error Rate
  62. 62. Learned Filters 62
  63. 63. OCR Evaluation 63
  64. 64. OCR Evaluation 64
  65. 65. Beyond Training 1. Training a classification DNN 2. Improving the DNN a. Analysis Capabilities b. Augmentation 3. Open Source Packages 4. Summary 65
  66. 66. Basic VS Advanced Mode 66 Basic Advance
  67. 67. Improving the DNN Very tempting: ● >1M images ● >1M parameters ● Large gap: Theory ↔ Practice ⇒Brute force experiments?! 67
  68. 68. Analysis Capabilities 1. Theoretical explanation a. Eg. dropout/augmentation decrease overfit 2. Empirical claims about a phenomena a. Eg. normalization helps convergence 3. Numerical understanding a. Eg. exploding / vanishing updates 68
  69. 69. Reduce Overfitting Solution: Data Augmentation 69 Net 1 Net 2 Overfitting Epochs
  70. 70. Data Augmentation Horizontal Flip Perturbation 70 1
  71. 71. Convergence Challenges 71 Need to monitor forward + backward path EpochsEpochs RMSE Data ErrorNormalization
  72. 72. Deal with NaN 1. If in first 100 iterations a. Learning rate is too high 2. Beyond 100 iterations a. Gradient explosion i. Consider gradient clipping b. Illegal math operation i. SoftMax: inf/inf ii. Division by zero by one of your customized layers 72http://russellsstewart.com//notes/0.html
  73. 73. The Net Doesn’t Learn Anything 1. Training loss does not reduce after first 100 iterations a. Reduce the training size to 10 instances (images) to overfit it i. Achieve 100% training accuracy on a small portion of data b. Change batch size to 1 to and monitor the error per batch c. Solve the simplest version of your problem 73 http://russellsstewart.com//notes/0.html
  74. 74. Beyond Training 1. Training a classification DNN 2. Improving the DNN 3. Open Source Packages a. DL Open Source Packages b. Effort Estimation 4. Summary 74
  75. 75. Tips from Other Packages Torch code organization Caffe’s separation configuration ↔code NeuralNet → YAML text format defining experiment’s configuration 75
  76. 76. DL Open Source Packages 76 Caffe & MatConvNet for applications Torch, TensorFlow and Theano for research on DL http://fastml.com/torch-vs-theano/ Simple dnnComplex dnn
  77. 77. Disruptive Effort Estimation Feature Eng Deep Learning 77Modest SW Infra Huge SW Infra
  78. 78. Summary ● Dove into Training a DNN ● Presented Analysis Capabilities ● Reviewed Open Source Packages 78
  79. 79. References Hinton Coursera Neuronal Network https://www.coursera.org/course/neuralnets Udacity Tensor Flow course https://classroom.udacity.com/courses/ud730 Technion Deep Learning course http://moodle.technion.ac.il/course/view.php?id=4128 Oxford Deep Learning course https://www.youtube.com/playlist?list=PLE6Wd9FR--EfW8dtjAuPoTuPcqmOV53Fu CS231n CNN for Visual Recognition http://cs231n.github.io/ Deep Learning Book http://www.deeplearningbook.org/ 79
  80. 80. Questions? 80 DNN

×