Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Prochain SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Chargement dans…3
×
1 sur 108

XebiCon'17 : Faites chauffer les neurones de votre Smartphone avec du Deep Learning on-device - Qian Jin, Yoann Benoit et Sylvain Lequeux

1

Partager

Télécharger pour lire hors ligne

Nous entendons aujourd’hui parler de Deep Learning un peu partout : reconnaissance d’images, de sons, génération de textes, etc. Suite aux récentes annonces sur Android Neural Network API et TensorFlowLite et à la release du framework CoreML d’Apple, tout nous pousse vers le “on-device intelligence”.
Bien que les techniques et frameworks soient en train de se démocratiser, il reste difficile d’en voir les applications concrètes en entreprise, et encore moins sur des applications mobiles. Nous avons donc décidé de construire un Proof Of Concept pour relever les défis du domaine.
A travers une application mobile à but éducatif, utilisant du Deep Learning pour de la reconnaissance d’objets, nous aborderons les impacts de ce type de modèles sur les smartphones, l’architecture pour l’entraînement et le déploiement de modèles sur un service Cloud, ainsi que la construction de l’application mobile avec les dernières nouveautés annoncées.

XebiCon'17 : Faites chauffer les neurones de votre Smartphone avec du Deep Learning on-device - Qian Jin, Yoann Benoit et Sylvain Lequeux

  1. 1. Heat the Neurons of Your Smartphone with Deep Learning Qian Jin | @bonbonking | qjin@xebia.fr Yoann Benoit | @YoannBENOIT | ybenoit@xebia.fr Sylvain Lequeux | @slequeux | slequeux@xebia.fr
  2. 2. On-Device Intelligence
  3. 3. Android Wear 2.0 Smart Reply 4 Learned Projection Model https://research.googleblog.com/2017/02/on-device-machine-intelligence.html
  4. 4. https://en.wikipedia.org/wiki/Moore%27s_law
  5. 5. Source: https://www.qualcomm.com/news/snapdragon/2017/01/09/tensorflow-machine-learning-now-optimized-snapdragon-835-and-hexagon-682
  6. 6. Credit: 7 Source: https://9to5google.com/2017/01/10/qualcomm-snapdragon-835-machine-learning-tensorflow/
  7. 7. On-device Intelligence enables edge devices to provide reliable execution with or without network connection. 8
  8. 8. 9 Image credit: https://www.andertoons.com
  9. 9. #datamobile Chat History of the Slack channel
  10. 10. Magritte Ceci n’est pas une pomme.
  11. 11. 14
  12. 12. Build TensorFlow Android Example With Bazel
  13. 13. 16
  14. 14. Android Developer Deep Learning Noob
  15. 15. WE CAN RECOGNIZE ALL THE THINGS!
  16. 16. NEURONS NEURONS EVERYWHERE
  17. 17. I THOUGHT THERE WERE MODELS FOR EVERYTHING...
  18. 18. Neural Networks in a Nutshell
  19. 19. Here’s a Neural Network 22
  20. 20. Prediction on an image - Inference 23
  21. 21. Prediction on an image - Inference 24
  22. 22. Prediction on an image - Inference 25 Apple: 0.98 Banana: 0.02
  23. 23. How to train a model?
  24. 24. 27
  25. 25. Back Propagation 28
  26. 26. Back Propagation 29 Apple: 0.34 Banana: 0.66
  27. 27. Apple: 0.34 Banana: 0.66 Back Propagation 30 Prediction Error
  28. 28. Apple: 0.34 Banana: 0.66 Back Propagation 31 Prediction Error
  29. 29. Apple: 0.34 Banana: 0.66 Back Propagation 32 Prediction Error
  30. 30. Back Propagation 33 Apple: 0.87 Banana: 0.13
  31. 31. Back Propagation 34 Banana: 0.93 Apple: 0.07
  32. 32. Deep Convolutional Neural Network & Inception Architecture Credit: http://nicolovaligi.com/history-inception-deep-learning-architecture.html
  33. 33. Deep Convolutional Neural Network 36 Image Credit: https://github.com/tensorflow/models/tree/master/research/inception Visualisation of Inception v3 Model Architecture Edges Shapes High Level Features Classifiers
  34. 34. Source: CS231n Convolutional Neural Networks for Visual Recognition http://cs231n.stanford.edu/
  35. 35. Source: https://code.facebook.com/posts/1687861518126048/facebook-to-open-source-ai-hardware-design/
  36. 36. Transfer Learning
  37. 37. Transfer Learning • Use a pre-trained Deep Neural Network • Keep all operations but the last one • Re-train only the last operation to specialize your network to your classes Keep all weights identical except these ones 40
  38. 38. Gather Training Data 41
  39. 39. 42 Retrain a Model Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/examples/image_retraining/retrain.py --bottleneck_dir=tf_files/bottlenecks --how_many_training_steps=500 --model_dir=tf_files/models/ --summaries_dir=tf_files/training_summaries/ --output_graph=tf_files/retrained_graph.pb --output_labels=tf_files/retrained_labels.txt --image_dir=tf_files/fruit_photos
  40. 40. Obtain the Retrained Model •2 outputs: • Model as protobuf file: contains a version of the selected network with a final layer retrained on your categories • Labels as text file 43 model.pb label.txt
  41. 41. 44 public class ClassifierActivity extends CameraActivity implements OnImageAvailableListener { private static final int INPUT_SIZE = 224; private static final int IMAGE_MEAN = 117; private static final float IMAGE_STD = 1; private static final String INPUT_NAME = "input"; private static final String OUTPUT_NAME = "output"; private static final String MODEL_FILE = "file:///android_asset/ tensorflow_inception_graph.pb"; private static final String LABEL_FILE = "file:///android_asset/ imagenet_comp_graph_label_strings.txt"; } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/android/src/org/tensorflow/demo/ClassifierActivity.java
  42. 42. java.lang.UnsupportedOperationException: Op BatchNormWithGlobalNormalization is not available in GraphDef version 21. 45
  43. 43. Unsupported Operations • Only keep the operations dedicated to the inference step • Remove decoding, training, loss and evaluation operations 46
  44. 44. 47 Optimize for Inference Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/python/tools/optimize_for_inference --input=tf_files/retrained_graph.pb --output=tf_files/optimized_graph.pb --input_names="input" --output_names="final_result"
  45. 45. Data Scientist Android Development Noob
  46. 46. CLICK 7 TIMES ON BUILD NUMBER 49
  47. 47. Build Standalone App
  48. 48. Pre-Google I/O 2017 • Use nightly build • Library .so • Java API jar android { //… sourceSets { main { jniLibs.srcDirs = ['libs'] } } } 51
  49. 49. Post-Google I/O 2017 Source: Android Meets TensorFlow: How to Accelerate Your App with AI (Google I/O '17) https://www.youtube.com/watch?v=25ISTLhz0ys 52 Currently: 1.4.0
  50. 50. App size ~80MB 54
  51. 51. Reducing Model Size
  52. 52. WHO CARES? MODEL SIZE 56
  53. 53. Model Size All weights are stored as they are (64-bit floats) => 80MB 57
  54. 54. ~80MB -> ~20MB 58 Weights Quantization 6.372638493746383 => 6.4 Source: https://www.tensorflow.org/performance/quantization
  55. 55. 59 Quantize Graph Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/tools/quantization/quantize_graph.py --input=tf_files/optimized_graph.pb --output=tf_files/rounded_graph.pb --output_node_names=final_result --mode=weights_rounded
  56. 56. MobileNet Mobile-first computer vision models for TensorFlow 60 Image credit : https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md
  57. 57. Inception V3 v.s. MobileNet 61 Inception V3 78% Accuracy* 85MB MobileNet (Largest configuration) 70.5% Accuracy* 19MB *: accuracy on ImageNet images
  58. 58. ~80Mb => ~20Mb => ~1-5Mb Source: https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html
  59. 59. Underneath the Android App
  60. 60. Android SDK (Java) Android NDK (C++) Classifier Implementation TensorFlow JNI wrapper Image (Bitmap) Trained Model top_resultsClassifications + Confidence input_tensor1 2 34 Camera Preview Ref: https://jalammar.github.io/Supercharging-android-apps-using-tensorflow/ Overlay Display
  61. 61. Image Sampling Get Image from Camera Preview Crop the center square Resize Sample Image 67
  62. 62. Converts YUV420 (NV21) to ARGB8888 68 public static native void convertYUV420ToARGB8888( byte[] y, byte[] u, byte[] v, int[] output, int width, int height, int yRowStride, int uvRowStride, int uvPixelStride, boolean halfSize );
  63. 63. 69 /** * Initializes a native TensorFlow session for classifying images. * * @param assetManager The asset manager to be used to load assets. * @param modelFilename The filepath of the model GraphDef protocol buffer. * @param labels The list of labels * @param inputSize The input size. A square image of inputSize x inputSize is assumed. * @param imageMean The assumed mean of the image values. * @param imageStd The assumed std of the image values. * @param inputName The label of the image input node. * @param outputName The label of the output node. * @throws IOException */ public static Classifier create( AssetManager assetManager, String modelFilename, List<String> labels, int inputSize, int imageMean, float imageStd, String inputName, String outputName) { }
  64. 64. 70 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Preprocess Bitmap / Create Tensor
  65. 65. 71 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Feed Input Data to TensorFlow
  66. 66. 72 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Run the Inference Call
  67. 67. 73 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Fetch the Output Tensor
  68. 68. 74 (continue..) // Find the best classifications. PriorityQueue<Recognition> pq = new PriorityQueue<>( 3, (lhs, rhs) -> { // Intentionally reversed to put high confidence at the head of the queue. return Float.compare(rhs.getConfidence(), lhs.getConfidence()); }); for (int i = 0; i < outputs.length; ++i) { if (outputs[i] > THRESHOLD) { pq.add( new Recognition( "" + i, labels.size() > i ? labels.get(i) : "unknown", outputs[i], null)); } } //... return recognitions; } Find the Best Classification
  69. 69. Adding New Models
  70. 70. Adding a New Model 77 2 * 20 MB = 40 MB
  71. 71. Model Fusion • Start from previous model to keep all specific operations in the graph • Specify all operations to keep when optimizing for inference 78 graph_util.convert_variables_to_constants(sess, graph.as_graph_def(), [“final_result_fruits”, “final_result_vegetables”]
  72. 72. Android Makers Paris 2017
  73. 73. Continuous Training Pipeline 81 Source: https://www.tensorflow.org/serving/
  74. 74. TensorFlow Serving Hosts the model provides remote access to it 82 Source: https://www.tensorflow.org/serving/
  75. 75. model.pb label.txt Continuous Training Pipeline 83
  76. 76. model.pb label.txt Dispensing Model 84
  77. 77. Currently with Project Magritte… Training • Model debug done by overheating a laptop • Model built on personal GPU • Files uploaded manually Model dispensing • API available • Deployment on AWS, currently migrating on Google Cloud 85
  78. 78. Android App Evolves
  79. 79. Android FilesDir model.pb Labels model.pb label.txt
  80. 80. public TensorFlowInferenceInterface(AssetManager assetManager, String model) { prepareNativeRuntime(); this.modelName = model; this.g = new Graph(); this.sess = new Session(g); this.runner = sess.runner(); final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX); InputStream is = null; try { String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model; is = assetManager.open(aname); } catch (IOException e) { if (hasAssetPrefix) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } // Perhaps the model file is not an asset but is on disk. try { is = new FileInputStream(model); } catch (IOException e2) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } } } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
  81. 81. public TensorFlowInferenceInterface(AssetManager assetManager, String model) { prepareNativeRuntime(); this.modelName = model; this.g = new Graph(); this.sess = new Session(g); this.runner = sess.runner(); final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX); InputStream is = null; try { String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model; is = assetManager.open(aname); } catch (IOException e) { if (hasAssetPrefix) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } // Perhaps the model file is not an asset but is on disk. try { is = new FileInputStream(model); } catch (IOException e2) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } } } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
  82. 82. public TensorFlowInferenceInterface(AssetManager assetManager, String model) { prepareNativeRuntime(); this.modelName = model; this.g = new Graph(); this.sess = new Session(g); this.runner = sess.runner(); final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX); InputStream is = null; try { String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model; is = assetManager.open(aname); } catch (IOException e) { if (hasAssetPrefix) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } // Perhaps the model file is not an asset but is on disk. try { is = new FileInputStream(model); } catch (IOException e2) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } } } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
  83. 83. 92 Optimize for Mobile > IMAGE_SIZE=224 > ARCHITECTURE="mobilenet_0.50_${IMAGE_SIZE}"
  84. 84. 93 Optimize for Mobile Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/examples/image_retraining/retrain.py --bottleneck_dir=tf_files/bottlenecks --how_many_training_steps=500 --model_dir=tf_files/models/ --summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}" --output_graph=tf_files/retrained_graph.pb --output_labels=tf_files/retrained_labels.txt --architecture="${ARCHITECTURE}" —image_dir=tf_files/fruit_photos
  85. 85. Model Inception V3 Optimized & Quantized
  86. 86. Model MobileNets_1.0_224
  87. 87. Model MobileNets_0.5_224
  88. 88. Model MobileNets_0.25_224
  89. 89. Demo Time
  90. 90. 100
  91. 91. What’s next?
  92. 92. Source: https://developer.android.com/ndk/guides/neuralnetworks/index.html • Android C API (NDK) • Functionalities for high-level machine learning frameworks (TensorFlow Lite, Caffe2, others) • Available on Android 8.1 and plus (API 27+) Android Neural Networks API
  93. 93. On-device Inferencing • Latency: You don’t need to send a request over a network connection and wait for a response. This can be critical for video applications that process successive frames coming from a camera. • Availability: The application runs even when outside of network coverage. • Speed: New hardware specific to neural networks processing provide significantly faster computation than with general-use CPU alone. • Privacy: The data does not leave the device. • Cost: No server farm is needed when all the computations are performed on the device. 103 Source: https://developer.android.com/ndk/guides/neuralnetworks/index.html
  94. 94. Source: https://www.tensorflow.org/mobile/tflite/ TensorFlow Lite • New model file format: based on FlatBuffers, no parsing/unpacking step & much smaller footprint • Mobile-optimized interpreter: uses a static graph ordering and a custom (less-dynamic) memory allocator  • Hardware acceleration
  95. 95. Federate Learning Collaborative Machine Learning without Centralized Training Data 105 Source: https://research.googleblog.com/2017/04/federated-learning-collaborative.html
  96. 96. Resources
  97. 97. Resources • Artificial neural network: https://en.wikipedia.org/wiki/Artificial_neural_network • Deep Learning: https://en.wikipedia.org/wiki/Deep_learning • Convolutional Neural Network: https://en.wikipedia.org/wiki/Convolutional_neural_network • TensorFlow for Poets: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ • TensorFlow for Poets 2: Optimize for Mobile: https://codelabs.developers.google.com/ codelabs/tensorflow-for-poets-2/ • TensorFlow Glossary: https://www.tensorflow.org/versions/r0.12/resources/glossary • Magritte project blog: http://blog.xebia.fr/2017/07/24/on-device-intelligence-integrez-du- deep-learning-sur-vos-smartphones/ 107
  98. 98. Thank you! Github: https://github.com/xebia-france/magritte Qian Jin | @bonbonking | qjin@xebia.fr Yoann Benoit | @YoannBENOIT | ybenoit@xebia.fr Sylvain Lequeux | @slequeux | slequeux@xebia.fr

×