This document discusses implementing machine learning at the IoT edge using neural networks like GRU. It covers neural network types and cells, training tools, numeric formats for embedded implementations, and references for CMSIS-NN, HDF5, TensorFlow, Python and more. Code examples demonstrate converting models to Q format and optimizing matrix storage for efficient embedded inference.
4. Sigmoid Function
• A sigmoid function is a
mathematical function
having a characteristic "S"-
shaped curve or sigmoid
curve. Often, sigmoid
function refers to the special
case of the logistic function
shown in the first figure and
defined by the formula
S(x)=1/(1+e⁻ˣ)=eˣ/(eˣ+1).
7. Implementation
• Tools
• MBED OS5
• CMSIS-NN Package - GRU Example
• HDF5 Extractor
• Float to Q format Converter
• VS Code
• GNU Tools ARM Embedded
• Issues
• GRU Example Problems
• Weight and Bias Value Transfer
• Q format
• Code
8. Numeric Formats
Name bits Value bits Exponent bits Min Value Max Value
Float Half 16 10 5 -1*2**15 1*2**15
Float 32 23 8 -1*2**127 1*2**127
Double 64 52 11 -1*2**1023 1*2**1023
Q0.15 (Q15) 16 15 0 -1+2**-15 1-2**-15
Q0.7 (Q7) 8 7 0 -1+-2**-7 1-2**-7
Bit 15 14 13 12
… 0
Value Sign 1/2 1/4 1/8 … 1/(2^15)
Q15 Format
9. Matrix ordering with Optimization
• Weights are in q7_t and Activations are in q15_t
• Limitation: x4 version requires weight reordering to work
• Here we use only one pointer to read 4 rows in the weight matrix. So if the original q7_t matrix
looks like this:
• | a11 | a12 | a13 | a14 | a15 | a16 | a17 |
• | a21 | a22 | a23 | a24 | a25 | a26 | a27 |
• | a31 | a32 | a33 | a34 | a35 | a36 | a37 |
• | a41 | a42 | a43 | a44 | a45 | a46 | a47 |
• | a51 | a52 | a53 | a54 | a55 | a56 | a57 |
• | a61 | a62 | a63 | a64 | a65 | a66 | a67 |
• We operates on multiple-of-4 rows, so the first four rows becomes
• | a11 | a21 | a12 | a22 | a31 | a41 | a32 | a42 |
• | a13 | a23 | a14 | a24 | a33 | a43 | a34 | a44 |
• | a15 | a25 | a16 | a26 | a35 | a45 | a36 | a46 |
10. Matrix Reordering with Optimzation (Cont)
• The column left over will be in-order. which is: | a17 | a27 | a37 | a47 |
• For the left-over rows, we do 1x1 computation, so the data remains as its
original order.
• So the stored weight matrix looks like this:
• | a11 | a21 | a12 | a22 | a31 | a41 |
• | a32 | a42 | a13 | a23 | a14 | a24 |
• | a33 | a43 | a34 | a44 | a15 | a25 |
• | a16 | a26 | a35 | a45 | a36 | a46 |
• | a17 | a27 | a37 | a47 | a51 | a52 |
• | a53 | a54 | a55 | a56 | a57 | a61 |
• | a62 | a63 | a64 | a65 | a66 | a67 |