Contenu connexe Similaire à CMSIS-NN (20) CMSIS-NN3. CMSIS 5.3.0
• http://www2.keil.com/mdk5/cmsis/
• https://developer.arm.com/embedded/cmsis
• Cortex Microcontroller Software Interface Standard
• CMSIS-NN first appeared in 5.2.1 dev 3
CMSIS-CORE CMSIS-RTOS CMSIS-DSP
CMSIS-Driver CMSIS-SVD CMSIS-DAP
CMSIS-Pack CMSIS-NNCMSIS-Zone
(planned)
3
5. CMSIS-NN
• DSP: Cortex-M0 (N) / Cortex-M3 (N) Cortex-M4 (Y) / Cortex-M7 (Y) / Cortex-M33 (Optional)
• For inference only with limited computation power
• CPU: Dozens MHz to 192MHz Cortex-M4, 400 MHz Cortex-M7
• MEMORY: Dozens KB to a few MB
• Kernels Support: q7t and q15_t fractional data type: [ -1.0, 1.0 )
• Functions
• Neural Network Convolution Functions
• Neural Network Activation Functions
• Fully-connected Layer Functions
• Neural Network Pooling Functions
• Softmax Functions
5
12. FOOTPRINT - 9,306
text data bss dec hex filename
132 0 0 132 84 ./SoftmaxFunctions/arm_softmax_q15.o
154 0 0 154 9a ./SoftmaxFunctions/arm_softmax_q7.o
544 0 0 544 220 ./PoolingFunctions/arm_pool_q7_HWC.o
2816 0 0 2816 b00 ./NNSupportFunctions/arm_nntables.o
84 0 0 84 54 ./NNSupportFunctions/arm_q7_to_q15_no_shift.o
72 0 0 72 48 ./NNSupportFunctions/arm_q7_to_q15_reordered_no_shift.o
102 0 0 102 66 ./FullyConnectedFunctions/arm_fully_connected_q15.o
88 0 0 88 58 ./FullyConnectedFunctions/arm_fully_connected_mat_q7_vec_q15.o
476 0 0 476 1dc ./FullyConnectedFunctions/arm_fully_connected_mat_q7_vec_q15_opt.o
486 0 0 486 1e6 ./FullyConnectedFunctions/arm_fully_connected_q15_opt.o
86 0 0 86 56 ./FullyConnectedFunctions/arm_fully_connected_q7.o
532 0 0 532 214 ./FullyConnectedFunctions/arm_fully_connected_q7_opt.o
266 0 0 266 10a ./ConvolutionFunctions/arm_convolve_1x1_HWC_q7_fast_nonsquare.o
404 0 0 404 194 ./ConvolutionFunctions/arm_convolve_HWC_q15_basic.o
450 0 0 450 1c2 ./ConvolutionFunctions/arm_convolve_HWC_q15_fast.o
426 0 0 426 1aa ./ConvolutionFunctions/arm_convolve_HWC_q7_basic.o
434 0 0 434 1b2 ./ConvolutionFunctions/arm_convolve_HWC_q7_fast.o
434 0 0 434 1b2 ./ConvolutionFunctions/arm_convolve_HWC_q7_fast_nonsquare.o
428 0 0 428 1ac ./ConvolutionFunctions/arm_convolve_HWC_q7_RGB.o
298 0 0 298 12a ./ConvolutionFunctions/arm_depthwise_separable_conv_HWC_q7.o
378 0 0 378 17a ./ConvolutionFunctions/arm_depthwise_separable_conv_HWC_q7_nonsquare.o
4 0 0 4 4 ./ConvolutionFunctions/arm_nn_mat_mult_kernel_q7_q15.o
4 0 0 4 4 ./ConvolutionFunctions/arm_nn_mat_mult_kernel_q7_q15_reordered.o
104 0 0 104 68 ./ActivationFunctions/arm_nn_activations_q15.o
48 0 0 48 30 ./ActivationFunctions/arm_nn_activations_q7.o
28 0 0 28 1c ./ActivationFunctions/arm_relu_q15.o
28 0 0 28 1c ./ActivationFunctions/arm_relu_q7.o
13. EXAMPLE - CIFAR-10
• arm_convolve_HWC_q7_RGB()
• arm_relu_q7()
• arm_maxpool_q7_HWC()
• arm_convolve_HWC_q7_fast()
• arm_relu_q7()
• arm_avepool_q7_HWC()
13
• arm_convolve_HWC_q7_fast()
• arm_relu_q7()
• arm_avepool_q7_HWC()
• arm_fully_connected_q7()
• arm_softmax_q7()
• conv1_wt: 2,400
• conv1_bias: 32
• conv2_wt: 12,800
• conv2_bias: 16
• conv3_wt: 12,800
• conv3_bias: 32
• ip1_wt: 10
• ip1_bias: 10
• input_data: 3K
• output_data: 10
• col_buffer: 3,200
• scratch_buffer: 40K