SlideShare une entreprise Scribd logo
1  sur  31
LUT-Network
~ FOR REAL-TIME COMPUTING~
REVISION2
Ryuji Fuchikami
渕上 竜司
• This document is update from “fpgax February 2, 2019”
• https://www.slideshare.net/ryuz88/lut-network-
fpgx201902
• This is English translation version
2
History of LUT-Network publishing
• BinaryBrain Version 1 (August 1, 2018 ~)
• I named it “LUT-Network”
• Flat programing
• Binary-LUT model (SIMD AVX2)
• brute force learning model
• binary modulation model
• BinaryBrain Version 2 (September 2, 2018 ~)
• Layer model programing
• support CNN
• support export Verilog-RTL
• add back-propagation learning model
• Sparse Affine model
• micro-MLP model
• BinaryBrain Version 3 (March 19, 2019 ~)
• data object support with GPU (CUDA)
• add Stochastic-LUT model
• add Regression sample
3
https://github.com/ryuz/BinaryBrain
What is Real-Time Computing?
• Technology to match the computer to the real world dynamics.
• Computing in the living space
Human
thing
thing
Digital mirror
Video
conference
Remote
controller
Exploration
robot
Care robot
AR glasses
4
HPC
Autonomous
control
YouTube movie : https://www.youtube.com/watch?v=wGRhw9bbiik&t=2s
offline
Human
real-world
Real-Time Binary-DNN architecture for FPGA
memory
processor
input
device
output
device
best effort (variable fps)
Data enters memory first.
high-throughput, but long latency.
von Neumann architecture
dataflow programming for Real-time.
memory
processor
input
device
output
device
hard-real-time and Low-Latency
Memory is used to refer to past data
5I invented LUT-Network for Real-Time processing
LUT-Network Overview
• Conventional DNN
1. Construct with perceptron nodes.
2. Do training.
3. Get perceptron’s weight parameter.
• LUT-Network
1. Construct with LUT nodes.
2. Do training.
3. Get table parameters
θ
x1
x2
x3
xn
・・・
w1
w2
w3
wn
y
6
LUT-Network Performance
7
xc7z020clg400-1
very few resource Very Low-delay Real-Time recognition
MNIST MLP classification 318,877fps
1ms delay, 1000fps throughput
Network Design
Learning
(e.g. Tensor Flow)
Convert to C++
network
parameter
C++ source code
High Level Synthesis
(e.g. Vivado HLS)
RTL(behavior)
Synthesis
(e.g. Vivado)
Complete
(many LUTs, 100~200MHz)
Network Circuit
Design
network
(FPGA Circuit)
Learning
(BinaryBrain)
RTL(net-list)
Complete
(few LUTs, 300~400MHz)
Synthesis
(e.g. Vivado)
Design Flow for LUT-Network
【Conventional】 【LUT-Network】
8
Features of LUT-Network
• Binary Network for Prediction on Edge Device.
• Classification and Regression
• High-density and High-Speed(300~400MHz)
• Circuit size is determined prior to learning
• It is possible to keep Real-Time Warranty
Conventional DNN LUT-Network
Recognition rate Decided when learning best effort
System
performance
best effort Decided when learning
(Real-Time Warranty)
9
How do you learn the LUT?
1. Brute force learning
• Directly optimize LUT tables to minimize loss function for Train data.
• MLP(multi layer perceptron) only. (can’t apply to CNN)
• A large network's learning is difficult.
• Do not use gradients for learning.
(Possibility of being resistant to “Adversarial Examples”)
2. learning with micro-MLP model
• Apply the method of BDNN
• Forward : Binary, Backward : FP32
• low-speed learning on GPU, and high-speed prediction on FPGA.
3. learning with Stochastic-LUT model
• Forward : FP32, Backward : FP32
• high-speed and high-accuracy learning on GPU, and high-speed prediction on FPGA.
3 ideas
10
Brute force learning
1. Initialize LUT with random numbers
2. Fix the output to 0 and 1 respectively and pass all learning data
3. Keep the sum of loss function for each input value of LUT, and update the table
in the direction to reduce .
11
input frequency loss with 0 loss with 1
0 37932 47813.7 48233.9
1 39482 50001.3 49692.9
2 37028 44698.9 44845.7
3 40640 49257.1 49331.0
4 27156 33998.4 33891.0
5 23930 29538.6 29495.2
6 29002 35197.3 35451.4
7 27786 33390.9 33466.9
8 43532 52741.1 52993.5
9 41628 49985.9 50388.5
10 49176 56521.4 56026.1
11 46542 54215.4 54284.9
・・・・
・・・・
・・・・
・・・・
59 34268 41152.9 41215.8
60 22872 28852.4 29000.0
61 17930 22068.9 22112.9
62 24156 28213.2 28227.1
63 24194 28367.0 28450.4
new table value
0
1
0
0
1
1
0
0
0
0
1
0
・・・・
0
0
0
0
0
















yxwvu
tsrqp
onmlk
jihgf
edcba
















000
000
000
000
000
wv
ts
ok
gf
db
Dense-Affine (Fully Connection)
Sparse-Affine (my 1st idea)
・
・
・
synthesis
LUT
LUT
LUT
LUT
LUT
LUT
LUTmapping
BatchNormalization
Binary-Activation
BatchNormalization
Binary-Activation
Deep Logic
(Low-speed and Middle
Performance)
It can not learn XOR
high-speed(300MHz~400MHz)
It can not learn XOR
100MHz~200MHz
micro-MLP stack(my 2nd idea)
LUTmapping
BatchNormalization
Binary-Activation
It can learn XOR
high-speed(300MHz~400MHz)
・
・
・
Simple Logic
(High-speed and Low Performance)
Simple Logic
(High-speed and High Performance)
LUT includes BN
LUT includes hidden layer
This unit is “micro-MLP”
Micro-MLP learning
12
binary activation layer for Micro-MLP
• forward
• Sign()
• 𝑦 =
1 𝑖𝑓 𝑥 ≥ 0,
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒,
• backward
• hard-tanh()
• 𝑔 𝑥 = 1 𝑥 ≤1
Same as the Binary Connect method
Batch Normalization uses the conventional one
13
BatchNormalization
Binary-Activation
・
・
・
mapping LUT
Stochastic-LUT model learning
14
-
*
-
x0
x1
*
W0
binarize
*
*
W1
*
*
W2
*
*
W3
1
1
+ y
e.g.) 2-input LUT model x0-x1 is input stochastic variables. W0-W3 is table value.
Probability that W0 is selected : (1 - x1) * (1 - x0)
Probability that W1 is selected : (1 - x1) * x0
Probability that W2 is selected : x1 * (1 - x0)
Probability that W3 is selected : x1 * x0
y = W0 * (1 - x1) * (1 - x0)
+ W1 * (1 - x1) * x0
+ W2 * x1 * (1 - x0)
+ W3 * x1 * x0
Because this calculation tree is differentiable, it can be calculate back-propagation.
The formula for the 6-input LUT is larger, but can be calculated in the same method.
By using the Stochastic-LUT model, it is possible to perform learning much faster and with
higher accuracy than the micro-MLP model.
No need for Batch-Normalization
No need for Activation
Stochastic-LUT model
15
input[n-1:0]
output
Table with probability values as n-dimensional continuum
Learning Prediction
Matrix Weight Activation Convolution Deep
Network
performance of
CPUs/GPUs
performance of
FPGA
Binary
Connect
Dense Binary Real
(FP32)
OK
Binarized
Neural
Network
Dense
Binary
Binary OK
XNOR-
Network
Dense Binary Binary OK
LUT-
Network
Sparse
Real
(FP32) none OK
good
excellent
1 node → many adders
1 node → many XNOR
1 node → many XNOR
1 node → 1 LUT
excellent
benchmark for other Binary Network
16
good
Demonstration 1
[MNIST MLP 1000fps])
DNN
(LUT-Net)
MIPI-CSI
RX
Raspberry Pi
Camera V2
(Sony IMX219)
SERDE
S
TX
PS
(Linux)
SERDES
RX
FIN1216
DMA
OLED
UG-9664HDDAG01
DDR3
SDRA
M
I2C
MIPI-CSI
Original Board
PCX-Window
1000fps
640x132
1000fps
control
PL (Jelly)
BinaryBrain
Ether
RTL
offline learning (PC)
ZYBO Z7-20
debug view
17
YouTube movie: https://www.youtube.com/watch?v=NJa77PZlQMI
MNIST MLP 1000fps
LUT: 1182
input:784
layer0: 256
layer1: 256
layer2: 128
layer3: 128
layer4: 128
layer5: 128
layer6: 128
layer7: 30
Total Utilization(include Camera/OLED control)Utilization of DNN part
250MHz / (28x28) = 318,877fps
DNN
(LUT-Net)
MIPI-CSI
RX
Raspberry Pi
Camera V2
(Sony IMX219)
SERDE
S
TX
PS
(Linux)
SERDES
RX
FIN1216
DMA
OLED
UG-9664HDDAG01
DDR3
SDRA
M
I2C
MIPI-CSI
Original Board
PCX-Window
1000fps
640x132
1000fps
control
PL (Jelly)
BinaryBrain
Ether
RTL
offline learning (PC)
ZYBO Z7-20
debug view
OSD
(frame-mem)
19
Demonstration 2
[MNIST CNN 1000fp]
YouTube movie : https://www.youtube.com/watch?v=aYuYrYxztBU
MNIST CNN (DNN part)
CNV3x3
CNV3x3
MaxPol
Affine
CNV3x3
CNV3x3
MaxPol
Affine
// sub-networks for convolution(3x3)
bb::NeuralNetSparseMicroMlp<6, 16>sub0_smm0(1 * 3 * 3, 192);
bb::NeuralNetSparseMicroMlp<6, 16>sub0_smm1(192, 32);
bb::NeuralNetGroup<>sub0_net;
sub0_net.AddLayer(&sub0_smm0);
sub0_net.AddLayer(&sub0_smm1);
// sub-networks for convolution(3x3)
bb::NeuralNetSparseMicroMlp<6, 16>sub1_smm0(32 * 3 * 3, 192);
bb::NeuralNetSparseMicroMlp<6, 16>sub1_smm1(192, 32);
bb::NeuralNetGroup<>sub1_net;
sub1_net.AddLayer(&sub1_smm0);
sub1_net.AddLayer(&sub1_smm1);
// sub-networks for convolution(3x3)
bb::NeuralNetSparseMicroMlp<6, 16>sub3_smm0(32 * 3 * 3, 192);
bb::NeuralNetSparseMicroMlp<6, 16>sub3_smm1(192, 32);
bb::NeuralNetGroup<>sub3_net;
sub3_net.AddLayer(&sub3_smm0);
sub3_net.AddLayer(&sub3_smm1);
// sub-networks for convolution(3x3)
bb::NeuralNetSparseMicroMlp<6, 16>sub4_smm0(32 * 3 * 3, 192);
bb::NeuralNetSparseMicroMlp<6, 16>sub4_smm1(192, 32);
bb::NeuralNetGroup<>sub4_net;
sub4_net.AddLayer(&sub4_smm0);
sub4_net.AddLayer(&sub4_smm1);
// main-networks
bb::NeuralNetRealToBinary<float>input_real2bin(28 * 28, 28 * 28);
bb::NeuralNetLoweringConvolution<>layer0_conv(&sub0_net, 1, 28, 28, 32, 3, 3);
bb::NeuralNetLoweringConvolution<>layer1_conv(&sub1_net, 32, 26, 26, 32, 3, 3);
bb::NeuralNetMaxPooling<>layer2_maxpol(32, 24, 24, 2, 2);
bb::NeuralNetLoweringConvolution<>layer3_conv(&sub3_net, 32, 12, 12, 32, 3, 3);
bb::NeuralNetLoweringConvolution<>layer4_conv(&sub4_net, 32, 10, 10, 32, 3, 3);
bb::NeuralNetMaxPooling<>layer5_maxpol(32, 8, 8, 2, 2);
bb::NeuralNetSparseMicroMlp<6, 16>layer6_smm(32 * 4 * 4, 480);
bb::NeuralNetSparseMicroMlp<6, 16>layer7_smm(480, 80);
bb::NeuralNetBinaryToReal<float>output_bin2real(80, 10);
xc7z020clg400-1
20
MNIST CNN (system total)
include Camera and OLED control
21
result of: RTL-simulation
MNIST CNN Learning log [micro-MLP]
fitting start : MnistCnnBin
initial test_accuracy : 0.1518
[save] MnistCnnBin_net_1.json
[load] MnistCnnBin_net.json
fitting start : MnistCnnBin
[initial] test_accuracy : 0.6778 train_accuracy : 0.6694
695.31s epoch[ 2] test_accuracy : 0.7661 train_accuracy : 0.7473
1464.13s epoch[ 3] test_accuracy : 0.8042 train_accuracy : 0.7914
2206.67s epoch[ 4] test_accuracy : 0.8445 train_accuracy : 0.8213
2913.12s epoch[ 5] test_accuracy : 0.8511 train_accuracy : 0.8460
3621.61s epoch[ 6] test_accuracy : 0.8755 train_accuracy : 0.8616
4325.83s epoch[ 7] test_accuracy : 0.8713 train_accuracy : 0.8730
5022.86s epoch[ 8] test_accuracy : 0.9086 train_accuracy : 0.8863
5724.22s epoch[ 9] test_accuracy : 0.9126 train_accuracy : 0.8930
6436.04s epoch[ 10] test_accuracy : 0.9213 train_accuracy : 0.8986
7128.01s epoch[ 11] test_accuracy : 0.9115 train_accuracy : 0.9034
7814.35s epoch[ 12] test_accuracy : 0.9078 train_accuracy : 0.9061
8531.97s epoch[ 13] test_accuracy : 0.9089 train_accuracy : 0.9082
9229.73s epoch[ 14] test_accuracy : 0.9276 train_accuracy : 0.9098
9950.20s epoch[ 15] test_accuracy : 0.9161 train_accuracy : 0.9105
10663.83s epoch[ 16] test_accuracy : 0.9243 train_accuracy : 0.9146
11337.86s epoch[ 17] test_accuracy : 0.9280 train_accuracy : 0.9121
fitting end
22micro MLP-model on BinaryBrain version2
MNIST CNN Learning log[Stochastic-LUT]
fitting start : MnistStochasticLut6Cnn
72.35s epoch[ 1] test accuracy : 0.9508 train accuracy : 0.9529
153.70s epoch[ 2] test accuracy : 0.9581 train accuracy : 0.9638
235.33s epoch[ 3] test accuracy : 0.9615 train accuracy : 0.9676
316.71s epoch[ 4] test accuracy : 0.9647 train accuracy : 0.9701
398.33s epoch[ 5] test accuracy : 0.9642 train accuracy : 0.9718
479.71s epoch[ 6] test accuracy : 0.9676 train accuracy : 0.9731
・
・
・
2111.04s epoch[ 26] test accuracy : 0.9699 train accuracy : 0.9786
2192.82s epoch[ 27] test accuracy : 0.9701 train accuracy : 0.9788
2274.26s epoch[ 28] test accuracy : 0.9699 train accuracy : 0.9789
2355.97s epoch[ 29] test accuracy : 0.9699 train accuracy : 0.9789
2437.39s epoch[ 30] test accuracy : 0.9696 train accuracy : 0.9791
2519.13s epoch[ 31] test accuracy : 0.9698 train accuracy : 0.9793
2600.71s epoch[ 32] test accuracy : 0.9695 train accuracy : 0.9792
fitting end
parameter copy to LUT-Network
lut_accuracy : 0.9641
export : verilog/MnistStochasticLut6Cnn.v
23
Stochastic-LUT model on BinaryBrain version3
Linear Regression [Stochastic-LUT]
(diabetes data from scikit-learn)
fitting start : DiabetesRegressionStochasticLut6
[initial] test MSE : 0.0571 train MSE : 0.0581
0.97s epoch[ 1] test MSE : 0.0307 train MSE : 0.0344
1.42s epoch[ 2] test MSE : 0.0209 train MSE : 0.0284
1.87s epoch[ 3] test MSE : 0.0162 train MSE : 0.0270
2.32s epoch[ 4] test MSE : 0.0160 train MSE : 0.0261
・
・
・
27.11s epoch[ 59] test MSE : 0.0146 train MSE : 0.0245
27.55s epoch[ 60] test MSE : 0.0195 train MSE : 0.0256
27.99s epoch[ 61] test MSE : 0.0145 train MSE : 0.0231
28.43s epoch[ 62] test MSE : 0.0133 train MSE : 0.0232
28.87s epoch[ 63] test MSE : 0.0940 train MSE : 0.0903
29.30s epoch[ 64] test MSE : 0.0146 train MSE : 0.0233
fitting end
parameter copy to LUT-Network
LUT-Network accuracy : 0.0340518
export : DiabetesRegressionBinaryLut.v
24Stochastic-LUT model on BinaryBrain version3
Learning prediction
operator
CPU
1Core
operator
CPU 1Core
(1 weight calculate instructions)
FPGA
(XILIN 7-Series)
ASIC
multi-cycle pipeline multi-cycle pipeline
Affine
(Float)
Multiplier
+ adder
0.25
cycle
Multiplier
+ adder
0.125 cycle
(8 parallel [FMA])
[MUL] DSP:2
LUT:133
[ADD] LUT:413
左×node数 gate : over 10k gate : over 10M
Affine
(INT16)
Multiplier
+ adder
0.125
cycle
Multiplier
+ adder
0.0625 cycle
(16 parallel)
[MAC] DSP:1 左×node数 gate : 0.5k~1k gate : over 1M
Binary
Connect
Multiplier
+ adder
0.25
cycle
adder
+adder
0.125 cycle
(8 parallel)
[MAC] DSP:1 左×node数 gate : 100~200 左×node数
BNN/
XNOR-Net
Multiplier
+ adder
0.25
cycle
XNOR
+popcnt
0.0039+0.0156 cycle
(256 parallel)
LUT:6~12
LUT:400~10000
(接続数次第)
gate : 20~60 左×node数
6-LUT-Net
Multiplier
+ adder
23.8
cycle
LUT
1.16 cycle
(6 input load
+ 1 table load) / 6
(256 parallel)
LUT : 1
(over spec)
LUT : 1
(fit)
gate : 10~30
(over spec)
gate : 10~30
2-LUT-Net
Multiplier
+ adder
1.37
cycle
logic-gate
1.5 cycle
(2 input load
+ 1 table load) / 2
LUT : 1
(over spec)
LUT : 1
(over spec)
gate : 1
(over spec)
gate : 1
(fit)
Resource estimate
25
oversampling and binary modulation
• oversampling and quantum modulation
• PWM(Pulse Width Modulation)
• delta-sigma modulation
• random dither, etc.
• For example, high-speed camera images originally contain noise.
• LPF (Low pass filter) removes noise and dequantizes it
• Regression analysis becomes possible
• e.g.) LPF will be constructed of IIR/FIR/Kalman filter
modulation
Quantization
DNN
random noise
or local oscillator
LPF
26
Human sense includes LPF
architecture proposal for Real-time
27
DNN
Video-In
ME MC
Video-Out
Frame Memory
Similar to IIR-filter
Next approach
• Improving Sparse Connected Connection Rules
• Currently connection rules is random select. But, any
data have locality, as CNN.
• There is a method to determine connection destination
by node distance probabilistically with Gaussian function
etc.
• I want to make stacked connection in pyramid structure.
28
reference
• BinaryConnect: Training Deep Neural Networks with binary weights during propagations
https://arxiv.org/pdf/1511.00363.pdf
• Binarized Neural Networks
https://arxiv.org/abs/1602.02505
• Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations
Constrained to +1 or -1
https://arxiv.org/abs/1602.02830
• XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
https://arxiv.org/abs/1603.05279
• Xilinx UltraScale Architecture Configurable Logic Block User Guide
https://japan.xilinx.com/support/documentation/user_guides/ug574-ultrascale-clb.pdf
29
My Profile
• Open source programmer (hobbyist)
• Born in 1976, I’m living in Fukuoka-city, Japan
• 1998~ publish HOS (Real-Time OS [uITRON])
• https://ja.osdn.net/projects/hos/
(ARM,H8,SH,MIPS,x86,Z80,AM,V850,MicroBlaze, etc.)
• 2008~ publish Jell (Soft-core CPU for FPGA)
• https://github.com/ryuz/jelly
• http://ryuz.my.coocan.jp/jelly/toppage.html
• 2018~ publish LUT-Network
• https://github.com/ryuz/BinaryBrain
• Real-Time AR-glasses project(current my hobby)
• Real-Time glasses (camera [IMX219] & OLED 1000fps)
https://www.youtube.com/watch?v=wGRhw9bbiik
• Real-Time GPU (no frame buffer architecture)
https://www.youtube.com/watch?v=vl-lhSOOlSk
• Real-Time DNN (LUT-Network)
https://www.youtube.com/watch?v=aYuYrYxztBU
30
Contact to me
• Ryuji Fuchikami (渕上 竜司)
• e-mail : ryuji.fuchikami@nifty.com
• Web-Site : http://ryuz.my.coocan.jp/
• Blog. : http://ryuz.txt-nifty.com/
• GitHub : https://github.com/ryuz/
• Twitter : https://twitter.com/Ryuz88
• Facebook : https://www.facebook.com/ryuji.fuchikami
• YouTube : https://www.youtube.com/user/nekoneko1024
31

Contenu connexe

Tendances

A Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGaA Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGaHiroki Nakahara
 
Intro to TensorFlow and PyTorch Workshop at Tubular Labs
Intro to TensorFlow and PyTorch Workshop at Tubular LabsIntro to TensorFlow and PyTorch Workshop at Tubular Labs
Intro to TensorFlow and PyTorch Workshop at Tubular LabsKendall
 
A minimal introduction to Python non-uniform fast Fourier transform (pynufft)
A minimal introduction to Python non-uniform fast Fourier transform (pynufft)A minimal introduction to Python non-uniform fast Fourier transform (pynufft)
A minimal introduction to Python non-uniform fast Fourier transform (pynufft)Jyh-Miin Lin
 
Pytorch for tf_developers
Pytorch for tf_developersPytorch for tf_developers
Pytorch for tf_developersAbdul Muneer
 
Deep Learning with PyTorch
Deep Learning with PyTorchDeep Learning with PyTorch
Deep Learning with PyTorchMayur Bhangale
 
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012Big Data Spain
 
Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1Tyrone Systems
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaRob Gillen
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerSeiya Tokui
 
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural NetworkISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural NetworkHiroki Nakahara
 
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2Tyrone Systems
 
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017Yu-Hsun (lymanblue) Lin
 
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYCTed Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYCMLconf
 
Chainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereportChainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereportPreferred Networks
 
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsFCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsKusano Hitoshi
 

Tendances (20)

Lrz kurs: big data analysis
Lrz kurs: big data analysisLrz kurs: big data analysis
Lrz kurs: big data analysis
 
A Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGaA Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGa
 
Intro to TensorFlow and PyTorch Workshop at Tubular Labs
Intro to TensorFlow and PyTorch Workshop at Tubular LabsIntro to TensorFlow and PyTorch Workshop at Tubular Labs
Intro to TensorFlow and PyTorch Workshop at Tubular Labs
 
A minimal introduction to Python non-uniform fast Fourier transform (pynufft)
A minimal introduction to Python non-uniform fast Fourier transform (pynufft)A minimal introduction to Python non-uniform fast Fourier transform (pynufft)
A minimal introduction to Python non-uniform fast Fourier transform (pynufft)
 
Pytorch for tf_developers
Pytorch for tf_developersPytorch for tf_developers
Pytorch for tf_developers
 
Deep Learning with PyTorch
Deep Learning with PyTorchDeep Learning with PyTorch
Deep Learning with PyTorch
 
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012
 
Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with Cuda
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
 
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural NetworkISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
 
Chainer v3
Chainer v3Chainer v3
Chainer v3
 
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
 
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYCTed Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
 
Chainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereportChainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereport
 
PyData Paris 2015 - Closing keynote Francesc Alted
PyData Paris 2015 - Closing keynote Francesc AltedPyData Paris 2015 - Closing keynote Francesc Alted
PyData Paris 2015 - Closing keynote Francesc Alted
 
Chainer v4 and v5
Chainer v4 and v5Chainer v4 and v5
Chainer v4 and v5
 
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsFCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
 

Similaire à LUT-Network Revision2 -English version-

Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Akihiro Hayashi
 
Graphics processing uni computer archiecture
Graphics processing uni computer archiectureGraphics processing uni computer archiecture
Graphics processing uni computer archiectureHaris456
 
Exploiting parallelism opportunities in non-parallel architectures to improve...
Exploiting parallelism opportunities in non-parallel architectures to improve...Exploiting parallelism opportunities in non-parallel architectures to improve...
Exploiting parallelism opportunities in non-parallel architectures to improve...GreenLSI Team, LSI, UPM
 
Monte Carlo G P U Jan2010
Monte  Carlo  G P U  Jan2010Monte  Carlo  G P U  Jan2010
Monte Carlo G P U Jan2010John Holden
 
improve deep learning training and inference performance
improve deep learning training and inference performanceimprove deep learning training and inference performance
improve deep learning training and inference performances.rohit
 
Lrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with rLrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with rFerdinand Jamitzky
 
Breaking New Frontiers in Robotics and Edge Computing with AI
Breaking New Frontiers in Robotics and Edge Computing with AIBreaking New Frontiers in Robotics and Edge Computing with AI
Breaking New Frontiers in Robotics and Edge Computing with AIDustin Franklin
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...Edge AI and Vision Alliance
 
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...Yandex
 
Fast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya SerebryanyFast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya Serebryanyyaevents
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network InterfacesKernel TLV
 
Large-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at FacebookLarge-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at FacebookFaisal Siddiqi
 
Designing High Performance Computing Architectures for Reliable Space Applica...
Designing High Performance Computing Architectures for Reliable Space Applica...Designing High Performance Computing Architectures for Reliable Space Applica...
Designing High Performance Computing Architectures for Reliable Space Applica...Fisnik Kraja
 
Python for High Throughput Science by Mark Basham
Python for High Throughput Science by Mark BashamPython for High Throughput Science by Mark Basham
Python for High Throughput Science by Mark BashamPyData
 
Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)micchie
 

Similaire à LUT-Network Revision2 -English version- (20)

Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
 
Graphics processing uni computer archiecture
Graphics processing uni computer archiectureGraphics processing uni computer archiecture
Graphics processing uni computer archiecture
 
Exploiting parallelism opportunities in non-parallel architectures to improve...
Exploiting parallelism opportunities in non-parallel architectures to improve...Exploiting parallelism opportunities in non-parallel architectures to improve...
Exploiting parallelism opportunities in non-parallel architectures to improve...
 
Monte Carlo G P U Jan2010
Monte  Carlo  G P U  Jan2010Monte  Carlo  G P U  Jan2010
Monte Carlo G P U Jan2010
 
improve deep learning training and inference performance
improve deep learning training and inference performanceimprove deep learning training and inference performance
improve deep learning training and inference performance
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Lrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with rLrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with r
 
Breaking New Frontiers in Robotics and Edge Computing with AI
Breaking New Frontiers in Robotics and Edge Computing with AIBreaking New Frontiers in Robotics and Edge Computing with AI
Breaking New Frontiers in Robotics and Edge Computing with AI
 
uCluster
uClusteruCluster
uCluster
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
 
Cat @ scale
Cat @ scaleCat @ scale
Cat @ scale
 
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
 
Fast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya SerebryanyFast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya Serebryany
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
 
Defense_Presentation
Defense_PresentationDefense_Presentation
Defense_Presentation
 
How to use mtr 2
How to use mtr 2How to use mtr 2
How to use mtr 2
 
Large-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at FacebookLarge-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at Facebook
 
Designing High Performance Computing Architectures for Reliable Space Applica...
Designing High Performance Computing Architectures for Reliable Space Applica...Designing High Performance Computing Architectures for Reliable Space Applica...
Designing High Performance Computing Architectures for Reliable Space Applica...
 
Python for High Throughput Science by Mark Basham
Python for High Throughput Science by Mark BashamPython for High Throughput Science by Mark Basham
Python for High Throughput Science by Mark Basham
 
Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)
 

Plus de ryuz88

LUT-Network その後の話(2022/05/07)
LUT-Network その後の話(2022/05/07)LUT-Network その後の話(2022/05/07)
LUT-Network その後の話(2022/05/07)ryuz88
 
Rust で RTOS を考える
Rust で RTOS を考えるRust で RTOS を考える
Rust で RTOS を考えるryuz88
 
Verilator勉強会 2021/05/29
Verilator勉強会 2021/05/29Verilator勉強会 2021/05/29
Verilator勉強会 2021/05/29ryuz88
 
FPGA勉強会資料 20210516
FPGA勉強会資料 20210516FPGA勉強会資料 20210516
FPGA勉強会資料 20210516ryuz88
 
Deep Learning development flow
Deep Learning development flowDeep Learning development flow
Deep Learning development flowryuz88
 
LUT-Network ~Edge環境でリアルタイムAIの可能性を探る~
LUT-Network ~Edge環境でリアルタイムAIの可能性を探る~LUT-Network ~Edge環境でリアルタイムAIの可能性を探る~
LUT-Network ~Edge環境でリアルタイムAIの可能性を探る~ryuz88
 
LUT-Network Revision2
LUT-Network Revision2LUT-Network Revision2
LUT-Network Revision2ryuz88
 
LUT-Network ~本物のリアルタイムコンピューティングを目指して~
LUT-Network ~本物のリアルタイムコンピューティングを目指して~LUT-Network ~本物のリアルタイムコンピューティングを目指して~
LUT-Network ~本物のリアルタイムコンピューティングを目指して~ryuz88
 

Plus de ryuz88 (8)

LUT-Network その後の話(2022/05/07)
LUT-Network その後の話(2022/05/07)LUT-Network その後の話(2022/05/07)
LUT-Network その後の話(2022/05/07)
 
Rust で RTOS を考える
Rust で RTOS を考えるRust で RTOS を考える
Rust で RTOS を考える
 
Verilator勉強会 2021/05/29
Verilator勉強会 2021/05/29Verilator勉強会 2021/05/29
Verilator勉強会 2021/05/29
 
FPGA勉強会資料 20210516
FPGA勉強会資料 20210516FPGA勉強会資料 20210516
FPGA勉強会資料 20210516
 
Deep Learning development flow
Deep Learning development flowDeep Learning development flow
Deep Learning development flow
 
LUT-Network ~Edge環境でリアルタイムAIの可能性を探る~
LUT-Network ~Edge環境でリアルタイムAIの可能性を探る~LUT-Network ~Edge環境でリアルタイムAIの可能性を探る~
LUT-Network ~Edge環境でリアルタイムAIの可能性を探る~
 
LUT-Network Revision2
LUT-Network Revision2LUT-Network Revision2
LUT-Network Revision2
 
LUT-Network ~本物のリアルタイムコンピューティングを目指して~
LUT-Network ~本物のリアルタイムコンピューティングを目指して~LUT-Network ~本物のリアルタイムコンピューティングを目指して~
LUT-Network ~本物のリアルタイムコンピューティングを目指して~
 

Dernier

1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 

Dernier (20)

1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 

LUT-Network Revision2 -English version-

  • 1. LUT-Network ~ FOR REAL-TIME COMPUTING~ REVISION2 Ryuji Fuchikami 渕上 竜司
  • 2. • This document is update from “fpgax February 2, 2019” • https://www.slideshare.net/ryuz88/lut-network- fpgx201902 • This is English translation version 2
  • 3. History of LUT-Network publishing • BinaryBrain Version 1 (August 1, 2018 ~) • I named it “LUT-Network” • Flat programing • Binary-LUT model (SIMD AVX2) • brute force learning model • binary modulation model • BinaryBrain Version 2 (September 2, 2018 ~) • Layer model programing • support CNN • support export Verilog-RTL • add back-propagation learning model • Sparse Affine model • micro-MLP model • BinaryBrain Version 3 (March 19, 2019 ~) • data object support with GPU (CUDA) • add Stochastic-LUT model • add Regression sample 3 https://github.com/ryuz/BinaryBrain
  • 4. What is Real-Time Computing? • Technology to match the computer to the real world dynamics. • Computing in the living space Human thing thing Digital mirror Video conference Remote controller Exploration robot Care robot AR glasses 4 HPC Autonomous control YouTube movie : https://www.youtube.com/watch?v=wGRhw9bbiik&t=2s offline Human real-world
  • 5. Real-Time Binary-DNN architecture for FPGA memory processor input device output device best effort (variable fps) Data enters memory first. high-throughput, but long latency. von Neumann architecture dataflow programming for Real-time. memory processor input device output device hard-real-time and Low-Latency Memory is used to refer to past data 5I invented LUT-Network for Real-Time processing
  • 6. LUT-Network Overview • Conventional DNN 1. Construct with perceptron nodes. 2. Do training. 3. Get perceptron’s weight parameter. • LUT-Network 1. Construct with LUT nodes. 2. Do training. 3. Get table parameters θ x1 x2 x3 xn ・・・ w1 w2 w3 wn y 6
  • 7. LUT-Network Performance 7 xc7z020clg400-1 very few resource Very Low-delay Real-Time recognition MNIST MLP classification 318,877fps 1ms delay, 1000fps throughput
  • 8. Network Design Learning (e.g. Tensor Flow) Convert to C++ network parameter C++ source code High Level Synthesis (e.g. Vivado HLS) RTL(behavior) Synthesis (e.g. Vivado) Complete (many LUTs, 100~200MHz) Network Circuit Design network (FPGA Circuit) Learning (BinaryBrain) RTL(net-list) Complete (few LUTs, 300~400MHz) Synthesis (e.g. Vivado) Design Flow for LUT-Network 【Conventional】 【LUT-Network】 8
  • 9. Features of LUT-Network • Binary Network for Prediction on Edge Device. • Classification and Regression • High-density and High-Speed(300~400MHz) • Circuit size is determined prior to learning • It is possible to keep Real-Time Warranty Conventional DNN LUT-Network Recognition rate Decided when learning best effort System performance best effort Decided when learning (Real-Time Warranty) 9
  • 10. How do you learn the LUT? 1. Brute force learning • Directly optimize LUT tables to minimize loss function for Train data. • MLP(multi layer perceptron) only. (can’t apply to CNN) • A large network's learning is difficult. • Do not use gradients for learning. (Possibility of being resistant to “Adversarial Examples”) 2. learning with micro-MLP model • Apply the method of BDNN • Forward : Binary, Backward : FP32 • low-speed learning on GPU, and high-speed prediction on FPGA. 3. learning with Stochastic-LUT model • Forward : FP32, Backward : FP32 • high-speed and high-accuracy learning on GPU, and high-speed prediction on FPGA. 3 ideas 10
  • 11. Brute force learning 1. Initialize LUT with random numbers 2. Fix the output to 0 and 1 respectively and pass all learning data 3. Keep the sum of loss function for each input value of LUT, and update the table in the direction to reduce . 11 input frequency loss with 0 loss with 1 0 37932 47813.7 48233.9 1 39482 50001.3 49692.9 2 37028 44698.9 44845.7 3 40640 49257.1 49331.0 4 27156 33998.4 33891.0 5 23930 29538.6 29495.2 6 29002 35197.3 35451.4 7 27786 33390.9 33466.9 8 43532 52741.1 52993.5 9 41628 49985.9 50388.5 10 49176 56521.4 56026.1 11 46542 54215.4 54284.9 ・・・・ ・・・・ ・・・・ ・・・・ 59 34268 41152.9 41215.8 60 22872 28852.4 29000.0 61 17930 22068.9 22112.9 62 24156 28213.2 28227.1 63 24194 28367.0 28450.4 new table value 0 1 0 0 1 1 0 0 0 0 1 0 ・・・・ 0 0 0 0 0
  • 12.                 yxwvu tsrqp onmlk jihgf edcba                 000 000 000 000 000 wv ts ok gf db Dense-Affine (Fully Connection) Sparse-Affine (my 1st idea) ・ ・ ・ synthesis LUT LUT LUT LUT LUT LUT LUTmapping BatchNormalization Binary-Activation BatchNormalization Binary-Activation Deep Logic (Low-speed and Middle Performance) It can not learn XOR high-speed(300MHz~400MHz) It can not learn XOR 100MHz~200MHz micro-MLP stack(my 2nd idea) LUTmapping BatchNormalization Binary-Activation It can learn XOR high-speed(300MHz~400MHz) ・ ・ ・ Simple Logic (High-speed and Low Performance) Simple Logic (High-speed and High Performance) LUT includes BN LUT includes hidden layer This unit is “micro-MLP” Micro-MLP learning 12
  • 13. binary activation layer for Micro-MLP • forward • Sign() • 𝑦 = 1 𝑖𝑓 𝑥 ≥ 0, 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒, • backward • hard-tanh() • 𝑔 𝑥 = 1 𝑥 ≤1 Same as the Binary Connect method Batch Normalization uses the conventional one 13 BatchNormalization Binary-Activation ・ ・ ・ mapping LUT
  • 14. Stochastic-LUT model learning 14 - * - x0 x1 * W0 binarize * * W1 * * W2 * * W3 1 1 + y e.g.) 2-input LUT model x0-x1 is input stochastic variables. W0-W3 is table value. Probability that W0 is selected : (1 - x1) * (1 - x0) Probability that W1 is selected : (1 - x1) * x0 Probability that W2 is selected : x1 * (1 - x0) Probability that W3 is selected : x1 * x0 y = W0 * (1 - x1) * (1 - x0) + W1 * (1 - x1) * x0 + W2 * x1 * (1 - x0) + W3 * x1 * x0 Because this calculation tree is differentiable, it can be calculate back-propagation. The formula for the 6-input LUT is larger, but can be calculated in the same method. By using the Stochastic-LUT model, it is possible to perform learning much faster and with higher accuracy than the micro-MLP model. No need for Batch-Normalization No need for Activation
  • 15. Stochastic-LUT model 15 input[n-1:0] output Table with probability values as n-dimensional continuum
  • 16. Learning Prediction Matrix Weight Activation Convolution Deep Network performance of CPUs/GPUs performance of FPGA Binary Connect Dense Binary Real (FP32) OK Binarized Neural Network Dense Binary Binary OK XNOR- Network Dense Binary Binary OK LUT- Network Sparse Real (FP32) none OK good excellent 1 node → many adders 1 node → many XNOR 1 node → many XNOR 1 node → 1 LUT excellent benchmark for other Binary Network 16 good
  • 17. Demonstration 1 [MNIST MLP 1000fps]) DNN (LUT-Net) MIPI-CSI RX Raspberry Pi Camera V2 (Sony IMX219) SERDE S TX PS (Linux) SERDES RX FIN1216 DMA OLED UG-9664HDDAG01 DDR3 SDRA M I2C MIPI-CSI Original Board PCX-Window 1000fps 640x132 1000fps control PL (Jelly) BinaryBrain Ether RTL offline learning (PC) ZYBO Z7-20 debug view 17 YouTube movie: https://www.youtube.com/watch?v=NJa77PZlQMI
  • 18. MNIST MLP 1000fps LUT: 1182 input:784 layer0: 256 layer1: 256 layer2: 128 layer3: 128 layer4: 128 layer5: 128 layer6: 128 layer7: 30 Total Utilization(include Camera/OLED control)Utilization of DNN part 250MHz / (28x28) = 318,877fps
  • 19. DNN (LUT-Net) MIPI-CSI RX Raspberry Pi Camera V2 (Sony IMX219) SERDE S TX PS (Linux) SERDES RX FIN1216 DMA OLED UG-9664HDDAG01 DDR3 SDRA M I2C MIPI-CSI Original Board PCX-Window 1000fps 640x132 1000fps control PL (Jelly) BinaryBrain Ether RTL offline learning (PC) ZYBO Z7-20 debug view OSD (frame-mem) 19 Demonstration 2 [MNIST CNN 1000fp] YouTube movie : https://www.youtube.com/watch?v=aYuYrYxztBU
  • 20. MNIST CNN (DNN part) CNV3x3 CNV3x3 MaxPol Affine CNV3x3 CNV3x3 MaxPol Affine // sub-networks for convolution(3x3) bb::NeuralNetSparseMicroMlp<6, 16>sub0_smm0(1 * 3 * 3, 192); bb::NeuralNetSparseMicroMlp<6, 16>sub0_smm1(192, 32); bb::NeuralNetGroup<>sub0_net; sub0_net.AddLayer(&sub0_smm0); sub0_net.AddLayer(&sub0_smm1); // sub-networks for convolution(3x3) bb::NeuralNetSparseMicroMlp<6, 16>sub1_smm0(32 * 3 * 3, 192); bb::NeuralNetSparseMicroMlp<6, 16>sub1_smm1(192, 32); bb::NeuralNetGroup<>sub1_net; sub1_net.AddLayer(&sub1_smm0); sub1_net.AddLayer(&sub1_smm1); // sub-networks for convolution(3x3) bb::NeuralNetSparseMicroMlp<6, 16>sub3_smm0(32 * 3 * 3, 192); bb::NeuralNetSparseMicroMlp<6, 16>sub3_smm1(192, 32); bb::NeuralNetGroup<>sub3_net; sub3_net.AddLayer(&sub3_smm0); sub3_net.AddLayer(&sub3_smm1); // sub-networks for convolution(3x3) bb::NeuralNetSparseMicroMlp<6, 16>sub4_smm0(32 * 3 * 3, 192); bb::NeuralNetSparseMicroMlp<6, 16>sub4_smm1(192, 32); bb::NeuralNetGroup<>sub4_net; sub4_net.AddLayer(&sub4_smm0); sub4_net.AddLayer(&sub4_smm1); // main-networks bb::NeuralNetRealToBinary<float>input_real2bin(28 * 28, 28 * 28); bb::NeuralNetLoweringConvolution<>layer0_conv(&sub0_net, 1, 28, 28, 32, 3, 3); bb::NeuralNetLoweringConvolution<>layer1_conv(&sub1_net, 32, 26, 26, 32, 3, 3); bb::NeuralNetMaxPooling<>layer2_maxpol(32, 24, 24, 2, 2); bb::NeuralNetLoweringConvolution<>layer3_conv(&sub3_net, 32, 12, 12, 32, 3, 3); bb::NeuralNetLoweringConvolution<>layer4_conv(&sub4_net, 32, 10, 10, 32, 3, 3); bb::NeuralNetMaxPooling<>layer5_maxpol(32, 8, 8, 2, 2); bb::NeuralNetSparseMicroMlp<6, 16>layer6_smm(32 * 4 * 4, 480); bb::NeuralNetSparseMicroMlp<6, 16>layer7_smm(480, 80); bb::NeuralNetBinaryToReal<float>output_bin2real(80, 10); xc7z020clg400-1 20
  • 21. MNIST CNN (system total) include Camera and OLED control 21 result of: RTL-simulation
  • 22. MNIST CNN Learning log [micro-MLP] fitting start : MnistCnnBin initial test_accuracy : 0.1518 [save] MnistCnnBin_net_1.json [load] MnistCnnBin_net.json fitting start : MnistCnnBin [initial] test_accuracy : 0.6778 train_accuracy : 0.6694 695.31s epoch[ 2] test_accuracy : 0.7661 train_accuracy : 0.7473 1464.13s epoch[ 3] test_accuracy : 0.8042 train_accuracy : 0.7914 2206.67s epoch[ 4] test_accuracy : 0.8445 train_accuracy : 0.8213 2913.12s epoch[ 5] test_accuracy : 0.8511 train_accuracy : 0.8460 3621.61s epoch[ 6] test_accuracy : 0.8755 train_accuracy : 0.8616 4325.83s epoch[ 7] test_accuracy : 0.8713 train_accuracy : 0.8730 5022.86s epoch[ 8] test_accuracy : 0.9086 train_accuracy : 0.8863 5724.22s epoch[ 9] test_accuracy : 0.9126 train_accuracy : 0.8930 6436.04s epoch[ 10] test_accuracy : 0.9213 train_accuracy : 0.8986 7128.01s epoch[ 11] test_accuracy : 0.9115 train_accuracy : 0.9034 7814.35s epoch[ 12] test_accuracy : 0.9078 train_accuracy : 0.9061 8531.97s epoch[ 13] test_accuracy : 0.9089 train_accuracy : 0.9082 9229.73s epoch[ 14] test_accuracy : 0.9276 train_accuracy : 0.9098 9950.20s epoch[ 15] test_accuracy : 0.9161 train_accuracy : 0.9105 10663.83s epoch[ 16] test_accuracy : 0.9243 train_accuracy : 0.9146 11337.86s epoch[ 17] test_accuracy : 0.9280 train_accuracy : 0.9121 fitting end 22micro MLP-model on BinaryBrain version2
  • 23. MNIST CNN Learning log[Stochastic-LUT] fitting start : MnistStochasticLut6Cnn 72.35s epoch[ 1] test accuracy : 0.9508 train accuracy : 0.9529 153.70s epoch[ 2] test accuracy : 0.9581 train accuracy : 0.9638 235.33s epoch[ 3] test accuracy : 0.9615 train accuracy : 0.9676 316.71s epoch[ 4] test accuracy : 0.9647 train accuracy : 0.9701 398.33s epoch[ 5] test accuracy : 0.9642 train accuracy : 0.9718 479.71s epoch[ 6] test accuracy : 0.9676 train accuracy : 0.9731 ・ ・ ・ 2111.04s epoch[ 26] test accuracy : 0.9699 train accuracy : 0.9786 2192.82s epoch[ 27] test accuracy : 0.9701 train accuracy : 0.9788 2274.26s epoch[ 28] test accuracy : 0.9699 train accuracy : 0.9789 2355.97s epoch[ 29] test accuracy : 0.9699 train accuracy : 0.9789 2437.39s epoch[ 30] test accuracy : 0.9696 train accuracy : 0.9791 2519.13s epoch[ 31] test accuracy : 0.9698 train accuracy : 0.9793 2600.71s epoch[ 32] test accuracy : 0.9695 train accuracy : 0.9792 fitting end parameter copy to LUT-Network lut_accuracy : 0.9641 export : verilog/MnistStochasticLut6Cnn.v 23 Stochastic-LUT model on BinaryBrain version3
  • 24. Linear Regression [Stochastic-LUT] (diabetes data from scikit-learn) fitting start : DiabetesRegressionStochasticLut6 [initial] test MSE : 0.0571 train MSE : 0.0581 0.97s epoch[ 1] test MSE : 0.0307 train MSE : 0.0344 1.42s epoch[ 2] test MSE : 0.0209 train MSE : 0.0284 1.87s epoch[ 3] test MSE : 0.0162 train MSE : 0.0270 2.32s epoch[ 4] test MSE : 0.0160 train MSE : 0.0261 ・ ・ ・ 27.11s epoch[ 59] test MSE : 0.0146 train MSE : 0.0245 27.55s epoch[ 60] test MSE : 0.0195 train MSE : 0.0256 27.99s epoch[ 61] test MSE : 0.0145 train MSE : 0.0231 28.43s epoch[ 62] test MSE : 0.0133 train MSE : 0.0232 28.87s epoch[ 63] test MSE : 0.0940 train MSE : 0.0903 29.30s epoch[ 64] test MSE : 0.0146 train MSE : 0.0233 fitting end parameter copy to LUT-Network LUT-Network accuracy : 0.0340518 export : DiabetesRegressionBinaryLut.v 24Stochastic-LUT model on BinaryBrain version3
  • 25. Learning prediction operator CPU 1Core operator CPU 1Core (1 weight calculate instructions) FPGA (XILIN 7-Series) ASIC multi-cycle pipeline multi-cycle pipeline Affine (Float) Multiplier + adder 0.25 cycle Multiplier + adder 0.125 cycle (8 parallel [FMA]) [MUL] DSP:2 LUT:133 [ADD] LUT:413 左×node数 gate : over 10k gate : over 10M Affine (INT16) Multiplier + adder 0.125 cycle Multiplier + adder 0.0625 cycle (16 parallel) [MAC] DSP:1 左×node数 gate : 0.5k~1k gate : over 1M Binary Connect Multiplier + adder 0.25 cycle adder +adder 0.125 cycle (8 parallel) [MAC] DSP:1 左×node数 gate : 100~200 左×node数 BNN/ XNOR-Net Multiplier + adder 0.25 cycle XNOR +popcnt 0.0039+0.0156 cycle (256 parallel) LUT:6~12 LUT:400~10000 (接続数次第) gate : 20~60 左×node数 6-LUT-Net Multiplier + adder 23.8 cycle LUT 1.16 cycle (6 input load + 1 table load) / 6 (256 parallel) LUT : 1 (over spec) LUT : 1 (fit) gate : 10~30 (over spec) gate : 10~30 2-LUT-Net Multiplier + adder 1.37 cycle logic-gate 1.5 cycle (2 input load + 1 table load) / 2 LUT : 1 (over spec) LUT : 1 (over spec) gate : 1 (over spec) gate : 1 (fit) Resource estimate 25
  • 26. oversampling and binary modulation • oversampling and quantum modulation • PWM(Pulse Width Modulation) • delta-sigma modulation • random dither, etc. • For example, high-speed camera images originally contain noise. • LPF (Low pass filter) removes noise and dequantizes it • Regression analysis becomes possible • e.g.) LPF will be constructed of IIR/FIR/Kalman filter modulation Quantization DNN random noise or local oscillator LPF 26 Human sense includes LPF
  • 27. architecture proposal for Real-time 27 DNN Video-In ME MC Video-Out Frame Memory Similar to IIR-filter
  • 28. Next approach • Improving Sparse Connected Connection Rules • Currently connection rules is random select. But, any data have locality, as CNN. • There is a method to determine connection destination by node distance probabilistically with Gaussian function etc. • I want to make stacked connection in pyramid structure. 28
  • 29. reference • BinaryConnect: Training Deep Neural Networks with binary weights during propagations https://arxiv.org/pdf/1511.00363.pdf • Binarized Neural Networks https://arxiv.org/abs/1602.02505 • Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 https://arxiv.org/abs/1602.02830 • XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks https://arxiv.org/abs/1603.05279 • Xilinx UltraScale Architecture Configurable Logic Block User Guide https://japan.xilinx.com/support/documentation/user_guides/ug574-ultrascale-clb.pdf 29
  • 30. My Profile • Open source programmer (hobbyist) • Born in 1976, I’m living in Fukuoka-city, Japan • 1998~ publish HOS (Real-Time OS [uITRON]) • https://ja.osdn.net/projects/hos/ (ARM,H8,SH,MIPS,x86,Z80,AM,V850,MicroBlaze, etc.) • 2008~ publish Jell (Soft-core CPU for FPGA) • https://github.com/ryuz/jelly • http://ryuz.my.coocan.jp/jelly/toppage.html • 2018~ publish LUT-Network • https://github.com/ryuz/BinaryBrain • Real-Time AR-glasses project(current my hobby) • Real-Time glasses (camera [IMX219] & OLED 1000fps) https://www.youtube.com/watch?v=wGRhw9bbiik • Real-Time GPU (no frame buffer architecture) https://www.youtube.com/watch?v=vl-lhSOOlSk • Real-Time DNN (LUT-Network) https://www.youtube.com/watch?v=aYuYrYxztBU 30
  • 31. Contact to me • Ryuji Fuchikami (渕上 竜司) • e-mail : ryuji.fuchikami@nifty.com • Web-Site : http://ryuz.my.coocan.jp/ • Blog. : http://ryuz.txt-nifty.com/ • GitHub : https://github.com/ryuz/ • Twitter : https://twitter.com/Ryuz88 • Facebook : https://www.facebook.com/ryuji.fuchikami • YouTube : https://www.youtube.com/user/nekoneko1024 31