PRMU201902 Presentation document

田中正行
PRMU研究会2019/02
深層学習に関する
（個人的な）取り組みの紹介

概要
1
１．mgq (Minimal Gram task Queue)
２．train1000 (Train with small samples)
３．WiG (Weighted Sigmoid Gate Unit)
（便利な）シンプルタスクキュー
（練習用）小サンプル学習
新しい活性化関数
https://github.com/likesilkto/mgqueue
http://www.ok.sc.e.titech.ac.jp/~mtanaka/proj/train1000/
http://www.ok.sc.e.titech.ac.jp/~mtanaka/proj/WiG/

タスクキューと深層学習
2 時間
18:00
20:00
22:00
00:00
02:00
04:00
06:00
08:00
10:00
GPU0 GPU1
Task0
Task1
Task2
Task3
（少ない）GPU資源を効率的に活
用したい
処理終了後，すぐに次の処理を行
いたい！

mgq (Minimal Gram task Queue)
3
% pip install git+https://github.com/likesilkto/mgqueue
インストール:
python application
タスク追加:
% mgq queue_name ad ‘python train1.py’
スタート:
% mgq queue_name start
% mgq queue_name start –gmail [@マークより前のgmail アカウント]
タスク追加:

Train1000 project
4
Cifar-10, 100
学習データ数： 50,000枚
テストデータ数：10,000枚
GPUを使って学習に数時間かかる
いろいろ試すには時間がかかるし，
初学者の練習には大変
少数データから学習できるのか？
1,000個の少数データから、
どれくらい性能が出せるのか？
#train1000

Train1000 project
5
#train1000
mnist
100 samples x 10 classes = 1,000 samples Test Acc.: 0.9786
fashion_mnist
cifar-10
cifar-100

概要
6

Activation Functions for DNNs
Input
x
Activation
function
Weight
Output
y
Conv.
Activation
function
Input
x
Output
y
Activation functions
Sigmoid tanh ReLU
𝜎𝜎 𝑥𝑥 =
1
1 + 𝑒𝑒−𝑥𝑥
max(𝑥𝑥, 0)

Advanced Activation Functions
ReLU
max(𝑥𝑥, 0)
�
𝑥𝑥 (𝑥𝑥 ≥ 0)
𝛼𝛼𝛼𝛼 (𝑥𝑥 < 0)
Leaky ReLU
Parametric ReLU
swish, SiL
𝑥𝑥 𝜎𝜎 𝑤𝑤𝑤𝑤 + 𝑏𝑏
Existing activation functions are
element-wise function.
Dying ReLU:
Dead ReLU units always
return zero.

WiG: Weighted Sigmoid Gate (Proposed)
Existing activation functions are
element-wise function.
Sigmoid Gated Network can be
used as activation function.
Weight
Activation
function
Weight
Activation
networkunit
Proposed WiG (Weighted sigmoid gate unit)
W ×
Wg
WiG activation unit
It is compatible to existing activation functions.
It includes the ReLU.
Sigmoid
W
Wg
×
My recommendation is:
You can improve the network performance just by
replacing the ReLU by the proposed WiG.

WiG: Three-state
10
𝒚𝒚 = 𝜎𝜎 𝑾𝑾𝒈𝒈 𝒙𝒙 + 𝒃𝒃𝒈𝒈 ⊗ (𝑾𝑾𝑾𝑾 + 𝒃𝒃)
人の網膜細胞
オン中心型受容野
オフ中心型受容野
中心が明るいほど
大きな出力
中心が暗いほど
大きな出力
反応なし
⊗
反応の大きさ閾値制御
（符号付の）
反応の大きさ制御
閾値制御
独立に制御できる
（かもしれない）
Uchida, Coupled convolution layer for convolutional neural network, 2018

WiG: 側抑制
11
脳の測抑制
WiGは測抑制を実現できる！
測抑制
ニューロンの空間分布
大きな反応の周辺のニューロンの
反応が抑制される
⊗
反応の大きさ閾値制御
要素独立の活性化関数
測抑制は実現不可能
測抑制を実現するWgを簡単に設計可能

WiG with sparseness constraint
12
スパースネス： yの非ゼロ要素が少ない
スパースネス拘束： 𝜎𝜎 𝑾𝑾𝒈𝒈 𝒙𝒙 + 𝒃𝒃𝒈𝒈 1

WiG  ReLU
13
𝑦𝑦 = 𝜎𝜎(𝛼𝛼𝛼𝛼) × 𝑥𝑥
𝑦𝑦 = 𝜎𝜎(𝛼𝛼𝛼𝛼)
𝛼𝛼 → ∞
𝑦𝑦 = �
0 (𝑥𝑥 < 0)
1 (𝑥𝑥 ≥ 0)
𝑦𝑦 = 𝜎𝜎(𝛼𝛼𝛼𝛼) × 𝑥𝑥
𝛼𝛼 → ∞
𝑦𝑦 = max(0, 𝑥𝑥)
WiGはReLUを再現できる！
既存ネットワークのReLUをWiGに置き換えて，
高性能化できる！（かも）

Experimental Validations
Object recognition
Average accuracy
Image denoising
The reproduction code is available

まとめ
15

PRMU201902 Presentation document

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à PRMU201902 Presentation document

Similaire à PRMU201902 Presentation document (9)

Plus de Masayuki Tanaka

Plus de Masayuki Tanaka (20)

PRMU201902 Presentation document