Tutorial of GANs in Gifu Univ

Tutorial of GANs
2018/1/15
岐阜大学加藤研究室中塚俊介

 名前：中塚俊介
 所属：岐阜大学大学院加藤研究室
 研究
深層学習を用いた外観検査
 少数不良品サンプル下における正常モデル生成と異常度判定手法
 回帰型CNNを用いた外観検査手法
 DAEGANを用いた欠損復元による外観検査手法
2
About me
My Mail Address : nakatsuka@cv.info.gifu-u.ac.jp
My github : https://github.com/salty-vanilla
My Qiita : https://qiita.com/salty-vanilla
Input
From distribution?
From AutoEncoder?
Input OutputEncoder Decoder
Discriminator
正常品画像正常品画像

 Introduction
 Standard GAN
 Wasserstein GAN
 Applications of GAN
3
Agenda

 特徴抽出層
 Convolutional Layer
 Pooling Layer
 識別部
 Fully Connected Layer
CNN: Convolutional Neural Networks
5
cat
leopard
dog
human
bird

 フィルタを畳み込み演算するLayer
 フィルタの係数を学習する
Convolutional Layer
学習されたフィルタ
6

What is Convolution ?
120 100 135 50 60 150
125 95 150 40 30 100
55 60 60 25 50 150
40 75 30 150 250 230
60 100 50 60 75 245
40 75 30 150 250 230
0.8 0.0 0.4
0.9 0.5 0.2
0.0 0.1 0.1
Input Image
Filter
7

120 100 135 50 60 150
125 95 150 40 30 100
55 60 60 25 50 150
40 75 30 150 250 230
60 100 50 60 75 245
40 75 30 150 250 230
× 0.8 × 0.0 × 0.4
× 0.9 × 0.5 × 0.2
× 0.0 × 0.1 × 0.1
Input Image
460.2
Output Image
8

120 100 135 50 60 150
125 95 150 40 30 100
55 60 60 25 50 150
40 75 30 150 250 230
60 100 50 60 75 245
40 75 30 150 250 230
× 0.8 × 0.0 × 0.4
× 0.9 × 0.5 × 0.2
× 0.0 × 0.1 × 0.1
Input Image Output Image
9
460.2 277

 位置不変性の獲得
 様々な特徴を抽出
Convolutional Layer
Input Image Filters of Conv1 Output of Conv1
10

 Conv Layerの直後に存在するLayer
 低解像度化
 多少の位置ずれ、特徴量の変化に鈍感に
 計算量の削減
Pooling Layer
60 25 50 150
30 150 250 230
50 60 75 245
30 150 250 230
150 250
150 250
11
Input
Output

 Conv Layerの直後に存在するLayer
 低解像度化
 多少の位置ずれ、特徴量の変化に鈍感に
 計算量の削減
Pooling Layer
Input Image Output of Conv1 Output of Pool1
12

 最後段のFeature MapをVectorに変換して入力
 出力は各クラスの確率
 出力層のユニットはクラス数と等しい
Fully Connected Layer
cat
leopard
dog
human
bird
bWxy +=
13

 特徴抽出層
 Convolutional Layer
 Pooling Layer
 識別部
 Fully Connected Layer
CNN: Convolutional Neural Networks
14
cat
leopard
dog
human
bird

Generative Adversarial Networks
敵対的生成ネットワーク
 GeneratorとDiscriminatorという２つのネットワークモデルが存在
 Generator : ランダムノイズzをLatent Spaceとして、 realなデータを生成（fake）
 Discriminator : 入力がreal or fake を判定する
15
What is GAN?
Z Generator 0.0 – 1.0
Discriminator
Pg: Generated Images
Pdata: Real Images

敵対的学習のイメージ
Ｇｅｎｅｒａｔoｒ Discriminator
偽札士偽札鑑定士
上手く偽造して騙す
偽造を見破る
16

 3D-GAN —Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling(github)
 3D-IWGAN —Improved Adversarial Systems for 3D Object Generation and Reconstruction (github)
 3D-RecGAN —3D Object Reconstruction from a Single Depth View with Adversarial Learning (github)
 ABC-GAN —ABC-GAN: Adaptive Blur and Control for improved training stability of Generative Adversarial Networks (github)
 AC-GAN —Conditional Image Synthesis With Auxiliary Classifier GANs
 acGAN—Face Aging With Conditional Generative Adversarial Networks
 AdaGAN—AdaGAN: Boosting Generative Models
 AE-GAN —AE-GAN: adversarial eliminating with GAN
 AEGAN —Learning Inverse Mapping by Autoencoder based Generative Adversarial Nets
 AffGAN—Amortised MAP Inference for Image Super-resolution
 AL-CGAN —Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts
 ALI —Adversarially Learned Inference
 AlignGAN—AlignGAN: Learning to Align Cross-Domain Images with Conditional Generative Adversarial Networks
 AM-GAN —Activation Maximization Generative Adversarial Nets
 AnoGAN—Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery
 ARAE —Adversarially Regularized Autoencoders for Generating Discrete Structures (github)
 ARDA —Adversarial Representation Learning for Domain Adaptation
 ARIGAN —ARIGAN: Synthetic Arabidopsis Plants using Generative Adversarial Network
 ArtGAN—ArtGAN: Artwork Synthesis with Conditional Categorial GANs
 b-GAN —Generative Adversarial Nets from a Density Ratio Estimation Perspective
 Bayesian GAN —Deep and Hierarchical Implicit Models
 Bayesian GAN —Bayesian GAN
 BCGAN —Bayesian Conditional Generative Adverserial Networks
 BEGAN —BEGAN: Boundary Equilibrium Generative Adversarial Networks
 BGAN —Binary Generative Adversarial Networks for Image Retrieval(github)
 BiGAN—Adversarial Feature Learning
 BS-GAN —Boundary-Seeking Generative Adversarial Networks
 C-RNN-GAN —C-RNN-GAN: Continuous recurrent neural networks with adversarial training (github)
 CaloGAN—CaloGAN: Simulating 3D High Energy Particle Showers in Multi-Layer Electromagnetic Calorimeters with Generative
Adversarial Networks (github)
 CAN —CAN: Creative Adversarial Networks, Generating Art by Learning About Styles and Deviating from Style Norms
 CatGAN—Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks
 CausalGAN—CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training
 CC-GAN —Semi-Supervised Learning with Context-Conditional Generative Adversarial Networks (github)
 CDcGAN—Simultaneously Color-Depth Super-Resolution with Conditional Generative Adversarial Network
 CGAN —Conditional Generative Adversarial Nets
 CGAN —Controllable Generative Adversarial Network
 Chekhov GAN —An Online Learning Approach to Generative Adversarial Networks
 CoGAN—Coupled Generative Adversarial Networks
 Conditional cycleGAN—Conditional CycleGAN for Attribute Guided Face Image Generation
 constrast-GAN —Generative Semantic Manipulation with Contrasting GAN
 Context-RNN-GAN —Contextual RNN-GANs for Abstract Reasoning Diagram Generation
 Coulomb GAN —Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields
 Cramèr GAN —The Cramer Distance as a Solution to Biased Wasserstein Gradients
 crVAE-GAN —Channel-Recurrent Variational Autoencoders
 CS-GAN —Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets
 CVAE-GAN —CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training
 CycleGAN—Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (github)
 D2GAN —Dual Discriminator Generative Adversarial Nets
 DAN —Distributional Adversarial Networks
 DCGAN —Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks(github)
 DeliGAN—DeLiGAN : Generative Adversarial Networks for Diverse and Limited Data (github)
 DiscoGAN—Learning to Discover Cross-Domain Relations with Generative Adversarial Networks
 DistanceGAN—One-Sided Unsupervised Domain Mapping
 DM-GAN —Dual Motion GAN for Future-Flow Embedded Video Prediction
 DR-GAN —Representation Learning by Rotating Your Faces
 DRAGAN —How to Train Your DRAGAN (github)
 DSP-GAN —Depth Structure Preserving Scene Image Generation
 DTN —Unsupervised Cross-Domain Image Generation
 DualGAN—DualGAN: Unsupervised Dual Learning for Image-to-Image Translation
 Dualing GAN —Dualing GANs
 EBGAN —Energy-based Generative Adversarial Network
 ED//GAN —Stabilizing Training of Generative Adversarial Networks through Regularization
 EGAN —Enhanced Experience Replay Generation for Efficient Reinforcement Learning
 ExprGAN—ExprGAN: Facial Expression Editing with Controllable Expression Intensity
 f-GAN —f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization
 FF-GAN —Towards Large-Pose Face Frontalization in the Wild
 Fila-GAN —Synthesizing Filamentary Structured Images with GANs
 Fisher GAN —Fisher GAN
 Flow-GAN —Flow-GAN: Bridging implicit and prescribed learning in generative models
 GAMN —Generative Adversarial Mapping Networks
 GAN —Generative Adversarial Networks (github)
 GAN-CLS —Generative Adversarial Text to Image Synthesis (github)
 GAN-sep—GANs for Biological Image Synthesis (github)
 GAN-VFS —Generative Adversarial Network-based Synthesis of Visible Faces from Polarimetric Thermal Faces
 GANCS —Deep Generative Adversarial Networks for Compressed Sensing Automates MRI
 GAWWN —Learning What and Where to Draw (github)
 GeneGAN—GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data (github)
 Geometric GAN —Geometric GAN
 GMAN —Generative Multi-Adversarial Networks
 GMM-GAN —Towards Understanding the Dynamics of Generative Adversarial Networks
 GoGAN—Gang of GANs: Generative Adversarial Networks with Maximum Margin Ranking
 GP-GAN —GP-GAN: Towards Realistic High-Resolution Image Blending(github)
 GRAN —Generating images with recurrent adversarial networks (github)
 IAN —Neural Photo Editing with Introspective Adversarial Networks(github)
 IcGAN—Invertible Conditional GANs for image editing (github)
 ID-CGAN —Image De-raining Using a Conditional Generative Adversarial Network
 iGAN—Generative Visual Manipulation on the Natural Image Manifold(github)
 Improved GAN —Improved Techniques for Training GANs (github)
 InfoGAN—InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets (github)
 IRGAN —IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval models
 IWGAN —On Unifying Deep Generative Models
 l-GAN —Representation Learning and Adversarial Generation of 3D Point Clouds
 LAGAN —Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis
 LAPGAN —Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks (github)
 LD-GAN —Linear Discriminant Generative Adversarial Networks
 LDAN —Label Denoising Adversarial Network (LDAN) for Inverse Lighting of Face Images
 LeakGAN—Long Text Generation via Adversarial Training with Leaked Information
 LeGAN—Likelihood Estimation for Generative Adversarial Networks
 LR-GAN —LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation
 LS-GAN —Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities
 LSGAN —Least Squares Generative Adversarial Networks
 MAD-GAN —Multi-Agent Diverse Generative Adversarial Networks
 MAGAN —MAGAN: Margin Adaptation for Generative Adversarial Networks
 MalGAN—Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN
 MaliGAN—Maximum-Likelihood Augmented Discrete Generative Adversarial Networks
 MARTA-GAN —Deep Unsupervised Representation Learning for Remote Sensing Images
 McGAN—McGan: Mean and Covariance Feature Matching GAN
 MD-GAN —Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks
 MDGAN —Mode Regularized Generative Adversarial Networks
 MedGAN—Generating Multi-label Discrete Electronic Health Records using Generative Adversarial Networks
 MGAN —Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks (github)
 MGGAN —Multi-Generator Generative Adversarial Nets
 MIX+GAN —Generalization and Equilibrium in Generative Adversarial Nets (GANs)
 MMD-GAN —MMD GAN: Towards Deeper Understanding of Moment Matching Network (github)
 MMGAN —MMGAN: Manifold Matching Generative Adversarial Network for Generating Images
 MoCoGAN—MoCoGAN: Decomposing Motion and Content for Video Generation (github)
 MPM-GAN —Message Passing Multi-Agent GANs
 MuseGAN—MuseGAN: Symbolic-domain Music Generation and Accompaniment with Multi-track Sequential Generative Adversarial
Networks
 MV-BiGAN—Multi-view Generative Adversarial Networks
 OptionGAN—OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
 ORGAN —Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models
 PAN —Perceptual Adversarial Networks for Image-to-Image Transformation
 PassGAN—PassGAN: A Deep Learning Approach for Password Guessing
 Perceptual GAN —Perceptual Generative Adversarial Networks for Small Object Detection
 PGAN —Probabilistic Generative Adversarial Networks
 pix2pix —Image-to-Image Translation with Conditional Adversarial Networks (github)
 PixelGAN—PixelGAN Autoencoders
 Pose-GAN —The Pose Knows: Video Forecasting by Generating Pose Futures
 PPGN —Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space
 PrGAN—3D Shape Induction from 2D Views of Multiple Objects
 PSGAN —Learning Texture Manifolds with the Periodic Spatial GAN
 RankGAN—Adversarial Ranking for Language Generation
 RCGAN —Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs
 RefineGAN—Compressed Sensing MRI Reconstruction with Cyclic Loss in Generative Adversarial Networks
 RenderGAN—RenderGAN: Generating Realistic Labeled Data
 ResGAN—Generative Adversarial Network based on Resnet for Conditional Image Restoration
 RNN-WGAN —Language Generation with Recurrent Generative Adversarial Networks without Pre-training(github)
 RPGAN —Stabilizing GAN Training with Multiple Random Projections(github)
 RTT-GAN —Recurrent Topic-Transition GAN for Visual Paragraph Generation
 RWGAN —Relaxed Wasserstein with Applications to GANs
 SAD-GAN —SAD-GAN: Synthetic Autonomous Driving using Generative Adversarial Networks
 SalGAN—SalGAN: Visual Saliency Prediction with Generative Adversarial Networks (github)
 SBADA-GAN —From source to target and back: symmetric bi-directional adaptive GAN
 SD-GAN —Semantically Decomposing the Latent Spaces of Generative Adversarial Networks
 SEGAN —SEGAN: Speech Enhancement Generative Adversarial Network
 SeGAN—SeGAN: Segmenting and Generating the Invisible
 SegAN—SegAN: Adversarial Network with Multi-scale L1 Loss for Medical Image Segmentation
 SeqGAN—SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient (github)
 SGAN —Texture Synthesis with Spatial Generative Adversarial Networks
 SGAN —Stacked Generative Adversarial Networks (github)
 SGAN —Steganographic Generative Adversarial Networks
 SimGAN—Learning from Simulated and Unsupervised Images through Adversarial Training
 SketchGAN—Adversarial Training For Sketch Retrieval
 SL-GAN —Semi-Latent GAN: Learning to generate and modify facial images from attributes
 SN-GAN —Spectral Normalization for Generative Adversarial Networks(github)
 Softmax-GAN —Softmax GAN
 Splitting GAN —Class-Splitting Generative Adversarial Networks
 SRGAN —Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
 SS-GAN —Semi-supervised Conditional GANs
 ss-InfoGAN—Guiding InfoGAN with Semi-Supervision
 SSGAN —SSGAN: Secure Steganography Based on Generative Adversarial Networks
 SSL-GAN —Semi-SupervisedLearning with Context-Conditional Generative Adversarial Networks
 ST-GAN —Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently
 StackGAN—StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
 SteinGAN—Learning Deep Energy Models: Contrastive Divergence vs. Amortized MLE
 S²GAN —Generative Image Modeling using Style and Structure Adversarial Networks
 TAC-GAN —TAC-GAN — Text Conditioned Auxiliary Classifier Generative Adversarial Network (github)
 TAN —Outline Colorization through Tandem Adversarial Networks
 TextureGAN—TextureGAN: Controlling Deep Image Synthesis with Texture Patches
 TGAN —Temporal Generative Adversarial Nets
 TP-GAN —Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis
 Triple-GAN —Triple Generative Adversarial Nets
 Unrolled GAN —Unrolled Generative Adversarial Networks (github)
 VAE-GAN —Autoencoding beyond pixels using a learned similarity metric
 VariGAN—Multi-View Image Generation from a Single-View
 VAW-GAN —Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks
 VEEGAN —VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning (github)
 VGAN —Generating Videos with Scene Dynamics (github)
 VGAN —Generative Adversarial Networks as Variational Training of Energy Based Models (github)
 ViGAN—Image Generation and Editing with Variational Info Generative Adversarial Networks
 VIGAN —VIGAN: Missing View Imputation with Generative Adversarial Networks
 VRAL —Variance Regularizing Adversarial Learning
 WaterGAN—WaterGAN: Unsupervised Generative Network to Enable Real-time Color Correction of Monocular Underwater Images
 WGAN —Wasserstein GAN (github)
 WGAN-GP —Improved Training of Wasserstein GANs (github)
 WS-GAN —Weakly Supervised Generative Adversarial Networks for 3D Reconstruction
 α-GAN —Variational Approaches for Auto-Encoding Generative Adversarial Networks (github)
 Δ-GAN —Triangle Generative Adversarial Networks
17
A Variety of GAN The GAN Zoo https://deephunt.in/the-gan-zoo-79597dc8c347

Tricks
 Label Smoothing
 Add noise to inputs, decay o
 Use Dropouts in G in both train and test
phase
18
A Variety of GAN
Surrogate or auxiliary objective
 Unrolled GAN
 WGAN-GP
 DRAGAN
New Objectives
 EBGAN
 LSGAN
 WGAN
 BEGAN
Network Architecture
 LAPGAN
 Stacked GAN

 NIPS 2016 Tutorial: Generative Adversarial Networks
https://arxiv.org/abs/1701.00160
https://www.youtube.com/watch?v=AJVyzd0rqdc
 Theory and Application of Generative Adversarial Network(CVPR2017)
https://www.slideshare.net/mlreview/tutorial-on-theory-and-application-of-generative-
adversarial-networks
 ICCV 2017 Tutorial on GANs
https://sites.google.com/view/iccv-2017-gans
 Yann LeCun曰く
“This, and the variations that are now being proposed is
the most interesting idea in the last 10 years in ML, in my opinion. “
19
GAN is Hot Topic

20

敵対的生成ネットワーク
 GeneratorとDiscriminatorという２つのネットワークモデルが存在
 Generator : ランダムノイズzをLatent Spaceとして、 realなデータを生成（fake）
 Discriminator : 入力がreal or fake を判定する
21
What is GAN?
Z Generator 0.0 – 1.0
Discriminator
Pdata: Real Images

 多次元のノイズから画像を生成する
 生成分布pgをReal分布pr に近づけることで、本物に近い画像を出力する
 Noise z を線形補完することで視覚的に連続な画像が得られる
Generator
22
Z Generator

 入力された画像が、Generatorが生成した画像 or 本物の画像かを判定
 出力層のユニットが本物である確率を表す
 実際には、生成分布pgとReal分布pr の距離を推定している
Discriminator
23
0.0 – 1.0
Discriminator
Pdata: Real Images

分布間の距離をどう定義するかでGANのObjectiveは変わる
 JSD(Jensen Shannon Divergence) → Standard GAN
 EMD(Earth Mover’s Distance) → WGAN
24
Distance, Divergence
 
      mpKLDmpKLDppJSD
xpxp
m
xp
xp
EppKLD
grgr
gr
g
r
pxgr r
||||
2
1
||
2
)()(
)(
)(
log|| ~
+=
+
=








=
   
yxEppW yx
pp
gr
gr
=



~),(
,
inf,

DiscriminatorとGeneratorのMinMax
 x~pr : real分布からサンプリングされたデータ
 D(*) : 入力されたデータのreal / fake 判定 (Discriminator)
 z~pz : ノイズ生成分布からサンプリングされたノイズベクター
 G(*) : 入力されたノイズからデータを生成 (Generator)
25
Objectives
   )))((log(1))(log(),(maxmin ~~ zGDExDEGDV zrDG
pzpx +=

a. Dは部分的には正確
b. DがD∗ x =
pr x
pr x +pg x
に収束するように訓練
c. pr がpg に近づいてくる
d. pr とpg が一致このときD x =
1
2
26
Objectives
 Discrimination Distribution
 Real Distribution
 Generative Distribution
 Dが正確になってきてから、Gが正確になる
 しかし、Dが正確になりすぎてもだめ

27
Algorithm
最大化
最小化
 学習初期段階では、Gが弱く、Dが強い
 Dが1を頻繁に出力してしまう
 勾配消失
))((log zGD
最大化
実際の実装では、最大化問題に置き換える

 Discriminator
最大化問題なので、符号を反転して最小化問題に
 Generator
置き換えた最大化問題を最小化問題に
28
Objectives (implementation)






+  ==
m
i
i
m
i
i zGD
m
xD
m 11
)))((1log(
1
)(log
1
 





 =
m
i
izGD
m 1
)))((log(
1


 Discriminatorの精度が高くないと、Generatorは分布間の距離を最小化できない
 Dだけ先にたくさん学習させた後に、Gを学習すればよい？ ※うまくいきません
 DとGのバランス調整をする必要がある
 Mode Collapseしてしまう
30
Drawbacks of Standard GAN
 DiscriminatorのLossの最大値は↓のはずなのに、実際には誤差が0になってしまう
  4log)(||)(2 + xpxpJSD gr
 pr とpg が低次元多様体上で不連続
 pr とpg が互いに素なSupportを含んでいる

pr とpg が低次元多様体上で不連続
 画像の情報は高次元だが、その分布はもっと低次元な空間にあるはず（仮にdreal 次元とする）
 dreal > dz ならpg は不連続になる
pr とpg が互いに素なSupportを含んでいる
 2つの集合が共通の元を持たない状態
31
uncontinuous Disjoint support
 Optimal Discriminatorが存在
 さらにこのとき、
 完全に分離できる
 勾配0
0)(
2log
=
=
 xD
JSD
x

もし、Optimal Discriminatorを回避しても・・・
Discriminatorが正しくJSDを近似できるようになると、gradientのnormは小さくなっていく
つまるところ・・・
 Discriminatorの学習が不十分 → Generatorは正しくないJSDを最小化
 Discriminatorの学習が十分 → Generator更新のためのgradientが小さい、勾配消失
32
 
*
2
~
1
)))((1log(
DD
MzgDE zpz







詳しくは
Arjovsky, Martin, and Léon Bottou. "Towards principled methods for training generative adversarial networks." arXiv preprint arXiv:1701.04862 (2017).

 DiscriminatorとGeneratorのバランス調整が必要ない
 Mode Collapseが起きにくい
 学習時の際に、Lossを評価することができる
Advantages of WGAN
pr
Standard GAN WGAN
Standard GAN
WGAN

 Earth Mover’s Distanceを分布間の距離尺度に使用
 Πは同時分布
 直感的には、最適な輸送コスト
“pgをprに変えるために、地点xから地点yまでどれだけの質量を輸送にしなければいけないか？”
35
Objectives
   
yxEppW yx
pp
gr
gr
=



~),(
,
inf,

直感的に考えて・・・
 Pgが縦軸に近づくほど、距離は小さくなる
 Pgが縦軸から遠ざかるほど、距離は大きくなる
36
JSD vs EMD
)1,1(~ Uz
θ

 実際は
 JSDは不連続かつ勾配が消失
 EMDは連続かつ勾配がある
37
JSD vs EMD
θ
)1,1(~ Uz
 


 =
=
otherwise0
0if2log
||

gr ppJSD   =gr ppEMD ||
JSD EMD

 Standard GAN
 不連続なpg を生成していたのは、JSD自体が不連続だから
 また、勾配消失問題もJSDのせい
 WGAN
 Gがθにおいて連続なら、EMDも連続
 DがK-Lipschitz連続性(関数の傾きがある定数Kで抑えられるような一様連続性)を保つなら、
EMDはどこでも連続かつ微分可能
38
JSD vs EMD

 EMDの良さはわかったけど、これどうやって計算するの？
 Kantorovich-Rubinstein Duality を使うと、
39
Objectives
   
yxEppW yx
pp
gr
gr
=



~),(
,
inf,
     
   
   ))(()(max
))(()(max
)()(sup,
~~
~~
~~
1
zGDExDE
zGfExfE
xfExfEppW
wpzwpx
Ww
wpzwpx
Ww
pxpx
f
gr
zr
zr
gr
L


=
=
=



1-Lipschitz性を保つDiscriminator

40
Algorithm
最大化
最小化
 やっていることはGANと同じ
 ObjectiveとWeight Clippingのみ違う

 Discriminator (Critic)
最大化問題なので、符号を反転して最小化問題に
 Generator
W-Distanceを最小化する
41






  ==
m
i
iw
m
i
iww
zGD
m
xD
m 11
))((
1
)(
1
 





  =
m
i
iw zGD
m 1
))((
1


Weight Clipping
 1-Lipschitz連続性を保つために、重みに制約を加える
 Discriminatorのweightをupdate後に毎回 (–c, c) にclipする
(paper内では c=0.01)
 cが大きければ、制約の力が弱くなかなか1-Lipschitz連続性にたどり着かないため、
Discriminatorが正しく収束しない
 cが小さければ、勾配消失を起こしやすくなる
“Weight clipping is a clearly terrible way to enforce a Lipschitz constraint. [……] However, we do
leave the topic of enforcing Lipschitz constraints in a neural network setting for further
investigation, and we actively encourage interested researchers to improve on this method.”
42
1-Lipschitz

43
生成結果
W-Distanceが小さくなるごとに良い画像ができている
これでLossの評価ができる

 LayerのWeightが-0.01 もしくは0.01のような偏った値になってしまう
 Weight Decayも考案されたが、勾配消失
44
Weight Clippingの罠

WGANのDiscriminatorのLossにGP項を加算
 λ : GP項の強さを決めるパラメータ(paper内では10)
 ε : U(0, 1)からサンプリングされた値
45
Gradient Penalty
  








 +  ==
2
2ˆ~ˆ
11
1)ˆ()(
1
))((
1
min
ˆ
xDxf
m
zGD
m xpx
m
i
iw
m
i
iw
w E x

Gradient Penalty
xzGx  += )()1(ˆ
𝐺(𝑧)
ො𝑥
𝑥
𝜖
1 − 𝜖
Discrimination Loss

Optimal Critic D*は
という直線上のどこにおいても、以下の勾配を持つ
つまり、勾配のnormは1
だから、GP項はnorm=1からどれだけ離れているかを表している
これで、1-Lipschitz連続性が担保できる！
46
Gradient Penalty
  


 
2
2ˆ
1)ˆ(xDx
xzGx  += )()1(ˆ
xx
xx
xDx
ˆ
ˆ
)ˆ(*
ˆ


= 正規化されているのでnormは1

47
Algorithm
 やっていることはWGANと同じ
 Weight Clippingを削除
 DiscriminatorのLossにGP項を加算

 Discriminator (Critic)
WGANのLossにGP項を加算
 Generator
W-Distanceを最小化する
48






  =
m
i
iw zGD
m 1
))((
1

  








 +  ==
2
2ˆ~ˆ
11
1)ˆ()(
1
))((
1
ˆ
xDxD
m
zGD
m xpx
m
i
iw
m
i
iww E x


 WGANはDCGANの欠点をいくつか克服した
 学習の安定性
 意味のあるLoss
 Mode Collapse
 しかし、WGANにも問題がある
 画像がぼやける
 収束が遅い
 表現力が低い
 これらに向けた新しいGAN
 DRAGAN
 Crammer GAN
50
GANのこれから

 Auxiliary Classifier GAN
自分の狙った画像を生成
 Super Resolution GAN
超解像画像の解像度を綺麗にアップサンプリングする
 pix2pix
汎用的なImage Translation(e.g. 地図↔航空写真, スケッチ↔写真, 白黒↔カラー)
 Cycle GAN
アンペアのImage Translation
52
Applications of GAN

 ランダムノイズz に加え，condition vector c を結合して入力とする
 Discriminatorの出力は通常のGANの出力と，入力から予測されるcondition c
 e.g. MNIST: 10次元のone-hot
53
ACGAN(Auxiliary Classifier GAN)
Generator
Discriminator
G(z, c)
x
c : condition
z : random noise
c : condition
fake or real

54
ACGAN(Auxiliary Classifier GAN)

 C92にて発表
 Condition + DRAGANを用いたアニメ顔生成
55
MakeGirls
http://make.girls.moe/#/
https://github.com/makegirlsmoe/makegirlsmoe.githu

 2016年 FaceBook Research
 入力はnoiseではなく，低解像画像
56
Super Resolution GAN

 低解像度画像から高解像度画像を作成する際に，答えは1つではない
 単なるMSEでは，平均的な画像を生成しぼやけてしまう
 GANの構造を加えることで，実際のデータにありそうな画像に近づける
57
Super Resolution GAN

 画像のペアを学習させることで、その関係性を獲得するネットワーク
 1つのタスクに特化するのではなく，汎用的なImage Translation
 地図↔航空写真, スケッチ↔写真, 白黒↔カラー
 画像のペアが必要なので，データセット作成のコスト大
58
pix2pix
https://affinelayer.com/pixsrv/
https://github.com/phillipi/pix2p

 入力は画像
 Discriminatorには，画像のペアが入力される
 Lossは，Adversarial LossとタスクごとのLossを組み合わせ
 e.g. MSE, VGG Loss, CCE
59
pix2pix
Discriminator
fake pair
real pair
fake or real
Generator
x G(x)
x
x
y

 線画を着色するネットワーク
 おそらくpix2pixベース
60
PaintsChainer
http://paintschainer.preferred.tech/index_ja.html
https://qiita.com/taizan/items/cf77fd37ec3a0bef5d9d

 Pix2pixとOptical Flowを利用した異常シーン検出
 Best Paper / Student Paper Award Finalist,
IEEE International Conference on Image Processing (ICIP), 2017
61
ABNORMAL EVENT DETECTION
https://arxiv.org/abs/1708.09644

 ペアが必要ないImage Transfer
 双方向の変換を1度に学習する
 e.g. 馬⇔シマウマ，リンゴ⇔みかん，写真⇔絵画
62
Cycle GAN

63
Cycle GAN
Dy
fake or real
Gxy
Gyx
Dx
fake or real
 11
))(())(()))((log(
),,,(),,(
yGGyxGGxxGD
yxGGLxGDLL
yxxyxyyxxyy
xyyxcyclexyyGANGxy
++=
+=



 Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing
systems. 2014.
 Arjovsky, Martin, and Léon Bottou. "Towards principled methods for training generative
adversarial networks." arXiv preprint arXiv:1701.04862 (2017).
 Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein gan." arXiv preprint
arXiv:1701.07875 (2017).
 Gulrajani, Ishaan, et al. "Improved training of wasserstein gans." arXiv preprint
arXiv:1704.00028 (2017).
 https://makegirlsmoe.github.io/assets/pdf/technical_report.pdf
64
参考文献

Questions ?
65
My Mail Address : nakatsuka@cv.info.gifu-u.ac.jp
My implementations : https://github.com/salty-vanilla
My Qiita : https://qiita.com/salty-vanilla

Tutorial of GANs in Gifu Univ

Recommended

Recommended

More Related Content

Similar to Tutorial of GANs in Gifu Univ

Similar to Tutorial of GANs in Gifu Univ (20)

More from Shunsuke NAKATSUKA

More from Shunsuke NAKATSUKA (10)

Recently uploaded

Recently uploaded (20)

Tutorial of GANs in Gifu Univ