SlideShare une entreprise Scribd logo
Paper Summary :
beta-VAE: Learning Basic Visual Concepts with a
Constrained Variational Framework
Jun-sik Choi
Department of Brain and Cognitive Engineering,
Korea University
November 9, 2019
Overview of beta-VAE [1]
β-VAE is an unsupervised for learning disentangled
representations of independent visual data generative factors.
β-VAE adds an extra hyperparameter β to the VAE objective,
which constricts the encoding capacity of the latent bottleneck
and encourages factorized latent representation.
A protocol that can quantitatively compare the degree of
disentanglement learnt by different models is proposed.
Derivation of beta-VAE framework I
Assumption
Let D = {X, V , W },
x ∈ RN
: images,
v ∈ RK
: conditionally independent factors,
w ∈ RH
: conditionally dependent factors
p(x|v, w) = Sim(v, w): true world simulator using ground truth
generative factors.
An unsupervised deep generative model pθ(x|z) can learn a joint
distribution between x and z ∈ RM
, (M ≥ K) by maximizing :
max
θ
Epθ(z) [pθ(x|z)] then p∗
θ (x|z) ≈ p(x|v, w) = Sim(v, w)
The aim is to ensure that the inference model qφ(z|x) capture the
independent generative factor v in a disentangled manner and keep
conditional generative factors remain entangled in a separate subset
of z.
Derivation of beta-VAE framework II
To encourage disentangling property of qφ(z|x),
1. the prior p(z) is set to an isotropic unit Gaussian N(0, I).
2. qφ(z|x) is constrained to match a prior p(z)
This constrained optimisation problem can be expressed as:
max
φ,θ
Ex∼D Eqφ(z|x) [log pθ(x|z)] subject to DKL (qφ(z|x) p(z)) <
After applying Lagrangian transformation under the KKT
conditions,
F(θ, φ, β; x, z) = Eqφ(z|x) [log pθ(x|z)] − β (DKL (qφ(z|x) p(z)) − )
≥ L(θ, φ; x, z, β)
= Eqφ(z|x) [log pθ(x|z)] − βDKL (qφ(z|x) p(z))
Derivation of beta-VAE framework III
Meaning of β
1. β changes the degree of applied learning pressure during
training, thus encouraging different learnt representations.
2. When β = 1, β-VAE corresponds to the original VAE
formulation.
3. Set β > 1 is putting a stronger constraint on the latent
bottleneck than in the original VAE formulation.
4. Pressure to match KL-divergence limit the capacity of z,
encourage the model to learn the most efficient representation
of the data (the disentangled representation by conditionally
independent factor v).
5. There is a trade-off between reconstruction fidelity and the
quality of disentanglement.
Disentanglement Metric I
The description of calculation of disentanglement in the paper
[1] is too complex, so I summarized it to a form of pseudocode.
Data: D = {V ∈ RK
, C ∈ RH
, X ∈ RN
}
lclf ; Linear classifier, q(z|x) ∼ N(µ(x), σ(x));
for b in Batch do
Sample yb
from Unif[1 · · · K];
for l in L do
Sample v1 from p(v) and Sample v2 from p(v);
[v2]yb ← [v1]yb ;
Sample c1 and c2 from p(c);
x1 ← Sim(v1, c1) and x2 ← Sim(v2, c2);
z1 ← µ(x1) and z2 ← µ(x2);
zl
diff ← |z1 − z2|;
end
zb
diff = 1
L ΣL
l zl
diff ;
Predb
= lclf (zb
diff );
end
Loss = ΣB
b CrossEntropy(Predb
, yb
);
Update lclf ;
Disentanglement Metric II
The linear classifier predict which generative factor [v]i is shared
along the pair of images.
As q(z|x) has disentangled representation, the performance of
classifier increases.
The linear classifier should be very simple and have a low
VC-dimension in order to ensure that it has no capacity to perform
nonlinear disentangling itself.
Qualitative Results - 3D chairs
Figure: Qualitative results comparing disentangling performance of
beta-VAE (beta = 5), and other comparing methods.
Qualitative Results - 3D faces
Figure: Qualitative results comparing disentangling performance of
beta-VAE (beta = 20), and other comparing methods.
Qualitative Results - CelebA
Figure: Traversal of individual latents demonstrates that beta-VAE
discovered.
Quantitative Results
Figure: (Left) Disentanglement metric classification accuracy for 2D
shapes dataset. (Right) Positive correlation between normalized beta and
size of latent variable for disentangled factor learning for a fixed
beta-VAE architecture.
Quantitative Results
Figure: Representations learnt by
beta-VAE (beta=4)
Figure: Representations learnt by
InfoGAN
References
I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot,
M. Botvinick, S. Mohamed, and A. Lerchner, “beta-vae:
Learning basic visual concepts with a constrained variational
framework.,” ICLR, vol. 2, no. 5, p. 6, 2017.

Contenu connexe

Tendances

Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAP
Jakub Bartczuk
 
Toward Disentanglement through Understand ELBO
Toward Disentanglement through Understand ELBOToward Disentanglement through Understand ELBO
Toward Disentanglement through Understand ELBO
Kai-Wen Zhao
 
Spatial filtering
Spatial filteringSpatial filtering
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
確率的主成分分析
確率的主成分分析確率的主成分分析
確率的主成分分析
Mika Yoshimura
 
Spatial filtering using image processing
Spatial filtering using image processingSpatial filtering using image processing
Spatial filtering using image processing
Anuj Arora
 
Chapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier TransformationChapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier Transformation
Varun Ojha
 
Image processing spatialfiltering
Image processing spatialfilteringImage processing spatialfiltering
Image processing spatialfiltering
John Williams
 
Detecting malaria using a deep convolutional neural network
Detecting malaria using a deep  convolutional neural networkDetecting malaria using a deep  convolutional neural network
Detecting malaria using a deep convolutional neural network
Yusuf Brima
 
Image enhancement techniques
Image enhancement techniquesImage enhancement techniques
Image enhancement techniques
Saideep
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
Changjin Lee
 
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Hansol Kang
 
【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...
【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...
【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...
Deep Learning JP
 
HIGH PASS FILTER IN DIGITAL IMAGE PROCESSING
HIGH PASS FILTER IN DIGITAL IMAGE PROCESSINGHIGH PASS FILTER IN DIGITAL IMAGE PROCESSING
HIGH PASS FILTER IN DIGITAL IMAGE PROCESSING
Bimal2354
 
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
Jin-Hwa Kim
 
Canny Edge Detection
Canny Edge DetectionCanny Edge Detection
Canny Edge Detection
SN Chakraborty
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
Dat Nguyen
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
Mark Chang
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning
Asma-AH
 
Manifold learning with application to object recognition
Manifold learning with application to object recognitionManifold learning with application to object recognition
Manifold learning with application to object recognition
zukun
 

Tendances (20)

Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAP
 
Toward Disentanglement through Understand ELBO
Toward Disentanglement through Understand ELBOToward Disentanglement through Understand ELBO
Toward Disentanglement through Understand ELBO
 
Spatial filtering
Spatial filteringSpatial filtering
Spatial filtering
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
確率的主成分分析
確率的主成分分析確率的主成分分析
確率的主成分分析
 
Spatial filtering using image processing
Spatial filtering using image processingSpatial filtering using image processing
Spatial filtering using image processing
 
Chapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier TransformationChapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier Transformation
 
Image processing spatialfiltering
Image processing spatialfilteringImage processing spatialfiltering
Image processing spatialfiltering
 
Detecting malaria using a deep convolutional neural network
Detecting malaria using a deep  convolutional neural networkDetecting malaria using a deep  convolutional neural network
Detecting malaria using a deep convolutional neural network
 
Image enhancement techniques
Image enhancement techniquesImage enhancement techniques
Image enhancement techniques
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
 
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
 
【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...
【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...
【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...
 
HIGH PASS FILTER IN DIGITAL IMAGE PROCESSING
HIGH PASS FILTER IN DIGITAL IMAGE PROCESSINGHIGH PASS FILTER IN DIGITAL IMAGE PROCESSING
HIGH PASS FILTER IN DIGITAL IMAGE PROCESSING
 
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
 
Canny Edge Detection
Canny Edge DetectionCanny Edge Detection
Canny Edge Detection
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning
 
Manifold learning with application to object recognition
Manifold learning with application to object recognitionManifold learning with application to object recognition
Manifold learning with application to object recognition
 

Similaire à Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework

GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
Masahiro Suzuki
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
Deep Learning JP
 
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Olga Zinkevych
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Ono Shigeru
 
Secure Domination in graphs
Secure Domination in graphsSecure Domination in graphs
Secure Domination in graphs
Mahesh Gadhwal
 
Presentation OCIP 2015
Presentation OCIP 2015Presentation OCIP 2015
Presentation OCIP 2015
Fabian Froehlich
 
MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3
arogozhnikov
 
Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01
Cheng Feng
 
Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
seijihagawa
 
Pattern baysin
Pattern baysinPattern baysin
Pattern baysin
Kumar Shubham
 
Quantum Deep Learning
Quantum Deep LearningQuantum Deep Learning
Quantum Deep Learning
Willy Marroquin (WillyDevNET)
 
ABC workshop: 17w5025
ABC workshop: 17w5025ABC workshop: 17w5025
ABC workshop: 17w5025
Christian Robert
 
Slides4
Slides4Slides4
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
Masahiro Suzuki
 
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Alexander Litvinenko
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
Elvis DOHMATOB
 
VJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCNVJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCN
Dat Nguyen
 
Dl1 deep learning_algorithms
Dl1 deep learning_algorithmsDl1 deep learning_algorithms
Dl1 deep learning_algorithms
Armando Vieira
 
On Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph DomainsOn Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph Domains
Jean-Charles Vialatte
 
Anniversary2012
Anniversary2012Anniversary2012
Anniversary2012
J_H_Davenport
 

Similaire à Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework (20)

GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
 
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
 
Secure Domination in graphs
Secure Domination in graphsSecure Domination in graphs
Secure Domination in graphs
 
Presentation OCIP 2015
Presentation OCIP 2015Presentation OCIP 2015
Presentation OCIP 2015
 
MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3
 
Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01
 
Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
Scalable Global Alignment Graph Kernel Using Random Features: From Node Embed...
 
Pattern baysin
Pattern baysinPattern baysin
Pattern baysin
 
Quantum Deep Learning
Quantum Deep LearningQuantum Deep Learning
Quantum Deep Learning
 
ABC workshop: 17w5025
ABC workshop: 17w5025ABC workshop: 17w5025
ABC workshop: 17w5025
 
Slides4
Slides4Slides4
Slides4
 
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
 
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
VJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCNVJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCN
 
Dl1 deep learning_algorithms
Dl1 deep learning_algorithmsDl1 deep learning_algorithms
Dl1 deep learning_algorithms
 
On Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph DomainsOn Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph Domains
 
Anniversary2012
Anniversary2012Anniversary2012
Anniversary2012
 

Dernier

BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Denish Jangid
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
giancarloi8888
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
RamseyBerglund
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
MysoreMuleSoftMeetup
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
Nguyen Thanh Tu Collection
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Henry Hollis
 
math operations ued in python and all used
math operations ued in python and all usedmath operations ued in python and all used
math operations ued in python and all used
ssuser13ffe4
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
imrankhan141184
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.
deepaannamalai16
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
BoudhayanBhattachari
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
RidwanHassanYusuf
 

Dernier (20)

BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
 
math operations ued in python and all used
math operations ued in python and all usedmath operations ued in python and all used
math operations ued in python and all used
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
 

Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework

  • 1. Paper Summary : beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework Jun-sik Choi Department of Brain and Cognitive Engineering, Korea University November 9, 2019
  • 2. Overview of beta-VAE [1] β-VAE is an unsupervised for learning disentangled representations of independent visual data generative factors. β-VAE adds an extra hyperparameter β to the VAE objective, which constricts the encoding capacity of the latent bottleneck and encourages factorized latent representation. A protocol that can quantitatively compare the degree of disentanglement learnt by different models is proposed.
  • 3. Derivation of beta-VAE framework I Assumption Let D = {X, V , W }, x ∈ RN : images, v ∈ RK : conditionally independent factors, w ∈ RH : conditionally dependent factors p(x|v, w) = Sim(v, w): true world simulator using ground truth generative factors. An unsupervised deep generative model pθ(x|z) can learn a joint distribution between x and z ∈ RM , (M ≥ K) by maximizing : max θ Epθ(z) [pθ(x|z)] then p∗ θ (x|z) ≈ p(x|v, w) = Sim(v, w) The aim is to ensure that the inference model qφ(z|x) capture the independent generative factor v in a disentangled manner and keep conditional generative factors remain entangled in a separate subset of z.
  • 4. Derivation of beta-VAE framework II To encourage disentangling property of qφ(z|x), 1. the prior p(z) is set to an isotropic unit Gaussian N(0, I). 2. qφ(z|x) is constrained to match a prior p(z) This constrained optimisation problem can be expressed as: max φ,θ Ex∼D Eqφ(z|x) [log pθ(x|z)] subject to DKL (qφ(z|x) p(z)) < After applying Lagrangian transformation under the KKT conditions, F(θ, φ, β; x, z) = Eqφ(z|x) [log pθ(x|z)] − β (DKL (qφ(z|x) p(z)) − ) ≥ L(θ, φ; x, z, β) = Eqφ(z|x) [log pθ(x|z)] − βDKL (qφ(z|x) p(z))
  • 5. Derivation of beta-VAE framework III Meaning of β 1. β changes the degree of applied learning pressure during training, thus encouraging different learnt representations. 2. When β = 1, β-VAE corresponds to the original VAE formulation. 3. Set β > 1 is putting a stronger constraint on the latent bottleneck than in the original VAE formulation. 4. Pressure to match KL-divergence limit the capacity of z, encourage the model to learn the most efficient representation of the data (the disentangled representation by conditionally independent factor v). 5. There is a trade-off between reconstruction fidelity and the quality of disentanglement.
  • 6. Disentanglement Metric I The description of calculation of disentanglement in the paper [1] is too complex, so I summarized it to a form of pseudocode. Data: D = {V ∈ RK , C ∈ RH , X ∈ RN } lclf ; Linear classifier, q(z|x) ∼ N(µ(x), σ(x)); for b in Batch do Sample yb from Unif[1 · · · K]; for l in L do Sample v1 from p(v) and Sample v2 from p(v); [v2]yb ← [v1]yb ; Sample c1 and c2 from p(c); x1 ← Sim(v1, c1) and x2 ← Sim(v2, c2); z1 ← µ(x1) and z2 ← µ(x2); zl diff ← |z1 − z2|; end zb diff = 1 L ΣL l zl diff ; Predb = lclf (zb diff ); end Loss = ΣB b CrossEntropy(Predb , yb ); Update lclf ;
  • 7. Disentanglement Metric II The linear classifier predict which generative factor [v]i is shared along the pair of images. As q(z|x) has disentangled representation, the performance of classifier increases. The linear classifier should be very simple and have a low VC-dimension in order to ensure that it has no capacity to perform nonlinear disentangling itself.
  • 8. Qualitative Results - 3D chairs Figure: Qualitative results comparing disentangling performance of beta-VAE (beta = 5), and other comparing methods.
  • 9. Qualitative Results - 3D faces Figure: Qualitative results comparing disentangling performance of beta-VAE (beta = 20), and other comparing methods.
  • 10. Qualitative Results - CelebA Figure: Traversal of individual latents demonstrates that beta-VAE discovered.
  • 11. Quantitative Results Figure: (Left) Disentanglement metric classification accuracy for 2D shapes dataset. (Right) Positive correlation between normalized beta and size of latent variable for disentangled factor learning for a fixed beta-VAE architecture.
  • 12. Quantitative Results Figure: Representations learnt by beta-VAE (beta=4) Figure: Representations learnt by InfoGAN
  • 13. References I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, and A. Lerchner, “beta-vae: Learning basic visual concepts with a constrained variational framework.,” ICLR, vol. 2, no. 5, p. 6, 2017.