Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]

Multi-Domain Image Completion
for Random Missing Input Data
Stanford University, NVIDIA, and National Institutes of Health
Yonsei University Severance Hospital CCIDS
Choi Dongmin

Introduction
• Multi-domain images could provide complementary knowledge 
- ex. Four MRI modalities (T1, T1CE, T2, FLAIR) provide distinct features to locate tumor
boundaries from different diagnosis perspective 
- ex. Person re-identification across different cameras or times

• However, some image domains might be missing in practice 
- Solution 1. Nearest neighbor approach : lack of semantic consistency 
- Solution 2. Generative models

• ReMIC (Representational disentanglement schemes for Multi-domain
Image Completion) 
- -to- image completion framework  
- utilized for the high-level task by joint training (ex. segmentation) 
- completes the missing domains given random distributed numbers of visible domains 
- consistent performance improvement on three datasets
n n

Related Works
Image-to-Image 
Translation
J.Y Zhu et al. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. ICCV 2017
- Impressive performance via cycle-consistency loss

- only 1-to-1 mapping
• CycleGAN

Related Works
Image-to-Image 
Translation
Y Choi et al. StarGAN: Uniﬁed Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. CVPR 2018

J Yoon et al. RadialGAN: Leveraging multiple datasets to improve target-speciﬁc predictive models using GANs. ICML 2018
- Multi-domain image generation

- only 1-to- mapping (generation is always conditioned on the single input
image as the only source domain)
n
• StarGAN & RadialGAN
StarGAN RadialGAN

Related Works
Image-to-Image 
Translation
D Lee et al. CollaGAN: Collaborative GAN for Missing Image Data Imputation. CVPR 2019
- Collaborative model to incorporate multiple domains for generating one
missing domain

- only -to-1 mappingn
• CollaGAN

Related Works
Image-to-Image 
Translation
Red boxes are missing-domain images
Only ReMIC is -to- mappingn n

Related Works
Learning 
Disentangled 
Representations
• Learning Disentangled Representations 
- to capture the full distribution of possible outputs by introducing a random style code 
- to transfer information across domains for adaptation 
- InfoGAN and -VAE learn the disentangled representation in unsupervised mannerβ
Xi Chen et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. NIPS 2016

I Higgins et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. ICLR 2017
https://www.slideshare.net/NaverEngineering/ss-96581209

Related Works
Learning 
Disentangled 
Representations
• DRIT and MUNIT 
- disentangles content and attribute features in image translation 
- However, only 1-to-1 translation
H.Y Lee et al. Diverse image-to- image translation via disentangled representations. ECCV 2018

X Huang et al. Multimodal Unsupervised Image-to-Image Translation. ECCV 2018

Related Works
Learning 
Disentangled 
Representations
• Liu et al 
- tackles multi-domain learning cross-domain latent code 
- less discussion about the domain-speciﬁc style code
Liu et al. A Uniﬁed Feature Disentangler for Multi-Domain Image Translation and Manipulation. NIPS 2018

Related Works Medical 
Image Synthesis
• Previous works also discuss how to extract representations from multi-
modalities especially for segmentation with missing modalities 
- However, fuse the features from multiple modalities but not from the perspective of
representation disentanglement
V Nguyen et al. Cross-domain synthesis of medical images using eﬃcient location-sensitive deep network. MICCAI 2015

M Havaei et al. HeMIS: Hetero-Modal Image Segmentation. MICCAI 2016 
A Chartsias et al.Multimodal mr synthesis via modality-invariant latent representation. IEEE transactions on medical imaging 2017

Method
Red boxes are 
missing-domain images

Method
- Image decomposition 
: Shared content structure (skeleton) + Unique characteristics (flesh)

- Missing image reconstruction during the testing 
: Shared skeleton from available domains + Sampled flesh from the learned model
Style code (Domain-specific) 
- Style encoder Es
i (xi) = si (1 ≤ i ≤ N)
Content code (Shared) 
- Content encoder Ec
(x1, x2, …, xN) = c

Method
- Content codes visualization (randomly selected 8 out of 256 channels) of BraTs 
: Various focuses on diﬀerent anatomical structures (ex. tumor, brain, skull) are
demonstrated by diﬀerent channel-wise feature maps
Input images

Method
- Generation : Style codes from a prior distribution + Content Code

-
si c
Gi(c, si) = ˜xi
Image Generation Process

Method
Segmentation Branch
- Segmentation generator after content codes

- Assumption : The content codes contain essential image structure information

- Joint training (generation loss + segmentation Dice loss) 
: adaptively learn how to generate missing images
GS

Method
Training Loss
Image Consistency Loss
Latent Consistency Loss
Adversarial Loss
Reconstruction Loss
Segmentation Loss

Method
Total Loss
Training Loss
Adversarial 
(λadv = 1)
Image 
Consistency 
(λx
cyc = 10)
Style Latent 
Consistency 
(λs
cyc = 1)
Reconstruction 
(λrec = 20)
Content Latent 
Consistency 
(λc
cyc = 1)
Segmentation 
(λseg = 1)

Experiments
• BraTS 2018 dataset 
- Multi-modal brain MRI with four modalities : T1, T1Gd, T2, FLAIR 
- Following CollaGAN, 218 training and 28 testing samples randomly selected 
- A set of 2D slices (40,148 training / 5,340 test) extracted from 3D volumes 
- Resized to 256 256 
- Three tumor categories 
: Enhancing tumor (ET), tumor core (TC), and whole tumor (WT)
×
D Lee et al. CollaGAN: Collaborative GAN for Missing Image Data Imputation. CVPR 2019

B.H Menze et al. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS). IEEE TMI

Experiments
• ProstateX dataset 
- Multi-parametric prostate MR scans for 98 subjects : T2, ADC, HighB 
- 78 training and 20 testing samples randomly selected 
- A set of 2D slices (3,540 training / 840 test) extracted from 3D volumes 
- Resized to 256 256 
- Prostate regions are manually labeled as the whole prostate (WP)
×
G Litjens et al. Computer-aided detection of prostate cancer in MRI. IEEE TMI

Experiments
• RaFD (Radboud Faces Database) 
- Eight facial expressions 
: neutral, angry, contemptuous, disgusted, fearful, happy, sand, and surprised 
- Following StarGAN, adopt image from three camera angles with three gaze
directions 
- 3,888 training (54 participants) / 936 test (13 participants) 
- Cropped with the face in the center and Resized to 128 128×
Y Choi et al. StarGAN: Uniﬁed Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. CVPR 2018
http://www.socsci.ru.nl:8180/RaFD2/RaFD

Results
• Multi-Domain Image Completion on Domains 
- Only One Missing Domain ( -to-1) 
* Training : the one missing domain is randomly distributed 
* Testing : Fix the one missing domain and generate outputs only on that  
 
- More than One Missing Domains ( -to- ) 
* Training : randomly selected visible domains 
* Testing : Fix while these domains are randomly selected visible  
domains. Evaluate all the generated images 
 
- Evaluation metrics 
* NRMSE (Normalized Root Mean Squared Error) 
* SSIM (Structural Similarity)  
* PSNR (Peak Signal-to-Noise Ratio)
N
n
n n
k (k ∈ {1,…, N − 1})
k k
N

Results
• Multi-Domain Image Completion 
- Comparison with MUNIT, StarGAN, and CollaGAN 
- ReMIC w/o Recon : ReMIC without reconstruction loss (single missing domain) 
- ReMIC-Random : random visible domains (multiple missing domains)(k = * ) k

Results
• Multi-Domain Image Completion

Results
• Multi-Domain Segmentation 
- Oracle : fully supervised 2D U-Net variation without missing images 
- Oracle+* : the missing images generated from the “*” method 
with the pre-trained “Oracle” model (All : without any missing domains) 
- ReMIC+Seg : separate content encoders for image generation and  
segmentation tasks 
- ReMIC+Joint : sharing the weights of content encoder for the two tasks

Conclusion
• A general framework for multi-domain image completion, given
that one or more input domains are missing

• Learning shared content and domain-speciﬁc style encoding
across multiple domains

• Well generalized to both natural and medical images

• Extended for a uniﬁed image generation and segmentation
framework for missing-domain segmentation task

Question
• According to this paper, “diﬀerent modalities provide distinct
features to locate tumor boundaries from diﬀerential diagnosis
perspectives”. 
But ReMIC uses a content code, which encodes the shared
skeleton, as an input for the connected segmentation generator.
Isn’t is a contradiction?

ICLR 2020 Reviews
• The main contribution is representational disentanglement,
namely the content and style separation, but there is no explicit
evidence that this separation is really happened

• Evaluation on high-resolution dataset such as CelebHQ and other
conventional metrics such as FID
https://openreview.net/forum?id=rkg_wREYDS

Appendix. A : Implementation Details
• A.1 Hyperparameters 
- Adam optimizer  
- Batch size 1 and 100,000 iterations 
- Style code dimension : 8 
- During testing, a ﬁxed style code of 0.5 in each dimension

• A.2 Network Architectures (Check details in paper) 
- ReMIC is developed on the backbone of MUNIT 
- Uniﬁed Content Encoder : Down-sampling module + Residual Blocks (IN) 
- Style Encoder : Down-sampling module + Residual Blocks + GAP + FC 
- Generator : Four residual blocks + Up-sampling + AdaIN* 
- Discriminator : Four convolutional blocks 
- Segmentor : U-Net shaped network
(β1 = 0.5, β2 = 0.999)
X Huang et al. Arbitrary style transfer in real-time with adaptive instance normalization. ICCV 2017

Appendix. C : Extended Ablation Study and
Results for Multi-domain Segmentation
• C.4 Analysis of missing-domain segmentation results

Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]

Similaire à Review : Multi-Domain Image Completion for Random Missing Input Data [cdm] (20)

Plus de Dongmin Choi

Plus de Dongmin Choi (11)

Dernier

Dernier (20)

Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]