Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Review : Prototype Mixture Models for Few-shot Semantic Segmentation
1. Prototype Mixture Models
for Few-shot Semantic Segmentation
University of Chinese Academy of Sciences, Beijing, China
Yonsei University Severance Hospital CCIDS
Choi Dongmin
2. Abstract
• Few-shot segmentation
- challenging
- single prototype from the support image causes semantic ambiguity
• Prototype mixture models (PMMs)
- correlate diverse image regions with multiple prototypes
- leverage the semantics to activate objects in the query image
- S.O.T.A on Pascal VOC and MS-COCO
3. Introduction
Nguyen et al. Feature Weighting and Boosting for Few-Shot Segmentation. ICCV 2019
Few-shot Segmentation
Segmenting the Query image based on a feature representation learned on training images
given Support images and the related segmentation Support masks
4. Introduction
Single Prototype Model vs Prototype Mixture Model
A single prototype causes "semantic ambiguity" and deteriorates the distribution of features.
PMMs focus on solving the semantic ambiguity problem.
6. Related Works
Semantic Segmentation
Chen et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. TPAMI 2017
S.O.T.A methods : UNet, PSPNet, DeepLab
7. Related Works
Few-shot learning
• Metric Learning
- train networks to predict whether two images/regions belong to the
same category
• Meta-learning
- specify optimization or loss functions which force faster adaptation
of the parameters to new categories with few examples
• Data Augmentation
- generate additional examples for unseen categories
8. Related Works
Few-shot learning
• Metric Learning
Chen et al. A CLOSER LOOK AT FEW-SHOT CLASSIFICATION. ICLR 2019
simple prototypes for each class, which captures representative and discriminative features
9. Related Works
Few-shot Segmentation
• Largely following the Metric Learning framework
- Feed learned knowledge to a metric module to segment query images
Shaban et al. One-Shot Learning for Semantic Segmentation. BMVC 2017
OSLSM (two-branch network)
Support branch
Query branch
10. Related Works
Few-shot Segmentation
• Largely following the Metric Learning framework
- Feed learned knowledge to a metric module to segment query images
Zhang et al. SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation. CoRR abs/1810.09091 (2018)
SG-One, which uses a prototype vector
Prototype vector
11. Related Works
Few-shot Segmentation
• Largely following the Metric Learning framework
- Feed learned knowledge to a metric module to segment query images
Zhang et al. SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation. CoRR abs/1810.09091 (2018)
PANet w/ a prototype alignment regularization between support and query branches
12. Related Works
Few-shot Segmentation
• Metric Learning in few-shot segmentation
- A core is the prototype vector, which commonly calculated by GAP
- However, it typically disregards the spatial extent of objects and
tends to mix semantics from various parts
- Using single prototypes to represent object regions and
the semantic ambiguity problem remains unsolved
14. The Proposed Approach
Overview
Support branch
Query branch
Negative sample set S−
Positive sample set S+
Activate query features in a duplex way (P-Match and P-Conv)
15. The Proposed Approach
Prototype Mixture Models
Features is spatially partitioned into
foreground samples and background samples ,
( : feature vectors within the mask of the support image )
S ∈ RW×H×C
S+
S−
S+
16. The Proposed Approach
Prototype Mixture Models
PMMs : a probability mixture model
p(si |θ) = ΣK
k=1wk pk(si |θ)
- : the mixing weights
- : the model parameters
- : the feature sample
- : the base model, which is a probability model
based on a Kernel distance function (vector distance)
wk (0 ≤ wk ≤ 1, ΣK
k=1wk = 1)
θ
si ∈ S ith
pk(si |θ) kth
pk(si |θ) = β(θ)eKernel(si, μk)
= βc(κ)eκ μT
k si
Normalization constant
one of the parameter μk ∈ θ
κc/2−1
(2π)c/2Ic/2−1(κ)
* θ = {μ, κ}
17. The Proposed Approach
Prototype Mixture Models
Model Learning using EM algorithm
Eik =
pk(si |θ)
ΣK
k=1pk(si |θ)
=
eκ μT
k si
ΣK
k=1eκ μT
k si
E-step :
Given model parameters and sample features extracted,
calculating the expectation of the sample si
μk =
ΣN
i=1Eiksi
ΣN
k=1Eik
M-step :
The expectation is used to update the mean vectors of PMMs
( is the number of samples )N = W × H
18. The Proposed Approach
Prototype Mixture Models
Model Learning using EM algorithm
The mean vectors and
are used as
prototype vectors to extract convolution features
for the query image.
Such a prototype vector can represent
a region around an object part
μ+
= {μ+
k , k = 1, …, K}
μ−
= {μ−
k , k = 1, …, K}
19. The Proposed Approach
Prototype Mixture Models
PMMs as Representation (P-Match)
squeezes representation information about an object part
and can be used to match and activate the query features
μ+
Q
Q′ = P-Match(μ+
k , Q), k = 1, …, K
20. The Proposed Approach
Prototype Mixture Models
PMMs as Classifiers (P-Conv)
Each prototype vector incorporating discriminative information
across feature channels can be seen as classifier,
which produces probability maps
Mk = {M+
k , M−
k }
Mk = P-Conv(μ+
k , μ−
k , Q), k = 1, …, K
21. The Proposed Approach
Prototype Mixture Models
P-Match and P-Conv
The semantic info across channels and discriminative info related to object
parts are collected from the support features to activate the query featureS Q
23. The Proposed Approach
Residual Prototype Mixture Models
Ensemble by stacking multiple PMMs
to further enhance the model representative capacity
24. Experiments
• Baseline : CANet w/o iterative optimization
• Data Augmentation
: normalization, horizontal flipping, random cropping and random resizing
• Pytorch 1.0 & Nvidia 2080Ti GPUs
• The EM algorithm iterates 10 rounds
• Optimization
: Cross-entropy Loss with SGD (init lr = 0.0035, momentum 0.9,
200,000 iterations, 8 pairs of support-query images per batch),
LR decay following DeepLab’s policy
• For each training step, the categories in the train split are randomly selected
and then the support-query pairs are randomly sampled in the selected
categories.
Zhang et al. CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning. CVPR 2019
Chen et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. TPAMI 2018
25. Experiments
• Dataset
- Pascal- : 20 object categories are partitioned into 4 splits
with 3 for training and 1 for testing
- COCO- : 80 classes are divided into 4 splits and each contains
20 classes and the val dataset is used for evaluation
• Evaluation Metric : mIoU
5i
20i
33. Conclusion
• PMMs
- correlate diverse image regions with multiple prototype to solve the
semantic ambiguity problem
- During training, PMMs incorporate rich channel-wised and spatial
semantics from limited support images
- During inference, PMMs are matched with query features in a duplex
manner to perform accurate semantic segmentation
- S.O.T.A of few-shot segmentation
- Capture the diverse semantics of object parts given few support
examples