SlideShare une entreprise Scribd logo
1  sur  71
Télécharger pour lire hors ligne
Learning Loss
for Active Learning
Donggeun Yoo In So Kweon
CVPR 2019 (Oral presentation)
Lunit KAIST
Introduction
•Very important for deep learning
•It is not questionable that
more data still improves network performance
[Mahajan et al., ECCV’18]
천만~10억장
Introduction
•Problem: Limited budget for annotation
Horse=1
$ $$ $$$
Introduction
•Problem: Limited budget for annotation
•Disease-level annotations for medical images:
super-expensive
$$$$$
Horse=1
$
Active Learning
Labeled
Training
Active Learning
Unlabeled
pool
Labeled
Inference
Active Learning
Unlabeled
pool
Labeled
Inference
Labeling
If uncertain,
Active Learning
Unlabeled
pool
Labeled
Inference
Labeling
Training
If uncertain,
Active Learning
Unlabeled
pool
Labeled
Inference
Labeling
Training
If uncertain,
Active Learning
If uncertain,
The key of active learning is
how to measure the uncertainty.
Active Learning: Limitations
• Heuristic approach
• Highest entropy [Joshi et al., CVPR’09]
• Distance to decision boundaries [Tong & Koller, JMLR’01]
(−) Task-specific design
• Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18]
(−) Not scale to large CNNs and data
• Bayesian approach
• Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07]
• Bayesian inference by dropouts [Gal & Ghahramani ICML’17]
(−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18]
• Distribution approach
• Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18]
(−) Task-specific design
*Entropy
• An information-theoretic measure that represents the
information amount needed to “encode” a distribution.
• The use of entropy in active learning
• Dense prediction (0.33, 0.33, 0.33) → maximum
• Sparse prediction (1.00, 0.00, 0.00) → minimum
*Entropy
• An information-theoretic measure that represents the
information amount needed to “encode” a distribution.
• The use of entropy in active learning
• Dense prediction (0.33, 0.33, 0.33) → maximum
• Sparse prediction (1.00, 0.00, 0.00) → minimum
(+) Very simple but works well (also in deep networks)
(−) Specific for classification problem
Active Learning: Limitations
• Heuristic approach
• Highest entropy [Joshi et al., CVPR’09]
• Distance to decision boundaries [Tong & Koller, JMLR’01]
(−) Task-specific design
• Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18]
(−) Not scale to large CNNs and data
• Bayesian approach
• Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07]
• Bayesian inference by dropouts [Gal & Ghahramani ICML’17]
(−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18]
• Distribution approach
• Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18]
(−) Task-specific design
*Bayesian Inference
• Training
• Dropout layer inserted to every convolution layer
• Inference
• N feed forwards → N predictions
• Uncertainty = variance between predictions
*Bayesian Inference
• Training
• Dropout layer inserted to every convolution layer
(−) Super slow convergence
→ impractical for current deep nets
• Inference
• N feed forwards → N predictions
• Uncertainty = variance between predictions
(−) Computationally expensive
Active Learning: Limitations
• Heuristic approach
• Highest entropy [Joshi et al., CVPR’09]
• Distance to decision boundaries [Tong & Koller, JMLR’01]
(−) Task-specific design
• Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18]
(−) Not scale to large CNNs and data
• Bayesian approach
• Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07]
• Bayesian inference by dropouts [Gal & Ghahramani ICML’17]
(−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18]
• Distribution approach
• Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18]
(−) Task-specific design
*Diversity:
Core-set
Distribution of unlabeled pool
*Diversity:
Core-set
𝛿
Distribution of unlabeled pool
{ } is 𝛿-cover of { }
*Diversity:
Core-set
𝛿
Distribution of unlabeled pool
{ } is 𝛿-cover of { }
𝑥 = min
{𝑥}
𝛿
Optimization problem
*Diversity:
Core-set
(+) can be task-agnostic
as it only depends on feature space
(−) not considering ”hard” examples
near the decision boundaries
(−) Expensive optimization for large pool
Active Learning: Limitations
• Heuristic approach
• Highest entropy [Joshi et al., CVPR’09]
• Distance to decision boundaries [Tong & Koller, JMLR’01]
(−) Task-specific design
• Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18]
(−) Not scale to large CNNs and data
• Bayesian approach
• Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07]
• Bayesian inference by dropouts [Gal & Ghahramani ICML’17]
(−) Not scale to large CNNs and data [Sener & Savarese, ICLR’18]
• Distribution approach
• Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18]
(−) Not considering hard examples
Active Learning: Our approach
• Active learning by learning loss
• Attach a “loss prediction module” to a target network
• Learn the module to predict the loss
Unlabeled
pool
⋯Predicted
losses
Labeled
training set
Human oracles
annotate top-𝐾
data points
Active Learning: Our approach
• Requirements
• Task-agnostic method
• Not heuristic, learning-based
• Scalable to state-of-the-art networks and large data
Active Learning by Learning Loss
Model
Loss prediction module
Input
Target
prediction
Loss
prediction
Target
GT
Target
loss
Loss-prediction
loss
Active Learning by Learning Loss
Model
Loss prediction module
Input
Target
prediction
Loss
prediction
Target
GT
Target
loss
Loss-prediction
loss
Multi-task learning
Active Learning by Learning Loss
Model
Loss prediction module
Input
Target
prediction
Loss
prediction
Target
GT
Target
loss
Loss-prediction
loss
(+) Applicable to
• any network and data
• any tasks
(+) Nearly zero cost
Active Learning by Learning Loss
Model
Loss prediction module
Input
Target
prediction
Loss
prediction
Target
GT
Target
loss
Loss-prediction
loss
(+) Applicable to
• any network and data
• any tasks
(+) Nearly zero cost
𝑥
ො𝑦
𝑦
መ𝑙
𝑙
𝐿loss
መ𝑙, 𝑙
Active Learning by Learning Loss
•The loss for loss-prediction 𝐿loss
መ𝑙, 𝑙
•Mean square error?
𝐿𝑙𝑜𝑠𝑠
መ𝑙, 𝑙 = መ𝑙 − 𝑙
2
Active Learning by Learning Loss
•The loss for loss-prediction 𝐿loss
መ𝑙, 𝑙
•Mean square error?
→ target task loss 𝑙 reduced as training progresses
𝐿𝑙𝑜𝑠𝑠
መ𝑙, 𝑙 = መ𝑙 − 𝑙
2
Scale changes
Active Learning by Learning Loss
•The loss for loss-prediction 𝐿loss
መ𝑙, 𝑙
•To ignore scale changes of 𝑙,
we use a ranking loss
Active Learning by Learning Loss
•The loss for loss-prediction 𝐿loss
መ𝑙, 𝑙
•To ignore scale changes of 𝑙,
we use a ranking loss as
𝐿loss
መ𝑙𝑖, መ𝑙𝑗, 𝑙𝑖, 𝑙𝑗 = max 0, −𝟏 𝑙𝑖, 𝑙𝑗 ⋅ መ𝑙𝑖 − መ𝑙𝑗 + 𝜉
where 𝟏 𝑙𝑖, 𝑙𝑗 = ቊ
+1, if 𝑙𝑖 > 𝑙𝑗
−1, otherwise
A pair of
predicted losses
A pair of
real losses
Margin (=1)
Active Learning by Learning Loss
•Given a mini-batch B,
the total loss is defined as
1
B
෍
𝑥,𝑦 ∈B
𝐿task ො𝑦, 𝑦 + 𝜆
1
B
⋅ ෍
𝑥 𝑖,𝑦 𝑖,𝑥 𝑗,𝑦 𝑗 ∈B
𝐿loss
መ𝑙𝑖, መ𝑙𝑗, 𝑙𝑖, 𝑙𝑗
where 𝑙𝑖 = 𝐿task ො𝑦𝑖, 𝑦𝑖
Target task Loss prediction
A pair 𝑖, 𝑗 within a mini-batch B
Active Learning by Learning Loss
•MSE loss VS. Ranking loss
MSE
ResNet-18
CIFAR-10
Active Learning by Learning Loss
•MSE loss VS. Ranking loss
MSE
Ranking
ResNet-18
CIFAR-10
Active Learning by Learning Loss
•Loss prediction module
Target model
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
FC
Loss
predictionConcat.
Active Learning by Learning Loss
•Loss prediction module
Enough convolutions
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
FC
Loss
predictionConcat.
Convolved
features
Active Learning by Learning Loss
•Loss prediction module
Enough convolutions
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
FC
Loss
predictionConcat.
Backprop.
to convs
Active Learning by Learning Loss
•Loss prediction module
Enough convolutions
• The convolutions would be learned by
the loss prediction loss as well as the target loss
• Sufficiently large receptive field size
Active Learning by Learning Loss
•Loss prediction module
Enough convolutions
• The convolutions would be learned by
the loss prediction loss as well as the target loss
• Sufficiently large receptive field size
→ Don’t need more convolutions,
we just focus on merging the multiple features
Active Learning by Learning Loss
•Loss prediction module
Target model
FC
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
Loss
prediction
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
Concat.
(+) very efficient as GAP reduces the feature dimension
Active Learning by Learning Loss
•Loss prediction module
Target model
Target
model
FC
Loss
prediction
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
Concat.
Conv
BN
ReLU
GAP
FC
ReLU
Conv
BN
ReLU
GAP
FC
ReLU
Conv
BN
ReLU
GAP
FC
ReLU
: Added layer
Active Learning by Learning Loss
•Loss prediction module
More convolutions VS. Just FC
ResNet-18
CIFAR-10
Active Learning by Learning Loss
•Loss prediction module
Target model
FC
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
Loss
prediction
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
Concat.
Experiments (1)
•To validate “task-agnostic” + “state-of-the-art architectures”
Classification
Task Image
classification
Data CIFAR-10
Net ResNet-18
[He et al., CVPR’16]
Experiments (1)
•To validate “task-agnostic” + “state-of-the-art architectures”
Classification Classification
+ regression
Task Image
classification
Object
detection
Data CIFAR-10 PASCAL VOC
2007+2012
Net ResNet-18
[He et al., CVPR’16]
SSD
[Liu et al., ECCV’16]
Experiments (1)
•To validate “task-agnostic” + “state-of-the-art architectures”
Classification Classification
+ regression
Regression
Task Image
classification
Object
detection
Human pose
estimation
Data CIFAR-10 PASCAL VOC
2007+2012
MPII
Net ResNet-18
[He et al., CVPR’16]
SSD
[Liu et al., ECCV’16]
Stacked
Hourglass
Networks
[Newell et al., ECCV’16]
Results
•Image classification over CIFAR 10
FC
GAP
FC
ReLU
Loss
prediction
Concat.
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
ResNet-18
[He et al., CVPR’16]
Target
prediction
512×4×4
256×8×8
64×32×32
128×16×16
128
128
128
128
512
Results
•Image classification
over CIFAR 10
Loss prediction performance
Results
•Image classification
over CIFAR 10
(mean of 5 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
Results
•Image classification
over CIFAR 10
(mean of 5 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
+3.37%
Results
•Image classification
over CIFAR 10
(mean of 5 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
+3.37%
Data selection VS. Architecture
Data selection by active learning → +3.37%
DenseNet121[Huang et al.] − ResNet18 → +2.02%
Results
•Object detection
SSD (ImageNet pre-trained)
[Liu et al., ECCV’16]
FC
Loss
prediction
Concat.
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
Target
prediction
512×38×38
1024×19×19
512×10×10
256×5×5
256×3×3
256×1×1
128
768
Results
•Object detection over
PASCAL VOC 07+12
Loss prediction performance
Results
•Object detection on
PASCAL VOC 07+12
(mean of 3 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
Results
•Object detection on
PASCAL VOC 07+12
(mean of 3 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
+2.21%
Results
•Object detection on
PASCAL VOC 07+12
(mean of 3 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
+2.21%
Data selection VS. Architecture
Data selection by active learning → +2.21%
YOLOv2[Redmon et al.] − SSD → +1.80%
Results
•Human pose estimation
over MPII dataset Stacked Hourglass Network
[Newell et al., ECCV’16]
FC
GAP
FC
ReLU
Loss
prediction
Concat.
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
Target
prediction
An hourglass256×64×64
256×64×64
256×64×64
256×64×64
128
128
128
128
1024
Results
•Human pose estimation
over MPII dataset
Loss prediction performance
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
Results
•Human pose estimation
over MPII dataset
(mean of 3 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
Results
•Human pose estimation
over MPII dataset
(mean of 3 trials)
+1.84%
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
Results
•Human pose estimation
over MPII dataset
(mean of 3 trials)
+1.84%
Data selection VS. Number of stacks
Data selection by active learning → +1.84%
8-stacked − 2-stacked → +0.25%
Results
•Entropy VS predicted loss over MPII dataset
MSE loss MSE loss
Experiments (2)
•To validate “active domain adaptation”,
Dataset Data stats Active learning
Source
domain
MNIST #train:60k
#test: 10k
Use 60k as an
initial labeled pool
Target
domain
MNIST +
background
#train: 12k
#test: 50k
Add 1k for
each cycle
Results
•Image classification over MNIST
*https://github.com/pytorch/examples/tree/master/mnist
FC
GAP
FC
ReLU
Loss
prediction
Concat.
GAP
FC
ReLU
GAP
FC
ReLU
PyTorch MNIST model*
Target
prediction
Conv
ReLU
Conv
ReLU
FC
ReLU
FC
Image
10×12×12
20×4×4
50
64
64
64
192
Results
•Domain adaptation from MNIST to MNIST+background
•Loss prediction performance
Results
•Domain adaptation
from MNIST
to MNIST+background
•Target domain
performance
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→

Feature space overfitted
to source domain
Results
•Domain adaptation
from MNIST
to MNIST+background
•Target domain
performance
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→

Feature space overfitted
to source domain
+1.20%
Results
•Domain adaptation
from MNIST
to MNIST+background
•Target domain
performance
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→

Feature space overfitted
to source domain
+1.20%
Data selection VS. Architecture
Data selection by active learning → +1.20%
WideResNet14 − PytorchMNIST(4 layers) → +2.85%
Conclusion
•Introduced a novel active learning method that is
• Works well with current deep networks
• Task-agnostic
•Verified with
• Three major visual recognition tasks
• Three popular network architectures
Conclusion
•Introduced a novel active learning method that is
• Works well with current deep networks
• Task-agnostic
•Verified with
• Three major visual recognition tasks
• Three popular network architectures
“
”
Pick more important data,
and get better performance!

Contenu connexe

Tendances

Reservoir computing fast deep learning for sequences
Reservoir computing   fast deep learning for sequencesReservoir computing   fast deep learning for sequences
Reservoir computing fast deep learning for sequencesClaudio Gallicchio
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.pptbutest
 
(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMM(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMMMorpho, Inc.
 
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)Morpho, Inc.
 
機械学習デザインパターンおよび機械学習システムの品質保証の取り組み
機械学習デザインパターンおよび機械学習システムの品質保証の取り組み機械学習デザインパターンおよび機械学習システムの品質保証の取り組み
機械学習デザインパターンおよび機械学習システムの品質保証の取り組みHironori Washizaki
 
Bayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningBayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningSangwoo Mo
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion ModelsSangwoo Mo
 
Energy based models and boltzmann machines - v2.0
Energy based models and boltzmann machines - v2.0Energy based models and boltzmann machines - v2.0
Energy based models and boltzmann machines - v2.0Soowan Lee
 
[DL輪読会]Scalable Training of Inference Networks for Gaussian-Process Models
[DL輪読会]Scalable Training of Inference Networks for Gaussian-Process Models[DL輪読会]Scalable Training of Inference Networks for Gaussian-Process Models
[DL輪読会]Scalable Training of Inference Networks for Gaussian-Process ModelsDeep Learning JP
 
Techniques in Deep Learning
Techniques in Deep LearningTechniques in Deep Learning
Techniques in Deep LearningSourya Dey
 
adversarial training.pptx
adversarial training.pptxadversarial training.pptx
adversarial training.pptxssuserc45ddf
 
Brief intro : Invariance and Equivariance
Brief intro : Invariance and EquivarianceBrief intro : Invariance and Equivariance
Brief intro : Invariance and Equivariance홍배 김
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learningSung Kim
 
DeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural NetworksDeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural NetworksShunta Saito
 
Spiking neural network: an introduction I
Spiking neural network: an introduction ISpiking neural network: an introduction I
Spiking neural network: an introduction IDalin Zhang
 
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~SSII
 
Architecture Design for Deep Neural Networks III
Architecture Design for Deep Neural Networks IIIArchitecture Design for Deep Neural Networks III
Architecture Design for Deep Neural Networks IIIWanjin Yu
 
DeepVIO: Self-supervised Deep Learning of Monocular Visual Inertial Odometry ...
DeepVIO: Self-supervised Deep Learning of Monocular Visual Inertial Odometry ...DeepVIO: Self-supervised Deep Learning of Monocular Visual Inertial Odometry ...
DeepVIO: Self-supervised Deep Learning of Monocular Visual Inertial Odometry ...harmonylab
 

Tendances (20)

Reservoir computing fast deep learning for sequences
Reservoir computing   fast deep learning for sequencesReservoir computing   fast deep learning for sequences
Reservoir computing fast deep learning for sequences
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.ppt
 
(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMM(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMM
 
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
 
機械学習デザインパターンおよび機械学習システムの品質保証の取り組み
機械学習デザインパターンおよび機械学習システムの品質保証の取り組み機械学習デザインパターンおよび機械学習システムの品質保証の取り組み
機械学習デザインパターンおよび機械学習システムの品質保証の取り組み
 
Bayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningBayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-Learning
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion Models
 
Energy based models and boltzmann machines - v2.0
Energy based models and boltzmann machines - v2.0Energy based models and boltzmann machines - v2.0
Energy based models and boltzmann machines - v2.0
 
[DL輪読会]World Models
[DL輪読会]World Models[DL輪読会]World Models
[DL輪読会]World Models
 
[DL輪読会]Scalable Training of Inference Networks for Gaussian-Process Models
[DL輪読会]Scalable Training of Inference Networks for Gaussian-Process Models[DL輪読会]Scalable Training of Inference Networks for Gaussian-Process Models
[DL輪読会]Scalable Training of Inference Networks for Gaussian-Process Models
 
Techniques in Deep Learning
Techniques in Deep LearningTechniques in Deep Learning
Techniques in Deep Learning
 
adversarial training.pptx
adversarial training.pptxadversarial training.pptx
adversarial training.pptx
 
Brief intro : Invariance and Equivariance
Brief intro : Invariance and EquivarianceBrief intro : Invariance and Equivariance
Brief intro : Invariance and Equivariance
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
DeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural NetworksDeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural Networks
 
Spiking neural network: an introduction I
Spiking neural network: an introduction ISpiking neural network: an introduction I
Spiking neural network: an introduction I
 
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~
 
Architecture Design for Deep Neural Networks III
Architecture Design for Deep Neural Networks IIIArchitecture Design for Deep Neural Networks III
Architecture Design for Deep Neural Networks III
 
DeepVIO: Self-supervised Deep Learning of Monocular Visual Inertial Odometry ...
DeepVIO: Self-supervised Deep Learning of Monocular Visual Inertial Odometry ...DeepVIO: Self-supervised Deep Learning of Monocular Visual Inertial Odometry ...
DeepVIO: Self-supervised Deep Learning of Monocular Visual Inertial Odometry ...
 

Similaire à Learning loss for active learning

Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyAlon Bochman, CFA
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroSi Krishan
 
Machine learning for Data Science
Machine learning for Data ScienceMachine learning for Data Science
Machine learning for Data ScienceDr. Vaibhav Kumar
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?Tuan Yang
 
6 large-scale-learning.pptx
6 large-scale-learning.pptx6 large-scale-learning.pptx
6 large-scale-learning.pptxmustafa sarac
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning台灣資料科學年會
 
Graph Analysis of Student Model Networks
Graph Analysis of Student Model NetworksGraph Analysis of Student Model Networks
Graph Analysis of Student Model Networksmallium
 
Winning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingWinning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingTed Xiao
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hakky St
 
Decision Forests and discriminant analysis
Decision Forests and discriminant analysisDecision Forests and discriminant analysis
Decision Forests and discriminant analysispotaters
 
Boosting based Transfer Learning
Boosting based Transfer LearningBoosting based Transfer Learning
Boosting based Transfer LearningAshok Venkatesan
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersAlbert Y. C. Chen
 
Bag of tricks for image classification with convolutional neural networks r...
Bag of tricks for image classification with convolutional neural networks   r...Bag of tricks for image classification with convolutional neural networks   r...
Bag of tricks for image classification with convolutional neural networks r...Dongmin Choi
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balanceAlex Henderson
 
November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 butest
 
Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...Balázs Hidasi
 

Similaire à Learning loss for active learning (20)

Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 
Machine learning for Data Science
Machine learning for Data ScienceMachine learning for Data Science
Machine learning for Data Science
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
6 large-scale-learning.pptx
6 large-scale-learning.pptx6 large-scale-learning.pptx
6 large-scale-learning.pptx
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning
 
Graph Analysis of Student Model Networks
Graph Analysis of Student Model NetworksGraph Analysis of Student Model Networks
Graph Analysis of Student Model Networks
 
Winning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingWinning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to Stacking
 
PCA.pptx
PCA.pptxPCA.pptx
PCA.pptx
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
 
Decision Forests and discriminant analysis
Decision Forests and discriminant analysisDecision Forests and discriminant analysis
Decision Forests and discriminant analysis
 
Vi sem
Vi semVi sem
Vi sem
 
Boosting based Transfer Learning
Boosting based Transfer LearningBoosting based Transfer Learning
Boosting based Transfer Learning
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
 
Bag of tricks for image classification with convolutional neural networks r...
Bag of tricks for image classification with convolutional neural networks   r...Bag of tricks for image classification with convolutional neural networks   r...
Bag of tricks for image classification with convolutional neural networks r...
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balance
 
November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 November, 2006 CCKM'06 1
November, 2006 CCKM'06 1
 
Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...
 
Ml - A shallow dive
Ml  - A shallow diveMl  - A shallow dive
Ml - A shallow dive
 

Plus de NAVER Engineering

디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIXNAVER Engineering
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)NAVER Engineering
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트NAVER Engineering
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호NAVER Engineering
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라NAVER Engineering
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기NAVER Engineering
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정NAVER Engineering
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기NAVER Engineering
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)NAVER Engineering
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드NAVER Engineering
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기NAVER Engineering
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활NAVER Engineering
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출NAVER Engineering
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우NAVER Engineering
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...NAVER Engineering
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법NAVER Engineering
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며NAVER Engineering
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기NAVER Engineering
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기NAVER Engineering
 

Plus de NAVER Engineering (20)

React vac pattern
React vac patternReact vac pattern
React vac pattern
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Dernier (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Learning loss for active learning

  • 1. Learning Loss for Active Learning Donggeun Yoo In So Kweon CVPR 2019 (Oral presentation) Lunit KAIST
  • 2. Introduction •Very important for deep learning •It is not questionable that more data still improves network performance [Mahajan et al., ECCV’18] 천만~10억장
  • 3. Introduction •Problem: Limited budget for annotation Horse=1 $ $$ $$$
  • 4. Introduction •Problem: Limited budget for annotation •Disease-level annotations for medical images: super-expensive $$$$$ Horse=1 $
  • 10. Active Learning If uncertain, The key of active learning is how to measure the uncertainty.
  • 11. Active Learning: Limitations • Heuristic approach • Highest entropy [Joshi et al., CVPR’09] • Distance to decision boundaries [Tong & Koller, JMLR’01] (−) Task-specific design • Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18] (−) Not scale to large CNNs and data • Bayesian approach • Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07] • Bayesian inference by dropouts [Gal & Ghahramani ICML’17] (−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18] • Distribution approach • Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18] (−) Task-specific design
  • 12. *Entropy • An information-theoretic measure that represents the information amount needed to “encode” a distribution. • The use of entropy in active learning • Dense prediction (0.33, 0.33, 0.33) → maximum • Sparse prediction (1.00, 0.00, 0.00) → minimum
  • 13. *Entropy • An information-theoretic measure that represents the information amount needed to “encode” a distribution. • The use of entropy in active learning • Dense prediction (0.33, 0.33, 0.33) → maximum • Sparse prediction (1.00, 0.00, 0.00) → minimum (+) Very simple but works well (also in deep networks) (−) Specific for classification problem
  • 14. Active Learning: Limitations • Heuristic approach • Highest entropy [Joshi et al., CVPR’09] • Distance to decision boundaries [Tong & Koller, JMLR’01] (−) Task-specific design • Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18] (−) Not scale to large CNNs and data • Bayesian approach • Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07] • Bayesian inference by dropouts [Gal & Ghahramani ICML’17] (−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18] • Distribution approach • Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18] (−) Task-specific design
  • 15. *Bayesian Inference • Training • Dropout layer inserted to every convolution layer • Inference • N feed forwards → N predictions • Uncertainty = variance between predictions
  • 16. *Bayesian Inference • Training • Dropout layer inserted to every convolution layer (−) Super slow convergence → impractical for current deep nets • Inference • N feed forwards → N predictions • Uncertainty = variance between predictions (−) Computationally expensive
  • 17. Active Learning: Limitations • Heuristic approach • Highest entropy [Joshi et al., CVPR’09] • Distance to decision boundaries [Tong & Koller, JMLR’01] (−) Task-specific design • Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18] (−) Not scale to large CNNs and data • Bayesian approach • Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07] • Bayesian inference by dropouts [Gal & Ghahramani ICML’17] (−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18] • Distribution approach • Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18] (−) Task-specific design
  • 20. *Diversity: Core-set 𝛿 Distribution of unlabeled pool { } is 𝛿-cover of { } 𝑥 = min {𝑥} 𝛿 Optimization problem
  • 21. *Diversity: Core-set (+) can be task-agnostic as it only depends on feature space (−) not considering ”hard” examples near the decision boundaries (−) Expensive optimization for large pool
  • 22. Active Learning: Limitations • Heuristic approach • Highest entropy [Joshi et al., CVPR’09] • Distance to decision boundaries [Tong & Koller, JMLR’01] (−) Task-specific design • Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18] (−) Not scale to large CNNs and data • Bayesian approach • Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07] • Bayesian inference by dropouts [Gal & Ghahramani ICML’17] (−) Not scale to large CNNs and data [Sener & Savarese, ICLR’18] • Distribution approach • Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18] (−) Not considering hard examples
  • 23. Active Learning: Our approach • Active learning by learning loss • Attach a “loss prediction module” to a target network • Learn the module to predict the loss Unlabeled pool ⋯Predicted losses Labeled training set Human oracles annotate top-𝐾 data points
  • 24. Active Learning: Our approach • Requirements • Task-agnostic method • Not heuristic, learning-based • Scalable to state-of-the-art networks and large data
  • 25. Active Learning by Learning Loss Model Loss prediction module Input Target prediction Loss prediction Target GT Target loss Loss-prediction loss
  • 26. Active Learning by Learning Loss Model Loss prediction module Input Target prediction Loss prediction Target GT Target loss Loss-prediction loss Multi-task learning
  • 27. Active Learning by Learning Loss Model Loss prediction module Input Target prediction Loss prediction Target GT Target loss Loss-prediction loss (+) Applicable to • any network and data • any tasks (+) Nearly zero cost
  • 28. Active Learning by Learning Loss Model Loss prediction module Input Target prediction Loss prediction Target GT Target loss Loss-prediction loss (+) Applicable to • any network and data • any tasks (+) Nearly zero cost 𝑥 ො𝑦 𝑦 መ𝑙 𝑙 𝐿loss መ𝑙, 𝑙
  • 29. Active Learning by Learning Loss •The loss for loss-prediction 𝐿loss መ𝑙, 𝑙 •Mean square error? 𝐿𝑙𝑜𝑠𝑠 መ𝑙, 𝑙 = መ𝑙 − 𝑙 2
  • 30. Active Learning by Learning Loss •The loss for loss-prediction 𝐿loss መ𝑙, 𝑙 •Mean square error? → target task loss 𝑙 reduced as training progresses 𝐿𝑙𝑜𝑠𝑠 መ𝑙, 𝑙 = መ𝑙 − 𝑙 2 Scale changes
  • 31. Active Learning by Learning Loss •The loss for loss-prediction 𝐿loss መ𝑙, 𝑙 •To ignore scale changes of 𝑙, we use a ranking loss
  • 32. Active Learning by Learning Loss •The loss for loss-prediction 𝐿loss መ𝑙, 𝑙 •To ignore scale changes of 𝑙, we use a ranking loss as 𝐿loss መ𝑙𝑖, መ𝑙𝑗, 𝑙𝑖, 𝑙𝑗 = max 0, −𝟏 𝑙𝑖, 𝑙𝑗 ⋅ መ𝑙𝑖 − መ𝑙𝑗 + 𝜉 where 𝟏 𝑙𝑖, 𝑙𝑗 = ቊ +1, if 𝑙𝑖 > 𝑙𝑗 −1, otherwise A pair of predicted losses A pair of real losses Margin (=1)
  • 33. Active Learning by Learning Loss •Given a mini-batch B, the total loss is defined as 1 B ෍ 𝑥,𝑦 ∈B 𝐿task ො𝑦, 𝑦 + 𝜆 1 B ⋅ ෍ 𝑥 𝑖,𝑦 𝑖,𝑥 𝑗,𝑦 𝑗 ∈B 𝐿loss መ𝑙𝑖, መ𝑙𝑗, 𝑙𝑖, 𝑙𝑗 where 𝑙𝑖 = 𝐿task ො𝑦𝑖, 𝑦𝑖 Target task Loss prediction A pair 𝑖, 𝑗 within a mini-batch B
  • 34. Active Learning by Learning Loss •MSE loss VS. Ranking loss MSE ResNet-18 CIFAR-10
  • 35. Active Learning by Learning Loss •MSE loss VS. Ranking loss MSE Ranking ResNet-18 CIFAR-10
  • 36. Active Learning by Learning Loss •Loss prediction module Target model Mid- block Mid- block Mid- block Out block Target prediction FC Loss predictionConcat.
  • 37. Active Learning by Learning Loss •Loss prediction module Enough convolutions Mid- block Mid- block Mid- block Out block Target prediction FC Loss predictionConcat. Convolved features
  • 38. Active Learning by Learning Loss •Loss prediction module Enough convolutions Mid- block Mid- block Mid- block Out block Target prediction FC Loss predictionConcat. Backprop. to convs
  • 39. Active Learning by Learning Loss •Loss prediction module Enough convolutions • The convolutions would be learned by the loss prediction loss as well as the target loss • Sufficiently large receptive field size
  • 40. Active Learning by Learning Loss •Loss prediction module Enough convolutions • The convolutions would be learned by the loss prediction loss as well as the target loss • Sufficiently large receptive field size → Don’t need more convolutions, we just focus on merging the multiple features
  • 41. Active Learning by Learning Loss •Loss prediction module Target model FC GAP FC ReLU GAP FC ReLU GAP FC ReLU Loss prediction Mid- block Mid- block Mid- block Out block Target prediction Concat. (+) very efficient as GAP reduces the feature dimension
  • 42. Active Learning by Learning Loss •Loss prediction module Target model Target model FC Loss prediction Mid- block Mid- block Mid- block Out block Target prediction Concat. Conv BN ReLU GAP FC ReLU Conv BN ReLU GAP FC ReLU Conv BN ReLU GAP FC ReLU : Added layer
  • 43. Active Learning by Learning Loss •Loss prediction module More convolutions VS. Just FC ResNet-18 CIFAR-10
  • 44. Active Learning by Learning Loss •Loss prediction module Target model FC GAP FC ReLU GAP FC ReLU GAP FC ReLU Loss prediction Mid- block Mid- block Mid- block Out block Target prediction Concat.
  • 45. Experiments (1) •To validate “task-agnostic” + “state-of-the-art architectures” Classification Task Image classification Data CIFAR-10 Net ResNet-18 [He et al., CVPR’16]
  • 46. Experiments (1) •To validate “task-agnostic” + “state-of-the-art architectures” Classification Classification + regression Task Image classification Object detection Data CIFAR-10 PASCAL VOC 2007+2012 Net ResNet-18 [He et al., CVPR’16] SSD [Liu et al., ECCV’16]
  • 47. Experiments (1) •To validate “task-agnostic” + “state-of-the-art architectures” Classification Classification + regression Regression Task Image classification Object detection Human pose estimation Data CIFAR-10 PASCAL VOC 2007+2012 MPII Net ResNet-18 [He et al., CVPR’16] SSD [Liu et al., ECCV’16] Stacked Hourglass Networks [Newell et al., ECCV’16]
  • 48. Results •Image classification over CIFAR 10 FC GAP FC ReLU Loss prediction Concat. GAP FC ReLU GAP FC ReLU GAP FC ReLU ResNet-18 [He et al., CVPR’16] Target prediction 512×4×4 256×8×8 64×32×32 128×16×16 128 128 128 128 512
  • 49. Results •Image classification over CIFAR 10 Loss prediction performance
  • 50. Results •Image classification over CIFAR 10 (mean of 5 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→
  • 51. Results •Image classification over CIFAR 10 (mean of 5 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ +3.37%
  • 52. Results •Image classification over CIFAR 10 (mean of 5 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ +3.37% Data selection VS. Architecture Data selection by active learning → +3.37% DenseNet121[Huang et al.] − ResNet18 → +2.02%
  • 53. Results •Object detection SSD (ImageNet pre-trained) [Liu et al., ECCV’16] FC Loss prediction Concat. GAP FC ReLU GAP FC ReLU GAP FC ReLU GAP FC ReLU GAP FC ReLU GAP FC ReLU Target prediction 512×38×38 1024×19×19 512×10×10 256×5×5 256×3×3 256×1×1 128 768
  • 54. Results •Object detection over PASCAL VOC 07+12 Loss prediction performance
  • 55. Results •Object detection on PASCAL VOC 07+12 (mean of 3 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→
  • 56. Results •Object detection on PASCAL VOC 07+12 (mean of 3 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ +2.21%
  • 57. Results •Object detection on PASCAL VOC 07+12 (mean of 3 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ +2.21% Data selection VS. Architecture Data selection by active learning → +2.21% YOLOv2[Redmon et al.] − SSD → +1.80%
  • 58. Results •Human pose estimation over MPII dataset Stacked Hourglass Network [Newell et al., ECCV’16] FC GAP FC ReLU Loss prediction Concat. GAP FC ReLU GAP FC ReLU GAP FC ReLU Target prediction An hourglass256×64×64 256×64×64 256×64×64 256×64×64 128 128 128 128 1024
  • 59. Results •Human pose estimation over MPII dataset Loss prediction performance
  • 60. [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ Results •Human pose estimation over MPII dataset (mean of 3 trials)
  • 61. [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ Results •Human pose estimation over MPII dataset (mean of 3 trials) +1.84%
  • 62. [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ Results •Human pose estimation over MPII dataset (mean of 3 trials) +1.84% Data selection VS. Number of stacks Data selection by active learning → +1.84% 8-stacked − 2-stacked → +0.25%
  • 63. Results •Entropy VS predicted loss over MPII dataset MSE loss MSE loss
  • 64. Experiments (2) •To validate “active domain adaptation”, Dataset Data stats Active learning Source domain MNIST #train:60k #test: 10k Use 60k as an initial labeled pool Target domain MNIST + background #train: 12k #test: 50k Add 1k for each cycle
  • 65. Results •Image classification over MNIST *https://github.com/pytorch/examples/tree/master/mnist FC GAP FC ReLU Loss prediction Concat. GAP FC ReLU GAP FC ReLU PyTorch MNIST model* Target prediction Conv ReLU Conv ReLU FC ReLU FC Image 10×12×12 20×4×4 50 64 64 64 192
  • 66. Results •Domain adaptation from MNIST to MNIST+background •Loss prediction performance
  • 67. Results •Domain adaptation from MNIST to MNIST+background •Target domain performance [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→  Feature space overfitted to source domain
  • 68. Results •Domain adaptation from MNIST to MNIST+background •Target domain performance [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→  Feature space overfitted to source domain +1.20%
  • 69. Results •Domain adaptation from MNIST to MNIST+background •Target domain performance [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→  Feature space overfitted to source domain +1.20% Data selection VS. Architecture Data selection by active learning → +1.20% WideResNet14 − PytorchMNIST(4 layers) → +2.85%
  • 70. Conclusion •Introduced a novel active learning method that is • Works well with current deep networks • Task-agnostic •Verified with • Three major visual recognition tasks • Three popular network architectures
  • 71. Conclusion •Introduced a novel active learning method that is • Works well with current deep networks • Task-agnostic •Verified with • Three major visual recognition tasks • Three popular network architectures “ ” Pick more important data, and get better performance!