MOODv2 : Masked Image Modeling for Out-of-Distribution Detection

MOODv2: Masked Image Modeling
for Out-of-Distribution Detection
Computer Vision Lab 김다은
2024.03.29

Introduction
1
✓ OOD detection 에서 ID feature space representation 이 중요
• ID feature representation 이 잘 만들어졌다면 OOD detection score function 에 상관없이 좋은 성능을 낼 수
있음
✓ OOD detection 을 위해 개선된 feature representation 을 연구
2
연구 목표
✓ classification-based method 에서 short-cut learning
• model avoids learning localized, stereotypical features
기존 연구 방법의 문제
잘 만들어진 ID feature represnetation
• 이미지의 전체적인 특징을 잘 이해
• 기존 classification-based 보다
더 대표적인 feature representation

Introduction
1
3
Feature space logits Softmax prob Features & logits
Residual
Energy + React
Mahalanobis
Energy
MaxLogit
MSP
KL-Matching
ViM

Introduction - Background
1
4
Based on probability
• Resnet 기반 모델은 over confidence 문제가 있음
• Resnet 의 activation func 인 Relu 는 0-1 사이의 값을 가짐
class weight wi 중 하나라도 image x 에 대해 양수 값을 내놓는다면
나머지 값들은 음수 값을 가지더라도 0 이 되므로
양수 값을 내놓은 class 가 높은 confidence 를 가지게 됨
• Test time 에 logit 에 temperature 즉 (exp(fi(x) / T), logit 에 scaling 을 거쳐
softmax 를 취하는 등 calibration 을 개선하기 위한 노력이 있었음
• ood detection 을 위해서는 class 의 확률 뿐만 아니라 feature space 에서의
차이를 이해할 필요가 있음

1
5
logit
• 입력 데이터에 feature 를 새로운 좌표계로 변환
• 새로운 feature space 의 원점 o 가 설정되게 됨
• Feature space 상 x 의 위치 비교가 가능해 짐
• Feature space 는 class 정보 뿐 아니라 class 에 무관한
정보도 포함되어 있음
• Logit 은 class agnostic 한 정보를 복원할 수 없음

1
6
OOD Score Based on Null Space
• W 는 W 의 column space
• 𝑤⊥
는 W 의 null space
• 𝑥𝑤⊥
이 0 에 가까울 수록 in-distribution
• Feature space 상 정보를 고려하므로 texture 와 같은
class agnostic info 가 중요한 데이터에서 특히 좋은 성능을
보임

1
7
OOD Score Based on Principal Space
• P : In distribution data x 의 가장 큰 D 개의 eigen value 에 해당하는
eigen vector 들로 이루어진 D-dimensional subspace P 를 principle space
로 정의함
* 위의 dimensional reduction 을 수행하기 위해 non-linear manifold learning
수행
• 즉 principal space 에서 P 의 null space 에 x 를 projection 시켰을 때 더
좋은 성능을 낼 수 있음
• Residual(x) 가 적을수록 In-distribution score 에 가까운 것
• 여기서 P 는 W 보다 작은 dimensional space 로 projection 되기 때문에
왜곡된 결과를 얻게 되는 문제

1
8
Virtual logit matching
• ViM 은 residual score 와 probability 를 합친 것
• Virtual logit 𝑙0 을 추가
• 𝑥𝑝⊥
는 dimensional reduction 에 의한 왜곡이 존재
이를 해결하기 위해 a 값을 통해 rescailing 해주는 과정이 포함됨
(데이터에 따라 달라지는 값)
• Feature 정보를 가지는 residual score 를
virtual logit 으로 두어 class 별 logit 과 함께
softmax 를 거치게 됨
• Virtual logit 이 일정 threshold 이상이라면 ood 로 판단
• Class probability 와 class agnostic info 를
함께 고려한 score

Method
2
✓ classification vs reconstruction
• Classification 을 위한 pattern 대신
network가 ID image 의 pixel 수준으로
이미지 representation 을
reconstruction 하도록 강제함
• Network 가 ID 데이터의 더 대표적인
representation 을 학습할 수 있음
✓ OOD detection 의 성능 확인을 위해
ID 및 OOD 데이터셋 구성
• 원본 이미지와 reconstruction 된
이미지 를 score function 을 거친 후
score 의 차이 계산
9
Motivation
ID dataset
the largest domain gap

Method
2
10
Masked Image Modeling for OOD
• MIM 을 통해 pre-trained 된 모델을
사용하면 OOD detection 에서 성능이
크게 향상됨을 확인 (MOODv1)
• MIM pre-trained 모델을 사용해 in-
distribution dataset 에 대해 fine-tuning
하면 in-distribution 의 정보를 충분히
내포하게 됨
• BEiT 를 기반으로 MIM pretrained 된
모델을 사용
• Encoder 의 logit 을 가지고 ViM score 를
계산함

Experiments
3
11
ID/OOD Datasets
✓ ID Dataset
• CIFAR-10
• ImageNet-1K
✓ OOD Dataset
• OpenImage-O
• Texture
• INaturalist
• ImageNet-O
Evaluation Metrics
✓ AUROC
✓ FPR95
• true positive rate 가 95% 일 때
false positive rate
cifar
OpenImage-O
Texture
imagenet
ImageNet-O I-naturalist
ID dataset
OOD dataset

MOODv2 : Masked Image Modeling for Out-of-Distribution Detection

Recommandé

Recommandé

Contenu connexe

Similaire à MOODv2 : Masked Image Modeling for Out-of-Distribution Detection

Similaire à MOODv2 : Masked Image Modeling for Out-of-Distribution Detection (20)

MOODv2 : Masked Image Modeling for Out-of-Distribution Detection