SlideShare a Scribd company logo
1 of 27
Visualizing and Understanding
Convolutional Networks
๊น€์€์žฌ
2017๋…„ 11์›” 21์ผ
๋ชฉ์ฐจ
๏‚ง Abstract
๏‚ง Introduction
๏‚ง Approach
๏‚ง Training Details
๏‚ง Convnet Visualization
๏‚ง Experiments
๏‚ง Discussion
1
Abstract
๏‚ง Large Convolutional Network model์ด ๋›ฐ์–ด๋‚œ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์คฌ๋Š”๋ฐ
์™œ ๊ทธ๊ฒƒ์ด ์„ฑ๋Šฅ์ด ์ž˜ ๋‚˜์˜ค๊ณ , ์–ด๋–ป๊ฒŒ ๊ฐœ์„ ๋˜์—ˆ๋Š”์ง€ ๋ช…ํ™•ํ•œ ์ดํ•ด๊ฐ€ ์—†์Œ.
๏‚ง ์œ„์™€ ๊ฐ™์€ ๋ฌธ์ œ ๋‹ค๋ฃจ๊ธฐ ์œ„ํ•ด ์‹œ๊ฐํ™” ๊ธฐ์ˆ  ์†Œ๊ฐœํ•จ.
โ€ข ์ค‘๊ฐ„์ธต(intermediate feature)์™€ ๋ถ„๋ฅ˜๊ธฐ์˜ ๋™์ž‘์˜ ํ†ต์ฐฐ๋ ฅ ์คŒ.
โ€ข ์‹œ๊ฐํ™” ๊ธฐ์ˆ ์„ ํ†ตํ•ด ๊ธฐ์กด (Krizhevsky et al 2012)์˜ ๋ถ„๋ฅ˜๊ธฐ ์„ฑ๋Šฅ์„ ๋Šฅ๊ฐ€ํ•˜๋Š” ๋ชจ๋ธ
์•„ํ‚คํ…์ณ๋ฅผ ์ฐพ์Œ.
๏‚ง ๋‹ค๋ฅธ ๋ชจ๋ธ ๋ ˆ์ด์–ด๋“ค๋กœ๋ถ€ํ„ฐ ์„ฑ๋Šฅ ๊ธฐ์—ฌ๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•œ ์—ฐ๊ตฌ๋ฅผ ํ•จ.
๏‚ง ๋งŒ๋“ค์–ด์ง„ ImageNet ๋ชจ๋ธ์ด ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ ์…‹์—๋„ ์ž˜
์ผ๋ฐ˜ํ™”๊ฐ€ ๋œ ๊ฒƒ์„ ํ™•์ธํ•จ.
2
Introduction
๏‚ง Convolutional Network ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ์„ฑ๊ณต ํ•  ์ˆ˜ ์žˆ์—ˆ๋˜ ์ด์œ 
โ€ข 1) ๋งŽ์€ training set
โ€ข 2) ๊ฐ•๋ ฅํ•œ GPU
โ€ข 3) Drop out
๏‚ง Convnet ๋ชจ๋ธ์ด ์„ฑ๊ณต์ ์ด์—ˆ์Œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ๊ทธ์™€ ๊ฐ™์€ ์„ฑ๋Šฅ์„ ์–ด๋–ป๊ฒŒ ๋งŒ๋“ค์—ˆ๋Š”์ง€ ๋˜๋Š” ๊ทธ ๋ณต์žกํ•œ ๋ชจ๋ธ
์˜ ํ–‰๋™์ด๋‚˜ ๋‚ด๋ถ€ ๋™์ž‘ ๋“ฑ์„ ์ดํ•ดํ•˜๊ธฐ ํž˜๋“ฆ.
๏‚ง ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋ชจ๋ธ์— ์–ด๋–ค layer์—์„œ ๊ฐ feature map์„ ํ™œ์„ฑํ™”์‹œํ‚ค๋Š” input๋“ค์„ ๋ณด์—ฌ ์ค„ ์ˆ˜ ์žˆ๋Š” ์‹œ๊ฐํ™”
๊ธฐ์ˆ ์„ ์†Œ๊ฐœํ•จ.
โ€ข Training ๋™์•ˆ feature๋“ค์ด ์ง„ํ™”ํ•˜๋Š” ๊ฑธ ๋ณผ ์ˆ˜ ์žˆ์Œ.
โ€ข ๋ชจ๋ธ์ด ๊ฐ€์ง„ ์ž ์žฌ์ ์ธ ๋ฌธ์ œ์ ์„ ๋ณผ ์ˆ˜ ์žˆ์Œ.
โ€ข Multi-layered Deconvolutional Network (Zeiler et al. 2011)๋ฅผ ์‚ฌ์šฉํ•จ.
๏‚ง ์ด๋ฏธ์ง€ ์ผ๋ถ€๋ถ„์„ ์ˆจ๊น€์œผ๋กœ์จ Scene์˜ ์–ด๋–ค ๋ถ€๋ถ„์„ ๋“œ๋Ÿฌ๋‚ด๋Š” ๊ฒƒ์ด
๋ถ„๋ฅ˜๊ธฐ์—์„œ ์ค‘์š”ํ•œ์ง€ ์—ฐ๊ตฌํ•จ.
๏‚ง ์‹œ๊ฐํ™” ๊ธฐ์ˆ ์„ ํ†ตํ•ด ๋‹ค๋ฅธ ์•„ํ‚คํ…์ฒ˜๋“ค์„ ์—ฐ๊ตฌํ•˜๊ณ  ๋” ๋Šฅ๊ฐ€ํ•˜๋Š” ๋ชจ๋ธ ๋ฐœ๊ฒฌํ•จ
๋˜ํ•œ ์ด ๋ชจ๋ธ์„ ๊ฐ€์ง€๊ณ  ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ ์…‹์—์„œ ๋ชจ๋ธ์˜ ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์„ ์—ฐ๊ตฌํ•จ.
3
Approach (1)
๏‚ง Convnet ๋ชจ๋ธ
โ€ข (Krizhevsky et al. 2012)์—์„œ ์‚ฌ์šฉ๋œ standard fully supervised convnet models๋ฅผ
์‚ฌ์šฉํ•จ.
โ€ข ๋„คํŠธ์›Œํฌ์˜ ์•ž์ชฝ ๋ ˆ์ด์–ด๋“ค์€ conventional fully-connected network๋ฅผ ์‚ฌ์šฉํ•˜๊ณ 
๋งˆ์ง€๋ง‰ ๋ ˆ์ด์–ด๋Š” Softmax ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ์‚ฌ์šฉ, ์†์‹คํ•จ์ˆ˜๋Š” Cross-Entropy ์‚ฌ์šฉํ•จ.
4
Approach (2)
๏‚ง 2.1 Visualization with a Deconvnet
โ€ข Intermediate layers์—์„œ Features activities๋ฅผ
pixel space๋กœ ๋งคํ•‘์‹œํ‚ค๋Š” ๋ฐฉ์‹์„ ๋ณด์—ฌ์คŒ.
โ€ข Features map์„ ์ž…๋ ฅ ๊ฐ’์œผ๋กœ ์ „๋‹ฌํ•˜๊ณ 
์•„๋ž˜ 3๊ฐ€์ง€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ˆ˜ํ–‰ํ•จ.
๏ƒ˜ Unpooling
- Switch๋ผ๋Š” ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์›๋ž˜์˜
poolingํ•˜๊ธฐ ์ „ ํ˜•ํƒœ๋กœ ๋ณต์›ํ•จ.
- Switch๋Š” ๊ฐ pooling region์—์„œ local max์˜
์œ„์น˜ ๊ฐ’์„ ๊ธฐ๋กํ•จ.
๏ƒ˜ Rectification
- ReLU์„ ํ†ตํ•ด Feature map์ด ํ•ญ์ƒ ์–‘์ˆ˜๊ฐ€ ๋˜๋„๋ก ํ•จ.
๏ƒ˜ Filtering
- Convnet์— ์‚ฌ์šฉ ๋œ Filter map์—
์ „์น˜(Transpose)๋ฅผ ์‹œํ‚จ ํ–‰๋ ฌ์„ ์‚ฌ์šฉํ•จ.
- Rectified map์— ์ ์šฉํ•จ.
5
Training Details
๏‚ง (Krizhevsky et al. 2012)์™€ ์ฐจ์ด์ 
โ€ข (Krizhevsky et al. 2012)๋Š” 3,4,5 ๋ ˆ์ด์–ด์— sparse connections์„ ์‚ฌ์šฉํ–ˆ์ง€๋งŒ
์—ฌ๊ธฐ์„œ ์“ฐ์ด๋Š” ๋ชจ๋ธ์€ dense connection์ž„.
โ€ข Layer1, 2๊ฐ€ ๋‹ค๋ฆ„. (์ดํ›„์— ์„ค๋ช…)
๏‚ง Training
โ€ข ImageNet 2012 training set ์‚ฌ์šฉํ•จ.
โ€ข Preprocessing
๏ƒ˜ 256*256 ์‚ฌ์ด์ฆˆ๋กœ ์ค‘์•™๋ถ€๋ถ„์„ ์ž๋ฅด๊ณ  ๋ชจ๋“  ํ”ฝ์…€์˜ ํ‰๊ท ๊ฐ’์„ ๊ฐ€์ง€๊ณ  ๋บŒ.
๊ทธ๋Ÿฐ ํ›„ 224*224 ์‚ฌ์ด์ฆˆ๋กœ 10์žฅ์˜ ์ด๋ฏธ์ง€๋ฅผ ๋งŒ๋“ค์–ด๋ƒ„.
โ€ข 128 ์‚ฌ์ด์ฆˆ์˜ Mini-batch ๋ฅผ ๊ฐ€์ง„ Stochastic gradient descent๋ฅผ ์ด์šฉํ•˜์—ฌ
parameter๋ฅผ ์—…๋ฐ์ดํŠธํ•จ.
โ€ข Layer 6,7์—์„œ Dropout์„ ์‚ฌ์šฉํ•จ.
โ€ข ๋ชจ๋“  ๊ฐ€์ค‘์น˜๋Š” 10^-2๋กœ bias๋Š” 0์œผ๋กœ ์ดˆ๊ธฐํ™”ํ•จ.
6
Convnet Visualization (1)
๏‚ง Feature Visualization
โ€ข ์˜ค๋ฅธ์ชฝ ๊ทธ๋ฆผ์€ Training ์™„๋ฃŒ ํ›„์˜ feature์˜
๋ชจ์Šต์ž„. (Visualization ์˜†์—๋Š” ํ•ด๋‹น
image patch๋ฅผ ๋ณด์—ฌ์คŒ)
โ€ข ๊ฐ€์žฅ ๊ฐ•ํ•œ activation ๋Œ€์‹ ์— top 9 activation
์„ ๋ณด์—ฌ์คŒ.
โ€ข Pixel space๋กœ projecting ํ•˜๋Š” ๊ฒƒ์€
์ฃผ์–ด์ง„ feature map์„ excite ์‹œํ‚ค๋Š” ๋‹ค๋ฅธ
๊ตฌ์กฐ๋ฅผ ๋งŒ๋“ค์–ด๋ƒ„.
โ€ข Image patch๋Š” visualization๋ณด๋‹ค ๋ณ€์ˆ˜๊ฐ€ ๋งŽ์Œ.
Visualization์€ ํŠน์ • ๊ตฌ์กฐ์— ์ดˆ์ ์„ ๋งž์ถค.
(Ex) Layer5 [1, 2])
โ€ข ๋„คํŠธ์›Œํฌ์—์„œ ๊ฐ feature๋“ค์€ ๊ณ„์ธต์  ํŠน์„ฑ์„
์ง€๋‹˜.
๏ƒ˜ Layer2 : corner, edge/color conjunctions
๏ƒ˜ Layer3 : similar textures
๏ƒ˜ Layer4 : significant variation, more class-specific
๏ƒ˜ Layer5 : entire objects with significant
7
Convnet Visualization (2)
๏‚ง Feature Evolution during Training
โ€ข ์œ„ ๊ทธ๋ฆผ์€ Feature map์ค‘์—์„œ ๊ฐ•ํ•œ activation์„ ํ›ˆ๋ จํ•˜๋Š” ๋™์•ˆ์— ์ง„ํ–‰์„
Visualization์„ ํ•จ.
โ€ข ๋‚ฎ์€ layer์—์„œ๋Š” ์ ์€ ์ˆ˜์˜ epochs๋กœ ํ•™์Šต์ด ๋˜์ง€๋งŒ ๋†’์€ layer๋Š”
40~50๊ฐœ์˜ ์ƒ๋‹นํ•œ ์ˆ˜์˜ epochs๊ฐ€ ํ•„์š”๋กœ ํ•จ.
8
Convnet Visualization (3)
๏‚ง Feature invariance
โ€ข Sample Image 5๊ฐœ๋ฅผ ๋ณ€ํ˜•(์ด๋™, ํšŒ์ „, ํ™•๋Œ€์ถ•์†Œ)์„ ์‹œ์ผœ ํ•ด๋‹น feature vector์™€
untransformed feature vector์™€์˜ ์ƒ๋Œ€์ ์ธ ๋ณ€ํ™”๋ฅผ ๋ด„.
โ€ข ์ž‘์€ ๋ณ€ํ˜•์— ๋Œ€ํ•ด์„œ๋Š” ์ฒซ ๋ฒˆ์งธ Layer๊ฐ€ ํฐ ์˜ํ–ฅ์„ ํฐ ํšจ๊ณผ๋ฅผ ๊ฐ€์ง€์ง€๋งŒ
๋งˆ์ง€๋ง‰ Layer์— ๋Œ€ํ•ด์„œ๋Š” ์˜ํ–ฅ์ด ๋ฏธ๋ฏธํ•จ.
โ€ข ๋„คํŠธ์›Œํฌ ์ถœ๋ ฅ์€ ์ด๋™๊ณผ ์Šค์ผ€์ผ์— ๋Œ€ํ•ด์„œ๋Š” ์•ˆ์ •์ ์ด์ง€๋งŒ ํšŒ์ „์€ ๊ทธ๋ ‡์ง€ ์•Š์Œ.
(๋Œ€์นญํšŒ์ „ ์ œ์™ธ)
โ€ข ๋‹ค์Œ ์žฅ ๊ทธ๋ฆผ ๋ณ€์ˆ˜
๏ƒ˜ A : ์ด๋™, B : ์Šค์ผ€์ผ, C : ํšŒ์ „
๏ƒ˜ Col1 : 5๊ฐœ์˜ ๋ณ€ํ˜•๋œ ์ด๋ฏธ์ง€
๏ƒ˜ Col2 & 3 : Euclidean distance๋ฅผ ์ด์šฉํ•œ feature vector ์ฐจ์ด.
(2๋Š” Layer 1, 3์€ Layer 7)
๏ƒ˜ Col4 : ๊ฐ ์ด๋ฏธ์ง€์˜ True label์˜ ํ™•๋ฅ 
9
Convnet Visualization (4)
10
Convnet Visualization (5)
๏‚ง Architecture Selection
โ€ข Visualization์€ ์ข‹์€ architecture๋ฅผ ์„ ํƒํ•˜๋Š” ๋ฐ ๋„์›€์„ ์คŒ.
โ€ข (a)๋Š” feature scale clipping์ด ์—†๋Š” 1 Layer
(b)์™€ (d)๊ฐ€ Krizhevsky et al์˜ 1 Layer์™€ 2 Layer
(c)์™€ (e)๊ฐ€ ์ด ๋…ผ๋ฌธ์—์„œ ๋ฐ”๊พผ Layer
โ€ข Filter size๋ฅผ 11 * 11 ๏ƒ  7 * 7 , Stride of convolution 4 ๏ƒ  2
๏ƒ˜ ์ ์šฉ ๊ฒฐ๊ณผ
(b)์—์„œ ์žˆ๋˜ a mix of extremely high and low frequency information์ด
(c)์—์„œ๋Š” ๋ณด์ด์ง€ ์•Š์Œ.
(d)์—์„œ aliasing artifact๊ฐ€ ๋ณด์ด์ง€๋งŒ (e)์—์„œ ๋งŽ์ด ์ค„์—ฌ์ง.
๊ฒŒ๋‹ค๊ฐ€ ๋ถ„๋ฅ˜๊ธฐ์˜ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋จ.
11
Convnet Visualization (6)
12
Convnet Visualization (7)
๏‚ง Occlusion Sensitivity
โ€ข ์ด ๋ชจ๋ธ์ด ์ •๋ง๋กœ ์ด๋ฏธ์ง€์—์„œ ์˜ค๋ธŒ์ ํŠธ์˜ ์œ„์น˜๋ฅผ ํ™•์ธํ•˜๋Š”์ง€
๋˜๋Š” ์ฃผ๋ณ€ ํ™˜๊ฒฝ์„ ์ž˜ ํ™œ์šฉํ•˜๋Š”์ง€๋ฅผ ํ…Œ์ŠคํŠธ ํ•ด๋ด„.
๏ƒ˜ ์ด๋ฏธ์ง€์˜ ์ผ๋ถ€๋ถ„์„ ๊ฐ€๋ฆผ(Occlusion)์œผ๋กœ์จ ํ…Œ์ŠคํŠธ.
โ€ข ์‹คํ—˜ ๊ฒฐ๊ณผ
๏ƒ˜ ๋ชจ๋ธ์€ ์”ฌ์—์„œ ๋ช…ํ™•ํ•˜๊ฒŒ ์˜ค๋ธŒ์ ํŠธ๋ฅผ ์ธ์ง€ํ•จ.
- ์˜ค๋ธŒ์ ํŠธ๋ฅผ ๊ฐ€๋ ธ์„ ๋•Œ correct class ํ™•๋ฅ ์ด ๋–จ์–ด์ง.
๏ƒ˜ Visualization์€ image structure์— ๋“ค์–ด ๋งž๋Š”๊ฒƒ์„ ๋ณด์—ฌ์ฃผ๊ณ  ๋˜ ๋‹ค๋ฅธ visualization๋„
์œ ํšจํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์ž…์ฆํ•จ.
13
Convnet Visualization (8)
14
Convnet Visualization (9)
๏‚ง Correspondence Analysis
โ€ข Deep model์€ ๊ธฐ์กด ์ธ์ง€ ์ ‘๊ทผ ๋ฐฉ์‹๋“ค๊ณผ ๋‹ค๋ฅด๊ฒŒ ํŠน์ •ํ•œ ์˜ค๋ธŒ์ ํŠธ ๋ถ€๋ถ„๋“ค ์‚ฌ์ด์—
๊ด€๋ จ์„ฑ์„ ๋งŒ๋“ค์–ด๋‚ด๋Š” ๋ช…ํ™•ํ•œ ๋ฉ”์ปค๋‹ˆ์ฆ˜์ด ์กด์žฌํ•˜์ง€ ์•Š์Œ.
๏ƒ˜ Deep model์ด ์ด๊ฒƒ๋“ค์„ implicitly ํ•˜๊ฒŒ ๊ณ„์‚ฐํ•˜๋Š”์ง€ ์ฐพ์•„๋ด„.
โ€ข ์‹คํ—˜ ๋ฐฉ๋ฒ•
๏ƒ˜ 5๊ฐœ์˜ ๊ฐœ ์ด๋ฏธ์ง€๋ฅผ ๊ฐ€์ง€๊ณ  ๊ฐ ์ด๋ฏธ์ง€ ๋งˆ๋‹ค ์™ผ์ชฝ ๋ˆˆ, ์˜ค๋ฅธ์ชฝ ๋ˆˆ, ์ฝ”, ๋žœ๋ค ํ•œ ์œ„์น˜๋ฅผ
๊ฐ€๋ฆฌ๊ณ  ์›๋ณธ๊ณผ ๊ฐ€๋ ค์ง„ ์ด๋ฏธ์ง€์—์„œ ๋งŒ๋“ค์–ด์ง„ ๊ฐ๊ฐ์˜ feature vector๋ฅผ ๋นผ์คŒ.
- ๐œ–๐‘–
๐‘™
= ๐‘ฅ๐‘–
๐‘™
โˆ’ ๐‘ฅ๐‘–
~๐‘™
โ€ข i : ์ด๋ฏธ์ง€ ์ธ๋ฑ์Šค, l , ~l : Layer l์—์„œ์˜ ์›๋ณธ, occlude ๋œ ์ด๋ฏธ์ง€์˜ feature map
-
โ€ข H : hamming distance, sign() : ์–‘์ˆ˜๋Š” 1 ์Œ์ˆ˜๋Š” -1 0์€ 0์œผ๋กœ ์„ธํŒ… ํ•˜๋Š” ํ•จ์ˆ˜.
๏ƒ˜ Layer5์™€ Layer7์—์„œ์˜ feature๋ฅผ ์‚ฌ์šฉํ•จ.
โ€ข ๋‚ฎ์€ ๊ฐ’์ผ์ˆ˜๋ก ์ด๋ฏธ์ง€๋“ค ์‚ฌ์ด์—์„œ ๋” ์ผ๊ด€์„ฑ ์žˆ๊ณ  ๋” ์—ฐ๊ด€์„ฑ ์žˆ๋‹ค๋Š” ๊ฒƒ์ž„.
15
Convnet Visualization (10)
16
Convnet Visualization (11)
๏‚ง ์‹คํ—˜ ๊ฒฐ๊ณผ
โ€ข Layer5์—์„œ๋Š” Random๊ณผ ๋น„๊ตํ–ˆ์„ ๋•Œ ์˜ค๋ฅธ์ชฝ ๋ˆˆ, ์™ผ์ชฝ ๋ˆˆ, ์ฝ”์˜ ์ ์ˆ˜๋Š”
์ƒ๋Œ€์ ์œผ๋กœ ๋‚ฎ์€ ๊ฒƒ์œผ๋กœ ํ™•์ธ๋จ.
โ€ข ๋ฐ˜๋ฉด์— Layer7์—์„œ๋Š” Random๊ณผ๋Š” ๋ณ„ ์ฐจ์ด๊ฐ€ ์—†์Œ.
๏ƒ˜ Layer7์—์„œ ์ด๋Ÿฐ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜จ ์ด์œ ๋Š” Layer7์—์„œ๋Š” ๊ฐœ ์ข…์„ ๊ตฌ๋ณ„ํ•˜๋ ค๊ณ  ํ•˜๋Š” ์ธต์ด๊ธฐ ๋•Œ๋ฌธ.
17
5. Experiments (1)
๏‚ง 5.1 ImageNet 2012
โ€ข Krizhevsky et al., 2012์— ์•„ํ‚คํ…์ฒ˜๋ฅผ
๊ฑฐ์˜ ์ •ํ™•ํ•˜๊ฒŒ ๊ตฌํ˜„ํ•จ (0.1% ์˜ค์ฐจ).
โ€ข ์•ž์„œ Visualization์„ ํ†ตํ•ด ๋ฌธ์ œ๊ฐ€ ์žˆ๋Š”
๋ถ€๋ถ„์„ ์ฐพ์•„์„œ ๋ณด์™„ํ•œ ๋ฐฉ์‹์„ ์ ์šฉํ•จ.
โ€ข ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๋ชจ๋ธ๋“ค์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์ตœ์ข…์ ์œผ๋กœ
14.8% ๋ผ๋Š” ๊ฐ€์žฅ ๋‚ฎ์€ ์—๋Ÿฌ์œจ์„
๋งŒ๋“ค์–ด ๋ƒ„.(Test Top-5)
18
5. Experiments (2)
๏‚ง Varying ImageNet Model Sizes
โ€ข Krizhevsky et al. 2012 ์•„ํ‚คํ…์ฒ˜์—์„œ Layer ์‚ฌ์ด์ฆˆ๋ฅผ ๋ฐ”๊พธ๊ฑฐ๋‚˜
์ œ๊ฑฐํ•ด๋ณด๋ฉด์„œ ๊ตฌ์กฐ๋ฅผ ์—ฐ๊ตฌํ•ด๋ด„.
โ€ข ๊ฒฐ๋ก ์ ์œผ๋กœ ๋ชจ๋ธ์˜ ๊นŠ์ด๊ฐ€ ์ข‹์€ ์„ฑ๋Šฅ์„ ์–ป๋Š”๋ฐ ์ค‘์š”ํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ ๋จ.
19
5. Experiments (3)
๏‚ง 5.2 Feature Generalization
โ€ข ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ ์…‹์—๋„ ์ผ๋ฐ˜ํ™”๊ฐ€ ์ž˜๋˜๋Š”์ง€ ํ…Œ์ŠคํŠธ๋ฅผ ํ•ด๋ด„.
๏ƒ˜ ๋ฐ์ดํ„ฐ ์…‹
- Caltech-101 (Feifei et al. 2006), Caltech-256 (Griffin et al. 2006)
- PASCAL VOC 2012
โ€ข ๊ธฐ์กด์— ๋งŒ๋“ค์–ด์ง„ ๋ชจ๋ธ์—์„œ Layer1~7๊นŒ์ง€ ์œ ์ง€์‹œํ‚ค๊ณ  ๋งˆ์ง€๋ง‰์— softmax classifier๋งŒ
์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ ์…‹์œผ๋กœ ๋‹ค์‹œ ํ›ˆ๋ จ ์‹œํ‚ด.
โ€ข ์ƒˆ๋กœ์šด training ๋ฐ์ดํ„ฐ ์…‹์ด ImageNet์ด๋ž‘ ๊ฒน์น˜๋Š” ์ด๋ฏธ์ง€๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์–ด์„œ
normalized correlation์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ทธ๋Ÿฐ ์ด๋ฏธ์ง€๋“ค์„ ์ œ๊ฑฐ ์‹œํ‚ด.
โ€ข Class๋‹น ์ด๋ฏธ์ง€ ๊ฐœ์ˆ˜๋ฅผ ๋Š˜๋ ค๊ฐ€๋ฉด์„œ ์ •ํ™•๋„๋ฅผ ์ธก์ •ํ•จ.
๏ƒ˜ Caltech-101 : 15 or 30 image per class, 50 image per class
๏ƒ˜ Caltech-256 : 15, 30, 45, 60 image per class
โ€ข PASCAL 2012์˜ ๊ฒฝ์šฐ ์ด๋ฏธ์ง€์˜ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์˜ค๋ธŒ์ ํŠธ๊ฐ€ ์กด์žฌํ•ด์„œ
์ž์‹ ๋“ค์ด ๋งŒ๋“  ๋ชจ๋ธ๊ณผ ์ข€ ๋งž์ง€ ์•Š๋‹ค๊ณ  ์„ค๋ช…ํ•จ.
20
5. Experiments (4)
๏‚ง ์‹คํ—˜ ๊ฒฐ๊ณผ (1)
21
Caltech-101 classification accuracy Caltech-256 classification accuracy
Caltech-256 classification performance
5. Experiments (5)
๏‚ง ์‹คํ—˜ ๊ฒฐ๊ณผ (2)
โ€ข [A] : (Sande et al. 2012), [B] : (Yan et al. 2012)
โ€ข ๊ธฐ์กด์— ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋ƒˆ๋˜ ๋ชจ๋ธ๋ณด๋‹ค ํ‰๊ท  ์„ฑ๋Šฅ์ด 3.2% ๋‚ฎ๊ฒŒ ๋‚˜์˜ด.
๊ทธ๋Ÿฌ๋‚˜ 5๊ฐœ class์—์„œ ๋” ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์คŒ.
22
5. Experiments (6)
๏‚ง Feature Analysis
โ€ข ImageNet-pretrained model์ด ๊ฐ Layer๋งˆ๋‹ค ์–ผ๋งˆ๋‚˜ feature๋“ค์„
์ž˜ ๊ตฌ๋ณ„ํ•ด๋‚ด๋Š”์ง€ ์‹คํ—˜์„ ํ•ด๋ด„.
โ€ข Layer์˜ ์ˆ˜์˜ ๋ณ€ํ™”๋ฅผ ์ฃผ์—ˆ๊ณ  ๋ถ„๋ฅ˜๊ธฐ๋ฅผ linear SVM์ด๋‚˜ softmax๋ฅผ ์‚ฌ์šฉํ•จ.
โ€ข ๋ฐ์ดํ„ฐ ์…‹์€ Caltech-101, Caltech-256 ์‚ฌ์šฉํ•จ.
โ€ข ์‹คํ—˜ ๊ฒฐ๊ณผ
23
6. Discussion
๏‚ง Convolutional neural network model์— ๋Œ€ํ•ด Visualizeํ•˜๋Š” ๋ฐฉ์‹์„ ์ œ์•ˆํ•จ.
๏‚ง Visualize ๋ฐฉ์‹์„ ํ†ตํ•ด ๋ชจ๋ธ์„ ๋””๋ฒ„๊น…ํ•จ์œผ๋กœ์จ ๋” ๋‚˜์€ ๊ตฌ์กฐ๋ฅผ ์ฐพ๋Š” ๊ฒƒ์„
๋ณด์—ฌ์คŒ.
๏‚ง Occlusion ์‹คํ—˜์„ ํ†ตํ•ด ๋ถ„๋ฅ˜ ํ›ˆ๋ จ ๋™์•ˆ ์ด๋ฏธ์ง€์—์„œ ์œ„์น˜ ๊ตฌ์กฐ์— ๋Œ€ํ•ด
๋งค์šฐ ๋ฏผ๊ฐํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ ๋จ.
๏‚ง ๋ชจ๋ธ์—์„œ ๋ ˆ์ด์–ด ์ œ๊ฑฐ ์‹คํ—˜์„ ํ†ตํ•ด ์ตœ์†Œํ•œ์˜ ๊นŠ์ด๊ฐ€ ์žˆ๋Š” ๊ฒƒ์ด
๋ชจ๋ธ ์„ฑ๋Šฅ์— ์ข‹๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ ๋จ.
๏‚ง ์šฐ๋ฆฌ๊ฐ€ ๋งŒ๋“  ๋ชจ๋ธ์ด ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ ์…‹์—๋„ ์ผ๋ฐ˜ํ™”๊ฐ€ ์ž˜ ๋œ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์คŒ.
24
Thank you
25
Reference
๏‚ง Visualizing and understanding convolutional networks (2014), M. Zeiler
and R. Fergus
๏‚ง https://qiita.com/knao124/items/fdb47674ada389e70c6e
๏‚ง http://ferguson.tistory.com/5
26

More Related Content

What's hot

Ndc2014 ์‹œ์ฆŒ 2 : ๋ฉ€ํ‹ฐ์“ฐ๋ ˆ๋“œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์ด ์™œ ์ด๋ฆฌ ํž˜๋“œ๋‚˜์š”? (Lock-free์—์„œ Transactional Memory๊นŒ์ง€)
Ndc2014 ์‹œ์ฆŒ 2 : ๋ฉ€ํ‹ฐ์“ฐ๋ ˆ๋“œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์ด  ์™œ ์ด๋ฆฌ ํž˜๋“œ๋‚˜์š”?  (Lock-free์—์„œ Transactional Memory๊นŒ์ง€)Ndc2014 ์‹œ์ฆŒ 2 : ๋ฉ€ํ‹ฐ์“ฐ๋ ˆ๋“œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์ด  ์™œ ์ด๋ฆฌ ํž˜๋“œ๋‚˜์š”?  (Lock-free์—์„œ Transactional Memory๊นŒ์ง€)
Ndc2014 ์‹œ์ฆŒ 2 : ๋ฉ€ํ‹ฐ์“ฐ๋ ˆ๋“œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์ด ์™œ ์ด๋ฆฌ ํž˜๋“œ๋‚˜์š”? (Lock-free์—์„œ Transactional Memory๊นŒ์ง€)๋‚ดํ›ˆ ์ •
ย 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksJinwon Lee
ย 
[CVPR2021] Rainbow Memory: Continual Learning with a Memory of Diverse Samples
[CVPR2021] Rainbow Memory: Continual Learning with a Memory of Diverse Samples[CVPR2021] Rainbow Memory: Continual Learning with a Memory of Diverse Samples
[CVPR2021] Rainbow Memory: Continual Learning with a Memory of Diverse SamplesJihwan Bang
ย 
YOLO V6
YOLO V6YOLO V6
YOLO V6taeseon ryu
ย 
Simulation of a Telephone system & Reliability Problem
Simulation of a Telephone system & Reliability ProblemSimulation of a Telephone system & Reliability Problem
Simulation of a Telephone system & Reliability ProblemAltafur Rahman
ย 
I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)Susang Kim
ย 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningMLAI2
ย 
modeling concepts
modeling conceptsmodeling concepts
modeling conceptsMinal Maniar
ย 

What's hot (11)

Ndc2014 ์‹œ์ฆŒ 2 : ๋ฉ€ํ‹ฐ์“ฐ๋ ˆ๋“œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์ด ์™œ ์ด๋ฆฌ ํž˜๋“œ๋‚˜์š”? (Lock-free์—์„œ Transactional Memory๊นŒ์ง€)
Ndc2014 ์‹œ์ฆŒ 2 : ๋ฉ€ํ‹ฐ์“ฐ๋ ˆ๋“œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์ด  ์™œ ์ด๋ฆฌ ํž˜๋“œ๋‚˜์š”?  (Lock-free์—์„œ Transactional Memory๊นŒ์ง€)Ndc2014 ์‹œ์ฆŒ 2 : ๋ฉ€ํ‹ฐ์“ฐ๋ ˆ๋“œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์ด  ์™œ ์ด๋ฆฌ ํž˜๋“œ๋‚˜์š”?  (Lock-free์—์„œ Transactional Memory๊นŒ์ง€)
Ndc2014 ์‹œ์ฆŒ 2 : ๋ฉ€ํ‹ฐ์“ฐ๋ ˆ๋“œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์ด ์™œ ์ด๋ฆฌ ํž˜๋“œ๋‚˜์š”? (Lock-free์—์„œ Transactional Memory๊นŒ์ง€)
ย 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
ย 
[CVPR2021] Rainbow Memory: Continual Learning with a Memory of Diverse Samples
[CVPR2021] Rainbow Memory: Continual Learning with a Memory of Diverse Samples[CVPR2021] Rainbow Memory: Continual Learning with a Memory of Diverse Samples
[CVPR2021] Rainbow Memory: Continual Learning with a Memory of Diverse Samples
ย 
YOLO V6
YOLO V6YOLO V6
YOLO V6
ย 
Simulation of a Telephone system & Reliability Problem
Simulation of a Telephone system & Reliability ProblemSimulation of a Telephone system & Reliability Problem
Simulation of a Telephone system & Reliability Problem
ย 
I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)
ย 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual Learning
ย 
Ragam Model Proses Perangkat Lunak
Ragam Model Proses Perangkat LunakRagam Model Proses Perangkat Lunak
Ragam Model Proses Perangkat Lunak
ย 
modeling concepts
modeling conceptsmodeling concepts
modeling concepts
ย 
Uml
UmlUml
Uml
ย 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide
ย 

Similar to [Paper Review] Visualizing and understanding convolutional networks

"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...LEE HOSEONG
ย 
Progressive Growing of GANs for Improved Quality, Stability, and Variation Re...
Progressive Growing of GANs for Improved Quality, Stability, and Variation Re...Progressive Growing of GANs for Improved Quality, Stability, and Variation Re...
Progressive Growing of GANs for Improved Quality, Stability, and Variation Re...ํƒœ์—ฝ ๊น€
ย 
๋”ฅ๋Ÿฌ๋‹ ๋…ผ๋ฌธ์ฝ๊ธฐ efficient netv2 ๋…ผ๋ฌธ๋ฆฌ๋ทฐ
๋”ฅ๋Ÿฌ๋‹ ๋…ผ๋ฌธ์ฝ๊ธฐ efficient netv2  ๋…ผ๋ฌธ๋ฆฌ๋ทฐ๋”ฅ๋Ÿฌ๋‹ ๋…ผ๋ฌธ์ฝ๊ธฐ efficient netv2  ๋…ผ๋ฌธ๋ฆฌ๋ทฐ
๋”ฅ๋Ÿฌ๋‹ ๋…ผ๋ฌธ์ฝ๊ธฐ efficient netv2 ๋…ผ๋ฌธ๋ฆฌ๋ทฐtaeseon ryu
ย 
History of Vision AI
History of Vision AIHistory of Vision AI
History of Vision AITae Young Lee
ย 
2019 5-5-week-i-learned-generative model
2019 5-5-week-i-learned-generative model2019 5-5-week-i-learned-generative model
2019 5-5-week-i-learned-generative modelstrutive07
ย 
Learning how to explain neural networks: PatternNet and PatternAttribution
Learning how to explain neural networks: PatternNet and PatternAttributionLearning how to explain neural networks: PatternNet and PatternAttribution
Learning how to explain neural networks: PatternNet and PatternAttributionGyubin Son
ย 
Introduction toDQN
Introduction toDQNIntroduction toDQN
Introduction toDQNCurt Park
ย 
Image net classification with deep convolutional neural networks
Image net classification with deep convolutional neural networks Image net classification with deep convolutional neural networks
Image net classification with deep convolutional neural networks Korea, Sejong University.
ย 
์—ฐ๊ตฌ์‹ค ์„ธ๋ฏธ๋‚˜ Show and tell google image captioning
์—ฐ๊ตฌ์‹ค ์„ธ๋ฏธ๋‚˜ Show and tell google image captioning์—ฐ๊ตฌ์‹ค ์„ธ๋ฏธ๋‚˜ Show and tell google image captioning
์—ฐ๊ตฌ์‹ค ์„ธ๋ฏธ๋‚˜ Show and tell google image captioninghkh
ย 
HistoryOfCNN
HistoryOfCNNHistoryOfCNN
HistoryOfCNNTae Young Lee
ย 
[๋ถ€์ŠคํŠธ์บ ํ”„ Tech Talk] ๋ฐฐ์ง€์—ฐ_Structure of Model and Task
[๋ถ€์ŠคํŠธ์บ ํ”„ Tech Talk] ๋ฐฐ์ง€์—ฐ_Structure of Model and Task[๋ถ€์ŠคํŠธ์บ ํ”„ Tech Talk] ๋ฐฐ์ง€์—ฐ_Structure of Model and Task
[๋ถ€์ŠคํŠธ์บ ํ”„ Tech Talk] ๋ฐฐ์ง€์—ฐ_Structure of Model and TaskCONNECT FOUNDATION
ย 
Review MLP Mixer
Review MLP MixerReview MLP Mixer
Review MLP MixerWoojin Jeong
ย 
Imagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement LearningImagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement Learning์„ฑ์žฌ ์ตœ
ย 
[์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Others
 [์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Others [์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Others
[์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Othersjdo
ย 
Densely Connected Convolutional Networks
Densely Connected Convolutional NetworksDensely Connected Convolutional Networks
Densely Connected Convolutional NetworksOh Yoojin
ย 
VLFeat SIFT MATLAB application ํ…Œํฌ๋‹ˆ์ปฌ ๋ฆฌํฌํŠธ
VLFeat SIFT MATLAB application ํ…Œํฌ๋‹ˆ์ปฌ ๋ฆฌํฌํŠธVLFeat SIFT MATLAB application ํ…Œํฌ๋‹ˆ์ปฌ ๋ฆฌํฌํŠธ
VLFeat SIFT MATLAB application ํ…Œํฌ๋‹ˆ์ปฌ ๋ฆฌํฌํŠธHyunwoong_Jang
ย 
SAGAN_2024seminar announce_seoultech.pptx
SAGAN_2024seminar announce_seoultech.pptxSAGAN_2024seminar announce_seoultech.pptx
SAGAN_2024seminar announce_seoultech.pptxssuser4b2fe7
ย 
U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-In...
U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-In...U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-In...
U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-In...jungminchung
ย 
Summary in recent advances in deep learning for object detection
Summary in recent advances in deep learning for object detectionSummary in recent advances in deep learning for object detection
Summary in recent advances in deep learning for object detection์ฐฝ๊ธฐ ๋ฌธ
ย 

Similar to [Paper Review] Visualizing and understanding convolutional networks (20)

"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...
ย 
Progressive Growing of GANs for Improved Quality, Stability, and Variation Re...
Progressive Growing of GANs for Improved Quality, Stability, and Variation Re...Progressive Growing of GANs for Improved Quality, Stability, and Variation Re...
Progressive Growing of GANs for Improved Quality, Stability, and Variation Re...
ย 
๋”ฅ๋Ÿฌ๋‹ ๋…ผ๋ฌธ์ฝ๊ธฐ efficient netv2 ๋…ผ๋ฌธ๋ฆฌ๋ทฐ
๋”ฅ๋Ÿฌ๋‹ ๋…ผ๋ฌธ์ฝ๊ธฐ efficient netv2  ๋…ผ๋ฌธ๋ฆฌ๋ทฐ๋”ฅ๋Ÿฌ๋‹ ๋…ผ๋ฌธ์ฝ๊ธฐ efficient netv2  ๋…ผ๋ฌธ๋ฆฌ๋ทฐ
๋”ฅ๋Ÿฌ๋‹ ๋…ผ๋ฌธ์ฝ๊ธฐ efficient netv2 ๋…ผ๋ฌธ๋ฆฌ๋ทฐ
ย 
History of Vision AI
History of Vision AIHistory of Vision AI
History of Vision AI
ย 
2019 5-5-week-i-learned-generative model
2019 5-5-week-i-learned-generative model2019 5-5-week-i-learned-generative model
2019 5-5-week-i-learned-generative model
ย 
Learning how to explain neural networks: PatternNet and PatternAttribution
Learning how to explain neural networks: PatternNet and PatternAttributionLearning how to explain neural networks: PatternNet and PatternAttribution
Learning how to explain neural networks: PatternNet and PatternAttribution
ย 
Introduction toDQN
Introduction toDQNIntroduction toDQN
Introduction toDQN
ย 
Image net classification with deep convolutional neural networks
Image net classification with deep convolutional neural networks Image net classification with deep convolutional neural networks
Image net classification with deep convolutional neural networks
ย 
์—ฐ๊ตฌ์‹ค ์„ธ๋ฏธ๋‚˜ Show and tell google image captioning
์—ฐ๊ตฌ์‹ค ์„ธ๋ฏธ๋‚˜ Show and tell google image captioning์—ฐ๊ตฌ์‹ค ์„ธ๋ฏธ๋‚˜ Show and tell google image captioning
์—ฐ๊ตฌ์‹ค ์„ธ๋ฏธ๋‚˜ Show and tell google image captioning
ย 
HistoryOfCNN
HistoryOfCNNHistoryOfCNN
HistoryOfCNN
ย 
[๋ถ€์ŠคํŠธ์บ ํ”„ Tech Talk] ๋ฐฐ์ง€์—ฐ_Structure of Model and Task
[๋ถ€์ŠคํŠธ์บ ํ”„ Tech Talk] ๋ฐฐ์ง€์—ฐ_Structure of Model and Task[๋ถ€์ŠคํŠธ์บ ํ”„ Tech Talk] ๋ฐฐ์ง€์—ฐ_Structure of Model and Task
[๋ถ€์ŠคํŠธ์บ ํ”„ Tech Talk] ๋ฐฐ์ง€์—ฐ_Structure of Model and Task
ย 
Review MLP Mixer
Review MLP MixerReview MLP Mixer
Review MLP Mixer
ย 
Imagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement LearningImagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement Learning
ย 
[์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Others
 [์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Others [์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Others
[์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Others
ย 
Densely Connected Convolutional Networks
Densely Connected Convolutional NetworksDensely Connected Convolutional Networks
Densely Connected Convolutional Networks
ย 
VLFeat SIFT MATLAB application ํ…Œํฌ๋‹ˆ์ปฌ ๋ฆฌํฌํŠธ
VLFeat SIFT MATLAB application ํ…Œํฌ๋‹ˆ์ปฌ ๋ฆฌํฌํŠธVLFeat SIFT MATLAB application ํ…Œํฌ๋‹ˆ์ปฌ ๋ฆฌํฌํŠธ
VLFeat SIFT MATLAB application ํ…Œํฌ๋‹ˆ์ปฌ ๋ฆฌํฌํŠธ
ย 
Refinenet
RefinenetRefinenet
Refinenet
ย 
SAGAN_2024seminar announce_seoultech.pptx
SAGAN_2024seminar announce_seoultech.pptxSAGAN_2024seminar announce_seoultech.pptx
SAGAN_2024seminar announce_seoultech.pptx
ย 
U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-In...
U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-In...U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-In...
U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-In...
ย 
Summary in recent advances in deep learning for object detection
Summary in recent advances in deep learning for object detectionSummary in recent advances in deep learning for object detection
Summary in recent advances in deep learning for object detection
ย 

[Paper Review] Visualizing and understanding convolutional networks

  • 1. Visualizing and Understanding Convolutional Networks ๊น€์€์žฌ 2017๋…„ 11์›” 21์ผ
  • 2. ๋ชฉ์ฐจ ๏‚ง Abstract ๏‚ง Introduction ๏‚ง Approach ๏‚ง Training Details ๏‚ง Convnet Visualization ๏‚ง Experiments ๏‚ง Discussion 1
  • 3. Abstract ๏‚ง Large Convolutional Network model์ด ๋›ฐ์–ด๋‚œ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์คฌ๋Š”๋ฐ ์™œ ๊ทธ๊ฒƒ์ด ์„ฑ๋Šฅ์ด ์ž˜ ๋‚˜์˜ค๊ณ , ์–ด๋–ป๊ฒŒ ๊ฐœ์„ ๋˜์—ˆ๋Š”์ง€ ๋ช…ํ™•ํ•œ ์ดํ•ด๊ฐ€ ์—†์Œ. ๏‚ง ์œ„์™€ ๊ฐ™์€ ๋ฌธ์ œ ๋‹ค๋ฃจ๊ธฐ ์œ„ํ•ด ์‹œ๊ฐํ™” ๊ธฐ์ˆ  ์†Œ๊ฐœํ•จ. โ€ข ์ค‘๊ฐ„์ธต(intermediate feature)์™€ ๋ถ„๋ฅ˜๊ธฐ์˜ ๋™์ž‘์˜ ํ†ต์ฐฐ๋ ฅ ์คŒ. โ€ข ์‹œ๊ฐํ™” ๊ธฐ์ˆ ์„ ํ†ตํ•ด ๊ธฐ์กด (Krizhevsky et al 2012)์˜ ๋ถ„๋ฅ˜๊ธฐ ์„ฑ๋Šฅ์„ ๋Šฅ๊ฐ€ํ•˜๋Š” ๋ชจ๋ธ ์•„ํ‚คํ…์ณ๋ฅผ ์ฐพ์Œ. ๏‚ง ๋‹ค๋ฅธ ๋ชจ๋ธ ๋ ˆ์ด์–ด๋“ค๋กœ๋ถ€ํ„ฐ ์„ฑ๋Šฅ ๊ธฐ์—ฌ๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•œ ์—ฐ๊ตฌ๋ฅผ ํ•จ. ๏‚ง ๋งŒ๋“ค์–ด์ง„ ImageNet ๋ชจ๋ธ์ด ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ ์…‹์—๋„ ์ž˜ ์ผ๋ฐ˜ํ™”๊ฐ€ ๋œ ๊ฒƒ์„ ํ™•์ธํ•จ. 2
  • 4. Introduction ๏‚ง Convolutional Network ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ์„ฑ๊ณต ํ•  ์ˆ˜ ์žˆ์—ˆ๋˜ ์ด์œ  โ€ข 1) ๋งŽ์€ training set โ€ข 2) ๊ฐ•๋ ฅํ•œ GPU โ€ข 3) Drop out ๏‚ง Convnet ๋ชจ๋ธ์ด ์„ฑ๊ณต์ ์ด์—ˆ์Œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ๊ทธ์™€ ๊ฐ™์€ ์„ฑ๋Šฅ์„ ์–ด๋–ป๊ฒŒ ๋งŒ๋“ค์—ˆ๋Š”์ง€ ๋˜๋Š” ๊ทธ ๋ณต์žกํ•œ ๋ชจ๋ธ ์˜ ํ–‰๋™์ด๋‚˜ ๋‚ด๋ถ€ ๋™์ž‘ ๋“ฑ์„ ์ดํ•ดํ•˜๊ธฐ ํž˜๋“ฆ. ๏‚ง ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋ชจ๋ธ์— ์–ด๋–ค layer์—์„œ ๊ฐ feature map์„ ํ™œ์„ฑํ™”์‹œํ‚ค๋Š” input๋“ค์„ ๋ณด์—ฌ ์ค„ ์ˆ˜ ์žˆ๋Š” ์‹œ๊ฐํ™” ๊ธฐ์ˆ ์„ ์†Œ๊ฐœํ•จ. โ€ข Training ๋™์•ˆ feature๋“ค์ด ์ง„ํ™”ํ•˜๋Š” ๊ฑธ ๋ณผ ์ˆ˜ ์žˆ์Œ. โ€ข ๋ชจ๋ธ์ด ๊ฐ€์ง„ ์ž ์žฌ์ ์ธ ๋ฌธ์ œ์ ์„ ๋ณผ ์ˆ˜ ์žˆ์Œ. โ€ข Multi-layered Deconvolutional Network (Zeiler et al. 2011)๋ฅผ ์‚ฌ์šฉํ•จ. ๏‚ง ์ด๋ฏธ์ง€ ์ผ๋ถ€๋ถ„์„ ์ˆจ๊น€์œผ๋กœ์จ Scene์˜ ์–ด๋–ค ๋ถ€๋ถ„์„ ๋“œ๋Ÿฌ๋‚ด๋Š” ๊ฒƒ์ด ๋ถ„๋ฅ˜๊ธฐ์—์„œ ์ค‘์š”ํ•œ์ง€ ์—ฐ๊ตฌํ•จ. ๏‚ง ์‹œ๊ฐํ™” ๊ธฐ์ˆ ์„ ํ†ตํ•ด ๋‹ค๋ฅธ ์•„ํ‚คํ…์ฒ˜๋“ค์„ ์—ฐ๊ตฌํ•˜๊ณ  ๋” ๋Šฅ๊ฐ€ํ•˜๋Š” ๋ชจ๋ธ ๋ฐœ๊ฒฌํ•จ ๋˜ํ•œ ์ด ๋ชจ๋ธ์„ ๊ฐ€์ง€๊ณ  ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ ์…‹์—์„œ ๋ชจ๋ธ์˜ ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์„ ์—ฐ๊ตฌํ•จ. 3
  • 5. Approach (1) ๏‚ง Convnet ๋ชจ๋ธ โ€ข (Krizhevsky et al. 2012)์—์„œ ์‚ฌ์šฉ๋œ standard fully supervised convnet models๋ฅผ ์‚ฌ์šฉํ•จ. โ€ข ๋„คํŠธ์›Œํฌ์˜ ์•ž์ชฝ ๋ ˆ์ด์–ด๋“ค์€ conventional fully-connected network๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ๋งˆ์ง€๋ง‰ ๋ ˆ์ด์–ด๋Š” Softmax ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ์‚ฌ์šฉ, ์†์‹คํ•จ์ˆ˜๋Š” Cross-Entropy ์‚ฌ์šฉํ•จ. 4
  • 6. Approach (2) ๏‚ง 2.1 Visualization with a Deconvnet โ€ข Intermediate layers์—์„œ Features activities๋ฅผ pixel space๋กœ ๋งคํ•‘์‹œํ‚ค๋Š” ๋ฐฉ์‹์„ ๋ณด์—ฌ์คŒ. โ€ข Features map์„ ์ž…๋ ฅ ๊ฐ’์œผ๋กœ ์ „๋‹ฌํ•˜๊ณ  ์•„๋ž˜ 3๊ฐ€์ง€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ˆ˜ํ–‰ํ•จ. ๏ƒ˜ Unpooling - Switch๋ผ๋Š” ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์›๋ž˜์˜ poolingํ•˜๊ธฐ ์ „ ํ˜•ํƒœ๋กœ ๋ณต์›ํ•จ. - Switch๋Š” ๊ฐ pooling region์—์„œ local max์˜ ์œ„์น˜ ๊ฐ’์„ ๊ธฐ๋กํ•จ. ๏ƒ˜ Rectification - ReLU์„ ํ†ตํ•ด Feature map์ด ํ•ญ์ƒ ์–‘์ˆ˜๊ฐ€ ๋˜๋„๋ก ํ•จ. ๏ƒ˜ Filtering - Convnet์— ์‚ฌ์šฉ ๋œ Filter map์— ์ „์น˜(Transpose)๋ฅผ ์‹œํ‚จ ํ–‰๋ ฌ์„ ์‚ฌ์šฉํ•จ. - Rectified map์— ์ ์šฉํ•จ. 5
  • 7. Training Details ๏‚ง (Krizhevsky et al. 2012)์™€ ์ฐจ์ด์  โ€ข (Krizhevsky et al. 2012)๋Š” 3,4,5 ๋ ˆ์ด์–ด์— sparse connections์„ ์‚ฌ์šฉํ–ˆ์ง€๋งŒ ์—ฌ๊ธฐ์„œ ์“ฐ์ด๋Š” ๋ชจ๋ธ์€ dense connection์ž„. โ€ข Layer1, 2๊ฐ€ ๋‹ค๋ฆ„. (์ดํ›„์— ์„ค๋ช…) ๏‚ง Training โ€ข ImageNet 2012 training set ์‚ฌ์šฉํ•จ. โ€ข Preprocessing ๏ƒ˜ 256*256 ์‚ฌ์ด์ฆˆ๋กœ ์ค‘์•™๋ถ€๋ถ„์„ ์ž๋ฅด๊ณ  ๋ชจ๋“  ํ”ฝ์…€์˜ ํ‰๊ท ๊ฐ’์„ ๊ฐ€์ง€๊ณ  ๋บŒ. ๊ทธ๋Ÿฐ ํ›„ 224*224 ์‚ฌ์ด์ฆˆ๋กœ 10์žฅ์˜ ์ด๋ฏธ์ง€๋ฅผ ๋งŒ๋“ค์–ด๋ƒ„. โ€ข 128 ์‚ฌ์ด์ฆˆ์˜ Mini-batch ๋ฅผ ๊ฐ€์ง„ Stochastic gradient descent๋ฅผ ์ด์šฉํ•˜์—ฌ parameter๋ฅผ ์—…๋ฐ์ดํŠธํ•จ. โ€ข Layer 6,7์—์„œ Dropout์„ ์‚ฌ์šฉํ•จ. โ€ข ๋ชจ๋“  ๊ฐ€์ค‘์น˜๋Š” 10^-2๋กœ bias๋Š” 0์œผ๋กœ ์ดˆ๊ธฐํ™”ํ•จ. 6
  • 8. Convnet Visualization (1) ๏‚ง Feature Visualization โ€ข ์˜ค๋ฅธ์ชฝ ๊ทธ๋ฆผ์€ Training ์™„๋ฃŒ ํ›„์˜ feature์˜ ๋ชจ์Šต์ž„. (Visualization ์˜†์—๋Š” ํ•ด๋‹น image patch๋ฅผ ๋ณด์—ฌ์คŒ) โ€ข ๊ฐ€์žฅ ๊ฐ•ํ•œ activation ๋Œ€์‹ ์— top 9 activation ์„ ๋ณด์—ฌ์คŒ. โ€ข Pixel space๋กœ projecting ํ•˜๋Š” ๊ฒƒ์€ ์ฃผ์–ด์ง„ feature map์„ excite ์‹œํ‚ค๋Š” ๋‹ค๋ฅธ ๊ตฌ์กฐ๋ฅผ ๋งŒ๋“ค์–ด๋ƒ„. โ€ข Image patch๋Š” visualization๋ณด๋‹ค ๋ณ€์ˆ˜๊ฐ€ ๋งŽ์Œ. Visualization์€ ํŠน์ • ๊ตฌ์กฐ์— ์ดˆ์ ์„ ๋งž์ถค. (Ex) Layer5 [1, 2]) โ€ข ๋„คํŠธ์›Œํฌ์—์„œ ๊ฐ feature๋“ค์€ ๊ณ„์ธต์  ํŠน์„ฑ์„ ์ง€๋‹˜. ๏ƒ˜ Layer2 : corner, edge/color conjunctions ๏ƒ˜ Layer3 : similar textures ๏ƒ˜ Layer4 : significant variation, more class-specific ๏ƒ˜ Layer5 : entire objects with significant 7
  • 9. Convnet Visualization (2) ๏‚ง Feature Evolution during Training โ€ข ์œ„ ๊ทธ๋ฆผ์€ Feature map์ค‘์—์„œ ๊ฐ•ํ•œ activation์„ ํ›ˆ๋ จํ•˜๋Š” ๋™์•ˆ์— ์ง„ํ–‰์„ Visualization์„ ํ•จ. โ€ข ๋‚ฎ์€ layer์—์„œ๋Š” ์ ์€ ์ˆ˜์˜ epochs๋กœ ํ•™์Šต์ด ๋˜์ง€๋งŒ ๋†’์€ layer๋Š” 40~50๊ฐœ์˜ ์ƒ๋‹นํ•œ ์ˆ˜์˜ epochs๊ฐ€ ํ•„์š”๋กœ ํ•จ. 8
  • 10. Convnet Visualization (3) ๏‚ง Feature invariance โ€ข Sample Image 5๊ฐœ๋ฅผ ๋ณ€ํ˜•(์ด๋™, ํšŒ์ „, ํ™•๋Œ€์ถ•์†Œ)์„ ์‹œ์ผœ ํ•ด๋‹น feature vector์™€ untransformed feature vector์™€์˜ ์ƒ๋Œ€์ ์ธ ๋ณ€ํ™”๋ฅผ ๋ด„. โ€ข ์ž‘์€ ๋ณ€ํ˜•์— ๋Œ€ํ•ด์„œ๋Š” ์ฒซ ๋ฒˆ์งธ Layer๊ฐ€ ํฐ ์˜ํ–ฅ์„ ํฐ ํšจ๊ณผ๋ฅผ ๊ฐ€์ง€์ง€๋งŒ ๋งˆ์ง€๋ง‰ Layer์— ๋Œ€ํ•ด์„œ๋Š” ์˜ํ–ฅ์ด ๋ฏธ๋ฏธํ•จ. โ€ข ๋„คํŠธ์›Œํฌ ์ถœ๋ ฅ์€ ์ด๋™๊ณผ ์Šค์ผ€์ผ์— ๋Œ€ํ•ด์„œ๋Š” ์•ˆ์ •์ ์ด์ง€๋งŒ ํšŒ์ „์€ ๊ทธ๋ ‡์ง€ ์•Š์Œ. (๋Œ€์นญํšŒ์ „ ์ œ์™ธ) โ€ข ๋‹ค์Œ ์žฅ ๊ทธ๋ฆผ ๋ณ€์ˆ˜ ๏ƒ˜ A : ์ด๋™, B : ์Šค์ผ€์ผ, C : ํšŒ์ „ ๏ƒ˜ Col1 : 5๊ฐœ์˜ ๋ณ€ํ˜•๋œ ์ด๋ฏธ์ง€ ๏ƒ˜ Col2 & 3 : Euclidean distance๋ฅผ ์ด์šฉํ•œ feature vector ์ฐจ์ด. (2๋Š” Layer 1, 3์€ Layer 7) ๏ƒ˜ Col4 : ๊ฐ ์ด๋ฏธ์ง€์˜ True label์˜ ํ™•๋ฅ  9
  • 12. Convnet Visualization (5) ๏‚ง Architecture Selection โ€ข Visualization์€ ์ข‹์€ architecture๋ฅผ ์„ ํƒํ•˜๋Š” ๋ฐ ๋„์›€์„ ์คŒ. โ€ข (a)๋Š” feature scale clipping์ด ์—†๋Š” 1 Layer (b)์™€ (d)๊ฐ€ Krizhevsky et al์˜ 1 Layer์™€ 2 Layer (c)์™€ (e)๊ฐ€ ์ด ๋…ผ๋ฌธ์—์„œ ๋ฐ”๊พผ Layer โ€ข Filter size๋ฅผ 11 * 11 ๏ƒ  7 * 7 , Stride of convolution 4 ๏ƒ  2 ๏ƒ˜ ์ ์šฉ ๊ฒฐ๊ณผ (b)์—์„œ ์žˆ๋˜ a mix of extremely high and low frequency information์ด (c)์—์„œ๋Š” ๋ณด์ด์ง€ ์•Š์Œ. (d)์—์„œ aliasing artifact๊ฐ€ ๋ณด์ด์ง€๋งŒ (e)์—์„œ ๋งŽ์ด ์ค„์—ฌ์ง. ๊ฒŒ๋‹ค๊ฐ€ ๋ถ„๋ฅ˜๊ธฐ์˜ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋จ. 11
  • 14. Convnet Visualization (7) ๏‚ง Occlusion Sensitivity โ€ข ์ด ๋ชจ๋ธ์ด ์ •๋ง๋กœ ์ด๋ฏธ์ง€์—์„œ ์˜ค๋ธŒ์ ํŠธ์˜ ์œ„์น˜๋ฅผ ํ™•์ธํ•˜๋Š”์ง€ ๋˜๋Š” ์ฃผ๋ณ€ ํ™˜๊ฒฝ์„ ์ž˜ ํ™œ์šฉํ•˜๋Š”์ง€๋ฅผ ํ…Œ์ŠคํŠธ ํ•ด๋ด„. ๏ƒ˜ ์ด๋ฏธ์ง€์˜ ์ผ๋ถ€๋ถ„์„ ๊ฐ€๋ฆผ(Occlusion)์œผ๋กœ์จ ํ…Œ์ŠคํŠธ. โ€ข ์‹คํ—˜ ๊ฒฐ๊ณผ ๏ƒ˜ ๋ชจ๋ธ์€ ์”ฌ์—์„œ ๋ช…ํ™•ํ•˜๊ฒŒ ์˜ค๋ธŒ์ ํŠธ๋ฅผ ์ธ์ง€ํ•จ. - ์˜ค๋ธŒ์ ํŠธ๋ฅผ ๊ฐ€๋ ธ์„ ๋•Œ correct class ํ™•๋ฅ ์ด ๋–จ์–ด์ง. ๏ƒ˜ Visualization์€ image structure์— ๋“ค์–ด ๋งž๋Š”๊ฒƒ์„ ๋ณด์—ฌ์ฃผ๊ณ  ๋˜ ๋‹ค๋ฅธ visualization๋„ ์œ ํšจํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์ž…์ฆํ•จ. 13
  • 16. Convnet Visualization (9) ๏‚ง Correspondence Analysis โ€ข Deep model์€ ๊ธฐ์กด ์ธ์ง€ ์ ‘๊ทผ ๋ฐฉ์‹๋“ค๊ณผ ๋‹ค๋ฅด๊ฒŒ ํŠน์ •ํ•œ ์˜ค๋ธŒ์ ํŠธ ๋ถ€๋ถ„๋“ค ์‚ฌ์ด์— ๊ด€๋ จ์„ฑ์„ ๋งŒ๋“ค์–ด๋‚ด๋Š” ๋ช…ํ™•ํ•œ ๋ฉ”์ปค๋‹ˆ์ฆ˜์ด ์กด์žฌํ•˜์ง€ ์•Š์Œ. ๏ƒ˜ Deep model์ด ์ด๊ฒƒ๋“ค์„ implicitly ํ•˜๊ฒŒ ๊ณ„์‚ฐํ•˜๋Š”์ง€ ์ฐพ์•„๋ด„. โ€ข ์‹คํ—˜ ๋ฐฉ๋ฒ• ๏ƒ˜ 5๊ฐœ์˜ ๊ฐœ ์ด๋ฏธ์ง€๋ฅผ ๊ฐ€์ง€๊ณ  ๊ฐ ์ด๋ฏธ์ง€ ๋งˆ๋‹ค ์™ผ์ชฝ ๋ˆˆ, ์˜ค๋ฅธ์ชฝ ๋ˆˆ, ์ฝ”, ๋žœ๋ค ํ•œ ์œ„์น˜๋ฅผ ๊ฐ€๋ฆฌ๊ณ  ์›๋ณธ๊ณผ ๊ฐ€๋ ค์ง„ ์ด๋ฏธ์ง€์—์„œ ๋งŒ๋“ค์–ด์ง„ ๊ฐ๊ฐ์˜ feature vector๋ฅผ ๋นผ์คŒ. - ๐œ–๐‘– ๐‘™ = ๐‘ฅ๐‘– ๐‘™ โˆ’ ๐‘ฅ๐‘– ~๐‘™ โ€ข i : ์ด๋ฏธ์ง€ ์ธ๋ฑ์Šค, l , ~l : Layer l์—์„œ์˜ ์›๋ณธ, occlude ๋œ ์ด๋ฏธ์ง€์˜ feature map - โ€ข H : hamming distance, sign() : ์–‘์ˆ˜๋Š” 1 ์Œ์ˆ˜๋Š” -1 0์€ 0์œผ๋กœ ์„ธํŒ… ํ•˜๋Š” ํ•จ์ˆ˜. ๏ƒ˜ Layer5์™€ Layer7์—์„œ์˜ feature๋ฅผ ์‚ฌ์šฉํ•จ. โ€ข ๋‚ฎ์€ ๊ฐ’์ผ์ˆ˜๋ก ์ด๋ฏธ์ง€๋“ค ์‚ฌ์ด์—์„œ ๋” ์ผ๊ด€์„ฑ ์žˆ๊ณ  ๋” ์—ฐ๊ด€์„ฑ ์žˆ๋‹ค๋Š” ๊ฒƒ์ž„. 15
  • 18. Convnet Visualization (11) ๏‚ง ์‹คํ—˜ ๊ฒฐ๊ณผ โ€ข Layer5์—์„œ๋Š” Random๊ณผ ๋น„๊ตํ–ˆ์„ ๋•Œ ์˜ค๋ฅธ์ชฝ ๋ˆˆ, ์™ผ์ชฝ ๋ˆˆ, ์ฝ”์˜ ์ ์ˆ˜๋Š” ์ƒ๋Œ€์ ์œผ๋กœ ๋‚ฎ์€ ๊ฒƒ์œผ๋กœ ํ™•์ธ๋จ. โ€ข ๋ฐ˜๋ฉด์— Layer7์—์„œ๋Š” Random๊ณผ๋Š” ๋ณ„ ์ฐจ์ด๊ฐ€ ์—†์Œ. ๏ƒ˜ Layer7์—์„œ ์ด๋Ÿฐ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜จ ์ด์œ ๋Š” Layer7์—์„œ๋Š” ๊ฐœ ์ข…์„ ๊ตฌ๋ณ„ํ•˜๋ ค๊ณ  ํ•˜๋Š” ์ธต์ด๊ธฐ ๋•Œ๋ฌธ. 17
  • 19. 5. Experiments (1) ๏‚ง 5.1 ImageNet 2012 โ€ข Krizhevsky et al., 2012์— ์•„ํ‚คํ…์ฒ˜๋ฅผ ๊ฑฐ์˜ ์ •ํ™•ํ•˜๊ฒŒ ๊ตฌํ˜„ํ•จ (0.1% ์˜ค์ฐจ). โ€ข ์•ž์„œ Visualization์„ ํ†ตํ•ด ๋ฌธ์ œ๊ฐ€ ์žˆ๋Š” ๋ถ€๋ถ„์„ ์ฐพ์•„์„œ ๋ณด์™„ํ•œ ๋ฐฉ์‹์„ ์ ์šฉํ•จ. โ€ข ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๋ชจ๋ธ๋“ค์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์ตœ์ข…์ ์œผ๋กœ 14.8% ๋ผ๋Š” ๊ฐ€์žฅ ๋‚ฎ์€ ์—๋Ÿฌ์œจ์„ ๋งŒ๋“ค์–ด ๋ƒ„.(Test Top-5) 18
  • 20. 5. Experiments (2) ๏‚ง Varying ImageNet Model Sizes โ€ข Krizhevsky et al. 2012 ์•„ํ‚คํ…์ฒ˜์—์„œ Layer ์‚ฌ์ด์ฆˆ๋ฅผ ๋ฐ”๊พธ๊ฑฐ๋‚˜ ์ œ๊ฑฐํ•ด๋ณด๋ฉด์„œ ๊ตฌ์กฐ๋ฅผ ์—ฐ๊ตฌํ•ด๋ด„. โ€ข ๊ฒฐ๋ก ์ ์œผ๋กœ ๋ชจ๋ธ์˜ ๊นŠ์ด๊ฐ€ ์ข‹์€ ์„ฑ๋Šฅ์„ ์–ป๋Š”๋ฐ ์ค‘์š”ํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ ๋จ. 19
  • 21. 5. Experiments (3) ๏‚ง 5.2 Feature Generalization โ€ข ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ ์…‹์—๋„ ์ผ๋ฐ˜ํ™”๊ฐ€ ์ž˜๋˜๋Š”์ง€ ํ…Œ์ŠคํŠธ๋ฅผ ํ•ด๋ด„. ๏ƒ˜ ๋ฐ์ดํ„ฐ ์…‹ - Caltech-101 (Feifei et al. 2006), Caltech-256 (Griffin et al. 2006) - PASCAL VOC 2012 โ€ข ๊ธฐ์กด์— ๋งŒ๋“ค์–ด์ง„ ๋ชจ๋ธ์—์„œ Layer1~7๊นŒ์ง€ ์œ ์ง€์‹œํ‚ค๊ณ  ๋งˆ์ง€๋ง‰์— softmax classifier๋งŒ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ ์…‹์œผ๋กœ ๋‹ค์‹œ ํ›ˆ๋ จ ์‹œํ‚ด. โ€ข ์ƒˆ๋กœ์šด training ๋ฐ์ดํ„ฐ ์…‹์ด ImageNet์ด๋ž‘ ๊ฒน์น˜๋Š” ์ด๋ฏธ์ง€๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์–ด์„œ normalized correlation์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ทธ๋Ÿฐ ์ด๋ฏธ์ง€๋“ค์„ ์ œ๊ฑฐ ์‹œํ‚ด. โ€ข Class๋‹น ์ด๋ฏธ์ง€ ๊ฐœ์ˆ˜๋ฅผ ๋Š˜๋ ค๊ฐ€๋ฉด์„œ ์ •ํ™•๋„๋ฅผ ์ธก์ •ํ•จ. ๏ƒ˜ Caltech-101 : 15 or 30 image per class, 50 image per class ๏ƒ˜ Caltech-256 : 15, 30, 45, 60 image per class โ€ข PASCAL 2012์˜ ๊ฒฝ์šฐ ์ด๋ฏธ์ง€์˜ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์˜ค๋ธŒ์ ํŠธ๊ฐ€ ์กด์žฌํ•ด์„œ ์ž์‹ ๋“ค์ด ๋งŒ๋“  ๋ชจ๋ธ๊ณผ ์ข€ ๋งž์ง€ ์•Š๋‹ค๊ณ  ์„ค๋ช…ํ•จ. 20
  • 22. 5. Experiments (4) ๏‚ง ์‹คํ—˜ ๊ฒฐ๊ณผ (1) 21 Caltech-101 classification accuracy Caltech-256 classification accuracy Caltech-256 classification performance
  • 23. 5. Experiments (5) ๏‚ง ์‹คํ—˜ ๊ฒฐ๊ณผ (2) โ€ข [A] : (Sande et al. 2012), [B] : (Yan et al. 2012) โ€ข ๊ธฐ์กด์— ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋ƒˆ๋˜ ๋ชจ๋ธ๋ณด๋‹ค ํ‰๊ท  ์„ฑ๋Šฅ์ด 3.2% ๋‚ฎ๊ฒŒ ๋‚˜์˜ด. ๊ทธ๋Ÿฌ๋‚˜ 5๊ฐœ class์—์„œ ๋” ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์คŒ. 22
  • 24. 5. Experiments (6) ๏‚ง Feature Analysis โ€ข ImageNet-pretrained model์ด ๊ฐ Layer๋งˆ๋‹ค ์–ผ๋งˆ๋‚˜ feature๋“ค์„ ์ž˜ ๊ตฌ๋ณ„ํ•ด๋‚ด๋Š”์ง€ ์‹คํ—˜์„ ํ•ด๋ด„. โ€ข Layer์˜ ์ˆ˜์˜ ๋ณ€ํ™”๋ฅผ ์ฃผ์—ˆ๊ณ  ๋ถ„๋ฅ˜๊ธฐ๋ฅผ linear SVM์ด๋‚˜ softmax๋ฅผ ์‚ฌ์šฉํ•จ. โ€ข ๋ฐ์ดํ„ฐ ์…‹์€ Caltech-101, Caltech-256 ์‚ฌ์šฉํ•จ. โ€ข ์‹คํ—˜ ๊ฒฐ๊ณผ 23
  • 25. 6. Discussion ๏‚ง Convolutional neural network model์— ๋Œ€ํ•ด Visualizeํ•˜๋Š” ๋ฐฉ์‹์„ ์ œ์•ˆํ•จ. ๏‚ง Visualize ๋ฐฉ์‹์„ ํ†ตํ•ด ๋ชจ๋ธ์„ ๋””๋ฒ„๊น…ํ•จ์œผ๋กœ์จ ๋” ๋‚˜์€ ๊ตฌ์กฐ๋ฅผ ์ฐพ๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์คŒ. ๏‚ง Occlusion ์‹คํ—˜์„ ํ†ตํ•ด ๋ถ„๋ฅ˜ ํ›ˆ๋ จ ๋™์•ˆ ์ด๋ฏธ์ง€์—์„œ ์œ„์น˜ ๊ตฌ์กฐ์— ๋Œ€ํ•ด ๋งค์šฐ ๋ฏผ๊ฐํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ ๋จ. ๏‚ง ๋ชจ๋ธ์—์„œ ๋ ˆ์ด์–ด ์ œ๊ฑฐ ์‹คํ—˜์„ ํ†ตํ•ด ์ตœ์†Œํ•œ์˜ ๊นŠ์ด๊ฐ€ ์žˆ๋Š” ๊ฒƒ์ด ๋ชจ๋ธ ์„ฑ๋Šฅ์— ์ข‹๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ ๋จ. ๏‚ง ์šฐ๋ฆฌ๊ฐ€ ๋งŒ๋“  ๋ชจ๋ธ์ด ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ ์…‹์—๋„ ์ผ๋ฐ˜ํ™”๊ฐ€ ์ž˜ ๋œ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์คŒ. 24
  • 27. Reference ๏‚ง Visualizing and understanding convolutional networks (2014), M. Zeiler and R. Fergus ๏‚ง https://qiita.com/knao124/items/fdb47674ada389e70c6e ๏‚ง http://ferguson.tistory.com/5 26