20. NBI
(Narrow Band Imaging
, –NBI AFI IRI – , 1 1 , , 2006.
R
B
G
415nm
540nm
Color Transform
NBI filter
Xenon lamp RGB rotary filter
mucosal
CCD
Monitor
Light source unit
Video processor
ON
OFF
Normal light
Normal light
NBI
NBI
http://www.olympus.co.jp/jp/technology/technology/luceraelite/
25. NBI (NBI: Narrow-band Imaging)
• pit
–
–
Type A
Type B
Type C
1
2
3
pit
pit
/
pit
/
pit
/
(AVA)
NBI [H.Kanao et al., ‘09]
sm
26.
27.
28.
29. texture analysis approach
Yoshito Takemura, Shigeto Yoshida, Shinji Tanaka, Keiichi Onji, Shiro Oka, Toru Tamaki, Kazufumi Kaneda, Masaharu Yoshihara, Kazuaki Chayama:
"Quantitative analysis and development of a computer-aided system for identification of regular pit patterns of colorectal lesions," Gastrointestinal
Endoscopy, Vol. 72, No. 5, pp. 1047-1051 (2010 11).
30. Bag-of-Visual Words Approach
Type A Type B Type C3
12, 55, 63, …
87, 49, 21, …
32, 20, 73, …
67, 6, 0, …
79, 5, 40, …
11, 36, 87, …
27, 64, 25, …, 87
93, 41, 75, …, 8
…
12, 55, 63, …
87, 49, 21, …
32, 20, 73, …
67, 6, 0, …
79, 5, 40, …
11, 36, 87, …
65, 33, 19, …, 101
52, 51, 32, …, 89
…
12, 55, 63, …
87, 49, 21, …
32, 20, 73, …
67, 6, 0, …
79, 5, 40, …
11, 36, 87, …
66, 95, 47, …, 85
11, 82, 3,…, 124
…
Type A Type B Type C3
84, 99, 40, …, 121
5, 26, 91, …, 150
…
Vector quantization
Vector quantization
Feature space
Classifier
Histogram
Test image
Learning
Classification result
Description of Local features
+ Bag-of-features
31. Object Bag of words
Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
32. Analogy to documents
Of all the sensory impressions proceeding to
the brain, the visual experiences are the
dominant ones. Our perception of the world
around us is based essentially on the
messages that reach the brain from our eyes.
For a long time it was thought that the retinal
image was transmitted point by point to visual
centers in the brain; the cerebral cortex was a
movie screen, so to speak, upon which the
image in the eye was projected. Through the
discoveries of Hubel and Wiesel we now
know that behind the origin of the visual
perception in the brain there is a considerably
more complicated course of events. By
following the visual impulses along their path
to the various cell layers of the optical cortex,
Hubel and Wiesel have been able to
demonstrate that the message about the
image falling on the retina undergoes a step-
wise analysis in a system of nerve cells
stored in columns. In this system each cell
has its specific function and is responsible for
a specific detail in the pattern of the retinal
image.
sensory, brain,
visual, perception,
retinal, cerebral cortex,
eye, cell, optical
nerve, image
Hubel, Wiesel
China is forecasting a trade surplus of $90bn
(£51bn) to $100bn this year, a threefold
increase on 2004's $32bn. The Commerce
Ministry said the surplus would be created by
a predicted 30% jump in exports to $750bn,
compared with a 18% rise in imports to
$660bn. The figures are likely to further
annoy the US, which has long argued that
China's exports are unfairly helped by a
deliberately undervalued yuan. Beijing
agrees the surplus is too high, but says the
yuan is only one factor. Bank of China
governor Zhou Xiaochuan said the country
also needed to do more to boost domestic
demand so more goods stayed within the
country. China increased the value of the
yuan against the dollar by 2.1% in July and
permitted it to trade within a narrow band, but
the US wants the yuan to be allowed to trade
freely. However, Beijing has made it clear that
it will take its time and tread carefully before
allowing the yuan to rise further in value.
China, trade,
surplus, commerce,
exports, imports, US,
yuan, bank, domestic,
foreign, increase,
trade, value
Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
33. Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
34. ヒストグラム
Type C3Type BType A
Type A Type C3
学習画像
Type B
特徴量:
Bag-of-Visual Words Approach
病変部の画像パッチを分類[Tamaki et al., 2013]
・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習
・Type C1,Type C2は不明瞭な部分が多いため省かれている
・最大認識率96%
認識の流れ
特徴量抽出 特徴量をクラスタリング,
代表値をVisual Wordsとする
Visual Wordsヒストグラムを作成
認識画像
Type ?Type B
Visual Wordsヒストグラムを作成
SVM学習 認識
Visual Words
Bag-of-Visual Wordsの枠組み
46. ヒストグラム
Type C3Type BType A
Type A Type C3
学習画像
Type B
特徴量:
病変部の画像パッチを分類[Tamaki et al., 2013]
・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習
・Type C1,Type C2は不明瞭な部分が多いため省かれている
・最大認識率96%
認識の流れ
特徴量抽出 特徴量をクラスタリング,
代表値をVisual Wordsとする
Visual Wordsヒストグラムを作成
認識画像
Type ?Type B
Visual Wordsヒストグラムを作成
SVM学習 認識
Visual Words
Bag-of-Visual Wordsの枠組み
56. Related Work
61
Adapting Visual Category Models to New Domains
[Saenko et al., ECCV2010]
SourceとTargetの同時認識をする問題
Source:x Target:y
Targetのみを認識する問題
Ø 本手法はハイパーパラメータが存在しない
Ø この手法はハイパーパラメータが存在し,調整が必要
TargetをSourceに変換する行列 W を求める
Our Approach
Source
Target
W
Source-Target間の条件を満たす行列 を求めるA
Source
Target
+Target
For each class
(xi yj)T
A(xi yj) upper bound
(xi yj)T
A(xi yj) lower bound
Same class:
Different class: A1/2
A1/2
! arg min
W
kx W yk2
F
W
57. y1
Convert Histogram
62
yn
yN
x1
xn
xN
Source Target
1. Visual Wordsヒストグラムをベクトルとして扱い,行列とする
2. ヒストグラム同士の誤差を最小に
する変換行列WをADMM*で求める
*ADMMによる解法 (For each row n=1, …, N)
arg min
W
PN
n=1 ||xn W nyn||2
2
+1
2 ||W n zn + un||2
2
+
PN
n=1(zk
n uk
n))
手順
以下の双対問題を手順を繰り返すことで解く
Y = y1, · · · , yNX = x1, · · · , xN
Subject to. W ij 0
arg min
W
kX W Y k2
F
zk+1
n = ⇡c(W k+1
n + uk
n)
uk+1
n = uk
n + W k+1
n zk+1
n
W k+1
n = (
PN
n=1 ynyT
n + E) 1
(
PN
n=1 ynyT
n
58. How to Make Pseudo Dataset
63
l 新内視鏡はくっきり,鮮やかに見えると思われるため
を適用する
①コントラスト強調
②先鋭化フィルタ
Source Target
Output
Input
0 25542 213
0
255
1
9
1
9
1
9
1
9
1
9
1
9
1
9
1
9
25
9
コントラスト強調 先鋭化フィルタ
l この手法は学習画像同士の対応がないと使えない
Ø 現実には対応のある画像を得るのは難しい
60. Related Works
65
Cross-Domain Transform[Saenko et al., ECCV2010]
Max-Margin Domain Transfer(MMDT)[Hoffman et al., ICLR2013]
min tr(W ) log det W
s.t. W ⌫ 0
kxs
i xt
jkw upper bound, (xs
i , xt
j) 2 the same class
kxs
i xt
jkw lowe rbound, (xs
i , xt
j) 2 di↵erent class
Ø Estimate transformation matrix which minimize Mahalanobis distance.
Ø Consider in only transformed feature distributions.
Ø Not ensure classification result.
min
W ,✓,b
1
2
kW k2
F +
1
2
KX
k=1
k✓kk2
2 + Cs
nX
i=1
KX
k=1
⇠s
i,k + Ct
mX
j=1
KX
k=1
⇠t
j,k
s.t. ys
i,k✓T
k xs
i bk 1 ⇠s
i.k
yt
j,k✓T
k W xt
j 1 ⇠t
j,k
⇠s
i,k 0, ⇠t
j,k 0
Ø Optimize transformation matrix and SVM parameters at same time.
Ø Ensure classification result.
Ø Not guarantee transformed feature distributions.
W : Transform matrix
✓k : SVM parameter ⇠s
, ⇠t
: Slack variable
yi,k : Indicator function
61. Propose Method
66
min
W ,✓,b
1
2
kW k2
F +
1
2
KX
k=1
k✓kk2
2 + Cs
nX
i=1
KX
k=1
⇠s
i,k
s.t. ys
i,k✓T
k xs
i bk 1 ⇠s
i.k
yt
j,k✓T
k W xt
j 1 ⇠t
j,k
⇠s
i,k 0, ⇠t
j,k 0
Constraint of close transformed target to source.
+Ct
mX
j=1
KX
k=1
⇠t
j,k +
1
2
D
MX
i=1
NX
j=1
yi,jk(W xt
i xs
j)k2
2
Ø Add L2 distance constraints to MMDT.
Ø Our method ensures classification result
and transformed feature distributions.
Max-Margin Domain Transfer with L2 Distance Constraints
(MMDTL2)
62. Decompose to Sub-problem
67
Hoffman et al. decompose objective function to 2 sub-problem in MMDT.
Our method as well decomposes objective functions in below.
Objective function optimize by iterate (1) and (2).
min
✓,⇠s,⇠t
1
2
KX
k=1
k✓kk2
2 + Cs
NX
i=1
KX
k=1
⇠s
i,k + Ct
MX
j=1
KX
k=1
⇠t
j,k(1)
Constraintof close transformed target to source.
(2) min
W ,⇠t
1
2
kW k2
F + Ct
MX
j=1
KX
k=1
⇠t
j,k +
1
2
D
MX
i=1
MX
j=1
yi,jkW xt
i xs
jk2
2
Objective function for optimize SVM parameter.
Objective function for optimize transform matrix.
s.t. ys
i,k✓T
k xs
i bk 1 ⇠s
i.k
yt
j,k✓T
k W xt
j 1 ⇠t
j,k
⇠s
i,k 0, ⇠t
j,k 0
63. Primal Problem
68
U(x) =
2
6
6
6
4
xxT
xxT
...
xxT
3
7
7
7
5
vi,j = vec(xs
j(xt
i)T
)
w = vec(W )
(x) = vec(✓xT
)
min
w,⇠t
1
2
kwk2
2 + Ct
MX
j=1
KX
k=1
⇠t
j,k +
1
2
D
MX
i=1
MX
j=1
wT
U(xt
i)w 2vT
ijw + (xt
i)T
xs
j(2)
s.t. ⇠t
i 0
yt
i,k
T
k (xt
i)w 1 ⇠t
i,k
Derivate from objective function for optimize transform matrix.
This is standard quadratic programming but…
p High computational costs.
p Need to huge memory.
p Depend on dimensions of data.
Derivate dual problem.
64. Dual Problem
69
s.t. 0 ai CT
MX
i=1
aiyt
i,k = 0
max
a
1
2
KX
k1=1
KX
k2=1
MX
i=1
MX
j=1
aiajyt
i,k1
yt
j,k2
T
k1(xt
i)V 1
k2 (xt
j)
+
KX
k=1
MX
i=1
ai 1 D T
k (xt
i)V 1
MX
m=1
NX
n=1
ym,nvi,j
!!
(2)
p Low computational cost.
p Defined by sparse problem.
p Depend on number of target data.
ai: Lagrange multiplier
Dual problem has many advantages.
V =
0
@I + D
MX
i=1
NX
j=1
yi,jU(xt
i)
1
A
65. Comparison Primal with Dual of
Computation Time
70
SetupTime: computation time for coefficients(e.g. and ).
OptimizationTime: optimization time for solving quadratic programming
CalculationTime: computation time in from (dual only).
U(x) vi,j
w a
3riPDO DuDO
0
1000
2000
3000
4000
5000
6000
7000
CoPSutDtion7iPe
6etuS7iPe
2StiPizDtion7iPe
CDOcuODtion7iPe
Visual Words:128
About 14 times faster
66. Result
71
MMDTL2 achieve good performance as equivalent with baseline.
But Not transfer is the best performance.
8 16 32 64 128 256 512 1024
# Rf 9LVuDl WRrdV
0.4
0.5
0.6
0.7
0.8
0.9
1.05ecRgnLWLRnrDWe BDVelLne
6Rurce Rnly
1RW WrDnVfer
00D7
00D7L2
73. [ ]
0
0.5
1
0 50 100 150 200
Probability
フレーム番号
Type A
Type B
•
•
Type A
Type B
Type C3
74. MRF/HMM
f x y( )∝exp A xi, yi( )
i
∑
#
$
%
&
'
(⋅exp I xi, xj( )
j∈Ni
∑
#
$
%%
&
'
((
x:
y: SVM
x1
x50………… x100 x150 x200………… ………… …………
B B BC3 C3
i0 50 100 200150
…… …… …… ……
y1
y50 y100 y150 y200
75. Type B (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.8)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.9)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.99)
Type B (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.8)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.9)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.99)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.999)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.6)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.7)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.8)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.9)
frame number
0 20 40 60 80 100 120 140 160 180 200
0
0.5
1
B
A
C
20 40 60 80 100 120 140 160 180 200
Type B
0
0.5
1
A
B
C
20 40 60 80 100 120 140 160 180 200
Type A_1 (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type A_1 (DP_0.99)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type A_1 (Gibbs_p4=0.9)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type A
Type A Type B Type C3
MAP
(C3 )
( )
76. A
B
C3
Type A Type B Type C3
MRF
0
0.5
1
0 50 100 150 200
Probability
Type A
Type B
Type C3
77. Type A Type B
Type A Type B
Type A Type B Type C3
79. Possible Cause of Instability
84
p Classification results would be
affected by out of focus.
number of visual words
RecognitionRate[%]
●
●
●
●
● ● ●
●
● ● ● ●
● ● ●
●
●
●
●
● ●
● ● ●
●
●no defoucs
SD = 0.5
SD = 1
SD = 2
SD = 3
SD = 5
SD = 7
SD = 9
SD = 11
10 100 1000 10000
0.00.20.40.60.81.0
p Test image: 1191
Ø Test images are added Gaussian blur with different SD.
p Train image: 480
Ø 160 images for each class
Smaller SD Larger
Recognition results for out of focus images
80. Particle Filter (Online Bayesian Filtering)
85
State vector:
Observation vector:
t : time
p (xt | y1:t 1) =
Z
p (xt | xt 1, ✓1) p (xt 1 | y1:t 1) dxt 1
Prediction
State transition
We use Dirichlet distribution for state transition and likelihood.
Update
Likelihood
p (xt | y1:t) / p (yt | xt, ✓2) p (xt | y1:t 1)
yt =
⇣
y
(A)
t , y
(B)
t , y
(C3)
t
⌘
, y
(A)
t + y
(B)
t + y
(C3)
t = 1
xt =
⇣
x
(A)
t , x
(B)
t , x
(C3)
t
⌘
, x
(A)
t + x
(B)
t + x
(C3)
t
81. Dirichlet distribution
86
(0.50, 0.50, 0.50)
(0.85, 1.50, 2.00)
(1.00, 1.00, 1.00)
(1.00, 1.76, 2.35)
(4.00, 4.00 ,4.00)
(3.40, 6.00, 8.00)
low
high
Dirx[↵] =
(
PN
i=1 ↵i)
QN
i=1 (↵i)
NY
i=1
x↵i 1
i
parameter of distribution:
↵ (x) = ax + b
82. Problem & Our Approach
87
xt 1 xt xt+1
yt 1 yt yt+1zt+1zt 1 zt
t t+1t 1
xt 1 xt+1xt
ytyt 1 yt+1
✓2
Dirichlet Particle Filter (DPF)
Defocus-aware Dirichlet Particle Filter (D-DPF)
Prediction
p (xy | y1:t 1, 1:t 1, z1:t 1) =
Z
p (xt | xt 1, ✓1)p (xt 1 | y1:t 1, 1:t 1, z1:t 1)dxt 1
State transition
p (xt | y1:t, 1:t, z1:t) /
p (yt, t, zt | xt) p (xt | y1:t 1, 1:t 1, z1:t 1)
Update
Likelihood
p (yt, t, zt | xt) = p (yt, xt, t) p (zt | t)
83. Isolated Pixel Ratio (IPR) [Oh et al., MedIA2007]
88
Endoscopic image Edges pixels by Canny edge detector
Clear edge Defocus edge
Edge pixel
Non-edge pixel
Edge and isolated pixel
IPR: the percentage of isolated pixel in every edge pixels
84. Isolated pixel value (IPR)
frequency
0.000.020.040.060.080.10
0 0.005 0.01 0.015
γt
Density
0 2 4 6 8 10
0.00.20.40.60.81.01.2
sigma = 0.5
sigma = 1
sigma = 2
sigma = 3
sigma = 4
Dirichlet
distribution
Modeling with Rayleigh dist. and IPR
89
Rayx [ ] =
x
2
exp
✓
x2
2 2
◆
Defocus Clear
γt
0.000 0.005 0.010 0.015
0.51.01.52.02.53.03.54.0
zt
σ(zt)
●
●
(zt) = 4 exp(100 log(0.25)zt)
p (zt | t) = Ray t
[ (zt)]
85. Sequential filtering
90
Prediction
p (xy | y1:t 1, 1:t 1, z1:t 1) =
Z
p (xt | xt 1, ✓1)p (xt 1 | y1:t 1, 1:t 1, z1:t 1)dxt 1
p (xt | y1:t, 1:t, z1:t) /
p (yt, t, zt | xt) p (xt | y1:t 1, 1:t 1, z1:t 1)
Update
xt 1 xt xt+1
yt 1 yt yt+1zt+1zt 1 zt
t t+1t 1
p (yt, t, zt | xt) = p (yt, xt, t) p (zt | t)
p (yt, xt, t) = Dirxt
[↵2 (yt, t)] p (zt | t) = Ray t
[ (zt)]
p (xt | xt 1, ✓1) = Dirxt
[↵1(xt 1, ✓1)]
86. The performance for defocus frames
91
0 100 200 300 400 500 6000.00.51.0
0 100 200 300 400 500 600
0.00.51.0
0 100 200 300 400 500 600
0.0000.0050.010
0 100 200 300 400 500 600
0.00.51.0
0 100 200 300 400 500 600
0.00.51.0
Frame number
Ground truth
Observation
IPR
Result by DPF
Result by
D-DPF
87. Smoothing result for an actual NBI video
92
No smoothing result
Smoothing result
Type A
Type B
Type C3