大腸内視鏡検査における大腸癌認識システム

, Bisser Raytchev, ,
,
and many others

http://www.sciencekids.co.nz/pictures/humanbody/braintomography.html

http://www.sciencekids.co.nz/pictures/humanbody/heartsurfaceanatomy.html

http://sozai.rash.jp/medical/p/000154.html
http://medical.toykikaku.com/ / /
http://www.sciencekids.co.nz/pictures/humanbody/humanorgans.html

• CT
•
•
• NBI
http://medical.toykikaku.com/ / /

• : 235,000 ( 21 )†
–
• : 42,434 ( )†
– 20 1.7
– 3 ( 1 : 2 : )
– 7 1
• 5 : 20%
11
† http://www.mhlw.go.jp/toukei/saikin/
‡http://www.gunma-cc.jp/sarukihan/seizonritu/index.html
0
20
40
60
80
100
stage 1 stage 2 stage 3 stage 4
†
5 ‡
survivalrate[%]
stage 1:
stage 2:
stage 3:
stage 4:
stage 1 ( ) 100%
0
10,000
20,000
30,000
40,000
50,000
'90 '91 '92 '93 '94 '95 '96 '97 '98 '99 '00 '01 '02 '03 '04 '05 '06 '07 '08 '09
fatalities of colorectal cancer
year

http://www.mhlw.go.jp/toukei/saikin/hw/jinkou/geppo/nengai11/kekka03.html#k3_2
8 10

http://ameblo.jp/gomora16610/entry-10839715830.html
http://daichou.com/ben.htm

• CCD
•
100
I think this is a cancer…

http://www.ajinomoto-seiyaku.co.jp/newsrelease/2004/1217.html

http://yotsuba-clinic.jp/WordPress/?p=63
http://www.oiya-clinic.jp/inform3.htmlhttps://www.youtube.com/watch?v=40L-y9rNOzw

Capture ~Setup~
17
NBI内視鏡
処理用PC
スコープ

光源(NBI)
ビデオプロセッサ
レコーダー
スコープ
接続口

NBI
(Narrow Band Imaging
, –NBI AFI IRI – , 1 1 , , 2006.
R
B
G
415nm
540nm
Color Transform
NBI filter
Xenon lamp RGB rotary filter
mucosal
CCD
Monitor
Light source unit
Video processor
ON
OFF
Normal light
Normal light
NBI
NBI

http://www.olympus.co.jp/jp/technology/technology/luceraelite/

http://cancernavi.nikkeibp.co.jp/daicho/worry/post_2.html
ü
ü
ü
ü
or
or
or or

or
or
or or
ü
ü
ü
ü
Ø
Ø

or
or
or or
ü
ü
ü
ü
Oh, MIA, ‘07
Sundaram et al., MIA, ‘08
Diaz & Rao , PRL, ‘07
Al-Kadi, PR, ‘10
Gunduz-Demir et al., MIA, ‘10
Tosun, PR, ‘09
Pit-Pattern
Häfner et al., PAA, ‘09
Häfner, ICPR, ‘10
Häfner, PR, ‘09
Kwitt & Uhl, ICCV, ‘07
Tischendrof et al., Endoscopy, ‘10
NBI
Stehle, MI, ‘09
Gross, MI, ‘08
, PRMU, ‘10
Tamaki et al., ACCV, ‘10

pit-pattern
• pit
– pit
–
29
m sm
pit
pit
pit
pit
pit
S
L
I
N
pit
pit
pit
S L
pit-pattern [S.Tanaka et al., ‘06]

NBI (NBI: Narrow-band Imaging)
• pit
–
–
Type A
Type B
Type C
1
2
3
pit
pit
/
pit
/
pit
/
(AVA)
NBI [H.Kanao et al., ‘09]
sm

texture analysis approach
Yoshito Takemura, Shigeto Yoshida, Shinji Tanaka, Keiichi Onji, Shiro Oka, Toru Tamaki, Kazufumi Kaneda, Masaharu Yoshihara, Kazuaki Chayama:
"Quantitative analysis and development of a computer-aided system for identification of regular pit patterns of colorectal lesions," Gastrointestinal
Endoscopy, Vol. 72, No. 5, pp. 1047-1051 (2010 11).

Bag-of-Visual Words Approach
Type A Type B Type C3
12, 55, 63, …
87, 49, 21, …
32, 20, 73, …
67, 6, 0, …
79, 5, 40, …
11, 36, 87, …
27, 64, 25, …, 87
93, 41, 75, …, 8
…
12, 55, 63, …
87, 49, 21, …
32, 20, 73, …
67, 6, 0, …
79, 5, 40, …
11, 36, 87, …
65, 33, 19, …, 101
52, 51, 32, …, 89
…
12, 55, 63, …
87, 49, 21, …
32, 20, 73, …
67, 6, 0, …
79, 5, 40, …
11, 36, 87, …
66, 95, 47, …, 85
11, 82, 3,…, 124
…
84, 99, 40, …, 121
5, 26, 91, …, 150
…
Vector quantization
Vector quantization
Feature space
Classifier
Histogram
Test image
Learning
Classification result
Description of Local features
+ Bag-of-features

Object Bag of words
Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/

Analogy to documents
Of all the sensory impressions proceeding to
the brain, the visual experiences are the
dominant ones. Our perception of the world
around us is based essentially on the
messages that reach the brain from our eyes.
For a long time it was thought that the retinal
image was transmitted point by point to visual
centers in the brain; the cerebral cortex was a
movie screen, so to speak, upon which the
image in the eye was projected. Through the
discoveries of Hubel and Wiesel we now
know that behind the origin of the visual
perception in the brain there is a considerably
more complicated course of events. By
following the visual impulses along their path
to the various cell layers of the optical cortex,
Hubel and Wiesel have been able to
demonstrate that the message about the
image falling on the retina undergoes a step-
wise analysis in a system of nerve cells
stored in columns. In this system each cell
has its specific function and is responsible for
a specific detail in the pattern of the retinal
image.
sensory, brain,
visual, perception,
retinal, cerebral cortex,
eye, cell, optical
nerve, image
Hubel, Wiesel
China is forecasting a trade surplus of $90bn
(£51bn) to $100bn this year, a threefold
increase on 2004's $32bn. The Commerce
Ministry said the surplus would be created by
a predicted 30% jump in exports to $750bn,
compared with a 18% rise in imports to
$660bn. The figures are likely to further
annoy the US, which has long argued that
China's exports are unfairly helped by a
deliberately undervalued yuan. Beijing
agrees the surplus is too high, but says the
yuan is only one factor. Bank of China
governor Zhou Xiaochuan said the country
also needed to do more to boost domestic
demand so more goods stayed within the
country. China increased the value of the
yuan against the dollar by 2.1% in July and
permitted it to trade within a narrow band, but
the US wants the yuan to be allowed to trade
freely. However, Beijing has made it clear that
it will take its time and tread carefully before
allowing the yuan to rise further in value.
China, trade,
surplus, commerce,
exports, imports, US,
yuan, bank, domestic,
foreign, increase,
trade, value

ヒストグラム
Type C3Type BType A
Type A Type C3
学習画像
Type B
特徴量:
Bag-of-Visual Words Approach
病変部の画像パッチを分類[Tamaki et al., 2013]
・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習
・Type C1,Type C2は不明瞭な部分が多いため省かれている
・最大認識率96%
認識の流れ
特徴量抽出特徴量をクラスタリング，
代表値をVisual Wordsとする
Visual Wordsヒストグラムを作成
認識画像
Type ?Type B
SVM学習認識
Visual Words
Bag-of-Visual Wordsの枠組み

: gridSIFT
• Scale Invariant Feature Transform (SIFT) [Lowe, ‘99]
– 128
– DoG 90[%]
DoG
• grid sampling SIFT (gridSIFT)
–
– SIFT
grid sampling
grid space
scale size

: Support Vector Machine (SVM)
•
– Radial basis function (RBF)
– linear
– χ2
• : One-Versus-One
vuvu =),(lineark
)exp(),(
2
vuvu =RBFk
( )
+
=
vu
vu
vu
2
2
2
exp),(k
2
2
1
max ww
subject to yiw (xi ) 1
2
1 w
2
1 w

•
• Type
• : 100 300 900 800[pix.]
• 2
• 907
(Type A: 359, Type B: 462, Type C3: 87)
Type A:
Type B:
Type C3:
< >

Results <10-fold Cross Validation>
60
65
70
75
80
85
90
95
100
10 100 1000 10000 100000
Correct Rate [%]
# of visual-words [-]
Correct Rate
96.00%
0
10
20
30
40
50
60
70
80
90
100
10 100 1000 10000 100000
Recall Rate [%]
Recall Rate
Type A
Type B
Type C3
0
10
20
30
40
50
60
70
80
90
100
10 100 1000 10000 100000
Precision Rate [%]
Precision Rate
Type A
Type B
Type C3

Results <Holdout Testing>
60
65
70
75
80
85
90
95
100
10 100 1000 10000 100000
Correct Rate [%]
Correct Rate
0
10
20
30
40
50
60
70
80
90
100
10 100 1000 10000 100000
Recall Rate [%]
Recall Rate
Type A
Type B
Type C3
0
10
20
30
40
50
60
70
80
90
100
10 100 1000 10000 100000
Precision Rate [%]
Precision Rate
Type A
Type B
Type C3
92.86%

MOTIVATION
•
ž
NBI
ž NBI
ž
×
×
×
A
B C3

ABSTRACT
• Self-training
n
n
[Yoshimuta et al., ‘10]
Key Idea :

Self-training
•
•
Accept
Reject
POINT
1.
2.

labeled samples
•
• 100 300 900 800 [pix.]
•
Type A Type B Type C3 Total
359 462 87 908
A
B C3

Unlabeled samples
• 10
• 30 30 250 250 [pix.]
•
–
–
•
* 10
Type A Type B Type C3 Total
3590 4610 870 9070

Result
0.9
0.91
0.92
0.93
0.94
0.95
0.96
Algorithm 1 Algorithm 2 Algorithm 3
Recognition Rate
* p=0.013314

ヒストグラム
Type C3Type BType A
Type A Type C3
学習画像
Type B
特徴量:
病変部の画像パッチを分類[Tamaki et al., 2013]
・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習
・Type C1,Type C2は不明瞭な部分が多いため省かれている
・最大認識率96%
認識の流れ
特徴量抽出特徴量をクラスタリング，
代表値をVisual Wordsとする
認識画像
Type ?Type B
SVM学習認識
Visual Words
Bag-of-Visual Wordsの枠組み

格子間隔 15[pixel] 10[pixel] 5[pixel]
最高認識率 92.11[%] 93.89[%] 96.00[%]
学習時間約13分約30分約3時間
2/3 1/2
+1.78% +2.11%
特徴量数の増加による学習時間の増加が問題
格子間隔
Ø 抽出する特徴量数を増やすと認識率は向上する[Jurie et al., 2005]
Ø 特徴量抽出の間隔(格子間隔)を狭くして認識率向上を確認[吉牟田ら,2011]
特徴量数:
2.25倍
特徴量数:
4倍
格子間隔:
2/3
格子間隔:
1/2
学習画像:NBI画像908枚(Type A: 359,Type B:462,Type C3:87)

特徴空間ヒストグラム
1. 全ての学習画像から(格子状に)特徴量を抽出する
2. 抽出した特徴量をクラスタリングする
3. 学習画像1枚から(格子状に)特徴量を抽出する
4. 特徴量をベクトル量子化して
Visual Wordsヒストグラムを求める
格子間隔
全学習画像
I = {In | 1, . . . , N}
学習画像
In 2 I
Visual Words

( & )
特徴空間ヒストグラム
1. 全ての学習画像から少量の特徴量を抽出する
3. 学習画像1枚から(格子状に)多くの特徴量を抽出する
4. 特徴量をベクトル量子化して
Visual Wordsヒストグラムを求める
2. 少量の特徴量をクラスタリングする
格子間隔
全学習画像
I = {In | 1, . . . , N}
学習画像
In 2 I
Visual Words

学習時間の削減と認識率の向上を確認する
Visual Words作成
ヒストグラム作成
特徴量数:削減
実行環境
OS:Linux Fedora 18
CPU:Intel Xeon CPU E-5 2620
Memory:128GB
識別器
Ø Linear SVM
学習画像
Ø ラベルありNBI画像908枚(Type A: 359,Type B:462,Type C3:87)
Visual Wordsを作成する特徴量数を減らす
特徴量数:増加
Visual Words作成に使用する特徴量数格子間隔
5[pixel],2[pixel],1[pixel]19,742個8,678,198個
ヒストグラムを作成する特徴量数を増やす
Ø 特徴量数 vs 学習時間合計
Ø 格子間隔 vs 認識率
学習時間の削減を確認する
認識率の向上を確認する

10233.45
680.72
4167.89
16471.8
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
従来手法(格子間隔:5)
提案手法(格子間隔:5)
CPU時間[sec]
6.6%
40.7%
160.9%
格子間隔:5[pixel],2[pixel]の時，学習時間が削減できている
格子間隔:1[pixel]の時，学習時間が増えている
Visual Words数:32

0.80
0.82
0.84
0.86
0.88
0.90
0.92
0.94
0.96
0.98
32 1024 4096 16384
CorrectRate
Visual Words数
従来手法（格子間隔:5）提案手法（格子間隔:5）
提案手法（格子間隔:2）提案手法（格子間隔:1）
格子間隔:5[pixel]と格子間隔:2[pixel],1[pixel]には差がある
格子間隔:2[pixel]と格子間隔:1[pixel]には大きな差がない

Problem
58
光学系が異なる撮影画像が異なる
特徴量分布が異なる
旧内視鏡と新内視鏡が混在している
Old endoscopy
(EVIS LUCERA)
New endoscopy
(EVIS LUCERA
ELITE)
Viewing
angle
140 (WIDE)，80 (TELE) 170 (WIDE)，90 (TELE)
Resolution 1440*1080 1980*1080 Old endoscopy New endoscopy
Ø 新内視鏡が広角・高解像度で明るい
Old endoscopy New endoscopy
新内視鏡での認識性能の低下
学習画像を新旧同時に使えない
新内視鏡の学習画像を収集するのは困難
Ø 認識と学習は分布が同じことが前提
Ø がん患者は多くない
Ø 検査時しか撮影できない
Ø ラベル付けは医師しかできない
Ø 最新のデバイスが登場し，過渡期にある

http://www.olympus.co.jp/jp/technology/technology/luceraelite/

Objective
60
Solution: 新内視鏡の特徴量を旧内視鏡の特徴量に変換し，
学習する
Framework of Transfer Learning
2つの画像は関連がある
5
10
New endoscopyOld endoscopy
5
10
特徴量を変換する
学習：旧内視鏡
認識：旧内視鏡
学習：旧内視鏡
認識：新内視鏡
認識率
低下

Related Work
61
Adapting Visual Category Models to New Domains
[Saenko et al., ECCV2010]
SourceとTargetの同時認識をする問題
Source:x Target:y
Targetのみを認識する問題
Ø 本手法はハイパーパラメータが存在しない
Ø この手法はハイパーパラメータが存在し，調整が必要
TargetをSourceに変換する行列 W を求める
Our Approach
Source
Target
W
Source-Target間の条件を満たす行列を求めるA
Source
Target
+Target
For each class
(xi yj)T
A(xi yj)  upper bound
(xi yj)T
A(xi yj) lower bound
Same class:
Different class: A1/2
A1/2
! arg min
W
kx W yk2
F
W

y1
Convert Histogram
62
yn
yN
x1
xn
xN
Source Target
1. Visual Wordsヒストグラムをベクトルとして扱い，行列とする
2. ヒストグラム同士の誤差を最小に
する変換行列WをADMM*で求める
*ADMMによる解法 (For each row n=1, …, N)
arg min
W
PN
n=1 ||xn W nyn||2
2
+1
2 ||W n zn + un||2
2
+
PN
n=1(zk
n uk
n))
手順
以下の双対問題を手順を繰り返すことで解く
Y = y1, · · · , yNX = x1, · · · , xN
Subject to. W ij 0
arg min
W
kX W Y k2
F
zk+1
n = ⇡c(W k+1
n + uk
n)
uk+1
n = uk
n + W k+1
n zk+1
n
W k+1
n = (
PN
n=1 ynyT
n + E) 1
(
PN
n=1 ynyT
n

How to Make Pseudo Dataset
63
l 新内視鏡はくっきり，鮮やかに見えると思われるため
を適用する
①コントラスト強調
②先鋭化フィルタ
Source Target
Output
Input
0 25542 213
0
255
1
9
1
9
1
9
1
9
1
9
1
9
1
9
1
9
25
9
コントラスト強調先鋭化フィルタ
l この手法は学習画像同士の対応がないと使えない
Ø 現実には対応のある画像を得るのは難しい

Result
64
転移することで旧内視鏡と同等に認識率を得た
Almost same
Training Test
n Source Source
n Source Target
n Source+Target Target
n Source
+ Target
Target
①
④
②
③

Related Works
65
Cross-Domain Transform[Saenko et al., ECCV2010]
Max-Margin Domain Transfer(MMDT)[Hoffman et al., ICLR2013]
min tr(W ) log det W
s.t. W ⌫ 0
kxs
i xt
jkw  upper bound, (xs
i , xt
j) 2 the same class
kxs
i xt
jkw  lowe rbound, (xs
i , xt
j) 2 di↵erent class
Ø Estimate transformation matrix which minimize Mahalanobis distance.
Ø Consider in only transformed feature distributions.
Ø Not ensure classification result.
min
W ,✓,b
1
2
kW k2
F +
1
2
KX
k=1
k✓kk2
2 + Cs
nX
i=1
KX
k=1
⇠s
i,k + Ct
mX
j=1
KX
k=1
⇠t
j,k
s.t. ys
i,k✓T
k xs
i bk 1 ⇠s
i.k
yt
j,k✓T
k W xt
j 1 ⇠t
j,k
⇠s
i,k 0, ⇠t
j,k 0
Ø Optimize transformation matrix and SVM parameters at same time.
Ø Ensure classification result.
Ø Not guarantee transformed feature distributions.
W : Transform matrix
✓k : SVM parameter ⇠s
, ⇠t
: Slack variable
yi,k : Indicator function

Propose Method
66
min
W ,✓,b
1
2
kW k2
F +
1
2
KX
k=1
k✓kk2
2 + Cs
nX
i=1
KX
k=1
⇠s
i,k
s.t. ys
i,k✓T
k xs
i bk 1 ⇠s
i.k
yt
j,k✓T
k W xt
j 1 ⇠t
j,k
⇠s
i,k 0, ⇠t
j,k 0
Constraint of close transformed target to source.
+Ct
mX
j=1
KX
k=1
⇠t
j,k +
1
2
D
MX
i=1
NX
j=1
yi,jk(W xt
i xs
j)k2
2
Ø Add L2 distance constraints to MMDT.
Ø Our method ensures classification result
and transformed feature distributions.
Max-Margin Domain Transfer with L2 Distance Constraints
(MMDTL2)

Decompose to Sub-problem
67
Hoffman et al. decompose objective function to 2 sub-problem in MMDT.
Our method as well decomposes objective functions in below.
Objective function optimize by iterate (1) and (2).
min
✓,⇠s,⇠t
1
2
KX
k=1
k✓kk2
2 + Cs
NX
i=1
KX
k=1
⇠s
i,k + Ct
MX
j=1
KX
k=1
⇠t
j,k(1)
Constraintof close transformed target to source.
(2) min
W ,⇠t
1
2
kW k2
F + Ct
MX
j=1
KX
k=1
⇠t
j,k +
1
2
D
MX
i=1
MX
j=1
yi,jkW xt
i xs
jk2
2
Objective function for optimize SVM parameter.
Objective function for optimize transform matrix.
s.t. ys
i,k✓T
k xs
i bk 1 ⇠s
i.k
yt
j,k✓T
k W xt
j 1 ⇠t
j,k
⇠s
i,k 0, ⇠t
j,k 0

Primal Problem
68
U(x) =
2
6
6
6
4
xxT
xxT
...
xxT
3
7
7
7
5
vi,j = vec(xs
j(xt
i)T
)
w = vec(W )
(x) = vec(✓xT
)
min
w,⇠t
1
2
kwk2
2 + Ct
MX
j=1
KX
k=1
⇠t
j,k +
1
2
D
MX
i=1
MX
j=1
wT
U(xt
i)w 2vT
ijw + (xt
i)T
xs
j(2)
s.t. ⇠t
i 0
yt
i,k
T
k (xt
i)w 1 ⇠t
i,k
Derivate from objective function for optimize transform matrix.
This is standard quadratic programming but…
p High computational costs.
p Need to huge memory.
p Depend on dimensions of data.
Derivate dual problem.

Dual Problem
69
s.t. 0  ai  CT
MX
i=1
aiyt
i,k = 0
max
a
1
2
KX
k1=1
KX
k2=1
MX
i=1
MX
j=1
aiajyt
i,k1
yt
j,k2
T
k1(xt
i)V 1
k2 (xt
j)
+
KX
k=1
MX
i=1
ai 1 D T
k (xt
i)V 1
MX
m=1
NX
n=1
ym,nvi,j
!!
(2)
p Low computational cost.
p Defined by sparse problem.
p Depend on number of target data.
ai: Lagrange multiplier
Dual problem has many advantages.
V =
0
@I + D
MX
i=1
NX
j=1
yi,jU(xt
i)
1
A

Comparison Primal with Dual of
Computation Time
70
SetupTime: computation time for coefficients(e.g. and ).
OptimizationTime: optimization time for solving quadratic programming
CalculationTime: computation time in from (dual only).
U(x) vi,j
w a
3riPDO DuDO
0
1000
2000
3000
4000
5000
6000
7000
CoPSutDtion7iPe
6etuS7iPe
2StiPizDtion7iPe
CDOcuODtion7iPe
Visual Words:128
About 14 times faster

Result
71
MMDTL2 achieve good performance as equivalent with baseline.
But Not transfer is the best performance.
8 16 32 64 128 256 512 1024
# Rf 9LVuDl WRrdV
0.4
0.5
0.6
0.7
0.8
0.9
1.05ecRgnLWLRnrDWe BDVelLne
6Rurce Rnly
1RW WrDnVfer
00D7
00D7L2

:14.7[fps]
A
B
C3
•
Visual Word Histogram
• (A or B or C3)
• A, B, C3
• : SVM
22 6 …
91 87 …
•
• SIFT
120[pix.]
120[pix.]
A
B
C3

Objective
74
処理用PCをNBI内視鏡と接続し，
オンラインでの認識を可能とする
システム構成
*2 http://www.genkosha.com/vs/news/entry/sdi.html*1 http://www.olympus.co.jp/jp/news/2012b/nr121002luceraj.jsp
開発環境
Visual Studio 2012(製品版)，OpenCV 3.0-devel，
VLFeat 0.9.18, Boost 1.55.0，DeckLink SDK 10.0
OS:Windows 7 Home Premium SP1 64bit
CPU:Intel Core i7-4470 3.40GHz
Memory:16GB
OLYMPUS製
EVIS LUCERAELITE
Blackmagic製
DeckLink SDI
NBI内視鏡*1
キャプチャボード*2
処理用PC
SDI PCI Express

Capture ~Setup~
75
NBI内視鏡
処理用PC
NBIスコープ

Capture ~demo & performance~
76
NBI内視鏡
の画面
処理用PC
の画面
色変換処理
特徴量抽出

[ ]
0
0.5
1
0 50 100 150 200
Probability
フレーム番号
Type A
Type B
•
•
Type A
Type B
Type C3

MRF/HMM
f x y( )∝exp A xi, yi( )
i
∑
#
$
%
&
'
(⋅exp I xi, xj( )
j∈Ni
∑
#
$
%%
&
'
((
x:
y: SVM
x1
x50………… x100 x150 x200………… ………… …………
B B BC3 C3
i0 50 100 200150
…… …… …… ……
y1
y50 y100 y150 y200

Type B (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.8)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.9)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.99)

Type B (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.8)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.9)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.99)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.999)
frame number
0 20 40 60 80 100 120 140 160 180 200

Type B (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.6)
frame number
0 20 40 60 80 100 120 140 160 180 200
frame number
0 20 40 60 80 100 120 140 160 180 200
frame number
0 20 40 60 80 100 120 140 160 180 200
frame number
0 20 40 60 80 100 120 140 160 180 200

0
0.5
1
B
A
C
20 40 60 80 100 120 140 160 180 200
Type B
0
0.5
1
A
B
C
20 40 60 80 100 120 140 160 180 200
Type A_1 (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type A_1 (DP_0.99)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type A_1 (Gibbs_p4=0.9)
frame number
0 20 40 60 80 100 120 140 160 180 200

Type A
MAP
(C3 )
( )

A
B
C3
MRF
0
0.5
1
0 50 100 150 200
Probability
Type A
Type B
Type C3

Type A Type B
Type A Type B

Colorectal Tumor Classification System
in Magnifying Endoscopic NBI Images
[Tamaki et al., MedIA2013]
Recognizing colorectal image
p Feature: Bag-of-Visual-Words
of densely sampled SIFT
p Classifier: Linear SVM
83
Extended to video frames
Display posterior probabilities at each frame.
0
0.5
1
251 271 291 311 331 351 371 391 411 431
Probability
Frame number
A
B
C
0 20 40 60 80 120100 140 160 180 200
Highly unstable classification results

Possible Cause of Instability
84
p Classification results would be
affected by out of focus.
number of visual words
RecognitionRate[%]
●
●
●
●
● ● ●
●
● ● ● ●
● ● ●
●
●
●
●
● ●
● ● ●
●
●no defoucs
SD = 0.5
SD = 1
SD = 2
SD = 3
SD = 5
SD = 7
SD = 9
SD = 11
10 100 1000 10000
0.00.20.40.60.81.0
p Test image: 1191
Ø Test images are added Gaussian blur with different SD.
p Train image: 480
Ø 160 images for each class
Smaller SD Larger
Recognition results for out of focus images

Particle Filter (Online Bayesian Filtering)
85
State vector:
Observation vector:
t : time
p (xt | y1:t 1) =
Z
p (xt | xt 1, ✓1) p (xt 1 | y1:t 1) dxt 1
Prediction
State transition
We use Dirichlet distribution for state transition and likelihood.
Update
Likelihood
p (xt | y1:t) / p (yt | xt, ✓2) p (xt | y1:t 1)
yt =
⇣
y
(A)
t , y
(B)
t , y
(C3)
t
⌘
, y
(A)
t + y
(B)
t + y
(C3)
t = 1
xt =
⇣
x
(A)
t , x
(B)
t , x
(C3)
t
⌘
, x
(A)
t + x
(B)
t + x
(C3)
t

Dirichlet distribution
86
(0.50, 0.50, 0.50)
(0.85, 1.50, 2.00)

(1.00, 1.00, 1.00)
(1.00, 1.76, 2.35)

(4.00, 4.00 ,4.00)

(3.40, 6.00, 8.00)
low
high
Dirx[↵] =
(
PN
i=1 ↵i)
QN
i=1 (↵i)
NY
i=1
x↵i 1
i
parameter of distribution:
↵ (x) = ax + b

Problem & Our Approach
87
xt 1 xt xt+1
yt 1 yt yt+1zt+1zt 1 zt
t t+1t 1
xt 1 xt+1xt
ytyt 1 yt+1
✓2
Dirichlet Particle Filter (DPF)
Defocus-aware Dirichlet Particle Filter (D-DPF)
Prediction
p (xy | y1:t 1, 1:t 1, z1:t 1) =
Z
p (xt | xt 1, ✓1)p (xt 1 | y1:t 1, 1:t 1, z1:t 1)dxt 1
State transition
p (xt | y1:t, 1:t, z1:t) /
p (yt, t, zt | xt) p (xt | y1:t 1, 1:t 1, z1:t 1)
Update
Likelihood
p (yt, t, zt | xt) = p (yt, xt, t) p (zt | t)

Isolated Pixel Ratio (IPR) [Oh et al., MedIA2007]
88
Endoscopic image Edges pixels by Canny edge detector
Clear edge Defocus edge
Edge pixel
Non-edge pixel
Edge and isolated pixel
IPR: the percentage of isolated pixel in every edge pixels

Isolated pixel value (IPR)
frequency
0.000.020.040.060.080.10
0 0.005 0.01 0.015
γt
Density
0 2 4 6 8 10
0.00.20.40.60.81.01.2
sigma = 0.5
sigma = 1
sigma = 2
sigma = 3
sigma = 4
Dirichlet
distribution
Modeling with Rayleigh dist. and IPR
89
Rayx [ ] =
x
2
exp
✓
x2
2 2
◆
Defocus Clear
γt
0.000 0.005 0.010 0.015
0.51.01.52.02.53.03.54.0
zt
σ(zt)
●
●
(zt) = 4 exp(100 log(0.25)zt)
p (zt | t) = Ray t
[ (zt)]

The performance for defocus frames
91
0 100 200 300 400 500 6000.00.51.0
0 100 200 300 400 500 600
0.00.51.0
0 100 200 300 400 500 600
0.0000.0050.010
0 100 200 300 400 500 600
0.00.51.0
0 100 200 300 400 500 600
0.00.51.0
Frame number
Ground truth
Observation
IPR
Result by DPF
Result by
D-DPF

Smoothing result for an actual NBI video
92
No smoothing result
Smoothing result
Type A
Type B
Type C3

Summary
• NBI
• Baseline: SIFT + Bag-of-Visual Words
•
•
– self-training
– sampling
– domain adaptation / transfer learning
• /
– MRF/HMM
–

大腸内視鏡検査における大腸癌認識システム

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (20)

Similaire à 大腸内視鏡検査における大腸癌認識システム

Similaire à 大腸内視鏡検査における大腸癌認識システム (8)

Plus de Toru Tamaki

Plus de Toru Tamaki (20)

Dernier

Dernier (20)

大腸内視鏡検査における大腸癌認識システム