SlideShare une entreprise Scribd logo
1  sur  88
Télécharger pour lire hors ligne
, Bisser Raytchev, ,
,
and many others
http://www.sciencekids.co.nz/pictures/humanbody/braintomography.html
http://www.sciencekids.co.nz/pictures/humanbody/heartsurfaceanatomy.html
http://sozai.rash.jp/medical/p/000154.html
http://sozai.rash.jp/medical/p/000152.html
http://medical.toykikaku.com/ / /
http://www.sciencekids.co.nz/pictures/humanbody/humanorgans.html
• CT
•
•
• NBI
http://sozai.rash.jp/medical/p/000154.html
http://sozai.rash.jp/medical/p/000152.html
http://medical.toykikaku.com/ / /
• : 235,000 ( 21 )†
–
• : 42,434 ( )†
– 20 1.7
– 3 ( 1 : 2 : )
– 7 1
• 5 : 20%
11
† http://www.mhlw.go.jp/toukei/saikin/
‡http://www.gunma-cc.jp/sarukihan/seizonritu/index.html
0	
20	
40	
60	
80	
100	
stage	1 stage	2 stage	3 stage	4
†
5 ‡
survivalrate[%]
stage 1:
stage 2:
stage 3:
stage 4:
stage 1 ( ) 100%
0	
10,000	
20,000	
30,000	
40,000	
50,000	
'90 '91 '92 '93 '94 '95 '96 '97 '98 '99 '00 '01 '02 '03 '04 '05 '06 '07 '08 '09
fatalities	of	colorectal	cancer
year
http://www.mhlw.go.jp/toukei/saikin/hw/jinkou/geppo/nengai11/kekka03.html#k3_2
8 10
http://ameblo.jp/gomora16610/entry-10839715830.html
http://daichou.com/ben.htm
• CCD
•
100
I think this is a cancer…
http://www.ajinomoto-seiyaku.co.jp/newsrelease/2004/1217.html
http://yotsuba-clinic.jp/WordPress/?p=63
http://www.oiya-clinic.jp/inform3.htmlhttps://www.youtube.com/watch?v=40L-y9rNOzw
Capture ~Setup~
17
NBI内視鏡
処理用PC
スコープ
光源(NBI)
ビデオプロセッサ
レコーダー
スコープ
接続口
70 100
NBI
75 100
NBI
NBI
(Narrow Band Imaging
, 	–NBI AFI IRI	 – , 	1	 	1	 , ,	2006.
R
B
G
415nm
540nm
Color	Transform
NBI	filter
Xenon	lamp RGB	rotary	filter
mucosal
CCD
Monitor
Light	source	unit
Video	processor
ON
OFF
Normal	light
Normal	light
NBI
NBI
	
http://www.olympus.co.jp/jp/technology/technology/luceraelite/
http://cancernavi.nikkeibp.co.jp/daicho/worry/post_2.html
ü
ü
ü
ü
I	think	this	is	a	cancer…
or
or
or or
or
or
or or
http://cancernavi.nikkeibp.co.jp/daicho/worry/post_2.html
ü
ü
ü
ü
I	think	this	is	a	cancer…
Ø
Ø
or
or
or or
http://cancernavi.nikkeibp.co.jp/daicho/worry/post_2.html
ü
ü
ü
ü
I	think	this	is	a	cancer…
Oh,	MIA,	‘07
Sundaram et	al.,	MIA,	‘08
Diaz	&	Rao ,	PRL,	‘07
Al-Kadi,	PR,	‘10
Gunduz-Demir et	al.,	MIA,	‘10
Tosun,	PR,	‘09
Pit-Pattern
Häfner et	al.,	PAA,	‘09
Häfner,	ICPR,	‘10
Häfner,	PR,	‘09
Kwitt &	Uhl,	ICCV,	‘07
Tischendrof et	al.,	Endoscopy,	 ‘10
NBI
Stehle,	MI,	‘09
Gross,	MI,	‘08
,	PRMU,	‘10
Tamaki	et	al.,	ACCV,	‘10
pit-pattern
• pit
– pit
–
29
m sm
pit
pit
pit
pit
pit
S
L
I
N
pit
pit
pit
S L
pit-pattern [S.Tanaka et	al.,	‘06]
NBI (NBI: Narrow-band Imaging)
• pit
–
–
Type	A
Type	B
Type	C
1
2
3
pit
pit
/
pit
/
pit
/
(AVA)
NBI [H.Kanao et al., ‘09]
sm
texture analysis approach
Yoshito Takemura,	Shigeto Yoshida,	Shinji	Tanaka,	Keiichi	Onji,	Shiro Oka,	Toru	Tamaki,	Kazufumi Kaneda,	Masaharu Yoshihara,	Kazuaki	Chayama:	
"Quantitative	analysis	and	development	of	a	computer-aided	system	for	identification	of	regular	pit	patterns	of	colorectal	lesions,"	Gastrointestinal	
Endoscopy,	Vol.	72,	No.	5,	pp.	1047-1051	(2010	11).
Bag-of-Visual Words Approach
Type A Type B Type C3
12, 55, 63, …
87, 49, 21, …
32, 20, 73, …
67, 6, 0, …
79, 5, 40, …
11, 36, 87, …
27, 64, 25, …, 87
93, 41, 75, …, 8
…
12, 55, 63, …
87, 49, 21, …
32, 20, 73, …
67, 6, 0, …
79, 5, 40, …
11, 36, 87, …
65, 33, 19, …, 101
52, 51, 32, …, 89
…
12, 55, 63, …
87, 49, 21, …
32, 20, 73, …
67, 6, 0, …
79, 5, 40, …
11, 36, 87, …
66, 95, 47, …, 85
11, 82, 3,…, 124
…
Type A Type B Type C3
84, 99, 40, …, 121
5, 26, 91, …, 150
…
Vector quantization
Vector quantization
Feature space
Classifier
Histogram
Test image
Learning
Classification result
Description of Local features
+ Bag-of-features
Object Bag of words
Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
Analogy to documents
Of all the sensory impressions proceeding to
the brain, the visual experiences are the
dominant ones. Our perception of the world
around us is based essentially on the
messages that reach the brain from our eyes.
For a long time it was thought that the retinal
image was transmitted point by point to visual
centers in the brain; the cerebral cortex was a
movie screen, so to speak, upon which the
image in the eye was projected. Through the
discoveries of Hubel and Wiesel we now
know that behind the origin of the visual
perception in the brain there is a considerably
more complicated course of events. By
following the visual impulses along their path
to the various cell layers of the optical cortex,
Hubel and Wiesel have been able to
demonstrate that the message about the
image falling on the retina undergoes a step-
wise analysis in a system of nerve cells
stored in columns. In this system each cell
has its specific function and is responsible for
a specific detail in the pattern of the retinal
image.
sensory, brain,
visual, perception,
retinal, cerebral cortex,
eye, cell, optical
nerve, image
Hubel, Wiesel
China is forecasting a trade surplus of $90bn
(£51bn) to $100bn this year, a threefold
increase on 2004's $32bn. The Commerce
Ministry said the surplus would be created by
a predicted 30% jump in exports to $750bn,
compared with a 18% rise in imports to
$660bn. The figures are likely to further
annoy the US, which has long argued that
China's exports are unfairly helped by a
deliberately undervalued yuan. Beijing
agrees the surplus is too high, but says the
yuan is only one factor. Bank of China
governor Zhou Xiaochuan said the country
also needed to do more to boost domestic
demand so more goods stayed within the
country. China increased the value of the
yuan against the dollar by 2.1% in July and
permitted it to trade within a narrow band, but
the US wants the yuan to be allowed to trade
freely. However, Beijing has made it clear that
it will take its time and tread carefully before
allowing the yuan to rise further in value.
China, trade,
surplus, commerce,
exports, imports, US,
yuan, bank, domestic,
foreign, increase,
trade, value
Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
ヒストグラム
Type C3Type BType A
Type A Type C3
学習画像
Type B
特徴量:
Bag-of-Visual Words Approach
病変部の画像パッチを分類[Tamaki et al., 2013]
・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習
・Type C1,Type C2は不明瞭な部分が多いため省かれている
・最大認識率96%
認識の流れ
特徴量抽出 特徴量をクラスタリング,
代表値をVisual Wordsとする
Visual Wordsヒストグラムを作成
認識画像
Type ?Type B
Visual Wordsヒストグラムを作成
SVM学習 認識
Visual Words
Bag-of-Visual Wordsの枠組み
: gridSIFT
• Scale Invariant Feature Transform (SIFT) [Lowe, ‘99]
– 128
– DoG 90[%]
DoG
• grid sampling SIFT (gridSIFT)
–
– SIFT
grid sampling
grid space
scale size
: Support Vector Machine (SVM)
•
– Radial basis function (RBF)
– linear
– χ2
• : One-Versus-One
vuvu =),(lineark
)exp(),(
2
vuvu =RBFk
( )
+
=
vu
vu
vu
2
2
2
exp),(k
2
2
1
max ww
subject to yiw (xi ) 1
2
1 w
2
1 w
•
• Type
• : 100 300 900 800[pix.]
• 2
• 907
(Type A: 359, Type B: 462, Type C3: 87)
Type A:
Type B:
Type C3:
< >
Results <10-fold Cross Validation>
60	
65	
70	
75	
80	
85	
90	
95	
100	
10 100 1000 10000 100000
Correct	Rate	[%]
#	of	visual-words	[-]
Correct	Rate
96.00%
0	
10	
20	
30	
40	
50	
60	
70	
80	
90	
100	
10 100 1000 10000 100000
Recall	Rate	[%]
#	of	visual-words	[-]
Recall	Rate
Type	A
Type	B
Type	C3
0	
10	
20	
30	
40	
50	
60	
70	
80	
90	
100	
10 100 1000 10000 100000
Precision	Rate	[%]
#	of	visual-words	[-]
Precision	Rate
Type	A
Type	B
Type	C3
Results <Holdout Testing>
60	
65	
70	
75	
80	
85	
90	
95	
100	
10 100 1000 10000 100000
Correct	Rate	[%]
#	of	visual-words	[-]
Correct	Rate
0	
10	
20	
30	
40	
50	
60	
70	
80	
90	
100	
10 100 1000 10000 100000
Recall	Rate	[%]
#	of	visual-words	[-]
Recall	Rate
Type	A
Type	B
Type	C3
0	
10	
20	
30	
40	
50	
60	
70	
80	
90	
100	
10 100 1000 10000 100000
Precision	Rate	[%]
#	of	visual-words	[-]
Precision	Rate
Type	A
Type	B
Type	C3
92.86%
MOTIVATION
•
ž
NBI
ž NBI
ž
×
×
×
A
B C3
ABSTRACT
• Self-training
n
n
[Yoshimuta et al., ‘10]
Key Idea :
Self-training
•
•
Accept
Reject
POINT
1.
2.
labeled samples
•
• 100 300 900 800 [pix.]
•
Type	A Type	B Type	C3 Total
359 462 87 908
A
B C3
Unlabeled samples
• 10
• 30 30 250 250 [pix.]
•
–
–
•
* 10
Type	A Type	B Type	C3 Total
3590 4610 870 9070
Result
0.9
0.91
0.92
0.93
0.94
0.95
0.96
Algorithm	1 Algorithm	2 Algorithm	3
Recognition	Rate	
* p=0.013314
ヒストグラム
Type C3Type BType A
Type A Type C3
学習画像
Type B
特徴量:
病変部の画像パッチを分類[Tamaki et al., 2013]
・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習
・Type C1,Type C2は不明瞭な部分が多いため省かれている
・最大認識率96%
認識の流れ
特徴量抽出 特徴量をクラスタリング,
代表値をVisual Wordsとする
Visual Wordsヒストグラムを作成
認識画像
Type ?Type B
Visual Wordsヒストグラムを作成
SVM学習 認識
Visual Words
Bag-of-Visual Wordsの枠組み
格子間隔 15[pixel] 10[pixel] 5[pixel]
最高認識率 92.11[%] 93.89[%] 96.00[%]
学習時間 約13分 約30分 約3時間
2/3 1/2
+1.78% +2.11%
特徴量数の増加による学習時間の増加が問題
格子間隔
Ø 抽出する特徴量数を増やすと認識率は向上する[Jurie et al., 2005]
Ø 特徴量抽出の間隔(格子間隔)を狭くして認識率向上を確認[吉牟田ら,2011]
特徴量数:
2.25倍
特徴量数:
4倍
格子間隔:
2/3
格子間隔:
1/2
学習画像:NBI画像908枚(Type A: 359,Type B:462,Type C3:87)
特徴空間 ヒストグラム
1. 全ての学習画像から(格子状に)特徴量を抽出する
2. 抽出した特徴量をクラスタリングする
3. 学習画像1枚から(格子状に)特徴量を抽出する
4. 特徴量をベクトル量子化して
Visual Wordsヒストグラムを求める
格子間隔
全学習画像
I = {In | 1, . . . , N}
学習画像
In 2 I
Visual Words
( & )
特徴空間 ヒストグラム
1. 全ての学習画像から少量の特徴量を抽出する
3. 学習画像1枚から(格子状に)多くの特徴量を抽出する
4. 特徴量をベクトル量子化して
Visual Wordsヒストグラムを求める
2. 少量の特徴量をクラスタリングする
格子間隔
全学習画像
I = {In | 1, . . . , N}
学習画像
In 2 I
Visual Words
学習時間の削減と認識率の向上を確認する
Visual Words作成
ヒストグラム作成
特徴量数:削減
実行環境
OS:Linux Fedora 18
CPU:Intel Xeon CPU E-5 2620
Memory:128GB
識別器
Ø Linear SVM
学習画像
Ø ラベルありNBI画像908枚(Type A: 359,Type B:462,Type C3:87)
Visual Wordsを作成する特徴量数を減らす
特徴量数:増加
Visual Words作成に使用する特徴量数 格子間隔
5[pixel],2[pixel],1[pixel]19,742個8,678,198個
ヒストグラムを作成する特徴量数を増やす
Ø 特徴量数 vs 学習時間合計
Ø 格子間隔 vs 認識率
学習時間の削減を確認する
認識率の向上を確認する
10233.45
680.72
4167.89
16471.8
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
従来手法(格子間隔:5)
提案手法(格子間隔:5)
提案手法(格子間隔:2)
提案手法(格子間隔:1)
CPU時間[sec]
6.6%
40.7%
160.9%
格子間隔:5[pixel],2[pixel]の時,学習時間が削減できている
格子間隔:1[pixel]の時,学習時間が増えている
Visual Words数:32
0.80
0.82
0.84
0.86
0.88
0.90
0.92
0.94
0.96
0.98
32 1024 4096 16384
CorrectRate
Visual Words数
従来手法(格子間隔:5) 提案手法(格子間隔:5)
提案手法(格子間隔:2) 提案手法(格子間隔:1)
格子間隔:5[pixel]と格子間隔:2[pixel],1[pixel]には差がある
格子間隔:2[pixel]と格子間隔:1[pixel]には大きな差がない
Problem
58
光学系が異なる 撮影画像が異なる
特徴量分布が異なる
旧内視鏡と新内視鏡が混在している
Old endoscopy
(EVIS LUCERA)
New endoscopy
(EVIS LUCERA
ELITE)
Viewing
angle
140 (WIDE),80 (TELE) 170 (WIDE),90 (TELE)
Resolution 1440*1080 1980*1080 Old endoscopy New endoscopy
Ø 新内視鏡が広角・高解像度で明るい
Old endoscopy New endoscopy
新内視鏡での認識性能の低下
学習画像を新旧同時に使えない
新内視鏡の学習画像を収集するのは困難
Ø 認識と学習は分布が同じことが前提
Ø がん患者は多くない
Ø 検査時しか撮影できない
Ø ラベル付けは医師しかできない
Ø 最新のデバイスが登場し,過渡期にある
http://www.olympus.co.jp/jp/technology/technology/luceraelite/
Objective
60
Solution: 新内視鏡の特徴量を旧内視鏡の特徴量に変換し,
学習する
Framework of Transfer Learning
2つの画像は関連がある
5
10
New endoscopyOld endoscopy
5
10
特徴量を変換する
学習:旧内視鏡
認識:旧内視鏡
学習:旧内視鏡
認識:新内視鏡
認識率
低下
Related Work
61
Adapting Visual Category Models to New Domains
[Saenko et al., ECCV2010]
SourceとTargetの同時認識をする問題
Source:x Target:y
Targetのみを認識する問題
Ø 本手法はハイパーパラメータが存在しない
Ø この手法はハイパーパラメータが存在し,調整が必要
TargetをSourceに変換する行列 W を求める
Our Approach
Source
Target
W
Source-Target間の条件を満たす行列 を求めるA
Source
Target
+Target
For each class
(xi yj)T
A(xi yj)  upper bound
(xi yj)T
A(xi yj) lower bound
Same class:
Different class: A1/2
A1/2
! arg min
W
kx W yk2
F
W
y1
Convert Histogram
62
yn
yN
x1
xn
xN
Source Target
1. Visual Wordsヒストグラムをベクトルとして扱い,行列とする
2. ヒストグラム同士の誤差を最小に
する変換行列WをADMM*で求める
*ADMMによる解法 (For each row n=1, …, N)
arg min
W
PN
n=1 ||xn W nyn||2
2
+1
2 ||W n zn + un||2
2
+
PN
n=1(zk
n uk
n))
手順
以下の双対問題を手順を繰り返すことで解く
Y = y1, · · · , yNX = x1, · · · , xN
Subject to. W ij 0
arg min
W
kX W Y k2
F
zk+1
n = ⇡c(W k+1
n + uk
n)
uk+1
n = uk
n + W k+1
n zk+1
n
W k+1
n = (
PN
n=1 ynyT
n + E) 1
(
PN
n=1 ynyT
n
How to Make Pseudo Dataset
63
l 新内視鏡はくっきり,鮮やかに見えると思われるため
を適用する
①コントラスト強調
②先鋭化フィルタ
Source Target
Output
Input
0 25542 213
0
255
1
9
1
9
1
9
1
9
1
9
1
9
1
9
1
9
25
9
コントラスト強調 先鋭化フィルタ
l この手法は学習画像同士の対応がないと使えない
Ø 現実には対応のある画像を得るのは難しい
Result
64
転移することで旧内視鏡と同等に認識率を得た
Almost same
Training Test
n Source Source
n Source Target
n Source+Target Target
n Source
+ Target
Target
①
④
②
③
Related Works
65
Cross-Domain Transform[Saenko et al., ECCV2010]
Max-Margin Domain Transfer(MMDT)[Hoffman et al., ICLR2013]
min tr(W ) log det W
s.t. W ⌫ 0
kxs
i xt
jkw  upper bound, (xs
i , xt
j) 2 the same class
kxs
i xt
jkw  lowe rbound, (xs
i , xt
j) 2 di↵erent class
Ø Estimate transformation matrix which minimize Mahalanobis distance.
Ø Consider in only transformed feature distributions.
Ø Not ensure classification result.
min
W ,✓,b
1
2
kW k2
F +
1
2
KX
k=1
k✓kk2
2 + Cs
nX
i=1
KX
k=1
⇠s
i,k + Ct
mX
j=1
KX
k=1
⇠t
j,k
s.t. ys
i,k✓T
k xs
i bk 1 ⇠s
i.k
yt
j,k✓T
k W xt
j 1 ⇠t
j,k
⇠s
i,k 0, ⇠t
j,k 0
Ø Optimize transformation matrix and SVM parameters at same time.
Ø Ensure classification result.
Ø Not guarantee transformed feature distributions.
W : Transform matrix
✓k : SVM parameter ⇠s
, ⇠t
: Slack variable
yi,k : Indicator function
Propose Method
66
min
W ,✓,b
1
2
kW k2
F +
1
2
KX
k=1
k✓kk2
2 + Cs
nX
i=1
KX
k=1
⇠s
i,k
s.t. ys
i,k✓T
k xs
i bk 1 ⇠s
i.k
yt
j,k✓T
k W xt
j 1 ⇠t
j,k
⇠s
i,k 0, ⇠t
j,k 0
Constraint of close transformed target to source.
+Ct
mX
j=1
KX
k=1
⇠t
j,k +
1
2
D
MX
i=1
NX
j=1
yi,jk(W xt
i xs
j)k2
2
Ø Add L2 distance constraints to MMDT.
Ø Our method ensures classification result
and transformed feature distributions.
Max-Margin Domain Transfer with L2 Distance Constraints
(MMDTL2)
Decompose to Sub-problem
67
Hoffman et al. decompose objective function to 2 sub-problem in MMDT.
Our method as well decomposes objective functions in below.
Objective function optimize by iterate (1) and (2).
min
✓,⇠s,⇠t
1
2
KX
k=1
k✓kk2
2 + Cs
NX
i=1
KX
k=1
⇠s
i,k + Ct
MX
j=1
KX
k=1
⇠t
j,k(1)
Constraintof close transformed target to source.
(2) min
W ,⇠t
1
2
kW k2
F + Ct
MX
j=1
KX
k=1
⇠t
j,k +
1
2
D
MX
i=1
MX
j=1
yi,jkW xt
i xs
jk2
2
Objective function for optimize SVM parameter.
Objective function for optimize transform matrix.
s.t. ys
i,k✓T
k xs
i bk 1 ⇠s
i.k
yt
j,k✓T
k W xt
j 1 ⇠t
j,k
⇠s
i,k 0, ⇠t
j,k 0
Primal Problem
68
U(x) =
2
6
6
6
4
xxT
xxT
...
xxT
3
7
7
7
5
vi,j = vec(xs
j(xt
i)T
)
w = vec(W )
(x) = vec(✓xT
)
min
w,⇠t
1
2
kwk2
2 + Ct
MX
j=1
KX
k=1
⇠t
j,k +
1
2
D
MX
i=1
MX
j=1
wT
U(xt
i)w 2vT
ijw + (xt
i)T
xs
j(2)
s.t. ⇠t
i 0
yt
i,k
T
k (xt
i)w 1 ⇠t
i,k
Derivate from objective function for optimize transform matrix.
This is standard quadratic programming but…
p High computational costs.
p Need to huge memory.
p Depend on dimensions of data.
Derivate dual problem.
Dual Problem
69
s.t. 0  ai  CT
MX
i=1
aiyt
i,k = 0
max
a
1
2
KX
k1=1
KX
k2=1
MX
i=1
MX
j=1
aiajyt
i,k1
yt
j,k2
T
k1(xt
i)V 1
k2 (xt
j)
+
KX
k=1
MX
i=1
ai 1 D T
k (xt
i)V 1
MX
m=1
NX
n=1
ym,nvi,j
!!
(2)
p Low computational cost.
p Defined by sparse problem.
p Depend on number of target data.
ai: Lagrange multiplier
Dual problem has many advantages.
V =
0
@I + D
MX
i=1
NX
j=1
yi,jU(xt
i)
1
A
Comparison Primal with Dual of
Computation Time
70
SetupTime: computation time for coefficients(e.g. and ).
OptimizationTime: optimization time for solving quadratic programming
CalculationTime: computation time in from (dual only).
U(x) vi,j
w a
3riPDO DuDO
0
1000
2000
3000
4000
5000
6000
7000
CoPSutDtion7iPe
6etuS7iPe
2StiPizDtion7iPe
CDOcuODtion7iPe
Visual Words:128
About 14 times faster
Result
71
MMDTL2 achieve good performance as equivalent with baseline.
But Not transfer is the best performance.
8 16 32 64 128 256 512 1024
# Rf 9LVuDl WRrdV
0.4
0.5
0.6
0.7
0.8
0.9
1.05ecRgnLWLRnrDWe BDVelLne
6Rurce Rnly
1RW WrDnVfer
00D7
00D7L2
:14.7[fps]
A
B
C3
•
Visual Word Histogram
• (A or B or C3)
• A, B, C3
• : SVM
22 6 …
91 87 …
•
• SIFT
120[pix.]
120[pix.]
A
B
C3
73time
probability
0
1
A
B
C3
Objective
74
処理用PCをNBI内視鏡と接続し,
オンラインでの認識を可能とする
システム構成
*2 http://www.genkosha.com/vs/news/entry/sdi.html*1 http://www.olympus.co.jp/jp/news/2012b/nr121002luceraj.jsp
開発環境
Visual Studio 2012(製品版),OpenCV 3.0-devel,
VLFeat 0.9.18, Boost 1.55.0,DeckLink SDK 10.0
OS:Windows 7 Home Premium SP1 64bit
CPU:Intel Core i7-4470 3.40GHz
Memory:16GB
OLYMPUS製
EVIS LUCERAELITE
Blackmagic製
DeckLink SDI
NBI内視鏡*1
キャプチャボード*2
処理用PC
SDI PCI Express
Capture ~Setup~
75
NBI内視鏡
処理用PC
NBIスコープ
Capture ~demo & performance~
76
NBI内視鏡
の画面
処理用PC
の画面
色変換処理
特徴量抽出
77time
probability
0
1
A
B
C3
[ ]
0
0.5
1
0 50 100 150 200
Probability
フレーム番号
Type A
Type B
•
•
Type A
Type B
Type C3
MRF/HMM
f x y( )∝exp A xi, yi( )
i
∑
#
$
%
&
'
(⋅exp I xi, xj( )
j∈Ni
∑
#
$
%%
&
'
((
x:
y: SVM
x1
x50………… x100 x150 x200………… ………… …………
B B BC3 C3
i0 50 100 200150
…… …… …… ……
y1
y50 y100 y150 y200
Type B (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.8)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.9)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.99)
	
	
	
	
Type B (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.8)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.9)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.99)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (DP_0.999)
frame number
0 20 40 60 80 100 120 140 160 180 200
	
	
	
	
	
Type B (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.6)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.7)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.8)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type B (Gibbs_p4=0.9)
frame number
0 20 40 60 80 100 120 140 160 180 200
	
	
	
	
	
0
0.5
1
B
A
C
20 40 60 80 100 120 140 160 180 200
Type B
0
0.5
1
A
B
C
20 40 60 80 100 120 140 160 180 200
Type A_1 (original)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type A_1 (DP_0.99)
frame number
0 20 40 60 80 100 120 140 160 180 200
Type A_1 (Gibbs_p4=0.9)
frame number
0 20 40 60 80 100 120 140 160 180 200
	
	
	
Type A
Type A Type B Type C3
MAP
(C3 )
( )
A
B
C3
Type A Type B Type C3
MRF
0
0.5
1
0 50 100 150 200
Probability
Type	A
Type	B
Type	C3
Type A Type B
Type A Type B
Type A Type B Type C3
Colorectal	Tumor	Classification	System
in	Magnifying	Endoscopic	NBI	Images
[Tamaki	et	al.,	MedIA2013]
Recognizing	colorectal	image
p Feature:	Bag-of-Visual-Words	
of	densely	sampled	SIFT
p Classifier:		Linear	SVM
83
Extended	to	video	frames
Display	posterior	probabilities	at	each	frame.
0
0.5
1
251 271 291 311 331 351 371 391 411 431
Probability
Frame number
A
B
C
0 20 40 60 80 120100 140 160 180 200
Highly	unstable	classification	results
Possible Cause of Instability
84
p Classification	results	would	be	
affected	by	out	of	focus.
number of visual words
RecognitionRate[%]
●
●
●
●
● ● ●
●
● ● ● ●
● ● ●
●
●
●
●
● ●
● ● ●
●
●no defoucs
SD = 0.5
SD = 1
SD = 2
SD = 3
SD = 5
SD = 7
SD = 9
SD = 11
10 100 1000 10000
0.00.20.40.60.81.0
p Test	image:		1191
Ø Test	images	are	added	Gaussian	blur	with	different	SD.
p Train	image:	480
Ø 160	images	for	each	class
Smaller													SD													Larger
Recognition	results	for	out	of	focus	images
Particle Filter (Online Bayesian Filtering)
85
State	vector:
Observation	vector:
t :	time
p (xt | y1:t 1) =
Z
p (xt | xt 1, ✓1) p (xt 1 | y1:t 1) dxt 1
Prediction
State transition
We use Dirichlet distribution for state transition and likelihood.
Update
Likelihood
p (xt | y1:t) / p (yt | xt, ✓2) p (xt | y1:t 1)
yt =
⇣
y
(A)
t , y
(B)
t , y
(C3)
t
⌘
, y
(A)
t + y
(B)
t + y
(C3)
t = 1
xt =
⇣
x
(A)
t , x
(B)
t , x
(C3)
t
⌘
, x
(A)
t + x
(B)
t + x
(C3)
t
Dirichlet distribution
86
(0.50, 0.50, 0.50)
(0.85, 1.50, 2.00)
	
 	
 	
 	
 	
 	
 	
	
	
	
	
	
	
	
	
 	
 	
 	
 	
 	
 	
	
	
	
	
	
	
	
(1.00, 1.00, 1.00)
(1.00, 1.76, 2.35)
	
 	
 	
 	
 	
 	
 	
	
	
	
	
	
	
	
	
 	
 	
 	
 	
 	
 	
	
	
	
	
	
	
	
(4.00, 4.00 ,4.00)
	
 	
 	
 	
 	
 	
 	
	
	
	
	
	
	
	
	
 	
 	
 	
 	
 	
 	
	
	
	
	
	
	
	
(3.40, 6.00, 8.00)
low
high
Dirx[↵] =
(
PN
i=1 ↵i)
QN
i=1 (↵i)
NY
i=1
x↵i 1
i
parameter	of	distribution:
↵ (x) = ax + b
Problem & Our Approach
87
xt 1 xt xt+1
yt 1 yt yt+1zt+1zt 1 zt
t t+1t 1
xt 1 xt+1xt
ytyt 1 yt+1
✓2
Dirichlet	Particle	Filter	(DPF)
Defocus-aware	Dirichlet	Particle	Filter	(D-DPF)
Prediction
p (xy | y1:t 1, 1:t 1, z1:t 1) =
Z
p (xt | xt 1, ✓1)p (xt 1 | y1:t 1, 1:t 1, z1:t 1)dxt 1
State transition
p (xt | y1:t, 1:t, z1:t) /
p (yt, t, zt | xt) p (xt | y1:t 1, 1:t 1, z1:t 1)
Update
Likelihood
p (yt, t, zt | xt) = p (yt, xt, t) p (zt | t)
Isolated Pixel Ratio (IPR) [Oh et al., MedIA2007]
88
Endoscopic	image Edges	pixels	by	Canny	edge	detector	
Clear	edge Defocus	edge
Edge	pixel
Non-edge	pixel
Edge	and	isolated	pixel
IPR:	the	percentage	of	isolated	pixel	in	every	edge	pixels
Isolated pixel value (IPR)
frequency
0.000.020.040.060.080.10
0 0.005 0.01 0.015
γt
Density
0 2 4 6 8 10
0.00.20.40.60.81.01.2
sigma = 0.5
sigma = 1
sigma = 2
sigma = 3
sigma = 4
Dirichlet
distribution
Modeling with Rayleigh dist. and IPR
89
Rayx [ ] =
x
2
exp
✓
x2
2 2
◆
Defocus Clear
γt
0.000 0.005 0.010 0.015
0.51.01.52.02.53.03.54.0
zt
σ(zt)
●
●
(zt) = 4 exp(100 log(0.25)zt)
p (zt | t) = Ray t
[ (zt)]
Sequential filtering
90
Prediction
p (xy | y1:t 1, 1:t 1, z1:t 1) =
Z
p (xt | xt 1, ✓1)p (xt 1 | y1:t 1, 1:t 1, z1:t 1)dxt 1
p (xt | y1:t, 1:t, z1:t) /
p (yt, t, zt | xt) p (xt | y1:t 1, 1:t 1, z1:t 1)
Update
xt 1 xt xt+1
yt 1 yt yt+1zt+1zt 1 zt
t t+1t 1
p (yt, t, zt | xt) = p (yt, xt, t) p (zt | t)
p (yt, xt, t) = Dirxt
[↵2 (yt, t)] p (zt | t) = Ray t
[ (zt)]
p (xt | xt 1, ✓1) = Dirxt
[↵1(xt 1, ✓1)]
The performance for defocus frames
91
0 100 200 300 400 500 6000.00.51.0
0 100 200 300 400 500 600
0.00.51.0
0 100 200 300 400 500 600
0.0000.0050.010
0 100 200 300 400 500 600
0.00.51.0
0 100 200 300 400 500 600
0.00.51.0
Frame	number
Ground	truth
Observation
IPR
Result	by	DPF
Result	by
D-DPF
Smoothing result for an actual NBI video
92
No	smoothing	result
Smoothing	result
Type	A
Type	B
Type	C3
Summary
• NBI
• Baseline: SIFT + Bag-of-Visual Words
•
•
– self-training
– sampling
– domain adaptation / transfer learning
• /
– MRF/HMM
–

Contenu connexe

En vedette

MLaPP 4章 「ガウシアンモデル」
MLaPP 4章 「ガウシアンモデル」MLaPP 4章 「ガウシアンモデル」
MLaPP 4章 「ガウシアンモデル」Shinichi Tamura
 
Locally Optimized Product Quantization for Approximate Nearest Neighbor Searc...
Locally Optimized Product Quantization for Approximate Nearest Neighbor Searc...Locally Optimized Product Quantization for Approximate Nearest Neighbor Searc...
Locally Optimized Product Quantization for Approximate Nearest Neighbor Searc...Gou Koutaki
 
最近傍探索と直積量子化(Nearest neighbor search and Product Quantization)
最近傍探索と直積量子化(Nearest neighbor search and Product Quantization)最近傍探索と直積量子化(Nearest neighbor search and Product Quantization)
最近傍探索と直積量子化(Nearest neighbor search and Product Quantization)Nguyen Tuan
 
Unsupervised Object Discovery and Localization in the Wild: Part-Based Match...
Unsupervised Object Discovery and Localization in the Wild:Part-Based Match...Unsupervised Object Discovery and Localization in the Wild:Part-Based Match...
Unsupervised Object Discovery and Localization in the Wild: Part-Based Match...Yoshitaka Ushiku
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationHidekazu Oiwa
 
Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Coverin...
Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Coverin...Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Coverin...
Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Coverin...Kenko Nakamura
 
Leveraging Visual Question Answering for Image-Caption Ranking (関東CV勉強会 ECCV ...
Leveraging Visual Question Answeringfor Image-Caption Ranking (関東CV勉強会 ECCV ...Leveraging Visual Question Answeringfor Image-Caption Ranking (関東CV勉強会 ECCV ...
Leveraging Visual Question Answering for Image-Caption Ranking (関東CV勉強会 ECCV ...Yoshitaka Ushiku
 
NIPS Paper Reading, Data Programing
NIPS Paper Reading, Data ProgramingNIPS Paper Reading, Data Programing
NIPS Paper Reading, Data ProgramingKotaro Tanahashi
 
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...Nishanth Koganti
 
SGD+α: 確率的勾配降下法の現在と未来
SGD+α: 確率的勾配降下法の現在と未来SGD+α: 確率的勾配降下法の現在と未来
SGD+α: 確率的勾配降下法の現在と未来Hidekazu Oiwa
 
非技術者でもわかる(?)コンピュータビジョン紹介資料
非技術者でもわかる(?)コンピュータビジョン紹介資料非技術者でもわかる(?)コンピュータビジョン紹介資料
非技術者でもわかる(?)コンピュータビジョン紹介資料Takuya Minagawa
 
Kaggle bosch presentation material for Kaggle Tokyo Meetup #2
Kaggle bosch presentation material for Kaggle Tokyo Meetup #2Kaggle bosch presentation material for Kaggle Tokyo Meetup #2
Kaggle bosch presentation material for Kaggle Tokyo Meetup #2Keisuke Hosaka
 
Kaggle boschコンペ振り返り
Kaggle boschコンペ振り返りKaggle boschコンペ振り返り
Kaggle boschコンペ振り返りKeisuke Hosaka
 
Chapter 8 ボルツマンマシン - 深層学習本読み会
Chapter 8 ボルツマンマシン - 深層学習本読み会Chapter 8 ボルツマンマシン - 深層学習本読み会
Chapter 8 ボルツマンマシン - 深層学習本読み会Taikai Takeda
 
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Yoshitaka Ushiku
 
Binarized Neural Networks
Binarized Neural NetworksBinarized Neural Networks
Binarized Neural NetworksShotaro Sano
 

En vedette (20)

MLaPP 4章 「ガウシアンモデル」
MLaPP 4章 「ガウシアンモデル」MLaPP 4章 「ガウシアンモデル」
MLaPP 4章 「ガウシアンモデル」
 
Locally Optimized Product Quantization for Approximate Nearest Neighbor Searc...
Locally Optimized Product Quantization for Approximate Nearest Neighbor Searc...Locally Optimized Product Quantization for Approximate Nearest Neighbor Searc...
Locally Optimized Product Quantization for Approximate Nearest Neighbor Searc...
 
最近傍探索と直積量子化(Nearest neighbor search and Product Quantization)
最近傍探索と直積量子化(Nearest neighbor search and Product Quantization)最近傍探索と直積量子化(Nearest neighbor search and Product Quantization)
最近傍探索と直積量子化(Nearest neighbor search and Product Quantization)
 
Unsupervised Object Discovery and Localization in the Wild: Part-Based Match...
Unsupervised Object Discovery and Localization in the Wild:Part-Based Match...Unsupervised Object Discovery and Localization in the Wild:Part-Based Match...
Unsupervised Object Discovery and Localization in the Wild: Part-Based Match...
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
 
PRML chapter7
PRML chapter7PRML chapter7
PRML chapter7
 
Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Coverin...
Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Coverin...Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Coverin...
Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Coverin...
 
Leveraging Visual Question Answering for Image-Caption Ranking (関東CV勉強会 ECCV ...
Leveraging Visual Question Answeringfor Image-Caption Ranking (関東CV勉強会 ECCV ...Leveraging Visual Question Answeringfor Image-Caption Ranking (関東CV勉強会 ECCV ...
Leveraging Visual Question Answering for Image-Caption Ranking (関東CV勉強会 ECCV ...
 
NIPS Paper Reading, Data Programing
NIPS Paper Reading, Data ProgramingNIPS Paper Reading, Data Programing
NIPS Paper Reading, Data Programing
 
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...
 
NIPS2016 Supervised Word Mover's Distance
NIPS2016 Supervised Word Mover's DistanceNIPS2016 Supervised Word Mover's Distance
NIPS2016 Supervised Word Mover's Distance
 
SGD+α: 確率的勾配降下法の現在と未来
SGD+α: 確率的勾配降下法の現在と未来SGD+α: 確率的勾配降下法の現在と未来
SGD+α: 確率的勾配降下法の現在と未来
 
非技術者でもわかる(?)コンピュータビジョン紹介資料
非技術者でもわかる(?)コンピュータビジョン紹介資料非技術者でもわかる(?)コンピュータビジョン紹介資料
非技術者でもわかる(?)コンピュータビジョン紹介資料
 
Kaggle bosch presentation material for Kaggle Tokyo Meetup #2
Kaggle bosch presentation material for Kaggle Tokyo Meetup #2Kaggle bosch presentation material for Kaggle Tokyo Meetup #2
Kaggle bosch presentation material for Kaggle Tokyo Meetup #2
 
Semantic segmentation
Semantic segmentationSemantic segmentation
Semantic segmentation
 
Kaggle boschコンペ振り返り
Kaggle boschコンペ振り返りKaggle boschコンペ振り返り
Kaggle boschコンペ振り返り
 
Chapter 8 ボルツマンマシン - 深層学習本読み会
Chapter 8 ボルツマンマシン - 深層学習本読み会Chapter 8 ボルツマンマシン - 深層学習本読み会
Chapter 8 ボルツマンマシン - 深層学習本読み会
 
Dynamic filter networks
Dynamic filter networksDynamic filter networks
Dynamic filter networks
 
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)
 
Binarized Neural Networks
Binarized Neural NetworksBinarized Neural Networks
Binarized Neural Networks
 

Similaire à 大腸内視鏡検査における大腸癌認識システム

기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법Kyunghoon Kim
 
Information Visualization for Health Care
Information Visualization for Health CareInformation Visualization for Health Care
Information Visualization for Health CareKrist Wongsuphasawat
 
Who wants to be a millionaire
Who wants to be a millionaireWho wants to be a millionaire
Who wants to be a millionairedeathfoxjinkarl
 
Fundraising from Online Communities
Fundraising from Online CommunitiesFundraising from Online Communities
Fundraising from Online CommunitiesNoam Kostucki
 
Technology Education Millionare Game
Technology Education Millionare GameTechnology Education Millionare Game
Technology Education Millionare Gamebenedijr
 
Data Journalism
Data JournalismData Journalism
Data Journalismpilhofer
 
12 cie552 object_recognition
12 cie552 object_recognition12 cie552 object_recognition
12 cie552 object_recognitionElsayed Hemayed
 

Similaire à 大腸内視鏡検査における大腸癌認識システム (8)

기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법
 
Information Visualization for Health Care
Information Visualization for Health CareInformation Visualization for Health Care
Information Visualization for Health Care
 
Who wants to be a millionaire
Who wants to be a millionaireWho wants to be a millionaire
Who wants to be a millionaire
 
Fundraising from Online Communities
Fundraising from Online CommunitiesFundraising from Online Communities
Fundraising from Online Communities
 
Technology Education Millionare Game
Technology Education Millionare GameTechnology Education Millionare Game
Technology Education Millionare Game
 
Data Journalism
Data JournalismData Journalism
Data Journalism
 
Abraji
AbrajiAbraji
Abraji
 
12 cie552 object_recognition
12 cie552 object_recognition12 cie552 object_recognition
12 cie552 object_recognition
 

Plus de Toru Tamaki

論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...
論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...
論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...Toru Tamaki
 
論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding
論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding
論文紹介:Selective Structured State-Spaces for Long-Form Video UnderstandingToru Tamaki
 
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...Toru Tamaki
 
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...Toru Tamaki
 
論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNet論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNetToru Tamaki
 
論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A survey論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A surveyToru Tamaki
 
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex ScenesToru Tamaki
 
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...Toru Tamaki
 
論文紹介:Tracking Anything with Decoupled Video Segmentation
論文紹介:Tracking Anything with Decoupled Video Segmentation論文紹介:Tracking Anything with Decoupled Video Segmentation
論文紹介:Tracking Anything with Decoupled Video SegmentationToru Tamaki
 
論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope
論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope
論文紹介:Real-Time Evaluation in Online Continual Learning: A New HopeToru Tamaki
 
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...Toru Tamaki
 
論文紹介:Multitask Vision-Language Prompt Tuning
論文紹介:Multitask Vision-Language Prompt Tuning論文紹介:Multitask Vision-Language Prompt Tuning
論文紹介:Multitask Vision-Language Prompt TuningToru Tamaki
 
論文紹介:MovieCLIP: Visual Scene Recognition in Movies
論文紹介:MovieCLIP: Visual Scene Recognition in Movies論文紹介:MovieCLIP: Visual Scene Recognition in Movies
論文紹介:MovieCLIP: Visual Scene Recognition in MoviesToru Tamaki
 
論文紹介:Discovering Universal Geometry in Embeddings with ICA
論文紹介:Discovering Universal Geometry in Embeddings with ICA論文紹介:Discovering Universal Geometry in Embeddings with ICA
論文紹介:Discovering Universal Geometry in Embeddings with ICAToru Tamaki
 
論文紹介:Efficient Video Action Detection with Token Dropout and Context Refinement
論文紹介:Efficient Video Action Detection with Token Dropout and Context Refinement論文紹介:Efficient Video Action Detection with Token Dropout and Context Refinement
論文紹介:Efficient Video Action Detection with Token Dropout and Context RefinementToru Tamaki
 
論文紹介:Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...
論文紹介:Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...論文紹介:Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...
論文紹介:Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...Toru Tamaki
 
論文紹介:MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...
論文紹介:MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...論文紹介:MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...
論文紹介:MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...Toru Tamaki
 
論文紹介:Revealing the unseen: Benchmarking video action recognition under occlusion
論文紹介:Revealing the unseen: Benchmarking video action recognition under occlusion論文紹介:Revealing the unseen: Benchmarking video action recognition under occlusion
論文紹介:Revealing the unseen: Benchmarking video action recognition under occlusionToru Tamaki
 
論文紹介:Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving
論文紹介:Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving論文紹介:Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving
論文紹介:Video Task Decathlon: Unifying Image and Video Tasks in Autonomous DrivingToru Tamaki
 
論文紹介:Spatio-Temporal Action Detection Under Large Motion
論文紹介:Spatio-Temporal Action Detection Under Large Motion論文紹介:Spatio-Temporal Action Detection Under Large Motion
論文紹介:Spatio-Temporal Action Detection Under Large MotionToru Tamaki
 

Plus de Toru Tamaki (20)

論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...
論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...
論文紹介:Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Groun...
 
論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding
論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding
論文紹介:Selective Structured State-Spaces for Long-Form Video Understanding
 
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...
論文紹介:Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...
 
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
 
論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNet論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNet
 
論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A survey論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A survey
 
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
論文紹介:MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
 
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...
論文紹介:MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...
 
論文紹介:Tracking Anything with Decoupled Video Segmentation
論文紹介:Tracking Anything with Decoupled Video Segmentation論文紹介:Tracking Anything with Decoupled Video Segmentation
論文紹介:Tracking Anything with Decoupled Video Segmentation
 
論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope
論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope
論文紹介:Real-Time Evaluation in Online Continual Learning: A New Hope
 
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...
論文紹介:PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...
 
論文紹介:Multitask Vision-Language Prompt Tuning
論文紹介:Multitask Vision-Language Prompt Tuning論文紹介:Multitask Vision-Language Prompt Tuning
論文紹介:Multitask Vision-Language Prompt Tuning
 
論文紹介:MovieCLIP: Visual Scene Recognition in Movies
論文紹介:MovieCLIP: Visual Scene Recognition in Movies論文紹介:MovieCLIP: Visual Scene Recognition in Movies
論文紹介:MovieCLIP: Visual Scene Recognition in Movies
 
論文紹介:Discovering Universal Geometry in Embeddings with ICA
論文紹介:Discovering Universal Geometry in Embeddings with ICA論文紹介:Discovering Universal Geometry in Embeddings with ICA
論文紹介:Discovering Universal Geometry in Embeddings with ICA
 
論文紹介:Efficient Video Action Detection with Token Dropout and Context Refinement
論文紹介:Efficient Video Action Detection with Token Dropout and Context Refinement論文紹介:Efficient Video Action Detection with Token Dropout and Context Refinement
論文紹介:Efficient Video Action Detection with Token Dropout and Context Refinement
 
論文紹介:Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...
論文紹介:Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...論文紹介:Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...
論文紹介:Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...
 
論文紹介:MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...
論文紹介:MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...論文紹介:MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...
論文紹介:MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...
 
論文紹介:Revealing the unseen: Benchmarking video action recognition under occlusion
論文紹介:Revealing the unseen: Benchmarking video action recognition under occlusion論文紹介:Revealing the unseen: Benchmarking video action recognition under occlusion
論文紹介:Revealing the unseen: Benchmarking video action recognition under occlusion
 
論文紹介:Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving
論文紹介:Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving論文紹介:Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving
論文紹介:Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving
 
論文紹介:Spatio-Temporal Action Detection Under Large Motion
論文紹介:Spatio-Temporal Action Detection Under Large Motion論文紹介:Spatio-Temporal Action Detection Under Large Motion
論文紹介:Spatio-Temporal Action Detection Under Large Motion
 

Dernier

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Dernier (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

大腸内視鏡検査における大腸癌認識システム

  • 1. , Bisser Raytchev, , , and many others
  • 6. • : 235,000 ( 21 )† – • : 42,434 ( )† – 20 1.7 – 3 ( 1 : 2 : ) – 7 1 • 5 : 20% 11 † http://www.mhlw.go.jp/toukei/saikin/ ‡http://www.gunma-cc.jp/sarukihan/seizonritu/index.html 0 20 40 60 80 100 stage 1 stage 2 stage 3 stage 4 † 5 ‡ survivalrate[%] stage 1: stage 2: stage 3: stage 4: stage 1 ( ) 100% 0 10,000 20,000 30,000 40,000 50,000 '90 '91 '92 '93 '94 '95 '96 '97 '98 '99 '00 '01 '02 '03 '04 '05 '06 '07 '08 '09 fatalities of colorectal cancer year
  • 9. • CCD • 100 I think this is a cancer…
  • 14.
  • 15.
  • 16.
  • 18. NBI
  • 20. NBI (Narrow Band Imaging , –NBI AFI IRI – , 1 1 , , 2006. R B G 415nm 540nm Color Transform NBI filter Xenon lamp RGB rotary filter mucosal CCD Monitor Light source unit Video processor ON OFF Normal light Normal light NBI NBI http://www.olympus.co.jp/jp/technology/technology/luceraelite/
  • 23. or or or or http://cancernavi.nikkeibp.co.jp/daicho/worry/post_2.html ü ü ü ü I think this is a cancer… Oh, MIA, ‘07 Sundaram et al., MIA, ‘08 Diaz & Rao , PRL, ‘07 Al-Kadi, PR, ‘10 Gunduz-Demir et al., MIA, ‘10 Tosun, PR, ‘09 Pit-Pattern Häfner et al., PAA, ‘09 Häfner, ICPR, ‘10 Häfner, PR, ‘09 Kwitt & Uhl, ICCV, ‘07 Tischendrof et al., Endoscopy, ‘10 NBI Stehle, MI, ‘09 Gross, MI, ‘08 , PRMU, ‘10 Tamaki et al., ACCV, ‘10
  • 24. pit-pattern • pit – pit – 29 m sm pit pit pit pit pit S L I N pit pit pit S L pit-pattern [S.Tanaka et al., ‘06]
  • 25. NBI (NBI: Narrow-band Imaging) • pit – – Type A Type B Type C 1 2 3 pit pit / pit / pit / (AVA) NBI [H.Kanao et al., ‘09] sm
  • 26.
  • 27.
  • 28.
  • 29. texture analysis approach Yoshito Takemura, Shigeto Yoshida, Shinji Tanaka, Keiichi Onji, Shiro Oka, Toru Tamaki, Kazufumi Kaneda, Masaharu Yoshihara, Kazuaki Chayama: "Quantitative analysis and development of a computer-aided system for identification of regular pit patterns of colorectal lesions," Gastrointestinal Endoscopy, Vol. 72, No. 5, pp. 1047-1051 (2010 11).
  • 30. Bag-of-Visual Words Approach Type A Type B Type C3 12, 55, 63, … 87, 49, 21, … 32, 20, 73, … 67, 6, 0, … 79, 5, 40, … 11, 36, 87, … 27, 64, 25, …, 87 93, 41, 75, …, 8 … 12, 55, 63, … 87, 49, 21, … 32, 20, 73, … 67, 6, 0, … 79, 5, 40, … 11, 36, 87, … 65, 33, 19, …, 101 52, 51, 32, …, 89 … 12, 55, 63, … 87, 49, 21, … 32, 20, 73, … 67, 6, 0, … 79, 5, 40, … 11, 36, 87, … 66, 95, 47, …, 85 11, 82, 3,…, 124 … Type A Type B Type C3 84, 99, 40, …, 121 5, 26, 91, …, 150 … Vector quantization Vector quantization Feature space Classifier Histogram Test image Learning Classification result Description of Local features + Bag-of-features
  • 31. Object Bag of words Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
  • 32. Analogy to documents Of all the sensory impressions proceeding to the brain, the visual experiences are the dominant ones. Our perception of the world around us is based essentially on the messages that reach the brain from our eyes. For a long time it was thought that the retinal image was transmitted point by point to visual centers in the brain; the cerebral cortex was a movie screen, so to speak, upon which the image in the eye was projected. Through the discoveries of Hubel and Wiesel we now know that behind the origin of the visual perception in the brain there is a considerably more complicated course of events. By following the visual impulses along their path to the various cell layers of the optical cortex, Hubel and Wiesel have been able to demonstrate that the message about the image falling on the retina undergoes a step- wise analysis in a system of nerve cells stored in columns. In this system each cell has its specific function and is responsible for a specific detail in the pattern of the retinal image. sensory, brain, visual, perception, retinal, cerebral cortex, eye, cell, optical nerve, image Hubel, Wiesel China is forecasting a trade surplus of $90bn (£51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. The figures are likely to further annoy the US, which has long argued that China's exports are unfairly helped by a deliberately undervalued yuan. Beijing agrees the surplus is too high, but says the yuan is only one factor. Bank of China governor Zhou Xiaochuan said the country also needed to do more to boost domestic demand so more goods stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value. China, trade, surplus, commerce, exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
  • 33. Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
  • 34. ヒストグラム Type C3Type BType A Type A Type C3 学習画像 Type B 特徴量: Bag-of-Visual Words Approach 病変部の画像パッチを分類[Tamaki et al., 2013] ・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習 ・Type C1,Type C2は不明瞭な部分が多いため省かれている ・最大認識率96% 認識の流れ 特徴量抽出 特徴量をクラスタリング, 代表値をVisual Wordsとする Visual Wordsヒストグラムを作成 認識画像 Type ?Type B Visual Wordsヒストグラムを作成 SVM学習 認識 Visual Words Bag-of-Visual Wordsの枠組み
  • 35. : gridSIFT • Scale Invariant Feature Transform (SIFT) [Lowe, ‘99] – 128 – DoG 90[%] DoG • grid sampling SIFT (gridSIFT) – – SIFT grid sampling grid space scale size
  • 36. : Support Vector Machine (SVM) • – Radial basis function (RBF) – linear – χ2 • : One-Versus-One vuvu =),(lineark )exp(),( 2 vuvu =RBFk ( ) + = vu vu vu 2 2 2 exp),(k 2 2 1 max ww subject to yiw (xi ) 1 2 1 w 2 1 w
  • 37. • • Type • : 100 300 900 800[pix.] • 2 • 907 (Type A: 359, Type B: 462, Type C3: 87) Type A: Type B: Type C3: < >
  • 38. Results <10-fold Cross Validation> 60 65 70 75 80 85 90 95 100 10 100 1000 10000 100000 Correct Rate [%] # of visual-words [-] Correct Rate 96.00% 0 10 20 30 40 50 60 70 80 90 100 10 100 1000 10000 100000 Recall Rate [%] # of visual-words [-] Recall Rate Type A Type B Type C3 0 10 20 30 40 50 60 70 80 90 100 10 100 1000 10000 100000 Precision Rate [%] # of visual-words [-] Precision Rate Type A Type B Type C3
  • 39. Results <Holdout Testing> 60 65 70 75 80 85 90 95 100 10 100 1000 10000 100000 Correct Rate [%] # of visual-words [-] Correct Rate 0 10 20 30 40 50 60 70 80 90 100 10 100 1000 10000 100000 Recall Rate [%] # of visual-words [-] Recall Rate Type A Type B Type C3 0 10 20 30 40 50 60 70 80 90 100 10 100 1000 10000 100000 Precision Rate [%] # of visual-words [-] Precision Rate Type A Type B Type C3 92.86%
  • 43. labeled samples • • 100 300 900 800 [pix.] • Type A Type B Type C3 Total 359 462 87 908 A B C3
  • 44. Unlabeled samples • 10 • 30 30 250 250 [pix.] • – – • * 10 Type A Type B Type C3 Total 3590 4610 870 9070
  • 46. ヒストグラム Type C3Type BType A Type A Type C3 学習画像 Type B 特徴量: 病変部の画像パッチを分類[Tamaki et al., 2013] ・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習 ・Type C1,Type C2は不明瞭な部分が多いため省かれている ・最大認識率96% 認識の流れ 特徴量抽出 特徴量をクラスタリング, 代表値をVisual Wordsとする Visual Wordsヒストグラムを作成 認識画像 Type ?Type B Visual Wordsヒストグラムを作成 SVM学習 認識 Visual Words Bag-of-Visual Wordsの枠組み
  • 47. 格子間隔 15[pixel] 10[pixel] 5[pixel] 最高認識率 92.11[%] 93.89[%] 96.00[%] 学習時間 約13分 約30分 約3時間 2/3 1/2 +1.78% +2.11% 特徴量数の増加による学習時間の増加が問題 格子間隔 Ø 抽出する特徴量数を増やすと認識率は向上する[Jurie et al., 2005] Ø 特徴量抽出の間隔(格子間隔)を狭くして認識率向上を確認[吉牟田ら,2011] 特徴量数: 2.25倍 特徴量数: 4倍 格子間隔: 2/3 格子間隔: 1/2 学習画像:NBI画像908枚(Type A: 359,Type B:462,Type C3:87)
  • 48. 特徴空間 ヒストグラム 1. 全ての学習画像から(格子状に)特徴量を抽出する 2. 抽出した特徴量をクラスタリングする 3. 学習画像1枚から(格子状に)特徴量を抽出する 4. 特徴量をベクトル量子化して Visual Wordsヒストグラムを求める 格子間隔 全学習画像 I = {In | 1, . . . , N} 学習画像 In 2 I Visual Words
  • 49. ( & ) 特徴空間 ヒストグラム 1. 全ての学習画像から少量の特徴量を抽出する 3. 学習画像1枚から(格子状に)多くの特徴量を抽出する 4. 特徴量をベクトル量子化して Visual Wordsヒストグラムを求める 2. 少量の特徴量をクラスタリングする 格子間隔 全学習画像 I = {In | 1, . . . , N} 学習画像 In 2 I Visual Words
  • 50. 学習時間の削減と認識率の向上を確認する Visual Words作成 ヒストグラム作成 特徴量数:削減 実行環境 OS:Linux Fedora 18 CPU:Intel Xeon CPU E-5 2620 Memory:128GB 識別器 Ø Linear SVM 学習画像 Ø ラベルありNBI画像908枚(Type A: 359,Type B:462,Type C3:87) Visual Wordsを作成する特徴量数を減らす 特徴量数:増加 Visual Words作成に使用する特徴量数 格子間隔 5[pixel],2[pixel],1[pixel]19,742個8,678,198個 ヒストグラムを作成する特徴量数を増やす Ø 特徴量数 vs 学習時間合計 Ø 格子間隔 vs 認識率 学習時間の削減を確認する 認識率の向上を確認する
  • 51. 10233.45 680.72 4167.89 16471.8 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 従来手法(格子間隔:5) 提案手法(格子間隔:5) 提案手法(格子間隔:2) 提案手法(格子間隔:1) CPU時間[sec] 6.6% 40.7% 160.9% 格子間隔:5[pixel],2[pixel]の時,学習時間が削減できている 格子間隔:1[pixel]の時,学習時間が増えている Visual Words数:32
  • 52. 0.80 0.82 0.84 0.86 0.88 0.90 0.92 0.94 0.96 0.98 32 1024 4096 16384 CorrectRate Visual Words数 従来手法(格子間隔:5) 提案手法(格子間隔:5) 提案手法(格子間隔:2) 提案手法(格子間隔:1) 格子間隔:5[pixel]と格子間隔:2[pixel],1[pixel]には差がある 格子間隔:2[pixel]と格子間隔:1[pixel]には大きな差がない
  • 53. Problem 58 光学系が異なる 撮影画像が異なる 特徴量分布が異なる 旧内視鏡と新内視鏡が混在している Old endoscopy (EVIS LUCERA) New endoscopy (EVIS LUCERA ELITE) Viewing angle 140 (WIDE),80 (TELE) 170 (WIDE),90 (TELE) Resolution 1440*1080 1980*1080 Old endoscopy New endoscopy Ø 新内視鏡が広角・高解像度で明るい Old endoscopy New endoscopy 新内視鏡での認識性能の低下 学習画像を新旧同時に使えない 新内視鏡の学習画像を収集するのは困難 Ø 認識と学習は分布が同じことが前提 Ø がん患者は多くない Ø 検査時しか撮影できない Ø ラベル付けは医師しかできない Ø 最新のデバイスが登場し,過渡期にある
  • 55. Objective 60 Solution: 新内視鏡の特徴量を旧内視鏡の特徴量に変換し, 学習する Framework of Transfer Learning 2つの画像は関連がある 5 10 New endoscopyOld endoscopy 5 10 特徴量を変換する 学習:旧内視鏡 認識:旧内視鏡 学習:旧内視鏡 認識:新内視鏡 認識率 低下
  • 56. Related Work 61 Adapting Visual Category Models to New Domains [Saenko et al., ECCV2010] SourceとTargetの同時認識をする問題 Source:x Target:y Targetのみを認識する問題 Ø 本手法はハイパーパラメータが存在しない Ø この手法はハイパーパラメータが存在し,調整が必要 TargetをSourceに変換する行列 W を求める Our Approach Source Target W Source-Target間の条件を満たす行列 を求めるA Source Target +Target For each class (xi yj)T A(xi yj)  upper bound (xi yj)T A(xi yj) lower bound Same class: Different class: A1/2 A1/2 ! arg min W kx W yk2 F W
  • 57. y1 Convert Histogram 62 yn yN x1 xn xN Source Target 1. Visual Wordsヒストグラムをベクトルとして扱い,行列とする 2. ヒストグラム同士の誤差を最小に する変換行列WをADMM*で求める *ADMMによる解法 (For each row n=1, …, N) arg min W PN n=1 ||xn W nyn||2 2 +1 2 ||W n zn + un||2 2 + PN n=1(zk n uk n)) 手順 以下の双対問題を手順を繰り返すことで解く Y = y1, · · · , yNX = x1, · · · , xN Subject to. W ij 0 arg min W kX W Y k2 F zk+1 n = ⇡c(W k+1 n + uk n) uk+1 n = uk n + W k+1 n zk+1 n W k+1 n = ( PN n=1 ynyT n + E) 1 ( PN n=1 ynyT n
  • 58. How to Make Pseudo Dataset 63 l 新内視鏡はくっきり,鮮やかに見えると思われるため を適用する ①コントラスト強調 ②先鋭化フィルタ Source Target Output Input 0 25542 213 0 255 1 9 1 9 1 9 1 9 1 9 1 9 1 9 1 9 25 9 コントラスト強調 先鋭化フィルタ l この手法は学習画像同士の対応がないと使えない Ø 現実には対応のある画像を得るのは難しい
  • 59. Result 64 転移することで旧内視鏡と同等に認識率を得た Almost same Training Test n Source Source n Source Target n Source+Target Target n Source + Target Target ① ④ ② ③
  • 60. Related Works 65 Cross-Domain Transform[Saenko et al., ECCV2010] Max-Margin Domain Transfer(MMDT)[Hoffman et al., ICLR2013] min tr(W ) log det W s.t. W ⌫ 0 kxs i xt jkw  upper bound, (xs i , xt j) 2 the same class kxs i xt jkw  lowe rbound, (xs i , xt j) 2 di↵erent class Ø Estimate transformation matrix which minimize Mahalanobis distance. Ø Consider in only transformed feature distributions. Ø Not ensure classification result. min W ,✓,b 1 2 kW k2 F + 1 2 KX k=1 k✓kk2 2 + Cs nX i=1 KX k=1 ⇠s i,k + Ct mX j=1 KX k=1 ⇠t j,k s.t. ys i,k✓T k xs i bk 1 ⇠s i.k yt j,k✓T k W xt j 1 ⇠t j,k ⇠s i,k 0, ⇠t j,k 0 Ø Optimize transformation matrix and SVM parameters at same time. Ø Ensure classification result. Ø Not guarantee transformed feature distributions. W : Transform matrix ✓k : SVM parameter ⇠s , ⇠t : Slack variable yi,k : Indicator function
  • 61. Propose Method 66 min W ,✓,b 1 2 kW k2 F + 1 2 KX k=1 k✓kk2 2 + Cs nX i=1 KX k=1 ⇠s i,k s.t. ys i,k✓T k xs i bk 1 ⇠s i.k yt j,k✓T k W xt j 1 ⇠t j,k ⇠s i,k 0, ⇠t j,k 0 Constraint of close transformed target to source. +Ct mX j=1 KX k=1 ⇠t j,k + 1 2 D MX i=1 NX j=1 yi,jk(W xt i xs j)k2 2 Ø Add L2 distance constraints to MMDT. Ø Our method ensures classification result and transformed feature distributions. Max-Margin Domain Transfer with L2 Distance Constraints (MMDTL2)
  • 62. Decompose to Sub-problem 67 Hoffman et al. decompose objective function to 2 sub-problem in MMDT. Our method as well decomposes objective functions in below. Objective function optimize by iterate (1) and (2). min ✓,⇠s,⇠t 1 2 KX k=1 k✓kk2 2 + Cs NX i=1 KX k=1 ⇠s i,k + Ct MX j=1 KX k=1 ⇠t j,k(1) Constraintof close transformed target to source. (2) min W ,⇠t 1 2 kW k2 F + Ct MX j=1 KX k=1 ⇠t j,k + 1 2 D MX i=1 MX j=1 yi,jkW xt i xs jk2 2 Objective function for optimize SVM parameter. Objective function for optimize transform matrix. s.t. ys i,k✓T k xs i bk 1 ⇠s i.k yt j,k✓T k W xt j 1 ⇠t j,k ⇠s i,k 0, ⇠t j,k 0
  • 63. Primal Problem 68 U(x) = 2 6 6 6 4 xxT xxT ... xxT 3 7 7 7 5 vi,j = vec(xs j(xt i)T ) w = vec(W ) (x) = vec(✓xT ) min w,⇠t 1 2 kwk2 2 + Ct MX j=1 KX k=1 ⇠t j,k + 1 2 D MX i=1 MX j=1 wT U(xt i)w 2vT ijw + (xt i)T xs j(2) s.t. ⇠t i 0 yt i,k T k (xt i)w 1 ⇠t i,k Derivate from objective function for optimize transform matrix. This is standard quadratic programming but… p High computational costs. p Need to huge memory. p Depend on dimensions of data. Derivate dual problem.
  • 64. Dual Problem 69 s.t. 0  ai  CT MX i=1 aiyt i,k = 0 max a 1 2 KX k1=1 KX k2=1 MX i=1 MX j=1 aiajyt i,k1 yt j,k2 T k1(xt i)V 1 k2 (xt j) + KX k=1 MX i=1 ai 1 D T k (xt i)V 1 MX m=1 NX n=1 ym,nvi,j !! (2) p Low computational cost. p Defined by sparse problem. p Depend on number of target data. ai: Lagrange multiplier Dual problem has many advantages. V = 0 @I + D MX i=1 NX j=1 yi,jU(xt i) 1 A
  • 65. Comparison Primal with Dual of Computation Time 70 SetupTime: computation time for coefficients(e.g. and ). OptimizationTime: optimization time for solving quadratic programming CalculationTime: computation time in from (dual only). U(x) vi,j w a 3riPDO DuDO 0 1000 2000 3000 4000 5000 6000 7000 CoPSutDtion7iPe 6etuS7iPe 2StiPizDtion7iPe CDOcuODtion7iPe Visual Words:128 About 14 times faster
  • 66. Result 71 MMDTL2 achieve good performance as equivalent with baseline. But Not transfer is the best performance. 8 16 32 64 128 256 512 1024 # Rf 9LVuDl WRrdV 0.4 0.5 0.6 0.7 0.8 0.9 1.05ecRgnLWLRnrDWe BDVelLne 6Rurce Rnly 1RW WrDnVfer 00D7 00D7L2
  • 67. :14.7[fps] A B C3 • Visual Word Histogram • (A or B or C3) • A, B, C3 • : SVM 22 6 … 91 87 … • • SIFT 120[pix.] 120[pix.] A B C3
  • 69. Objective 74 処理用PCをNBI内視鏡と接続し, オンラインでの認識を可能とする システム構成 *2 http://www.genkosha.com/vs/news/entry/sdi.html*1 http://www.olympus.co.jp/jp/news/2012b/nr121002luceraj.jsp 開発環境 Visual Studio 2012(製品版),OpenCV 3.0-devel, VLFeat 0.9.18, Boost 1.55.0,DeckLink SDK 10.0 OS:Windows 7 Home Premium SP1 64bit CPU:Intel Core i7-4470 3.40GHz Memory:16GB OLYMPUS製 EVIS LUCERAELITE Blackmagic製 DeckLink SDI NBI内視鏡*1 キャプチャボード*2 処理用PC SDI PCI Express
  • 71. Capture ~demo & performance~ 76 NBI内視鏡 の画面 処理用PC の画面 色変換処理 特徴量抽出
  • 73. [ ] 0 0.5 1 0 50 100 150 200 Probability フレーム番号 Type A Type B • • Type A Type B Type C3
  • 74. MRF/HMM f x y( )∝exp A xi, yi( ) i ∑ # $ % & ' (⋅exp I xi, xj( ) j∈Ni ∑ # $ %% & ' (( x: y: SVM x1 x50………… x100 x150 x200………… ………… ………… B B BC3 C3 i0 50 100 200150 …… …… …… …… y1 y50 y100 y150 y200
  • 75. Type B (original) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.8) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.9) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.99) Type B (original) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.8) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.9) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.99) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.999) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (original) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (Gibbs_p4=0.6) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (Gibbs_p4=0.7) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (Gibbs_p4=0.8) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (Gibbs_p4=0.9) frame number 0 20 40 60 80 100 120 140 160 180 200 0 0.5 1 B A C 20 40 60 80 100 120 140 160 180 200 Type B 0 0.5 1 A B C 20 40 60 80 100 120 140 160 180 200 Type A_1 (original) frame number 0 20 40 60 80 100 120 140 160 180 200 Type A_1 (DP_0.99) frame number 0 20 40 60 80 100 120 140 160 180 200 Type A_1 (Gibbs_p4=0.9) frame number 0 20 40 60 80 100 120 140 160 180 200 Type A Type A Type B Type C3 MAP (C3 ) ( )
  • 76. A B C3 Type A Type B Type C3 MRF 0 0.5 1 0 50 100 150 200 Probability Type A Type B Type C3
  • 77. Type A Type B Type A Type B Type A Type B Type C3
  • 79. Possible Cause of Instability 84 p Classification results would be affected by out of focus. number of visual words RecognitionRate[%] ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●no defoucs SD = 0.5 SD = 1 SD = 2 SD = 3 SD = 5 SD = 7 SD = 9 SD = 11 10 100 1000 10000 0.00.20.40.60.81.0 p Test image: 1191 Ø Test images are added Gaussian blur with different SD. p Train image: 480 Ø 160 images for each class Smaller SD Larger Recognition results for out of focus images
  • 80. Particle Filter (Online Bayesian Filtering) 85 State vector: Observation vector: t : time p (xt | y1:t 1) = Z p (xt | xt 1, ✓1) p (xt 1 | y1:t 1) dxt 1 Prediction State transition We use Dirichlet distribution for state transition and likelihood. Update Likelihood p (xt | y1:t) / p (yt | xt, ✓2) p (xt | y1:t 1) yt = ⇣ y (A) t , y (B) t , y (C3) t ⌘ , y (A) t + y (B) t + y (C3) t = 1 xt = ⇣ x (A) t , x (B) t , x (C3) t ⌘ , x (A) t + x (B) t + x (C3) t
  • 81. Dirichlet distribution 86 (0.50, 0.50, 0.50) (0.85, 1.50, 2.00) (1.00, 1.00, 1.00) (1.00, 1.76, 2.35) (4.00, 4.00 ,4.00) (3.40, 6.00, 8.00) low high Dirx[↵] = ( PN i=1 ↵i) QN i=1 (↵i) NY i=1 x↵i 1 i parameter of distribution: ↵ (x) = ax + b
  • 82. Problem & Our Approach 87 xt 1 xt xt+1 yt 1 yt yt+1zt+1zt 1 zt t t+1t 1 xt 1 xt+1xt ytyt 1 yt+1 ✓2 Dirichlet Particle Filter (DPF) Defocus-aware Dirichlet Particle Filter (D-DPF) Prediction p (xy | y1:t 1, 1:t 1, z1:t 1) = Z p (xt | xt 1, ✓1)p (xt 1 | y1:t 1, 1:t 1, z1:t 1)dxt 1 State transition p (xt | y1:t, 1:t, z1:t) / p (yt, t, zt | xt) p (xt | y1:t 1, 1:t 1, z1:t 1) Update Likelihood p (yt, t, zt | xt) = p (yt, xt, t) p (zt | t)
  • 83. Isolated Pixel Ratio (IPR) [Oh et al., MedIA2007] 88 Endoscopic image Edges pixels by Canny edge detector Clear edge Defocus edge Edge pixel Non-edge pixel Edge and isolated pixel IPR: the percentage of isolated pixel in every edge pixels
  • 84. Isolated pixel value (IPR) frequency 0.000.020.040.060.080.10 0 0.005 0.01 0.015 γt Density 0 2 4 6 8 10 0.00.20.40.60.81.01.2 sigma = 0.5 sigma = 1 sigma = 2 sigma = 3 sigma = 4 Dirichlet distribution Modeling with Rayleigh dist. and IPR 89 Rayx [ ] = x 2 exp ✓ x2 2 2 ◆ Defocus Clear γt 0.000 0.005 0.010 0.015 0.51.01.52.02.53.03.54.0 zt σ(zt) ● ● (zt) = 4 exp(100 log(0.25)zt) p (zt | t) = Ray t [ (zt)]
  • 85. Sequential filtering 90 Prediction p (xy | y1:t 1, 1:t 1, z1:t 1) = Z p (xt | xt 1, ✓1)p (xt 1 | y1:t 1, 1:t 1, z1:t 1)dxt 1 p (xt | y1:t, 1:t, z1:t) / p (yt, t, zt | xt) p (xt | y1:t 1, 1:t 1, z1:t 1) Update xt 1 xt xt+1 yt 1 yt yt+1zt+1zt 1 zt t t+1t 1 p (yt, t, zt | xt) = p (yt, xt, t) p (zt | t) p (yt, xt, t) = Dirxt [↵2 (yt, t)] p (zt | t) = Ray t [ (zt)] p (xt | xt 1, ✓1) = Dirxt [↵1(xt 1, ✓1)]
  • 86. The performance for defocus frames 91 0 100 200 300 400 500 6000.00.51.0 0 100 200 300 400 500 600 0.00.51.0 0 100 200 300 400 500 600 0.0000.0050.010 0 100 200 300 400 500 600 0.00.51.0 0 100 200 300 400 500 600 0.00.51.0 Frame number Ground truth Observation IPR Result by DPF Result by D-DPF
  • 87. Smoothing result for an actual NBI video 92 No smoothing result Smoothing result Type A Type B Type C3
  • 88. Summary • NBI • Baseline: SIFT + Bag-of-Visual Words • • – self-training – sampling – domain adaptation / transfer learning • / – MRF/HMM –