10. Deep Learningの強み② 特徴抽出の職人技が不要に
<従来のパターン認識>
■Deep Learningは職人技の特徴量抽出不要
<深層学習>
Feature
Extractor
Trainable
Classifier
Citrus
Not
Citrus
Low-Level
Features
Trainable
Classifier
Citrus
Not
Citrus
Mid-Level
Features
High-Level
Features
階層的に表現を学習
※手作業Handcraftedで抽出
Googleの論文(Le et al. Building High-level Features Using Large Scale
Unsupervised Learning, ICML2012; arxiv:1112.6209)
+YouTube から抽出した200x200 pixelsの1000万枚画像を学習
+300万個のneurons, 10億 connectionsのNeural Network
+ラベル無しの教師なし学習で、ネコの顔やヒトの顔に反応するニューロンが出来た。
+ヒト顔画像の割合は3%程。
図出展:GoogleBlog
■教師なしでCat Neuron作成
38. Image/DNA-based DNN Model Architecture
adopted in Latest(~2017) Conventional Studies
IMAGE
DNA
REFERENCE DATE INput, TarGeT DNN Architecture, etc.
PMID:
29086034
2017.10 IN=Endoscopy Images内視鏡画像,
TGT=Classification
CNN(Pre-trained AlexNet) + SVM
(96Convolutional kernels)
PMID:
29083930
2017.10 IN=Histopathological Images,
TGT=Osteosarcoma骨肉腫 Classification
CNN( VGG, AlexNet)
PMID:
29082086
2017.9 IN=OCT images, TGT=Cochlear Endolymphatic
Hydrops蝸牛内リンパ水腫
CNN (VGG16-based)
Scratch
REFERENCE DATE ToolName INput, TarGeT DNN Architecture, etc.
PMID:
29069282
2017.10 DeOpen IN=DNA sequence
TGT=Chromatin accessibility prediction
Composition Model
(CNN_org 4 layers + BP)
PMID:
28158264
2017.2 CNNProm IN=DNA sequence(Hs,Mm,At 251nt,Ec,Bs 81nt),
TGT=Promotor Classification
CNN-org 2-3 layers
arXiv:
1608.03644
2017.1 DeepMotif IN=DNA sequence
TGT=TFBS classification
CNN,RNN(LSTM),CNN+R
NN, org_1-4 layers
PMID:
27587684
2016.9 DeepChrom IN=Peak-based shift window matrix
TGT=Gene expression prediction
CNN-org 2-3 layers
ResNet, Inception, Xception seem to be not adopted yet in life science studies.
CNN models for DNA sequences had few layers.
Survey
39. CNNProm
(TATA promotor prediction)
PMID: 28158264
This encoding matrix is used as the
input to a convolutional, recurrent,
or convolutional-recurrent module
that each outputs a vector of fixed
dimension. The output vector of
each model is linearly fed to a
softmax function as the last layer
which learns the mapping from
the hidden space to the output
class label space C ∈ [+1, −1]. The
final output is a probability
indicating whether an input is a
positive or a negative binding site
(binary classification task)
Each model has the same input (one-
hot encoded matrix of the raw
nucleotide inputs), and the same
output (softmax classifier to make a
binary prediction).
DeepMotif
(TFBS classification)
arXiv:1608.03644, PSB2017
CNN Model Architectures (1)
Survey
40. DeOpen
(Chromatin accessibility prediction)
PMID: 29069282
bipartite model combined with CNN and a typical
three-layer BP neural network.
It consists of 9 convolutional layers, 3 max pooling layers, 3 fully
connected layers. Each convolution layer contains 128 convolution
kernels The parameter k is set to 6 in our model, thus creating a
1024 dimensional feature vector for each DNA sequence. We also
apply dropout technology to the output of MergeLayer with the rate
0.5 in case of overfitting.
DeepChrom
(Gene expression prediction from histone
modifications)
PMID: 27587684
CNN Model Architectures (2)
Survey
41. K-mer-based DNN Model Architecture adopted in
Latest(~2017) Conventional Studies
K-mer
REFERENCE DATE ToolName INput, TarGeT DNN Architecture, etc.
PMID: 28881969 2017.7 ismb2017_lst
m
IN=k-mer frequency
TGT=Chromatin accessibility prediction
RNN(LSTM)
PMID: 27506469 2016.8 IPMiner IN=k-mer frequency
TGT=ncRNA-protein interactions prediction
Multiple CNNs + Logistic
Regression
DNN models for K-mer
frequency should be
compared CNN to
RNN(LSTM).
PMID: 28881969 AUC:0.881(K562) PMID: 27506469 ACC:0.891
(Preprint)
K=6
1-layer
Survey