Bioinformatics kernels relations

Kernel Methods and Relational Learning in
Bioinformatics

ir. Michiel Stock
Dr. Willem Waegeman
Prof. dr. Bernard De Baets

Faculty of Bioscience Engineering
Ghent University

November 2012

KERMIT

ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 1 / 40

Outline

1 Introduction

2 Kernel methods

3 Learning relations

4 Case studies
Enzyme function prediction
Protein-ligand interactions
Microbial ecology

5 Conclusions


Introduction

Introductory example

Problem statement
Predict protein-protein interactions based on high-throughput data.
Based on a gold standard
Typical features that can be
used:
Yeast two-hybrid
Pfam proﬁle
Phylogenetic proﬁle
Localization
PSI-BLAST
Expression
...


Introduction

Machine learning is widelyagaused in bioinformatics
88 Larran‹ et al.

Downloaded from bib.oxfordjournals.org at Biomedische Bibliotheek o
Figure 1: Classification of the topics where machine learning methods are applied.

Introduction

Bioinformatics deals with complex data

Bioinformatics data is typically:
in large dimension (e.g., microarrays or proteomics data)
structured (e.g., gene sequences, small molecules, interaction
networks, phylogenetic trees...)
heterogeneous (e.g., vectors, sequences, graphs to describe
the same protein)
in large quantities (e.g., more than 106 known protein
sequences)
noisy (e.g., many features are not relevant)


Kernel methods

Formal definition of a kernel

Kernels are non-linear functions defined over objects x ∈ X .
Definition
A function k : X × X → R is called a positive definite kernel if it is
symmetric, that is, k(x, x ) = k(x , x) for any two objects x, x ∈ X , and
positive semi-definite, that is,
N N
ci cj k(xi , xj ) ≥ 0
i=1 j=1

for any N > 0, any choice of N objects x1 , . . . , xN ∈ X , and any choice of
real numbers c1 , . . . , cN ∈ R.

Can be seen as generalized covariances.


Kernel methods

Interpretation of kernels

Suppose an object x has an
implicit feature representation
φ(x) ∈ F.
A kernel function can be seen
as a dot product in this
feature space: X F

k(x, x ) = φ(x), φ(x )
h (x), (x0 )i
k

Linear models in this feature
space F can be made:
dinsdag, 10 april 2012

T
y (x) = w φ(x)
= an k(xn , x)
n


Kernel methods

Many kernel methods exist
SVM
Examples of popular kernel
methods:
Support vector machine
(SVM)
Regularized least squares
(RLS)
Kernel principal KPCA
component analysis
(KPCA)
Learning algorithm is
independent of the kernel
representation!


Kernel methods

Kernels for (protein) sequences

Spectrum kernel (SK)
The SK considers the number of k-mers m two sequences si and sj have in
common.

SKk (si , sj ) = N(m, si )∗N(m, sj )
m∈Σk

with N(m, s) the number of k-mers
m in sequence s.
To predict structure, function...
of DNA, RNA or proteins.
A discriminative alternative for
Hidden Markov Models.


Kernel methods

Kernels for graphs (1)
Graph
Graphs are a set of interconnected objects, called vertices (or nodes), that
are connected through edges.

Graphs can show the structure of an object or interactions between
diﬀerent objects.

Graph are important in bioinformatics!

Kernel methods


Graph kernel
Constructing a similarity between graphs.
In chemoinformatics:

Based on performing a
random walk on both graphs
and counting the number of In structural bioinformatics:
matching walks.
Usually very computationally
demanding!

A

Kernel methods


Diﬀusion kernel
Constructing a similarity between vertices within the same graph.

Also based on performing a
random walk on a graph.
Captures the long-range
relationships between
vertices.
Inspired by the heat
equation. The kernel
quantiﬁes how quickly ‘heat’
can spread from one node to
another.


Kernel methods

Kernels for ﬁngerprints

Fingerprint representation of
Objects that can be described an object:
by a long binary vector x can
be represented by the
Tanimoto kernel:

KTan (xm , xn ) =
xm , xn
.
xm , xm + xn , xn − xm , xn


Learning relations

Kernels for pairs of objects

Problem statement
Predict the binding interaction between a given protein and a ligand
(small molecule). Learning Molecular docking.

The problem deals with two
types of objects:
Proteins (graph kernel of
structure, sequence
kernel, ﬁngerprints...)
Ligand (ﬁngerprints,
graph kernel...)
Label is for a pair of objects.


Learning relations
ng and Ranking Algorithms for Bioinformatics
example: pairs of objects
Kernels for
Applications
nomicsWillem Waegeman, Bernard De Baets
Michiel Stock,
Pairwise kernel
IT, Department of Mathematical Modelling, Statistics and Bioinformatics
of Combine the kernel matrices of the individual the process of druga kernel
proteins and a database of ligands to aid objects to construct
istical model based objects.
matrix for pairs of on a data set. Kernel methods allow for the
roductory example: chemogenomics
tein and a from individual kernels for the proteins and ligands:
Starting ligand.
ding interactions between a set of proteins and a database of ligands to aid the process of drug
to model pairwise relations between different types of objects.
s
Data set Object kernels

( , )
By optimizing a ranking loss, our algorithms can also be used for
( , ) as shown on the right.
conditional ranking,
( , )
SVM
In short, our framework is ideally suited for bioinformatics
RLS
...

challenges:
( , )
- efﬁcient learning process
( , ) ...
- can handle complex objects (graphs, trees, sequences...)
Pairwise kernel
- ability to deal with information retrieval problems
Object kernels Learning algorithm

gorithms can also be used for

( , ) Learning relations
SVM
Conditional ranking (1) RLS
...
Motivation( , )
Suppose one is not ) ...
( , particularly interested in the exact value of the
interaction but in the order of the proteins for a given ligand.
Pairwise kernel
rnels Learning algorithm

ed for More relevant

More relevant
matics
Query 1 Query 2

Database objects

Learning relations

Conditional ranking (2)

Based on a graph description,
with e a pair of objects.
Train the model:

h(e) =< w, Φ(e) >= ae K Φ (e, e )
¯
e∈E

using the algorithm:
2
A(T ) = argmin L(h, T )+λ h H.
h∈H
Figure 1 Example of a multi-graph. If this graph, on the left, would be used fo
conditioned on C, then A scores better than E, which ranks higher than E, w
Where we use a ranking loss: higher than D and D ranks higher than B. There is no information about the re
and G, respectively, our model could be used to include these two instances in
are available. Notice that in this setting unconditional ranking of these objects
graph is obviously intransitive. Figure reproduced from (Pahikkala et al., 2010).

L(h, T ) = (ye −ye −h(e)+h(¯))2 .
¯ e
The proposed framework is based on the Kronecker product ke
v ∈V e,¯∈Ev
e implicit joint feature representations of queries and the sets of ob
Exactly this kernel construction will allow a straightforward
existing framework to dyadic relations and multi-task l
(Objectives 1 and 2). It has been proposed independently by three
modeling pairwise inputs in different application domains (Basilico
ir. Michiel Stock (KERMIT) Kernels for Bioinformatics et al. 2004, Ben-Hur et al. November a2012
2005). From different perspective, it h
17 / 40

Case studies Enzyme function prediction

Predicting enzyme function

Problem statement
Predict the function (EC number) of an enzyme using structural
information of the active site.
Data: active site of an
1730 enzymes with 21 enzyme:
different functions
four different structural
similarities
CavBase
maximum common
subgraph
labeled point cloud
superposition
fingerprints



EC numbers

EC number
A functional label of an enzyme, based on the reaction that is catalyzed.

Example: EC 2.7.6.1 = ribose-phosphate diphosphokinase



Deﬁning catalytic similarity
Catalytic similarity
The catalytic similarity is the number of successive equal digits in the EC
number between two enzymes, starting from the ﬁrst digit.

0 EC 2.7.7.34
EC ?.?.?.?
3 2
0
1
EC 4.2.3.90
0
0
0
EC 4.6.1.11

2
EC 2.7.1.12
EC 2.7.7.12


Bioinformatics kernels relations

Bioinformatics kernels relations

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Viewers also liked

Viewers also liked (6)

Similar to Bioinformatics kernels relations

Similar to Bioinformatics kernels relations (20)

More from Michiel Stock

More from Michiel Stock (13)

Bioinformatics kernels relations