Protein structure alignment beyond spatial proximity 3 dsig_2012

•Télécharger en tant que PPT, PDF•

1 j'aime•419 vues

Background / Purpose: The problem of automatically constructing an accurate protein structure alignment still remains challenging especially when proteins to be aligned are distantly-related. Main conclusion: We present a novel method, DeepAlign, which aligns two protein structures using not only spatial proximity of equivalent residues (after rigid-body superposition), but also evolutionary distance and hydrogen bonding similarity.

Technologie Business

Protein structure alignment
beyond spatial proximity
3DSIG 2012
Jul 14, Long Beach, California
Sheng WANG
Toyota Technological Institute at Chicago

Related works on Pairwise Structure Alignment
1
2
Almost all the structure alignment tools
TMalign, fr-TMalign
3 DALI,
MUSTANG
4
MAMMOTH, Vorolign,
YAKUSA
5 FATCAT, CE, MATT,
FlexProt
Note: all proteins we align only consider their C-alpha atom

Our contribution
Design a scoring function
• local sub-structure similarity
• evolutionary and functional information
• angular similarity for hydrogen bonding
Employ a fast and efficient search algorithm
• from highly similar local sub-structures pair (SFP)
• recruit new SFPs that satisfies spatial constrains
• final refine the alignment within a bound

Scoring Function
local similarity global similarity
CLESUM is the local structure substitution matrix;
BLOSUM is the amino acid substitution matrix;
v(i,j) measures the angular similarity using three vectors;
d(i,j) measures the spatial proximity of two aligned residues.
Note: both v(i,j) and d(i,j) are calculated after rigid-body superposition.
Score(i,j)=( max(0,BLOSUM(i,j) )+CLESUM(i,j) )*v(i,j)*d(i,j)

θ
θ’
τ
i-2
i-1
i
i+1
(A
)
(B)
RRFEDECCGAIHHHHHHHHHHHHHHHOMICQEECBLDFQNBFEEEEFEQNNGCP
LDDEEEDEEENOGCEDEEEEEEPKKOGFEDPLDEQBGCCR
The transformation from 3D
structure to 1D CLE strings
alpha
beta
coil
S Wang, WM Zheng, “CLePAPS: Fast Pair Alignment of Protein
Structures Based on Conformational Letters.” JBCB, 2008

CLESUM : Conformational LEtter SUbstitution Matrix
Mij = 20* log 2 (Pij/PiPj)
Note: CLESUM is constructed using FSSP representatives.
typical helix
typical sheet
evolutionary
+ geometric

HHHHHHH
EGHILLI
DGHVLLV
HHHHHHH
HHHHHHH
GHILLIQ
DGHVLLV
HHHHHHH
(A) (B)
correct incorrect
Same CLESUM, different BLOSUM
CLE ->
AMI ->

Why Max and Add ?
max(0,CLESUM(i,j)+BLOSUM(i,j) )
BLOSUM
CLESUM
+ -
+
-
√ o
×o
Note: log (Cij/ CiCj) + log (Bij/ BiBj) = log(CijBij / CiCj BiBj)

(A) (B)
incorrect correct
smaller RMSD larger RMSD
Why use angular similarity ?

The three vectors used in the vect-score v(i,j).
Using three vector's deviation for angular similarity

DeepAlign-score
SFP_long
SFP_short
Search Algorithm
[2] SFP stands for Similar Fragment Pair, using ∑max(0,CLESUM(i,j)+BLOSUM(i,j) )
Note:
[1] TopK > TopJ > M
Sort both SFP lists

SFP_long
score rank
5 2 4 1
Example: TopK = 5; TopJ = 1
# of consistent SFPs = 4 # of consistent SFPs = 1
From TopK coarse-grained to TopJ fine-grained initial alignment
Top2 SFP is globally supported by three other SFPs,
while Top1 SFP is supported only by itself.
3

Third
Update
d1 d2
d3
d1 > d2 > d3
Output
Alignment
Fisrt
Update
Second
Update
Refine each fine-grained initial alignment by three iteration
Final refinement
SFP_short score rank
(high -> low)

Final refinement on DeepAlign-score only in bounded area
(1) refined fine-grained alignment (2) bounded area upon the alignment
(3) dynamic programming to find a path
with maximal DeepAlign-score within
bounded area

• CDD (Conserved Domain Database): contains 3591
conserved domain structure alignments.
• MALUDUP: contains 241 alignments for homologous
domains originated from internal duplication.
• MALISAM: contains 130 alignments for structurally
analogous motifs in proteins.
Result on manually-curated data

Result on discrimination data
• We use SABmark to test the ability of identifying distant
homologs (super-family) and structural analogs (fold)
among those negative data (with no structural similarity)
DeepAlign
DeepAlign
super-family fold

One example
Superimposition of domain d1pqsa_ and d1poh__ from
MALISAM. (A) TMalign, (B) DeepAlign optimizing TM-
score and (C) DeepAlign.
TMscore
0.288
TMscore
0.514
TMscore
0.473

Thank you !!
Please find the executable program of DeepAlign at:
http://ttic.uchicago.edu/~jinbo/DeepAlign/DeepAlign_exe_V1.00.tar.gz

Recommandé

Accurate Learning of Graph Representations with Graph Multiset PoolingMLAI2

Domain adaptationTomoya Koike

Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...MLAI2

R2RML-F: Towards Sharing and Executing Domain Logic in R2RML MappingsChristophe Debruyne

DESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORMsipij

Implementation of D* Path Planning Algorithm with NXT LEGO Mindstorms Kit for...idescitation

Flexible dsp accelerator architecture exploiting carry save arithmeticNexgen Technology

Pretzel: optimized Machine Learning framework for low-latency and high throug...NECST Lab @ Politecnico di Milano

Recommandé

Accurate Learning of Graph Representations with Graph Multiset PoolingMLAI2

Domain adaptationTomoya Koike

Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...MLAI2

R2RML-F: Towards Sharing and Executing Domain Logic in R2RML MappingsChristophe Debruyne

DESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORMsipij

Implementation of D* Path Planning Algorithm with NXT LEGO Mindstorms Kit for...idescitation

Flexible dsp accelerator architecture exploiting carry save arithmeticNexgen Technology

Pretzel: optimized Machine Learning framework for low-latency and high throug...NECST Lab @ Politecnico di Milano

DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORMVLSICS Design

Data base management systems question papersuthi

Modified montgomery modular multiplier for cryptosystemsIAEME Publication

(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...Sungha Choi

An optimal general type-2 fuzzy controller for Urban Trafﬁc NetworkISA Interchange

Crdom cell re ordering based domino on-the-fly mappingVLSICS Design

Performance evaluations of grioryan fft and cooley tukey fft onto xilinx virt...csandit

PERFORMANCE EVALUATIONS OF GRIORYAN FFT AND COOLEY-TUKEY FFT ONTO XILINX VIRT...cscpconf

An Alternative Genetic Algorithm to Optimize OSPF WeightsEM Legacy

IMPACT OF PARTIAL DEMAND INCREASE ON THE PERFORMANCE OF IP NETWORKS AND RE-OP...EM Legacy

Exploration of genetic network programming with two-stage reinforcement learn...TELKOMNIKA JOURNAL

Fault-tolerant topology and routing synthesis for IEEE time-sensitive network...Voica Gavrilut

COUPLED FPGA/ASIC IMPLEMENTATION OF ELLIPTIC CURVE CRYPTO-PROCESSORIJNSA Journal

Area, Delay and Power Comparison of Adder TopologiesVLSICS Design

ABayesianApproachToLocalizedMultiKernelLearningUsingTheRelevanceVectorMachine...grssieee

A downlink scheduler supporting real time services in LTE cellular networksUniversity of Piraeus

1-s2.0-S092523121401087X-mainPraveen Jesudhas

論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...Kusano Hitoshi

PRML 5.5Ryuta Shitomi

Knowledge Based Genetic Algorithm for Robot Path PlanningTarundeep Dhot

STRING - Protein networks from data and text miningLars Juhl Jensen

BITS: Basics of sequence databasesBITS

Contenu connexe

Tendances