SlideShare une entreprise Scribd logo
1  sur  94
Seyed mohammad motevalli
December 2013
outline
 Introduction to bioinformatics
 Biological databases
 Sequence alignment and their algorithms
 Structural prediction

 Web-based tools
 Stand-alone software
Introduction to bioinformatics
 What is the bioinformatics?
Bioinformatics is an interdisciplinary research area at the interface between
computer science and biological science.
Introduction to bioinformatics
 What are differences between bioinformatics and

informatics?
 What are differences between bioinformatics and
computational biology?
 What is the algorithm?
What is the proteomics!?
Biological databases
 Database

A database is a computerized archive used to store and organize data in such a
way that information can be retrieved easily via a variety of search criteria
 Entry
Each record should contain a number of fields that hold the actual data items
 Value
a particular piece of information
 Making a query
To retrieve a particular record from the database, a user can specify a value to
be found in a particular field and expect the computer to retrieve the whole
data record
Biological databases
 Primary databases
 Gen bank (NCBI)
 EMBL
 DDBJ

www.ncbi.nlm.nih.gov
www.ebi.ac.uk/embl/index.html
www.ddbj.nig.ac.jp

 Secondary databases
 ExPASY
 PIR
 SWISS-Prot

http://web.expasy.org
http://pir.georgetown.edu/pirwww/pirhome3.shtml
www.ebi.ac.uk/swissprot/access.html
Biological databases
 Interconnection between Biological Databases
Biological databases
 Pitfalls of biological databases
 The causes of redundancy include: repeated submission of identical or

overlapping sequences by the same or different authors, revision of
annotations, dumping of expressed sequence tags (EST) data
 Redundant sequences
 Non-redundant sequences (Ref Seq)
Biological databases
 Further databases
 NCBI







www.ncbi.nlm.nih.gov
Uniprot
http://www.uniprot.org
ExPASY
http://web.expasy.org
PIR
http://pir.georgetown.edu/
SWISS-Prot
http://swissmodel.expasy.org/
PDB
http://www.rcsb.org/pdb/home/home.do
Enzyme structure http://www.ebi.ac.uk/thornton-srv/databases/enzymes
Biological databases
 NCBI

www.ncbi.nlm.nih.gov
Biological databases
 Uniprot

http://www.uniprot.org
Biological databases
 ExPASY

http://web.expasy.org
Biological databases
 PIR

http://pir.georgetown.edu/
Biological databases
 SWISS-Prot

http://swissmodel.expasy.org/
Biological databases
 PDB

http://www.rcsb.org/pdb/home/home.do
Biological databases
 Enzyme structure

http://www.ebi.ac.uk/thornton-srv/databases/enzymes
Sequence alignment and their
algorithms
 Pairwise sequence alignment
Pairwise sequence alignment is the process of aligning two sequences and is
the basis of database similarity searching and multiple sequence alignment

 Sequence similarity versus sequence homology
When two sequences are descended from a common evolutionary origin, they
are said to have a homologous relationship or share homology. A related but
different term is sequence similarity, which is the percentage of aligned
residues that are similar in physiochemical properties such as size, charge,
and hydrophobicity

 Sequence similarity versus sequence identity
In a protein sequence alignment, sequence identity refers to the percentage of
matches of the same amino acid residues between two aligned sequences.
Similarity refers to the percentage of aligned residues that have similar
physicochemical characteristics and can be more readily substituted for each
other
Sequence alignment and their
algorithms
 Sequence alignment strategies
 Global alignment

In global alignment, two sequences to be aligned are assumed to be generally
similar over their entire length. Alignment is carried out from beginning to end
of both sequences to find the best possible alignment across the entire length
between the two sequences
 Local alignment
In local alignment does not assume that the two sequences in question have
similarity over the entire length. It only finds local regions with the highest
level of similarity between the two sequences and aligns these regions without
regard for the alignment of the rest of the sequence regions
Sequence alignment and their
algorithms
Sequence alignment and their
algorithms
Linear gap penalty: The cost for creation and extension of gaps are the same

W(I)= gI, g is the cost for each gap and I is the length

Affine gap penalty: different cost for creation and extension
W(I)=gopen + gext (I-1) and gopen < Gext

S

S

,

W I
Sequence alignment and their
algorithms
 Alignment Algorithms And Methodes
 The dot matrix method
 The word method
 The dynamic programming method
Sequence alignment and their
algorithms
 Alignment Algorithms
 The dot matrix method

The most basic sequence alignment method is the dot matrix method, also
known as the dot plot method
Sequence alignment and their
algorithms
 Alignment Algorithms
 The word method

It works by finding short stretches of identical or nearly identical letters in
two sequences. These short strings of characters are called words, which
are similar to the windows used in the dot matrix method
Sequence alignment and their
algorithms
 Alignment Algorithms
 The word method
Sequence alignment and their
algorithms
 Alignment Algorithms
 The dynamic programming method

Dynamic programming is a method that determines optimal alignment by
matching two sequences for all possible pairs of characters between the
two sequences
Sequence alignment and their
algorithms
 Alignment Algorithms
 The dynamic programming method
 Global alignment

The classical global pairwise alignment algorithm using dynamic
programming is the Needleman–Wunsch algorithm. In this algorithm, an
optimal alignment is obtained over the entire lengths of the two sequences
 Local alignment

The first application of dynamic programming in local alignment is the
Smith–Waterman algorithm. In this algorithm, positive scores are
assigned for matching residues and zeros for mismatches. No negative
scores are used
Sequence alignment and their
algorithms
 substitution matrix
 PAM matrices (point accepted mutation)

The PAM matrices were subsequently derived based on the evolutionary
divergence between sequences of the same cluster. One PAM unit is defined as
1% of the amino acid positions that have been changed. Because of the use of
very closely related homologs, the observed mutations were not expected to
significantly change the common function of the proteins
Sequence alignment and their
algorithms
 substitution matrix
 PAM matrices (point accepted mutation)
Sequence alignment and their
algorithms
 substitution matrix
 BLOSUM matrices

This is the series of blocks amino acid substitution matrices (BLOSUM), all of
which are derived based on direct observation for every possible amino acid
substitution in multiple sequence alignments
Sequence alignment and their
algorithms
 substitution matrix
 BLOSUM matrices
Sequence alignment and their
algorithms
What Matrices should be used and when?
Matrix
PAM40

Best use
Similarity (%)
Short alignment that are
70-90
highly similar
PAM160
Detecting members of a
50-60
protein family
PAM250
Longer alignments of more App. 30
divergent sequences
BLUSOM90
Short alignment that are
70-90
highly similar
BLUSOME80
Detecting members of a
50-60
protein family
BLUSOME62
Most effective in finding
30-40
all potential similarities
BLUSOME30
Longer alignments of more <30
divergent sequences
Similarity: the range of similarities that the matrix is able to best tdetecr.
Comparison
• PAM is based on an evolutionary model
using phylogenetic trees
• BLOSUM assumes no evolutionary model,
but rather conserved “blocks” of proteins
Sequence alignment and their
algorithms
 Heuristic database searching
The heuristic algorithms perform faster searches because they examine only a
fraction of the possible alignments examined in regular dynamic programming
 BLAST (basic local alignment search tool)
BLAST uses heuristics to align a query sequence with all sequences in a
database
Sequence alignment and their
algorithms
 BLAST (basic local alignment search tool)
Sequence alignment and their
algorithms
6- finishing

Negative scores from scoring matrix

Threshold for stopping extension

Minimum
Score (S)
Neighborhood
Score Threshold (T)

If the extension stopped after crossing the X, the alignment is called
High-scoring segment pair (HSP)
Sequence alignment and their
algorithms
Suggested BLAST Cutoffs
Finding by chance in nucleotide database is more than proteins
Identity in proteins is more informative than in the nucleic acids
For nucleotide-based searches: hits with E values of 10-6 or
less and seq identity 70% or more
For protein-based searches: hits with E values of 10-3 or less and
seq. identity of 25% or more.
Sequence alignment and their
algorithms
 BLAST (basic local alignment search tool)
 BLASTN

queries nucleotide sequences with a nucleotide sequence database
 BLASTP
uses protein sequences as queries to search against a protein sequence
database
 BLASTX
uses nucleotide sequences as queries and translates them in all six reading
frames to produce translated protein sequences, which are used to query a
protein sequence database
 TBLASTN
queries protein sequences to a nucleotide sequence database with the
sequences translated in all six reading frames
 TBLASTX
uses nucleotide sequences, which are translated in all six frames, to search
against a nucleotide sequence database that has all the sequences
translated in six frames
Sequence alignment and their
algorithms
 PSI-BLAST

Position-specific iterated BLAST (PSI-BLAST) builds profiles and performs
database searches in an iterative fashion. The main feature of PSI-BLAST is
that profiles are constructed automatically and arefine-tunedin each successive
cycle
Sequence alignment and their
algorithms
 PSI-BLAST
Sequence alignment and their
algorithms
 Multiple sequence alignment
Sequence alignment and their
algorithms
 Multiple sequence alignment
 Exhaustive algorithms

The exhaustive alignment method involves examining all possible aligned
positions simultaneously
 Heuristic algorithms
 Because the use of dynamic programming is not feasible for routine multiple
sequence alignment, faster and heuristic algorithms have been developed.
computational strategy to find a near-optimal solution by using rules of
thumb. Essentially, this strategy takes shortcuts by reducing the search
space according to certain criteria
Sequence alignment and their
algorithms
 Multiple sequence alignment
 Heuristic algorithms
 Progressive alignment
 Progressive alignment depends on the stepwise assembly of multiple

alignment and is heuristic in nature
 Clustal
It is a progressive multiple alignment program available either as a standalone or on-line program
 T-coffee
T-coffee performs progressive sequence alignments as in Clustal. The main
difference is that, in processing a query, T-Coffee performs both global and
local pairwise alignment for all possible pairs involved. The global pairwise
alignment is performed using the Clustal program
Sequence alignment and their
algorithms
 Multiple sequence alignment
 Heuristic algorithms
 Iterative alignment

The iterative approach is based on the idea that an optimal
solution can be found by repeatedly modifying existing
suboptimal solutions
Sequence alignment and their
algorithms
 Multiple sequence alignment
 Heuristic algorithms
 Block-Based Alignment

The strategy identifies a block of ungapped alignment shared by all the
sequences, hence, the block-based local alignment strategy
Structural prediction
 Structural prediction methods
 Ab-initio prediction

Computational prediction based on first principles or using the most
elementary information
 Threading
Method of predicting the most likely protein structural fold based on secondary
structure similarity with database structures and assessment of energies of the
potential fold. The term has been used interchangeably with fold recognition
 Homology-based modeling
Method for predicting the three-dimensional structure of a protein based on
homology by assigning the structure of an unknown protein using an existing
homologous protein structure as a template
Hidden Markova algorithm
Statistical model composed of a number of interconnected. Markov chains
with the capability to generate the probability value of an event by taking
into account the influence from hidden variables. Mathematically, it
calculates probability values of connected states among the Markov chains
to find an optimal path within the network of states. It requires training to
obtain the probability values of state transitions. When using a hidden
Markov model to represent a multiple sequence alignment, a sequence can
be generated through the model by incorporating probability values of
match, insertion, and deletion states
Hidden Markova algorithm
Neural network algorithm
Machine-learning algorithm for pattern recognition. It is composed of
input, hidden, and output layers. Units of information in each layer are
called nodes. The nodes of different layers are interconnected to form a
network analogous to a biological nervous system. Between the nodes are
mathematical weight parameters that can be trained with known patterns
so they can be used for later predictions. After training, the network is able
to recognize correlation between an input and output
Neural network algorithm
Web-based tools
 Alignment tools
 Sequence-based methods
 T-coffee










http://tcoffee.crg.cat/apps/tcoffee/do:regular
NCBI
http://blast.ncbi.nlm.nih.gov/Blast.cgi
Uniprot
http://www.uniprot.org
EMBL
http://coot.embl.de/Alignment
Structural-based methods
Dali server
http://ekhidna.biocenter.helsinki.fi/dali_server
FSSP
http://protein.hbu.cn/fssp
Signal peptide resource http://proline.bic.nus.edu.sg/spdb/searchn.html
Active site prediction http://www.scfbio-iitd.res.in/dock/ActiveSite.jsp
Web-based tools
 T-coffee

http://tcoffee.crg.cat/apps/tcoffee/do:regular
Web-based tools
 NCBI

http://blast.ncbi.nlm.nih.gov/Blast.cgi
Web-based tools
 Uniprot

http://www.uniprot.org
Web-based tools
 EMBL

http://coot.embl.de/Alignment
Web-based tools
 Dali server

http://ekhidna.biocenter.helsinki.fi/dali_server
Web-based tools
FSSP
http://protein.hbu.cn/fssp

Web-based tools
 Secondary structures prediction
 Sopma







http://npsapbil.ibcp.fr/cgibin/npsa_automat.pl?page=npsa_sopma.html
Jpred3
http://www.compbio.dundee.ac.uk/www-jpred
PreSSaPro
http://bioinformatica.isa.cnr.it/PRESSAPRO
HMM protein structure prediction
http://compbio.soe.ucsc.edu/SAM_T08/T08-query.html
PROF
http://www.aber.ac.uk/~phiwww/prof
Software package http://molbiol-tools.ca/Protein_secondary_structure.htm


Web-based tools
Sopma
http://npsapbil.ibcp.fr/cgibin/npsa_automat.pl?page=npsa_sopma.html


Web-based tools
Sopma
http://npsapbil.ibcp.fr/cgibin/npsa_automat.pl?page=npsa_sopma.html
Web-based tools
 Jpred3

http://www.compbio.dundee.ac.uk/www-jpred
Web-based tools
 PreSSaPro

http://bioinformatica.isa.cnr.it/PRESSAPRO
Web-based tools
 HMM protein structure prediction

http://compbio.soe.ucsc.edu/SAM_T08/T08-query.html
Web-based tools
 PROF

http://www.aber.ac.uk/~phiwww/prof
Web-based tools
Software package


http://molbiol-tools.ca/Protein_secondary_structure.htm
Web-basedhttp://proline.bic.nus.edu.sg/spdb/searchn.html
tools
Signal peptide resource

Web-based tools
 Active site prediction

http://www.scfbio-iitd.res.in/dock/ActiveSite.jsp
Web-based tools
 Tertiary structure prediction
 Phyre2

http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index
Web-based tools
 Biochemical features
 Protein calculator






http://www.scripps.edu/~cdputnam/protcalc.html
Amino acid calculator
http://proteome.gs.washington.edu/cgibin/aa_calc.pl
Peptide property calculator
https://www.genscript.com/sslbin/site2/peptide_calculation.cgi
Peptide property calculator
http://www.innovagen.se/custom-peptidesynthesis/peptide-property-calculator/peptide-property-calculator.asp
Physico-chemical profiles
http://npsa-pbil.ibcp.fr/cgibin/npsa_automat.pl?page=/NPSA/npsa_pcprof.html
Tagldent tool
http://web.expasy.org/tagident/
Web-based tools
 Biochemical features
 Peptide cutter








http://web.expasy.org/peptide_cutter/
Kyte doolittle hydropahty plot http://gcat.davidson.edu/DGPB/kd/kytedoolittle.htm
GRAVY calculator
http://www.gravy-calculator.de/index.php
ProtScale
http://web.expasy.org/protscale/
ProtParam
http://web.expasy.org/protparam/
Prosite
http://prosite.expasy.org/prosite.html
Interpro
http://www.ebi.ac.uk/interpro/
Web-based tools
Protein calculator http://www.scripps.edu/~cdputnam/protcalc.html

Web-based tools
Amino acid calculator


http://proteome.gs.washington.edu/cgi- bin/aa_calc.pl
Web-based tools
Peptide property calculator


https://www.genscript.com/ssl-bin/site2/peptide_calculation.cgi
Web-based tools
 Peptide property calculator

http://www.innovagen.se/custom-peptidesynthesis/peptide-property-calculator/peptide-property-calculator.asp
Web-based tools
 Physico-chemical profiles

http://npsa-pbil.ibcp.fr/cgibin/npsa_automat.pl?page=/NPSA/npsa_pcprof.html
Web-based tools
 Tagldent tool

http://web.expasy.org/tagident/
Web-based tools
Peptide cutter
http://web.expasy.org/peptide_cutter/

Web-based tools
Kyte doolittle hydropahty plot http://gcat.davidson.edu/DGPB/kd/kyte

doolittle.htm
Web-based http://www.gravy-calculator.de/index.php
tools
GRAVY calculator

Web-based tools
 ProtScale

http://web.expasy.org/protscale/
Web-based tools
 ProtParam

http://web.expasy.org/protparam/
Web-based tools
Prosite
http://prosite.expasy.org/prosite.html

Web-based tools
Interpro
http://www.ebi.ac.uk/interpro/

Stand-alone softwares
 MEGA
Stand-alone softwares
 CLC main workbench
Stand-alone softwares
 UGENE
Stand-alone softwares
 Spdb viewer
Stand-alone softwares
 Pairwise structure alignment
Stand-alone softwares
 Cn3D
Stand-alone software
 BioEdit
Stand-alone software
 ClustalX

Contenu connexe

Tendances (20)

Scop database
Scop databaseScop database
Scop database
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Protein databases
Protein databasesProtein databases
Protein databases
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Data mining
Data miningData mining
Data mining
 
Protein Threading
Protein ThreadingProtein Threading
Protein Threading
 
Protein sequence databases
Protein sequence databasesProtein sequence databases
Protein sequence databases
 
Genome analysis
Genome analysisGenome analysis
Genome analysis
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Protein structure prediction (1)
Protein structure prediction (1)Protein structure prediction (1)
Protein structure prediction (1)
 
TrEMBL
TrEMBLTrEMBL
TrEMBL
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
 
Bioinformatics for beginners
Bioinformatics for beginnersBioinformatics for beginners
Bioinformatics for beginners
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Multiple Sequence Alignment
Multiple Sequence AlignmentMultiple Sequence Alignment
Multiple Sequence Alignment
 
Applications of bioinformatics
Applications of bioinformaticsApplications of bioinformatics
Applications of bioinformatics
 
BLAST
BLASTBLAST
BLAST
 
Protein Database
Protein DatabaseProtein Database
Protein Database
 
Sequence assembly
Sequence assemblySequence assembly
Sequence assembly
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
 

En vedette

Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015Surya Saha
 
Bioinfromatics - local alignment
Bioinfromatics - local alignmentBioinfromatics - local alignment
Bioinfromatics - local alignmentVivek Chandramohan
 
Global local alignment
Global local alignmentGlobal local alignment
Global local alignmentScott Hamilton
 
الفيزياء الحيوية2
الفيزياء الحيوية2الفيزياء الحيوية2
الفيزياء الحيوية2Biophysics2014
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsDuncan Hull
 
Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)Pritom Chaki
 
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO MOLECULES
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO  MOLECULES ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO  MOLECULES
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO MOLECULES Arunima Sur
 
B.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentB.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentRai University
 
Biophysics -diffusion,osmosis,osmotic pressure,dialysis
Biophysics -diffusion,osmosis,osmotic pressure,dialysisBiophysics -diffusion,osmosis,osmotic pressure,dialysis
Biophysics -diffusion,osmosis,osmotic pressure,dialysisDr.Rittu Chandel MBBS, MD
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignmentKubuldinho
 
Sequencing 2017
Sequencing 2017Sequencing 2017
Sequencing 2017Surya Saha
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignmentavrilcoghlan
 
Bioinformatics
BioinformaticsBioinformatics
BioinformaticsJTADrexel
 

En vedette (14)

Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015
 
Bioinfromatics - local alignment
Bioinfromatics - local alignmentBioinfromatics - local alignment
Bioinfromatics - local alignment
 
Global local alignment
Global local alignmentGlobal local alignment
Global local alignment
 
الفيزياء الحيوية2
الفيزياء الحيوية2الفيزياء الحيوية2
الفيزياء الحيوية2
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)
 
Global alignment
Global alignmentGlobal alignment
Global alignment
 
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO MOLECULES
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO  MOLECULES ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO  MOLECULES
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO MOLECULES
 
B.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentB.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignment
 
Biophysics -diffusion,osmosis,osmotic pressure,dialysis
Biophysics -diffusion,osmosis,osmotic pressure,dialysisBiophysics -diffusion,osmosis,osmotic pressure,dialysis
Biophysics -diffusion,osmosis,osmotic pressure,dialysis
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
 
Sequencing 2017
Sequencing 2017Sequencing 2017
Sequencing 2017
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignment
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 

Similaire à Bioinformatics

Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptxericndunek
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)AnkitTiwari354
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fastaALLIENU
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...journal ijrtem
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...IJRTEMJOURNAL
 
4. sequence alignment.pptx
4. sequence alignment.pptx4. sequence alignment.pptx
4. sequence alignment.pptxArupKhakhlari1
 
multiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdfmultiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdfsriaisvariyasundar
 
Bioinformatics_Sequence Analysis
Bioinformatics_Sequence AnalysisBioinformatics_Sequence Analysis
Bioinformatics_Sequence AnalysisSangeeta Das
 
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISHMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISijcseit
 
How the blast work
How the blast workHow the blast work
How the blast workAtai Rabby
 
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsseham15
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformaticsAbhishek Vatsa
 
B.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentB.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentRai University
 
Softwares For Phylogentic Analysis
Softwares For Phylogentic AnalysisSoftwares For Phylogentic Analysis
Softwares For Phylogentic AnalysisPrasanthperceptron
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENTMariya Raju
 
Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02PILLAI ASWATHY VISWANATH
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein functionLars Juhl Jensen
 

Similaire à Bioinformatics (20)

Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptx
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
 
4. sequence alignment.pptx
4. sequence alignment.pptx4. sequence alignment.pptx
4. sequence alignment.pptx
 
multiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdfmultiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdf
 
Bioinformatics_Sequence Analysis
Bioinformatics_Sequence AnalysisBioinformatics_Sequence Analysis
Bioinformatics_Sequence Analysis
 
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISHMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
 
Sequence alignment.pptx
Sequence alignment.pptxSequence alignment.pptx
Sequence alignment.pptx
 
How the blast work
How the blast workHow the blast work
How the blast work
 
Blast
BlastBlast
Blast
 
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignments
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 
Sequence Analysis
Sequence AnalysisSequence Analysis
Sequence Analysis
 
B.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentB.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignment
 
Softwares For Phylogentic Analysis
Softwares For Phylogentic AnalysisSoftwares For Phylogentic Analysis
Softwares For Phylogentic Analysis
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein function
 

Plus de seyed mohammad motevalli

Plus de seyed mohammad motevalli (7)

nanomedicine
nanomedicinenanomedicine
nanomedicine
 
Cancer nanotechnology
Cancer nanotechnologyCancer nanotechnology
Cancer nanotechnology
 
Drug delivery system
Drug delivery systemDrug delivery system
Drug delivery system
 
IJPAB-2015-3-2-462-478
IJPAB-2015-3-2-462-478IJPAB-2015-3-2-462-478
IJPAB-2015-3-2-462-478
 
Size and coating of nanoparticle have effect on their thermodynamic intreacti...
Size and coating of nanoparticle have effect on their thermodynamic intreacti...Size and coating of nanoparticle have effect on their thermodynamic intreacti...
Size and coating of nanoparticle have effect on their thermodynamic intreacti...
 
Genetic engineering in animal cells
Genetic engineering in animal cellsGenetic engineering in animal cells
Genetic engineering in animal cells
 
Nanoparticle corona study -
Nanoparticle corona study - Nanoparticle corona study -
Nanoparticle corona study -
 

Dernier

Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Dernier (20)

Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Bioinformatics

  • 2. outline  Introduction to bioinformatics  Biological databases  Sequence alignment and their algorithms  Structural prediction  Web-based tools  Stand-alone software
  • 3. Introduction to bioinformatics  What is the bioinformatics? Bioinformatics is an interdisciplinary research area at the interface between computer science and biological science.
  • 4. Introduction to bioinformatics  What are differences between bioinformatics and informatics?  What are differences between bioinformatics and computational biology?  What is the algorithm?
  • 5.
  • 6. What is the proteomics!?
  • 7. Biological databases  Database A database is a computerized archive used to store and organize data in such a way that information can be retrieved easily via a variety of search criteria  Entry Each record should contain a number of fields that hold the actual data items  Value a particular piece of information  Making a query To retrieve a particular record from the database, a user can specify a value to be found in a particular field and expect the computer to retrieve the whole data record
  • 8. Biological databases  Primary databases  Gen bank (NCBI)  EMBL  DDBJ www.ncbi.nlm.nih.gov www.ebi.ac.uk/embl/index.html www.ddbj.nig.ac.jp  Secondary databases  ExPASY  PIR  SWISS-Prot http://web.expasy.org http://pir.georgetown.edu/pirwww/pirhome3.shtml www.ebi.ac.uk/swissprot/access.html
  • 9. Biological databases  Interconnection between Biological Databases
  • 10. Biological databases  Pitfalls of biological databases  The causes of redundancy include: repeated submission of identical or overlapping sequences by the same or different authors, revision of annotations, dumping of expressed sequence tags (EST) data  Redundant sequences  Non-redundant sequences (Ref Seq)
  • 11. Biological databases  Further databases  NCBI       www.ncbi.nlm.nih.gov Uniprot http://www.uniprot.org ExPASY http://web.expasy.org PIR http://pir.georgetown.edu/ SWISS-Prot http://swissmodel.expasy.org/ PDB http://www.rcsb.org/pdb/home/home.do Enzyme structure http://www.ebi.ac.uk/thornton-srv/databases/enzymes
  • 18. Biological databases  Enzyme structure http://www.ebi.ac.uk/thornton-srv/databases/enzymes
  • 19. Sequence alignment and their algorithms  Pairwise sequence alignment Pairwise sequence alignment is the process of aligning two sequences and is the basis of database similarity searching and multiple sequence alignment  Sequence similarity versus sequence homology When two sequences are descended from a common evolutionary origin, they are said to have a homologous relationship or share homology. A related but different term is sequence similarity, which is the percentage of aligned residues that are similar in physiochemical properties such as size, charge, and hydrophobicity  Sequence similarity versus sequence identity In a protein sequence alignment, sequence identity refers to the percentage of matches of the same amino acid residues between two aligned sequences. Similarity refers to the percentage of aligned residues that have similar physicochemical characteristics and can be more readily substituted for each other
  • 20. Sequence alignment and their algorithms  Sequence alignment strategies  Global alignment In global alignment, two sequences to be aligned are assumed to be generally similar over their entire length. Alignment is carried out from beginning to end of both sequences to find the best possible alignment across the entire length between the two sequences  Local alignment In local alignment does not assume that the two sequences in question have similarity over the entire length. It only finds local regions with the highest level of similarity between the two sequences and aligns these regions without regard for the alignment of the rest of the sequence regions
  • 21. Sequence alignment and their algorithms
  • 22. Sequence alignment and their algorithms Linear gap penalty: The cost for creation and extension of gaps are the same W(I)= gI, g is the cost for each gap and I is the length Affine gap penalty: different cost for creation and extension W(I)=gopen + gext (I-1) and gopen < Gext S S , W I
  • 23. Sequence alignment and their algorithms  Alignment Algorithms And Methodes  The dot matrix method  The word method  The dynamic programming method
  • 24. Sequence alignment and their algorithms  Alignment Algorithms  The dot matrix method The most basic sequence alignment method is the dot matrix method, also known as the dot plot method
  • 25. Sequence alignment and their algorithms  Alignment Algorithms  The word method It works by finding short stretches of identical or nearly identical letters in two sequences. These short strings of characters are called words, which are similar to the windows used in the dot matrix method
  • 26. Sequence alignment and their algorithms  Alignment Algorithms  The word method
  • 27. Sequence alignment and their algorithms  Alignment Algorithms  The dynamic programming method Dynamic programming is a method that determines optimal alignment by matching two sequences for all possible pairs of characters between the two sequences
  • 28.
  • 29. Sequence alignment and their algorithms  Alignment Algorithms  The dynamic programming method  Global alignment The classical global pairwise alignment algorithm using dynamic programming is the Needleman–Wunsch algorithm. In this algorithm, an optimal alignment is obtained over the entire lengths of the two sequences  Local alignment The first application of dynamic programming in local alignment is the Smith–Waterman algorithm. In this algorithm, positive scores are assigned for matching residues and zeros for mismatches. No negative scores are used
  • 30. Sequence alignment and their algorithms  substitution matrix  PAM matrices (point accepted mutation) The PAM matrices were subsequently derived based on the evolutionary divergence between sequences of the same cluster. One PAM unit is defined as 1% of the amino acid positions that have been changed. Because of the use of very closely related homologs, the observed mutations were not expected to significantly change the common function of the proteins
  • 31. Sequence alignment and their algorithms  substitution matrix  PAM matrices (point accepted mutation)
  • 32. Sequence alignment and their algorithms  substitution matrix  BLOSUM matrices This is the series of blocks amino acid substitution matrices (BLOSUM), all of which are derived based on direct observation for every possible amino acid substitution in multiple sequence alignments
  • 33. Sequence alignment and their algorithms  substitution matrix  BLOSUM matrices
  • 34. Sequence alignment and their algorithms What Matrices should be used and when? Matrix PAM40 Best use Similarity (%) Short alignment that are 70-90 highly similar PAM160 Detecting members of a 50-60 protein family PAM250 Longer alignments of more App. 30 divergent sequences BLUSOM90 Short alignment that are 70-90 highly similar BLUSOME80 Detecting members of a 50-60 protein family BLUSOME62 Most effective in finding 30-40 all potential similarities BLUSOME30 Longer alignments of more <30 divergent sequences Similarity: the range of similarities that the matrix is able to best tdetecr.
  • 35. Comparison • PAM is based on an evolutionary model using phylogenetic trees • BLOSUM assumes no evolutionary model, but rather conserved “blocks” of proteins
  • 36. Sequence alignment and their algorithms  Heuristic database searching The heuristic algorithms perform faster searches because they examine only a fraction of the possible alignments examined in regular dynamic programming  BLAST (basic local alignment search tool) BLAST uses heuristics to align a query sequence with all sequences in a database
  • 37. Sequence alignment and their algorithms  BLAST (basic local alignment search tool)
  • 38. Sequence alignment and their algorithms 6- finishing Negative scores from scoring matrix Threshold for stopping extension Minimum Score (S) Neighborhood Score Threshold (T) If the extension stopped after crossing the X, the alignment is called High-scoring segment pair (HSP)
  • 39. Sequence alignment and their algorithms Suggested BLAST Cutoffs Finding by chance in nucleotide database is more than proteins Identity in proteins is more informative than in the nucleic acids For nucleotide-based searches: hits with E values of 10-6 or less and seq identity 70% or more For protein-based searches: hits with E values of 10-3 or less and seq. identity of 25% or more.
  • 40. Sequence alignment and their algorithms  BLAST (basic local alignment search tool)  BLASTN queries nucleotide sequences with a nucleotide sequence database  BLASTP uses protein sequences as queries to search against a protein sequence database  BLASTX uses nucleotide sequences as queries and translates them in all six reading frames to produce translated protein sequences, which are used to query a protein sequence database  TBLASTN queries protein sequences to a nucleotide sequence database with the sequences translated in all six reading frames  TBLASTX uses nucleotide sequences, which are translated in all six frames, to search against a nucleotide sequence database that has all the sequences translated in six frames
  • 41. Sequence alignment and their algorithms  PSI-BLAST Position-specific iterated BLAST (PSI-BLAST) builds profiles and performs database searches in an iterative fashion. The main feature of PSI-BLAST is that profiles are constructed automatically and arefine-tunedin each successive cycle
  • 42. Sequence alignment and their algorithms  PSI-BLAST
  • 43. Sequence alignment and their algorithms  Multiple sequence alignment
  • 44. Sequence alignment and their algorithms  Multiple sequence alignment  Exhaustive algorithms The exhaustive alignment method involves examining all possible aligned positions simultaneously  Heuristic algorithms  Because the use of dynamic programming is not feasible for routine multiple sequence alignment, faster and heuristic algorithms have been developed. computational strategy to find a near-optimal solution by using rules of thumb. Essentially, this strategy takes shortcuts by reducing the search space according to certain criteria
  • 45. Sequence alignment and their algorithms  Multiple sequence alignment  Heuristic algorithms  Progressive alignment  Progressive alignment depends on the stepwise assembly of multiple alignment and is heuristic in nature  Clustal It is a progressive multiple alignment program available either as a standalone or on-line program  T-coffee T-coffee performs progressive sequence alignments as in Clustal. The main difference is that, in processing a query, T-Coffee performs both global and local pairwise alignment for all possible pairs involved. The global pairwise alignment is performed using the Clustal program
  • 46.
  • 47. Sequence alignment and their algorithms  Multiple sequence alignment  Heuristic algorithms  Iterative alignment The iterative approach is based on the idea that an optimal solution can be found by repeatedly modifying existing suboptimal solutions
  • 48. Sequence alignment and their algorithms  Multiple sequence alignment  Heuristic algorithms  Block-Based Alignment The strategy identifies a block of ungapped alignment shared by all the sequences, hence, the block-based local alignment strategy
  • 49. Structural prediction  Structural prediction methods  Ab-initio prediction Computational prediction based on first principles or using the most elementary information  Threading Method of predicting the most likely protein structural fold based on secondary structure similarity with database structures and assessment of energies of the potential fold. The term has been used interchangeably with fold recognition  Homology-based modeling Method for predicting the three-dimensional structure of a protein based on homology by assigning the structure of an unknown protein using an existing homologous protein structure as a template
  • 50. Hidden Markova algorithm Statistical model composed of a number of interconnected. Markov chains with the capability to generate the probability value of an event by taking into account the influence from hidden variables. Mathematically, it calculates probability values of connected states among the Markov chains to find an optimal path within the network of states. It requires training to obtain the probability values of state transitions. When using a hidden Markov model to represent a multiple sequence alignment, a sequence can be generated through the model by incorporating probability values of match, insertion, and deletion states
  • 52. Neural network algorithm Machine-learning algorithm for pattern recognition. It is composed of input, hidden, and output layers. Units of information in each layer are called nodes. The nodes of different layers are interconnected to form a network analogous to a biological nervous system. Between the nodes are mathematical weight parameters that can be trained with known patterns so they can be used for later predictions. After training, the network is able to recognize correlation between an input and output
  • 54. Web-based tools  Alignment tools  Sequence-based methods  T-coffee         http://tcoffee.crg.cat/apps/tcoffee/do:regular NCBI http://blast.ncbi.nlm.nih.gov/Blast.cgi Uniprot http://www.uniprot.org EMBL http://coot.embl.de/Alignment Structural-based methods Dali server http://ekhidna.biocenter.helsinki.fi/dali_server FSSP http://protein.hbu.cn/fssp Signal peptide resource http://proline.bic.nus.edu.sg/spdb/searchn.html Active site prediction http://www.scfbio-iitd.res.in/dock/ActiveSite.jsp
  • 59. Web-based tools  Dali server http://ekhidna.biocenter.helsinki.fi/dali_server
  • 61. Web-based tools  Secondary structures prediction  Sopma      http://npsapbil.ibcp.fr/cgibin/npsa_automat.pl?page=npsa_sopma.html Jpred3 http://www.compbio.dundee.ac.uk/www-jpred PreSSaPro http://bioinformatica.isa.cnr.it/PRESSAPRO HMM protein structure prediction http://compbio.soe.ucsc.edu/SAM_T08/T08-query.html PROF http://www.aber.ac.uk/~phiwww/prof Software package http://molbiol-tools.ca/Protein_secondary_structure.htm
  • 66. Web-based tools  HMM protein structure prediction http://compbio.soe.ucsc.edu/SAM_T08/T08-query.html
  • 70. Web-based tools  Active site prediction http://www.scfbio-iitd.res.in/dock/ActiveSite.jsp
  • 71. Web-based tools  Tertiary structure prediction  Phyre2 http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index
  • 72. Web-based tools  Biochemical features  Protein calculator      http://www.scripps.edu/~cdputnam/protcalc.html Amino acid calculator http://proteome.gs.washington.edu/cgibin/aa_calc.pl Peptide property calculator https://www.genscript.com/sslbin/site2/peptide_calculation.cgi Peptide property calculator http://www.innovagen.se/custom-peptidesynthesis/peptide-property-calculator/peptide-property-calculator.asp Physico-chemical profiles http://npsa-pbil.ibcp.fr/cgibin/npsa_automat.pl?page=/NPSA/npsa_pcprof.html Tagldent tool http://web.expasy.org/tagident/
  • 73. Web-based tools  Biochemical features  Peptide cutter       http://web.expasy.org/peptide_cutter/ Kyte doolittle hydropahty plot http://gcat.davidson.edu/DGPB/kd/kytedoolittle.htm GRAVY calculator http://www.gravy-calculator.de/index.php ProtScale http://web.expasy.org/protscale/ ProtParam http://web.expasy.org/protparam/ Prosite http://prosite.expasy.org/prosite.html Interpro http://www.ebi.ac.uk/interpro/
  • 74. Web-based tools Protein calculator http://www.scripps.edu/~cdputnam/protcalc.html 
  • 75. Web-based tools Amino acid calculator  http://proteome.gs.washington.edu/cgi- bin/aa_calc.pl
  • 76. Web-based tools Peptide property calculator  https://www.genscript.com/ssl-bin/site2/peptide_calculation.cgi
  • 77. Web-based tools  Peptide property calculator http://www.innovagen.se/custom-peptidesynthesis/peptide-property-calculator/peptide-property-calculator.asp
  • 78. Web-based tools  Physico-chemical profiles http://npsa-pbil.ibcp.fr/cgibin/npsa_automat.pl?page=/NPSA/npsa_pcprof.html
  • 79. Web-based tools  Tagldent tool http://web.expasy.org/tagident/
  • 81. Web-based tools Kyte doolittle hydropahty plot http://gcat.davidson.edu/DGPB/kd/kyte doolittle.htm
  • 91. Stand-alone softwares  Pairwise structure alignment