SlideShare une entreprise Scribd logo
1  sur  17
Sucheta Tripathy, 16th November 2012
   A protein sequence from species A
    ◦ What is the nearest species this protein is similar
      to?
    ◦ Where is it originated from?
    ◦ Putative function.
    ◦ If it has a conserved motif etc.
   Blast (Basic Local Alignment Search Tool)
    ◦ NCBI Blast
    ◦ Wu-Blast
    ◦ PSI-Blast
   Fasta
   SSearch
   Heuristic (Educated guess)
   Does not compare sequence to its entirety.
   Quickly locates short matches(seeds)
   Word size
   Seeds are extended in both directions
   Threshold is defined
    ◦ > Threshold -> keep the alignment
    ◦ < Threshold -> discard the alignment
GLKFA -> 3
   GLK, LKF, FKA
   A Query sequence:
    ◦ Nucleotide
    ◦ Protein
   A Target Database
    ◦ Nucleotide
    ◦ Protein
   Blast Program
    ◦ Blastn
    ◦ Blastp
    ◦ tBlastx (Slowest Nt query translated against Nt database
      trlt.)
    ◦ tBlastn (Protein query translated nt. Database)
    ◦ Blastx (Nucleotide trnslt against Protein database)
   E Value -> Probability value at which the
    sequence hits may occur by chance
   Score -> Similarity score.
    ◦ By chance rain probability is 0.001
    ◦ Passing by chance etc.
    ◦ Less the e –value the better is the sensitivity of the
      alignment.
   Remove Low Complexity regions
   Generate all the k mers.
   List All Possible matching key words.
    - Blast cares about only high scoring pairs
    - Fasta stores all pairs irrespective of the
    scores.
   Extend the matches into high scoring
    pairs(HSPs)
   Evaluate results depending on thresholds set.
   Extend HSPs and join them together.
ATGGGGCGAGGCAGCGGCACCTTCGAGCGTCTCCTAGACAAGGCGACCAGCCAGCTCCTGTTG
GAGACAGATTGGGAGTCCATTTTGCAGATCTGCGACCTGATCCGCCAAGGGGACACACAAGCA
AAATATGCTGTGAATTCCATCAAGAAGAAAGTCAACGACAAGAACCCACACGTCGCCTTGTATG
CCCTGGAGGTCATGGAATCTGTGGTAAAGAACTGTGGCCAGACAGTTCATGATGAGGTGGCCA
ACAAGCAGACCATGGAGGAGCTGAAGGACCTGCTGAAGAGACAAGTGGAGGTAAACGTCCGTA
ACAAGATCCTGTACCTGATCCAGGCCTGGGCGCATGCCTTCCGGAACGAGCCCAAGTACAAGG
TGGTCCAGGACACCTACCAGATCATGAAGGTGGAGGGGCACGTCTTTCCAGAATTCAAAGAGA
GCGATGCCATGTTTGCTGCCGAGAGAGCCCCAGACTGGGTGGACGCTGAGGAATGCCACCGCT
GCAGGGTGCAGTTCGGGGTGATGACCCGTAAGCACCACTGCCGGGCGTGTGGGCAGATATTCT
GTGGAAAGTGTTCTTCCAAGTACTCCACCATCCCCAAGTTTGGCATCGAGAAGGAGGTGCGCGT
GTGTGAGCCCTGCTACGAGCAGCTGAACAGGAAAGCGGAGGGAAAGGCCACTTCCACCACTGA
   Dot matrix method (bioinfx.net)
   Dynamic Programming method
    ◦ Global(Needleman-Wunsch method)
    ◦ Local (Smith-Waterman method)
   Word Method or K-tuple method(Heuristic)




    FTFTALILLAVAV
    FTALLLAAV



http://www.ncbi.nlm.nih.gov/pmc/articles/PMC50453/pdf/pnas01096-
   Uses Neighbor joining guide tree(NJ).
    ◦ N number of sequences
      ½ * N! / (N-r)! -> Number of pairs
      5 sequences (5,4,3,2,1)
        (5,4), (5,3), (5,2), (5,1); (4,3),(4,2),(4,1);(3,2),(3,1);(2,1)
PAM
BLOSSUM
GONNET
DNA Identity Matrix
DNA PUPY matrix
   Substitution Matrices
      Insertion and deletions are less likely than
    a substitution
      Insertion and Deletion in DNA sequence leads to Frame
       shift.



PAM Matrices(Point Accepted Mutation Matrices)
Margaret Dayhoff 1978

PAM1 -> Expected rates of substition if 1% of the
amino acids have changed
 BLOSUM : Blocks Substitution Matrix (% of identity)
PAM matrices are based on a
   simple evolutionary model
    MATLFC          MLTLCC




          M(A/L)TL(F/C)C     Two changes
       Ancestral sequence?
• Only mutations are allowed
• Sites evolve independently
                                           15
Guidelines for using matricies


Protein Query      LengthMatrix   Open Gap   Extend Gap
>300                  BLOSUM50          -10      -2
85-300                BLOSUM62          -7       -1
50-85                 BLOSUM80          -16      -4
>300                  PAM250             -10      -2
85-300                 PAM120            -16      -4
35-85                  MDM40            -12       -2
<=35                   MDM20             -22      -4
<=10                    MDM10            -23      -4

PAM100   ==>    Blosum90
PAM120   ==>    Blosum80
PAM160   ==>    Blosum60
PAM200   ==>    Blosum52
PAM250   ==>    Blosum45
Scoring Matrices
S = [sij] gives score of aligning character i
  with character j for every pair i, j.


                              STPP
                              CTCA

                               0 + 3 + (-3) + 1

                                  =1
                                                17

Contenu connexe

Tendances

Needleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmNeedleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmProshantaShil
 
Global and local alignment in Bioinformatics
Global and local alignment in BioinformaticsGlobal and local alignment in Bioinformatics
Global and local alignment in BioinformaticsMahmudul Alam
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNAmaryamshah13
 
Nucleic acid sequencing- introduction,DNA sequencing
Nucleic acid sequencing- introduction,DNA sequencingNucleic acid sequencing- introduction,DNA sequencing
Nucleic acid sequencing- introduction,DNA sequencingSwati Pawar
 
Sequence analysis - Bioinformatics
Sequence analysis - BioinformaticsSequence analysis - Bioinformatics
Sequence analysis - BioinformaticsPratik Parikh
 
DNA microarray final ppt.
DNA microarray final ppt.DNA microarray final ppt.
DNA microarray final ppt.Aashish Patel
 
molecular marker RFLP, and application
molecular marker RFLP, and applicationmolecular marker RFLP, and application
molecular marker RFLP, and applicationKAUSHAL SAHU
 
Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)LOGESWARAN KA
 
454 pyrosequencing @ujjwalsirohi
454 pyrosequencing @ujjwalsirohi454 pyrosequencing @ujjwalsirohi
454 pyrosequencing @ujjwalsirohiujjwal sirohi
 
Ramachandran Plot
Ramachandran PlotRamachandran Plot
Ramachandran PlotNishanth S
 
Transcription in prokaryotes
Transcription in prokaryotesTranscription in prokaryotes
Transcription in prokaryotesKaayathri Devi
 
Translational machinery
Translational   machineryTranslational   machinery
Translational machineryvishnu prasad
 
Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)Asiri Wijesinghe
 
nucleic acid hybridization
nucleic acid hybridizationnucleic acid hybridization
nucleic acid hybridizationPragati Randive
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingSwathi Prabakar
 

Tendances (20)

Sequence alignment belgaum
Sequence alignment belgaumSequence alignment belgaum
Sequence alignment belgaum
 
Needleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmNeedleman-Wunsch Algorithm
Needleman-Wunsch Algorithm
 
Global and local alignment in Bioinformatics
Global and local alignment in BioinformaticsGlobal and local alignment in Bioinformatics
Global and local alignment in Bioinformatics
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNA
 
Nucleic acid sequencing- introduction,DNA sequencing
Nucleic acid sequencing- introduction,DNA sequencingNucleic acid sequencing- introduction,DNA sequencing
Nucleic acid sequencing- introduction,DNA sequencing
 
Sequence analysis - Bioinformatics
Sequence analysis - BioinformaticsSequence analysis - Bioinformatics
Sequence analysis - Bioinformatics
 
DNA microarray final ppt.
DNA microarray final ppt.DNA microarray final ppt.
DNA microarray final ppt.
 
molecular marker RFLP, and application
molecular marker RFLP, and applicationmolecular marker RFLP, and application
molecular marker RFLP, and application
 
Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)
 
454 pyrosequencing @ujjwalsirohi
454 pyrosequencing @ujjwalsirohi454 pyrosequencing @ujjwalsirohi
454 pyrosequencing @ujjwalsirohi
 
Ramachandran Plot
Ramachandran PlotRamachandran Plot
Ramachandran Plot
 
Transcription in prokaryotes
Transcription in prokaryotesTranscription in prokaryotes
Transcription in prokaryotes
 
PCR, RT-PCR and qPCR
PCR, RT-PCR and qPCRPCR, RT-PCR and qPCR
PCR, RT-PCR and qPCR
 
Protein sequencing
Protein sequencingProtein sequencing
Protein sequencing
 
Translational machinery
Translational   machineryTranslational   machinery
Translational machinery
 
Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)
 
nucleic acid hybridization
nucleic acid hybridizationnucleic acid hybridization
nucleic acid hybridization
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Restriction enzyme
Restriction enzymeRestriction enzyme
Restriction enzyme
 
Swiss PROT
Swiss PROT Swiss PROT
Swiss PROT
 

En vedette (20)

Fasta
FastaFasta
Fasta
 
Fasta
FastaFasta
Fasta
 
Blast fasta 4
Blast fasta 4Blast fasta 4
Blast fasta 4
 
blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
Fasta
FastaFasta
Fasta
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
 
Blast
BlastBlast
Blast
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignment
 
BLAST
BLASTBLAST
BLAST
 
BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
sequence alignment
sequence alignmentsequence alignment
sequence alignment
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 
Blast
BlastBlast
Blast
 
Alignments
AlignmentsAlignments
Alignments
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
 
Multiple alignment
Multiple alignmentMultiple alignment
Multiple alignment
 
Dotplots for Bioinformatics
Dotplots for BioinformaticsDotplots for Bioinformatics
Dotplots for Bioinformatics
 
Blast
BlastBlast
Blast
 
Needleman-wunch algorithm harshita
Needleman-wunch algorithm  harshitaNeedleman-wunch algorithm  harshita
Needleman-wunch algorithm harshita
 

Similaire à Sequence Alignment,Blast, Fasta, MSA

Scoring matrices
Scoring matricesScoring matrices
Scoring matricesAshwini
 
ppgardner-lecture05-alignment-comparativegenomics.pdf
ppgardner-lecture05-alignment-comparativegenomics.pdfppgardner-lecture05-alignment-comparativegenomics.pdf
ppgardner-lecture05-alignment-comparativegenomics.pdfPaul Gardner
 
20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07Computer Science Club
 
Wang labsummer2010
Wang labsummer2010Wang labsummer2010
Wang labsummer2010russodl
 
Analyzing_ETF_Financial_Data_In_R
Analyzing_ETF_Financial_Data_In_RAnalyzing_ETF_Financial_Data_In_R
Analyzing_ETF_Financial_Data_In_RGeoffery Mullings
 
Comparative Genomics with GMOD and BioPerl
Comparative Genomics with GMOD and BioPerlComparative Genomics with GMOD and BioPerl
Comparative Genomics with GMOD and BioPerlJason Stajich
 
Lab talk 190210 efficacy studies on radioligand hits_beginnings of fret assay...
Lab talk 190210 efficacy studies on radioligand hits_beginnings of fret assay...Lab talk 190210 efficacy studies on radioligand hits_beginnings of fret assay...
Lab talk 190210 efficacy studies on radioligand hits_beginnings of fret assay...Laurence Dawkins-Hall
 

Similaire à Sequence Alignment,Blast, Fasta, MSA (12)

Similarity
SimilaritySimilarity
Similarity
 
BLAST
BLASTBLAST
BLAST
 
_BLAST.ppt
_BLAST.ppt_BLAST.ppt
_BLAST.ppt
 
Scoring matrices
Scoring matricesScoring matrices
Scoring matrices
 
ppgardner-lecture05-alignment-comparativegenomics.pdf
ppgardner-lecture05-alignment-comparativegenomics.pdfppgardner-lecture05-alignment-comparativegenomics.pdf
ppgardner-lecture05-alignment-comparativegenomics.pdf
 
20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07
 
Wang labsummer2010
Wang labsummer2010Wang labsummer2010
Wang labsummer2010
 
Analyzing_ETF_Financial_Data_In_R
Analyzing_ETF_Financial_Data_In_RAnalyzing_ETF_Financial_Data_In_R
Analyzing_ETF_Financial_Data_In_R
 
Ch06 alignment
Ch06 alignmentCh06 alignment
Ch06 alignment
 
Comparative Genomics with GMOD and BioPerl
Comparative Genomics with GMOD and BioPerlComparative Genomics with GMOD and BioPerl
Comparative Genomics with GMOD and BioPerl
 
Bioinformatics life sciences_v2015
Bioinformatics life sciences_v2015Bioinformatics life sciences_v2015
Bioinformatics life sciences_v2015
 
Lab talk 190210 efficacy studies on radioligand hits_beginnings of fret assay...
Lab talk 190210 efficacy studies on radioligand hits_beginnings of fret assay...Lab talk 190210 efficacy studies on radioligand hits_beginnings of fret assay...
Lab talk 190210 efficacy studies on radioligand hits_beginnings of fret assay...
 

Plus de Sucheta Tripathy (20)

Gal
GalGal
Gal
 
Ramorum2016 final
Ramorum2016 finalRamorum2016 final
Ramorum2016 final
 
Primer designgeneprediction
Primer designgenepredictionPrimer designgeneprediction
Primer designgeneprediction
 
Motif andpatterndatabase
Motif andpatterndatabaseMotif andpatterndatabase
Motif andpatterndatabase
 
Databases ii
Databases iiDatabases ii
Databases ii
 
Snps and microarray
Snps and microarraySnps and microarray
Snps and microarray
 
Stat2013
Stat2013Stat2013
Stat2013
 
26 nov2013seminar
26 nov2013seminar26 nov2013seminar
26 nov2013seminar
 
Stat2013
Stat2013Stat2013
Stat2013
 
Presentation2013
Presentation2013Presentation2013
Presentation2013
 
Lecture7,8
Lecture7,8Lecture7,8
Lecture7,8
 
Lecture5,6
Lecture5,6Lecture5,6
Lecture5,6
 
Primer designgeneprediction
Primer designgenepredictionPrimer designgeneprediction
Primer designgeneprediction
 
Lecture 3,4
Lecture 3,4Lecture 3,4
Lecture 3,4
 
Lecture 1,2
Lecture 1,2Lecture 1,2
Lecture 1,2
 
Databases Part II
Databases Part IIDatabases Part II
Databases Part II
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Genome sequencingprojects
Genome sequencingprojectsGenome sequencingprojects
Genome sequencingprojects
 
Human encodeproject
Human encodeprojectHuman encodeproject
Human encodeproject
 
Tyler presentation
Tyler presentationTyler presentation
Tyler presentation
 

Sequence Alignment,Blast, Fasta, MSA

Notes de l'éditeur

  1. Series of methods that relies on pairwise alignments