SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
Short introduction to Bioinformatics
             What are the Probabilistic Models?
                            Sequence Alignment
                             Pairwise Alignment
            Multiple Sequence Alignment Models
                         What is Phylogenetics?
                     Building Phylogenetic Trees
                                   Other Models
                                    Conctact Us




Introduction to Probabilistic Models for Bioinformatics

              Igor Bogicevic (igor.bogicevic@sbgenomics.com)




                                          July 3, 2011




                                                                                                         EVEN BRIDGES
                                                                                                             G E N O M I C S, LLC




  Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Short introduction to Bioinformatics




       Bioinformatics is the application of statistics and computer science to the field of
       molecular biology.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Short introduction to Bioinformatics




       Bioinformatics is the application of statistics and computer science to the field of
       molecular biology.
       Major research efforts in the field include sequence alignment, gene finding,
       genome assembly, drug design, drug discovery, protein structure alignment,
       protein structure prediction, prediction of gene expression and protein-protein
       interactions, genome-wide association studies and the modeling of evolution.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Short introduction to Bioinformatics




       Bioinformatics is the application of statistics and computer science to the field of
       molecular biology.
       Major research efforts in the field include sequence alignment, gene finding,
       genome assembly, drug design, drug discovery, protein structure alignment,
       protein structure prediction, prediction of gene expression and protein-protein
       interactions, genome-wide association studies and the modeling of evolution.
       At the current moment, given the enormous volumes of sequenced data, one of
       the biggest challenges is not producing, but actually understanding the data.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What are the Probabilistic Models?

       There are 2 basic definitions:




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What are the Probabilistic Models?

       There are 2 basic definitions:
       Statistical analysis tool that estimates, on the basis of past (historical) data, the
       probability of an event occurring again.
       Probabilistic model is a system that simulates the object under the consideration
       and produces different outcomes with different probabilities.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What are the Probabilistic Models?

       There are 2 basic definitions:
       Statistical analysis tool that estimates, on the basis of past (historical) data, the
       probability of an event occurring again.
       Probabilistic model is a system that simulates the object under the consideration
       and produces different outcomes with different probabilities.
       Simple example - rolling a die.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What are the Probabilistic Models?

       There are 2 basic definitions:
       Statistical analysis tool that estimates, on the basis of past (historical) data, the
       probability of an event occurring again.
       Probabilistic model is a system that simulates the object under the consideration
       and produces different outcomes with different probabilities.
       Simple example - rolling a die.
       A bit more relevant example - random sequence model in DNA .
       Biological sequences are strings from a finite alphabet of residues, most
       commonly either four nucleotides, or twenty amino acids.
       Imagine that a residue a occurs with probability qa , if protein or DNA sequence is
       denoted x1 ...xn , then probability of the whole sequence is:
                                                                     n
                                                                     Y
                                                  qx1 qx2 ...qxn =         qxi
                                                                     i=1
                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Sequence Alignment




       Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein
       to identify regions of similarity that may be a consequence of functional,
       structural, or evolutionary relationships between the sequences.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Sequence Alignment




       Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein
       to identify regions of similarity that may be a consequence of functional,
       structural, or evolutionary relationships between the sequences.
       A variety of computational algorithms have been applied to the sequence
       alignment problem, i.e. dynamic programming, heuristic algorithms, probabilistic
       methods.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Sequence Alignment




       Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein
       to identify regions of similarity that may be a consequence of functional,
       structural, or evolutionary relationships between the sequences.
       A variety of computational algorithms have been applied to the sequence
       alignment problem, i.e. dynamic programming, heuristic algorithms, probabilistic
       methods.
       Common formats for representing alignments are FASTA and GenBank format




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
           What are the Probabilistic Models?
                          Sequence Alignment
                           Pairwise Alignment
          Multiple Sequence Alignment Models
                       What is Phylogenetics?
                   Building Phylogenetic Trees
                                 Other Models
                                  Conctact Us




                                                                                                       EVEN BRIDGES
                                                                                                           G E N O M I C S, LLC




Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.
       The three primary methods of producing pairwise alignments are dot-matrix
       methods, dynamic programming, and word methods.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.
       The three primary methods of producing pairwise alignments are dot-matrix
       methods, dynamic programming, and word methods.
       Needleman-Wunsch algorithm (Global Alignment)




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.
       The three primary methods of producing pairwise alignments are dot-matrix
       methods, dynamic programming, and word methods.
       Needleman-Wunsch algorithm (Global Alignment)
       Smith-Waterman algorithm (Local Alignment)




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.
       The three primary methods of producing pairwise alignments are dot-matrix
       methods, dynamic programming, and word methods.
       Needleman-Wunsch algorithm (Global Alignment)
       Smith-Waterman algorithm (Local Alignment)
       FASTA/BLAST Algorithms (k-tuple heuristic methods, often combined with
       dynamic models)




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Pairwise Alignment


       Pairwise sequence alignment methods are used to find the best-matching
       piecewise (local) or global alignments of two query sequences.
       The three primary methods of producing pairwise alignments are dot-matrix
       methods, dynamic programming, and word methods.
       Needleman-Wunsch algorithm (Global Alignment)
       Smith-Waterman algorithm (Local Alignment)
       FASTA/BLAST Algorithms (k-tuple heuristic methods, often combined with
       dynamic models)
       Gap Penalities - modeling a cost of a gap in matched sequences (linear, affine,
       etc.)



                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                 What are the Probabilistic Models?
                                Sequence Alignment
                                 Pairwise Alignment
                Multiple Sequence Alignment Models
                             What is Phylogenetics?
                         Building Phylogenetic Trees
                                       Other Models
                                        Conctact Us




Example - Smith-Waterman: A matrix H is built as follows:

                                         H(i, 0) = 0, 0 ≤ i ≤ m
                                         H(0, j) = 0, 0 ≤ j ≤ n


                               if ai = bj then w (ai , bj ) = w (match)
                          or if ai ! = bj then w (ai , bj ) = w (mismatch)

                  8                                                          9
                  >
                  >          0                                               >
                                                                             >
                H(i − 1, j − 1) + w (ai , bj )                 Match/Mismatch
                  <                                                          =
H(i, j) = max                                                                  , 1 ≤ i ≤ m, 1 ≤ j ≤ n
              > H(i − 1, j) + w (ai , −)
              >                                                   Deletion   >
                                                                             >
                 H(i, j − 1) + w (−, bj )                         Insertion
              :                                                              ;



                                                                                                             EVEN BRIDGES
                                                                                                                 G E N O M I C S, LLC




      Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
               What are the Probabilistic Models?
                              Sequence Alignment
                               Pairwise Alignment
              Multiple Sequence Alignment Models
                           What is Phylogenetics?
                       Building Phylogenetic Trees
                                     Other Models
                                      Conctact Us



Sequence 1 = ACACACTA, Sequence 2 = AGCACACA




                                                                                                           EVEN BRIDGES
                                                                                                               G E N O M I C S, LLC




    Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                What are the Probabilistic Models?
                               Sequence Alignment
                                Pairwise Alignment
               Multiple Sequence Alignment Models
                            What is Phylogenetics?
                        Building Phylogenetic Trees
                                      Other Models
                                       Conctact Us



Sequence 1 = ACACACTA, Sequence 2 = AGCACACA
w(match) = +2
w(a,-) = w(-,b) = w(mismatch) = -1

                                  −      A      C     A       C      A       C        T       A
                        0                                                                       1
                  B−              0      0      0     0       0      0        0        0      0C
                  BA              0      2      1     2       1      2        1        0      2C
                  B                                                                             C
                  BG              0      1      1     1       1      1        1        0      1C
                  B                                                                             C
                  BC              0      0      3     2       3      2        3        2      1C
                  B                                                                             C
                H=B
                  BA              0      2      2     5       4      5        4        3      4C
                                                                                                C
                  BC              0      1      4     4       7      6        7        6      5C
                  B                                                                             C
                  BA              0      2      3     6       6      9        8        7      8C
                  B                                                                             C
                  @C              0      1      4     5       8      8       11       10       9A
                    A             0      2      3     6       7      10      10       10      12




                                                                                                                EVEN BRIDGES
                                                                                                                    G E N O M I C S, LLC




     Igor Bogicevic (igor.bogicevic@sbgenomics.com)       Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                 What are the Probabilistic Models?
                                Sequence Alignment
                                 Pairwise Alignment
                Multiple Sequence Alignment Models
                             What is Phylogenetics?
                         Building Phylogenetic Trees
                                       Other Models
                                        Conctact Us



Sequence 1 = ACACACTA, Sequence 2 = AGCACACA
w(match) = +2
w(a,-) = w(-,b) = w(mismatch) = -1

                                   −      A      C     A       C      A       C        T       A
                         0                                                                       1
                   B−              0      0      0     0       0      0        0        0      0C
                   BA              0      2      1     2       1      2        1        0      2C
                   B                                                                             C
                   BG              0      1      1     1       1      1        1        0      1C
                   B                                                                             C
                   BC              0      0      3     2       3      2        3        2      1C
                   B                                                                             C
                 H=B
                   BA              0      2      2     5       4      5        4        3      4C
                                                                                                 C
                   BC              0      1      4     4       7      6        7        6      5C
                   B                                                                             C
                   BA              0      2      3     6       6      9        8        7      8C
                   B                                                                             C
                   @C              0      1      4     5       8      8       11       10       9A
                     A             0      2      3     6       7      10      10       10      12

In the example, the highest value corresponds to the cell in position (8,8). The
walk back corresponds to (8,8), (7,7), (7,6), (6,5), (5,4), (4,3), (3,2), (2,1),
(1,1), and (0,0)
Sequence 1 = A-CACACTA, Sequence 2 = AGCACAC-A                                                                   EVEN BRIDGES
                                                                                                                     G E N O M I C S, LLC




      Igor Bogicevic (igor.bogicevic@sbgenomics.com)       Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Multiple Sequence Alignment Models



       A multiple sequence alignment (MSA) is a sequence alignment of three or more
       biological sequences, commonly protein, DNA, or RNA.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Multiple Sequence Alignment Models



       A multiple sequence alignment (MSA) is a sequence alignment of three or more
       biological sequences, commonly protein, DNA, or RNA.
       We usually want to do multiple alignments to find a homologous sequences that
       point to a shared evolutionary origins that can be used for further phylogenetic
       analysis.
       Progressive Alignment Methods - constructing succession of a pairwise alignment.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Multiple Sequence Alignment Models



       A multiple sequence alignment (MSA) is a sequence alignment of three or more
       biological sequences, commonly protein, DNA, or RNA.
       We usually want to do multiple alignments to find a homologous sequences that
       point to a shared evolutionary origins that can be used for further phylogenetic
       analysis.
       Progressive Alignment Methods - constructing succession of a pairwise alignment.
       Hidden Markov Models - representation of MSA as DAG, observed states are
       individual alignment columns and the hidden states represent the presumed
       ancestral sequence.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
           What are the Probabilistic Models?
                          Sequence Alignment
                           Pairwise Alignment
          Multiple Sequence Alignment Models
                       What is Phylogenetics?
                   Building Phylogenetic Trees
                                 Other Models
                                  Conctact Us




                                                                                                       EVEN BRIDGES
                                                                                                           G E N O M I C S, LLC




Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What is Phylogenetics?



       Phylogenetics is the study of evolutionary relatedness among groups of organisms
       (e.g. species, populations), which is discovered through molecular sequencing
       data and morphological data matrices.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What is Phylogenetics?



       Phylogenetics is the study of evolutionary relatedness among groups of organisms
       (e.g. species, populations), which is discovered through molecular sequencing
       data and morphological data matrices.
       Evolution is regarded as a branching process, whereby populations are altered
       over time and may speciate into separate branches, hybridize together, or
       terminate by extinction. This may be visualized in a phylogenetic tree.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


What is Phylogenetics?



       Phylogenetics is the study of evolutionary relatedness among groups of organisms
       (e.g. species, populations), which is discovered through molecular sequencing
       data and morphological data matrices.
       Evolution is regarded as a branching process, whereby populations are altered
       over time and may speciate into separate branches, hybridize together, or
       terminate by extinction. This may be visualized in a phylogenetic tree.
       Ernst Haeckel’s recapitulation theory (”ontogeny recapitulates phylogeny”) is a
       hypothesis that in developing from embryo to adult, animals go through stages
       resembling or representing successive stages in the evolution of their remote
       ancestors.



                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Building Phylogenetic Trees


       Phylogenetic trees among a nontrivial number of input sequences are constructed
       using computational phylogenetics methods.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Building Phylogenetic Trees


       Phylogenetic trees among a nontrivial number of input sequences are constructed
       using computational phylogenetics methods.
       Common method is to search for maximum likelihood, often within a Bayesian
       Framework, and apply an explicit model of evolution to phylogenetic tree
       estimation.




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Building Phylogenetic Trees


       Phylogenetic trees among a nontrivial number of input sequences are constructed
       using computational phylogenetics methods.
       Common method is to search for maximum likelihood, often within a Bayesian
       Framework, and apply an explicit model of evolution to phylogenetic tree
       estimation.
       Identifying the optimal tree using many of these techniques is NP-hard, so
       heuristic search and optimization methods are used in combination with
       tree-scoring functions to identify a reasonably good tree that fits the data.




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Building Phylogenetic Trees


       Phylogenetic trees among a nontrivial number of input sequences are constructed
       using computational phylogenetics methods.
       Common method is to search for maximum likelihood, often within a Bayesian
       Framework, and apply an explicit model of evolution to phylogenetic tree
       estimation.
       Identifying the optimal tree using many of these techniques is NP-hard, so
       heuristic search and optimization methods are used in combination with
       tree-scoring functions to identify a reasonably good tree that fits the data.
       They do not necessarily accurately represent the species evolutionary history as
       the data on which they are based is noisy; the analysis can be confounded by
       horizontal gene transfer, hybridisation between species that were not nearest
       neighbors on the tree before hybridisation takes place, convergent evolution, and
       conserved sequences.

                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
           What are the Probabilistic Models?
                          Sequence Alignment
                           Pairwise Alignment
          Multiple Sequence Alignment Models
                       What is Phylogenetics?
                   Building Phylogenetic Trees
                                 Other Models
                                  Conctact Us




                                                                                                       EVEN BRIDGES
                                                                                                           G E N O M I C S, LLC




Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                       What are the Probabilistic Models?
                                      Sequence Alignment
                                       Pairwise Alignment
                      Multiple Sequence Alignment Models
                                   What is Phylogenetics?
                               Building Phylogenetic Trees
                                             Other Models
                                              Conctact Us


Other Models




       Transformational Grammars (Chomsky Hierarchy)
       RNA Structure Analysis Models (RNA contains the interactions - rather than
       preserving the sequence)




                                                                                                                   EVEN BRIDGES
                                                                                                                       G E N O M I C S, LLC




            Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics
Short introduction to Bioinformatics
                        What are the Probabilistic Models?
                                       Sequence Alignment
                                        Pairwise Alignment
                       Multiple Sequence Alignment Models
                                    What is Phylogenetics?
                                Building Phylogenetic Trees
                                              Other Models
                                               Conctact Us


Contact Us




       We are Hiring!




                                                                                                                    EVEN BRIDGES
                                                                                                                        G E N O M I C S, LLC




             Igor Bogicevic (igor.bogicevic@sbgenomics.com)   Introduction to Probabilistic Models for Bioinformatics

Contenu connexe

Tendances

Microarray Data Analysis
Microarray Data AnalysisMicroarray Data Analysis
Microarray Data Analysisyuvraj404
 
Systems biology & Approaches of genomics and proteomics
 Systems biology & Approaches of genomics and proteomics Systems biology & Approaches of genomics and proteomics
Systems biology & Approaches of genomics and proteomicssonam786
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPuneet Kulyana
 
Metagenomic analysis
Metagenomic analysisMetagenomic analysis
Metagenomic analysisAnimesh Kumar
 
PAM : Point Accepted Mutation
PAM : Point Accepted MutationPAM : Point Accepted Mutation
PAM : Point Accepted MutationAmit Kyada
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionArindam Ghosh
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformaticsMakarand Bhale
 
Protein 3 d structure prediction
Protein 3 d structure predictionProtein 3 d structure prediction
Protein 3 d structure predictionSamvartika Majumdar
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Sachin Kumar
 
Global analysis of protein by kk sahu sir
Global analysis of protein by kk sahu sirGlobal analysis of protein by kk sahu sir
Global analysis of protein by kk sahu sirKAUSHAL SAHU
 

Tendances (20)

Microarray Data Analysis
Microarray Data AnalysisMicroarray Data Analysis
Microarray Data Analysis
 
Systems biology & Approaches of genomics and proteomics
 Systems biology & Approaches of genomics and proteomics Systems biology & Approaches of genomics and proteomics
Systems biology & Approaches of genomics and proteomics
 
Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
Metagenomic analysis
Metagenomic analysisMetagenomic analysis
Metagenomic analysis
 
BioInformatics Software
BioInformatics SoftwareBioInformatics Software
BioInformatics Software
 
HMM (Hidden Markov Model)
HMM (Hidden Markov Model)HMM (Hidden Markov Model)
HMM (Hidden Markov Model)
 
PAM : Point Accepted Mutation
PAM : Point Accepted MutationPAM : Point Accepted Mutation
PAM : Point Accepted Mutation
 
Biopython
BiopythonBiopython
Biopython
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
 
Phylogenetic analysis
Phylogenetic analysisPhylogenetic analysis
Phylogenetic analysis
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 
Metagenomics: An overview
Metagenomics: An overviewMetagenomics: An overview
Metagenomics: An overview
 
Protein 3 d structure prediction
Protein 3 d structure predictionProtein 3 d structure prediction
Protein 3 d structure prediction
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
Global analysis of protein by kk sahu sir
Global analysis of protein by kk sahu sirGlobal analysis of protein by kk sahu sir
Global analysis of protein by kk sahu sir
 
Intro to illumina sequencing
Intro to illumina sequencingIntro to illumina sequencing
Intro to illumina sequencing
 
String.pptx
String.pptxString.pptx
String.pptx
 

En vedette

Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning GenomeCompiler
 
Sequence comparison techniques
Sequence comparison techniquesSequence comparison techniques
Sequence comparison techniquesruchibioinfo
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignmentKubuldinho
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsNikesh Narayanan
 
Application of bioinformatics
Application of bioinformaticsApplication of bioinformatics
Application of bioinformaticsKamlesh Patade
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignmentavrilcoghlan
 
2015 bioinformatics phylogenetics_wim_vancriekinge
2015 bioinformatics phylogenetics_wim_vancriekinge2015 bioinformatics phylogenetics_wim_vancriekinge
2015 bioinformatics phylogenetics_wim_vancriekingeProf. Wim Van Criekinge
 
TCS: A new multiple sequence alignment reliability measure to estimate align...
 TCS: A new multiple sequence alignment reliability measure to estimate align... TCS: A new multiple sequence alignment reliability measure to estimate align...
TCS: A new multiple sequence alignment reliability measure to estimate align...JIA-MING CHANG
 
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic Trees
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic TreesBIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic Trees
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic TreesJonathan Eisen
 
The Needleman Wunsch algorithm
The Needleman Wunsch algorithmThe Needleman Wunsch algorithm
The Needleman Wunsch algorithmavrilcoghlan
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Vijay Hemmadi
 
Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Marina Santini
 

En vedette (20)

Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning
 
Sequence comparison techniques
Sequence comparison techniquesSequence comparison techniques
Sequence comparison techniques
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 
Application of bioinformatics
Application of bioinformaticsApplication of bioinformatics
Application of bioinformatics
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignment
 
2015 bioinformatics phylogenetics_wim_vancriekinge
2015 bioinformatics phylogenetics_wim_vancriekinge2015 bioinformatics phylogenetics_wim_vancriekinge
2015 bioinformatics phylogenetics_wim_vancriekinge
 
TCS: A new multiple sequence alignment reliability measure to estimate align...
 TCS: A new multiple sequence alignment reliability measure to estimate align... TCS: A new multiple sequence alignment reliability measure to estimate align...
TCS: A new multiple sequence alignment reliability measure to estimate align...
 
Phylogenetics2
Phylogenetics2Phylogenetics2
Phylogenetics2
 
Phylogenetics1
Phylogenetics1Phylogenetics1
Phylogenetics1
 
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic Trees
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic TreesBIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic Trees
BIS2C. Biodiversity and the Tree of Life. 2014. L4. Inferring Phylogenetic Trees
 
Clustal X
Clustal XClustal X
Clustal X
 
The Needleman Wunsch algorithm
The Needleman Wunsch algorithmThe Needleman Wunsch algorithm
The Needleman Wunsch algorithm
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Hidden markov model
Hidden markov modelHidden markov model
Hidden markov model
 
Phylogeny
PhylogenyPhylogeny
Phylogeny
 
Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)
 
Phylogenetic tree
Phylogenetic treePhylogenetic tree
Phylogenetic tree
 
Blast fasta 4
Blast fasta 4Blast fasta 4
Blast fasta 4
 

Similaire à Introduction to Probabilistic Models for Bioinformatics

Bio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challengesBio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challengesJanna Hastings
 
My ontology is better than yours! Building and evaluating ontologies for inte...
My ontology is better than yours! Building and evaluating ontologies for inte...My ontology is better than yours! Building and evaluating ontologies for inte...
My ontology is better than yours! Building and evaluating ontologies for inte...Robert Hoehndorf
 
Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Sage Base
 
Biotechnology as Career Option 2012
Biotechnology as Career Option 2012Biotechnology as Career Option 2012
Biotechnology as Career Option 2012Reportbioinformatics
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfkigaruantony
 
Vicarious Systems at Singularity Summit 2011
Vicarious Systems at Singularity Summit 2011Vicarious Systems at Singularity Summit 2011
Vicarious Systems at Singularity Summit 2011Scott Brown
 

Similaire à Introduction to Probabilistic Models for Bioinformatics (8)

Bioinformatica t1-bioinformatics
Bioinformatica t1-bioinformaticsBioinformatica t1-bioinformatics
Bioinformatica t1-bioinformatics
 
Bio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challengesBio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challenges
 
HOMOLOGY MODELING.pptx.pdf
HOMOLOGY MODELING.pptx.pdfHOMOLOGY MODELING.pptx.pdf
HOMOLOGY MODELING.pptx.pdf
 
My ontology is better than yours! Building and evaluating ontologies for inte...
My ontology is better than yours! Building and evaluating ontologies for inte...My ontology is better than yours! Building and evaluating ontologies for inte...
My ontology is better than yours! Building and evaluating ontologies for inte...
 
Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27
 
Biotechnology as Career Option 2012
Biotechnology as Career Option 2012Biotechnology as Career Option 2012
Biotechnology as Career Option 2012
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdf
 
Vicarious Systems at Singularity Summit 2011
Vicarious Systems at Singularity Summit 2011Vicarious Systems at Singularity Summit 2011
Vicarious Systems at Singularity Summit 2011
 

Dernier

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 

Dernier (20)

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 

Introduction to Probabilistic Models for Bioinformatics

  • 1. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Introduction to Probabilistic Models for Bioinformatics Igor Bogicevic (igor.bogicevic@sbgenomics.com) July 3, 2011 EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 2. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Short introduction to Bioinformatics Bioinformatics is the application of statistics and computer science to the field of molecular biology. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 3. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Short introduction to Bioinformatics Bioinformatics is the application of statistics and computer science to the field of molecular biology. Major research efforts in the field include sequence alignment, gene finding, genome assembly, drug design, drug discovery, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, genome-wide association studies and the modeling of evolution. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 4. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Short introduction to Bioinformatics Bioinformatics is the application of statistics and computer science to the field of molecular biology. Major research efforts in the field include sequence alignment, gene finding, genome assembly, drug design, drug discovery, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, genome-wide association studies and the modeling of evolution. At the current moment, given the enormous volumes of sequenced data, one of the biggest challenges is not producing, but actually understanding the data. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 5. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What are the Probabilistic Models? There are 2 basic definitions: EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 6. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What are the Probabilistic Models? There are 2 basic definitions: Statistical analysis tool that estimates, on the basis of past (historical) data, the probability of an event occurring again. Probabilistic model is a system that simulates the object under the consideration and produces different outcomes with different probabilities. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 7. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What are the Probabilistic Models? There are 2 basic definitions: Statistical analysis tool that estimates, on the basis of past (historical) data, the probability of an event occurring again. Probabilistic model is a system that simulates the object under the consideration and produces different outcomes with different probabilities. Simple example - rolling a die. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 8. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What are the Probabilistic Models? There are 2 basic definitions: Statistical analysis tool that estimates, on the basis of past (historical) data, the probability of an event occurring again. Probabilistic model is a system that simulates the object under the consideration and produces different outcomes with different probabilities. Simple example - rolling a die. A bit more relevant example - random sequence model in DNA . Biological sequences are strings from a finite alphabet of residues, most commonly either four nucleotides, or twenty amino acids. Imagine that a residue a occurs with probability qa , if protein or DNA sequence is denoted x1 ...xn , then probability of the whole sequence is: n Y qx1 qx2 ...qxn = qxi i=1 EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 9. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence Alignment Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 10. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence Alignment Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. A variety of computational algorithms have been applied to the sequence alignment problem, i.e. dynamic programming, heuristic algorithms, probabilistic methods. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 11. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence Alignment Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. A variety of computational algorithms have been applied to the sequence alignment problem, i.e. dynamic programming, heuristic algorithms, probabilistic methods. Common formats for representing alignments are FASTA and GenBank format EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 12. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 13. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 14. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. The three primary methods of producing pairwise alignments are dot-matrix methods, dynamic programming, and word methods. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 15. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. The three primary methods of producing pairwise alignments are dot-matrix methods, dynamic programming, and word methods. Needleman-Wunsch algorithm (Global Alignment) EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 16. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. The three primary methods of producing pairwise alignments are dot-matrix methods, dynamic programming, and word methods. Needleman-Wunsch algorithm (Global Alignment) Smith-Waterman algorithm (Local Alignment) EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 17. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. The three primary methods of producing pairwise alignments are dot-matrix methods, dynamic programming, and word methods. Needleman-Wunsch algorithm (Global Alignment) Smith-Waterman algorithm (Local Alignment) FASTA/BLAST Algorithms (k-tuple heuristic methods, often combined with dynamic models) EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 18. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Pairwise Alignment Pairwise sequence alignment methods are used to find the best-matching piecewise (local) or global alignments of two query sequences. The three primary methods of producing pairwise alignments are dot-matrix methods, dynamic programming, and word methods. Needleman-Wunsch algorithm (Global Alignment) Smith-Waterman algorithm (Local Alignment) FASTA/BLAST Algorithms (k-tuple heuristic methods, often combined with dynamic models) Gap Penalities - modeling a cost of a gap in matched sequences (linear, affine, etc.) EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 19. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Example - Smith-Waterman: A matrix H is built as follows: H(i, 0) = 0, 0 ≤ i ≤ m H(0, j) = 0, 0 ≤ j ≤ n if ai = bj then w (ai , bj ) = w (match) or if ai ! = bj then w (ai , bj ) = w (mismatch) 8 9 > > 0 > > H(i − 1, j − 1) + w (ai , bj ) Match/Mismatch < = H(i, j) = max , 1 ≤ i ≤ m, 1 ≤ j ≤ n > H(i − 1, j) + w (ai , −) > Deletion > > H(i, j − 1) + w (−, bj ) Insertion : ; EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 20. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence 1 = ACACACTA, Sequence 2 = AGCACACA EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 21. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence 1 = ACACACTA, Sequence 2 = AGCACACA w(match) = +2 w(a,-) = w(-,b) = w(mismatch) = -1 − A C A C A C T A 0 1 B− 0 0 0 0 0 0 0 0 0C BA 0 2 1 2 1 2 1 0 2C B C BG 0 1 1 1 1 1 1 0 1C B C BC 0 0 3 2 3 2 3 2 1C B C H=B BA 0 2 2 5 4 5 4 3 4C C BC 0 1 4 4 7 6 7 6 5C B C BA 0 2 3 6 6 9 8 7 8C B C @C 0 1 4 5 8 8 11 10 9A A 0 2 3 6 7 10 10 10 12 EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 22. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Sequence 1 = ACACACTA, Sequence 2 = AGCACACA w(match) = +2 w(a,-) = w(-,b) = w(mismatch) = -1 − A C A C A C T A 0 1 B− 0 0 0 0 0 0 0 0 0C BA 0 2 1 2 1 2 1 0 2C B C BG 0 1 1 1 1 1 1 0 1C B C BC 0 0 3 2 3 2 3 2 1C B C H=B BA 0 2 2 5 4 5 4 3 4C C BC 0 1 4 4 7 6 7 6 5C B C BA 0 2 3 6 6 9 8 7 8C B C @C 0 1 4 5 8 8 11 10 9A A 0 2 3 6 7 10 10 10 12 In the example, the highest value corresponds to the cell in position (8,8). The walk back corresponds to (8,8), (7,7), (7,6), (6,5), (5,4), (4,3), (3,2), (2,1), (1,1), and (0,0) Sequence 1 = A-CACACTA, Sequence 2 = AGCACAC-A EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 23. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Multiple Sequence Alignment Models A multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, commonly protein, DNA, or RNA. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 24. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Multiple Sequence Alignment Models A multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, commonly protein, DNA, or RNA. We usually want to do multiple alignments to find a homologous sequences that point to a shared evolutionary origins that can be used for further phylogenetic analysis. Progressive Alignment Methods - constructing succession of a pairwise alignment. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 25. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Multiple Sequence Alignment Models A multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, commonly protein, DNA, or RNA. We usually want to do multiple alignments to find a homologous sequences that point to a shared evolutionary origins that can be used for further phylogenetic analysis. Progressive Alignment Methods - constructing succession of a pairwise alignment. Hidden Markov Models - representation of MSA as DAG, observed states are individual alignment columns and the hidden states represent the presumed ancestral sequence. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 26. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 27. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What is Phylogenetics? Phylogenetics is the study of evolutionary relatedness among groups of organisms (e.g. species, populations), which is discovered through molecular sequencing data and morphological data matrices. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 28. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What is Phylogenetics? Phylogenetics is the study of evolutionary relatedness among groups of organisms (e.g. species, populations), which is discovered through molecular sequencing data and morphological data matrices. Evolution is regarded as a branching process, whereby populations are altered over time and may speciate into separate branches, hybridize together, or terminate by extinction. This may be visualized in a phylogenetic tree. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 29. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us What is Phylogenetics? Phylogenetics is the study of evolutionary relatedness among groups of organisms (e.g. species, populations), which is discovered through molecular sequencing data and morphological data matrices. Evolution is regarded as a branching process, whereby populations are altered over time and may speciate into separate branches, hybridize together, or terminate by extinction. This may be visualized in a phylogenetic tree. Ernst Haeckel’s recapitulation theory (”ontogeny recapitulates phylogeny”) is a hypothesis that in developing from embryo to adult, animals go through stages resembling or representing successive stages in the evolution of their remote ancestors. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 30. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Building Phylogenetic Trees Phylogenetic trees among a nontrivial number of input sequences are constructed using computational phylogenetics methods. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 31. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Building Phylogenetic Trees Phylogenetic trees among a nontrivial number of input sequences are constructed using computational phylogenetics methods. Common method is to search for maximum likelihood, often within a Bayesian Framework, and apply an explicit model of evolution to phylogenetic tree estimation. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 32. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Building Phylogenetic Trees Phylogenetic trees among a nontrivial number of input sequences are constructed using computational phylogenetics methods. Common method is to search for maximum likelihood, often within a Bayesian Framework, and apply an explicit model of evolution to phylogenetic tree estimation. Identifying the optimal tree using many of these techniques is NP-hard, so heuristic search and optimization methods are used in combination with tree-scoring functions to identify a reasonably good tree that fits the data. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 33. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Building Phylogenetic Trees Phylogenetic trees among a nontrivial number of input sequences are constructed using computational phylogenetics methods. Common method is to search for maximum likelihood, often within a Bayesian Framework, and apply an explicit model of evolution to phylogenetic tree estimation. Identifying the optimal tree using many of these techniques is NP-hard, so heuristic search and optimization methods are used in combination with tree-scoring functions to identify a reasonably good tree that fits the data. They do not necessarily accurately represent the species evolutionary history as the data on which they are based is noisy; the analysis can be confounded by horizontal gene transfer, hybridisation between species that were not nearest neighbors on the tree before hybridisation takes place, convergent evolution, and conserved sequences. EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 34. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 35. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Other Models Transformational Grammars (Chomsky Hierarchy) RNA Structure Analysis Models (RNA contains the interactions - rather than preserving the sequence) EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics
  • 36. Short introduction to Bioinformatics What are the Probabilistic Models? Sequence Alignment Pairwise Alignment Multiple Sequence Alignment Models What is Phylogenetics? Building Phylogenetic Trees Other Models Conctact Us Contact Us We are Hiring! EVEN BRIDGES G E N O M I C S, LLC Igor Bogicevic (igor.bogicevic@sbgenomics.com) Introduction to Probabilistic Models for Bioinformatics