SlideShare une entreprise Scribd logo
1  sur  34
Télécharger pour lire hors ligne
Comparative genomics
in eukaryotes
Genome analysis



  Klaas Vandepoele, PhD


Professor Ghent University
Comparative & Integrative Genomics
VIB – Ghent University, Belgium
I. Genome conservation & genomic
        homology
       Alignment of homologous regions
            Inter-genomic: aligning genomic sequences from different
             species
            Intra-genomic aligning genomic sequences from the same
             species

       Different levels of resolution
          Comparative mapping (markers)
          Synteny (~ gene content)
          Colinearity (gene content + order conservation)
          DNA-based alignments (base-to-base mapping)




2
Human – Mouse - Rat
                          resolution




3
Human – Mouse orthologous
                   regions                                     resolution


                                      Genome translocations associated
    Comparative
                                      with human-mouse speciation
    mapping




               Human




Mouse chr IV




4                                                       www.ensembl.org
Human genome browser
                                                                      resolution


Conserved gene      Human chr I
content & order     Mouse chr IV

                                     Gene loss and insertions in orthologous
                                   segments since human-mouse speciation




EST/cDNA
similarities




Genome
similarities



5                                                      Human gene model
Human – Mouse base-to-base
       mapping                                                 resolution


                                                 Functional sequences
                                                (e.g. exons) evolve slower
                                                than non-functional ones
                                                (e.g. introns) due to
                                                natural selection against
                                                mutations in these regions


                                                 Consequently,
                                                functional elements, both
                                                coding and non-coding,
                                                are unusually well
                                                conserved in orthologous
                                                regions




    Blue: coding exons   GT donor AG acceptor
6
DNA substitution rates for different
    gene/genome regions




7                                Molecular Evolution, Li WH
Multiple species comparisons
             (gene-based)




8   Hedges, 2002                            PhIGs
Genome size variation in the grasses:
       the use of model systems



        BEP                           Rice 450Mb
             46 MYA


    55 MYA                            Barley ~5000Mb




             28 MYA

       PACC                           Sorghum ~750Mb
                                      Maize ~2400Mb




9                                          Gaut 2002
Grass genomes: a single genetic
     system?
                               Gale and Devos, 1998




10
Micro-colinearity within the grasses




11                                  Bennetzen lab
Yeast Gene Order Browser (YGOB)




12
II. Computational detection of
         genomic homology
        Synteny
         ~ conservation of gene content
        Colinearity
         ~ conservation of (gene) content & order

        Macro-colinearity
            Marker-based
        Micro-colinearity
            DNA based or gene-based
13
How to find evidence for gene
          colinearity?
     A    1   2    3      4       5        6   7   8     9    10    11

                              speciation

     S1   1   2    3      4       5        6   7   8     9    10    11

     S2   1   2    3      4       5        6   7   8     9    10    11
                                                                         Time
                  Gene loss, insertions,
                  rearrangements,
                  translocation, etc …
              2

     S1   1        3      4                6   7              10    11

     S2   1   2           4                6   7   8     9          11




                               retained orthologs (anchor points)
14
Matrix representation

     S1   1                             3       4               6       7                   10   11

     S2   1                2                    4               6       7       8       9        11


                                                        segment S1
                                    1       -       3   4   -   6   7   X   X       10 11
                               1

                               2

                               -
              segment S2




                               4
                               X

                               6

                               7

                               8

                               9
                               -

15                             11
Map-based approach
                        Chromosome 1

                                            • Represent chromosomes
                                              as sorted gene lists
                                            • Identify all homologous
     Chromosome 2




                                              gene pairs between
                                              chromosomes (all-
                                              against-all BLASTP*).
                                            • Score pairs of
                                              homologues in matrix


     Identifying homologous regions = identifying diagonal series of
     elements in the gene homology matrix (GHM).
16                                            Vandepoele et al., Genome Research 2002
The map-based approach: terminology


                      Chromosome 1

                                     Colinear segment
                                     Tandem duplication
     Chromosome 2




                                     Homologous gene
                                     Inverted colinear segment


                                                                 1


                                                                 2


     Gene Homology Matrix (GHM)
17
Detection of colinear homologous
             regions


                   Human-mouse    Chicken-human




     MmuC4
                                                  HsaC1




                     HsaC1           GgaC23
18
Detection of colinear homologous
             regions


                   Human-mouse    Human-tetraodon




     MmuC4
                                                    TviC1




                     HsaC1             HsaC1
19
MUMmer
     NUCmer   PROmer




20
And what about synteny?
                                                  HsaC1




                                                          • Application of 2-
                                                          dimensional sliding-
     HsaC9
                                                          window approach to
                                                          score regions with a high
                                                          density of homologous
                                                          genes between 2
                                                          chromosomes



              ancient duplication

        Identifying syntenic regions = identifying high homolog-density
        regions in the gene homology matrix (GHM).
21                                                   DeSyRe, Vandepoele et al. unpublished
Detection of recent and ancient large-
            scale duplications

          recent duplication                ancient duplication




                               C2                                 HsaC1




     C4                             HsaC9




22          colinearity                            synteny
III. Whole-genome alignments

        Evolutionary constrained sequences are a
         good indicator of functional genome regions

        Basic protocol
         1.   Sequence generation
         2.   Reconstructing homologous colinearity across
              related genomes
         3.   Multi-sequence alignment
         4.   Detection sequences under purifying selection.



23                                             Margulies & Birney, NRG 2008
Reconstructing homologous
     colinearity




     • Segmental duplication and other species-specific
     rearrangements (e.g. inversions, insertions, deletions)
     interfere with the accurate detection of orthologous
     genomic regions


24
Tools

        Mercator (Ensembl)
            coding exons as anchor points
            graph of colinearity information
            travel through graph to generate homologous
             regions
        chains-and-nets (UCSC)
            reference-based local alignments different
             genomes (BLASTZ)
            filtering highest-scoring chains
            net together chains from same locus

25
Sequence alignment & constraint
     detection




                               PhastCons
                               BinCons
                               GERP
                               Siphy




26
Whole-genome base-pair
         alignment

        Challenges
            multi-species alignment
            long DNA sequences (reflecting homologous
             colinear regions)
            one-to-one mapping (with reference genome)
            various levels of sequence divergence




27
Whole-genome base-pair
         alignment toolbox
        MLAGAN
            CHAOS seeding algorithm (k-mer anchors)
            Dynamic programming (pairwise)
            Multiple alignment using progressive strategy
            Shuffle-LAGAN (incl. rearrangement map); VISTA
        TBA / MultiZ; UCSC
            Pairwise BLASTZ alignments (local blocks)
            Merging joining blocks using MultiZ
            Complex ordering of blocks using Threaded Blockset Aligner
        PECAN (Ensembl)
            Consistency alignment based on pairwise alignments (incl. outgroup
             information)
        MAVID




28
From gene to DNA-based
                  colinearity…

Pairwise approach:
 Human segment as
          reference




29                                                    VISTA
                                           http://genome.lbl.gov/vista
From gene to DNA-based
     colinearity…




30
Input and output files




                              PIP- maker




31                                Frazer et al., 2003
Conserved Non-coding Sequences or
              Elements (CNS/CNE)

Human/dog

Human/mouse

 Mouse/dog




                                                           VISTA plot
                                          Blue: exons
                                          Turquoise: UTR
32
Exercise

        Explore the genome organization and
         conservation of your favorite locus in a set of
         related species.

        Plants
           http://bioinformatics.psb.ugent.be/plaza/


        Vertebrates
           http://teleost.cs.uoregon.edu/synteny_db/


        Yeast
           http://wolfe.gen.tcd.ie/ygob/


33
34

Contenu connexe

Tendances

Whole Genome Sequencing Analysis
Whole Genome Sequencing AnalysisWhole Genome Sequencing Analysis
Whole Genome Sequencing AnalysisEfi Athieniti
 
Role of ensembl in genome browsing
Role of ensembl in genome browsingRole of ensembl in genome browsing
Role of ensembl in genome browsingJoydeep16
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)talhakhat
 
Techniques in proteomics
Techniques in proteomicsTechniques in proteomics
Techniques in proteomicsN Poorin
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijayVijay Hemmadi
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsAthira RG
 
BITS - Introduction to comparative genomics
BITS - Introduction to comparative genomicsBITS - Introduction to comparative genomics
BITS - Introduction to comparative genomicsBITS
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl databaseAshfaq Ahmad
 
Yeast two hybrid system
Yeast two hybrid systemYeast two hybrid system
Yeast two hybrid systemiqraakbar8
 
COMPARATIVE GENOMICS.ppt
COMPARATIVE GENOMICS.pptCOMPARATIVE GENOMICS.ppt
COMPARATIVE GENOMICS.pptSilpa87
 
Map based cloning
Map based cloning Map based cloning
Map based cloning PREETHYDAVID
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programsMugdhaSharma11
 
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Manikhandan Mudaliar
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencingDenis C. Bauer
 

Tendances (20)

Whole Genome Sequencing Analysis
Whole Genome Sequencing AnalysisWhole Genome Sequencing Analysis
Whole Genome Sequencing Analysis
 
Role of ensembl in genome browsing
Role of ensembl in genome browsingRole of ensembl in genome browsing
Role of ensembl in genome browsing
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)
 
Techniques in proteomics
Techniques in proteomicsTechniques in proteomics
Techniques in proteomics
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
BITS - Introduction to comparative genomics
BITS - Introduction to comparative genomicsBITS - Introduction to comparative genomics
BITS - Introduction to comparative genomics
 
Sequence assembly
Sequence assemblySequence assembly
Sequence assembly
 
Genome assembly
Genome assemblyGenome assembly
Genome assembly
 
genomic comparison
genomic comparison genomic comparison
genomic comparison
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl database
 
Yeast two hybrid system
Yeast two hybrid systemYeast two hybrid system
Yeast two hybrid system
 
COMPARATIVE GENOMICS.ppt
COMPARATIVE GENOMICS.pptCOMPARATIVE GENOMICS.ppt
COMPARATIVE GENOMICS.ppt
 
Genome Assembly
Genome AssemblyGenome Assembly
Genome Assembly
 
Map based cloning
Map based cloning Map based cloning
Map based cloning
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
 
CODON BIAS
CODON BIASCODON BIAS
CODON BIAS
 
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencing
 

En vedette

Productivity tips - Introduction to linux for bioinformatics
Productivity tips - Introduction to linux for bioinformaticsProductivity tips - Introduction to linux for bioinformatics
Productivity tips - Introduction to linux for bioinformaticsBITS
 
The structure of Linux - Introduction to Linux for bioinformatics
The structure of Linux - Introduction to Linux for bioinformaticsThe structure of Linux - Introduction to Linux for bioinformatics
The structure of Linux - Introduction to Linux for bioinformaticsBITS
 
RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2BITS
 
Text mining on the command line - Introduction to linux for bioinformatics
Text mining on the command line - Introduction to linux for bioinformaticsText mining on the command line - Introduction to linux for bioinformatics
Text mining on the command line - Introduction to linux for bioinformaticsBITS
 
Managing your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformaticsManaging your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformaticsBITS
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3BITS
 
BITS - Protein inference from mass spectrometry data
BITS - Protein inference from mass spectrometry dataBITS - Protein inference from mass spectrometry data
BITS - Protein inference from mass spectrometry dataBITS
 
BITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics dataBITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics dataBITS
 
BITS - Comparative genomics: the Contra tool
BITS - Comparative genomics: the Contra toolBITS - Comparative genomics: the Contra tool
BITS - Comparative genomics: the Contra toolBITS
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingmikaelhuss
 
RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5BITS
 
RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1BITS
 
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6BITS
 
RNA-seq for DE analysis: extracting counts and QC - part 4
RNA-seq for DE analysis: extracting counts and QC - part 4RNA-seq for DE analysis: extracting counts and QC - part 4
RNA-seq for DE analysis: extracting counts and QC - part 4BITS
 
BITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysisBITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysisBITS
 
BITS: Introduction to Linux - Software installation the graphical and the co...
BITS: Introduction to Linux -  Software installation the graphical and the co...BITS: Introduction to Linux -  Software installation the graphical and the co...
BITS: Introduction to Linux - Software installation the graphical and the co...BITS
 
Projekt sociala ekonomin i motala - slutrapport 2015
Projekt sociala ekonomin i motala - slutrapport 2015Projekt sociala ekonomin i motala - slutrapport 2015
Projekt sociala ekonomin i motala - slutrapport 2015Jonas Lagander
 
Lokala banksystem utan vinstkrav - för tillväxt och hållbar utveckling
Lokala banksystem utan vinstkrav - för tillväxt och hållbar utvecklingLokala banksystem utan vinstkrav - för tillväxt och hållbar utveckling
Lokala banksystem utan vinstkrav - för tillväxt och hållbar utvecklingJonas Lagander
 
Genevestigator
GenevestigatorGenevestigator
GenevestigatorBITS
 
Besök kimstad rapport förstudie
Besök kimstad   rapport förstudieBesök kimstad   rapport förstudie
Besök kimstad rapport förstudieJonas Lagander
 

En vedette (20)

Productivity tips - Introduction to linux for bioinformatics
Productivity tips - Introduction to linux for bioinformaticsProductivity tips - Introduction to linux for bioinformatics
Productivity tips - Introduction to linux for bioinformatics
 
The structure of Linux - Introduction to Linux for bioinformatics
The structure of Linux - Introduction to Linux for bioinformaticsThe structure of Linux - Introduction to Linux for bioinformatics
The structure of Linux - Introduction to Linux for bioinformatics
 
RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2
 
Text mining on the command line - Introduction to linux for bioinformatics
Text mining on the command line - Introduction to linux for bioinformaticsText mining on the command line - Introduction to linux for bioinformatics
Text mining on the command line - Introduction to linux for bioinformatics
 
Managing your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformaticsManaging your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformatics
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3
 
BITS - Protein inference from mass spectrometry data
BITS - Protein inference from mass spectrometry dataBITS - Protein inference from mass spectrometry data
BITS - Protein inference from mass spectrometry data
 
BITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics dataBITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics data
 
BITS - Comparative genomics: the Contra tool
BITS - Comparative genomics: the Contra toolBITS - Comparative genomics: the Contra tool
BITS - Comparative genomics: the Contra tool
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
 
RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5
 
RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1
 
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
 
RNA-seq for DE analysis: extracting counts and QC - part 4
RNA-seq for DE analysis: extracting counts and QC - part 4RNA-seq for DE analysis: extracting counts and QC - part 4
RNA-seq for DE analysis: extracting counts and QC - part 4
 
BITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysisBITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysis
 
BITS: Introduction to Linux - Software installation the graphical and the co...
BITS: Introduction to Linux -  Software installation the graphical and the co...BITS: Introduction to Linux -  Software installation the graphical and the co...
BITS: Introduction to Linux - Software installation the graphical and the co...
 
Projekt sociala ekonomin i motala - slutrapport 2015
Projekt sociala ekonomin i motala - slutrapport 2015Projekt sociala ekonomin i motala - slutrapport 2015
Projekt sociala ekonomin i motala - slutrapport 2015
 
Lokala banksystem utan vinstkrav - för tillväxt och hållbar utveckling
Lokala banksystem utan vinstkrav - för tillväxt och hållbar utvecklingLokala banksystem utan vinstkrav - för tillväxt och hållbar utveckling
Lokala banksystem utan vinstkrav - för tillväxt och hållbar utveckling
 
Genevestigator
GenevestigatorGenevestigator
Genevestigator
 
Besök kimstad rapport förstudie
Besök kimstad   rapport förstudieBesök kimstad   rapport förstudie
Besök kimstad rapport förstudie
 

Similaire à BITS - Comparative genomics on the genome level

Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesKlaas Vandepoele
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsgroovescience
 
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platform
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platformDissecting plant genomes with the PLAZA 2.5 comparative genomics platform
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platformKlaas Vandepoele
 
Role of molecular marker
Role of molecular markerRole of molecular marker
Role of molecular markerShweta Tiwari
 
The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...Borlaug Global Rust Initiative
 
Molecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingMolecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingFOODCROPS
 
Structural genomics
Structural genomicsStructural genomics
Structural genomicsAshfaq Ahmad
 
transposons complete ppt
transposons complete ppttransposons complete ppt
transposons complete ppttauseefsko
 
cytogenomics tools and techniques and chromosome sorting.pptx
cytogenomics tools and techniques and chromosome sorting.pptxcytogenomics tools and techniques and chromosome sorting.pptx
cytogenomics tools and techniques and chromosome sorting.pptxPABOLU TEJASREE
 
Proteomics course 1
Proteomics course 1Proteomics course 1
Proteomics course 1utpaltatu
 
Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07Monica Munoz-Torres
 
Genomic Analyses: QTLs, etc.
Genomic Analyses:  QTLs, etc.Genomic Analyses:  QTLs, etc.
Genomic Analyses: QTLs, etc.gfb1
 
Research report (alternative splicing, protein structure; retinitis pigmentosa)
Research report (alternative splicing, protein structure; retinitis pigmentosa)Research report (alternative splicing, protein structure; retinitis pigmentosa)
Research report (alternative splicing, protein structure; retinitis pigmentosa)avalgar
 

Similaire à BITS - Comparative genomics on the genome level (20)

Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomes
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traits
 
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platform
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platformDissecting plant genomes with the PLAZA 2.5 comparative genomics platform
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platform
 
Role of molecular marker
Role of molecular markerRole of molecular marker
Role of molecular marker
 
The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...
 
Molecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingMolecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breeding
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
transposons complete ppt
transposons complete ppttransposons complete ppt
transposons complete ppt
 
THE human genome
THE human genomeTHE human genome
THE human genome
 
Bioinformatica t8-go-hmm
Bioinformatica t8-go-hmmBioinformatica t8-go-hmm
Bioinformatica t8-go-hmm
 
cytogenomics tools and techniques and chromosome sorting.pptx
cytogenomics tools and techniques and chromosome sorting.pptxcytogenomics tools and techniques and chromosome sorting.pptx
cytogenomics tools and techniques and chromosome sorting.pptx
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Molecular tagging
Molecular tagging Molecular tagging
Molecular tagging
 
Proteomics course 1
Proteomics course 1Proteomics course 1
Proteomics course 1
 
Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07
 
Genomic Analyses: QTLs, etc.
Genomic Analyses:  QTLs, etc.Genomic Analyses:  QTLs, etc.
Genomic Analyses: QTLs, etc.
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
Genomics
Genomics Genomics
Genomics
 
Research report (alternative splicing, protein structure; retinitis pigmentosa)
Research report (alternative splicing, protein structure; retinitis pigmentosa)Research report (alternative splicing, protein structure; retinitis pigmentosa)
Research report (alternative splicing, protein structure; retinitis pigmentosa)
 
Zinc finger
Zinc fingerZinc finger
Zinc finger
 

Plus de BITS

Introduction to Linux for bioinformatics
Introduction to Linux for bioinformaticsIntroduction to Linux for bioinformatics
Introduction to Linux for bioinformaticsBITS
 
BITS - Overview of sequence databases for mass spectrometry data analysis
BITS - Overview of sequence databases for mass spectrometry data analysisBITS - Overview of sequence databases for mass spectrometry data analysis
BITS - Overview of sequence databases for mass spectrometry data analysisBITS
 
BITS - Search engines for mass spec data
BITS - Search engines for mass spec dataBITS - Search engines for mass spec data
BITS - Search engines for mass spec dataBITS
 
BITS - Introduction to proteomics
BITS - Introduction to proteomicsBITS - Introduction to proteomics
BITS - Introduction to proteomicsBITS
 
BITS - Introduction to Mass Spec data generation
BITS - Introduction to Mass Spec data generationBITS - Introduction to Mass Spec data generation
BITS - Introduction to Mass Spec data generationBITS
 
BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2BITS
 
Marcs (bio)perl course
Marcs (bio)perl courseMarcs (bio)perl course
Marcs (bio)perl courseBITS
 
Basics statistics
Basics statistics Basics statistics
Basics statistics BITS
 
Cytoscape: Integrating biological networks
Cytoscape: Integrating biological networksCytoscape: Integrating biological networks
Cytoscape: Integrating biological networksBITS
 
Cytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networksCytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networksBITS
 
BITS: UCSC genome browser - Part 1
BITS: UCSC genome browser - Part 1BITS: UCSC genome browser - Part 1
BITS: UCSC genome browser - Part 1BITS
 
Vnti11 basics course
Vnti11 basics courseVnti11 basics course
Vnti11 basics courseBITS
 
Bits protein structure
Bits protein structureBits protein structure
Bits protein structureBITS
 

Plus de BITS (13)

Introduction to Linux for bioinformatics
Introduction to Linux for bioinformaticsIntroduction to Linux for bioinformatics
Introduction to Linux for bioinformatics
 
BITS - Overview of sequence databases for mass spectrometry data analysis
BITS - Overview of sequence databases for mass spectrometry data analysisBITS - Overview of sequence databases for mass spectrometry data analysis
BITS - Overview of sequence databases for mass spectrometry data analysis
 
BITS - Search engines for mass spec data
BITS - Search engines for mass spec dataBITS - Search engines for mass spec data
BITS - Search engines for mass spec data
 
BITS - Introduction to proteomics
BITS - Introduction to proteomicsBITS - Introduction to proteomics
BITS - Introduction to proteomics
 
BITS - Introduction to Mass Spec data generation
BITS - Introduction to Mass Spec data generationBITS - Introduction to Mass Spec data generation
BITS - Introduction to Mass Spec data generation
 
BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2
 
Marcs (bio)perl course
Marcs (bio)perl courseMarcs (bio)perl course
Marcs (bio)perl course
 
Basics statistics
Basics statistics Basics statistics
Basics statistics
 
Cytoscape: Integrating biological networks
Cytoscape: Integrating biological networksCytoscape: Integrating biological networks
Cytoscape: Integrating biological networks
 
Cytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networksCytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networks
 
BITS: UCSC genome browser - Part 1
BITS: UCSC genome browser - Part 1BITS: UCSC genome browser - Part 1
BITS: UCSC genome browser - Part 1
 
Vnti11 basics course
Vnti11 basics courseVnti11 basics course
Vnti11 basics course
 
Bits protein structure
Bits protein structureBits protein structure
Bits protein structure
 

Dernier

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Dernier (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

BITS - Comparative genomics on the genome level

  • 1. Comparative genomics in eukaryotes Genome analysis Klaas Vandepoele, PhD Professor Ghent University Comparative & Integrative Genomics VIB – Ghent University, Belgium
  • 2. I. Genome conservation & genomic homology  Alignment of homologous regions  Inter-genomic: aligning genomic sequences from different species  Intra-genomic aligning genomic sequences from the same species  Different levels of resolution  Comparative mapping (markers)  Synteny (~ gene content)  Colinearity (gene content + order conservation)  DNA-based alignments (base-to-base mapping) 2
  • 3. Human – Mouse - Rat resolution 3
  • 4. Human – Mouse orthologous regions resolution Genome translocations associated Comparative with human-mouse speciation mapping Human Mouse chr IV 4 www.ensembl.org
  • 5. Human genome browser resolution Conserved gene Human chr I content & order Mouse chr IV Gene loss and insertions in orthologous segments since human-mouse speciation EST/cDNA similarities Genome similarities 5 Human gene model
  • 6. Human – Mouse base-to-base mapping resolution  Functional sequences (e.g. exons) evolve slower than non-functional ones (e.g. introns) due to natural selection against mutations in these regions  Consequently, functional elements, both coding and non-coding, are unusually well conserved in orthologous regions Blue: coding exons GT donor AG acceptor 6
  • 7. DNA substitution rates for different gene/genome regions 7 Molecular Evolution, Li WH
  • 8. Multiple species comparisons (gene-based) 8 Hedges, 2002 PhIGs
  • 9. Genome size variation in the grasses: the use of model systems BEP Rice 450Mb 46 MYA 55 MYA Barley ~5000Mb 28 MYA PACC Sorghum ~750Mb Maize ~2400Mb 9 Gaut 2002
  • 10. Grass genomes: a single genetic system? Gale and Devos, 1998 10
  • 11. Micro-colinearity within the grasses 11 Bennetzen lab
  • 12. Yeast Gene Order Browser (YGOB) 12
  • 13. II. Computational detection of genomic homology  Synteny ~ conservation of gene content  Colinearity ~ conservation of (gene) content & order  Macro-colinearity  Marker-based  Micro-colinearity  DNA based or gene-based 13
  • 14. How to find evidence for gene colinearity? A 1 2 3 4 5 6 7 8 9 10 11 speciation S1 1 2 3 4 5 6 7 8 9 10 11 S2 1 2 3 4 5 6 7 8 9 10 11 Time Gene loss, insertions, rearrangements, translocation, etc … 2 S1 1 3 4 6 7 10 11 S2 1 2 4 6 7 8 9 11 retained orthologs (anchor points) 14
  • 15. Matrix representation S1 1 3 4 6 7 10 11 S2 1 2 4 6 7 8 9 11 segment S1 1 - 3 4 - 6 7 X X 10 11 1 2 - segment S2 4 X 6 7 8 9 - 15 11
  • 16. Map-based approach Chromosome 1 • Represent chromosomes as sorted gene lists • Identify all homologous Chromosome 2 gene pairs between chromosomes (all- against-all BLASTP*). • Score pairs of homologues in matrix Identifying homologous regions = identifying diagonal series of elements in the gene homology matrix (GHM). 16 Vandepoele et al., Genome Research 2002
  • 17. The map-based approach: terminology Chromosome 1 Colinear segment Tandem duplication Chromosome 2 Homologous gene Inverted colinear segment 1 2 Gene Homology Matrix (GHM) 17
  • 18. Detection of colinear homologous regions Human-mouse Chicken-human MmuC4 HsaC1 HsaC1 GgaC23 18
  • 19. Detection of colinear homologous regions Human-mouse Human-tetraodon MmuC4 TviC1 HsaC1 HsaC1 19
  • 20. MUMmer NUCmer PROmer 20
  • 21. And what about synteny? HsaC1 • Application of 2- dimensional sliding- HsaC9 window approach to score regions with a high density of homologous genes between 2 chromosomes ancient duplication Identifying syntenic regions = identifying high homolog-density regions in the gene homology matrix (GHM). 21 DeSyRe, Vandepoele et al. unpublished
  • 22. Detection of recent and ancient large- scale duplications recent duplication ancient duplication C2 HsaC1 C4 HsaC9 22 colinearity synteny
  • 23. III. Whole-genome alignments  Evolutionary constrained sequences are a good indicator of functional genome regions  Basic protocol 1. Sequence generation 2. Reconstructing homologous colinearity across related genomes 3. Multi-sequence alignment 4. Detection sequences under purifying selection. 23 Margulies & Birney, NRG 2008
  • 24. Reconstructing homologous colinearity • Segmental duplication and other species-specific rearrangements (e.g. inversions, insertions, deletions) interfere with the accurate detection of orthologous genomic regions 24
  • 25. Tools  Mercator (Ensembl)  coding exons as anchor points  graph of colinearity information  travel through graph to generate homologous regions  chains-and-nets (UCSC)  reference-based local alignments different genomes (BLASTZ)  filtering highest-scoring chains  net together chains from same locus 25
  • 26. Sequence alignment & constraint detection PhastCons BinCons GERP Siphy 26
  • 27. Whole-genome base-pair alignment  Challenges  multi-species alignment  long DNA sequences (reflecting homologous colinear regions)  one-to-one mapping (with reference genome)  various levels of sequence divergence 27
  • 28. Whole-genome base-pair alignment toolbox  MLAGAN  CHAOS seeding algorithm (k-mer anchors)  Dynamic programming (pairwise)  Multiple alignment using progressive strategy  Shuffle-LAGAN (incl. rearrangement map); VISTA  TBA / MultiZ; UCSC  Pairwise BLASTZ alignments (local blocks)  Merging joining blocks using MultiZ  Complex ordering of blocks using Threaded Blockset Aligner  PECAN (Ensembl)  Consistency alignment based on pairwise alignments (incl. outgroup information)  MAVID 28
  • 29. From gene to DNA-based colinearity… Pairwise approach: Human segment as reference 29 VISTA http://genome.lbl.gov/vista
  • 30. From gene to DNA-based colinearity… 30
  • 31. Input and output files PIP- maker 31 Frazer et al., 2003
  • 32. Conserved Non-coding Sequences or Elements (CNS/CNE) Human/dog Human/mouse Mouse/dog VISTA plot Blue: exons Turquoise: UTR 32
  • 33. Exercise  Explore the genome organization and conservation of your favorite locus in a set of related species.  Plants  http://bioinformatics.psb.ugent.be/plaza/  Vertebrates  http://teleost.cs.uoregon.edu/synteny_db/  Yeast  http://wolfe.gen.tcd.ie/ygob/ 33
  • 34. 34