SlideShare a Scribd company logo
1 of 36
WHOLE GENOME SEQUENCING OF
BACTERIA & ANALYSIS
ELAMURUGAN. A
Ph.D Scholar,
Vet. Immunology
INTRODUCTION
 1977 - first complete genome to be sequenced was
bacteriophage X174 - 5386 bp
 1995 - first complete genome sequence from a free living
organism - Haemophilus influenzae (1.83 Mb) by whole
genome shotgun approach
 Sanger & Coulson (1977) - used chain-terminating
dideoxynucleotide analogues
 Maxam & Gilbert (1977) chemical degradation DNA
sequencing - terminally labeled DNA fragments were
chemically cleaved at specific bases and separated by gel
electrophoresis
http://www.genomesonline.org/cgi-bin/GOLD/sequencing_status_distribution.cgi
429
Genome online database (GOLD)
ARCHON X PRIZE
 X PRIZE Foundation in Santa Monica, CA, has
introduced the Archon X PRIZE for Genomics and will
award a sum of $10 million to the first team that can
design a system capable of sequencing 100 human
genomes in 10 days
SEQUENCING TECHNOLOGY
 First generation
 Sanger’s dideoxy chain terminating tech
 Maxam & Gilbert chemical degradation tech
 Next generation sequencing (NGS)
 454/Roche - pyrosequencing
 Illumina/ Solexa - reversible dye terminators
 SOLiD /ABI- sequential ligation of oligonucleotide probes
Second generation HT-NGS – sequencing after amplification
 Heliscope
 SMRT (Pacific biosciences)
 Single molecule real time (RNAP) sequencer
 Nanopore DNA sequencer
 Ion Torrent sequencing technology (PostLight)
 VisiGen biotechnologies – FRET
 Advantages of 3rd generation HT-NGS over 2nd
 higher throughput
 faster turnaround time
 longer read lengths
 higher consensus accuracy
 small amounts of starting material
 low cost
Third
generation
HT-NGS -
Single
molecule
sequencing
ADVANTAGES OF HT-NGS
 Massive parallel sequencing of hundreds of thousands
or millions of templates
 Preliminary and tedious cloning work is eliminated and
substituted by PCR amplification
 Most recent technologies, even PCR is eliminated,
because single DNA molecules
 Economic
 Reduced time
DISADVANTAGES OF HT-NGS
 Most NGSTs produce short reads
 Constructions of fragment libraries remain tricky and
involve several steps of fragmentation, adaptor ligation
and PCR amplification
 Short homopolymers with the 454 technology
 Modified nucleotides cause mis-incorporation or block
further incorporation if the florescent moiety cannot be
completely removed
 Assembly of short reads into longer sequences
Illumina/ Solexa technology
zero-mode
waveguides
(ZMWs)
Selection of a technology for an experiment
GENOME ASSEMBLY
 Assemblers can join sequences together based on
overlapping regions between the sequences
 Composed of contigs and scaffolds
 Contigs - contiguous consensus sequences that are
derived from collections of overlapping reads
 Scaffolds - ordered and orientated sets of contigs that are
linked to one another by mate pairs of sequencing reads
 N50 - basic statistic for describing the contiguity of a
genome assembly. The longer the N50 is, the better the
assembly
 Alignment against a reference genome sequence
 De novo assembly Construction of longer sequences, such
as contigs or genomes, from shorter sequences, such as
sequence reads, without prior knowledge of the order of
the reads or reference to a closely related sequence
GENE PREDICTION
 Ab initio gene prediction - mathematical models
rather than external evidence (such as EST and
protein alignments) to identify genes and to
determine their intron–exon structures
 Evidence-driven gene prediction - using ESTs, can
be used to identify exon boundaries
unambiguously. Great potential to improve the
quality of gene prediction in newly sequenced
genomes. ESTs and proteins must first be aligned
to the genome
 Commonly used tools for gene prediction in
prokaryotes Glimmer, GeneMark
GENOME ANNOTATION
 Is the extraction of biological knowledge from raw
nucleotide sequences
 Seeks to identify every potential protein coding gene
(ORFs)
 Used to compare in available database like BlastP
 ‘Structural’ genome annotation is the process of identifying
genes and their intron–exon structures
 ‘Functional’ genome annotation is the process of attaching
meta-data such as gene ontology terms to structural
annotations
APPLICATIONS
 Very large no of short reads help to identify single nucleotide
polymorphisms (SNP) when comparing them in reference
genome
 Identification of rearrangements, deletions, insertions,
inversions
 Used to generate expressed sequence tags (EST) from RNA
sequencing
 Also to detect small regulatory RNAs
 Illumia technoloy - ChIP Seq to study protein - DNA
interactions
 Metagenomics
LEADS TO DEVELOPMENT
 Functional genomics
 Comparative genomics
 Environmental genomics (Metagenomics)
FUNCTIONAL GENOMICS
 Reveals genome structure and its functional relation
 Orthologs - they represent genes derived from a common
ancestor that diverged because of divergence of the
organism, tend to have similar function
 Paralogs are homologs produced by gene duplication and
represent genes derived from a common ancestral gene
that duplicated within an organism and then diverged, tend
to have different functions
 Xenologs are homologs resulting from the horizontal
transfer of a gene between two organisms. The function of
xenologs can be variable, depending on how significant the
change in context was for the horizontally moving gene. In
general, though, the function tends to be similar
PHYLOGENETIC ANALYSIS
 Phylogenetic trees, which are used to classify the
evolutionary relationships between homologous
genes represented in the genomes of divergent species
Internal Nodes or
Divergence Points
Branches or
Lineages A
B
C
D
E
Terminal Nodes
Ancestral Node
or ROOT of
the Tree
COMPARATIVE GENOMICS
 Comparison of genome sequences reveals much
information about genome structure and evolution,
including importance of lateral gene transfer
 Tool to discover how microbs adapted to particular
ecology and in development of new therapeutic
agents
METAGENOMICS
 Genomics-based study of genetic material
recovered directly from environmentally derived
samples without laboratory culture and compared
with all previously sequenced genes
 Enable how microbs adapt extreme environments
which help to discover new metabolic pathway and
protective mechanisms
IMPACT OF GENOME SEQUENCING
 Revealed genome reduction in I/C bacteria
 Genome plasticity (rearrangements, mobile elements)
 Gene duplication and diversification of protein function
 Lateral gene transfer & acquisition of new functions
 Adaptation to environments, virulence
 Industrial process - fermentation tech,
 Bioremediation
 Biotransformation
 Development of vaccines
 Bacterial diversity
 Synthetic biology
 Epigenetics
REVERSE VACCINOLOGY
 Use of genomic sequence information to identify novel
and better suited protein candidates for vaccine
 Serogroup B Neisseria meningitidis – based on
genomic data all proteins predicted to be surface
exposed, therefore accessible to antiobodies
 Suitable candidates selected after sequencing various
strains
 Streptococcus agalactiae
 Pan-genome composed of core genome, the genes
present in all sequence strains and the dispensable
genome made of genes present in a subset of strains
 Synthetic biology - from sequence of entire genome to
synthesize genes de novo
 Identification of minimal genome, the smallest set of
genes that enbles life - Mycoplasma genitalium
DATABASES AND TOOLS RELATED WITH BACTERIAL
GENOMIC DATA
 NCBI Entrez Genome Project database:
 http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db = genomeprj
 A searchable collection of complete and incomplete (in-progress)
large-scale sequencing, assembly, annotation, and mapping projects
for cellular organisms
 NCBI, Bacteria Genome Database:
 http://www.ncbi.nlm.nih.gov/genomes/static/eub.html
 The Genome database provides views for a variety of genomes,
complete chromosomes, sequence maps with contigs, and integrated
genetic and physical maps
 Bacterial Genomes at The Sanger Institute:
• http://www.sanger.ac.uk/Projects/Microbes/
• This web contains a list of funded, on-going, or completed projects of
pathogens sequenced at this institute
 TIGR Comprehensive Microbial Resource (CMR):
 http://cmr.tigr.org/tigr-scripts/CMR/CmrHomePage.cgi
 A free website displaying information on all the publicly available,
complete prokaryotic genomes
 GOLD: Genomes OnLine Database:
 http://www.genomesonline.org/
 A genome database containing information about which genomes have
been sequenced or are in progress
 Microbial Genome Database for Comparative Analysis (MBGD):
 http://mbgd.genome.ad.jp/
 A database for comparative analysis of completely sequenced microbial
genomes
 Virulence Factors of Bacterial Pathogens (VFDB):
 http://zdsys.chgb.org.cn/VFs/main.htm
 VFDB is an integrated and comprehensive database of virulence
factors for bacterial pathogens
 Genome Information Broker:
 http://gib.genes.nig.ac.jp/
 A comprehensive data repository of complete microbial genomes in the
public domain. Many microbial genomes can be explored graphically
 Islander, a Database of Genomic Islands:
 http://www.indiana.edu/~islander
 This database contains genomic islands discovered in completely
sequenced bacterial genomes
 GenoList genome browser at Institute Pasteur:
 http://genolist.pasteur.fr/
 Contains access to diverse genome browsers of pathogenic
bacteria
 IslandPath:
 http://www.pathogenomics.sfu.ca/islandpath/update/IPindex.pl
 An aid to the identification of genomic islands, including
pathogenicity islands, of potentially horizontally transferred genes
 HGT-DB:
 http://www.tinet.org/~debb/HGT/
 A database containing the prediction of horizontally transferred
genes in several prokaryotic complete genomes
 E. coli genome project:
 http://www.genome.wisc.edu
 A site devoted to the E. coli genome project with an updated
annotation of the genome
Whole genome sequencing of bacteria & analysis

More Related Content

What's hot

multiple sequence alignment
multiple sequence alignmentmultiple sequence alignment
multiple sequence alignmentharshita agarwal
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignmentKubuldinho
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformaticsnadeem akhter
 
Map based cloning of genome
Map based cloning of genomeMap based cloning of genome
Map based cloning of genomeKAUSHAL SAHU
 
Structural genomics
Structural genomicsStructural genomics
Structural genomicsAshfaq Ahmad
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequencesababibi
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsajay301
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingUzma Jabeen
 
Third Generation Sequencing
Third Generation Sequencing Third Generation Sequencing
Third Generation Sequencing priyanka raviraj
 
Sequence similarity tools.pptx
Sequence similarity tools.pptxSequence similarity tools.pptx
Sequence similarity tools.pptxPagudalaSangeetha
 
Multiple Sequence Alignment
Multiple Sequence AlignmentMultiple Sequence Alignment
Multiple Sequence AlignmentMeghaj Mallick
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENTMariya Raju
 

What's hot (20)

NCBI
NCBINCBI
NCBI
 
multiple sequence alignment
multiple sequence alignmentmultiple sequence alignment
multiple sequence alignment
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Genome Assembly
Genome AssemblyGenome Assembly
Genome Assembly
 
Est database
Est databaseEst database
Est database
 
Map based cloning of genome
Map based cloning of genomeMap based cloning of genome
Map based cloning of genome
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequence
 
Prosite
PrositeProsite
Prosite
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Sequence Analysis
Sequence AnalysisSequence Analysis
Sequence Analysis
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Third Generation Sequencing
Third Generation Sequencing Third Generation Sequencing
Third Generation Sequencing
 
Sequence similarity tools.pptx
Sequence similarity tools.pptxSequence similarity tools.pptx
Sequence similarity tools.pptx
 
Whole Genome Analysis
Whole Genome AnalysisWhole Genome Analysis
Whole Genome Analysis
 
Multiple Sequence Alignment
Multiple Sequence AlignmentMultiple Sequence Alignment
Multiple Sequence Alignment
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Finding ORF
Finding ORFFinding ORF
Finding ORF
 

Viewers also liked

Genome sequencing
Genome sequencingGenome sequencing
Genome sequencingShital Pal
 
sequencing of genome
sequencing of genomesequencing of genome
sequencing of genomeNaveen Gupta
 
DNA SEQUENCING METHOD
DNA SEQUENCING METHODDNA SEQUENCING METHOD
DNA SEQUENCING METHODMusa Khan
 
Dna sequencing powerpoint
Dna sequencing powerpointDna sequencing powerpoint
Dna sequencing powerpoint14cummke
 
GMOD 2014 MAKER Lecture
GMOD 2014 MAKER LectureGMOD 2014 MAKER Lecture
GMOD 2014 MAKER Lecturebarrymoore
 
Bacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBIBacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBInist-spin
 
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...ExternalEvents
 
SPIN Workshop Microbial Genomics @NIST
SPIN Workshop Microbial Genomics @NISTSPIN Workshop Microbial Genomics @NIST
SPIN Workshop Microbial Genomics @NISTnist-spin
 
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...José André
 
Pathways and genomes databases in bioinformatics
Pathways and genomes databases in bioinformaticsPathways and genomes databases in bioinformatics
Pathways and genomes databases in bioinformaticssarwat bashir
 
Biology 16 1 genes and variation[1]
Biology 16 1 genes and variation[1]Biology 16 1 genes and variation[1]
Biology 16 1 genes and variation[1]Tamara
 
Genome Sequencing Project
Genome Sequencing ProjectGenome Sequencing Project
Genome Sequencing Projectguestd53a1
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural cropsPulipati Gangadhara Rao
 
Genomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single stepGenomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single stepILRI
 
Introduction to genomes
Introduction to genomesIntroduction to genomes
Introduction to genomesavrilcoghlan
 
Next-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotNext-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotLi Shen
 
Breeding and genomics in ILRI biosciences research
Breeding and genomics in ILRI biosciences researchBreeding and genomics in ILRI biosciences research
Breeding and genomics in ILRI biosciences researchILRI
 

Viewers also liked (20)

Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
sequencing of genome
sequencing of genomesequencing of genome
sequencing of genome
 
Genome analysis
Genome analysisGenome analysis
Genome analysis
 
DNA SEQUENCING METHOD
DNA SEQUENCING METHODDNA SEQUENCING METHOD
DNA SEQUENCING METHOD
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
Dna sequencing powerpoint
Dna sequencing powerpointDna sequencing powerpoint
Dna sequencing powerpoint
 
GMOD 2014 MAKER Lecture
GMOD 2014 MAKER LectureGMOD 2014 MAKER Lecture
GMOD 2014 MAKER Lecture
 
Bacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBIBacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBI
 
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
 
SPIN Workshop Microbial Genomics @NIST
SPIN Workshop Microbial Genomics @NISTSPIN Workshop Microbial Genomics @NIST
SPIN Workshop Microbial Genomics @NIST
 
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...
 
Pathways and genomes databases in bioinformatics
Pathways and genomes databases in bioinformaticsPathways and genomes databases in bioinformatics
Pathways and genomes databases in bioinformatics
 
Biology 16 1 genes and variation[1]
Biology 16 1 genes and variation[1]Biology 16 1 genes and variation[1]
Biology 16 1 genes and variation[1]
 
Genome Sequencing Project
Genome Sequencing ProjectGenome Sequencing Project
Genome Sequencing Project
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural crops
 
Genomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single stepGenomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single step
 
Introduction to genomes
Introduction to genomesIntroduction to genomes
Introduction to genomes
 
Next-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotNext-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plot
 
Genome Database Systems
Genome Database Systems Genome Database Systems
Genome Database Systems
 
Breeding and genomics in ILRI biosciences research
Breeding and genomics in ILRI biosciences researchBreeding and genomics in ILRI biosciences research
Breeding and genomics in ILRI biosciences research
 

Similar to Whole genome sequencing of bacteria & analysis

Genome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGenome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGedifewGebrie
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformaticsAtai Rabby
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical NotebookNaima Tahsin
 
Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891saurabh verma
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityMonica Munoz-Torres
 
Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGLong Pei
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein functionLars Juhl Jensen
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods Zohaib HUSSAIN
 
Apollo : A workshop for the Manakin Research Coordination Network
Apollo: A workshop for the Manakin Research Coordination NetworkApollo: A workshop for the Manakin Research Coordination Network
Apollo : A workshop for the Manakin Research Coordination NetworkMonica Munoz-Torres
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and toolsKAUSHAL SAHU
 
Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...OECD Environment
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxChijiokeNsofor
 

Similar to Whole genome sequencing of bacteria & analysis (20)

Understanding Genome
Understanding Genome Understanding Genome
Understanding Genome
 
Genome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGenome sequencing. ppt.pptx
Genome sequencing. ppt.pptx
 
CROP GENOME SEQUENCING
CROP GENOME SEQUENCINGCROP GENOME SEQUENCING
CROP GENOME SEQUENCING
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
New generation Sequencing
New generation Sequencing New generation Sequencing
New generation Sequencing
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
 
THE human genome
THE human genomeTHE human genome
THE human genome
 
Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research Community
 
Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEG
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein function
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods
 
Apollo : A workshop for the Manakin Research Coordination Network
Apollo: A workshop for the Manakin Research Coordination NetworkApollo: A workshop for the Manakin Research Coordination Network
Apollo : A workshop for the Manakin Research Coordination Network
 
Genome comparision
Genome comparisionGenome comparision
Genome comparision
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and tools
 
Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...
 
Vector Engineering.pptx
Vector Engineering.pptxVector Engineering.pptx
Vector Engineering.pptx
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
 
Shotgun (2) metagenomics
Shotgun (2) metagenomicsShotgun (2) metagenomics
Shotgun (2) metagenomics
 

Whole genome sequencing of bacteria & analysis

  • 1. WHOLE GENOME SEQUENCING OF BACTERIA & ANALYSIS ELAMURUGAN. A Ph.D Scholar, Vet. Immunology
  • 2. INTRODUCTION  1977 - first complete genome to be sequenced was bacteriophage X174 - 5386 bp  1995 - first complete genome sequence from a free living organism - Haemophilus influenzae (1.83 Mb) by whole genome shotgun approach  Sanger & Coulson (1977) - used chain-terminating dideoxynucleotide analogues  Maxam & Gilbert (1977) chemical degradation DNA sequencing - terminally labeled DNA fragments were chemically cleaved at specific bases and separated by gel electrophoresis
  • 4. ARCHON X PRIZE  X PRIZE Foundation in Santa Monica, CA, has introduced the Archon X PRIZE for Genomics and will award a sum of $10 million to the first team that can design a system capable of sequencing 100 human genomes in 10 days
  • 5. SEQUENCING TECHNOLOGY  First generation  Sanger’s dideoxy chain terminating tech  Maxam & Gilbert chemical degradation tech  Next generation sequencing (NGS)  454/Roche - pyrosequencing  Illumina/ Solexa - reversible dye terminators  SOLiD /ABI- sequential ligation of oligonucleotide probes Second generation HT-NGS – sequencing after amplification
  • 6.
  • 7.  Heliscope  SMRT (Pacific biosciences)  Single molecule real time (RNAP) sequencer  Nanopore DNA sequencer  Ion Torrent sequencing technology (PostLight)  VisiGen biotechnologies – FRET  Advantages of 3rd generation HT-NGS over 2nd  higher throughput  faster turnaround time  longer read lengths  higher consensus accuracy  small amounts of starting material  low cost Third generation HT-NGS - Single molecule sequencing
  • 8.
  • 9. ADVANTAGES OF HT-NGS  Massive parallel sequencing of hundreds of thousands or millions of templates  Preliminary and tedious cloning work is eliminated and substituted by PCR amplification  Most recent technologies, even PCR is eliminated, because single DNA molecules  Economic  Reduced time
  • 10. DISADVANTAGES OF HT-NGS  Most NGSTs produce short reads  Constructions of fragment libraries remain tricky and involve several steps of fragmentation, adaptor ligation and PCR amplification  Short homopolymers with the 454 technology  Modified nucleotides cause mis-incorporation or block further incorporation if the florescent moiety cannot be completely removed  Assembly of short reads into longer sequences
  • 11.
  • 13.
  • 14.
  • 15.
  • 17. Selection of a technology for an experiment
  • 18. GENOME ASSEMBLY  Assemblers can join sequences together based on overlapping regions between the sequences  Composed of contigs and scaffolds  Contigs - contiguous consensus sequences that are derived from collections of overlapping reads  Scaffolds - ordered and orientated sets of contigs that are linked to one another by mate pairs of sequencing reads  N50 - basic statistic for describing the contiguity of a genome assembly. The longer the N50 is, the better the assembly
  • 19.  Alignment against a reference genome sequence  De novo assembly Construction of longer sequences, such as contigs or genomes, from shorter sequences, such as sequence reads, without prior knowledge of the order of the reads or reference to a closely related sequence
  • 20. GENE PREDICTION  Ab initio gene prediction - mathematical models rather than external evidence (such as EST and protein alignments) to identify genes and to determine their intron–exon structures  Evidence-driven gene prediction - using ESTs, can be used to identify exon boundaries unambiguously. Great potential to improve the quality of gene prediction in newly sequenced genomes. ESTs and proteins must first be aligned to the genome  Commonly used tools for gene prediction in prokaryotes Glimmer, GeneMark
  • 21. GENOME ANNOTATION  Is the extraction of biological knowledge from raw nucleotide sequences  Seeks to identify every potential protein coding gene (ORFs)  Used to compare in available database like BlastP  ‘Structural’ genome annotation is the process of identifying genes and their intron–exon structures  ‘Functional’ genome annotation is the process of attaching meta-data such as gene ontology terms to structural annotations
  • 22.
  • 23.
  • 24. APPLICATIONS  Very large no of short reads help to identify single nucleotide polymorphisms (SNP) when comparing them in reference genome  Identification of rearrangements, deletions, insertions, inversions  Used to generate expressed sequence tags (EST) from RNA sequencing  Also to detect small regulatory RNAs  Illumia technoloy - ChIP Seq to study protein - DNA interactions  Metagenomics
  • 25. LEADS TO DEVELOPMENT  Functional genomics  Comparative genomics  Environmental genomics (Metagenomics)
  • 26. FUNCTIONAL GENOMICS  Reveals genome structure and its functional relation  Orthologs - they represent genes derived from a common ancestor that diverged because of divergence of the organism, tend to have similar function  Paralogs are homologs produced by gene duplication and represent genes derived from a common ancestral gene that duplicated within an organism and then diverged, tend to have different functions  Xenologs are homologs resulting from the horizontal transfer of a gene between two organisms. The function of xenologs can be variable, depending on how significant the change in context was for the horizontally moving gene. In general, though, the function tends to be similar
  • 27. PHYLOGENETIC ANALYSIS  Phylogenetic trees, which are used to classify the evolutionary relationships between homologous genes represented in the genomes of divergent species Internal Nodes or Divergence Points Branches or Lineages A B C D E Terminal Nodes Ancestral Node or ROOT of the Tree
  • 28. COMPARATIVE GENOMICS  Comparison of genome sequences reveals much information about genome structure and evolution, including importance of lateral gene transfer  Tool to discover how microbs adapted to particular ecology and in development of new therapeutic agents
  • 29. METAGENOMICS  Genomics-based study of genetic material recovered directly from environmentally derived samples without laboratory culture and compared with all previously sequenced genes  Enable how microbs adapt extreme environments which help to discover new metabolic pathway and protective mechanisms
  • 30. IMPACT OF GENOME SEQUENCING  Revealed genome reduction in I/C bacteria  Genome plasticity (rearrangements, mobile elements)  Gene duplication and diversification of protein function  Lateral gene transfer & acquisition of new functions  Adaptation to environments, virulence  Industrial process - fermentation tech,  Bioremediation  Biotransformation  Development of vaccines  Bacterial diversity  Synthetic biology  Epigenetics
  • 31. REVERSE VACCINOLOGY  Use of genomic sequence information to identify novel and better suited protein candidates for vaccine  Serogroup B Neisseria meningitidis – based on genomic data all proteins predicted to be surface exposed, therefore accessible to antiobodies  Suitable candidates selected after sequencing various strains  Streptococcus agalactiae  Pan-genome composed of core genome, the genes present in all sequence strains and the dispensable genome made of genes present in a subset of strains
  • 32.  Synthetic biology - from sequence of entire genome to synthesize genes de novo  Identification of minimal genome, the smallest set of genes that enbles life - Mycoplasma genitalium
  • 33. DATABASES AND TOOLS RELATED WITH BACTERIAL GENOMIC DATA  NCBI Entrez Genome Project database:  http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db = genomeprj  A searchable collection of complete and incomplete (in-progress) large-scale sequencing, assembly, annotation, and mapping projects for cellular organisms  NCBI, Bacteria Genome Database:  http://www.ncbi.nlm.nih.gov/genomes/static/eub.html  The Genome database provides views for a variety of genomes, complete chromosomes, sequence maps with contigs, and integrated genetic and physical maps  Bacterial Genomes at The Sanger Institute: • http://www.sanger.ac.uk/Projects/Microbes/ • This web contains a list of funded, on-going, or completed projects of pathogens sequenced at this institute  TIGR Comprehensive Microbial Resource (CMR):  http://cmr.tigr.org/tigr-scripts/CMR/CmrHomePage.cgi  A free website displaying information on all the publicly available, complete prokaryotic genomes
  • 34.  GOLD: Genomes OnLine Database:  http://www.genomesonline.org/  A genome database containing information about which genomes have been sequenced or are in progress  Microbial Genome Database for Comparative Analysis (MBGD):  http://mbgd.genome.ad.jp/  A database for comparative analysis of completely sequenced microbial genomes  Virulence Factors of Bacterial Pathogens (VFDB):  http://zdsys.chgb.org.cn/VFs/main.htm  VFDB is an integrated and comprehensive database of virulence factors for bacterial pathogens  Genome Information Broker:  http://gib.genes.nig.ac.jp/  A comprehensive data repository of complete microbial genomes in the public domain. Many microbial genomes can be explored graphically  Islander, a Database of Genomic Islands:  http://www.indiana.edu/~islander  This database contains genomic islands discovered in completely sequenced bacterial genomes
  • 35.  GenoList genome browser at Institute Pasteur:  http://genolist.pasteur.fr/  Contains access to diverse genome browsers of pathogenic bacteria  IslandPath:  http://www.pathogenomics.sfu.ca/islandpath/update/IPindex.pl  An aid to the identification of genomic islands, including pathogenicity islands, of potentially horizontally transferred genes  HGT-DB:  http://www.tinet.org/~debb/HGT/  A database containing the prediction of horizontally transferred genes in several prokaryotic complete genomes  E. coli genome project:  http://www.genome.wisc.edu  A site devoted to the E. coli genome project with an updated annotation of the genome