SlideShare a Scribd company logo
1 of 50
Molecular Genetics 
Genome 
Sequencing
AAtt aa ggllaannccee 
 What is a genome 
 Types of genomes 
 What is genomics 
 How is genomics different from genetics 
 Types of genomics 
 Genome sequencing 
 Milestones in genomic sequencing 
 Technical foundations of genomics 
 Steps of genome sequencing 
 DNA sequencing approaches 
 Hierarchical shotgun sequencing 
Markers used in mapping large genomes 
 Whole genome shotgun sequencing 
 New technologies 
Genome sequencing achievment in Bangladesh 
Benefits of Genome Research
WWHHAATT IISS AA GGEENNOOMMEE?? 
 Genome: One complete set of genetic 
information (total amount of DNA) from a haploid set of 
chromosomes of a single cell in eukaryotes, in a single 
chromosome in bacteria, or in the DNA or RNA of viruses. 
 Basic set of chromosome in a organism. 
“The whole hereditary information of an organism that is 
encoded in the DNA” 
•In cytogenetic genome means a single set of chromosomes. 
•It is denoted by x. Genome depends on the number of ploidy of 
organism. 
• In Drosophila melanogaster (2n = 2x = 8); genome x = 4. 
• In hexaploid Triticum aestivum (2n = 6x = 42); genome x = 7. 
Continue………
The genome is found 
inside every cell, and 
in those that have 
nucleus, the genome 
is situated inside the 
nucleus. Specifically, 
it is all the DNA in an 
organelle. 
 The term genome was introduced by H. Winkler in 1920 to 
denote the complete set of chromosomal and extra 
chromosomal genes present in an organism, including a virus.
How How mmaannyy t tyyppeess o off g geennoommeess a arree:: 
1. Prokaryotic Genomes 
2. Eukaryotic Genomes 
• Nuclear Genomes 
• Mitochondrial Genomes 
• Choloroplast Genomes 
If not specified, “genome” usually refers to the nuclear genome. 
WWHHAATT IISS GGEENNOOMMIICCSS?? 
• Genomics is the study of the structure and function of 
whole genomes. 
• Genomics is the comprehensive study of whole sets of 
genes and their interactions rather than single genes or 
proteins. 
• According to T.H. Roderick, genomics is the mapping and 
sequencing to analyze the structure and organization of 
genome.
OOrriiggiinn ooff tteerrmmiinnoollooggyy 
• The term genome was used by German botanist Hans 
Winker in 1920 
• Collection of genes in haploid set of chromosomes 
• Now it encompasses all DNA in a cell 
Genomics is the sub discipline of molecular genetics 
Genomics is the sub discipline of molecular genetics 
devoted to the 
devoted to the 
 The field includes studies of intro-genomic phenomena 
such as heterosis, epistasis, pleiotropy and other interactions 
between loci and alleles within the genome.
 The sequence information of the genome will 
show; 
 The position of every gene along the chromosome, 
 The regulatory regions that flank each gene, and 
 The coding sequence that determines the protein 
produce by each gene. 
 How is Genomics different from Genetics? 
Genetics as the study of inheritance and genomics as the 
study of genomes. 
– Genetics looks at single genes, one at a time, like a 
picture or snapshot. 
– Genomics looks at the big picture and examines all the 
genes as an entire system.
TTyyppeess ooff GGeennoommiiccss 
1. Structural: It deals with the determination of the 
complete sequence of genomes and gene map. 
This has progressed in steps as follows: 
(i) construction of high resolution genetic and physical 
maps, 
(ii) sequencing of the genome, and 
(iii) determination of complete set of proteins in an 
organism. 
2. Functional: It refers to the study of functioning of 
genes and their regulation and products(metabolic 
pathways), i.e., the gene expression patterns in organism. 
3. Comparative: It compare genes from different genomes 
to elucidate functional and evolutional relationship.
GGeennoommee SSeeqquueenncciinngg 
Genome sequencing is the technique that allows 
researchers to read the genetic information found in the DNA of 
anything from bacteria to plants to animals. Sequencing involves 
determining the order of bases, the nucleotide subunits-adenine( 
A), guanine(G), cytosine(C) and thymine(T), found in 
DNA. 
Genome sequencing is figuring out the order of DNA nucleotides. 
CChhaalllleennggeess ooff ggeennoommee sseeqquueenncciinngg 
 Data produce in form of short reads, which have to be assembled correctly 
in large contigs and chromosomes. 
 Short reads produced have low quality bases and vector/adaptor 
contaminations. 
 Several genome assemblers are available but we have to check the 
performance of them to search for best one.
MMiilleessttoonneess iinn GGeennoommiicc SSeeqquueenncciinngg 
1977; Fred Sanger; fX 174 bacteriophage (first sequenced genome ); 
5,375 bp 
Amino acid sequence of phage proteins 
Overlapping genes only in viruses 
Fig: The genetic map of phage fX174 (Overlapping reading frames) 
Continue………
1995; Craig Venter & Hamilton Smith; 
Haemophilus influenzae (1,830,137 bp) (1st free living). 
Mycoplasma genitalium (smallest free-living, 580,000 bp; 470 genes) 
1996; Saccharomyces cerevisiae; (1st eukaryote) 12,068,000 bp 
1997; Escherichia coli; 4,639,221 bp; Genetically more important. 
1999; Human chromosome 22; 53,000,000 bp 
2000; Drosophila melanogaster; 180,000,000 bp 
2001; Human; Working draft; 3,200,000,000 bp 
2002; Plasmodium falciparum; 23,000,000 bp 
Anopheles gambiea; 278,000,000 bp 
Mus musculus; 2,500,000,000 bp 
2003; Human; finished sequence, 3,200,000,000 bp 
2005; Oryza sativa (first cereal grain); 489,000,000 bp 
2006; Populus trichocarpa (first tree) ; 485,000,000 bp
Technical foundations of genomics 
 Molecular biology: Almost all of the 
underlying techniques of genomics 
originated with recombinant-DNA 
technology. 
 DNA sequencing: In particular, almost 
all DNA sequencing is still performed 
using the approach pioneered by 
Sanger. 
 Library construction: Also essential to 
high-throughput sequencing is the ability 
to generate libraries of genomic clones 
and then cut portions of these clones and 
introduce them into other vectors. 
 PCR amplification: The use of the 
polymerase chain reaction (PCR) to 
amplify DNA, developed in the 1980s, is 
another technique at the core of 
genomics approaches. 
Log MW 
. . 
. 
. 
Distance 
 Hybridization techniques: Finally, the use of hybridization of one nucleic 
acid to another in order to detect and quantitate DNA and RNA (Southern 
blotting). This method remains the basis for genomics techniques such as 
microarrays.
SStteeppss ooff ggeennoommee sseeqquueenncciinngg 
 Break genome into smaller fragments 
 Sequence those smaller pieces 
 Piece the sequences of the short fragments together 
DDNNAA sseeqquueenncciinngg aapppprrooaacchheess 
Two different methods used 
1. Hierarchical shotgun sequencing 
-Useful for sequencing genomes of higher vertebrates 
that contain repetitive sequences 
2. Whole genome Shotgun Sequencing 
-Useful for smaller genomes
Hierarchical Hierarchical SShhoottgguunn SSeeqquueenncciinngg 
• The method preferred by the Human Genome Project is 
the hierarchical shotgun sequencing method. 
• Also known as 
– The Clone-by-Clone Strategy 
– the map-based method 
– map first, sequence later 
– top-down sequencing 
Human Genome Project adopted a map-based strategy 
– Start with well-defined physical map 
– Produce shortest tiling path for large-insert clones 
– Assemble the sequence for each clone 
– Then assemble the entire sequence, based on the physical 
map
In In TThhee CClloonnee--bbyy--CClloonnee SSttrraatteeggyy 
1) Markers for regions of the genomes are identified. 
2) The genome is split into larger fragments (50-200kb) using restriction/cutting 
enzymes that contain a known marker. 
3) These fragments are cloned in bacteria (E. coli) using BACs (Bacterial 
Artificial Chromosomes), where they are replicated and stored. 
4) The BAC inserts are isolated and the whole genome is mapped by 
finding markers regularly spaced along each chromosome to determine the 
order of each cloned. 
5) The fragments contained in these clones have different ends, and with 
enough coverage finding a scaffold of BAC contigs. This scaffold is called 
a tiling path. BAC contig that covers the entire genomic area of interest 
makes up the tiling path. 
6) Each BAC fragment in the Golden Path is fragmented randomly into smaller 
pieces and these fragments are individually sequenced using automated 
Sanger sequencing and sequenced on both strands. 
7) These sequences are aligned so that identical sequences are overlapping. 
Assembly of the genome is done on the basis of prior knowledge of the 
markers used to localize sequenced fragments to their genomic location. A 
computer stitches the sequences up using the markers as a reference 
guide. 
Continue………
Fig: Hierarchical shotgun sequencing 
In this approach, every part 
of the genome is actually 
sequenced roughly 4-5 
times to ensure that no 
part of the genome is left 
out.
Each 150,000 bp fragment is inserted into a BAC (bacterial artificial 
chromosome). A BAC can replicate inside a bacterial cell. A set of BACs 
containing an entire human genome is called a BAC library. 
The Clone-by-Clone Strategy used in 
S. cerevisiae (yeast), 
C. elegans (nematode), 
Arabidopsis thaliana (mustard weed), 
Oryza sativa, 
Drosophila melanogaster and 
Homo sapiens (Human), etc.
The Clone-by-Clone Strategy 
The Clone-by-Clone Strategy 
Markers used in mapping large genomes 
Markers used in mapping large genomes 
Different types of Markers are used in mapping large 
genomes, Such as 
A. Restriction Fragment Length Polymorphisms (RFLP) 
B. Variable Number of Tandem Repeats (VNTRs) 
C. Sequence Tagged Sites (STS) 
D. Microsatellites, etc.
A. Restriction Fragment Length Polymorphisms (RFLP) 
Polymorphism means that a genetic locus has different forms, or 
alleles. 
The cutting the DNA from any two individuals with a restriction 
enzyme may yield fragments of different lengths, called Restriction 
Fragment Length Polymorphisms (RFLP), is usually pronounced 
“rifflip”. 
 The pattern of RFLP generated will depend mainly on 
– 1) The differentiation in DNA of selected strains (or) species 
– 2) The restriction enzymes used 
– 3) The DNA probe employed for southern hybridization 
Steps: 
a. Consider the restriction enzyme HindIII, which recognizes the sequence 
AAGCTT. 
b. Between two, One individual contains three sites of a chromosome, so 
cutting the DNA with HindIII yields two fragments, 2 and 4 kb long. 
Continue………
Figure: Detecting a RFLP 
c. Another individual may lack the middle site but have the other two, so 
cutting the DNA with HindIII yields one fragment 6 kb long. These 
fragments are called RFLP. 
Continue………
d. These restriction fragments of different lengths beteween the genotypes 
can be detected on southern blots and by the use of suitable 
probe. An RFLP is detected as a differential movement of a band on 
the gel lanes from different species and strains. Each such bond is 
regarded as single RFLP locus. So any differences among the DNA of 
individuals are easy to see. 
e. This RFLP is used as a marker in chromosomal mapping. 
Limitations 
 Requires relatively large amount of highly pure DNA 
 Laborious and expensive to identify a suitable marker restriction 
enzymes. 
 Time consuming. 
 Required expertise in auto radiography because of using radio actively 
labeled probes
B. Variable Number of Tandem Repeats (VNTRs) 
Due to the greater the degree of polymorphism of a RFLP, mapping 
become very tedious, in this case variable number tandem repeats 
(VNTRs) will be more useful. 
Tandem repeats occur in DNA when a pattern of one or more nucleotides 
is repeated and the repetitions are directly adjacent to each other. 
An example would be: 
AATTTTCCGGCCCCAAAATTCC AATTTTCCGGCCCCAAAATTCC AATTTTCCGGCCCCAAAATTCC 
In which the sequence ATTCGCCAATC is repeated three times. 
• A variable number tandem repeat (or VNTR) is a location in 
a genome where a short nucleotide sequence is organized as a tandem 
repeat. 
• The repeated sequence is longer — about 10-100 base pairs long. 
• The full genetic profiles of individuals reveal many differences. 
• Since most human genes are the same from person to person, but 
Variable Number of Tandem Repeats or VNTRs that tends to differ 
among different people. 
Continue………
• While the repeated sequences themselves are usually the same from 
person to person, the number of times they are repeated tends to vary. 
• VNTRs are highly polymorphic. These can be isolated from an 
individual’s DNA and therefore relatively easy to map. 
• However, VNTRs have a disadvantage as genetic markers: They tend 
to bunch together at the ends of chromosomes, leaving the interiors of 
the chromosomes relatively devoid of markers.
C. Sequence Tagged Sites (STS) 
Another kind of genetic marker, which is very useful to genome mappers, is 
the sequence-tagged site (STS). 
•STSs are short sequences, about 60–1000 bp long, that can be easily 
detected by PCR using specific primers. 
•The sequences of small areas of this DNA may be known or unknown, so 
one can design primers that will hybridize to these regions and allow PCR 
to produce double stranded fragments of predictable lengths. If the proper 
size appears, then the DNA has the STS of interest. 
•One great advantage of STSs as a mapping tool is that no DNA must be 
cloned and examined. 
•Instead, the sequences of the primers used to generate an STS are 
published and then anyone in the world can order those same primers and 
find the same STS in an experiment that takes just a few hours. 
Continue………
In this example, two PCR 
primers (red) spaced 250 bp apart 
have been used. Several cycles of 
PCR generate many double-stranded 
PCR products that are 
precisely 250 bp long. 
Electrophoresis of this product 
allows one to measure its size 
exactly and confirm that it is the 
correct one. 
Figure : Sequence-tagged sites
Making physical map using Sequence Tagged Sites (STS) 
1. Geneticists interested in physically mapping or sequencing a given 
region of a genome aim to assemble a set of clones called a contig, 
which contains contiguous (actually overlapping) DNAs spanning long 
distances. 
2. It is essential to have vectors like BACs and YACs that hold big chunks 
of DNA. Assuming we have a BAC library of the human genome, we 
need some way to identify the clones that contain the region we want to 
map. 
3. A more reliable method is to look for STSs in the BACs. It is best to 
screen the BAC library for at least two STSs, spaced hundreds of kilo-bases 
apart, so BACs spanning a long distance are selected. 
4. After we have found a number of positive BACs, we begin mapping by 
screening them for several additional STSs, so we can line them up in 
an overlapping fashion as shown in following figure. This set of 
overlapping BACs is our new contig. We can now begin finer mapping, 
and even sequencing, of the contig. 
Continue………
Fig: Mapping with STSs. 
At top left, several representative BACs are shown, with different symbols representing different STSs placed at 
specific intervals. In step (a) of the mapping procedure, screen for two or more widely spaced STSs. In this case 
screen for STS1 and STS4. All those BACs with either STS1 or 4 are shown at top right. The identified STSs are shown 
in color. In step (b), each of these positive BACs is further screened for the presence of STS2, STS3, and STS5.The 
colored symbols on the BACs at bottom right denote the STSs detected in each BAC. In step (c), align the STSs in 
each BAC to form the contig. Measuring the lengths of the BACs by pulsed-field gel electrophoresis helps to pin 
down the spacing between pairs of BACs.
D. Microsatellites 
STSs are very useful in physical mapping or locating specific sequences in 
the genome. But sometimes it is not possible to use them for genetic 
mapping. 
Fortunately, geneticists have discovered a class of STSs called 
microsatellites. 
GCTTGGTGTGATGTAGAAGGCGCCAATGCATCTCGACGTAT 
GCGTATACGGGTTACCCCCTTTGCAATCAGTGCACACACAC 
ACACACACACACACACACACACACACACACAGTGCCAAGCA 
AAAATAACGCCAAGCAGAACGAAGACGTTCTCGAGAACACC 
GCTTGGTGTGATGTAGAAGGCGCCAATGCATCTCGACGTAT 
GCGTATACGGGTTACCCCCTTTGCAATCAGTGCACACACAC 
ACACACACACACACACACACACACACACACAGTGCCAAGCA 
AAAATAACGCCAAGCAGAACGAAGACGTTCTCGAGAACACC 
 Microsatellites are similar to minisatellites in that they consist of a core 
sequence repeated over and over many times in a row. 
 The core sequence in typical microsatellites is smaller—usually only 2–4 
bp long. 
 Microsatellites are highly polymorphic; they are also widespread and 
relatively uniformly distributed in the human genome. 
 The number of repeats varied quite a bit from one individual to another. 
 Thus, they are ideal as markers for both linkage and physical mapping. 
Continue………
 In 1992, Jean Weissenbach et al produced a linkage map of the entire 
human genome based on 814 microsatellites containing a C–A 
dinucleotide repeat. 
 The most common way to detect microsatellites is to design PCR primers 
that are unique to one locus in the genome and unique on base pair on 
either side of the repeated portion. 
 Therefore, a single pair of PCR primers will work for every individual in the 
species and produce different sized products for each of the different 
length microsatellites. 
 The PCR products are then separated by either gel electrophoresis. Either 
way, the investigator can determine the size of the PCR product and thus 
how many times the dinucleotide ("CA") was repeated for each allele.
Whole Whole ggeennoommee SShhoottgguunn SSeeqquueenncciinngg 
The shotgun-sequencing strategy, first proposed by Craig Venter, 
Hamilton Smith, and Leroy Hood in 1996, bypasses the mapping stage and 
goes right to the sequencing stage. 
This method was employed by Celera Genomics, which was a private 
entity that was trying to mono-polise the human genome sequence by 
patenting it, to do this they had to try and beat the publicly funded project. 
Whole genome shotgun sequencing was therefore adopted by them. 
1. BAC library: A BAC library is generated of random fragments of the human 
genome using restriction digestion followed by cloning. 
The sequencing starts with a set of BAC clones containing very large 
DNA inserts, averaging about 150 kb. The insert in each BAC is sequenced 
on both ends using an automated sequencer that can usually read about 500 
bases at a time, so 500 bases at each end of the clone will be determined. 
Assuming that 300,000 clones of human DNA are sequenced this 
way, that would generate 300 million bases of sequence, or about 10% of the 
total human genome. These 500-base sequences serve as an identity tag, 
called a sequence-tagged connector (STC), for each BAC clone. This is the 
origin of the term connector—each clone should be “connected” via its STCs 
to about 30 other clones. Continue………
Steps: 
1. BAC library 
2. Finger printing 
3. Plasmid library 
4. BAC walking 
5. Powerful computer 
program 
Fig: Whole Genome Shotgun 
Sequencing Method 
Continue………
2. Finger printing: This step is to fingerprint each clone by digesting it with a 
restriction enzyme. This serves two important purposes. First, it tells the 
insert size (the sum of the sizes of all the fragmented by the restriction 
enzyme). Second, it allows one to eliminate aberrant clones whose 
fragmentation patterns do not fit the consensus of the overlapping clones. 
Note that this clone fingerprinting is not the same as mapping; it is just a 
simple check before sequencing begins. 
3. Plasmid library: A seed BAC is selected for sequencing. The seed BAC is 
sub cloned into a plasmid vector by subdividing the BAC into smaller clones 
only about 2 kb. A plasmid library is prepared by transforming E. coli strains 
with plasmid. This whole BAC sequence allows the identification of the 30 or 
so other BACs that overlap with the seed: They are the ones with STCs that 
occur somewhere in the seed BAC. 
4. BAC walking: Three thousand of the plasmid clones are sequenced, and 
the sequences are ordered by their overlaps, producing the sequence of the 
whole 150-kb BAC. Finding the BACs (about 30) with overlapping STCs, then 
compare them by fingerprinting to find those with minimal overlaps, and 
sequence them. This strategy, called BAC walking, would in principle allow 
one laboratory to sequence the whole human genome. 
Continue………
5. Powerful computer program: But we do not have that much time, so 
Venter and colleagues modified the procedure by sequencing BACs at 
random until they had about 35 billion bp of sequence. In principle that should 
cover the human genome ten times over, giving a high degree of coverage 
and accuracy. Then they fed all the sequence into a computer with a 
powerful program that found areas of overlap between clones and fit their 
sequences together, building the sequence of the whole genome.
Finishing 
• Process of assembling raw 
sequence reads into 
accurate contiguous 
sequence 
– Required to achieve 
1/10,000 accuracy 
• Manual process 
– Look at sequence reads at 
positions where programs 
can’t tell which base is the 
correct one 
– Fill gaps 
– Ensure adequate coverage 
Gap 
Single 
stranded 
Continue………
Finishing 
• To fill gaps in sequence, 
design primers and 
sequence from primer 
• To ensure adequate 
coverage, find regions 
where there is not 
sufficient coverage and 
use specific primers for 
those areas 
GAP 
Primer 
Primer
Verification 
• Region verified for the following: 
– Coverage 
– Sequence quality 
– Contiguity 
• Determine restriction-enzyme cleavage 
sites 
– Generate restriction map of sequenced region 
– Must agree with fingerprint generated of clone 
during mapping step
NNeeww tteecchhnnoollooggiieess 
• A high-priority goal at the beginning of the Human 
Genome Project was to develop new mapping and 
sequencing technologies 
• To date, no major breakthrough technology has been 
developed 
– Possible exception: whole-genome shotgun sequencing applied 
to large genomes, Celera 
AAuuttoommaatteedd sseeqquueenncceerrss 
• Perhaps the most important contribution to large-scale 
sequencing was the development of automated 
sequencers 
– Most use Sanger sequencing method 
– Fluorescently labeled reaction products 
– Capillary electrophoresis for separation
Automated sequencers: ABI 
3700 
96–well plate 
robotic arm and syringe 
96 glass capillaries 
load bar 
MegaBACE ABI 3700
Automatic gel reading 
Computer image of 
sequence read by 
automated sequencer
Sequence assembly readout 
Consensus building
Genome sequencing achievment in 
Bangladesh 
• Genome sequencing of Macrophomina phaseolina 
• Genome sequencing of Jute
Genome of destructive Pathogen 
Macrophomina phaseolina unraveled 
by Maqsudul Alam & BJRI Associates 
Genome of destructive Pathogen 
Macrophomina phaseolina unraveled 
by Maqsudul Alam & BJRI Associates 
 Macrophomina phaseolina is a soil and seed borne fungus. 
 it can infect more than 500 cultivated and wild plant species. 
 It causes seedling blight, dry root rot, wilt, leaf blight, stem blight, 
root and stem rot of different cultivated and wild plant species. 
 The fungus can remain viable for more than 4 years in soil and 
crop. 
Continue………
• The Basic and Applied Research on Jute (BARJ) project team, led 
by Prof Maqsudul Alam, took this unique challenge and, for the first 
time in the world, decoded the genome of this most dangerous 
fungus. 
• They have identified the proteins and their networks that the fungus 
uses to attack and kill the plant. This fundamental knowledge will help to 
defend and fight against this fungus and to promote the development of 
resistant varieties of jute as well as other crops.
Genome sequencing of Tossa jute Genome sequencing of Tossa jute ((CCoorrcchhoorruuss oolliittoorriiuuss)) 
• Jute was called the Golden Fiber of Bangladesh as 
Bangladesh was the largest jute production country of the 
world. 
• Genome sequencing of jute has been discovered by 
Bangladeshi scientists. 
Continue………
• The country first time in world decoded the jute genome. 
• The research team was led by Professor Maqsudul Alam from University of 
Hawaii, who also successfully led the genome discovery of papaya in USA 
and rubber in Malaysia. 
• Also included 
 a group of Bangladeshi researchers from Dhaka 
University's Biochemistry and Biotechnology 
departments, 
 Bangladesh Jute Research Institute (BJRI) 
 software firm Data Soft in collaboration with Centre 
for Chemical Biology, 
 University of Science, Malaysia and 
 University of Hawaii have successfully decoded 
the jute's genome. 
This was done under the 
Basic & Applied Research on Jute 
Project (BARJ).
Fig: Internationally famed geneticist Maqsudul Alam 
and 
other scientists of jute genome project
Anticipated Anticipated BBeenneeffiittss ooff GGeennoommee RReesseeaarrcchh 
Molecular Medicine 
• improve diagnosis of disease 
• detect genetic predispositions to disease 
• create drugs based on molecular information 
Microbial Genomics 
• rapidly detect and treat pathogens (disease-causing microbes) in 
clinical practice 
• develop new energy sources (biofuels) 
• monitor environments to detect pollutants 
• clean up toxic waste safely and efficiently. 
Risk Assessment 
• evaluate the health risks faced by individuals who may be exposed to 
radiation and to cancer-causing chemicals and toxins 
Bio-archaeology, Anthropology, Evolution 
• study evolution through mutations in lineages 
• study migration of different population groups based on maternal 
inheritance Continue………
• compare breakpoints in the evolution of mutations with ages of 
populations and historical events. 
Agriculture, Livestock Breeding, and Bio-processing 
• grow disease-, insect-, and drought-resistant crops 
• breed healthier, more productive, disease-resistant farm animals 
• grow more nutritious produce 
• develop biopesticides 
• incorporate edible vaccines incorporated into food products 
DNA Identification (Forensics) 
• identify potential suspects whose DNA may match evidence left at 
crime scenes 
• identify crime victims 
• establish paternity and other family relationships 
• identify endangered and protected species as an aid to wildlife officials 
• detect bacteria and other organisms that may pollute air, water, soil, 
and food 
• match organ donors with recipients in transplant programs
References 
• Weaver RF 2005. Molecular Biology. McGraw-Hill 
International edition, NY. 
• Gardner EJ, MJ Simmons and DP Snustad 1991. 
Principles of Genetics. John Wiley and Sons Inc, 
NY. 
• Gupta, P.K. 2007. Genetics. Rastogi Publications, 
Meerut. 
• Allison LA, 2007. Fundamental Molecular Biology, 
Blackwell publishing, USA 
• Internet
Thank 
You

More Related Content

What's hot

Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databasesPranavathiyani G
 
Antisense rna technology
Antisense rna technologyAntisense rna technology
Antisense rna technologySaurav Das
 
DNA Sequencing- Sanger's Method
DNA Sequencing- Sanger's MethodDNA Sequencing- Sanger's Method
DNA Sequencing- Sanger's MethodHarsha Joseph
 
PHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICSPHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICSUsman Arshad
 
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGBilal Nizami
 
Chromosome walking
Chromosome walkingChromosome walking
Chromosome walkingAleena Khan
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignmentRamya S
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)talhakhat
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsajay301
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformaticsnadeem akhter
 
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCINGDNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCINGPuneet Kulyana
 
Linker, Adaptor, Homopolymeric Tailing & Terminal Transferase
Linker, Adaptor, Homopolymeric Tailing & Terminal TransferaseLinker, Adaptor, Homopolymeric Tailing & Terminal Transferase
Linker, Adaptor, Homopolymeric Tailing & Terminal TransferaseUtsa Roy
 
AFLP, RFLP & RAPD
AFLP, RFLP & RAPDAFLP, RFLP & RAPD
AFLP, RFLP & RAPDDOCTOR WHO
 

What's hot (20)

Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 
Pyrosequencing
PyrosequencingPyrosequencing
Pyrosequencing
 
Antisense rna technology
Antisense rna technologyAntisense rna technology
Antisense rna technology
 
DNA Sequencing- Sanger's Method
DNA Sequencing- Sanger's MethodDNA Sequencing- Sanger's Method
DNA Sequencing- Sanger's Method
 
PHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICSPHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICS
 
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCING
 
Whole genome sequencing
Whole genome sequencingWhole genome sequencing
Whole genome sequencing
 
Chromosome walking
Chromosome walkingChromosome walking
Chromosome walking
 
Pyrosequencing
PyrosequencingPyrosequencing
Pyrosequencing
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
 
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCINGDNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
 
YEAST TWO HYBRID SYSTEM
 YEAST TWO HYBRID SYSTEM YEAST TWO HYBRID SYSTEM
YEAST TWO HYBRID SYSTEM
 
MODIFYING ENZYMES
MODIFYING ENZYMESMODIFYING ENZYMES
MODIFYING ENZYMES
 
Linker, Adaptor, Homopolymeric Tailing & Terminal Transferase
Linker, Adaptor, Homopolymeric Tailing & Terminal TransferaseLinker, Adaptor, Homopolymeric Tailing & Terminal Transferase
Linker, Adaptor, Homopolymeric Tailing & Terminal Transferase
 
Rasmol
RasmolRasmol
Rasmol
 
AFLP, RFLP & RAPD
AFLP, RFLP & RAPDAFLP, RFLP & RAPD
AFLP, RFLP & RAPD
 

Viewers also liked (8)

Whole Genome Analysis
Whole Genome AnalysisWhole Genome Analysis
Whole Genome Analysis
 
Gene concept
Gene conceptGene concept
Gene concept
 
TRANSPOSABLE ELEMENTS
TRANSPOSABLE ELEMENTSTRANSPOSABLE ELEMENTS
TRANSPOSABLE ELEMENTS
 
DNA Sequencing
DNA SequencingDNA Sequencing
DNA Sequencing
 
transposons complete ppt
transposons complete ppttransposons complete ppt
transposons complete ppt
 
Ngs ppt
Ngs pptNgs ppt
Ngs ppt
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
 
DNA SEQUENCING METHOD
DNA SEQUENCING METHODDNA SEQUENCING METHOD
DNA SEQUENCING METHOD
 

Similar to Genome sequencing

Similar to Genome sequencing (20)

THE human genome
THE human genomeTHE human genome
THE human genome
 
Genome Sequencing
Genome SequencingGenome Sequencing
Genome Sequencing
 
Genomics
GenomicsGenomics
Genomics
 
Tools of Genomics
Tools of GenomicsTools of Genomics
Tools of Genomics
 
Genomics
GenomicsGenomics
Genomics
 
Genome project.pdf
Genome project.pdfGenome project.pdf
Genome project.pdf
 
HGP, the human genome project
HGP, the human genome projectHGP, the human genome project
HGP, the human genome project
 
Human genome project (2) converted
Human genome project (2) convertedHuman genome project (2) converted
Human genome project (2) converted
 
Genomics and Plant Genomics
Genomics and Plant GenomicsGenomics and Plant Genomics
Genomics and Plant Genomics
 
2 whole genome sequencing and analysis
2 whole genome sequencing and analysis2 whole genome sequencing and analysis
2 whole genome sequencing and analysis
 
introduction to Genomics
introduction to Genomics introduction to Genomics
introduction to Genomics
 
human genome project_094513.pptx
human genome project_094513.pptxhuman genome project_094513.pptx
human genome project_094513.pptx
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
CROP GENOME SEQUENCING
CROP GENOME SEQUENCINGCROP GENOME SEQUENCING
CROP GENOME SEQUENCING
 
GENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICSGENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICS
 
Genomics
GenomicsGenomics
Genomics
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomics
 
Genomics
GenomicsGenomics
Genomics
 
Genome analysis
Genome analysisGenome analysis
Genome analysis
 
Bio153 microbial genomics 2012
Bio153 microbial genomics 2012Bio153 microbial genomics 2012
Bio153 microbial genomics 2012
 

Recently uploaded

Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...navyadasi1992
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 

Recently uploaded (20)

Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 

Genome sequencing

  • 2. AAtt aa ggllaannccee  What is a genome  Types of genomes  What is genomics  How is genomics different from genetics  Types of genomics  Genome sequencing  Milestones in genomic sequencing  Technical foundations of genomics  Steps of genome sequencing  DNA sequencing approaches  Hierarchical shotgun sequencing Markers used in mapping large genomes  Whole genome shotgun sequencing  New technologies Genome sequencing achievment in Bangladesh Benefits of Genome Research
  • 3. WWHHAATT IISS AA GGEENNOOMMEE??  Genome: One complete set of genetic information (total amount of DNA) from a haploid set of chromosomes of a single cell in eukaryotes, in a single chromosome in bacteria, or in the DNA or RNA of viruses.  Basic set of chromosome in a organism. “The whole hereditary information of an organism that is encoded in the DNA” •In cytogenetic genome means a single set of chromosomes. •It is denoted by x. Genome depends on the number of ploidy of organism. • In Drosophila melanogaster (2n = 2x = 8); genome x = 4. • In hexaploid Triticum aestivum (2n = 6x = 42); genome x = 7. Continue………
  • 4. The genome is found inside every cell, and in those that have nucleus, the genome is situated inside the nucleus. Specifically, it is all the DNA in an organelle.  The term genome was introduced by H. Winkler in 1920 to denote the complete set of chromosomal and extra chromosomal genes present in an organism, including a virus.
  • 5. How How mmaannyy t tyyppeess o off g geennoommeess a arree:: 1. Prokaryotic Genomes 2. Eukaryotic Genomes • Nuclear Genomes • Mitochondrial Genomes • Choloroplast Genomes If not specified, “genome” usually refers to the nuclear genome. WWHHAATT IISS GGEENNOOMMIICCSS?? • Genomics is the study of the structure and function of whole genomes. • Genomics is the comprehensive study of whole sets of genes and their interactions rather than single genes or proteins. • According to T.H. Roderick, genomics is the mapping and sequencing to analyze the structure and organization of genome.
  • 6. OOrriiggiinn ooff tteerrmmiinnoollooggyy • The term genome was used by German botanist Hans Winker in 1920 • Collection of genes in haploid set of chromosomes • Now it encompasses all DNA in a cell Genomics is the sub discipline of molecular genetics Genomics is the sub discipline of molecular genetics devoted to the devoted to the  The field includes studies of intro-genomic phenomena such as heterosis, epistasis, pleiotropy and other interactions between loci and alleles within the genome.
  • 7.  The sequence information of the genome will show;  The position of every gene along the chromosome,  The regulatory regions that flank each gene, and  The coding sequence that determines the protein produce by each gene.  How is Genomics different from Genetics? Genetics as the study of inheritance and genomics as the study of genomes. – Genetics looks at single genes, one at a time, like a picture or snapshot. – Genomics looks at the big picture and examines all the genes as an entire system.
  • 8. TTyyppeess ooff GGeennoommiiccss 1. Structural: It deals with the determination of the complete sequence of genomes and gene map. This has progressed in steps as follows: (i) construction of high resolution genetic and physical maps, (ii) sequencing of the genome, and (iii) determination of complete set of proteins in an organism. 2. Functional: It refers to the study of functioning of genes and their regulation and products(metabolic pathways), i.e., the gene expression patterns in organism. 3. Comparative: It compare genes from different genomes to elucidate functional and evolutional relationship.
  • 9. GGeennoommee SSeeqquueenncciinngg Genome sequencing is the technique that allows researchers to read the genetic information found in the DNA of anything from bacteria to plants to animals. Sequencing involves determining the order of bases, the nucleotide subunits-adenine( A), guanine(G), cytosine(C) and thymine(T), found in DNA. Genome sequencing is figuring out the order of DNA nucleotides. CChhaalllleennggeess ooff ggeennoommee sseeqquueenncciinngg  Data produce in form of short reads, which have to be assembled correctly in large contigs and chromosomes.  Short reads produced have low quality bases and vector/adaptor contaminations.  Several genome assemblers are available but we have to check the performance of them to search for best one.
  • 10. MMiilleessttoonneess iinn GGeennoommiicc SSeeqquueenncciinngg 1977; Fred Sanger; fX 174 bacteriophage (first sequenced genome ); 5,375 bp Amino acid sequence of phage proteins Overlapping genes only in viruses Fig: The genetic map of phage fX174 (Overlapping reading frames) Continue………
  • 11. 1995; Craig Venter & Hamilton Smith; Haemophilus influenzae (1,830,137 bp) (1st free living). Mycoplasma genitalium (smallest free-living, 580,000 bp; 470 genes) 1996; Saccharomyces cerevisiae; (1st eukaryote) 12,068,000 bp 1997; Escherichia coli; 4,639,221 bp; Genetically more important. 1999; Human chromosome 22; 53,000,000 bp 2000; Drosophila melanogaster; 180,000,000 bp 2001; Human; Working draft; 3,200,000,000 bp 2002; Plasmodium falciparum; 23,000,000 bp Anopheles gambiea; 278,000,000 bp Mus musculus; 2,500,000,000 bp 2003; Human; finished sequence, 3,200,000,000 bp 2005; Oryza sativa (first cereal grain); 489,000,000 bp 2006; Populus trichocarpa (first tree) ; 485,000,000 bp
  • 12. Technical foundations of genomics  Molecular biology: Almost all of the underlying techniques of genomics originated with recombinant-DNA technology.  DNA sequencing: In particular, almost all DNA sequencing is still performed using the approach pioneered by Sanger.  Library construction: Also essential to high-throughput sequencing is the ability to generate libraries of genomic clones and then cut portions of these clones and introduce them into other vectors.  PCR amplification: The use of the polymerase chain reaction (PCR) to amplify DNA, developed in the 1980s, is another technique at the core of genomics approaches. Log MW . . . . Distance  Hybridization techniques: Finally, the use of hybridization of one nucleic acid to another in order to detect and quantitate DNA and RNA (Southern blotting). This method remains the basis for genomics techniques such as microarrays.
  • 13. SStteeppss ooff ggeennoommee sseeqquueenncciinngg  Break genome into smaller fragments  Sequence those smaller pieces  Piece the sequences of the short fragments together DDNNAA sseeqquueenncciinngg aapppprrooaacchheess Two different methods used 1. Hierarchical shotgun sequencing -Useful for sequencing genomes of higher vertebrates that contain repetitive sequences 2. Whole genome Shotgun Sequencing -Useful for smaller genomes
  • 14. Hierarchical Hierarchical SShhoottgguunn SSeeqquueenncciinngg • The method preferred by the Human Genome Project is the hierarchical shotgun sequencing method. • Also known as – The Clone-by-Clone Strategy – the map-based method – map first, sequence later – top-down sequencing Human Genome Project adopted a map-based strategy – Start with well-defined physical map – Produce shortest tiling path for large-insert clones – Assemble the sequence for each clone – Then assemble the entire sequence, based on the physical map
  • 15. In In TThhee CClloonnee--bbyy--CClloonnee SSttrraatteeggyy 1) Markers for regions of the genomes are identified. 2) The genome is split into larger fragments (50-200kb) using restriction/cutting enzymes that contain a known marker. 3) These fragments are cloned in bacteria (E. coli) using BACs (Bacterial Artificial Chromosomes), where they are replicated and stored. 4) The BAC inserts are isolated and the whole genome is mapped by finding markers regularly spaced along each chromosome to determine the order of each cloned. 5) The fragments contained in these clones have different ends, and with enough coverage finding a scaffold of BAC contigs. This scaffold is called a tiling path. BAC contig that covers the entire genomic area of interest makes up the tiling path. 6) Each BAC fragment in the Golden Path is fragmented randomly into smaller pieces and these fragments are individually sequenced using automated Sanger sequencing and sequenced on both strands. 7) These sequences are aligned so that identical sequences are overlapping. Assembly of the genome is done on the basis of prior knowledge of the markers used to localize sequenced fragments to their genomic location. A computer stitches the sequences up using the markers as a reference guide. Continue………
  • 16. Fig: Hierarchical shotgun sequencing In this approach, every part of the genome is actually sequenced roughly 4-5 times to ensure that no part of the genome is left out.
  • 17. Each 150,000 bp fragment is inserted into a BAC (bacterial artificial chromosome). A BAC can replicate inside a bacterial cell. A set of BACs containing an entire human genome is called a BAC library. The Clone-by-Clone Strategy used in S. cerevisiae (yeast), C. elegans (nematode), Arabidopsis thaliana (mustard weed), Oryza sativa, Drosophila melanogaster and Homo sapiens (Human), etc.
  • 18. The Clone-by-Clone Strategy The Clone-by-Clone Strategy Markers used in mapping large genomes Markers used in mapping large genomes Different types of Markers are used in mapping large genomes, Such as A. Restriction Fragment Length Polymorphisms (RFLP) B. Variable Number of Tandem Repeats (VNTRs) C. Sequence Tagged Sites (STS) D. Microsatellites, etc.
  • 19. A. Restriction Fragment Length Polymorphisms (RFLP) Polymorphism means that a genetic locus has different forms, or alleles. The cutting the DNA from any two individuals with a restriction enzyme may yield fragments of different lengths, called Restriction Fragment Length Polymorphisms (RFLP), is usually pronounced “rifflip”.  The pattern of RFLP generated will depend mainly on – 1) The differentiation in DNA of selected strains (or) species – 2) The restriction enzymes used – 3) The DNA probe employed for southern hybridization Steps: a. Consider the restriction enzyme HindIII, which recognizes the sequence AAGCTT. b. Between two, One individual contains three sites of a chromosome, so cutting the DNA with HindIII yields two fragments, 2 and 4 kb long. Continue………
  • 20. Figure: Detecting a RFLP c. Another individual may lack the middle site but have the other two, so cutting the DNA with HindIII yields one fragment 6 kb long. These fragments are called RFLP. Continue………
  • 21. d. These restriction fragments of different lengths beteween the genotypes can be detected on southern blots and by the use of suitable probe. An RFLP is detected as a differential movement of a band on the gel lanes from different species and strains. Each such bond is regarded as single RFLP locus. So any differences among the DNA of individuals are easy to see. e. This RFLP is used as a marker in chromosomal mapping. Limitations  Requires relatively large amount of highly pure DNA  Laborious and expensive to identify a suitable marker restriction enzymes.  Time consuming.  Required expertise in auto radiography because of using radio actively labeled probes
  • 22. B. Variable Number of Tandem Repeats (VNTRs) Due to the greater the degree of polymorphism of a RFLP, mapping become very tedious, in this case variable number tandem repeats (VNTRs) will be more useful. Tandem repeats occur in DNA when a pattern of one or more nucleotides is repeated and the repetitions are directly adjacent to each other. An example would be: AATTTTCCGGCCCCAAAATTCC AATTTTCCGGCCCCAAAATTCC AATTTTCCGGCCCCAAAATTCC In which the sequence ATTCGCCAATC is repeated three times. • A variable number tandem repeat (or VNTR) is a location in a genome where a short nucleotide sequence is organized as a tandem repeat. • The repeated sequence is longer — about 10-100 base pairs long. • The full genetic profiles of individuals reveal many differences. • Since most human genes are the same from person to person, but Variable Number of Tandem Repeats or VNTRs that tends to differ among different people. Continue………
  • 23. • While the repeated sequences themselves are usually the same from person to person, the number of times they are repeated tends to vary. • VNTRs are highly polymorphic. These can be isolated from an individual’s DNA and therefore relatively easy to map. • However, VNTRs have a disadvantage as genetic markers: They tend to bunch together at the ends of chromosomes, leaving the interiors of the chromosomes relatively devoid of markers.
  • 24. C. Sequence Tagged Sites (STS) Another kind of genetic marker, which is very useful to genome mappers, is the sequence-tagged site (STS). •STSs are short sequences, about 60–1000 bp long, that can be easily detected by PCR using specific primers. •The sequences of small areas of this DNA may be known or unknown, so one can design primers that will hybridize to these regions and allow PCR to produce double stranded fragments of predictable lengths. If the proper size appears, then the DNA has the STS of interest. •One great advantage of STSs as a mapping tool is that no DNA must be cloned and examined. •Instead, the sequences of the primers used to generate an STS are published and then anyone in the world can order those same primers and find the same STS in an experiment that takes just a few hours. Continue………
  • 25. In this example, two PCR primers (red) spaced 250 bp apart have been used. Several cycles of PCR generate many double-stranded PCR products that are precisely 250 bp long. Electrophoresis of this product allows one to measure its size exactly and confirm that it is the correct one. Figure : Sequence-tagged sites
  • 26. Making physical map using Sequence Tagged Sites (STS) 1. Geneticists interested in physically mapping or sequencing a given region of a genome aim to assemble a set of clones called a contig, which contains contiguous (actually overlapping) DNAs spanning long distances. 2. It is essential to have vectors like BACs and YACs that hold big chunks of DNA. Assuming we have a BAC library of the human genome, we need some way to identify the clones that contain the region we want to map. 3. A more reliable method is to look for STSs in the BACs. It is best to screen the BAC library for at least two STSs, spaced hundreds of kilo-bases apart, so BACs spanning a long distance are selected. 4. After we have found a number of positive BACs, we begin mapping by screening them for several additional STSs, so we can line them up in an overlapping fashion as shown in following figure. This set of overlapping BACs is our new contig. We can now begin finer mapping, and even sequencing, of the contig. Continue………
  • 27. Fig: Mapping with STSs. At top left, several representative BACs are shown, with different symbols representing different STSs placed at specific intervals. In step (a) of the mapping procedure, screen for two or more widely spaced STSs. In this case screen for STS1 and STS4. All those BACs with either STS1 or 4 are shown at top right. The identified STSs are shown in color. In step (b), each of these positive BACs is further screened for the presence of STS2, STS3, and STS5.The colored symbols on the BACs at bottom right denote the STSs detected in each BAC. In step (c), align the STSs in each BAC to form the contig. Measuring the lengths of the BACs by pulsed-field gel electrophoresis helps to pin down the spacing between pairs of BACs.
  • 28. D. Microsatellites STSs are very useful in physical mapping or locating specific sequences in the genome. But sometimes it is not possible to use them for genetic mapping. Fortunately, geneticists have discovered a class of STSs called microsatellites. GCTTGGTGTGATGTAGAAGGCGCCAATGCATCTCGACGTAT GCGTATACGGGTTACCCCCTTTGCAATCAGTGCACACACAC ACACACACACACACACACACACACACACACAGTGCCAAGCA AAAATAACGCCAAGCAGAACGAAGACGTTCTCGAGAACACC GCTTGGTGTGATGTAGAAGGCGCCAATGCATCTCGACGTAT GCGTATACGGGTTACCCCCTTTGCAATCAGTGCACACACAC ACACACACACACACACACACACACACACACAGTGCCAAGCA AAAATAACGCCAAGCAGAACGAAGACGTTCTCGAGAACACC  Microsatellites are similar to minisatellites in that they consist of a core sequence repeated over and over many times in a row.  The core sequence in typical microsatellites is smaller—usually only 2–4 bp long.  Microsatellites are highly polymorphic; they are also widespread and relatively uniformly distributed in the human genome.  The number of repeats varied quite a bit from one individual to another.  Thus, they are ideal as markers for both linkage and physical mapping. Continue………
  • 29.  In 1992, Jean Weissenbach et al produced a linkage map of the entire human genome based on 814 microsatellites containing a C–A dinucleotide repeat.  The most common way to detect microsatellites is to design PCR primers that are unique to one locus in the genome and unique on base pair on either side of the repeated portion.  Therefore, a single pair of PCR primers will work for every individual in the species and produce different sized products for each of the different length microsatellites.  The PCR products are then separated by either gel electrophoresis. Either way, the investigator can determine the size of the PCR product and thus how many times the dinucleotide ("CA") was repeated for each allele.
  • 30. Whole Whole ggeennoommee SShhoottgguunn SSeeqquueenncciinngg The shotgun-sequencing strategy, first proposed by Craig Venter, Hamilton Smith, and Leroy Hood in 1996, bypasses the mapping stage and goes right to the sequencing stage. This method was employed by Celera Genomics, which was a private entity that was trying to mono-polise the human genome sequence by patenting it, to do this they had to try and beat the publicly funded project. Whole genome shotgun sequencing was therefore adopted by them. 1. BAC library: A BAC library is generated of random fragments of the human genome using restriction digestion followed by cloning. The sequencing starts with a set of BAC clones containing very large DNA inserts, averaging about 150 kb. The insert in each BAC is sequenced on both ends using an automated sequencer that can usually read about 500 bases at a time, so 500 bases at each end of the clone will be determined. Assuming that 300,000 clones of human DNA are sequenced this way, that would generate 300 million bases of sequence, or about 10% of the total human genome. These 500-base sequences serve as an identity tag, called a sequence-tagged connector (STC), for each BAC clone. This is the origin of the term connector—each clone should be “connected” via its STCs to about 30 other clones. Continue………
  • 31. Steps: 1. BAC library 2. Finger printing 3. Plasmid library 4. BAC walking 5. Powerful computer program Fig: Whole Genome Shotgun Sequencing Method Continue………
  • 32. 2. Finger printing: This step is to fingerprint each clone by digesting it with a restriction enzyme. This serves two important purposes. First, it tells the insert size (the sum of the sizes of all the fragmented by the restriction enzyme). Second, it allows one to eliminate aberrant clones whose fragmentation patterns do not fit the consensus of the overlapping clones. Note that this clone fingerprinting is not the same as mapping; it is just a simple check before sequencing begins. 3. Plasmid library: A seed BAC is selected for sequencing. The seed BAC is sub cloned into a plasmid vector by subdividing the BAC into smaller clones only about 2 kb. A plasmid library is prepared by transforming E. coli strains with plasmid. This whole BAC sequence allows the identification of the 30 or so other BACs that overlap with the seed: They are the ones with STCs that occur somewhere in the seed BAC. 4. BAC walking: Three thousand of the plasmid clones are sequenced, and the sequences are ordered by their overlaps, producing the sequence of the whole 150-kb BAC. Finding the BACs (about 30) with overlapping STCs, then compare them by fingerprinting to find those with minimal overlaps, and sequence them. This strategy, called BAC walking, would in principle allow one laboratory to sequence the whole human genome. Continue………
  • 33. 5. Powerful computer program: But we do not have that much time, so Venter and colleagues modified the procedure by sequencing BACs at random until they had about 35 billion bp of sequence. In principle that should cover the human genome ten times over, giving a high degree of coverage and accuracy. Then they fed all the sequence into a computer with a powerful program that found areas of overlap between clones and fit their sequences together, building the sequence of the whole genome.
  • 34. Finishing • Process of assembling raw sequence reads into accurate contiguous sequence – Required to achieve 1/10,000 accuracy • Manual process – Look at sequence reads at positions where programs can’t tell which base is the correct one – Fill gaps – Ensure adequate coverage Gap Single stranded Continue………
  • 35. Finishing • To fill gaps in sequence, design primers and sequence from primer • To ensure adequate coverage, find regions where there is not sufficient coverage and use specific primers for those areas GAP Primer Primer
  • 36. Verification • Region verified for the following: – Coverage – Sequence quality – Contiguity • Determine restriction-enzyme cleavage sites – Generate restriction map of sequenced region – Must agree with fingerprint generated of clone during mapping step
  • 37. NNeeww tteecchhnnoollooggiieess • A high-priority goal at the beginning of the Human Genome Project was to develop new mapping and sequencing technologies • To date, no major breakthrough technology has been developed – Possible exception: whole-genome shotgun sequencing applied to large genomes, Celera AAuuttoommaatteedd sseeqquueenncceerrss • Perhaps the most important contribution to large-scale sequencing was the development of automated sequencers – Most use Sanger sequencing method – Fluorescently labeled reaction products – Capillary electrophoresis for separation
  • 38. Automated sequencers: ABI 3700 96–well plate robotic arm and syringe 96 glass capillaries load bar MegaBACE ABI 3700
  • 39. Automatic gel reading Computer image of sequence read by automated sequencer
  • 40. Sequence assembly readout Consensus building
  • 41. Genome sequencing achievment in Bangladesh • Genome sequencing of Macrophomina phaseolina • Genome sequencing of Jute
  • 42. Genome of destructive Pathogen Macrophomina phaseolina unraveled by Maqsudul Alam & BJRI Associates Genome of destructive Pathogen Macrophomina phaseolina unraveled by Maqsudul Alam & BJRI Associates  Macrophomina phaseolina is a soil and seed borne fungus.  it can infect more than 500 cultivated and wild plant species.  It causes seedling blight, dry root rot, wilt, leaf blight, stem blight, root and stem rot of different cultivated and wild plant species.  The fungus can remain viable for more than 4 years in soil and crop. Continue………
  • 43. • The Basic and Applied Research on Jute (BARJ) project team, led by Prof Maqsudul Alam, took this unique challenge and, for the first time in the world, decoded the genome of this most dangerous fungus. • They have identified the proteins and their networks that the fungus uses to attack and kill the plant. This fundamental knowledge will help to defend and fight against this fungus and to promote the development of resistant varieties of jute as well as other crops.
  • 44. Genome sequencing of Tossa jute Genome sequencing of Tossa jute ((CCoorrcchhoorruuss oolliittoorriiuuss)) • Jute was called the Golden Fiber of Bangladesh as Bangladesh was the largest jute production country of the world. • Genome sequencing of jute has been discovered by Bangladeshi scientists. Continue………
  • 45. • The country first time in world decoded the jute genome. • The research team was led by Professor Maqsudul Alam from University of Hawaii, who also successfully led the genome discovery of papaya in USA and rubber in Malaysia. • Also included  a group of Bangladeshi researchers from Dhaka University's Biochemistry and Biotechnology departments,  Bangladesh Jute Research Institute (BJRI)  software firm Data Soft in collaboration with Centre for Chemical Biology,  University of Science, Malaysia and  University of Hawaii have successfully decoded the jute's genome. This was done under the Basic & Applied Research on Jute Project (BARJ).
  • 46. Fig: Internationally famed geneticist Maqsudul Alam and other scientists of jute genome project
  • 47. Anticipated Anticipated BBeenneeffiittss ooff GGeennoommee RReesseeaarrcchh Molecular Medicine • improve diagnosis of disease • detect genetic predispositions to disease • create drugs based on molecular information Microbial Genomics • rapidly detect and treat pathogens (disease-causing microbes) in clinical practice • develop new energy sources (biofuels) • monitor environments to detect pollutants • clean up toxic waste safely and efficiently. Risk Assessment • evaluate the health risks faced by individuals who may be exposed to radiation and to cancer-causing chemicals and toxins Bio-archaeology, Anthropology, Evolution • study evolution through mutations in lineages • study migration of different population groups based on maternal inheritance Continue………
  • 48. • compare breakpoints in the evolution of mutations with ages of populations and historical events. Agriculture, Livestock Breeding, and Bio-processing • grow disease-, insect-, and drought-resistant crops • breed healthier, more productive, disease-resistant farm animals • grow more nutritious produce • develop biopesticides • incorporate edible vaccines incorporated into food products DNA Identification (Forensics) • identify potential suspects whose DNA may match evidence left at crime scenes • identify crime victims • establish paternity and other family relationships • identify endangered and protected species as an aid to wildlife officials • detect bacteria and other organisms that may pollute air, water, soil, and food • match organ donors with recipients in transplant programs
  • 49. References • Weaver RF 2005. Molecular Biology. McGraw-Hill International edition, NY. • Gardner EJ, MJ Simmons and DP Snustad 1991. Principles of Genetics. John Wiley and Sons Inc, NY. • Gupta, P.K. 2007. Genetics. Rastogi Publications, Meerut. • Allison LA, 2007. Fundamental Molecular Biology, Blackwell publishing, USA • Internet

Editor's Notes

  1. Although automated editing programs like PHRAP have greatly increased the efficiency of sequencing, there remains a need for human judgment and intervention. This occurs during the finishing step, which is defined as the process of assembling the raw sequence reads into an accurate contiguous genomic sequence. For genomic sequencing with an accuracy of one error in 10,000 bases, a manual finishing step is essential. The finisher looks at positions where the automated editing program can’t tell which base is the correct one. By examining the various raw sequence reads, the finisher then makes a judgment call as to the correct base or sends the region back for additional sequencing. Similarly, when there are gaps in the sequence or insufficient coverage, the finisher will flag the region and send it back to the production sequencing team for more work.
  2. Gaps are usually filled by designing custom sequencing primers that are complementary to the regions adjacent to the gap. The sequencing reaction is then performed using these custom primers on a clone containing the problematic region of DNA as the template. A similar strategy is used for regions with insufficient coverage: Custom primers are made and then used for directed sequencing of a particular region.
  3. The final step of finishing is to verify the sequence. All regions are checked for the extent of coverage (i.e., how many times the same region has been sequenced, and in what direction), for sequence quality (i.e., whether ambiguity has been removed for all positions in the sequence), and for contiguity (i.e., whether the sequence forms one uninterrupted stretch of DNA). A good test of sequence quality that is frequently used in the finishing stage is to determine the sites where restriction enzymes would cut in the newly acquired sequence. A restriction map is generated from the sequence and then compared with the known fingerprint generated from the clone during the mapping step. If both show the same pattern, then it is considered to be an indication that the sequence is of high quality and relatively error free.
  4. The ABI 3700, made by Applied Biosystems, is probably the most widely used instrument for large-scale sequencing. It has 96 capillaries that are fed by robotic loading from two 384-well microtiter plates. It makes a sequence run every two to three hours and can read on average 600–700 bases per run. Celera, the company that produced a rough draft of the human genome in three years, used 200 of these machines running 24/7 to do so.
  5. Both automated sequencers detect fluorescently labeled DNA strands as they pass through the capillaries. The MegaBACE sequencer uses the confocal imaging system shown in the top image. The readout is given as peaks of fluorescence, as shown in the lower image.
  6. The readout generated by the PHRAP program shows overlapping sequences lined up to form one contiguous sequence.