3. Introduction:
• DNA sequencing: Process of determining the precise order
of nucleotides within a DNA molecule.
• It includes any method or technology that is used to determine the order
of the four bases—
* Adenine (A)
* Guanine (G)
* Cytosine (C)
* Thymine (T)
• Advent of rapid DNA sequencing methods has greatly accelerated
biological and medical research and discovery.
4. Contd.
• Knowledge of DNA sequences has become indispensable for
basic biological research, and in numerous applied fields.
• Diagnostics
• Biotechnology
• Forensic biology
• Biological systematics.
• The rapid speed of sequencing attained with modern DNA
sequencing technology has been instrumental in the
sequencing of complete DNA sequences, or genomes of
numerous types and species of life, including the human
genome and other complete DNA sequences of many
animal, plant, and microbial species.
5. History:
• The first DNA sequences were obtained in the early 1970s by
academic researchers using laborious methods based on two-
dimensional chromatography. Following the development
of fluorescence-based sequencing methods with automated
analysis.
• Several notable advancements in DNA sequencing were made
during the 1970s. Frederick Sanger developed rapid DNA
sequencing methods at the MRC Centre, Cambridge, UK and
published a method for "DNA sequencing with chain-
terminating inhibitors" in 1977.
• Walter Gilbert and Allan Maxam at Harvard also developed
sequencing methods, including one for "DNA sequencing by
chemical degradation"
6.
7. Contd.
• The first full DNA genome to be sequenced was that
of bacteriophage φX174 in 1977. Medical Research
Council scientists deciphered the complete DNA sequence of
the Epstein-Barr virus in 1984, finding it to be 170 thousand
base-pairs long.
• Leroy E. Hood's laboratory at the California Institute of
Technology and Smith announced the first semi-automated
DNA sequencing machine in 1986.
• Followed by Applied Biosystems' marketing of the first fully
automated sequencing machine, the ABI 370, in 1987.
• By 1990, the U.S. NIH had begun large-scale sequencing trials
on Mycoplasma capricolum ,Escherichia coli, Caenorhabditis
elegans, and Saccharomyces cerevisiae at a cost of US$0.75
per base.
8. Several new methods for DNA sequencing were developed in the mid to
late 1990s. These techniques comprise the first of the "next-generation"
sequencing methods.
In 1996, Pål Nyrén and his student Mostafa Ronaghi at the Royal Institute
of Technology in Stockholm published their method of pyrosequencing.
Lynx Therapeutics published and marketed "Massively parallel signature
sequencing", or MPSS, in 2000. This method incorporated a parallelized,
adapter/ligation-mediated, bead-based sequencing technology and
served as the first commercially available "next-generation" sequencing
method, though no DNA sequencers were sold to independent
laboratories
9. Use of Sequencing:
• DNA sequencing may be used to determine the sequence of
individual genes, larger genetic regions, full chromosomes or
entire genomes.
• Depending on the methods used, sequencing may provide the
order of nucleotides in DNA or RNA isolated from cells of
animals, plants, bacteria or virtually any other source of
genetic information.
• The resulting sequences may be used by researchers
in molecular biology or genetics to further scientific progress
or may be used by medical personnel to make treatment
decisions or aid in genetic counselling.
11. Sangers Method:
• The DNA sample is divided into four separate sequencing
reactions, containing all four of the
standard deoxynucleotides (dATP, dGTP, dCTP and dTTP) and the DNA
polymerase.
• To each reaction is added only one of the four dideoxynucleotides
(ddATP, ddGTP, ddCTP, or ddTTP).
• Following rounds of template DNA extension from the bound primer, the
resulting DNA fragments are heat denature and separated by size using gel
electrophoresis.
• This is frequently performed using a denaturing polyacrylamide-urea gel
with each of the four reactions run in one of four individual lanes (lanes
A, T, G, C). The DNA bands may then be visualized by autoradiography or
UV light and the DNA sequence can be directly read off the X-ray film or
gel image.
• Part of a radioactively labelled sequencing gel
• In the image on the right, X-ray film was exposed to the gel, and the dark
bands correspond to DNA fragments of different lengths. A dark band in a
lane indicates a DNA fragment that is the result of chain termination after
incorporation of a dideoxynucleotide (ddATP, ddGTP, ddCTP, or ddTTP).
The relative positions of the different bands among the four lanes, from
bottom to top, are then used to read the DNA sequence
12.
13. Next Generation Sequencing
• Employs micro and nanotechnologies to reduce the size of
sample components, reducing reagent costs and enabling
massively parallel sequencing reactions.
• Highly multiplexed, allowing simultaneous sequencing and
analysis of millions of samples.
• Became commercially available from 2005.
• The first using Solexa sequencing technologies.
• Several different sequencing methods have been developed,
all of which are continually being developed at astonishing
rates.
14. Sangers vs. NGS
Sanger NGS
Sequencing samples Clones, PCR DNA Libraries
Sample Tracking Many samples in 96, 384
well plates
Few
Preparation steps Few, Sequencing reactions
clean up
Many, Complex procedures
Data Collection Samples in plates 96, 384 Samples on slides 1 – 16+
Data One read/ sample Thousands and Millions of
reads/Samples.
15. Method
Single-molecule
real time
sequencing
Ion
semiconductor
Pyrosequencing
(454)
Sequencing by
synthesis
(Illumina)
Sequencing by
ligation (SOLiD
sequencing)
Chain
termination
(Sanger
sequencing)
Read length 2900 bp average[ 200 bp 700 bp 50 to 250 bp
50+35 or 50+50
bp
400 to 900 bp
Accuracy
87% (read length
mode), 99%
(accuracy mode)
98% 99.9% 98% 99.9% 99.9%
Reads per run 35–75 thousand up to 5 million 1 million up to 3 billion 1.2 to 1.4 billion N/A
Time per run
30 minutes to 2
hours
2 hours 24 hours
1 to 10 days,
depending upon
sequencer and
specified read
length
1 to 2 weeks
20 minutes to 3
hours
Cost per 1
million bases
$2 $1 $10 $0.05 to $0.15 $0.13 $2400
Advantages
Longest read
length. Fast.
Detects 4mC,
5mC, 6mA.
Less expensive
equipment. Fast.
Long read size.
Fast.
Potential for
high sequence
yield, depending
upon sequencer
model
Low cost per
base.
Long individual
reads. Useful for
many
applications.
Disadvantages
Low yield at high
accuracy.
Equipment can
be very
expensive.
Homopolymer
errors.
Runs are
expensive.
Homopolymer
errors.
Equipment can
be very
expensive.
Slower than
other methods.
More expensive
and impractical
for larger
sequencing
projects.
16. Thank You
Archa Dave
M.Sc Micro Biology
Semester 2
12031G1901