Original Next Gen Seq Methods set of slides prepared for Technorazz Vibes 2016. There is also a shorter version.
This starts with an introduction to qPCR followed by an introduction to Library Complexity. Microarrays are discussed as well along with a very short introduction to FISH. Finally discussion of Next gen seq methods is done where generation of sequencers are discussed and a short discussion of the ILLUMINA protocol. Finally comparison of ILLUMINA amongst other 3rd gen sequencer, description of the standard pipeline and the omics technologies that have risen from this seq data.
Deciphering DNA sequences is essential for virtually all branches of biological research. With the
advent of capillary electrophoresis (CE)-based Sanger sequencing, scientists gained the ability to
elucidate genetic information from any given biological system. This technology has become widely
adopted in laboratories around the world, yet has always been hampered by inherent limitations in
throughput, scalability, speed, and resolution that often preclude scientists from obtaining the essential
information they need for their course of study. To overcome these barriers, an entirely new technology
was required—Next-Generation Sequencing (NGS), a fundamentally different approach to sequencing
that triggered numerous ground-breaking discoveries and ignited a revolution in genomic science.
In this lecture tried to introduce some basic methods of DNA sequencing like pyrosequencing, sequencing by ligation, sequencing by synthesis and Ion Semiconductor Sequencing
and describe them. Also introduced some new sequencing method (third generation sequencing) like SMRT (Single Molecule Real-Time Sequencing) and GridION.
whole genome analysis
history
needs
steps involved
human genome data
NGS
pyrosequencing
illumina
SOLiD
Ion torrent
PacBio
applications
problems
benefits
complete Single Nucleotide Polymorphiitsm Detection methods with Advance techniques with its applications
Single nucleotide polymorphisms are single base variations between genomes within a species.
There are at least 10 million polymorphic sites in the human genome.
SNPs can distinguish individuals from one another
Denaturing Gradient Gel Electrophoresis
Chemical Cleavage Of Mismatch
Single-stranded Conformation Polymorphism (SSCP)
MutS Protein-binding Assays
Mismatch Repair Detection (MRD)
Heteroduplex Analysis (HA)
Denaturing High Performance Liquid Chromatography (DHPLC)
UNG-Mediated T-Sequencing
RNA-Mediated Finger printing with MALDI MS Detection
Sequencing by Hybridization
Direct DNA Sequencing
Single-feature polymorphism (SFP)
Invader probe
Allele-specific oligonucleotide probes
PCR-based methods
Allele specific primers
Sequence Polymorphism-Derived (SPD) markers
Targeting induced local lesions in genomes (TILLinG)
Minisequencing primers
Allele-specific ligation probes
This presentation is explains about the genome sequencing, its traditional method and modern method. This basically focus on Next Generation Sequencing and its types.
Deciphering DNA sequences is essential for virtually all branches of biological research. With the
advent of capillary electrophoresis (CE)-based Sanger sequencing, scientists gained the ability to
elucidate genetic information from any given biological system. This technology has become widely
adopted in laboratories around the world, yet has always been hampered by inherent limitations in
throughput, scalability, speed, and resolution that often preclude scientists from obtaining the essential
information they need for their course of study. To overcome these barriers, an entirely new technology
was required—Next-Generation Sequencing (NGS), a fundamentally different approach to sequencing
that triggered numerous ground-breaking discoveries and ignited a revolution in genomic science.
In this lecture tried to introduce some basic methods of DNA sequencing like pyrosequencing, sequencing by ligation, sequencing by synthesis and Ion Semiconductor Sequencing
and describe them. Also introduced some new sequencing method (third generation sequencing) like SMRT (Single Molecule Real-Time Sequencing) and GridION.
whole genome analysis
history
needs
steps involved
human genome data
NGS
pyrosequencing
illumina
SOLiD
Ion torrent
PacBio
applications
problems
benefits
complete Single Nucleotide Polymorphiitsm Detection methods with Advance techniques with its applications
Single nucleotide polymorphisms are single base variations between genomes within a species.
There are at least 10 million polymorphic sites in the human genome.
SNPs can distinguish individuals from one another
Denaturing Gradient Gel Electrophoresis
Chemical Cleavage Of Mismatch
Single-stranded Conformation Polymorphism (SSCP)
MutS Protein-binding Assays
Mismatch Repair Detection (MRD)
Heteroduplex Analysis (HA)
Denaturing High Performance Liquid Chromatography (DHPLC)
UNG-Mediated T-Sequencing
RNA-Mediated Finger printing with MALDI MS Detection
Sequencing by Hybridization
Direct DNA Sequencing
Single-feature polymorphism (SFP)
Invader probe
Allele-specific oligonucleotide probes
PCR-based methods
Allele specific primers
Sequence Polymorphism-Derived (SPD) markers
Targeting induced local lesions in genomes (TILLinG)
Minisequencing primers
Allele-specific ligation probes
This presentation is explains about the genome sequencing, its traditional method and modern method. This basically focus on Next Generation Sequencing and its types.
A microarray is a laboratory tool used to detect the expression of thousands of genes at the same time. DNA microarrays are microscope slides that are printed with thousands of tiny spots in defined positions, with each spot containing a known DNA sequence or gene.
Sequencing genes and genomes in biology. The most important technique available to the molecular biologist is DNA sequencing, by which the precise order of nucleotides in a piece of DNA can be determined
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...QIAGEN
This slidedeck provides a technical overview of DNA/RNA preprocessing, template preparation, sequencing and data analysis. It covers the applications for NGS technologies, including guidelines for how to select the technology that will best address your biological question.
Course: Bioinformatics for Biomedical Research (2014).
Session: 4.1- Introduction to RNA-seq and RNA-seq Data Analysis.
Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.
DNA Sequencing - DNA sequencing is like reading the instructions inside a cellAmitSamadhiya1
DNA sequencing is like reading the instructions inside a cell. It's figuring out the exact order of the building blocks that make up our DNA, represented by the letters A, T, C, and G. This order is like a code that tells our bodies how to function and grow.
By reading this code, scientists can understand genes, diagnose diseases, and even trace our ancestry. There are different ways to sequence DNA, kind of like having a few different ways to read a book. These techniques are constantly improving, making it faster and easier to unlock the secrets hidden in our DNA.
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
Online aptitude test management system project report.pdfKamal Acharya
The purpose of on-line aptitude test system is to take online test in an efficient manner and no time wasting for checking the paper. The main objective of on-line aptitude test system is to efficiently evaluate the candidate thoroughly through a fully automated system that not only saves lot of time but also gives fast results. For students they give papers according to their convenience and time and there is no need of using extra thing like paper, pen etc. This can be used in educational institutions as well as in corporate world. Can be used anywhere any time as it is a web based application (user Location doesn’t matter). No restriction that examiner has to be present when the candidate takes the test.
Every time when lecturers/professors need to conduct examinations they have to sit down think about the questions and then create a whole new set of questions for each and every exam. In some cases the professor may want to give an open book online exam that is the student can take the exam any time anywhere, but the student might have to answer the questions in a limited time period. The professor may want to change the sequence of questions for every student. The problem that a student has is whenever a date for the exam is declared the student has to take it and there is no way he can take it at some other time. This project will create an interface for the examiner to create and store questions in a repository. It will also create an interface for the student to take examinations at his convenience and the questions and/or exams may be timed. Thereby creating an application which can be used by examiners and examinee’s simultaneously.
Examination System is very useful for Teachers/Professors. As in the teaching profession, you are responsible for writing question papers. In the conventional method, you write the question paper on paper, keep question papers separate from answers and all this information you have to keep in a locker to avoid unauthorized access. Using the Examination System you can create a question paper and everything will be written to a single exam file in encrypted format. You can set the General and Administrator password to avoid unauthorized access to your question paper. Every time you start the examination, the program shuffles all the questions and selects them randomly from the database, which reduces the chances of memorizing the questions.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
2. Sequencing
• Sequencing is the process of determining the precise
order of nucleotides within a DNA, RNA molecule.
• In case of proteins, amino acids
• It includes any method or technology that is used to
determine the order of the four bases—adenine,
guanine, cytosine, and thymine—in a strand of DNA.
• Next-generation sequencing (NGS or high-throughput
sequencing are collectively technologies developed by:
– Illumina (Solexa) sequencing
– Roche 454 sequencing
– Ion torrent: Proton / PGM sequencing
– SOLiD sequencing (Thermo Fisher Scientific)
3.
4. qPCR
• Quantitative PCR measures the level of a particular
nucleic acid (usually mRNA) defined by a primer pair
sequence that amplifies it
• The mRNA is converted into DNA by reverse
transcriptase (a retroviral enzyme)
• Makes use of a high temperature stable polymerase i.e.
Taq polymerase
• It is a non-leniar amplification
• Selective bias of molecules can register less library
complexity
5. • Quantitaiveness by: measuring the no. of
cycles needed to reach a fluorescence
intensity threshold
• However, larger no. of cycles = lower original
amount of DNA i.e. less DNA available for
experimentation purposes
• It is a medium-throughput approach and is
usually limited to only being of validative use
6. Library Complexity
• The no. of unique molecules in
the “library” that is sampled by
finite sequencing constitutes
library complexity.
• Simple representation methods
(such as Poisson's distribution)
can be wrong in representing the
complexity
• Poisson’s dist. does not account
for over dispersion, hence
Poisson’s sampling is only
effective for smaller population
sizes
• Negative binomial distribution
(Poisson-Gamma dist.) is one
method of correctly estimating
the library complexity
7. How to register Lib. Complexity
• Attach a set of 5 nucleotide
barcodes (1,024 possible) to
the 5’ end of (nearly) every
mRNA
• Sequence the 5’ ends and
count the number of unique
barcodes that appear
• This is enough to ensure
unique alignment
• Obtained is the number of
mRNA molecules in the
original sample
8. • Gamma sampling rates describe the
entire population library (complexity)
• In high-throughput when reads are as large
as hundreds of millions, it becomes useful
to ask:
1.) How complex is the original library?
2.) Is the data really good?
• If the observed complexity matches with
the theoretical complexity , then it solves
the dilemma whether further sequencing
should be done to capture entire complexity
9. Microarrays
• These were one of the first omic
methods
• Used for measuring: transcript
levels, genotyping, DNA mapping
(viz. DNA copy no., methylome)
• Principle: On microarray chip there
are spots – each spot has a diff.
oligonucleotide – sample binds to
complimentary ones and generates
a fluorescent signal – we
photograph it – analyze the
photograpth – get results
• It is primarily used for
Transcriptome and registers lib.
complexity is very well
• Instead of chips/slides, beads are
also used (as we will see in
ILLUMINA seq tech.)
10. FISH
• Fluorescence in situ
hybridization uses labeled
oligos to hybridize to the
target nucleic acids
• It is shown to be done for
single molecules
(Raj et al, Nat. Met., 5(10),
2008)
• It can also be highly
multiplexed with super-
resolution and fluropore
barcoding (Lubec and cori,
Nat. Met. 9(7), 2012)
11. General rules of thumb
• Few different molecules (RNA): Northern
Blotting or FISH
• Medium throughput: qPCR
• High throughput: Microarrays, Next-Gen
Sequencing
12. Need of sequencing
• Sequencing such as that of mRNA allows for:
(a.) Quantification of expressed transcriptome (transcript library)
(b.) The ratios of splice variant levels i.e. for a gene that differentially
expresses two splice isoforms
(c.) Identification of novel splice variants
• Seq Techniques for Nucleic acids is broadly divided into two types:
(1.) Population average: Large amount of starting material
required (~1 M cells)
e.g. qPCR, microarrays, Deep Sequencing Technologies (viz. Whole
genome, Exome seq, Transcriptome, Bisulphate seq, ChIP seq) etc.
(2.) Single cell: Detecting analytes from single cell.
-> Main challenge- separating measurement noise
e.g. qPCR, FISH, RNA seq etc.
13. Sequencing Based Methods
• Although high-throughput
microarrays have certain
limitations viz.
(a.) High background noise
(b.) Need of large starting material
(c.) Microarray is hard to compare
across different techs.
(c.) Limited ability to distinguish
isoforms and allelic expression
• These are overcome by Next Gen
RNA sequencers
• Where depth of sequencing or
the average no. of short reads
per base pair is a key parameter
in seq.
• Exome seq give more info. for
less short reads
• Apart from the general protocol–
a step of exonic fragment
isolation is more to exon seq.
14. Generations of RNA sequencers
• 1st gen/ Sanger
Sequencing
These were primer
based methods which
used ddNTPs,
fluorescent markers and
enzymes
Low throughput, ~700
bp read length
Very slow and expensive
but highly accurate
Parallel seq was used to
increase sequencing
speed – 96 and 384 well
formats
16. • 2nd gen sequencing
DNA strand sep. problem
was eliminated by 2nd
gen. seq.
Parallel identification of
nts. during synthesis
A fundamentally different
approach
Limitations:
These require
amplification of DNA to
meet detection threshold
Amplification bias
Practical limits in read
lengths
17. ● Polymerase releases H+
during base incorporation
● H+ is measured by a
semi‐conductor wafer
● Essentially a massively
parallel pH meter
18. • 3rd gen sequencing
Longer read length without the
need of amplification
Involves immobilized polymerase +
fluorescent DNTPs + highly
sensitive optometry
DNTPs have 6 Ps instead of 3, thus
a longer fluorescence pulse is
generated
Color of fluorescence is compared
to give results
It is quite noisy and has a high
error rate
One powerful aspect is
methylation detection –
methylated bases take longer time
to be read
19.
20. • ILLUMINA Protocol (for a 3rd gen seq
machine)
Two components: (a.) RNA seq library
preparation (Mol Bio. Component) &
(b.) Actual seq
and Data analysis
(a.) We use Bioanalyzer chip -- a high-
throughput electrophoresis on a chip
for
QC of the mRNA sample
(Takes up to 45 min.)
Next is adapter ligation– automated –
takes up to 5 hrs.
Preparation of cDNA libraries in PCR – 1
hr.
Reanalysis of QC on a DNA chip
21. Libraries are
loaded at flow in
(tunnel / lane) with a
separate automated
machine (5 hrs.)
Each lane – 20 B of
the seq data nts.
25. Fruits of sequencing
• Answering key questions such as – are certain
SNIPs associated with diseases? What critical
mutations are there which caused the
disease?
• However, data must be quantitative and
sampling population should be large to make
any such assessment e.g. mutation
frequencies
26.
27. Conclusion
• These technologies allows for sequencing of
DNA and RNA much more quickly and cheaply
than the previously used Sanger sequencing,
and as such have revolutionised the study of
genomics and molecular biology
28. The Latest
• MinION has the potential to
revolutionize the field of sequencing
completed
• It was field tested recently for seq.
EBOLA virus
• A threat to the decade long
ILLUMINA dominancy over the
market
• Very cheap!
30. # References#Image Credits
Mostly CC0 images, FISH, microarray, ILLUMINA images from Nature reviews and ILLUMINA
site resp., in lecture images from MIT OCW lectures, Lab-based videos of SBCNY (Ic, CC0
slides from), Cost per genome image from NHGRI, MinION from google images
#Content
Goodwin S, McPherson JD, McCombie RW, Coming of age: ten years of next-generation
sequencing technologies, Nature Reviews Genetics 17, 333–351 (2016)
doi:10.1038/nrg.2016.49
Buermans HPJ, Den Dunnen JT, Next generation sequencing technology: Advances and
applications, Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease, 1842, Issue
10, 1932–1941 (2014), http://dx.doi.org/10.1016/j.bbadis.2014.06.015
Ed Yong, A DNA Sequencer in Every Pocket – The Atlantic, Apr 28, 2016
Lectures from SBCNY : ILLUMINA Protocol (Icahn School of Medicine at Mt. Sinai)
Lectures from MIT OCW: 2. Local Alignment (BLAST) and Statistics
Selected content VHIR : Introduction to next generation sequencing (VHIR (Vall d’Hebron
Institut de Recerca)