How to cluster and sequence an ngs library (james hadfield160416)

‘How to prepare, cluster and sequence an NGS
library’
AN OVERVIEW OF NGS IN THE GENOMICS CORE
– Introduction
– Understanding library prep
– Understanding clustering and sequencing
– Understanding instruments
– NGS QC
– NGS applications

A potted history of Illumina sequencing
200Gb 500Gb1Gb 25Gb 1000Gb 1500Gb
1998
20142007
2010
2011 20152004
2006 2012
1994

Understanding library prep
Adapter ligation
Adenylation
BioAnalyser
qPCR

PCR
BioAnalyserqPCR
Fragment
DNA

Ligate adaptersEnd-repair PCR
BioAnalyserqPCR
Fragment
DNA
Adenylate

Understanding library prepFragment
DNA
Quant
DNA
Ligate adaptersEnd-repair
BioAnalyserqPCR
PCR
Adenylate

DNA
Quant
DNA
Size
select
Fragment
RNA
2ndstrand
dUTP
UNG
BioAnalyser
1ststrand
qPCR
Ribozero
deplete
OligodT
enrich
PCR
Adenylate

DNA
Quant
DNA
Sizeselect
Size
select
Adenylate
Tagment
Quant
DNA
2ndPCR
1stcapture
2ndcapture
Strand
displace
Sizeselect
Circularise
&exo
Bead
purify
Shear
5’ligaon
Reverse
transcribe
3’ligaon
Gelpurify
Fragment
RNA
2ndstrand
dUTP
UNG
BioAnalyser
1ststrand
qPCR
Ribozero
deplete
OligodT
enrich
PCR

DNA
Quant
DNA
Sizeselect
Size
select
Adenylate
Tagment
Quant
DNA
2ndPCR
1stcapture
2ndcapture
Strand
displace
Sizeselect
Circularise
&exo
Bead
purify
Shear
5’ligaon
Reverse
transcribe
3’ligaon
Gelpurify
Fragment
RNA
2ndstrand
dUTP
UNG
BioAnalyser
1ststrand
qPCR
Ribozero
deplete
OligodT
enrich
Library
synthesis
Ampliy
&Index
Template
preparaon
QCBioAnalyser
QTqPCR
PCR

TSCA TruSeq stranded mRNANextera XT
TruSeq PCR-free TruSeq sRNANextera
TruSeq Nano Nextera Rapid Exome TruSeq Ribozero
Nextera Mate-PairTruSeq ChIP-seq TREX
Thruplex
Fragment
DNA
Quant
DNA
Sizeselect
Size
select
Adenylate
Tagment
Quant
DNA
2ndPCR
1stcapture
2ndcapture
Strand
displace
Sizeselect
Circularise
&exo
Bead
purify
Shear
5’ligaon
Reverse
transcribe
3’ligaon
Gelpurify
Fragment
RNA
2ndstrand
dUTP
UNG
BioAnalyser
1ststrand
qPCR
Ribozero
deplete
OligodT
enrich
Library
synthesis
Ampliy
&Index
Template
preparaon
QCBioAnalyser
QTqPCR
PCR

6 hours to 3 days dependant on
actual sample type, not high-
throughput or 96well

– Text
Adapter liga on, PCR and sequencing
A
P
OH
T
Read1
P5
P7
BC
Read2
P5
P7
BC
Read1P5
Read2
P7BC

Illumina adaptersask for Illumina letter!
CTCTTCCGATCT
ADAPTER
PCR PRIMER
SEQ PRIMER
CTCTTCCGATC
T
CTCTTCCGATCT
Insert DNA
CTCTTCCGATCT
InsertDNA A
||||||||||
InsertDNAACTCGTATGCCGTCTTCTGCTT
G
P-
GATCGGAAGAG
CTCTTCCGATCT T
||||||||||
CTCGTATGCCGTCTTCTGCTT
G
P-GATCGGAAGAG
CTCTTCCGATC
T
T
||||||||||
Oligonucleotide sequences © Illumina, Inc. All rights reserved.

The library prep spike
[DNA]
Illumina Processing

Understanding library prep – Nextera!
– Text
a
c
R1P5 BC1
R2 P7BC2
b

Understanding cluster generation (2500 etc)

A) Diluted & denatured libraries are annealed to lawn oligos at their 3’ end, and a
polymerase creates a covalently attached copy of the library molecule.
B) The original strand is removed by denaturation with NaOH.
C) In non-denaturing conditions the library molecule bends and hybridises to a lawn
oligo complementary to the 5’ end, and a polymerase creates a second covalently
attached molecule. This amplification is repeated to create a cluster with around 1000
copies of the original library molecule.
A B C

D E C G H
D) Clusters are linearized by cleavage at the 3’ end of the original library molecule, and
denaturation leaves the single stranded DNA which will be sequenced. A sequencing
primer is hybridised* and sequencing-by-synthesis generates the first read in your fastq
file.
-) For single-end indexing the the SBS template is removed by denaturation, and the index
1 sequencing primer is hybridised ready to generate index1 (i7). Dual-indexing is
complicated and differs on single- or paired-end flowcells but the process is essentially the
same to generate index two (i5).
E-G) For paired-end sequencing the SBS template is removed by denaturation, the cluster
is re-amplified for several cycles, cleaved at the 5’ end the paired-end sequencing primer
hybridised ready to generate read 2.
*Beware: if you create new adapters let us know if you need a custom sequencing primer

Understanding cluster generation (X Ten & 4000)
Exclusion Amplification
The same hybridisation and solid-surface amplification occurs but in an all-in-one
phase called “exclusion amplification” (ExAmp). Once a library molecule “lands” in a
well it should occupy it completely.

Understanding cluster generation (X Ten & 4000)
Exclusion Amplification

Understanding sequencing: Sanger-seq

Understanding sequencing: Pyro-seq

Understanding sequencing: Sequencing-by-synthesis

Understanding “sequencing by synthesis”

Understanding “sequencing by synthesis”
Instrument “colours”
HiSeq, MiSeq 4-colour SBS
NextSeq 2-colour SBS
Firefly 1-colour SBS?

Instruments explained – HiSeq 2500 & 4000

Different sequencing configurations
2500 Rapid
150M reads
SE 50bp 85%Q30
PE 250bp 75%Q30
PE 150 2 days
2500 High output
250M reads
SE 50bp 85%Q30
PE 125bp 80%Q30
PE 125 6 days
4000 High output
312M reads
SE 50bp 85%Q30
PE 150bp 75%Q30
PE 150 3 days

HiSeq 4000 considerations
CLUSTERING IS VERY DIFFERENT FROM 2500
– PE150 - >125 is not great*
– %Q30 “passes Illumina spec”*
– ExAmp duplicates*
– Need to consider how you handle duplicates
– RNA-seq is fine
– Exome-seq is fine
– Genomes are fine

Instruments explained - MiSeq
~600bp
fragments
+/- 50bp
overlap
300bp
reads

Instruments explained - NextSeq

NGS QC – library prep
QUALITY CONTROL OF LIBRARIES IS IMPORTANT.
TITRATION FLOWCELLS AND FAILED RUNS ARE EXPENSIVE.
TRY TO IDENTIFY ISSUES BEFORE RUNNING ANY LANES.
QC IS SPECIFIC TO YOUR SAMPLES.
QUANTITATION OF LIBRARIES IS IMPORTANT.
SOME QC CAN ONLY BE DONE ONCE YOU HAVE GENERATED DATA
Good
Bad
Bioanalyser qPCR Analysis

NGS QC – MGA
LIBRARY QC – CONTAMINANT DETECTION
SAMPLE 100,000 READS FROM FASTQ
READS TRIMMED TO 36BP
ALIGN TO MULTIPLE GENOMES USING BOWTIE
LIBRARY QC – ADAPTER DETECTION
SAMPLE 100,000 READS FROM FASTQ
READS CONVERTED TO FASTA
ALIGN TO “ADAPT-OME” USING EXONERATE
LIBRARY QC- YIELD
COUNT NUMBER OF READS (SINGLE-END ONLY)
DISPLAY NUMBER ON A PRE-DEFINED SCALE
DISPLAY LANES IN FLOWCELL CONFIGURATION

The Genomics Core sequencing services
James Hadfield NEB March 2016

This Tweet is
6 hours old
There are 13 samples
in the queue
It will take
about 1 week
to sequence
your sample
There is 1x
paired-end
125bp sample
in the queue
This is
driven by our
Genologics
LIMs
Sequencing
is on our
Illumina
sequencers

Service metrics Jan 2016
– TAT has been 2-3 weeks (often as little as 1 week)
– Most sequencing works very well, but…

A genomic case report
NFKBIA S32G
SIFT: deleterious(0)
PolyPhen: probably_damaging(0.979)

How to cluster and sequence an ngs library (james hadfield160416)

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (17)

Similaire à How to cluster and sequence an ngs library (james hadfield160416)

Similaire à How to cluster and sequence an ngs library (james hadfield160416) (20)

Dernier

Dernier (20)

How to cluster and sequence an ngs library (james hadfield160416)

Notes de l'éditeur