AGBT 2016 Workshop Magrini

MGI Reference Genomes
Workshop
Vince Magrini
February 10th 2016

Sequencing Plan
• PacBio Large Insert Library Construction
• Linked Reads with 10X Genomics
• Physical Map contiguity using BioNano IRYS

The NA19240 Large Insert Library Experience
10,000
10,500
11,000
11,500
12,000
12,500
13,000
13,500
14,000
14,500
15,000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75
ROILength(bp)
SMRT Cell
ROI Length
Lib4-ROI length Lib2-ROI length Lib3-ROI length Lib5-ROI length
Lib6-ROI length Lib7-ROI length LibF-ROI length Lib8-ROI length

Considerations for PacBio WGS
• High molecular weight genomic DNA
• DNA must be of sufficient quality to allow for >30 kb shearing to
produce PacBio Continuous Long Reads (CLR)
• Consistent shearing >30 kb
• Shearing genomic DNA >30 kb is challenging and requires a
consistent technology
• Preferred method: Diagenode Megaruptor
• Alternate method: Covaris g-Tube
• Sufficient DNA for PacBio sample prep
• A single PacBio sample prep reaction requires 5 μg sheared DNA
• One library is composed of 8-10 sample prep reactions
• At least 2-4 libraries are required for 60x coverage

NA19240 Sheared DNA Comparison
Library Shear Type Shear Settings
2 g-Tube 5500 rpm
3 g-Tube 4800 rpm
4 g-Tube 4800 rpm
5 g-Tube 4500 rpm
6 MegaRuptor Menlo Park 30 kb
7 MegaRuptor Menlo Park 30 kb
8 MegaRuptor MGI 30 kb
30kb MGI 30kb MP
G-Tube 4800 G-Tube 4500
 
✜ ✪

PacBio Workflow
DNA Shear
DNA Repair
Ligation/Exonuclease
BluePippin
>18kb Sizing
DNA Repair
AMPure PB
AMPure PB
3x AMPure PB
Rinse wells
AMPure PB
AMPure PB
Seq. Primer Anneal
P6 Polymerase Bind
MagBead Bind
Sequencing
30 minutes or 4 hours
20 minutes to 2 hours
Denature primer prior to use
4 to 6 hour collection time
• Adding DNA Damage Repair after BluePippin sizing increased the average Reads of Insert length by ~1 kb.
• Extending the P6 Polymerase Binding time from 30 minutes to 4 hours improved library complex loading per
SMRT cell

Standard PacBio protocol (sample prep & complex)
0.0
200.0
400.0
600.0
800.0
1,000.0
1,200.0
1,400.0
8,000
9,000
10,000
11,000
12,000
13,000
14,000
15,000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77
ROIYield(Mbases)
AverageROI(bp)
SMRT Cell
NA19240 Library 4 - Per SMRT Cell ROI length/yield
ROI length (bp) ROI Yield (Mbases)
Titration
• No Post-BluePippin DNA Damage Repair
• 30 min P6 polymerase bind
6 hour
Movies
4 hour
Movies
125 pM “on plate” loading concentration
G-Tube 4800✜

DNA Damage Repair & extended P6 bind
0.0
200.0
400.0
600.0
800.0
1,000.0
1,200.0
1,400.0
8,000
9,000
10,000
11,000
12,000
13,000
14,000
15,000
1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233343536373839404142434445464748
ROIYield(Mbases)
AverageROI(bp)
SMRT Cell
• No Post-BluePippin DNA Damage Repair
• 30 min P6 polymerase bind
• Post-BluePippin
DNA Damage
Repair
• 4 hour P6
polymerase
bind
G-Tube 4500✪

Menlo Park 30 kb MegaRuptor
0.0
200.0
400.0
600.0
800.0
1,000.0
1,200.0
1,400.0
8,000
9,000
10,000
11,000
12,000
13,000
14,000
15,000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
ROIYield(Mbases)
AverageROI(bp)
SMRT Cell
Titration
4hrP6bind
8Paclot#
231848
30minP6bind
8Paclot#
231848
4 hr P6 bind
8Pac lot #
231818
4 hr P6 bind
8Pac lot #
231848 4 hr P6 bind
8Pac lot #
231818
• Post-BluePippin DNA Damage Repair
• 4 hour or 30 minute P6 polymerase
bind
30kb MP

MGI 30 kb MegaRuptor
0.0
200.0
400.0
600.0
800.0
1,000.0
1,200.0
1,400.0
8,000
9,000
10,000
11,000
12,000
13,000
14,000
15,000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
ROIYield(Mbases)
AverageROI(bp)
SMRT Cell
Titration 125 pM “on plate” loading concentration
Clear cell-to-cell variability
Failed
cell
30kb MGI

PacBio NA19240 Sequencing Statistics
Sample 8 Packs Reads Mbp (Pol) RL Mbp(ROI) RL ROI Mbp/Cell
NA19240 37 16,088,050 214,621 13,605 195,619 12,487 661
HG00733 30 15,858,313 209,619 13,193 190,430 11,958 793
HG00514 40 20,707,629 311,500 13,473 277,690 13,473 868
NA12878* 22 11,029,811 165,153 14,949 146,833 13,174 962
Assembly Stats will be highlighted in Tina’s presentation.

PacBio Sequencing Observations
HG00514: 4h v 6h movie lengths
Instrument Movie Time Avg. ROI (bp) ROI Mb/Cell # Cells
00116 240 13,502 803 119
42274 240 13,036 881 95
00116 360 14,324 998 56
42274 360 13,282 1,063 24
• DNA Input and Sizing
• The library DNA >18 kb is fractionated using the Sage Science BluePippin
• DNA Damage Repair enzyme mix used post BluePippin (increased read length)
• Chemistry
• (+) DNA Damage Repair/4 hr bind: 970.2 Mbases/cell
• Instruments
• Longer average Reads
• Increased Loading Efficiency
• What about long term storage?

Reconfigured Oligo
- Uses inline index sequence
- No P5 index – HiSeq X single index compatible
10X Genomics Overview

10X Chromium Workflow
WGX Beta Product Workflow
gDNA
Extraction
GEM
Formation
Library
Prep
Long
Ranger
Pre-GEMs
Post-
GEMs
NGS Loupe
Qiagen MagAttract HMW Kit
Qubit quantifications
Dilute to 1 ng/ul
gDNA egram
Aliquot Master Mix
NaOH template denature
Load WGX Chip
Run instrument
Chip volume assessment
Instrument log
Isothermal incubation
Emulsion breaking
Bioanalyzer
HiSeq X or HiSeq 4000
2x150bp sequencing
200pmol loading
End Repair/A-tailing
Adapter ligation
SI PCR
Bioanalyzer
KAPA qPCR
Visualization
Demultiplexing
Alignment
De-duplication
SNP and indel calling
Large SV calling
Phasing

• HiSeq 4000
• 2x150, 200 pmol loading
• 2 lanes
Chromium NA19240 Library Sequencing Statistics
Post Gem: Isothermal Amp size dist.
Library Size Distribution
The spike at 0 in that graph is due to the
N's in the reference assembly.

NA19240 (MGI) NA12878 (10X)
Molecule Length (kb): 26,768 (±33,673) 94,923 (±64,103)
DNA in Molecules > 10kb 50.85 % 95.0%
DNA in Molecules > 100kb 1.38% 36.4%
SNPs Phased: 99.1% 97.8%
Longest Phase Block: 9.6 Mbp 34.7 Mbp
N50 Phase Block: 1.9 Mbp 9.5 Mbp
Chromium Molecule and Phasing Statistics

Harvest Cells
Dissociate
Tissue
Embed Cells
in Gel Plugs
Lyse Cells,
Digest Protein
Melt and
Digest Agarose
Plugs
Sample
Cleanup
Labeling
Reaction
BioNano Overview

0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
10-500kb 100-500kb 150-500kb 200-500kb 250-500kb >500kb
19240
19239
19238
19240 19239 19238
Mapped Molecule Quantity (Mb) 189,138.79 256,281.33 226,854.88
Mapped Avg Size (Kb) 232 280 289
Avg Label Density (per 100 Kb) 9.6 8.7 8.8
Number of Consensus Genome Maps 3051 2565 2798
Consensus Genome Maps Size (Mb) 2833.045 2965.972 2933.294
Consensus Genome Maps N50 (Mb) 1.276 1.685 1.477
Avg Depth of Mol Coverage 59.1 56.1 50.6
BioNano: Yoruban Trio Statistics
Molecule Length Bin
Molecules/Bin(%)

PacBio Assembly Contig
BioNano Genome Map Contigs
Sequencing Plan
Add 10X Linked Read information
Add Dovetail Hi-C/chiCago Data

Summary
• Goal: Generate robust data sets for additional high-quality
reference genome enhancing the full range of genetic
diversity in humans
• These long read (long range) sequencing/mapping
applications vary in approach and will provide synergistic
data sets to help accomplish our goal.
• Each system possesses unique challenges and requires
optimization of protocols and running conditions specific
to our needs.
• Experience and communication is key.
• Increasing applications and utility
• Polymerase read = read of insert
• BAC Pooling
• Low input SNV
• Multicolor labeling

Acknowledgements
The McDonnell Genome Institute at
Washington University in St. Louis
Rick Wilson
Sean McGrath
Amy Ly
Ryan Demeter
Dave Larson
Karyn Meltz Steinberg
Tina Graves
Bob Fulton
Derek Albracht
Milinn Kremitzki
Susan Rock
Debbie Scheer
Wes Warren
Chad Tomlinson
10X Genomics
Cassandra Jabara
Michael Schnall-Levin
Drew Kebbel
Rob Tarbox
Deanna Church
BioNano Genomics
Andrew Anfora
Palak Sheth
Alex Hastie
Pacific Biosciences
Paul Peluso
Nick Sisneros

AGBT 2016 Workshop Magrini

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to AGBT 2016 Workshop Magrini

Similar to AGBT 2016 Workshop Magrini (20)

More from Genome Reference Consortium

More from Genome Reference Consortium (17)

Recently uploaded

Recently uploaded (20)

AGBT 2016 Workshop Magrini

Editor's Notes