SlideShare a Scribd company logo
1 of 24
MGI Reference Genomes
Workshop
Vince Magrini
February 10th 2016
Sequencing Plan
• PacBio Large Insert Library Construction
• Linked Reads with 10X Genomics
• Physical Map contiguity using BioNano IRYS
Pacific Biosciences
The NA19240 Large Insert Library Experience
10,000
10,500
11,000
11,500
12,000
12,500
13,000
13,500
14,000
14,500
15,000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75
ROILength(bp)
SMRT Cell
ROI Length
Lib4-ROI length Lib2-ROI length Lib3-ROI length Lib5-ROI length
Lib6-ROI length Lib7-ROI length LibF-ROI length Lib8-ROI length
Considerations for PacBio WGS
• High molecular weight genomic DNA
• DNA must be of sufficient quality to allow for >30 kb shearing to
produce PacBio Continuous Long Reads (CLR)
• Consistent shearing >30 kb
• Shearing genomic DNA >30 kb is challenging and requires a
consistent technology
• Preferred method: Diagenode Megaruptor
• Alternate method: Covaris g-Tube
• Sufficient DNA for PacBio sample prep
• A single PacBio sample prep reaction requires 5 μg sheared DNA
• One library is composed of 8-10 sample prep reactions
• At least 2-4 libraries are required for 60x coverage
NA19240 Sheared DNA Comparison
Library Shear Type Shear Settings
2 g-Tube 5500 rpm
3 g-Tube 4800 rpm
4 g-Tube 4800 rpm
5 g-Tube 4500 rpm
6 MegaRuptor Menlo Park 30 kb
7 MegaRuptor Menlo Park 30 kb
8 MegaRuptor MGI 30 kb
30kb MGI 30kb MP
G-Tube 4800 G-Tube 4500
 
✜ ✪
PacBio Workflow
DNA Shear
DNA Repair
Ligation/Exonuclease
BluePippin
>18kb Sizing
DNA Repair
AMPure PB
AMPure PB
3x AMPure PB
Rinse wells
AMPure PB
AMPure PB
Seq. Primer Anneal
P6 Polymerase Bind
MagBead Bind
Sequencing
30 minutes or 4 hours
20 minutes to 2 hours
Denature primer prior to use
4 to 6 hour collection time
• Adding DNA Damage Repair after BluePippin sizing increased the average Reads of Insert length by ~1 kb.
• Extending the P6 Polymerase Binding time from 30 minutes to 4 hours improved library complex loading per
SMRT cell
Standard PacBio protocol (sample prep & complex)
0.0
200.0
400.0
600.0
800.0
1,000.0
1,200.0
1,400.0
8,000
9,000
10,000
11,000
12,000
13,000
14,000
15,000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77
ROIYield(Mbases)
AverageROI(bp)
SMRT Cell
NA19240 Library 4 - Per SMRT Cell ROI length/yield
ROI length (bp) ROI Yield (Mbases)
Titration
• No Post-BluePippin DNA Damage Repair
• 30 min P6 polymerase bind
6 hour
Movies
4 hour
Movies
125 pM “on plate” loading concentration
G-Tube 4800✜
DNA Damage Repair & extended P6 bind
0.0
200.0
400.0
600.0
800.0
1,000.0
1,200.0
1,400.0
8,000
9,000
10,000
11,000
12,000
13,000
14,000
15,000
1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233343536373839404142434445464748
ROIYield(Mbases)
AverageROI(bp)
SMRT Cell
NA19240 Library 5 - Per SMRT Cell ROI length/yield
ROI length (bp) ROI Yield (Mbases)
• No Post-BluePippin DNA Damage Repair
• 30 min P6 polymerase bind
• Post-BluePippin
DNA Damage
Repair
• 4 hour P6
polymerase
bind
G-Tube 4500✪
Menlo Park 30 kb MegaRuptor
0.0
200.0
400.0
600.0
800.0
1,000.0
1,200.0
1,400.0
8,000
9,000
10,000
11,000
12,000
13,000
14,000
15,000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
ROIYield(Mbases)
AverageROI(bp)
SMRT Cell
NA19240 Library 7 - Per SMRT Cell ROI length/yield
ROI length (bp) ROI Yield (Mbases)
Titration
4hrP6bind
8Paclot#
231848
30minP6bind
8Paclot#
231848
4 hr P6 bind
8Pac lot #
231818
4 hr P6 bind
8Pac lot #
231848 4 hr P6 bind
8Pac lot #
231818
• Post-BluePippin DNA Damage Repair
• 4 hour or 30 minute P6 polymerase
bind
30kb MP
MGI 30 kb MegaRuptor
0.0
200.0
400.0
600.0
800.0
1,000.0
1,200.0
1,400.0
8,000
9,000
10,000
11,000
12,000
13,000
14,000
15,000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
ROIYield(Mbases)
AverageROI(bp)
SMRT Cell
NA19240 Library 8 - Per SMRT Cell ROI length/yield
ROI length (bp) ROI Yield (Mbases)
Titration 125 pM “on plate” loading concentration
Clear cell-to-cell variability
Failed
cell
30kb MGI
PacBio NA19240 Sequencing Statistics
Sample 8 Packs Reads Mbp (Pol) RL Mbp(ROI) RL ROI Mbp/Cell
NA19240 37 16,088,050 214,621 13,605 195,619 12,487 661
HG00733 30 15,858,313 209,619 13,193 190,430 11,958 793
HG00514 40 20,707,629 311,500 13,473 277,690 13,473 868
NA12878* 22 11,029,811 165,153 14,949 146,833 13,174 962
Assembly Stats will be highlighted in Tina’s presentation.
PacBio Sequencing Observations
HG00514: 4h v 6h movie lengths
Instrument Movie Time Avg. ROI (bp) ROI Mb/Cell # Cells
00116 240 13,502 803 119
42274 240 13,036 881 95
00116 360 14,324 998 56
42274 360 13,282 1,063 24
• DNA Input and Sizing
• The library DNA >18 kb is fractionated using the Sage Science BluePippin
• DNA Damage Repair enzyme mix used post BluePippin (increased read length)
• Chemistry
• (+) DNA Damage Repair/4 hr bind: 970.2 Mbases/cell
• Instruments
• Longer average Reads
• Increased Loading Efficiency
• What about long term storage?
10X Genomics
Reconfigured Oligo
- Uses inline index sequence
- No P5 index – HiSeq X single index compatible
10X Genomics Overview
10X Chromium Workflow
WGX Beta Product Workflow
gDNA
Extraction
GEM
Formation
Library
Prep
Long
Ranger
Pre-GEMs
Post-
GEMs
NGS Loupe
Qiagen MagAttract HMW Kit
Qubit quantifications
Dilute to 1 ng/ul
gDNA egram
Aliquot Master Mix
NaOH template denature
Load WGX Chip
Run instrument
Chip volume assessment
Instrument log
Isothermal incubation
Emulsion breaking
Bioanalyzer
HiSeq X or HiSeq 4000
2x150bp sequencing
200pmol loading
End Repair/A-tailing
Adapter ligation
SI PCR
Bioanalyzer
KAPA qPCR
Visualization
Demultiplexing
Alignment
De-duplication
SNP and indel calling
Large SV calling
Phasing
• HiSeq 4000
• 2x150, 200 pmol loading
• 2 lanes
Chromium NA19240 Library Sequencing Statistics
Post Gem: Isothermal Amp size dist.
Library Size Distribution
The spike at 0 in that graph is due to the
N's in the reference assembly.
NA19240 (MGI) NA12878 (10X)
Molecule Length (kb): 26,768 (±33,673) 94,923 (±64,103)
DNA in Molecules > 10kb 50.85 % 95.0%
DNA in Molecules > 100kb 1.38% 36.4%
SNPs Phased: 99.1% 97.8%
Longest Phase Block: 9.6 Mbp 34.7 Mbp
N50 Phase Block: 1.9 Mbp 9.5 Mbp
Chromium Molecule and Phasing Statistics
BioNano
Harvest Cells
Dissociate
Tissue
Embed Cells
in Gel Plugs
Lyse Cells,
Digest Protein
Melt and
Digest Agarose
Plugs
Sample
Cleanup
Labeling
Reaction
BioNano Overview
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
10-500kb 100-500kb 150-500kb 200-500kb 250-500kb >500kb
19240
19239
19238
19240 19239 19238
Mapped Molecule Quantity (Mb) 189,138.79 256,281.33 226,854.88
Mapped Avg Size (Kb) 232 280 289
Avg Label Density (per 100 Kb) 9.6 8.7 8.8
Number of Consensus Genome Maps 3051 2565 2798
Consensus Genome Maps Size (Mb) 2833.045 2965.972 2933.294
Consensus Genome Maps N50 (Mb) 1.276 1.685 1.477
Avg Depth of Mol Coverage 59.1 56.1 50.6
BioNano: Yoruban Trio Statistics
Molecule Length Bin
Molecules/Bin(%)
PacBio Assembly Contig
BioNano Genome Map Contigs
Sequencing Plan
Add 10X Linked Read information
Add Dovetail Hi-C/chiCago Data
Summary
• Goal: Generate robust data sets for additional high-quality
reference genome enhancing the full range of genetic
diversity in humans
• These long read (long range) sequencing/mapping
applications vary in approach and will provide synergistic
data sets to help accomplish our goal.
• Each system possesses unique challenges and requires
optimization of protocols and running conditions specific
to our needs.
• Experience and communication is key.
• Increasing applications and utility
• Polymerase read = read of insert
• BAC Pooling
• Low input SNV
• Multicolor labeling
Acknowledgements
The McDonnell Genome Institute at
Washington University in St. Louis
Rick Wilson
Sean McGrath
Amy Ly
Ryan Demeter
Dave Larson
Karyn Meltz Steinberg
Tina Graves
Bob Fulton
Derek Albracht
Milinn Kremitzki
Susan Rock
Debbie Scheer
Wes Warren
Chad Tomlinson
10X Genomics
Cassandra Jabara
Michael Schnall-Levin
Drew Kebbel
Rob Tarbox
Deanna Church
BioNano Genomics
Andrew Anfora
Palak Sheth
Alex Hastie
Pacific Biosciences
Paul Peluso
Nick Sisneros

More Related Content

What's hot

What's hot (20)

Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
 
Generating high-quality reference human genomes using PromethION nanopore seq...
Generating high-quality reference human genomes using PromethION nanopore seq...Generating high-quality reference human genomes using PromethION nanopore seq...
Generating high-quality reference human genomes using PromethION nanopore seq...
 
Aug2015 analysis team 04 10x genomics
Aug2015 analysis team 04 10x genomicsAug2015 analysis team 04 10x genomics
Aug2015 analysis team 04 10x genomics
 
Ashg grc workshop2014_tg
Ashg grc workshop2014_tgAshg grc workshop2014_tg
Ashg grc workshop2014_tg
 
Generating haplotype phased reference genomes for the dikaryotic wheat strip...
Generating haplotype phased reference genomes  for the dikaryotic wheat strip...Generating haplotype phased reference genomes  for the dikaryotic wheat strip...
Generating haplotype phased reference genomes for the dikaryotic wheat strip...
 
Alignment Approaches II: Long Reads
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
 
Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...
 
ABGT 2016 Workshop Schneider
ABGT 2016 Workshop SchneiderABGT 2016 Workshop Schneider
ABGT 2016 Workshop Schneider
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copy
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
 
Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
 
Ashg2015 grc-pruitt
Ashg2015 grc-pruittAshg2015 grc-pruitt
Ashg2015 grc-pruitt
 
Schneider grc workshop_final
Schneider grc workshop_finalSchneider grc workshop_final
Schneider grc workshop_final
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccs
 
Ashg2015 schneider final
Ashg2015 schneider finalAshg2015 schneider final
Ashg2015 schneider final
 
Getting the most from the reference assembly
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
 
AGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: SchneiderAGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: Schneider
 
London Calling 2019: Karen Miga
London Calling 2019: Karen MigaLondon Calling 2019: Karen Miga
London Calling 2019: Karen Miga
 
Explaining the assembly model
Explaining the assembly modelExplaining the assembly model
Explaining the assembly model
 

Similar to AGBT 2016 Workshop Magrini

Decoding ancient Bulgarian DNA with semiconductor-based sequencing
Decoding ancient Bulgarian DNA with semiconductor-based sequencingDecoding ancient Bulgarian DNA with semiconductor-based sequencing
Decoding ancient Bulgarian DNA with semiconductor-based sequencing
Thermo Fisher Scientific
 

Similar to AGBT 2016 Workshop Magrini (20)

Miten Generating high-quality reference human genomes using Promethion nanopo...
Miten Generating high-quality reference human genomes using Promethion nanopo...Miten Generating high-quality reference human genomes using Promethion nanopo...
Miten Generating high-quality reference human genomes using Promethion nanopo...
 
Genome assembly from three sequencing platforms: minION, MiSeq and PacBio
Genome assembly from three sequencing platforms: minION, MiSeq and PacBioGenome assembly from three sequencing platforms: minION, MiSeq and PacBio
Genome assembly from three sequencing platforms: minION, MiSeq and PacBio
 
Ashg grc workshop2015_tg
Ashg grc workshop2015_tgAshg grc workshop2015_tg
Ashg grc workshop2015_tg
 
Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
BioSB meeting 2015
BioSB meeting 2015BioSB meeting 2015
BioSB meeting 2015
 
Decoding ancient Bulgarian DNA with semiconductor-based sequencing
Decoding ancient Bulgarian DNA with semiconductor-based sequencingDecoding ancient Bulgarian DNA with semiconductor-based sequencing
Decoding ancient Bulgarian DNA with semiconductor-based sequencing
 
Ultra-long read methods for nanopore single molecule sequencing
Ultra-long read methods for nanopore single molecule sequencingUltra-long read methods for nanopore single molecule sequencing
Ultra-long read methods for nanopore single molecule sequencing
 
Brock peters single tube long fragment read technology
Brock peters single tube long fragment read technologyBrock peters single tube long fragment read technology
Brock peters single tube long fragment read technology
 
26072016 uc davis_small
26072016 uc davis_small26072016 uc davis_small
26072016 uc davis_small
 
Jan2016 bio nano han cao
Jan2016 bio nano han caoJan2016 bio nano han cao
Jan2016 bio nano han cao
 
ASM Microbe 2017: Reaching the Parts Other Methods Can't: Long Reads for Micr...
ASM Microbe 2017: Reaching the Parts Other Methods Can't: Long Reads for Micr...ASM Microbe 2017: Reaching the Parts Other Methods Can't: Long Reads for Micr...
ASM Microbe 2017: Reaching the Parts Other Methods Can't: Long Reads for Micr...
 
Tomography
TomographyTomography
Tomography
 
Sl4.0 and ITAG4.0
Sl4.0 and ITAG4.0Sl4.0 and ITAG4.0
Sl4.0 and ITAG4.0
 
The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...
The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...
The ‘Three Peak Challenge’ for long-read, ultra-deep stool metagenomics on th...
 
Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...
Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...
Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...
 
customization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAcustomization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLA
 
40 Years of Genome Assembly: Are We Done Yet?
40 Years of Genome Assembly: Are We Done Yet?40 Years of Genome Assembly: Are We Done Yet?
40 Years of Genome Assembly: Are We Done Yet?
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Seminar 20150920.2
Seminar 20150920.2Seminar 20150920.2
Seminar 20150920.2
 

More from Genome Reference Consortium

More from Genome Reference Consortium (17)

What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)
 
Genome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkit
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 am
 
Mane v2 final
Mane v2 finalMane v2 final
Mane v2 final
 
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
 
20181016 grc presentation-pa
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
 
2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final
 
Ashg2017 workshop schneider
Ashg2017 workshop schneiderAshg2017 workshop schneider
Ashg2017 workshop schneider
 
Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
Everyday de novo diploid assembly
Everyday de novo diploid assemblyEveryday de novo diploid assembly
Everyday de novo diploid assembly
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materials
 
Understanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonUnderstanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL Hackathon
 
Graph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGraph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regions
 

Recently uploaded

development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 

Recently uploaded (20)

development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdf
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 

AGBT 2016 Workshop Magrini

  • 1. MGI Reference Genomes Workshop Vince Magrini February 10th 2016
  • 2. Sequencing Plan • PacBio Large Insert Library Construction • Linked Reads with 10X Genomics • Physical Map contiguity using BioNano IRYS
  • 4. The NA19240 Large Insert Library Experience 10,000 10,500 11,000 11,500 12,000 12,500 13,000 13,500 14,000 14,500 15,000 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 ROILength(bp) SMRT Cell ROI Length Lib4-ROI length Lib2-ROI length Lib3-ROI length Lib5-ROI length Lib6-ROI length Lib7-ROI length LibF-ROI length Lib8-ROI length
  • 5. Considerations for PacBio WGS • High molecular weight genomic DNA • DNA must be of sufficient quality to allow for >30 kb shearing to produce PacBio Continuous Long Reads (CLR) • Consistent shearing >30 kb • Shearing genomic DNA >30 kb is challenging and requires a consistent technology • Preferred method: Diagenode Megaruptor • Alternate method: Covaris g-Tube • Sufficient DNA for PacBio sample prep • A single PacBio sample prep reaction requires 5 μg sheared DNA • One library is composed of 8-10 sample prep reactions • At least 2-4 libraries are required for 60x coverage
  • 6. NA19240 Sheared DNA Comparison Library Shear Type Shear Settings 2 g-Tube 5500 rpm 3 g-Tube 4800 rpm 4 g-Tube 4800 rpm 5 g-Tube 4500 rpm 6 MegaRuptor Menlo Park 30 kb 7 MegaRuptor Menlo Park 30 kb 8 MegaRuptor MGI 30 kb 30kb MGI 30kb MP G-Tube 4800 G-Tube 4500   ✜ ✪
  • 7. PacBio Workflow DNA Shear DNA Repair Ligation/Exonuclease BluePippin >18kb Sizing DNA Repair AMPure PB AMPure PB 3x AMPure PB Rinse wells AMPure PB AMPure PB Seq. Primer Anneal P6 Polymerase Bind MagBead Bind Sequencing 30 minutes or 4 hours 20 minutes to 2 hours Denature primer prior to use 4 to 6 hour collection time • Adding DNA Damage Repair after BluePippin sizing increased the average Reads of Insert length by ~1 kb. • Extending the P6 Polymerase Binding time from 30 minutes to 4 hours improved library complex loading per SMRT cell
  • 8. Standard PacBio protocol (sample prep & complex) 0.0 200.0 400.0 600.0 800.0 1,000.0 1,200.0 1,400.0 8,000 9,000 10,000 11,000 12,000 13,000 14,000 15,000 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 ROIYield(Mbases) AverageROI(bp) SMRT Cell NA19240 Library 4 - Per SMRT Cell ROI length/yield ROI length (bp) ROI Yield (Mbases) Titration • No Post-BluePippin DNA Damage Repair • 30 min P6 polymerase bind 6 hour Movies 4 hour Movies 125 pM “on plate” loading concentration G-Tube 4800✜
  • 9. DNA Damage Repair & extended P6 bind 0.0 200.0 400.0 600.0 800.0 1,000.0 1,200.0 1,400.0 8,000 9,000 10,000 11,000 12,000 13,000 14,000 15,000 1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233343536373839404142434445464748 ROIYield(Mbases) AverageROI(bp) SMRT Cell NA19240 Library 5 - Per SMRT Cell ROI length/yield ROI length (bp) ROI Yield (Mbases) • No Post-BluePippin DNA Damage Repair • 30 min P6 polymerase bind • Post-BluePippin DNA Damage Repair • 4 hour P6 polymerase bind G-Tube 4500✪
  • 10. Menlo Park 30 kb MegaRuptor 0.0 200.0 400.0 600.0 800.0 1,000.0 1,200.0 1,400.0 8,000 9,000 10,000 11,000 12,000 13,000 14,000 15,000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 ROIYield(Mbases) AverageROI(bp) SMRT Cell NA19240 Library 7 - Per SMRT Cell ROI length/yield ROI length (bp) ROI Yield (Mbases) Titration 4hrP6bind 8Paclot# 231848 30minP6bind 8Paclot# 231848 4 hr P6 bind 8Pac lot # 231818 4 hr P6 bind 8Pac lot # 231848 4 hr P6 bind 8Pac lot # 231818 • Post-BluePippin DNA Damage Repair • 4 hour or 30 minute P6 polymerase bind 30kb MP
  • 11. MGI 30 kb MegaRuptor 0.0 200.0 400.0 600.0 800.0 1,000.0 1,200.0 1,400.0 8,000 9,000 10,000 11,000 12,000 13,000 14,000 15,000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ROIYield(Mbases) AverageROI(bp) SMRT Cell NA19240 Library 8 - Per SMRT Cell ROI length/yield ROI length (bp) ROI Yield (Mbases) Titration 125 pM “on plate” loading concentration Clear cell-to-cell variability Failed cell 30kb MGI
  • 12. PacBio NA19240 Sequencing Statistics Sample 8 Packs Reads Mbp (Pol) RL Mbp(ROI) RL ROI Mbp/Cell NA19240 37 16,088,050 214,621 13,605 195,619 12,487 661 HG00733 30 15,858,313 209,619 13,193 190,430 11,958 793 HG00514 40 20,707,629 311,500 13,473 277,690 13,473 868 NA12878* 22 11,029,811 165,153 14,949 146,833 13,174 962 Assembly Stats will be highlighted in Tina’s presentation.
  • 13. PacBio Sequencing Observations HG00514: 4h v 6h movie lengths Instrument Movie Time Avg. ROI (bp) ROI Mb/Cell # Cells 00116 240 13,502 803 119 42274 240 13,036 881 95 00116 360 14,324 998 56 42274 360 13,282 1,063 24 • DNA Input and Sizing • The library DNA >18 kb is fractionated using the Sage Science BluePippin • DNA Damage Repair enzyme mix used post BluePippin (increased read length) • Chemistry • (+) DNA Damage Repair/4 hr bind: 970.2 Mbases/cell • Instruments • Longer average Reads • Increased Loading Efficiency • What about long term storage?
  • 15. Reconfigured Oligo - Uses inline index sequence - No P5 index – HiSeq X single index compatible 10X Genomics Overview
  • 16. 10X Chromium Workflow WGX Beta Product Workflow gDNA Extraction GEM Formation Library Prep Long Ranger Pre-GEMs Post- GEMs NGS Loupe Qiagen MagAttract HMW Kit Qubit quantifications Dilute to 1 ng/ul gDNA egram Aliquot Master Mix NaOH template denature Load WGX Chip Run instrument Chip volume assessment Instrument log Isothermal incubation Emulsion breaking Bioanalyzer HiSeq X or HiSeq 4000 2x150bp sequencing 200pmol loading End Repair/A-tailing Adapter ligation SI PCR Bioanalyzer KAPA qPCR Visualization Demultiplexing Alignment De-duplication SNP and indel calling Large SV calling Phasing
  • 17. • HiSeq 4000 • 2x150, 200 pmol loading • 2 lanes Chromium NA19240 Library Sequencing Statistics Post Gem: Isothermal Amp size dist. Library Size Distribution The spike at 0 in that graph is due to the N's in the reference assembly.
  • 18. NA19240 (MGI) NA12878 (10X) Molecule Length (kb): 26,768 (±33,673) 94,923 (±64,103) DNA in Molecules > 10kb 50.85 % 95.0% DNA in Molecules > 100kb 1.38% 36.4% SNPs Phased: 99.1% 97.8% Longest Phase Block: 9.6 Mbp 34.7 Mbp N50 Phase Block: 1.9 Mbp 9.5 Mbp Chromium Molecule and Phasing Statistics
  • 20. Harvest Cells Dissociate Tissue Embed Cells in Gel Plugs Lyse Cells, Digest Protein Melt and Digest Agarose Plugs Sample Cleanup Labeling Reaction BioNano Overview
  • 21. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 10-500kb 100-500kb 150-500kb 200-500kb 250-500kb >500kb 19240 19239 19238 19240 19239 19238 Mapped Molecule Quantity (Mb) 189,138.79 256,281.33 226,854.88 Mapped Avg Size (Kb) 232 280 289 Avg Label Density (per 100 Kb) 9.6 8.7 8.8 Number of Consensus Genome Maps 3051 2565 2798 Consensus Genome Maps Size (Mb) 2833.045 2965.972 2933.294 Consensus Genome Maps N50 (Mb) 1.276 1.685 1.477 Avg Depth of Mol Coverage 59.1 56.1 50.6 BioNano: Yoruban Trio Statistics Molecule Length Bin Molecules/Bin(%)
  • 22. PacBio Assembly Contig BioNano Genome Map Contigs Sequencing Plan Add 10X Linked Read information Add Dovetail Hi-C/chiCago Data
  • 23. Summary • Goal: Generate robust data sets for additional high-quality reference genome enhancing the full range of genetic diversity in humans • These long read (long range) sequencing/mapping applications vary in approach and will provide synergistic data sets to help accomplish our goal. • Each system possesses unique challenges and requires optimization of protocols and running conditions specific to our needs. • Experience and communication is key. • Increasing applications and utility • Polymerase read = read of insert • BAC Pooling • Low input SNV • Multicolor labeling
  • 24. Acknowledgements The McDonnell Genome Institute at Washington University in St. Louis Rick Wilson Sean McGrath Amy Ly Ryan Demeter Dave Larson Karyn Meltz Steinberg Tina Graves Bob Fulton Derek Albracht Milinn Kremitzki Susan Rock Debbie Scheer Wes Warren Chad Tomlinson 10X Genomics Cassandra Jabara Michael Schnall-Levin Drew Kebbel Rob Tarbox Deanna Church BioNano Genomics Andrew Anfora Palak Sheth Alex Hastie Pacific Biosciences Paul Peluso Nick Sisneros

Editor's Notes

  1. For this project, we are generating ~60X – 100X coverage of PacBio long read data based on Reads – of – insert data The plan: 1. de novo assembly of PacBio data. 2. scaffold the assembly with BioNano data as well as Dovetail chiCago and 10X genomics linked read data sets. With the combined data sets will begin to generate scaffolds and identify areas of f potential misassemblies. 3. We are also targeting difficult to assemble regions of the genome by sequencing BACs. Once the BACs are incorporated, we plan to align all of this data to the Reference very stringently to produce chromosomal AGPs. The end product will be a very high quality whole genome assembly. Our role in this project mostly focused on Larger Insert pacbio libraries, adding BioNano data sets, and early work with 10X genomics. Today, I will highlight our progress with these large molecule applications.
  2. Highlight library – SMRT Bell Reads types Polymerase Reads Subreads Read on Insert Currently, we are enriching SMRT bell libraries > 18kb.
  3. The sloppy slide  For the NA19240 project, we generated a number of libraries (8). Please note, when I say library, I mean, for each library, a 10 reaction kit was used to Process 50µg of DNA in 8-10 independent reactions and then pooled. As this graph illustrates, the total number of SMRT cells for each of the libraries as well as the average ROI read length. Based on the these values, we Were able to consistently show inconsistent results; which required some tweaking of the process and many discussions with the Nick and Paul at PacBio. Based on these fruitful discussions, we were able to show a marked improvement which will be highlighted in this section.
  4. Genomic DNA tape station shows and overlay of each of the different shearing conditions used to generated the multiple NA19240 SMRT Bell libraries The symbols represent the mode for each electropherograms. The table on the right highlights the shearing method for 7 of the 8 libraries, a
  5. Data accumulation to date for each of the four genomes we’ve. The total number of cells is provide by the number of 8 packs, However, the table also illustrates the average number of ROI reads in Mbp per cell. Based on our modifications, we’ve transitioned the PacBio larger insert library protocol into production, And we are happy to report a positive trend with each new library – increased ROI data throughput per cell.
  6. In addition to sequence based methods, we are also utilizing the the BioNano IRYS system to generate physical maps of the genome The advantage obtained with the physical allows us to maintain the order and orientation based on the nicking endonuclease recognition site. The slide illustrates the processing overview starting with a cell culture.
  7. Another resource we have is a BioNano Genome Map of CHM1. BioNano is a nanopore mapping technology where the DNA in very long molecules is nicked and labeled and run through a nanochannel. Here is an example of the CHM1 Bionano map aligned to a 1.5 Mb Pb contig. On top in green is the PacBio contig. The lines indicate the in silico nick sites. The Blue bars indicate the Bionano contigs. You can see how well they align. The Bionano data can be used as an independent source to assess the CHM1.
  8. I want to acknowledge all of the collaborators on this project and all of the work that has gone into it thus far.