SlideShare une entreprise Scribd logo
1  sur  35
Using long and linked reads to generate
a new Genome in a Bottle small variant
benchmark
Justin Wagner, Andrew Carroll, Ian T. Fiddes, Aaron M. Wenger, William J.
Rowell, Nathan Olson, Lindsey Harris, Jenny McDaniel, Xin Zhou, Sergey
Aganezov, Melanie Kirsche, Bohan Ni, Samantha Zarate, Byunggil Yoo, Neil
Miller, C. Xiao, Marc Salit, Justin Zook, Genome in a Bottle Consortium
GRC/GIAB Workshop ASHG 2019
Overview
• v3.3.2 benchmark variants and regions cover 87.84% of assembled
bases in chromosomes 1-22 in GRCh37 for the sample HG002
• Short read variant callers perform poorly in genomic locations with
high homology such as segmental duplications and low-complexity
repeat-rich regions
• Now utilizing PacBio CCS and 10X Genomics data to expand the GIAB
benchmark regions and reduce errors in current regions
• Long and linked reads add variants to the benchmark, mostly in
regions difficult to map with short reads
• GRCh37: 276,840 SNPs and 53,482 INDELs
• GRCh38: 286,483 SNPs and 42,980 INDELs
How the benchmark is generated
When do we trust variants and regions from
each method
Variants
PASS
Filtered outliers
Low/high coverage or low
MQ (or low GQ for gVCF)
Difficult regions/SVs
Callable regions
TR
VariantCallingMethodX
(1) (2) (3)
1/1
0/1
Arbitrating between variant calls in different
methods
PASS variants #2
Benchmark regions
0/1 1/11/1
Benchmark calls 0/11/1
Callable regions #2
Callable regions #1
1/10/11/1PASS variants #1
InputMethods
1/1
(1)
Concordant
(2)
Discordant
unresolved
(3)
Discordant
arbitrated
(4)
Concordant
not callable
Sequencing data used in integration for
HG002
Platform Characteristics Alignment; Variant Calling
Illumina 150x150bp, ~300x coverage Novoalign; GATK v3.5
CG 26x26bp; ~100x coverage Complete Genomics Pipeline
Illumina 150x150bp, ~300x coverage Novoalign; Freebayes
Illumina 250x250bp;~45x coverage Novoalign; GATK v3.5
Illumina 250x250bp;~45x coverage Novoalign; Freebayes
Illumina 6Kbp mate pair; ~13x coverage bwa_mem; GATK v3.5
Illumina 6Kbp mate pair; ~13x coverage bwa_mem; Freebayes
Ion Exome, 1000x coverage Torrent Suite v4.2; Torrent Variant Caller v4.4
Solid 75bp; ~60x coverage LifeScope v2.5.1; GATK v3.5
PacBio CCS Sequel II ~11kb reads; ~32x coverage minimap2; GATK4
PacBio CCS Sequel II ~11kb reads; ~32x coverage minimap2; DeepVariant v0.8
10x Genomics Linked reads; ~84x coverage LongRanger Pipeline
Long and linked reads cover more variants
and regions
Variants
PASS
Filtered outliers
Low/high coverage or low
MQ (or low GQ for gVCF)
Difficult regions/SVs
Callable regions
TR
VariantCallingMethodX
(1) (2) (3)
1/1
0/1
10x Genomics and PacBio CCS data add new variants (1), regions with good
coverage of high MQ reads (2), and access to difficult regions (3)
How the benchmark is generated
Difficult Regions Excluded from all Methods
Difficult Region Description Bases Covered
in GRCh37
Bases Covered
in GRCh38
v0.6 SV GIAB Benchmark 32,596,754 32,872,907
Potential copy number variation 51,713,344 62,666,746
Tandem Repeats > 10kb 5,731,885 71,942,255
Highly similar and high depth segmental duplications 1,232,701 2,094,143
Regions that are collapsed and expanded from GRCh37/38
Primary Assembly Alignments 17,979,597 N/A
Modeled centromere and heterochromatin N/A 62,304,573
Difficult Regions Excluded by Method
• Tandem Repeats < 51bp except GATK from Illumina PCR-free, Complete
Genomics, and CCS DeepVariant
• Tandem Repeats > 51bp and < 200bp except GATK from Illumina PCR-
Free and CCS DeepVariant
• Tandem Repeats > 200bp except CCS DeepVariant
• Homopolymers > 6bp except GATK from Illumina PCR-free, Complete
Genomics, Ion Exome, PacBio CCS
• Imperfect homopolymer > 10bp except GATK from Illumina PCR-Free
• Difficult to map regions for short reads except 10x and CCS
• LINE:L1Hs > 500bp except Illumina MatePair, 10x, and CCS
• Segmental duplications except 10x and CCS
v4 draft benchmark includes variants found
with haplotype-resolved assembly of MHC
• Worked with a team from the March 2019 NCBI Pangenome
Hackathon to generate haplotype-resolved assembly of MHC region
(chr6:28,477,797-33,448,354 in GRCh37)
• Use assembly to call small variants
• Small variants from assembly are integrated with mapping-based calls
in the MHC region for v4 draft benchmark
• v4 draft benchmark includes 23,229 variants in the MHC region
• Covers most HLA genes and CYP21A2/TNXA/TNXB
v4 draft benchmark include more bases,
variants, and segmental duplications
v4 draft GRCh37 v4 draft GRCh38
Base pairs 2,504,027,936 2,509,269,277
Reference
covered
93.2% 91.03%
SNPs 3,323,773 3,314,941
Indels 519,152 519,494
Base pairs in
Segmental
Duplications
64,300,499 73,819,342
80.00%
85.00%
90.00%
95.00%
Percent of reference covered
Some variants and segmental duplications
only covered in v3.3.2 or v4 draft
Only in v3.3.2
GRCh37
Only in v4
draft GRCh37
SNPs INDELs SNPs INDELs
Only in v3.3.2
GRCh38
Only in v4
draft GRCh38343,358
69,495
77,324
23,828
376,653
91,837
91,719
48,753
Segmental Duplications Segmental Duplications
25,445
63,949,151
1,928,353
70,187,985
v4 draft enables benchmarking in regions
difficult for short reads
Comparison of Illumina RTG VCF against benchmark sets
• SNP FNs increase by a factor of more than 3, mostly due to new
benchmark variants in difficult to map regions and segmental
duplications
• False negatives: variants present in the truth set, but missed in the query
Subset v3.3.2 FNs v4 draft FNs
All SNPs 8,594 30,229
Low mappability 6,708 25,295
Segmental duplications 1,429 14,008
v4 draft benchmark contains more medically-
relevant variants
• v4 draft covers more of the MHC region
• Outside of MHC updates, top 5 genes with variants increased from v3.3.2
to v4 draft benchmark: TSPEAR (31), LAMA5 (28), FCGBP (18), TPSAB1
(15), HSPG2 (13)
• PMS2 from ACMG59 has 2 more variants and RET, SCN5A, TNNI3 have 1
more variant covered in v4 draft benchmark that are not in v3.3.2
Variants in Medical Exome
(genes from OMIM, HGMD, ClinVar, UniProt)
Benchmark Regions v3.3.2 8,209
Benchmark Regions v4 draft 9,527
Sanger sequencing confirms medically-
relevant variants
• Performed long range PCR
before sequencing
• Confirmed 12 variants in
CYP21A2, which is a medically-
relevant gene in the MHC region
• Confirmed 6 variants in PMS2
• Confirmed 15 variants in 5 other
genes
Evaluation by GIAB collaborators
Compared benchmark to callsets from a variety of technologies and
variant calling methods including:
• Illumina PCR-Free and Dragen
• PacBio CCS and GATK4
• PacBio CCS and DeepVariant
• PacBio CCS and Clair (Next generation of Clairvoyante)
• ONT Promethion and Clair
Preliminary results suggest that a majority of FPs and FNs are correct in
the benchmark and errors in the tested callsets
More
volunteers
welcomed
Manual curation by callset developers
Process
• Compare callset to benchmark using
hap.py and/or vcfeval
• Randomly select 5 FP SNPs, 5 FN SNPs, 5
FP indels and 5 FN indels, each from
inside and outside the v3.3.2 benchmark
bed, in GRCh37 and GRCh38
(5*4*2*2=80 total)
• Use IGV with PCR-free Illumina, PacBio
CCS, 10x, and ONT + difficult bed files
Questions to ask
• Are both alleles correct in the
benchmark?
• Yes/No/Unsure
• Are both alleles correct in the callset
being tested?
• Yes/No/Unsure
• If the benchmark is wrong or
questionable, how did you make this
determination?
• Instructions: Be critical of the benchmark,
and select unsure if the evidence does
not strongly support the benchmark
being correct
Process for independent evaluations
Callset developer
curates putative
errors
Benchmark is
wrong or
questionable
NIST curator
disagrees
Discuss with
callset developer
NIST curator
agrees
Classify source of
potential error in
benchmark
Benchmark is
correct
No further
curation
Initial evaluation suggest a majority of FPs and FNs
are correct in the benchmark and errors in the
tested callsets
Platform and Caller Number
Benchmark
Correct
Number
Benchmark
Unsure
Benchmark is not
correct
Comparison
callset is not
correct
Total sites
CCS with GATK GRCh37 FP 19 1 0 19 20
CCS with GATK GRCh37 FN 15 3 2 18 20
ONT with Clair GRCh37 FP 33 1 0 34 34
ONT with Clair GRCh37 FN 27 3 0 30 30
CCS with Clair GRCh37 FP 7 13 0 6 20
CCS with Clair GRCh37 FN 19 1 0 19 20
Illumina with Dragen GRCh37 FP 14 6 0 11 20
Illumina with Dragen GRCh37 FN 17 3 0 17 20
Evaluation FPs – Inversions
LINEs
Evaluation FPs – Complex SVs
Evaluation FPs – Near SVs
Evaluation FPs – Near low coverage
Potential refinements identified for v4.1
• Exclude VDJ
• Exclude Inversions
• Improve CNV coverage
• Use ONT for excessive coverage
• Explore smoothing on excessive coverage beds
• Use new diploid assemblies to identify CNVs
• MHC
• Exclude CNVs in the MHC, partial repeats in MHC, small regions that are questionable in the
DRB genes
• Benchmark regions density
• Regions with dense variation and many gaps in bed
• Dense variants near SVs
• Segmental duplications
• Small region of duplication covered by benchmark
• Containing an SV
Conclusions
• Long and linked reads add variants to the benchmark, mostly in
regions difficult to map with short reads
• GRCh37: 276,840 SNPs and 53,482 INDELs
• GRCh38: 286,483 SNPs and 42,980 INDELs
• v4 draft benchmark is available for GRCh37 and GRCh38
• GRCh37 Percent Chromosomes 1-22 Covered: 93.2%
• GRCh38 Percent Chromosomes 1-22 Covered: 91.03%
• Initial evaluation suggest a majority of FPs and FNs are correct in the
benchmark and errors in the tested callsets
• More volunteers welcomed
• Identified refinements for v4.1
On-going and Future Work
• Refine use of genome stratifications
• Adding variant calls from raw PacBio and Oxford Nanopore
• Improve benchmark for larger indels, homopolymers, and tandem
repeats
• Improve normalization of complex variants
• Generating benchmark variants from diploid assemblies
• Machine learning
• Outlier detection, active learning
• Generate v4 draft for other GIAB genomes
Acknowledgements
• Andrew Carroll
• Ian T. Fiddes
• Aaron M. Wenger
• William J. Rowell
• Nathan Olson
• Lindsey Harris
• Jenny McDaniel
• Chunlin Xiao
• Marc Salit
• Justin Zook
• Genome in a Bottle Consortium
Draft Benchmark Evaluators
• Xin Zhou
• Sergey Aganezov
• Melanie Kirsche
• Bohan Ni
• Samantha Zarate
• Byunggil Yoo
• Neil Miller
Backup
Initial evaluation suggest a majority of FPs and FNs
are correct in the benchmark and errors in the
tested callsets
Platform and Caller Number
Benchmark
Correct
Number
Benchmark
Unsure
Benchmark is not
correct
Comparison
callset is not
correct
Total sites
CCS with GATK GRCh38 FP 16 4 0 16 20
CCS with GATK GRCh38 FN 17 3 0 16 20
ONT with Clair GRCh38 FP 19 1 0 19 20
ONT with Clair GRCh38 FN 14 6 0 19 20
CCS with Clair GRCh38 FP 15 5 0 16 20
CCS with Clair GRCh38 FN 18 2 0 20 20
Illumina with Dragen GRCh38 FP 16 3 1 16 20
Illumina with Dragen GRCh38 FN 18 2 0 18 20
Integration Pipeline Process
Find sensitive
variant calls and
callable regions
for each dataset,
excluding
difficult
regions/SVs that
are problematic
for each type of
data and variant
caller
Find
“consensus”
calls with
support from
2+
technologies
(and no other
technologies
disagree) using
callable
regions
Use “consensus”
calls to train simple
one-class model for
each dataset and
find “outliers” that
are less trustworthy
for each dataset
Find
benchmark
calls by using
callable
regions and
“outliers” to
arbitrate
between
datasets when
they disagree
Find
benchmark
regions by
taking
union of
callable
regions and
subtracting
uncertain
variants
Sanger sequencing results
Initial evaluation shows a majority of FPs and FNs
are correct in the benchmark and errors in the
tested callsets
Platform and Caller Number
Benchmark
Correct
Number
Benchmark
Unsure
Benchmark is not
correct
Comparison
callset is not
correct
Total sites
CCS with DeepVariant GRCh37 FP 3 9 8 20
CCS with DeepVariant GRCh37 FN 17 3 0 20
CCS with GATK GRCh37 FP 19 1 0 19 20
CCS with GATK GRCh37 FN 15 3 2 18 20
ONT with Clair GRCh37 FP 33 1 0 34 34
ONT with Clair GRCh37 FN 27 3 0 30 30
CCS with Clair GRCh37 FP 7 13 0 6 20
CCS with Clair GRCh37 FN 19 1 0 19 20
Illumina with Dragen GRCh37 FP 14 6 0 11 20
Illumina with Dragen GRCh37 FN 17 3 0 17 20
Initial evaluation shows a majority of FPs and FNs
are correct in the benchmark and errors in the
tested callsets
Platform and Caller Number
Benchmark
Correct
Number
Benchmark
Unsure
Benchmark is not
correct
Comparison
callset is not
correct
Total sites
CCS with DeepVariant GRCh38 FP 6 7 7 20
CCS with DeepVariant GRCh38 FN 20 0 0 20
CCS with GATK GRCh38 FP 16 4 0 16 20
CCS with GATK GRCh38 FN 17 3 0 16 20
ONT with Clair GRCh38 FP 19 1 0 19 20
ONT with Clair GRCh38 FN 14 6 0 19 20
CCS with Clair GRCh38 FP 15 5 0 16 20
CCS with Clair GRCh38 FN 18 2 0 20 20
Illumina with Dragen GRCh38 FP 16 3 1 16 20
Illumina with Dragen GRCh38 FN 18 2 0 18 20
Initial evaluation shows a majority of FPs and FNs
are correct in the benchmark and errors in the
tested callsets
Platform and Caller Number Benchmark
Correct
Number Benchmark
Unsure/No
Number Callset
Incorrect
CCS with GATK GRCh37 32 8 32
CCS with GATK GRCh38 33 7 32
ONT with Clair GRCh37 60 4 60
CCS with Clair GRCh37 26 14 24
CCS with Clair GRCh38 33 7 36
Illumina with Dragen GRCh37 31 9 28
Illumina with Dragen GRCh38 34 6 34

Contenu connexe

Tendances

GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGenomeInABottle
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGenomeInABottle
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...GenomeInABottle
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...GenomeInABottle
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GenomeInABottle
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normalGenomeInABottle
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917GenomeInABottle
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphsGenomeInABottle
 
Tools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsTools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsGenomeInABottle
 
Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016GenomeInABottle
 
New data from giab genomes promethion
New data from giab genomes   promethionNew data from giab genomes   promethion
New data from giab genomes promethionGenomeInABottle
 
New methods deep variant evaluation of draft v4alpha
New methods   deep variant evaluation of draft v4alphaNew methods   deep variant evaluation of draft v4alpha
New methods deep variant evaluation of draft v4alphaGenomeInABottle
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomesGenomeInABottle
 
2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justinGenomeInABottle
 
Giab product and tool roadmap small variants
Giab product and tool roadmap   small variantsGiab product and tool roadmap   small variants
Giab product and tool roadmap small variantsGenomeInABottle
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccsGenomeInABottle
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshopGenomeInABottle
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224GenomeInABottle
 

Tendances (20)

GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normal
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphs
 
Tools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsTools for Using NIST Reference Materials
Tools for Using NIST Reference Materials
 
Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016
 
New data from giab genomes promethion
New data from giab genomes   promethionNew data from giab genomes   promethion
New data from giab genomes promethion
 
New methods deep variant evaluation of draft v4alpha
New methods   deep variant evaluation of draft v4alphaNew methods   deep variant evaluation of draft v4alpha
New methods deep variant evaluation of draft v4alpha
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
 
2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin
 
Giab product and tool roadmap small variants
Giab product and tool roadmap   small variantsGiab product and tool roadmap   small variants
Giab product and tool roadmap small variants
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccs
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
Giab ashg 2017
Giab ashg 2017Giab ashg 2017
Giab ashg 2017
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224
 
Giab sv genotyping
Giab sv genotypingGiab sv genotyping
Giab sv genotyping
 

Similaire à Using long reads to expand Genome in a Bottle small variant benchmark

Giab agbt small_var_2019
Giab agbt small_var_2019Giab agbt small_var_2019
Giab agbt small_var_2019GenomeInABottle
 
New methods draft v4alpha small variant benchmark
New methods   draft v4alpha small variant benchmarkNew methods   draft v4alpha small variant benchmark
New methods draft v4alpha small variant benchmarkGenomeInABottle
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907GenomeInABottle
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030GenomeInABottle
 
Understanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assaysUnderstanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assaysCandy Smellie
 
Droplet digital PCR and its applications
Droplet digital PCR and its applicationsDroplet digital PCR and its applications
Droplet digital PCR and its applicationssadiya97
 
GIAB Sep2016 Lightning chen sun varmatch
GIAB Sep2016 Lightning chen sun varmatchGIAB Sep2016 Lightning chen sun varmatch
GIAB Sep2016 Lightning chen sun varmatchGenomeInABottle
 
171114 best practices for benchmarking variant calls justin
171114 best practices for benchmarking variant calls justin171114 best practices for benchmarking variant calls justin
171114 best practices for benchmarking variant calls justinGenomeInABottle
 
Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821GenomeInABottle
 
Molecular marker technology in studies on plant genetic diversity
Molecular marker technology in studies on plant genetic diversityMolecular marker technology in studies on plant genetic diversity
Molecular marker technology in studies on plant genetic diversityChanakya P
 
Aug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plansAug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plansGenomeInABottle
 
Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Integrated DNA Technologies
 
Primer design for PCR and analysis of gel picture
Primer design for  PCR and analysis of gel picture Primer design for  PCR and analysis of gel picture
Primer design for PCR and analysis of gel picture Thoria Donia
 
PCR based molecular markers
PCR based molecular markersPCR based molecular markers
PCR based molecular markersDivya S
 
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisVarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisGolden Helix
 
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic Variation
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic VariationVarSeq 2.3.0: Supporting the Full Spectrum of Genomic Variation
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic VariationGolden Helix
 
RCA Rolling cycle amplification is a isothermal
RCA Rolling cycle amplification is a isothermalRCA Rolling cycle amplification is a isothermal
RCA Rolling cycle amplification is a isothermalMoksha34
 
Isothermal Nucleic Acid Amplification Techniques
Isothermal Nucleic Acid Amplification TechniquesIsothermal Nucleic Acid Amplification Techniques
Isothermal Nucleic Acid Amplification TechniquesAref Farokhi Fard
 

Similaire à Using long reads to expand Genome in a Bottle small variant benchmark (20)

Giab agbt small_var_2019
Giab agbt small_var_2019Giab agbt small_var_2019
Giab agbt small_var_2019
 
New methods draft v4alpha small variant benchmark
New methods   draft v4alpha small variant benchmarkNew methods   draft v4alpha small variant benchmark
New methods draft v4alpha small variant benchmark
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030
 
Understanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assaysUnderstanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assays
 
Droplet digital PCR and its applications
Droplet digital PCR and its applicationsDroplet digital PCR and its applications
Droplet digital PCR and its applications
 
GIAB Sep2016 Lightning chen sun varmatch
GIAB Sep2016 Lightning chen sun varmatchGIAB Sep2016 Lightning chen sun varmatch
GIAB Sep2016 Lightning chen sun varmatch
 
171114 best practices for benchmarking variant calls justin
171114 best practices for benchmarking variant calls justin171114 best practices for benchmarking variant calls justin
171114 best practices for benchmarking variant calls justin
 
Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821
 
Molecular marker technology in studies on plant genetic diversity
Molecular marker technology in studies on plant genetic diversityMolecular marker technology in studies on plant genetic diversity
Molecular marker technology in studies on plant genetic diversity
 
Aug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plansAug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plans
 
Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...
 
Primer design for PCR and analysis of gel picture
Primer design for  PCR and analysis of gel picture Primer design for  PCR and analysis of gel picture
Primer design for PCR and analysis of gel picture
 
05_Microbio590B_QC_2022.pdf
05_Microbio590B_QC_2022.pdf05_Microbio590B_QC_2022.pdf
05_Microbio590B_QC_2022.pdf
 
PCR based molecular markers
PCR based molecular markersPCR based molecular markers
PCR based molecular markers
 
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisVarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
 
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic Variation
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic VariationVarSeq 2.3.0: Supporting the Full Spectrum of Genomic Variation
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic Variation
 
Sept2016 sv nist_intro
Sept2016 sv nist_introSept2016 sv nist_intro
Sept2016 sv nist_intro
 
RCA Rolling cycle amplification is a isothermal
RCA Rolling cycle amplification is a isothermalRCA Rolling cycle amplification is a isothermal
RCA Rolling cycle amplification is a isothermal
 
Isothermal Nucleic Acid Amplification Techniques
Isothermal Nucleic Acid Amplification TechniquesIsothermal Nucleic Acid Amplification Techniques
Isothermal Nucleic Acid Amplification Techniques
 

Plus de GenomeInABottle

GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GenomeInABottle
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGenomeInABottle
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923GenomeInABottle
 
New data from giab genomes strand-seq
New data from giab genomes   strand-seqNew data from giab genomes   strand-seq
New data from giab genomes strand-seqGenomeInABottle
 
New data from giab genomes intro and ultralong nanopore
New data from giab genomes   intro and ultralong nanoporeNew data from giab genomes   intro and ultralong nanopore
New data from giab genomes intro and ultralong nanoporeGenomeInABottle
 
How giab fits in the rest of the world mdic somatic reference samples
How giab fits in the rest of the world   mdic somatic reference samplesHow giab fits in the rest of the world   mdic somatic reference samples
How giab fits in the rest of the world mdic somatic reference samplesGenomeInABottle
 
How giab fits in the rest of the world telomere to telomere consortium
How giab fits in the rest of the world   telomere to telomere consortiumHow giab fits in the rest of the world   telomere to telomere consortium
How giab fits in the rest of the world telomere to telomere consortiumGenomeInABottle
 
How giab fits in the rest of the world human genome structural variation co...
How giab fits in the rest of the world   human genome structural variation co...How giab fits in the rest of the world   human genome structural variation co...
How giab fits in the rest of the world human genome structural variation co...GenomeInABottle
 
How giab fits in the rest of the world introduction
How giab fits in the rest of the world introductionHow giab fits in the rest of the world introduction
How giab fits in the rest of the world introductionGenomeInABottle
 

Plus de GenomeInABottle (11)

2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
 
GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
 
New data from giab genomes strand-seq
New data from giab genomes   strand-seqNew data from giab genomes   strand-seq
New data from giab genomes strand-seq
 
New data from giab genomes intro and ultralong nanopore
New data from giab genomes   intro and ultralong nanoporeNew data from giab genomes   intro and ultralong nanopore
New data from giab genomes intro and ultralong nanopore
 
How giab fits in the rest of the world mdic somatic reference samples
How giab fits in the rest of the world   mdic somatic reference samplesHow giab fits in the rest of the world   mdic somatic reference samples
How giab fits in the rest of the world mdic somatic reference samples
 
How giab fits in the rest of the world telomere to telomere consortium
How giab fits in the rest of the world   telomere to telomere consortiumHow giab fits in the rest of the world   telomere to telomere consortium
How giab fits in the rest of the world telomere to telomere consortium
 
How giab fits in the rest of the world human genome structural variation co...
How giab fits in the rest of the world   human genome structural variation co...How giab fits in the rest of the world   human genome structural variation co...
How giab fits in the rest of the world human genome structural variation co...
 
How giab fits in the rest of the world introduction
How giab fits in the rest of the world introductionHow giab fits in the rest of the world introduction
How giab fits in the rest of the world introduction
 

Dernier

9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr
9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr
9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi NcrDelhi Call Girls
 
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...rajnisinghkjn
 
Case Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptxCase Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptxNiranjan Chavan
 
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...Ahmedabad Escorts
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...narwatsonia7
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersnarwatsonia7
 
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknownarwatsonia7
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxDr.Nusrat Tariq
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingNehru place Escorts
 
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Modelssonalikaur4
 
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near MeHigh Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Menarwatsonia7
 
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️saminamagar
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
 
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️saminamagar
 
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service MumbaiVIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbaisonalikaur4
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...saminamagar
 
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...Nehru place Escorts
 

Dernier (20)

9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr
9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr
9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr
 
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
 
Case Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptxCase Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptx
 
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...
Air-Hostess Call Girls Madambakkam - Phone No 7001305949 For Ultimate Sexual ...
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
 
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptx
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
 
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
 
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near MeHigh Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
 
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
 
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in paharganj DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
 
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service MumbaiVIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
 
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
 

Using long reads to expand Genome in a Bottle small variant benchmark

  • 1. Using long and linked reads to generate a new Genome in a Bottle small variant benchmark Justin Wagner, Andrew Carroll, Ian T. Fiddes, Aaron M. Wenger, William J. Rowell, Nathan Olson, Lindsey Harris, Jenny McDaniel, Xin Zhou, Sergey Aganezov, Melanie Kirsche, Bohan Ni, Samantha Zarate, Byunggil Yoo, Neil Miller, C. Xiao, Marc Salit, Justin Zook, Genome in a Bottle Consortium GRC/GIAB Workshop ASHG 2019
  • 2. Overview • v3.3.2 benchmark variants and regions cover 87.84% of assembled bases in chromosomes 1-22 in GRCh37 for the sample HG002 • Short read variant callers perform poorly in genomic locations with high homology such as segmental duplications and low-complexity repeat-rich regions • Now utilizing PacBio CCS and 10X Genomics data to expand the GIAB benchmark regions and reduce errors in current regions • Long and linked reads add variants to the benchmark, mostly in regions difficult to map with short reads • GRCh37: 276,840 SNPs and 53,482 INDELs • GRCh38: 286,483 SNPs and 42,980 INDELs
  • 3. How the benchmark is generated
  • 4. When do we trust variants and regions from each method Variants PASS Filtered outliers Low/high coverage or low MQ (or low GQ for gVCF) Difficult regions/SVs Callable regions TR VariantCallingMethodX (1) (2) (3) 1/1 0/1
  • 5. Arbitrating between variant calls in different methods PASS variants #2 Benchmark regions 0/1 1/11/1 Benchmark calls 0/11/1 Callable regions #2 Callable regions #1 1/10/11/1PASS variants #1 InputMethods 1/1 (1) Concordant (2) Discordant unresolved (3) Discordant arbitrated (4) Concordant not callable
  • 6. Sequencing data used in integration for HG002 Platform Characteristics Alignment; Variant Calling Illumina 150x150bp, ~300x coverage Novoalign; GATK v3.5 CG 26x26bp; ~100x coverage Complete Genomics Pipeline Illumina 150x150bp, ~300x coverage Novoalign; Freebayes Illumina 250x250bp;~45x coverage Novoalign; GATK v3.5 Illumina 250x250bp;~45x coverage Novoalign; Freebayes Illumina 6Kbp mate pair; ~13x coverage bwa_mem; GATK v3.5 Illumina 6Kbp mate pair; ~13x coverage bwa_mem; Freebayes Ion Exome, 1000x coverage Torrent Suite v4.2; Torrent Variant Caller v4.4 Solid 75bp; ~60x coverage LifeScope v2.5.1; GATK v3.5 PacBio CCS Sequel II ~11kb reads; ~32x coverage minimap2; GATK4 PacBio CCS Sequel II ~11kb reads; ~32x coverage minimap2; DeepVariant v0.8 10x Genomics Linked reads; ~84x coverage LongRanger Pipeline
  • 7. Long and linked reads cover more variants and regions Variants PASS Filtered outliers Low/high coverage or low MQ (or low GQ for gVCF) Difficult regions/SVs Callable regions TR VariantCallingMethodX (1) (2) (3) 1/1 0/1 10x Genomics and PacBio CCS data add new variants (1), regions with good coverage of high MQ reads (2), and access to difficult regions (3)
  • 8. How the benchmark is generated
  • 9. Difficult Regions Excluded from all Methods Difficult Region Description Bases Covered in GRCh37 Bases Covered in GRCh38 v0.6 SV GIAB Benchmark 32,596,754 32,872,907 Potential copy number variation 51,713,344 62,666,746 Tandem Repeats > 10kb 5,731,885 71,942,255 Highly similar and high depth segmental duplications 1,232,701 2,094,143 Regions that are collapsed and expanded from GRCh37/38 Primary Assembly Alignments 17,979,597 N/A Modeled centromere and heterochromatin N/A 62,304,573
  • 10. Difficult Regions Excluded by Method • Tandem Repeats < 51bp except GATK from Illumina PCR-free, Complete Genomics, and CCS DeepVariant • Tandem Repeats > 51bp and < 200bp except GATK from Illumina PCR- Free and CCS DeepVariant • Tandem Repeats > 200bp except CCS DeepVariant • Homopolymers > 6bp except GATK from Illumina PCR-free, Complete Genomics, Ion Exome, PacBio CCS • Imperfect homopolymer > 10bp except GATK from Illumina PCR-Free • Difficult to map regions for short reads except 10x and CCS • LINE:L1Hs > 500bp except Illumina MatePair, 10x, and CCS • Segmental duplications except 10x and CCS
  • 11. v4 draft benchmark includes variants found with haplotype-resolved assembly of MHC • Worked with a team from the March 2019 NCBI Pangenome Hackathon to generate haplotype-resolved assembly of MHC region (chr6:28,477,797-33,448,354 in GRCh37) • Use assembly to call small variants • Small variants from assembly are integrated with mapping-based calls in the MHC region for v4 draft benchmark • v4 draft benchmark includes 23,229 variants in the MHC region • Covers most HLA genes and CYP21A2/TNXA/TNXB
  • 12. v4 draft benchmark include more bases, variants, and segmental duplications v4 draft GRCh37 v4 draft GRCh38 Base pairs 2,504,027,936 2,509,269,277 Reference covered 93.2% 91.03% SNPs 3,323,773 3,314,941 Indels 519,152 519,494 Base pairs in Segmental Duplications 64,300,499 73,819,342 80.00% 85.00% 90.00% 95.00% Percent of reference covered
  • 13. Some variants and segmental duplications only covered in v3.3.2 or v4 draft Only in v3.3.2 GRCh37 Only in v4 draft GRCh37 SNPs INDELs SNPs INDELs Only in v3.3.2 GRCh38 Only in v4 draft GRCh38343,358 69,495 77,324 23,828 376,653 91,837 91,719 48,753 Segmental Duplications Segmental Duplications 25,445 63,949,151 1,928,353 70,187,985
  • 14. v4 draft enables benchmarking in regions difficult for short reads Comparison of Illumina RTG VCF against benchmark sets • SNP FNs increase by a factor of more than 3, mostly due to new benchmark variants in difficult to map regions and segmental duplications • False negatives: variants present in the truth set, but missed in the query Subset v3.3.2 FNs v4 draft FNs All SNPs 8,594 30,229 Low mappability 6,708 25,295 Segmental duplications 1,429 14,008
  • 15. v4 draft benchmark contains more medically- relevant variants • v4 draft covers more of the MHC region • Outside of MHC updates, top 5 genes with variants increased from v3.3.2 to v4 draft benchmark: TSPEAR (31), LAMA5 (28), FCGBP (18), TPSAB1 (15), HSPG2 (13) • PMS2 from ACMG59 has 2 more variants and RET, SCN5A, TNNI3 have 1 more variant covered in v4 draft benchmark that are not in v3.3.2 Variants in Medical Exome (genes from OMIM, HGMD, ClinVar, UniProt) Benchmark Regions v3.3.2 8,209 Benchmark Regions v4 draft 9,527
  • 16. Sanger sequencing confirms medically- relevant variants • Performed long range PCR before sequencing • Confirmed 12 variants in CYP21A2, which is a medically- relevant gene in the MHC region • Confirmed 6 variants in PMS2 • Confirmed 15 variants in 5 other genes
  • 17. Evaluation by GIAB collaborators Compared benchmark to callsets from a variety of technologies and variant calling methods including: • Illumina PCR-Free and Dragen • PacBio CCS and GATK4 • PacBio CCS and DeepVariant • PacBio CCS and Clair (Next generation of Clairvoyante) • ONT Promethion and Clair Preliminary results suggest that a majority of FPs and FNs are correct in the benchmark and errors in the tested callsets More volunteers welcomed
  • 18. Manual curation by callset developers Process • Compare callset to benchmark using hap.py and/or vcfeval • Randomly select 5 FP SNPs, 5 FN SNPs, 5 FP indels and 5 FN indels, each from inside and outside the v3.3.2 benchmark bed, in GRCh37 and GRCh38 (5*4*2*2=80 total) • Use IGV with PCR-free Illumina, PacBio CCS, 10x, and ONT + difficult bed files Questions to ask • Are both alleles correct in the benchmark? • Yes/No/Unsure • Are both alleles correct in the callset being tested? • Yes/No/Unsure • If the benchmark is wrong or questionable, how did you make this determination? • Instructions: Be critical of the benchmark, and select unsure if the evidence does not strongly support the benchmark being correct
  • 19. Process for independent evaluations Callset developer curates putative errors Benchmark is wrong or questionable NIST curator disagrees Discuss with callset developer NIST curator agrees Classify source of potential error in benchmark Benchmark is correct No further curation
  • 20. Initial evaluation suggest a majority of FPs and FNs are correct in the benchmark and errors in the tested callsets Platform and Caller Number Benchmark Correct Number Benchmark Unsure Benchmark is not correct Comparison callset is not correct Total sites CCS with GATK GRCh37 FP 19 1 0 19 20 CCS with GATK GRCh37 FN 15 3 2 18 20 ONT with Clair GRCh37 FP 33 1 0 34 34 ONT with Clair GRCh37 FN 27 3 0 30 30 CCS with Clair GRCh37 FP 7 13 0 6 20 CCS with Clair GRCh37 FN 19 1 0 19 20 Illumina with Dragen GRCh37 FP 14 6 0 11 20 Illumina with Dragen GRCh37 FN 17 3 0 17 20
  • 21. Evaluation FPs – Inversions LINEs
  • 22. Evaluation FPs – Complex SVs
  • 23. Evaluation FPs – Near SVs
  • 24. Evaluation FPs – Near low coverage
  • 25. Potential refinements identified for v4.1 • Exclude VDJ • Exclude Inversions • Improve CNV coverage • Use ONT for excessive coverage • Explore smoothing on excessive coverage beds • Use new diploid assemblies to identify CNVs • MHC • Exclude CNVs in the MHC, partial repeats in MHC, small regions that are questionable in the DRB genes • Benchmark regions density • Regions with dense variation and many gaps in bed • Dense variants near SVs • Segmental duplications • Small region of duplication covered by benchmark • Containing an SV
  • 26. Conclusions • Long and linked reads add variants to the benchmark, mostly in regions difficult to map with short reads • GRCh37: 276,840 SNPs and 53,482 INDELs • GRCh38: 286,483 SNPs and 42,980 INDELs • v4 draft benchmark is available for GRCh37 and GRCh38 • GRCh37 Percent Chromosomes 1-22 Covered: 93.2% • GRCh38 Percent Chromosomes 1-22 Covered: 91.03% • Initial evaluation suggest a majority of FPs and FNs are correct in the benchmark and errors in the tested callsets • More volunteers welcomed • Identified refinements for v4.1
  • 27. On-going and Future Work • Refine use of genome stratifications • Adding variant calls from raw PacBio and Oxford Nanopore • Improve benchmark for larger indels, homopolymers, and tandem repeats • Improve normalization of complex variants • Generating benchmark variants from diploid assemblies • Machine learning • Outlier detection, active learning • Generate v4 draft for other GIAB genomes
  • 28. Acknowledgements • Andrew Carroll • Ian T. Fiddes • Aaron M. Wenger • William J. Rowell • Nathan Olson • Lindsey Harris • Jenny McDaniel • Chunlin Xiao • Marc Salit • Justin Zook • Genome in a Bottle Consortium Draft Benchmark Evaluators • Xin Zhou • Sergey Aganezov • Melanie Kirsche • Bohan Ni • Samantha Zarate • Byunggil Yoo • Neil Miller
  • 30. Initial evaluation suggest a majority of FPs and FNs are correct in the benchmark and errors in the tested callsets Platform and Caller Number Benchmark Correct Number Benchmark Unsure Benchmark is not correct Comparison callset is not correct Total sites CCS with GATK GRCh38 FP 16 4 0 16 20 CCS with GATK GRCh38 FN 17 3 0 16 20 ONT with Clair GRCh38 FP 19 1 0 19 20 ONT with Clair GRCh38 FN 14 6 0 19 20 CCS with Clair GRCh38 FP 15 5 0 16 20 CCS with Clair GRCh38 FN 18 2 0 20 20 Illumina with Dragen GRCh38 FP 16 3 1 16 20 Illumina with Dragen GRCh38 FN 18 2 0 18 20
  • 31. Integration Pipeline Process Find sensitive variant calls and callable regions for each dataset, excluding difficult regions/SVs that are problematic for each type of data and variant caller Find “consensus” calls with support from 2+ technologies (and no other technologies disagree) using callable regions Use “consensus” calls to train simple one-class model for each dataset and find “outliers” that are less trustworthy for each dataset Find benchmark calls by using callable regions and “outliers” to arbitrate between datasets when they disagree Find benchmark regions by taking union of callable regions and subtracting uncertain variants
  • 33. Initial evaluation shows a majority of FPs and FNs are correct in the benchmark and errors in the tested callsets Platform and Caller Number Benchmark Correct Number Benchmark Unsure Benchmark is not correct Comparison callset is not correct Total sites CCS with DeepVariant GRCh37 FP 3 9 8 20 CCS with DeepVariant GRCh37 FN 17 3 0 20 CCS with GATK GRCh37 FP 19 1 0 19 20 CCS with GATK GRCh37 FN 15 3 2 18 20 ONT with Clair GRCh37 FP 33 1 0 34 34 ONT with Clair GRCh37 FN 27 3 0 30 30 CCS with Clair GRCh37 FP 7 13 0 6 20 CCS with Clair GRCh37 FN 19 1 0 19 20 Illumina with Dragen GRCh37 FP 14 6 0 11 20 Illumina with Dragen GRCh37 FN 17 3 0 17 20
  • 34. Initial evaluation shows a majority of FPs and FNs are correct in the benchmark and errors in the tested callsets Platform and Caller Number Benchmark Correct Number Benchmark Unsure Benchmark is not correct Comparison callset is not correct Total sites CCS with DeepVariant GRCh38 FP 6 7 7 20 CCS with DeepVariant GRCh38 FN 20 0 0 20 CCS with GATK GRCh38 FP 16 4 0 16 20 CCS with GATK GRCh38 FN 17 3 0 16 20 ONT with Clair GRCh38 FP 19 1 0 19 20 ONT with Clair GRCh38 FN 14 6 0 19 20 CCS with Clair GRCh38 FP 15 5 0 16 20 CCS with Clair GRCh38 FN 18 2 0 20 20 Illumina with Dragen GRCh38 FP 16 3 1 16 20 Illumina with Dragen GRCh38 FN 18 2 0 18 20
  • 35. Initial evaluation shows a majority of FPs and FNs are correct in the benchmark and errors in the tested callsets Platform and Caller Number Benchmark Correct Number Benchmark Unsure/No Number Callset Incorrect CCS with GATK GRCh37 32 8 32 CCS with GATK GRCh38 33 7 32 ONT with Clair GRCh37 60 4 60 CCS with Clair GRCh37 26 14 24 CCS with Clair GRCh38 33 7 36 Illumina with Dragen GRCh37 31 9 28 Illumina with Dragen GRCh38 34 6 34

Notes de l'éditeur

  1. Exclude tandem repeats approximately larger than the read length for each method Homopolymers are excluded from 10x and PacBio CCS Really long homopolymers only included for GATK based calls for PCR-Free data because GATK gVCF has low genotype quality score if they don’t have reads that totally encompass the homopolymer - Trust homopolymers most from PCR-Free short reads
  2. Ongoing work includes checking if many are in regions that might be in potential CNVs as they could be errors in v3.3.2
  3. false-negatives (FN) : variants present in the truth set, but missed in the query.
  4. 3_79181930 Add this from what lindsey sent on slack
  5. Combine GRCh37 and GRCh38
  6. Left is an inversion Right is an likely a LINE-mediated inversion - If have an inversion near repetitive elements, then exclude the repetitive elements as well - Show just two LINEs and the inversion they flank
  7. Left is likely a tandem duplication or large insertion or complex insertion Right is an inversion but then deletion that is in SV benchmark, likely a complex SV
  8. Update this table – Includes Billy’s new results 10x-Aquila_37 16 24 16 10x-Aquila_38 22 18 17