SlideShare une entreprise Scribd logo
1  sur  39
Télécharger pour lire hors ligne
vg: the variation graph toolkit
Eric Dawson
October 2016@erictdawson
1
Variation graphs
"Variation graphs provide a succinct encoding of the sequences of many genomes.
A variation graph (in particular as implemented in vg) is composed of:

	 •	 nodes, which are labeled by sequences and ids

	 •	 edges, which connect two nodes via either of their respective ends

	 •	 paths, describe genomes, sequence alignments, and annotations (such as gene
models and transcripts) as walks through nodes connected by edges"*
*From the vg wiki
Variation graphs allow us to map directly against known (or
proposed) variation.
2
Variation graphs
GRCh38 alts in B-3106 from human MHC
(@erikgarrison)
3
Variation graphs are pangenomes
“Computational Pan-Genomics: Status,
Promises and Challenges.”
Computational Pan-Genomics Consortium.
Briefings in Bioinformatics (2016) in press
Pangenomes should fulfill a
number of basic functions.
Erik Garrison, A toolkit for practical pangenomics, ECCB 2016
4
Variation graphs are pangenomes
We've implemented most
of these operations in vg.
github.com/vgteam/vg
Erik Garrison, A toolkit for practical pangenomics, ECCB 2016
5
Constructing graphs
construct - build graphs from paired FASTA/VCF files.
msga - use progressive assembly to generate a graph from input
sequences.
an example of a graph made with MSGA
@erikgarrison
6
Graph modification
mod - prune and normalize graphs, shorten nodes, and much more.
circularize - circularize paths in the graph.
ids - coordinate the ID spaces of multiple subgraphs.
Circular HPV variation graph
@erictdawson & Sarah Wagner (NCI)7
Indexing variation graphs
The core, mutable VG data structures support graph manipulation,
queries, and alignment, but they are not scalable.
Thus...
XG - immutable bidirectional graph index
gPBWT - graph-generalized positional BWT (Adam Novak)
GCSA2 - path index queries for variation and de Bruijn graphs (Jouni
Sirén)
8
Indexing variation graphs
The core, mutable VG data structures support graph manipulation,
queries, and alignment, but they are not scalable.
Thus...
XG - immutable bidirectional graph index
gPBWT - graph-generalized positional BWT (Adam Novak)
GCSA2 - path index queries for variation and de Bruijn graphs (Jouni
Sirén)
These structures enable mapping at whole-genome scale.
9
Mapping to a variation graph
vg can operate on arbitrary sequence graphs. All other graph-based
resequencing implementations (GRAL, BWBBLE, vBWT, GCSAv1)
require a DAG.
Local alignment to cyclic graphs is provided by unrolling:
10
Exact-match guided alignment in vg
@erikgarrison
Obtain MEMs from GCSA2, cluster MEMs with xg's positional index,
then fully resolve alignment for non-matching portions using dynamic
programming.
11
Obtain MEMs from GCSA2, cluster MEMs with xg's positional index,
then fully resolve alignment for non-matching portions using dynamic
programming.
Exact-match guided alignment in vg
@erikgarrison
12
Mapping to a variation graph
1 billion reads / 32 vCPUs / 30 hours
13
Small variant calling in vg
call - bubble informed pileup-based caller
genotype - FreeBayes style genotyping using graph augmentation
and superbubble detection
14
Genotyping
Genotyping and MSGA produce identical results (as expected).
15
Interchange with other programs
vectorize - export Alignments as Vowpal-Wabbit vectors for ML
view - GFA/JSON/DOT for many graph entities
1.0 1 ref_1A | ref 1 1 0 1 0 1 0 1
16
Other functions
locify(beta) - extract relevant info for external phasing.
deconstruct(beta) - extract an input VCF from the graph.
sim - simulate reads and exact alignments from the graph.
stats - print relevant graph properties.
surject	- push graph alignments to BAM space.
sift/scrub(beta) - filter / select alignments by mapping properties.
translate - lift graph coordinates between graphs.
17
Structural variation in variation graphs
1:CAAATAAG
9:GC
11:TTGGAAATT
20:TTCTGGA
27:GTT
30:CTAT
34:TATATTCCAACTCTCTG
18
Structural variation
Loosely defined as changes to the genomic sequence >50bp in length.
1. Balanced events


1. Inversions
2. Translocations











2. Unbalanced events


1. Insertions
2. Deletions
3. Duplications





3. "Complex" - not shown
- Multiple events occurring
in tandem, specific time
series of events, etc.




19
Why are we talking about SVs?
1. Evidence for structural variation in the NCI Chernobyl
study.
• Matched tumor/normal samples from >400 individuals exposed to 131
I
post-Chernobyl who subsequently developed papillary thyroid carcinoma.
2. Variation graphs provide novel ways to locate and score
SV calls.
• Graphs are mutable - candidate variation can be inserted, mapped
against, and refined.
20
*From Stephen Chanock, IARC 2016
*
Evidence of SV in Chernobyl data
21
*From Stephen Chanock, IARC 2016
*
Evidence of SV in Chernobyl data
22
Evidence of SV in Chernobyl data
Median and (Total)
Lumpy Calls
DEL INV
Other
(excl. BND)
Tumor (12) 9074 (187,150) 3925 (53,643) 2185 (25,182)
Normal (11*) 5059 (60,508) 378 (9,494) 1234 (13,682)
Blood (12) 6634 (88,985) 98 (1,195) 1247 (14,181)
Median and total call numbers (as well as analysis by others)
indicate we might expect a high burden of deletions and
inversions in tumours from our dataset (relative to normals)
after normalization and QC.
*No A90N normal tissue sample;
A90G metastasis sequenced instead23
Representing SVs in variation graphs
1:CAAATAAG
9:GC
11:TTGGAAATT
20:TTCTGGA
27:GTT
30:CTAT
34:TATATTCCAACTCTCTG
Chr22 (hg38) +
all well-defined DEL in COSMIC
DEL DEL INV
1
135016
1589543
1630246
1971760 3486016
4591816 5258415
6554688 7482235
7490801
7645222
17645868
9107072
9211653 15863014
16675711
16715410
16749787 16749860 17453803
1968856317607658
17680825
17833893 17953345
18105753
19198059
20266331
20401803
20559555
20418416
20428509
20449686
20460773
20495146 20556428 20556507
20629345 20858930 20859045
20998041
21014964
20998896
21976209
24 github.com/edawson/lasso
Why use a variation graphs and not [your favorite SV caller here] ?
25
Read alignment is Bayesian
Our best alignment gets the
highest posterior score
The reference
genome
The read from the sequencer
26
@erikgarrison
Detecting SVs with vg
graph
mapq soft clipping mate orientation
reads
fragment length path divergence unmapped reads
sample
We're stuck with our reads, but with variation graphs we can
sample from possible reference priors to maximize:
P( reference | reads)
We can also refine our breakpoints using this approach.
Informative for SV type
Uninformative
27
Detecting deletions with vg
1000 simulated reads w/
1kb deletion mapped to
an 8kb flat HPV genome
graph.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
36
35
31
32
1 2
6
5
72
71
70
66
67
68
69
72
71
4
5
9
8
28
29
33
32
37
36
32
33
49
50
55
54
22
23
27
26
25
77
76
72
73
74
58
57
53
54
74
75
78
77
55
54
50
51
40
41
45
44
68
69
72
71
23
22
8
9
10
27
26
23
24
27
28
32
31
52
5146
47
72
73
77
76
79
78
74
75
40
41 46
4524
25
26
29
28
8
7
3
4
5
25
26
30
29
8
7
3
4
6
7
10
9
77
76
71
72
3
4
9
8
7
59
60
63
62
52
53
56
55
66
65
61
62
72
73
74
77
76
76
75
71
72
35
3430
31
51
50
46
47
22
21
7
8
55
54
50
51
41
42
46
45
7
6
2
3
4
35
3430
31
42
43
46
45
2
3
7
6
8
7
6
2
3
4
74
75
79
78
45
44
40
41
39
40
44
43
59
58
54
55 65
64
60
61
48
47
44
45
71
72
76
75
48
49
53
52
70
71
74
73
47
46
42
43
54
53
48
49
33
34
35
38
37
75
74
70
71
72
73
77
76
34
35
39
38
53
5247
48
63
64
69
68
67
72
73
77
76
67
6665
60
61
44
45
49
48
36
35
32
33
3
4
8
7
34
33
29
30
52
53
57
56
58
57
54
55
5
6
9
8
68
67
63
64
65
32
31
30
26
27
54
53
49
50
57
58
62
61
71
70
66
67
38
37
34
35
51
50
46
47
28
27
23
24
34
33
28
29
30
75
74
70
71
7269
68
64
65
22
21
8
9
56
55
51
52
42
43
48
47
41
40
37
38
72
71
67
68
47
46
42
43
44
8
9
10
24
23
67
66
61
62
63
73
72
68
69
65
66
70
69
60
61
65
64
41
42
43
46
45
66
65
61
62
78
77
73
74
42
43
44
47
46
43
44
49
48
8
7
4
5
60
61
65
64
51
52 56
55
63
62
58
59
24
25
29
28
53
52
47
48
49
42
41
37
38
31
30
29
25
26
27
62
61
57
58
35
34
30
31 48
47
44
45
54
5349
50
51
24
25
29
28
73
74
78
77
8
9
10
24
23
22
67
68
72
71
71
72
76
75
57
58 61
60
64
65
69
68
42
41
37
38
29
30
34
33
33
34
38
37
3
4
7
6
79
78
74
75
36
37
41
40
62
61
57
58
29
30
31
34
33
64
63
60
61
73
74
79
78
77
39
40
44
43
35
36
41
40
38
37
33
34
56
55
51
52
25
26
31
30
29
41
42
46
45
54
55
56
59
58
63
64
69
68
40
39
36
37
8
7
3
4
32
31
27
28
61
62
66
65
66
65
64
61
62
56
57
60
59
40
41
45
44
43
36
35
32
33
53
54
55
59
58
72
73
76
75
23
22
8
9
47
48
51
50
52
53
57
56
34
33
29
30
76
75
71
72
73
58
57
54
55
79
78
74
75
57
56
55
51
52
32
33
34
37
36
8
7
3
4
51
50
46
47
22
23
27
26
61
60
57
58
70
71
75
74
58
57
53
54 69
68
64
65
66
45
46
49
48
38
39
43
42
60
59
55
56 63
62
58
59
30
29
28
25
26
65
66
67
70
69
47
46
43
44
27
26
21
22
5
6
10
9
8
68
69
74
73
67
66
61
62
3
4
7
6
73
72
68
69
40
41
45
44
35
36
40
39
69
70
75
74
73
60
59
56
57
21
22
26
25
7
8
22
21
31
32
36
35
61
60
56
57
1 2
6
5
4
45
46
50
49
48
71
70
66
67
34
33
32
28
29
77
76
72
73
7431
30
25
26
71
72
75
74
64
63
58
59
30
29
28
25
26
76
75
74
70
71
37
36
32
33
34
67
66
62
63
32
33
37
36
35
53
5247
48
46
45
42
43
26
25
24
21
22
53
54
59
58
76
75
71
72
73
52
53
57
56
28
2722
23
46
45
41
42
43
44
48
47
73
72
68
69
23
24
29
28
31
32
36
35
59
60
61
65
64
24
23
22
9
10 22
21
7
8
33
32
28
29
30
59
60
63
62
63
64
65
68
67
62
61
58
59
21
22
27
26
25
63
64
68
67
75
74
70
71
34
33
29
30
25
26
27 31
30
23
22
8
9
78
77
72
73
74
27
28
33
32
52
51
47
48
47
48
52
51
58
59
62
61
53
52
51
48
49
39
40
44
43
48
49
54
53
54
53
49
50
34
35
39
38
46
47
52
51
42
43
47
46
52
53
58
57
64
63
60
61
52
51
47
48
33
34
38
37
33
32
29
30
72
71
66
67
36
37
41
40
29
30
33
32
30
31
32
35
34
28
27
23
24
26
27 31
30
74
73
69
70
55
56
57
60
59
36
37
38
41
40
3
4
7
6
38
37
33
34
35
10
9
5
6
72
73
77
76
70
69
64
65
53
5248
49
55
56
57
60
59
58
59
64
63
62
51
52
57
56
55
43
44
47
46
35
36 40
39
33
34
38
37
53
52
51
48
49
54
55
59
58
27
28
32
31
58
57
56
53
54
52
53
54
57
56
58
59
63
62
61
30
29
26
27
72
71
67
68
60
59
56
57
26
25
21
22
47
48
53
52
51
26
25
21
22
55
56
60
59
34
33
29
30
61
62
66
65
28
27
26
22
23
54
53
49
50
51
74
75
79
78
77
61
60
55
56
33
34
38
37
79
78
74
75
71
72
77
76
51
52
55
54
39
40
44
43
4
5
9
8
7
67
68
73
72
45
46
47 50
49
67
68
69 72
71
70
60
61
66
65
64
68
69 72
71
10
9
5
6
10
9
6 7
8
21
22
26
25
24
25 28
27
1 2
5
4
40
39
35
36
37
9
8
4
5
6
4
5
9
8
65
66
67
71
70
69
45
44
40
41
42
51
52
56
55
58
59
63
62
51
52
56
55
38
39
40
43
42
52
51
47
48
9
10
24
23
29
30
34
33
9
10
24
23
22
31
30
26
27
28
36
35
31
32
29
28
25
26
53
54
55
58
57
59
60
61
64
63
54
55
56
59
58
79
78
77
74
75
44
45
49
48
52
51
47
48
58
57
52
53
53
54
59
58
57
56
57
61
60
61
60
55
56
57
56
57
60
59
22
23
26
25
55
54
53
49
50
51
54
55
59
58
61
62
63
66
65
51
50
47
48
35
36
39
38
8
9
23
22
36
35
34
31
32
62
61
60
56
57
58
40
39
36
37
6
7
10
9
27
26
25
21
22
31
30
29
25
26
27
63
62
58
59
60
68
69
73
72
52
53
57
56
55
8
9
23
22
35
34
30
31
56
55
50
51
52
61
62
67
66
65
33
34
37
36
74
73
69
70
64
65
68
67
25
26
30
29
10
9
5
6
32
31
27
28
76
75
71
72
73
38
37
36
33
34
78
77
76
72
73
67
68
72
7136
37
41
40
69
70
71
75
74
33
34
38
37
24
23
8
9
36
35
30
31
32
33
32
28
29
7
8
9
22
21
28
29
30
33
32
56
57
61
60
72
71
67
68
29
30
35
34
32
33
37
36
71
70
66
6759
58
54
55
31
32
36
35
55
56
57 60
59
35
34
31
32
9
8
7
4
5
66
67
71
70
69
4
5
9
8
28
27
23
24
61
60
56
57
58 66
65
61
62
32
3128
29
77
76
72
73
74
42
41
40
37
38
1 2
6
5
4
43
42
37
38
38
37
36
32
3327
26
22
23
29
30
34
33
73
74
75
79
78
37
38
42
41
42
41
37
38
8
9
23
22
33
32
28
29
74
75
79
78
62
63
67
66
66
67
71
70
79
78
74
75
40
41
44
43
24
25
29
28
40
41
42
46
45
47
48
51
50
3
4
8
7
64
63
60
61
46
45
41
42
43 45
44
43
39
40
47
46
45
41
42
43
5
6
7
10
9
32
33 37
36
35
33
32
28
29
30
21
6
7
25
24
21
66
67
70
69
66
67
71
70
28
29
32
31
41
42
43
47
46
71
72
73
77
76
75
42
41
37
38
49
48
47
44
45
26
27
31
30
3
4
8
7
67
68
72
7155
56
60
59
77
76
73
74
58
59 63
62
53
52
48
49
30
29
26
27
33
34
39
38
37
62
63
67
66
22
23
26
25
47
46
45
41
42
35
34
30
31
34
33
29
30
53
52
48
49
50
76
75
71
72
38
39
43
42
79
78
74
75
64
65
69
68
8
9
24
23
39
40
45
44
43
29
30
33
32
61
60
56
57
33
34
38
37
29
30
34
33
32
79
78
73
74
75
73
74
78
77
76
70
69
65
6658
57
53
54
36
35
30
31
32
7
8
22
21
39
40
44
43
29
28
24
25
49
50
54
53
33
34
38
37
58
59
62
61
76
75
71
72
30
31
36
35
34
45
44
41
42
58
57
54
55
29
28
24
25
26
21
26
25
24
23
9
10
5
6
21
26
25
21
22
26
25
24
21
22
10
24
23
44
45
48
47
23
24
25
29
28
9
8
4
5
34
33
32
29
30
66
65
64
62
63
48
47
41
42
43
21
22
26
25
71
70
67
68
32
31
30
27
28
63
62
59
60
62
61
60
56
57
7
8
23
22
21
33
34
35
39
38
4
5
10
9
8
24
25 28
27
42
41
38
39
79
78
74
75
48
49 52
51
35
36 40
39
38
30
29
25
26 55
54
53
49
50
7
8
9
22
21
38
39
42
41
55
56
59
58
23
24
28
27
4
5
9
8
75
74
73
70
71
26
27
30
29
26
27
30
29
31
32
36
35
77
76
72
73
75
74
70
71
53
54
59
58
57
25
26
30
29
62
61
57
58
62
63
67
66
56
55
50
51
53
54
58
57
43
44 48
47
52
53
56
55
41
42
46
45
47
46
42
43
75
76
79
78
73
74
78
77
10
25
24
6
7
22
21
6
7
8
21
3
4
8
7
38
39
42
41
28
29
30
33
32
8
7
3
4
36
37
42
41
40
50
51 55
54
23
24
29
28
27
27
28
31
30
68
69
70
74
73
72
6
7
22
21
42
43
46
45
46
45
41
42
52
51
47
48
68
67
63
64
65
69
70
74
73
50
4945
46
64
63
58
59
39
38
34
35
70
71
72 75
74
9
10
24
23
34
3329
30
31
30
26
27
65
64
60
61
62
6157
58
57
56
53
54
73
74
78
77
61
62
66
65
9
8
7
3
4
5
28
27
23
24
41
42
45
44
79
78
75
76
28
29
30
34
33
48
47
43
44
29
30
35
34
53
54
58
57
37
38
43
42
54
53
49
50
72
73
74
77
76
66
67
71
70
44
45
49
48
61
62
66
65
34
35
40
39
38
73
72
68
69
53
54
58
57
63
62
58
59
40
41
45
44
78
77
74
75
21
6
7
10
24
23
10
6
7
7
8
21
21
6
7
62
63
64
68
67
66
51
52
53
56
55
61
60
59
56
57
50
49
45
46
47
69
70
73
72
64
65
69
68
50
49
48
44
45
43
44
45 49
48
33
34
35
38
37
8
9
22
21
65
66 69
68
50
51 55
54
53
47
46
42
43
33
34
38
37
57
58
61
60
30
29
24
25
40
41
45
44
43
59
58
57
54
55
32
31
27
28 34
33
29
30
71
70
67
68
47
48
52
51
26
25
21
22
6
7
10
58
59
63
62
70
71 75
74
39
40 43
42 64
65
66
70
69
67
68
72
71
70
22
23
28
27
48
49 52
51
44
43
40
41
70
69
65
66
43
44
48
47
71
72
76
75
30
29
25
26
27
32
31
28
29
79
78
7774
75
31
30
27
28
58
57
54
55
33
32
31
28
29
33
34
38
37
68
69
73
72
33
32
31
28
29
32
33
37
36
32
33
36
35
25
24
23
21
57
58
59
63
62
21
22
26
25
7
6
2
3
71
70
66
67
65
66
67
70
69
55
54
53
49
50
62
63
67
66
60
61
65
64
63
24
23
22
8
9
35
36 40
39
2
3
6
5
41
40
36
37
52
53
54 57
56
52
51
50
48
49
67
68
72
71
30
29
26
27
51
52
53
57
56
55
39
40 45
44
43
53
52
50
51
74
73
68
69
70
38
39
43
42
65
66
70
69
6
5
1 2
64
6360
61
35
36
37
41
40
39
74
75
78
77
5
6
10
9
32
33
34
37
36
7
6
1 2
66
67
70
6955
56
60
59
63
64
65
68
67
47
46
41
42
43
67
68
71
70
47
46
45
42
43
1 2
6
5
10
9
5
6
10
6
7
69
70
71
74
73
73
74
78
77
50
49
45
46
2
3
7
6
50
49
46
47
24
23
22
9
10
51
52
56
55
31
32
37
36
69
68
64
65
72
73
77
76
43
44
48
47
46
28
27
24
25
35
34
31
32
79
78
74
75
52
51
47
48
59
60 63
62
67
66
61
62
32
31
27
28
50
51
54
53
44
43
39
40
34
3328
29
77
76
72
73
44
45 49
48
35
34
33
29
30
31
23
24
28
27
8
7
3
4
40
39
35
36
37
68
69
74
73
72
60
59
55
56
25
24
21
22
43
44
47
46
41
42
43
47
46
30
31
35
34
59
58
55
56
50
49
45
46
71
72
73
76
75
66
65
62
63
68
69
73
72
77
76
72
73
22
21
7
8
9
48
49
53
52
9
10 24
23
39
38
34
35
21
22
23 26
25
38
37
36
33
34
25
26
30
29
3
4
7
6
22
21
7
8
9
51
50
45
46
3
4
8
7
8
9
23
22
21
22
23 26
25
10
9
8
4
5 50
49
46
47
35
34
30
31
70
71 74
73
72
53
5248
49
60
61
65
64
5
6
9
8
7
8
22
21
51
52
56
55
61
62
66
65
73
72
69
70
69
70 75
74
38
37
34
35
51
52
53
56
55
10
24
23
57
58
62
61
55
54
53
51
52
34
3329
30
31
30
29
26
27
75
74
70
71
58
57
53
54
31
32
36
35
34
43
42
38
39
61
62
66
65
27
26
25
21
22
35
36
39
38
29
28
24
25 72
71
68
69
70
69
65
66
56
57
61
60
67
68
72
71
43
42
38
39 61
60
57
58
67
66
65
62
63
58
59
60
63
62
61
60
59
56
57
58
54
53
50
51
65
66
70
69
35
34
33
30
31
67
68
69
72
71
44
45
49
48
79
78
74
75
76
46
45
41
42
6
5
1 2
33
32
31
28
29
39
40
44
43
59
58
54
55
37
36
35
31
32
7
8
21
30
31
32
36
35
44
45
49
48
47
69
68
65
66
3
4
9
8
7
24
25
29
28
7
6
5
2
3
4
21
22 27
26
25
77
76
73
74
75
43
42
41
38
39 60
59
55
56
55
56
57 60
59
61
60
57
58
4
5
8
7
35
36
40
39
38
21
22
26
25
34
35
40
39
38
37
34
35
36
37
41
40
39
73
74
78
77
8
7
3
4
69
68
67
63
64
65
59
58
55
56
30
31
32 35
34
36
35
31
32
40
41
46
45
44
38
39
42
41
24
23
22
8
9 35
34
30
31
71
72
76
75
9
10 24
23
48
49
52
51
29
30 33
32
41
40
39
35
36
26
27
33
32
8
7
3
4
27
26
22
23
51
52
56
55
46
45
41
42
33
34 38
37
7
6
5
1 2
3
30
31 36
35
28
27
26
23
24
25
63
62
58
5941
40
36
37
44
45
50
49
44
45
48
47
3
4
8
7
72
73 77
76
75
8
7
3
4 22
21
7
8 30
29
28
24
25
51
50
46
47
68
69 72
71
39
40 43
42
24
25 29
28
59
58
55
56 74
73
70
71
39
40
41 45
44
1kb
28
Detecting deletions with vg
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
36
35
31
32
1 2
6
5
72
71
70
66
67
68
69
72
71
4
5
9
8
28
29
33
32
37
36
32
33
49
50
55
54
22
23
27
26
25
77
76
72
73
74
58
57
53
54
74
75
78
77
55
54
50
51
40
41
45
44
68
69
72
71
23
22
8
9
10
27
26
23
24
27
28
32
31
52
5146
47
72
73
77
76
79
78
74
75
40
41 46
4524
25
26
29
28
8
7
3
4
5
25
26
30
29
8
7
3
4
6
7
10
9
77
76
71
72
3
4
9
8
7
59
60
63
62
52
53
56
55
66
65
61
62
72
73
74
77
76
76
75
71
72
35
3430
31
51
50
46
47
22
21
7
8
55
54
50
51
41
42
46
45
7
6
2
3
4
35
3430
31
42
43
46
45
2
3
7
6
8
7
6
2
3
4
74
75
79
78
45
44
40
41
39
40
44
43
59
58
54
55 65
64
60
61
48
47
44
45
71
72
76
75
48
49
53
52
70
71
74
73
47
46
42
43
54
53
48
49
33
34
35
38
37
75
74
70
71
72
73
77
76
34
35
39
38
53
5247
48
63
64
69
68
67
72
73
77
76
67
6665
60
61
44
45
49
48
36
35
32
33
3
4
8
7
34
33
29
30
52
53
57
56
58
57
54
55
5
6
9
8
68
67
63
64
65
32
31
30
26
27
54
53
49
50
57
58
62
61
71
70
66
67
38
37
34
35
51
50
46
47
28
27
23
24
34
33
28
29
30
75
74
70
71
7269
68
64
65
22
21
8
9
56
55
51
52
42
43
48
47
41
40
37
38
72
71
67
68
47
46
42
43
44
8
9
10
24
23
67
66
61
62
63
73
72
68
69
65
66
70
69
60
61
65
64
41
42
43
46
45
66
65
61
62
78
77
73
74
42
43
44
47
46
43
44
49
48
8
7
4
5
60
61
65
64
51
52 56
55
63
62
58
59
24
25
29
28
53
52
47
48
49
42
41
37
38
31
30
29
25
26
27
62
61
57
58
35
34
30
31 48
47
44
45
54
5349
50
51
24
25
29
28
73
74
78
77
8
9
10
24
23
22
67
68
72
71
71
72
76
75
57
58 61
60
64
65
69
68
42
41
37
38
29
30
34
33
33
34
38
37
3
4
7
6
79
78
74
75
36
37
41
40
62
61
57
58
29
30
31
34
33
64
63
60
61
73
74
79
78
77
39
40
44
43
35
36
41
40
38
37
33
34
56
55
51
52
25
26
31
30
29
41
42
46
45
54
55
56
59
58
63
64
69
68
40
39
36
37
8
7
3
4
32
31
27
28
61
62
66
65
66
65
64
61
62
56
57
60
59
40
41
45
44
43
36
35
32
33
53
54
55
59
58
72
73
76
75
23
22
8
9
47
48
51
50
52
53
57
56
34
33
29
30
76
75
71
72
73
58
57
54
55
79
78
74
75
57
56
55
51
52
32
33
34
37
36
8
7
3
4
51
50
46
47
22
23
27
26
61
60
57
58
70
71
75
74
58
57
53
54 69
68
64
65
66
45
46
49
48
38
39
43
42
60
59
55
56 63
62
58
59
30
29
28
25
26
65
66
67
70
69
47
46
43
44
27
26
21
22
5
6
10
9
8
68
69
74
73
67
66
61
62
3
4
7
6
73
72
68
69
40
41
45
44
35
36
40
39
69
70
75
74
73
60
59
56
57
21
22
26
25
7
8
22
21
31
32
36
35
61
60
56
57
1 2
6
5
4
45
46
50
49
48
71
70
66
67
34
33
32
28
29
77
76
72
73
7431
30
25
26
71
72
75
74
64
63
58
59
30
29
28
25
26
76
75
74
70
71
37
36
32
33
34
67
66
62
63
32
33
37
36
35
53
5247
48
46
45
42
43
26
25
24
21
22
53
54
59
58
76
75
71
72
73
52
53
57
56
28
2722
23
46
45
41
42
43
44
48
47
73
72
68
69
23
24
29
28
31
32
36
35
59
60
61
65
64
24
23
22
9
10 22
21
7
8
33
32
28
29
30
59
60
63
62
63
64
65
68
67
62
61
58
59
21
22
27
26
25
63
64
68
67
75
74
70
71
34
33
29
30
25
26
27 31
30
23
22
8
9
78
77
72
73
74
27
28
33
32
52
51
47
48
47
48
52
51
58
59
62
61
53
52
51
48
49
39
40
44
43
48
49
54
53
54
53
49
50
34
35
39
38
46
47
52
51
42
43
47
46
52
53
58
57
64
63
60
61
52
51
47
48
33
34
38
37
33
32
29
30
72
71
66
67
36
37
41
40
29
30
33
32
30
31
32
35
34
28
27
23
24
26
27 31
30
74
73
69
70
55
56
57
60
59
36
37
38
41
40
3
4
7
6
38
37
33
34
35
10
9
5
6
72
73
77
76
70
69
64
65
53
5248
49
55
56
57
60
59
58
59
64
63
62
51
52
57
56
55
43
44
47
46
35
36 40
39
33
34
38
37
53
52
51
48
49
54
55
59
58
27
28
32
31
58
57
56
53
54
52
53
54
57
56
58
59
63
62
61
30
29
26
27
72
71
67
68
60
59
56
57
26
25
21
22
47
48
53
52
51
26
25
21
22
55
56
60
59
34
33
29
30
61
62
66
65
28
27
26
22
23
54
53
49
50
51
74
75
79
78
77
61
60
55
56
33
34
38
37
79
78
74
75
71
72
77
76
51
52
55
54
39
40
44
43
4
5
9
8
7
67
68
73
72
45
46
47 50
49
67
68
69 72
71
70
60
61
66
65
64
68
69 72
71
10
9
5
6
10
9
6 7
8
21
22
26
25
24
25 28
27
1 2
5
4
40
39
35
36
37
9
8
4
5
6
4
5
9
8
65
66
67
71
70
69
45
44
40
41
42
51
52
56
55
58
59
63
62
51
52
56
55
38
39
40
43
42
52
51
47
48
9
10
24
23
29
30
34
33
9
10
24
23
22
31
30
26
27
28
36
35
31
32
29
28
25
26
53
54
55
58
57
59
60
61
64
63
54
55
56
59
58
79
78
77
74
75
44
45
49
48
52
51
47
48
58
57
52
53
53
54
59
58
57
56
57
61
60
61
60
55
56
57
56
57
60
59
22
23
26
25
55
54
53
49
50
51
54
55
59
58
61
62
63
66
65
51
50
47
48
35
36
39
38
8
9
23
22
36
35
34
31
32
62
61
60
56
57
58
40
39
36
37
6
7
10
9
27
26
25
21
22
31
30
29
25
26
27
63
62
58
59
60
68
69
73
72
52
53
57
56
55
8
9
23
22
35
34
30
31
56
55
50
51
52
61
62
67
66
65
33
34
37
36
74
73
69
70
64
65
68
67
25
26
30
29
10
9
5
6
32
31
27
28
76
75
71
72
73
38
37
36
33
34
78
77
76
72
73
67
68
72
7136
37
41
40
69
70
71
75
74
33
34
38
37
24
23
8
9
36
35
30
31
32
33
32
28
29
7
8
9
22
21
28
29
30
33
32
56
57
61
60
72
71
67
68
29
30
35
34
32
33
37
36
71
70
66
6759
58
54
55
31
32
36
35
55
56
57 60
59
35
34
31
32
9
8
7
4
5
66
67
71
70
69
4
5
9
8
28
27
23
24
61
60
56
57
58 66
65
61
62
32
3128
29
77
76
72
73
74
42
41
40
37
38
1 2
6
5
4
43
42
37
38
38
37
36
32
3327
26
22
23
29
30
34
33
73
74
75
79
78
37
38
42
41
42
41
37
38
8
9
23
22
33
32
28
29
74
75
79
78
62
63
67
66
66
67
71
70
79
78
74
75
40
41
44
43
24
25
29
28
40
41
42
46
45
47
48
51
50
3
4
8
7
64
63
60
61
46
45
41
42
43 45
44
43
39
40
47
46
45
41
42
43
5
6
7
10
9
32
33 37
36
35
33
32
28
29
30
21
6
7
25
24
21
66
67
70
69
66
67
71
70
28
29
32
31
41
42
43
47
46
71
72
73
77
76
75
42
41
37
38
49
48
47
44
45
26
27
31
30
3
4
8
7
67
68
72
7155
56
60
59
77
76
73
74
58
59 63
62
53
52
48
49
30
29
26
27
33
34
39
38
37
62
63
67
66
22
23
26
25
47
46
45
41
42
35
34
30
31
34
33
29
30
53
52
48
49
50
76
75
71
72
38
39
43
42
79
78
74
75
64
65
69
68
8
9
24
23
39
40
45
44
43
29
30
33
32
61
60
56
57
33
34
38
37
29
30
34
33
32
79
78
73
74
75
73
74
78
77
76
70
69
65
6658
57
53
54
36
35
30
31
32
7
8
22
21
39
40
44
43
29
28
24
25
49
50
54
53
33
34
38
37
58
59
62
61
76
75
71
72
30
31
36
35
34
45
44
41
42
58
57
54
55
29
28
24
25
26
21
26
25
24
23
9
10
5
6
21
26
25
21
22
26
25
24
21
22
10
24
23
44
45
48
47
23
24
25
29
28
9
8
4
5
34
33
32
29
30
66
65
64
62
63
48
47
41
42
43
21
22
26
25
71
70
67
68
32
31
30
27
28
63
62
59
60
62
61
60
56
57
7
8
23
22
21
33
34
35
39
38
4
5
10
9
8
24
25 28
27
42
41
38
39
79
78
74
75
48
49 52
51
35
36 40
39
38
30
29
25
26 55
54
53
49
50
7
8
9
22
21
38
39
42
41
55
56
59
58
23
24
28
27
4
5
9
8
75
74
73
70
71
26
27
30
29
26
27
30
29
31
32
36
35
77
76
72
73
75
74
70
71
53
54
59
58
57
25
26
30
29
62
61
57
58
62
63
67
66
56
55
50
51
53
54
58
57
43
44 48
47
52
53
56
55
41
42
46
45
47
46
42
43
75
76
79
78
73
74
78
77
10
25
24
6
7
22
21
6
7
8
21
3
4
8
7
38
39
42
41
28
29
30
33
32
8
7
3
4
36
37
42
41
40
50
51 55
54
23
24
29
28
27
27
28
31
30
68
69
70
74
73
72
6
7
22
21
42
43
46
45
46
45
41
42
52
51
47
48
68
67
63
64
65
69
70
74
73
50
4945
46
64
63
58
59
39
38
34
35
70
71
72 75
74
9
10
24
23
34
3329
30
31
30
26
27
65
64
60
61
62
6157
58
57
56
53
54
73
74
78
77
61
62
66
65
9
8
7
3
4
5
28
27
23
24
41
42
45
44
79
78
75
76
28
29
30
34
33
48
47
43
44
29
30
35
34
53
54
58
57
37
38
43
42
54
53
49
50
72
73
74
77
76
66
67
71
70
44
45
49
48
61
62
66
65
34
35
40
39
38
73
72
68
69
53
54
58
57
63
62
58
59
40
41
45
44
78
77
74
75
21
6
7
10
24
23
10
6
7
7
8
21
21
6
7
62
63
64
68
67
66
51
52
53
56
55
61
60
59
56
57
50
49
45
46
47
69
70
73
72
64
65
69
68
50
49
48
44
45
43
44
45 49
48
33
34
35
38
37
8
9
22
21
65
66 69
68
50
51 55
54
53
47
46
42
43
33
34
38
37
57
58
61
60
30
29
24
25
40
41
45
44
43
59
58
57
54
55
32
31
27
28 34
33
29
30
71
70
67
68
47
48
52
51
26
25
21
22
6
7
10
58
59
63
62
70
71 75
74
39
40 43
42 64
65
66
70
69
67
68
72
71
70
22
23
28
27
48
49 52
51
44
43
40
41
70
69
65
66
43
44
48
47
71
72
76
75
30
29
25
26
27
32
31
28
29
79
78
7774
75
31
30
27
28
58
57
54
55
33
32
31
28
29
33
34
38
37
68
69
73
72
33
32
31
28
29
32
33
37
36
32
33
36
35
25
24
23
21
57
58
59
63
62
21
22
26
25
7
6
2
3
71
70
66
67
65
66
67
70
69
55
54
53
49
50
62
63
67
66
60
61
65
64
63
24
23
22
8
9
35
36 40
39
2
3
6
5
41
40
36
37
52
53
54 57
56
52
51
50
48
49
67
68
72
71
30
29
26
27
51
52
53
57
56
55
39
40 45
44
43
53
52
50
51
74
73
68
69
70
38
39
43
42
65
66
70
69
6
5
1 2
64
6360
61
35
36
37
41
40
39
74
75
78
77
5
6
10
9
32
33
34
37
36
7
6
1 2
66
67
70
6955
56
60
59
63
64
65
68
67
47
46
41
42
43
67
68
71
70
47
46
45
42
43
1 2
6
5
10
9
5
6
10
6
7
69
70
71
74
73
73
74
78
77
50
49
45
46
2
3
7
6
50
49
46
47
24
23
22
9
10
51
52
56
55
31
32
37
36
69
68
64
65
72
73
77
76
43
44
48
47
46
28
27
24
25
35
34
31
32
79
78
74
75
52
51
47
48
59
60 63
62
67
66
61
62
32
31
27
28
50
51
54
53
44
43
39
40
34
3328
29
77
76
72
73
44
45 49
48
35
34
33
29
30
31
23
24
28
27
8
7
3
4
40
39
35
36
37
68
69
74
73
72
60
59
55
56
25
24
21
22
43
44
47
46
41
42
43
47
46
30
31
35
34
59
58
55
56
50
49
45
46
71
72
73
76
75
66
65
62
63
68
69
73
72
77
76
72
73
22
21
7
8
9
48
49
53
52
9
10 24
23
39
38
34
35
21
22
23 26
25
38
37
36
33
34
25
26
30
29
3
4
7
6
22
21
7
8
9
51
50
45
46
3
4
8
7
8
9
23
22
21
22
23 26
25
10
9
8
4
5 50
49
46
47
35
34
30
31
70
71 74
73
72
53
5248
49
60
61
65
64
5
6
9
8
7
8
22
21
51
52
56
55
61
62
66
65
73
72
69
70
69
70 75
74
38
37
34
35
51
52
53
56
55
10
24
23
57
58
62
61
55
54
53
51
52
34
3329
30
31
30
29
26
27
75
74
70
71
58
57
53
54
31
32
36
35
34
43
42
38
39
61
62
66
65
27
26
25
21
22
35
36
39
38
29
28
24
25 72
71
68
69
70
69
65
66
56
57
61
60
67
68
72
71
43
42
38
39 61
60
57
58
67
66
65
62
63
58
59
60
63
62
61
60
59
56
57
58
54
53
50
51
65
66
70
69
35
34
33
30
31
67
68
69
72
71
44
45
49
48
79
78
74
75
76
46
45
41
42
6
5
1 2
33
32
31
28
29
39
40
44
43
59
58
54
55
37
36
35
31
32
7
8
21
30
31
32
36
35
44
45
49
48
47
69
68
65
66
3
4
9
8
7
24
25
29
28
7
6
5
2
3
4
21
22 27
26
25
77
76
73
74
75
43
42
41
38
39 60
59
55
56
55
56
57 60
59
61
60
57
58
4
5
8
7
35
36
40
39
38
21
22
26
25
34
35
40
39
38
37
34
35
36
37
41
40
39
73
74
78
77
8
7
3
4
69
68
67
63
64
65
59
58
55
56
30
31
32 35
34
36
35
31
32
40
41
46
45
44
38
39
42
41
24
23
22
8
9 35
34
30
31
71
72
76
75
9
10 24
23
48
49
52
51
29
30 33
32
41
40
39
35
36
26
27
33
32
8
7
3
4
27
26
22
23
51
52
56
55
46
45
41
42
33
34 38
37
7
6
5
1 2
3
30
31 36
35
28
27
26
23
24
25
63
62
58
5941
40
36
37
44
45
50
49
44
45
48
47
3
4
8
7
72
73 77
76
75
8
7
3
4 22
21
7
8 30
29
28
24
25
51
50
46
47
68
69 72
71
39
40 43
42
24
25 29
28
59
58
55
56 74
73
70
71
39
40
41 45
44
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
32
1 2
6
5
5
9
29
23
27
23
10
27
24
28
3226
29
8
3
5
26
83
4
7
10
3
4
9
31
8
7
2
3
4
31
2
3
7
8
2
3
4
3
4
8
5
6
9
32
27 28
24
22
9
10
24
8
5
25
29
31
27 31
25
29
10
24
3
4
7
31
26
31
8
3
4
32
28
23
9
8
3
4
23
27
26
27
22
5
6
10
3
4
7
22
26
8
22
32
1 2
6
5
29
31
26
26
26
22
28
23
24
29
32
24
10 22
8
22
27
27 31
23
9
28
30
32
28
24
27 31
3
4
7
10
5
6
28
32
30
27
26
22
26
22
30
28
23
5
9
10
5
6
10
8
22
26
25 28
1 2
5
9
5
6
4
5
9
10
24
30
10
24
31
28
32
29
26
23
26
9
23
32
7
10
27
21
22
31
27
9
23
31
26
30
10
5
6
32
28
249
32
29
9
22
21
30
30
32
32
9
4
5
4
5
9
28
24
32
29
1 2
6
5
27
23
30
9
23
2925
29
3
4
8
6
7
10
30
21
6
7
25
21
29
32
27
31
3
4
8
30
27
23
26
31
30
9
24
30
30
32
8
22
21
29
25
3129
26
21
26
25
24
10
5
6
21
26
25
21
22
26
25
21
22
10
24
25
29
9
4
5
30
21
22
26
25
3228
8
23
4
5
10
25 28
30
25
26
9
22
21
24
28
4
5
9
27
30
27
30
32
25
26
3010
25
6
7
22
21
8
21
3
4
8
30
8
3
4
23
24
2928
31
6
7
22
21
10
24
23
30
31
27
9
3
4
5
28
23
24
30
30
21
6
7
10
24
23
10
6
7
8
21
21
6
7
9
22
21
30
25
32
28 30
26
25
21
22
6
7
10
23
28 30
29
27
32
29 31
27
28
29
28
29
25
21
21
22
26
25
7
6
2
3
24
23
9
2
3
6
5
30
29
27
6
5
1 2
5
6
10
7
6
1 2
1 2
6
5
10
5
6
10
6
7
2
3
7
6
24
23
10
32
28
27
25 32
32
27
28
28
29
31
23
24
28
27
8
7
3
4
25
21
22
31
22
21
9
9
10 24
23
23 26
25
25
26
30
29
3
4
7
6
22
21
9
3
4
8
7
9
23
23 26
25
10
9
4
5
31
5
6
9
8
7
8
22
21
10
24
23
29
30
31
27
31
32
27
21
22 29
28
25
31
6
5
1 2
28
29
31
32
7
8
21
31
32
3
4
9
8
24
25
29
28
7
6
2
3
4
21
22 27
4
5
8
7
21
22
26
25
8
7
3
4
31
32
31
32
24
23
8
9
31
9
10 24
23
29
30
26
27
32
8
7
3
4
27
26
237
6
5
1 2
3 3128
27
24
25
3
4
8
7
8
7
3
4 22
21
7
8 30
29
24
25
24
25 29
28
Stacked soft clips
Long fragment length
Low mapping depth
1kb
29
SV realignment / calling in vg
Arbitrary genome graph
30
SV realignment / calling in vg
1. Map our reads
31
SV realignment / calling in vg
1. Map our reads

2. Collect read signatures
- soft clipped reads
- discordant fragment length
32
SV realignment / calling in vg
1. Map our reads

2. Collect read signatures
- soft clipped reads
- discordant fragment length

3. Create candidate alleles based on
signatures and add them to graph
33
SV realignment / calling in vg
1. Map our reads

2. Collect read signatures
- soft clipped reads
- discordant fragment length

3. Create candidate alleles based on
signatures and add them to graph

4. Remap our reads, score the new
alignments, and repeat if necessary
34
Early results - small variants
We submitted the first whole human genome analysis using variation graph
reference methods as part of the PrecisionFDA resequencing competition.
(May 2016.)
We did not win... but we did get a star for:
We're now approaching 99% F-score with vg call.
Erik Garrison, A toolkit for practical pangenomics, ECCB 2016
35
Progress - structural variants
Pipeline still under development:
• Deletions in time for the holidays.
• Inversions, insertions, and duplications to follow.
36
Opportunities for improvement
• Variant calling at all scales remains unsolved.
• Haplotype phasing on the graph unexplored.
• Currently no local assembly functionality.
• Exploring fermi-lite, but by no means finished.
• Few downstream analysis options
• Apps can easily operate on vg structures, JSON, GFA, etc.
• Additional user feedback / requests would be useful.
and many more...
37
Where do I see vg making a difference in my
work?*
• Viral coinfection detection / classification
• Multiclonal tumors
• SV breakpoint refinement
• Tumor / normal specific graphs
@erictdawson
38
Acknowledgements
Many thanks to:
Richard Durbin
Stephen Chanock
Erik Garrison
Adam Novak
Benedict Paten
Glenn Hickey
Jordan Eizinga
Jerven Bolleman
Maciek Smuga
Mike Lin
... and many others!
Thank y'all for listening!
@erictdawson
39

Contenu connexe

En vedette

Белоколенко М.В. Приоритеты развития библиотечной деятельности в ЦБС на совр...
Белоколенко М.В.  Приоритеты развития библиотечной деятельности в ЦБС на совр...Белоколенко М.В.  Приоритеты развития библиотечной деятельности в ЦБС на совр...
Белоколенко М.В. Приоритеты развития библиотечной деятельности в ЦБС на совр...Maria Belokolenko
 
Культурные программы Библиотеки № 183 им. Данте Алигьери
Культурные программы Библиотеки № 183 им. Данте АлигьериКультурные программы Библиотеки № 183 им. Данте Алигьери
Культурные программы Библиотеки № 183 им. Данте АлигьериMaria Belokolenko
 
FoodIN - jídlo s příběhem a přidanou hodnotou
FoodIN - jídlo s příběhem a přidanou hodnotouFoodIN - jídlo s příběhem a přidanou hodnotou
FoodIN - jídlo s příběhem a přidanou hodnotouPavičová Petra
 
النظام الإحصائي التونسي واقعه وآفاق تطويره
النظام الإحصائي التونسي واقعه وآفاق تطويرهالنظام الإحصائي التونسي واقعه وآفاق تطويره
النظام الإحصائي التونسي واقعه وآفاق تطويرهNational Institute of Statistics - Tunisia
 
Nataly N. Smetannikova , Maria V. Belokolenko, NataliYa M. Kurikalova "Readin...
Nataly N. Smetannikova, Maria V. Belokolenko, NataliYa M. Kurikalova "Readin...Nataly N. Smetannikova, Maria V. Belokolenko, NataliYa M. Kurikalova "Readin...
Nataly N. Smetannikova , Maria V. Belokolenko, NataliYa M. Kurikalova "Readin...Maria Belokolenko
 
50 Destinations in Asia and in the Philippines
50 Destinations in Asia and in the Philippines50 Destinations in Asia and in the Philippines
50 Destinations in Asia and in the PhilippinesAngela Francisco
 
Jткрытое родительское собрание по летнему семейному чтению красная площадь 2016
Jткрытое родительское собрание по летнему семейному чтению красная площадь 2016Jткрытое родительское собрание по летнему семейному чтению красная площадь 2016
Jткрытое родительское собрание по летнему семейному чтению красная площадь 2016Maria Belokolenko
 
4. capitalism and islamic economic system
4. capitalism and islamic economic system4. capitalism and islamic economic system
4. capitalism and islamic economic systemChoudhury Sadekuzzaman
 
The 6th and 9th Commandment
The 6th and 9th CommandmentThe 6th and 9th Commandment
The 6th and 9th CommandmentAngela Francisco
 
презентация E commerce
презентация E commerceпрезентация E commerce
презентация E commercenikateos
 

En vedette (17)

Белоколенко М.В. Приоритеты развития библиотечной деятельности в ЦБС на совр...
Белоколенко М.В.  Приоритеты развития библиотечной деятельности в ЦБС на совр...Белоколенко М.В.  Приоритеты развития библиотечной деятельности в ЦБС на совр...
Белоколенко М.В. Приоритеты развития библиотечной деятельности в ЦБС на совр...
 
Malaysia
MalaysiaMalaysia
Malaysia
 
Convocatoria boe
Convocatoria boeConvocatoria boe
Convocatoria boe
 
Культурные программы Библиотеки № 183 им. Данте Алигьери
Культурные программы Библиотеки № 183 им. Данте АлигьериКультурные программы Библиотеки № 183 им. Данте Алигьери
Культурные программы Библиотеки № 183 им. Данте Алигьери
 
Gbi
GbiGbi
Gbi
 
FoodIN - jídlo s příběhem a přidanou hodnotou
FoodIN - jídlo s příběhem a přidanou hodnotouFoodIN - jídlo s příběhem a přidanou hodnotou
FoodIN - jídlo s příběhem a přidanou hodnotou
 
النظام الإحصائي التونسي واقعه وآفاق تطويره
النظام الإحصائي التونسي واقعه وآفاق تطويرهالنظام الإحصائي التونسي واقعه وآفاق تطويره
النظام الإحصائي التونسي واقعه وآفاق تطويره
 
Data dissemination NIS Tunisia
Data dissemination NIS TunisiaData dissemination NIS Tunisia
Data dissemination NIS Tunisia
 
Nataly N. Smetannikova , Maria V. Belokolenko, NataliYa M. Kurikalova "Readin...
Nataly N. Smetannikova, Maria V. Belokolenko, NataliYa M. Kurikalova "Readin...Nataly N. Smetannikova, Maria V. Belokolenko, NataliYa M. Kurikalova "Readin...
Nataly N. Smetannikova , Maria V. Belokolenko, NataliYa M. Kurikalova "Readin...
 
50 Destinations in Asia and in the Philippines
50 Destinations in Asia and in the Philippines50 Destinations in Asia and in the Philippines
50 Destinations in Asia and in the Philippines
 
Jткрытое родительское собрание по летнему семейному чтению красная площадь 2016
Jткрытое родительское собрание по летнему семейному чтению красная площадь 2016Jткрытое родительское собрание по летнему семейному чтению красная площадь 2016
Jткрытое родительское собрание по летнему семейному чтению красная площадь 2016
 
Fiji Island (Australia)
Fiji Island (Australia)Fiji Island (Australia)
Fiji Island (Australia)
 
Agile Scrum - Crafting user stories
Agile Scrum - Crafting user storiesAgile Scrum - Crafting user stories
Agile Scrum - Crafting user stories
 
4. capitalism and islamic economic system
4. capitalism and islamic economic system4. capitalism and islamic economic system
4. capitalism and islamic economic system
 
Hallux valgus
Hallux valgusHallux valgus
Hallux valgus
 
The 6th and 9th Commandment
The 6th and 9th CommandmentThe 6th and 9th Commandment
The 6th and 9th Commandment
 
презентация E commerce
презентация E commerceпрезентация E commerce
презентация E commerce
 

Similaire à Variation Graphs and Structural Variation

Paired-end alignments in sequence graphs
Paired-end alignments in sequence graphsPaired-end alignments in sequence graphs
Paired-end alignments in sequence graphsChirag Jain
 
IGB genome genometry data models by Gregg Helt and Cyrus Harmon
IGB genome genometry data models by Gregg Helt and Cyrus HarmonIGB genome genometry data models by Gregg Helt and Cyrus Harmon
IGB genome genometry data models by Gregg Helt and Cyrus HarmonAnn Loraine
 
GraphBolt: Dependency-Driven Synchronous Processing of Streaming Graphs
GraphBolt: Dependency-Driven Synchronous Processing of Streaming GraphsGraphBolt: Dependency-Driven Synchronous Processing of Streaming Graphs
GraphBolt: Dependency-Driven Synchronous Processing of Streaming GraphsMugilan Mariappan
 
Line Detection on the GPU
Line Detection on the GPU Line Detection on the GPU
Line Detection on the GPU Gernot Ziegler
 
G-TAD: Sub-Graph Localization for Temporal Action Detection
G-TAD: Sub-Graph Localization for Temporal Action DetectionG-TAD: Sub-Graph Localization for Temporal Action Detection
G-TAD: Sub-Graph Localization for Temporal Action DetectionMengmeng Xu
 
Bi gaussianity and indicator variograms (2006)
Bi gaussianity and indicator variograms (2006)Bi gaussianity and indicator variograms (2006)
Bi gaussianity and indicator variograms (2006)David F. Machuca-Mory
 
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...AM Publications
 
Dahlquist et-al bosc-ismb_2016_poster
Dahlquist et-al bosc-ismb_2016_posterDahlquist et-al bosc-ismb_2016_poster
Dahlquist et-al bosc-ismb_2016_posterGRNsight
 
Iaetsd design and implementation of pseudo random number generator
Iaetsd design and implementation of pseudo random number generatorIaetsd design and implementation of pseudo random number generator
Iaetsd design and implementation of pseudo random number generatorIaetsd Iaetsd
 
SLAM of Multi-Robot System Considering Its Network Topology
SLAM of Multi-Robot System Considering Its Network TopologySLAM of Multi-Robot System Considering Its Network Topology
SLAM of Multi-Robot System Considering Its Network Topologytoukaigi
 
Towards the comparative analysis of genomic variants with Jalview
Towards the comparative analysis of genomic variants with JalviewTowards the comparative analysis of genomic variants with Jalview
Towards the comparative analysis of genomic variants with JalviewJim Procter
 
Prospects for CMB lensing-galaxy clustering cross-correlations and modeling b...
Prospects for CMB lensing-galaxy clustering cross-correlations and modeling b...Prospects for CMB lensing-galaxy clustering cross-correlations and modeling b...
Prospects for CMB lensing-galaxy clustering cross-correlations and modeling b...Marcel Schmittfull
 
Block_Diagram_Algebra in Control System.pdf
Block_Diagram_Algebra in Control System.pdfBlock_Diagram_Algebra in Control System.pdf
Block_Diagram_Algebra in Control System.pdfSirshenduSaha4
 
[Yonsei AI Workshop 2022] Graph Neural Controlled Differential Equations for ...
[Yonsei AI Workshop 2022] Graph Neural Controlled Differential Equations for ...[Yonsei AI Workshop 2022] Graph Neural Controlled Differential Equations for ...
[Yonsei AI Workshop 2022] Graph Neural Controlled Differential Equations for ...Jeongwhan Choi
 

Similaire à Variation Graphs and Structural Variation (20)

Paired-end alignments in sequence graphs
Paired-end alignments in sequence graphsPaired-end alignments in sequence graphs
Paired-end alignments in sequence graphs
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
IGB genome genometry data models by Gregg Helt and Cyrus Harmon
IGB genome genometry data models by Gregg Helt and Cyrus HarmonIGB genome genometry data models by Gregg Helt and Cyrus Harmon
IGB genome genometry data models by Gregg Helt and Cyrus Harmon
 
1406
14061406
1406
 
GraphBolt: Dependency-Driven Synchronous Processing of Streaming Graphs
GraphBolt: Dependency-Driven Synchronous Processing of Streaming GraphsGraphBolt: Dependency-Driven Synchronous Processing of Streaming Graphs
GraphBolt: Dependency-Driven Synchronous Processing of Streaming Graphs
 
GraphBolt
GraphBoltGraphBolt
GraphBolt
 
Line Detection on the GPU
Line Detection on the GPU Line Detection on the GPU
Line Detection on the GPU
 
Lmv paper
Lmv paperLmv paper
Lmv paper
 
G-TAD: Sub-Graph Localization for Temporal Action Detection
G-TAD: Sub-Graph Localization for Temporal Action DetectionG-TAD: Sub-Graph Localization for Temporal Action Detection
G-TAD: Sub-Graph Localization for Temporal Action Detection
 
Bi gaussianity and indicator variograms (2006)
Bi gaussianity and indicator variograms (2006)Bi gaussianity and indicator variograms (2006)
Bi gaussianity and indicator variograms (2006)
 
Unit v mmc
Unit v mmcUnit v mmc
Unit v mmc
 
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
 
Dahlquist et-al bosc-ismb_2016_poster
Dahlquist et-al bosc-ismb_2016_posterDahlquist et-al bosc-ismb_2016_poster
Dahlquist et-al bosc-ismb_2016_poster
 
Iaetsd design and implementation of pseudo random number generator
Iaetsd design and implementation of pseudo random number generatorIaetsd design and implementation of pseudo random number generator
Iaetsd design and implementation of pseudo random number generator
 
SLAM of Multi-Robot System Considering Its Network Topology
SLAM of Multi-Robot System Considering Its Network TopologySLAM of Multi-Robot System Considering Its Network Topology
SLAM of Multi-Robot System Considering Its Network Topology
 
AllPosters
AllPostersAllPosters
AllPosters
 
Towards the comparative analysis of genomic variants with Jalview
Towards the comparative analysis of genomic variants with JalviewTowards the comparative analysis of genomic variants with Jalview
Towards the comparative analysis of genomic variants with Jalview
 
Prospects for CMB lensing-galaxy clustering cross-correlations and modeling b...
Prospects for CMB lensing-galaxy clustering cross-correlations and modeling b...Prospects for CMB lensing-galaxy clustering cross-correlations and modeling b...
Prospects for CMB lensing-galaxy clustering cross-correlations and modeling b...
 
Block_Diagram_Algebra in Control System.pdf
Block_Diagram_Algebra in Control System.pdfBlock_Diagram_Algebra in Control System.pdf
Block_Diagram_Algebra in Control System.pdf
 
[Yonsei AI Workshop 2022] Graph Neural Controlled Differential Equations for ...
[Yonsei AI Workshop 2022] Graph Neural Controlled Differential Equations for ...[Yonsei AI Workshop 2022] Graph Neural Controlled Differential Equations for ...
[Yonsei AI Workshop 2022] Graph Neural Controlled Differential Equations for ...
 

Dernier

Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesSoftwareMill
 
Webinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptWebinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptkinjal48
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...OnePlan Solutions
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024Mind IT Systems
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native BuildpacksVish Abrams
 
Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampVICTOR MAESTRE RAMIREZ
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfBrain Inventory
 
ERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxAutus Cyber Tech
 
eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionsNirav Modi
 
Kawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntelliSource Technologies
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageDista
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.Sharon Liu
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxJoão Esperancinha
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsJaydeep Chhasatia
 
Watermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesWatermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesShyamsundar Das
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Incrobinwilliams8624
 

Dernier (20)

Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retries
 
Webinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptWebinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.ppt
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native Buildpacks
 
Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - Datacamp
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdf
 
ERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptx
 
eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspections
 
Kawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in Trivandrum
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptx
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptx
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
 
Salesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptxSalesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptx
 
Watermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesWatermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security Challenges
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Inc
 

Variation Graphs and Structural Variation

  • 1. vg: the variation graph toolkit Eric Dawson October 2016@erictdawson 1
  • 2. Variation graphs "Variation graphs provide a succinct encoding of the sequences of many genomes. A variation graph (in particular as implemented in vg) is composed of: • nodes, which are labeled by sequences and ids • edges, which connect two nodes via either of their respective ends • paths, describe genomes, sequence alignments, and annotations (such as gene models and transcripts) as walks through nodes connected by edges"* *From the vg wiki Variation graphs allow us to map directly against known (or proposed) variation. 2
  • 3. Variation graphs GRCh38 alts in B-3106 from human MHC (@erikgarrison) 3
  • 4. Variation graphs are pangenomes “Computational Pan-Genomics: Status, Promises and Challenges.” Computational Pan-Genomics Consortium. Briefings in Bioinformatics (2016) in press Pangenomes should fulfill a number of basic functions. Erik Garrison, A toolkit for practical pangenomics, ECCB 2016 4
  • 5. Variation graphs are pangenomes We've implemented most of these operations in vg. github.com/vgteam/vg Erik Garrison, A toolkit for practical pangenomics, ECCB 2016 5
  • 6. Constructing graphs construct - build graphs from paired FASTA/VCF files. msga - use progressive assembly to generate a graph from input sequences. an example of a graph made with MSGA @erikgarrison 6
  • 7. Graph modification mod - prune and normalize graphs, shorten nodes, and much more. circularize - circularize paths in the graph. ids - coordinate the ID spaces of multiple subgraphs. Circular HPV variation graph @erictdawson & Sarah Wagner (NCI)7
  • 8. Indexing variation graphs The core, mutable VG data structures support graph manipulation, queries, and alignment, but they are not scalable. Thus... XG - immutable bidirectional graph index gPBWT - graph-generalized positional BWT (Adam Novak) GCSA2 - path index queries for variation and de Bruijn graphs (Jouni Sirén) 8
  • 9. Indexing variation graphs The core, mutable VG data structures support graph manipulation, queries, and alignment, but they are not scalable. Thus... XG - immutable bidirectional graph index gPBWT - graph-generalized positional BWT (Adam Novak) GCSA2 - path index queries for variation and de Bruijn graphs (Jouni Sirén) These structures enable mapping at whole-genome scale. 9
  • 10. Mapping to a variation graph vg can operate on arbitrary sequence graphs. All other graph-based resequencing implementations (GRAL, BWBBLE, vBWT, GCSAv1) require a DAG. Local alignment to cyclic graphs is provided by unrolling: 10
  • 11. Exact-match guided alignment in vg @erikgarrison Obtain MEMs from GCSA2, cluster MEMs with xg's positional index, then fully resolve alignment for non-matching portions using dynamic programming. 11
  • 12. Obtain MEMs from GCSA2, cluster MEMs with xg's positional index, then fully resolve alignment for non-matching portions using dynamic programming. Exact-match guided alignment in vg @erikgarrison 12
  • 13. Mapping to a variation graph 1 billion reads / 32 vCPUs / 30 hours 13
  • 14. Small variant calling in vg call - bubble informed pileup-based caller genotype - FreeBayes style genotyping using graph augmentation and superbubble detection 14
  • 15. Genotyping Genotyping and MSGA produce identical results (as expected). 15
  • 16. Interchange with other programs vectorize - export Alignments as Vowpal-Wabbit vectors for ML view - GFA/JSON/DOT for many graph entities 1.0 1 ref_1A | ref 1 1 0 1 0 1 0 1 16
  • 17. Other functions locify(beta) - extract relevant info for external phasing. deconstruct(beta) - extract an input VCF from the graph. sim - simulate reads and exact alignments from the graph. stats - print relevant graph properties. surject - push graph alignments to BAM space. sift/scrub(beta) - filter / select alignments by mapping properties. translate - lift graph coordinates between graphs. 17
  • 18. Structural variation in variation graphs 1:CAAATAAG 9:GC 11:TTGGAAATT 20:TTCTGGA 27:GTT 30:CTAT 34:TATATTCCAACTCTCTG 18
  • 19. Structural variation Loosely defined as changes to the genomic sequence >50bp in length. 1. Balanced events

 1. Inversions 2. Translocations










 2. Unbalanced events

 1. Insertions 2. Deletions 3. Duplications




 3. "Complex" - not shown - Multiple events occurring in tandem, specific time series of events, etc.



 19
  • 20. Why are we talking about SVs? 1. Evidence for structural variation in the NCI Chernobyl study. • Matched tumor/normal samples from >400 individuals exposed to 131 I post-Chernobyl who subsequently developed papillary thyroid carcinoma. 2. Variation graphs provide novel ways to locate and score SV calls. • Graphs are mutable - candidate variation can be inserted, mapped against, and refined. 20
  • 21. *From Stephen Chanock, IARC 2016 * Evidence of SV in Chernobyl data 21
  • 22. *From Stephen Chanock, IARC 2016 * Evidence of SV in Chernobyl data 22
  • 23. Evidence of SV in Chernobyl data Median and (Total) Lumpy Calls DEL INV Other (excl. BND) Tumor (12) 9074 (187,150) 3925 (53,643) 2185 (25,182) Normal (11*) 5059 (60,508) 378 (9,494) 1234 (13,682) Blood (12) 6634 (88,985) 98 (1,195) 1247 (14,181) Median and total call numbers (as well as analysis by others) indicate we might expect a high burden of deletions and inversions in tumours from our dataset (relative to normals) after normalization and QC. *No A90N normal tissue sample; A90G metastasis sequenced instead23
  • 24. Representing SVs in variation graphs 1:CAAATAAG 9:GC 11:TTGGAAATT 20:TTCTGGA 27:GTT 30:CTAT 34:TATATTCCAACTCTCTG Chr22 (hg38) + all well-defined DEL in COSMIC DEL DEL INV 1 135016 1589543 1630246 1971760 3486016 4591816 5258415 6554688 7482235 7490801 7645222 17645868 9107072 9211653 15863014 16675711 16715410 16749787 16749860 17453803 1968856317607658 17680825 17833893 17953345 18105753 19198059 20266331 20401803 20559555 20418416 20428509 20449686 20460773 20495146 20556428 20556507 20629345 20858930 20859045 20998041 21014964 20998896 21976209 24 github.com/edawson/lasso
  • 25. Why use a variation graphs and not [your favorite SV caller here] ? 25
  • 26. Read alignment is Bayesian Our best alignment gets the highest posterior score The reference genome The read from the sequencer 26 @erikgarrison
  • 27. Detecting SVs with vg graph mapq soft clipping mate orientation reads fragment length path divergence unmapped reads sample We're stuck with our reads, but with variation graphs we can sample from possible reference priors to maximize: P( reference | reads) We can also refine our breakpoints using this approach. Informative for SV type Uninformative 27
  • 28. Detecting deletions with vg 1000 simulated reads w/ 1kb deletion mapped to an 8kb flat HPV genome graph.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 36 35 31 32 1 2 6 5 72 71 70 66 67 68 69 72 71 4 5 9 8 28 29 33 32 37 36 32 33 49 50 55 54 22 23 27 26 25 77 76 72 73 74 58 57 53 54 74 75 78 77 55 54 50 51 40 41 45 44 68 69 72 71 23 22 8 9 10 27 26 23 24 27 28 32 31 52 5146 47 72 73 77 76 79 78 74 75 40 41 46 4524 25 26 29 28 8 7 3 4 5 25 26 30 29 8 7 3 4 6 7 10 9 77 76 71 72 3 4 9 8 7 59 60 63 62 52 53 56 55 66 65 61 62 72 73 74 77 76 76 75 71 72 35 3430 31 51 50 46 47 22 21 7 8 55 54 50 51 41 42 46 45 7 6 2 3 4 35 3430 31 42 43 46 45 2 3 7 6 8 7 6 2 3 4 74 75 79 78 45 44 40 41 39 40 44 43 59 58 54 55 65 64 60 61 48 47 44 45 71 72 76 75 48 49 53 52 70 71 74 73 47 46 42 43 54 53 48 49 33 34 35 38 37 75 74 70 71 72 73 77 76 34 35 39 38 53 5247 48 63 64 69 68 67 72 73 77 76 67 6665 60 61 44 45 49 48 36 35 32 33 3 4 8 7 34 33 29 30 52 53 57 56 58 57 54 55 5 6 9 8 68 67 63 64 65 32 31 30 26 27 54 53 49 50 57 58 62 61 71 70 66 67 38 37 34 35 51 50 46 47 28 27 23 24 34 33 28 29 30 75 74 70 71 7269 68 64 65 22 21 8 9 56 55 51 52 42 43 48 47 41 40 37 38 72 71 67 68 47 46 42 43 44 8 9 10 24 23 67 66 61 62 63 73 72 68 69 65 66 70 69 60 61 65 64 41 42 43 46 45 66 65 61 62 78 77 73 74 42 43 44 47 46 43 44 49 48 8 7 4 5 60 61 65 64 51 52 56 55 63 62 58 59 24 25 29 28 53 52 47 48 49 42 41 37 38 31 30 29 25 26 27 62 61 57 58 35 34 30 31 48 47 44 45 54 5349 50 51 24 25 29 28 73 74 78 77 8 9 10 24 23 22 67 68 72 71 71 72 76 75 57 58 61 60 64 65 69 68 42 41 37 38 29 30 34 33 33 34 38 37 3 4 7 6 79 78 74 75 36 37 41 40 62 61 57 58 29 30 31 34 33 64 63 60 61 73 74 79 78 77 39 40 44 43 35 36 41 40 38 37 33 34 56 55 51 52 25 26 31 30 29 41 42 46 45 54 55 56 59 58 63 64 69 68 40 39 36 37 8 7 3 4 32 31 27 28 61 62 66 65 66 65 64 61 62 56 57 60 59 40 41 45 44 43 36 35 32 33 53 54 55 59 58 72 73 76 75 23 22 8 9 47 48 51 50 52 53 57 56 34 33 29 30 76 75 71 72 73 58 57 54 55 79 78 74 75 57 56 55 51 52 32 33 34 37 36 8 7 3 4 51 50 46 47 22 23 27 26 61 60 57 58 70 71 75 74 58 57 53 54 69 68 64 65 66 45 46 49 48 38 39 43 42 60 59 55 56 63 62 58 59 30 29 28 25 26 65 66 67 70 69 47 46 43 44 27 26 21 22 5 6 10 9 8 68 69 74 73 67 66 61 62 3 4 7 6 73 72 68 69 40 41 45 44 35 36 40 39 69 70 75 74 73 60 59 56 57 21 22 26 25 7 8 22 21 31 32 36 35 61 60 56 57 1 2 6 5 4 45 46 50 49 48 71 70 66 67 34 33 32 28 29 77 76 72 73 7431 30 25 26 71 72 75 74 64 63 58 59 30 29 28 25 26 76 75 74 70 71 37 36 32 33 34 67 66 62 63 32 33 37 36 35 53 5247 48 46 45 42 43 26 25 24 21 22 53 54 59 58 76 75 71 72 73 52 53 57 56 28 2722 23 46 45 41 42 43 44 48 47 73 72 68 69 23 24 29 28 31 32 36 35 59 60 61 65 64 24 23 22 9 10 22 21 7 8 33 32 28 29 30 59 60 63 62 63 64 65 68 67 62 61 58 59 21 22 27 26 25 63 64 68 67 75 74 70 71 34 33 29 30 25 26 27 31 30 23 22 8 9 78 77 72 73 74 27 28 33 32 52 51 47 48 47 48 52 51 58 59 62 61 53 52 51 48 49 39 40 44 43 48 49 54 53 54 53 49 50 34 35 39 38 46 47 52 51 42 43 47 46 52 53 58 57 64 63 60 61 52 51 47 48 33 34 38 37 33 32 29 30 72 71 66 67 36 37 41 40 29 30 33 32 30 31 32 35 34 28 27 23 24 26 27 31 30 74 73 69 70 55 56 57 60 59 36 37 38 41 40 3 4 7 6 38 37 33 34 35 10 9 5 6 72 73 77 76 70 69 64 65 53 5248 49 55 56 57 60 59 58 59 64 63 62 51 52 57 56 55 43 44 47 46 35 36 40 39 33 34 38 37 53 52 51 48 49 54 55 59 58 27 28 32 31 58 57 56 53 54 52 53 54 57 56 58 59 63 62 61 30 29 26 27 72 71 67 68 60 59 56 57 26 25 21 22 47 48 53 52 51 26 25 21 22 55 56 60 59 34 33 29 30 61 62 66 65 28 27 26 22 23 54 53 49 50 51 74 75 79 78 77 61 60 55 56 33 34 38 37 79 78 74 75 71 72 77 76 51 52 55 54 39 40 44 43 4 5 9 8 7 67 68 73 72 45 46 47 50 49 67 68 69 72 71 70 60 61 66 65 64 68 69 72 71 10 9 5 6 10 9 6 7 8 21 22 26 25 24 25 28 27 1 2 5 4 40 39 35 36 37 9 8 4 5 6 4 5 9 8 65 66 67 71 70 69 45 44 40 41 42 51 52 56 55 58 59 63 62 51 52 56 55 38 39 40 43 42 52 51 47 48 9 10 24 23 29 30 34 33 9 10 24 23 22 31 30 26 27 28 36 35 31 32 29 28 25 26 53 54 55 58 57 59 60 61 64 63 54 55 56 59 58 79 78 77 74 75 44 45 49 48 52 51 47 48 58 57 52 53 53 54 59 58 57 56 57 61 60 61 60 55 56 57 56 57 60 59 22 23 26 25 55 54 53 49 50 51 54 55 59 58 61 62 63 66 65 51 50 47 48 35 36 39 38 8 9 23 22 36 35 34 31 32 62 61 60 56 57 58 40 39 36 37 6 7 10 9 27 26 25 21 22 31 30 29 25 26 27 63 62 58 59 60 68 69 73 72 52 53 57 56 55 8 9 23 22 35 34 30 31 56 55 50 51 52 61 62 67 66 65 33 34 37 36 74 73 69 70 64 65 68 67 25 26 30 29 10 9 5 6 32 31 27 28 76 75 71 72 73 38 37 36 33 34 78 77 76 72 73 67 68 72 7136 37 41 40 69 70 71 75 74 33 34 38 37 24 23 8 9 36 35 30 31 32 33 32 28 29 7 8 9 22 21 28 29 30 33 32 56 57 61 60 72 71 67 68 29 30 35 34 32 33 37 36 71 70 66 6759 58 54 55 31 32 36 35 55 56 57 60 59 35 34 31 32 9 8 7 4 5 66 67 71 70 69 4 5 9 8 28 27 23 24 61 60 56 57 58 66 65 61 62 32 3128 29 77 76 72 73 74 42 41 40 37 38 1 2 6 5 4 43 42 37 38 38 37 36 32 3327 26 22 23 29 30 34 33 73 74 75 79 78 37 38 42 41 42 41 37 38 8 9 23 22 33 32 28 29 74 75 79 78 62 63 67 66 66 67 71 70 79 78 74 75 40 41 44 43 24 25 29 28 40 41 42 46 45 47 48 51 50 3 4 8 7 64 63 60 61 46 45 41 42 43 45 44 43 39 40 47 46 45 41 42 43 5 6 7 10 9 32 33 37 36 35 33 32 28 29 30 21 6 7 25 24 21 66 67 70 69 66 67 71 70 28 29 32 31 41 42 43 47 46 71 72 73 77 76 75 42 41 37 38 49 48 47 44 45 26 27 31 30 3 4 8 7 67 68 72 7155 56 60 59 77 76 73 74 58 59 63 62 53 52 48 49 30 29 26 27 33 34 39 38 37 62 63 67 66 22 23 26 25 47 46 45 41 42 35 34 30 31 34 33 29 30 53 52 48 49 50 76 75 71 72 38 39 43 42 79 78 74 75 64 65 69 68 8 9 24 23 39 40 45 44 43 29 30 33 32 61 60 56 57 33 34 38 37 29 30 34 33 32 79 78 73 74 75 73 74 78 77 76 70 69 65 6658 57 53 54 36 35 30 31 32 7 8 22 21 39 40 44 43 29 28 24 25 49 50 54 53 33 34 38 37 58 59 62 61 76 75 71 72 30 31 36 35 34 45 44 41 42 58 57 54 55 29 28 24 25 26 21 26 25 24 23 9 10 5 6 21 26 25 21 22 26 25 24 21 22 10 24 23 44 45 48 47 23 24 25 29 28 9 8 4 5 34 33 32 29 30 66 65 64 62 63 48 47 41 42 43 21 22 26 25 71 70 67 68 32 31 30 27 28 63 62 59 60 62 61 60 56 57 7 8 23 22 21 33 34 35 39 38 4 5 10 9 8 24 25 28 27 42 41 38 39 79 78 74 75 48 49 52 51 35 36 40 39 38 30 29 25 26 55 54 53 49 50 7 8 9 22 21 38 39 42 41 55 56 59 58 23 24 28 27 4 5 9 8 75 74 73 70 71 26 27 30 29 26 27 30 29 31 32 36 35 77 76 72 73 75 74 70 71 53 54 59 58 57 25 26 30 29 62 61 57 58 62 63 67 66 56 55 50 51 53 54 58 57 43 44 48 47 52 53 56 55 41 42 46 45 47 46 42 43 75 76 79 78 73 74 78 77 10 25 24 6 7 22 21 6 7 8 21 3 4 8 7 38 39 42 41 28 29 30 33 32 8 7 3 4 36 37 42 41 40 50 51 55 54 23 24 29 28 27 27 28 31 30 68 69 70 74 73 72 6 7 22 21 42 43 46 45 46 45 41 42 52 51 47 48 68 67 63 64 65 69 70 74 73 50 4945 46 64 63 58 59 39 38 34 35 70 71 72 75 74 9 10 24 23 34 3329 30 31 30 26 27 65 64 60 61 62 6157 58 57 56 53 54 73 74 78 77 61 62 66 65 9 8 7 3 4 5 28 27 23 24 41 42 45 44 79 78 75 76 28 29 30 34 33 48 47 43 44 29 30 35 34 53 54 58 57 37 38 43 42 54 53 49 50 72 73 74 77 76 66 67 71 70 44 45 49 48 61 62 66 65 34 35 40 39 38 73 72 68 69 53 54 58 57 63 62 58 59 40 41 45 44 78 77 74 75 21 6 7 10 24 23 10 6 7 7 8 21 21 6 7 62 63 64 68 67 66 51 52 53 56 55 61 60 59 56 57 50 49 45 46 47 69 70 73 72 64 65 69 68 50 49 48 44 45 43 44 45 49 48 33 34 35 38 37 8 9 22 21 65 66 69 68 50 51 55 54 53 47 46 42 43 33 34 38 37 57 58 61 60 30 29 24 25 40 41 45 44 43 59 58 57 54 55 32 31 27 28 34 33 29 30 71 70 67 68 47 48 52 51 26 25 21 22 6 7 10 58 59 63 62 70 71 75 74 39 40 43 42 64 65 66 70 69 67 68 72 71 70 22 23 28 27 48 49 52 51 44 43 40 41 70 69 65 66 43 44 48 47 71 72 76 75 30 29 25 26 27 32 31 28 29 79 78 7774 75 31 30 27 28 58 57 54 55 33 32 31 28 29 33 34 38 37 68 69 73 72 33 32 31 28 29 32 33 37 36 32 33 36 35 25 24 23 21 57 58 59 63 62 21 22 26 25 7 6 2 3 71 70 66 67 65 66 67 70 69 55 54 53 49 50 62 63 67 66 60 61 65 64 63 24 23 22 8 9 35 36 40 39 2 3 6 5 41 40 36 37 52 53 54 57 56 52 51 50 48 49 67 68 72 71 30 29 26 27 51 52 53 57 56 55 39 40 45 44 43 53 52 50 51 74 73 68 69 70 38 39 43 42 65 66 70 69 6 5 1 2 64 6360 61 35 36 37 41 40 39 74 75 78 77 5 6 10 9 32 33 34 37 36 7 6 1 2 66 67 70 6955 56 60 59 63 64 65 68 67 47 46 41 42 43 67 68 71 70 47 46 45 42 43 1 2 6 5 10 9 5 6 10 6 7 69 70 71 74 73 73 74 78 77 50 49 45 46 2 3 7 6 50 49 46 47 24 23 22 9 10 51 52 56 55 31 32 37 36 69 68 64 65 72 73 77 76 43 44 48 47 46 28 27 24 25 35 34 31 32 79 78 74 75 52 51 47 48 59 60 63 62 67 66 61 62 32 31 27 28 50 51 54 53 44 43 39 40 34 3328 29 77 76 72 73 44 45 49 48 35 34 33 29 30 31 23 24 28 27 8 7 3 4 40 39 35 36 37 68 69 74 73 72 60 59 55 56 25 24 21 22 43 44 47 46 41 42 43 47 46 30 31 35 34 59 58 55 56 50 49 45 46 71 72 73 76 75 66 65 62 63 68 69 73 72 77 76 72 73 22 21 7 8 9 48 49 53 52 9 10 24 23 39 38 34 35 21 22 23 26 25 38 37 36 33 34 25 26 30 29 3 4 7 6 22 21 7 8 9 51 50 45 46 3 4 8 7 8 9 23 22 21 22 23 26 25 10 9 8 4 5 50 49 46 47 35 34 30 31 70 71 74 73 72 53 5248 49 60 61 65 64 5 6 9 8 7 8 22 21 51 52 56 55 61 62 66 65 73 72 69 70 69 70 75 74 38 37 34 35 51 52 53 56 55 10 24 23 57 58 62 61 55 54 53 51 52 34 3329 30 31 30 29 26 27 75 74 70 71 58 57 53 54 31 32 36 35 34 43 42 38 39 61 62 66 65 27 26 25 21 22 35 36 39 38 29 28 24 25 72 71 68 69 70 69 65 66 56 57 61 60 67 68 72 71 43 42 38 39 61 60 57 58 67 66 65 62 63 58 59 60 63 62 61 60 59 56 57 58 54 53 50 51 65 66 70 69 35 34 33 30 31 67 68 69 72 71 44 45 49 48 79 78 74 75 76 46 45 41 42 6 5 1 2 33 32 31 28 29 39 40 44 43 59 58 54 55 37 36 35 31 32 7 8 21 30 31 32 36 35 44 45 49 48 47 69 68 65 66 3 4 9 8 7 24 25 29 28 7 6 5 2 3 4 21 22 27 26 25 77 76 73 74 75 43 42 41 38 39 60 59 55 56 55 56 57 60 59 61 60 57 58 4 5 8 7 35 36 40 39 38 21 22 26 25 34 35 40 39 38 37 34 35 36 37 41 40 39 73 74 78 77 8 7 3 4 69 68 67 63 64 65 59 58 55 56 30 31 32 35 34 36 35 31 32 40 41 46 45 44 38 39 42 41 24 23 22 8 9 35 34 30 31 71 72 76 75 9 10 24 23 48 49 52 51 29 30 33 32 41 40 39 35 36 26 27 33 32 8 7 3 4 27 26 22 23 51 52 56 55 46 45 41 42 33 34 38 37 7 6 5 1 2 3 30 31 36 35 28 27 26 23 24 25 63 62 58 5941 40 36 37 44 45 50 49 44 45 48 47 3 4 8 7 72 73 77 76 75 8 7 3 4 22 21 7 8 30 29 28 24 25 51 50 46 47 68 69 72 71 39 40 43 42 24 25 29 28 59 58 55 56 74 73 70 71 39 40 41 45 44 1kb 28
  • 29. Detecting deletions with vg 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 36 35 31 32 1 2 6 5 72 71 70 66 67 68 69 72 71 4 5 9 8 28 29 33 32 37 36 32 33 49 50 55 54 22 23 27 26 25 77 76 72 73 74 58 57 53 54 74 75 78 77 55 54 50 51 40 41 45 44 68 69 72 71 23 22 8 9 10 27 26 23 24 27 28 32 31 52 5146 47 72 73 77 76 79 78 74 75 40 41 46 4524 25 26 29 28 8 7 3 4 5 25 26 30 29 8 7 3 4 6 7 10 9 77 76 71 72 3 4 9 8 7 59 60 63 62 52 53 56 55 66 65 61 62 72 73 74 77 76 76 75 71 72 35 3430 31 51 50 46 47 22 21 7 8 55 54 50 51 41 42 46 45 7 6 2 3 4 35 3430 31 42 43 46 45 2 3 7 6 8 7 6 2 3 4 74 75 79 78 45 44 40 41 39 40 44 43 59 58 54 55 65 64 60 61 48 47 44 45 71 72 76 75 48 49 53 52 70 71 74 73 47 46 42 43 54 53 48 49 33 34 35 38 37 75 74 70 71 72 73 77 76 34 35 39 38 53 5247 48 63 64 69 68 67 72 73 77 76 67 6665 60 61 44 45 49 48 36 35 32 33 3 4 8 7 34 33 29 30 52 53 57 56 58 57 54 55 5 6 9 8 68 67 63 64 65 32 31 30 26 27 54 53 49 50 57 58 62 61 71 70 66 67 38 37 34 35 51 50 46 47 28 27 23 24 34 33 28 29 30 75 74 70 71 7269 68 64 65 22 21 8 9 56 55 51 52 42 43 48 47 41 40 37 38 72 71 67 68 47 46 42 43 44 8 9 10 24 23 67 66 61 62 63 73 72 68 69 65 66 70 69 60 61 65 64 41 42 43 46 45 66 65 61 62 78 77 73 74 42 43 44 47 46 43 44 49 48 8 7 4 5 60 61 65 64 51 52 56 55 63 62 58 59 24 25 29 28 53 52 47 48 49 42 41 37 38 31 30 29 25 26 27 62 61 57 58 35 34 30 31 48 47 44 45 54 5349 50 51 24 25 29 28 73 74 78 77 8 9 10 24 23 22 67 68 72 71 71 72 76 75 57 58 61 60 64 65 69 68 42 41 37 38 29 30 34 33 33 34 38 37 3 4 7 6 79 78 74 75 36 37 41 40 62 61 57 58 29 30 31 34 33 64 63 60 61 73 74 79 78 77 39 40 44 43 35 36 41 40 38 37 33 34 56 55 51 52 25 26 31 30 29 41 42 46 45 54 55 56 59 58 63 64 69 68 40 39 36 37 8 7 3 4 32 31 27 28 61 62 66 65 66 65 64 61 62 56 57 60 59 40 41 45 44 43 36 35 32 33 53 54 55 59 58 72 73 76 75 23 22 8 9 47 48 51 50 52 53 57 56 34 33 29 30 76 75 71 72 73 58 57 54 55 79 78 74 75 57 56 55 51 52 32 33 34 37 36 8 7 3 4 51 50 46 47 22 23 27 26 61 60 57 58 70 71 75 74 58 57 53 54 69 68 64 65 66 45 46 49 48 38 39 43 42 60 59 55 56 63 62 58 59 30 29 28 25 26 65 66 67 70 69 47 46 43 44 27 26 21 22 5 6 10 9 8 68 69 74 73 67 66 61 62 3 4 7 6 73 72 68 69 40 41 45 44 35 36 40 39 69 70 75 74 73 60 59 56 57 21 22 26 25 7 8 22 21 31 32 36 35 61 60 56 57 1 2 6 5 4 45 46 50 49 48 71 70 66 67 34 33 32 28 29 77 76 72 73 7431 30 25 26 71 72 75 74 64 63 58 59 30 29 28 25 26 76 75 74 70 71 37 36 32 33 34 67 66 62 63 32 33 37 36 35 53 5247 48 46 45 42 43 26 25 24 21 22 53 54 59 58 76 75 71 72 73 52 53 57 56 28 2722 23 46 45 41 42 43 44 48 47 73 72 68 69 23 24 29 28 31 32 36 35 59 60 61 65 64 24 23 22 9 10 22 21 7 8 33 32 28 29 30 59 60 63 62 63 64 65 68 67 62 61 58 59 21 22 27 26 25 63 64 68 67 75 74 70 71 34 33 29 30 25 26 27 31 30 23 22 8 9 78 77 72 73 74 27 28 33 32 52 51 47 48 47 48 52 51 58 59 62 61 53 52 51 48 49 39 40 44 43 48 49 54 53 54 53 49 50 34 35 39 38 46 47 52 51 42 43 47 46 52 53 58 57 64 63 60 61 52 51 47 48 33 34 38 37 33 32 29 30 72 71 66 67 36 37 41 40 29 30 33 32 30 31 32 35 34 28 27 23 24 26 27 31 30 74 73 69 70 55 56 57 60 59 36 37 38 41 40 3 4 7 6 38 37 33 34 35 10 9 5 6 72 73 77 76 70 69 64 65 53 5248 49 55 56 57 60 59 58 59 64 63 62 51 52 57 56 55 43 44 47 46 35 36 40 39 33 34 38 37 53 52 51 48 49 54 55 59 58 27 28 32 31 58 57 56 53 54 52 53 54 57 56 58 59 63 62 61 30 29 26 27 72 71 67 68 60 59 56 57 26 25 21 22 47 48 53 52 51 26 25 21 22 55 56 60 59 34 33 29 30 61 62 66 65 28 27 26 22 23 54 53 49 50 51 74 75 79 78 77 61 60 55 56 33 34 38 37 79 78 74 75 71 72 77 76 51 52 55 54 39 40 44 43 4 5 9 8 7 67 68 73 72 45 46 47 50 49 67 68 69 72 71 70 60 61 66 65 64 68 69 72 71 10 9 5 6 10 9 6 7 8 21 22 26 25 24 25 28 27 1 2 5 4 40 39 35 36 37 9 8 4 5 6 4 5 9 8 65 66 67 71 70 69 45 44 40 41 42 51 52 56 55 58 59 63 62 51 52 56 55 38 39 40 43 42 52 51 47 48 9 10 24 23 29 30 34 33 9 10 24 23 22 31 30 26 27 28 36 35 31 32 29 28 25 26 53 54 55 58 57 59 60 61 64 63 54 55 56 59 58 79 78 77 74 75 44 45 49 48 52 51 47 48 58 57 52 53 53 54 59 58 57 56 57 61 60 61 60 55 56 57 56 57 60 59 22 23 26 25 55 54 53 49 50 51 54 55 59 58 61 62 63 66 65 51 50 47 48 35 36 39 38 8 9 23 22 36 35 34 31 32 62 61 60 56 57 58 40 39 36 37 6 7 10 9 27 26 25 21 22 31 30 29 25 26 27 63 62 58 59 60 68 69 73 72 52 53 57 56 55 8 9 23 22 35 34 30 31 56 55 50 51 52 61 62 67 66 65 33 34 37 36 74 73 69 70 64 65 68 67 25 26 30 29 10 9 5 6 32 31 27 28 76 75 71 72 73 38 37 36 33 34 78 77 76 72 73 67 68 72 7136 37 41 40 69 70 71 75 74 33 34 38 37 24 23 8 9 36 35 30 31 32 33 32 28 29 7 8 9 22 21 28 29 30 33 32 56 57 61 60 72 71 67 68 29 30 35 34 32 33 37 36 71 70 66 6759 58 54 55 31 32 36 35 55 56 57 60 59 35 34 31 32 9 8 7 4 5 66 67 71 70 69 4 5 9 8 28 27 23 24 61 60 56 57 58 66 65 61 62 32 3128 29 77 76 72 73 74 42 41 40 37 38 1 2 6 5 4 43 42 37 38 38 37 36 32 3327 26 22 23 29 30 34 33 73 74 75 79 78 37 38 42 41 42 41 37 38 8 9 23 22 33 32 28 29 74 75 79 78 62 63 67 66 66 67 71 70 79 78 74 75 40 41 44 43 24 25 29 28 40 41 42 46 45 47 48 51 50 3 4 8 7 64 63 60 61 46 45 41 42 43 45 44 43 39 40 47 46 45 41 42 43 5 6 7 10 9 32 33 37 36 35 33 32 28 29 30 21 6 7 25 24 21 66 67 70 69 66 67 71 70 28 29 32 31 41 42 43 47 46 71 72 73 77 76 75 42 41 37 38 49 48 47 44 45 26 27 31 30 3 4 8 7 67 68 72 7155 56 60 59 77 76 73 74 58 59 63 62 53 52 48 49 30 29 26 27 33 34 39 38 37 62 63 67 66 22 23 26 25 47 46 45 41 42 35 34 30 31 34 33 29 30 53 52 48 49 50 76 75 71 72 38 39 43 42 79 78 74 75 64 65 69 68 8 9 24 23 39 40 45 44 43 29 30 33 32 61 60 56 57 33 34 38 37 29 30 34 33 32 79 78 73 74 75 73 74 78 77 76 70 69 65 6658 57 53 54 36 35 30 31 32 7 8 22 21 39 40 44 43 29 28 24 25 49 50 54 53 33 34 38 37 58 59 62 61 76 75 71 72 30 31 36 35 34 45 44 41 42 58 57 54 55 29 28 24 25 26 21 26 25 24 23 9 10 5 6 21 26 25 21 22 26 25 24 21 22 10 24 23 44 45 48 47 23 24 25 29 28 9 8 4 5 34 33 32 29 30 66 65 64 62 63 48 47 41 42 43 21 22 26 25 71 70 67 68 32 31 30 27 28 63 62 59 60 62 61 60 56 57 7 8 23 22 21 33 34 35 39 38 4 5 10 9 8 24 25 28 27 42 41 38 39 79 78 74 75 48 49 52 51 35 36 40 39 38 30 29 25 26 55 54 53 49 50 7 8 9 22 21 38 39 42 41 55 56 59 58 23 24 28 27 4 5 9 8 75 74 73 70 71 26 27 30 29 26 27 30 29 31 32 36 35 77 76 72 73 75 74 70 71 53 54 59 58 57 25 26 30 29 62 61 57 58 62 63 67 66 56 55 50 51 53 54 58 57 43 44 48 47 52 53 56 55 41 42 46 45 47 46 42 43 75 76 79 78 73 74 78 77 10 25 24 6 7 22 21 6 7 8 21 3 4 8 7 38 39 42 41 28 29 30 33 32 8 7 3 4 36 37 42 41 40 50 51 55 54 23 24 29 28 27 27 28 31 30 68 69 70 74 73 72 6 7 22 21 42 43 46 45 46 45 41 42 52 51 47 48 68 67 63 64 65 69 70 74 73 50 4945 46 64 63 58 59 39 38 34 35 70 71 72 75 74 9 10 24 23 34 3329 30 31 30 26 27 65 64 60 61 62 6157 58 57 56 53 54 73 74 78 77 61 62 66 65 9 8 7 3 4 5 28 27 23 24 41 42 45 44 79 78 75 76 28 29 30 34 33 48 47 43 44 29 30 35 34 53 54 58 57 37 38 43 42 54 53 49 50 72 73 74 77 76 66 67 71 70 44 45 49 48 61 62 66 65 34 35 40 39 38 73 72 68 69 53 54 58 57 63 62 58 59 40 41 45 44 78 77 74 75 21 6 7 10 24 23 10 6 7 7 8 21 21 6 7 62 63 64 68 67 66 51 52 53 56 55 61 60 59 56 57 50 49 45 46 47 69 70 73 72 64 65 69 68 50 49 48 44 45 43 44 45 49 48 33 34 35 38 37 8 9 22 21 65 66 69 68 50 51 55 54 53 47 46 42 43 33 34 38 37 57 58 61 60 30 29 24 25 40 41 45 44 43 59 58 57 54 55 32 31 27 28 34 33 29 30 71 70 67 68 47 48 52 51 26 25 21 22 6 7 10 58 59 63 62 70 71 75 74 39 40 43 42 64 65 66 70 69 67 68 72 71 70 22 23 28 27 48 49 52 51 44 43 40 41 70 69 65 66 43 44 48 47 71 72 76 75 30 29 25 26 27 32 31 28 29 79 78 7774 75 31 30 27 28 58 57 54 55 33 32 31 28 29 33 34 38 37 68 69 73 72 33 32 31 28 29 32 33 37 36 32 33 36 35 25 24 23 21 57 58 59 63 62 21 22 26 25 7 6 2 3 71 70 66 67 65 66 67 70 69 55 54 53 49 50 62 63 67 66 60 61 65 64 63 24 23 22 8 9 35 36 40 39 2 3 6 5 41 40 36 37 52 53 54 57 56 52 51 50 48 49 67 68 72 71 30 29 26 27 51 52 53 57 56 55 39 40 45 44 43 53 52 50 51 74 73 68 69 70 38 39 43 42 65 66 70 69 6 5 1 2 64 6360 61 35 36 37 41 40 39 74 75 78 77 5 6 10 9 32 33 34 37 36 7 6 1 2 66 67 70 6955 56 60 59 63 64 65 68 67 47 46 41 42 43 67 68 71 70 47 46 45 42 43 1 2 6 5 10 9 5 6 10 6 7 69 70 71 74 73 73 74 78 77 50 49 45 46 2 3 7 6 50 49 46 47 24 23 22 9 10 51 52 56 55 31 32 37 36 69 68 64 65 72 73 77 76 43 44 48 47 46 28 27 24 25 35 34 31 32 79 78 74 75 52 51 47 48 59 60 63 62 67 66 61 62 32 31 27 28 50 51 54 53 44 43 39 40 34 3328 29 77 76 72 73 44 45 49 48 35 34 33 29 30 31 23 24 28 27 8 7 3 4 40 39 35 36 37 68 69 74 73 72 60 59 55 56 25 24 21 22 43 44 47 46 41 42 43 47 46 30 31 35 34 59 58 55 56 50 49 45 46 71 72 73 76 75 66 65 62 63 68 69 73 72 77 76 72 73 22 21 7 8 9 48 49 53 52 9 10 24 23 39 38 34 35 21 22 23 26 25 38 37 36 33 34 25 26 30 29 3 4 7 6 22 21 7 8 9 51 50 45 46 3 4 8 7 8 9 23 22 21 22 23 26 25 10 9 8 4 5 50 49 46 47 35 34 30 31 70 71 74 73 72 53 5248 49 60 61 65 64 5 6 9 8 7 8 22 21 51 52 56 55 61 62 66 65 73 72 69 70 69 70 75 74 38 37 34 35 51 52 53 56 55 10 24 23 57 58 62 61 55 54 53 51 52 34 3329 30 31 30 29 26 27 75 74 70 71 58 57 53 54 31 32 36 35 34 43 42 38 39 61 62 66 65 27 26 25 21 22 35 36 39 38 29 28 24 25 72 71 68 69 70 69 65 66 56 57 61 60 67 68 72 71 43 42 38 39 61 60 57 58 67 66 65 62 63 58 59 60 63 62 61 60 59 56 57 58 54 53 50 51 65 66 70 69 35 34 33 30 31 67 68 69 72 71 44 45 49 48 79 78 74 75 76 46 45 41 42 6 5 1 2 33 32 31 28 29 39 40 44 43 59 58 54 55 37 36 35 31 32 7 8 21 30 31 32 36 35 44 45 49 48 47 69 68 65 66 3 4 9 8 7 24 25 29 28 7 6 5 2 3 4 21 22 27 26 25 77 76 73 74 75 43 42 41 38 39 60 59 55 56 55 56 57 60 59 61 60 57 58 4 5 8 7 35 36 40 39 38 21 22 26 25 34 35 40 39 38 37 34 35 36 37 41 40 39 73 74 78 77 8 7 3 4 69 68 67 63 64 65 59 58 55 56 30 31 32 35 34 36 35 31 32 40 41 46 45 44 38 39 42 41 24 23 22 8 9 35 34 30 31 71 72 76 75 9 10 24 23 48 49 52 51 29 30 33 32 41 40 39 35 36 26 27 33 32 8 7 3 4 27 26 22 23 51 52 56 55 46 45 41 42 33 34 38 37 7 6 5 1 2 3 30 31 36 35 28 27 26 23 24 25 63 62 58 5941 40 36 37 44 45 50 49 44 45 48 47 3 4 8 7 72 73 77 76 75 8 7 3 4 22 21 7 8 30 29 28 24 25 51 50 46 47 68 69 72 71 39 40 43 42 24 25 29 28 59 58 55 56 74 73 70 71 39 40 41 45 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 32 1 2 6 5 5 9 29 23 27 23 10 27 24 28 3226 29 8 3 5 26 83 4 7 10 3 4 9 31 8 7 2 3 4 31 2 3 7 8 2 3 4 3 4 8 5 6 9 32 27 28 24 22 9 10 24 8 5 25 29 31 27 31 25 29 10 24 3 4 7 31 26 31 8 3 4 32 28 23 9 8 3 4 23 27 26 27 22 5 6 10 3 4 7 22 26 8 22 32 1 2 6 5 29 31 26 26 26 22 28 23 24 29 32 24 10 22 8 22 27 27 31 23 9 28 30 32 28 24 27 31 3 4 7 10 5 6 28 32 30 27 26 22 26 22 30 28 23 5 9 10 5 6 10 8 22 26 25 28 1 2 5 9 5 6 4 5 9 10 24 30 10 24 31 28 32 29 26 23 26 9 23 32 7 10 27 21 22 31 27 9 23 31 26 30 10 5 6 32 28 249 32 29 9 22 21 30 30 32 32 9 4 5 4 5 9 28 24 32 29 1 2 6 5 27 23 30 9 23 2925 29 3 4 8 6 7 10 30 21 6 7 25 21 29 32 27 31 3 4 8 30 27 23 26 31 30 9 24 30 30 32 8 22 21 29 25 3129 26 21 26 25 24 10 5 6 21 26 25 21 22 26 25 21 22 10 24 25 29 9 4 5 30 21 22 26 25 3228 8 23 4 5 10 25 28 30 25 26 9 22 21 24 28 4 5 9 27 30 27 30 32 25 26 3010 25 6 7 22 21 8 21 3 4 8 30 8 3 4 23 24 2928 31 6 7 22 21 10 24 23 30 31 27 9 3 4 5 28 23 24 30 30 21 6 7 10 24 23 10 6 7 8 21 21 6 7 9 22 21 30 25 32 28 30 26 25 21 22 6 7 10 23 28 30 29 27 32 29 31 27 28 29 28 29 25 21 21 22 26 25 7 6 2 3 24 23 9 2 3 6 5 30 29 27 6 5 1 2 5 6 10 7 6 1 2 1 2 6 5 10 5 6 10 6 7 2 3 7 6 24 23 10 32 28 27 25 32 32 27 28 28 29 31 23 24 28 27 8 7 3 4 25 21 22 31 22 21 9 9 10 24 23 23 26 25 25 26 30 29 3 4 7 6 22 21 9 3 4 8 7 9 23 23 26 25 10 9 4 5 31 5 6 9 8 7 8 22 21 10 24 23 29 30 31 27 31 32 27 21 22 29 28 25 31 6 5 1 2 28 29 31 32 7 8 21 31 32 3 4 9 8 24 25 29 28 7 6 2 3 4 21 22 27 4 5 8 7 21 22 26 25 8 7 3 4 31 32 31 32 24 23 8 9 31 9 10 24 23 29 30 26 27 32 8 7 3 4 27 26 237 6 5 1 2 3 3128 27 24 25 3 4 8 7 8 7 3 4 22 21 7 8 30 29 24 25 24 25 29 28 Stacked soft clips Long fragment length Low mapping depth 1kb 29
  • 30. SV realignment / calling in vg Arbitrary genome graph 30
  • 31. SV realignment / calling in vg 1. Map our reads 31
  • 32. SV realignment / calling in vg 1. Map our reads
 2. Collect read signatures - soft clipped reads - discordant fragment length 32
  • 33. SV realignment / calling in vg 1. Map our reads
 2. Collect read signatures - soft clipped reads - discordant fragment length
 3. Create candidate alleles based on signatures and add them to graph 33
  • 34. SV realignment / calling in vg 1. Map our reads
 2. Collect read signatures - soft clipped reads - discordant fragment length
 3. Create candidate alleles based on signatures and add them to graph
 4. Remap our reads, score the new alignments, and repeat if necessary 34
  • 35. Early results - small variants We submitted the first whole human genome analysis using variation graph reference methods as part of the PrecisionFDA resequencing competition. (May 2016.) We did not win... but we did get a star for: We're now approaching 99% F-score with vg call. Erik Garrison, A toolkit for practical pangenomics, ECCB 2016 35
  • 36. Progress - structural variants Pipeline still under development: • Deletions in time for the holidays. • Inversions, insertions, and duplications to follow. 36
  • 37. Opportunities for improvement • Variant calling at all scales remains unsolved. • Haplotype phasing on the graph unexplored. • Currently no local assembly functionality. • Exploring fermi-lite, but by no means finished. • Few downstream analysis options • Apps can easily operate on vg structures, JSON, GFA, etc. • Additional user feedback / requests would be useful. and many more... 37
  • 38. Where do I see vg making a difference in my work?* • Viral coinfection detection / classification • Multiclonal tumors • SV breakpoint refinement • Tumor / normal specific graphs @erictdawson 38
  • 39. Acknowledgements Many thanks to: Richard Durbin Stephen Chanock Erik Garrison Adam Novak Benedict Paten Glenn Hickey Jordan Eizinga Jerven Bolleman Maciek Smuga Mike Lin ... and many others! Thank y'all for listening! @erictdawson 39