Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Di 2011 houston
1. 21 22
1
20
50M
50M
0M
0M
100M
0M
M
50M
150
19
M
200
0M
M
50
0M
0M
18 2
M
50
M
50
0M
ggbio
10
0M
0M
15
l 2
17
0M
20
M
50
l l
0M
l strand
0M
l
+
16
M
50M 50
−
3
l
M
0M 100
l
100M l
150
M rearrangements
1
15
50M
l
l
interchromosomal
l l
0M
l
l
0M
intrachromosomal
50M
100M tumreads
48245000 48250000 48255000 48260000 48265000 48270000
4
l
4
14
l
100M
hg19::chrX
50M
l 6
Extending the
0M 150M
l
l 8
l
100M
0M l 10
13
l
l 12
50M
50M
0M l
100
5
M
l
15
0M
10 l
0 M
50 l
12
0M
M
0M
50
M
l 10
0M
10
6
Grammar of Graphics
0M
15
50
0M
M
11
0M
0M
50
10
M
100
0M
50M
7
150
M
0M
0M
10
M
50M
100M
100M
50M
0M
9 8
to Genomic Data
1
Tengfei Yin, Di Cook
2
3
4
5
Stepping
6
Iowa State University
7
8
9
gieStain
10
acen
Michael Lawrence
gneg
11
gpos100
12
gpos25
13
gpos50
gpos75
14
gvar
15
15
stalk
Genentech
16
17
Coverage
18
10
19
20
21
5
Interface 2012, Rice University
22
X
0
Y
0.0e+00 5.0e+07 1.0e+08 1.5e+08 2.0e+08 0 100 200 300
2. Motivation
Lots of tools exist for displaying
genomic data
Many different packages, many
standalone, many different data
standards
ggbio - Genomic Data Vis - Interface 2012, Rice University 2 /31
4. Motivation
Circos
ggbio - Genomic Data Vis - Interface 2012, Rice University 3 /31
5. Motivation
Circos
ggbio - Genomic Data Vis - Interface 2012, Rice University 3 /31
6. Motivation
Circos
ggbio - Genomic Data Vis - Interface 2012, Rice University 3 /31
7. Motivation
Circos
ggbio - Genomic Data Vis - Interface 2012, Rice University 3 /31
8. Motivation
Circos
ggbio - Genomic Data Vis - Interface 2012, Rice University 3 /31
9. Motivation
Circos
ggbio - Genomic Data Vis - Interface 2012, Rice University 3 /31
10. Motivation
Circos
Need construct a central and many other
configuration files from scratch, learning
curve is very high
Adding legend not easy
Cannot map aesthetics to certain
variables
ggbio - Genomic Data Vis - Interface 2012, Rice University 3 /31
12. Motivation
UCSC Genome Browser
ggbio - Genomic Data Vis - Interface 2012, Rice University 4 /31
13. Motivation
UCSC Genome Browser
Karyogram view,
with associated
data plotted
ggbio - Genomic Data Vis - Interface 2012, Rice University 4 /31
14. Motivation
UCSC Genome Browser
Karyogram view,
with associated
data plotted
ggbio - Genomic Data Vis - Interface 2012, Rice University 4 /31
15. Motivation
UCSC Genome Browser
ggbio - Genomic Data Vis - Interface 2012, Rice University 4 /31
16. Motivation
UCSC Genome Browser
ggbio - Genomic Data Vis - Interface 2012, Rice University 4 /31
17. Motivation
UCSC Genome Browser
Logical zoom, all
we know about
this genetic code
ggbio - Genomic Data Vis - Interface 2012, Rice University 4 /31
18. Motivation
UCSC Genome Browser
Logical zoom, all
we know about
this genetic code
ggbio - Genomic Data Vis - Interface 2012, Rice University 4 /31
19. Motivation
UCSC Genome Browser
ggbio - Genomic Data Vis - Interface 2012, Rice University 4 /31
20. Motivation
UCSC Genome Browser
Very commonly used, very popular
Gives broadly applicable, generic, but
narrow selection of plot choices
No operations on genomic ranges views to
facilitate perception of structure
ggbio - Genomic Data Vis - Interface 2012, Rice University 4 /31
21. Motivation
Chromosome X
65.93 mb 65.95 mb 65.97 mb
65.94 mb 65.96 mb
70
60
50
40
30
4
2
0
−2
ggbio - Genomic Data Vis - Interface 2012, Rice University 5 /31
5 Composite plots for multiple chromosomes
22. Motivation
Chromosome X
Gviz (Hahne et al)
65.93 mb 65.95 mb 65.97 mb
65.94 mb 65.96 mb
70
60
50
40
30
4
2
0
−2
ggbio - Genomic Data Vis - Interface 2012, Rice University 5 /31
5 Composite plots for multiple chromosomes
23. Motivation
Gviz (Hahne et al)
ggbio - Genomic Data Vis - Interface 2012, Rice University 5 /31
24. Motivation
Gviz (Hahne et al)
Pretty good!
Incorporated with R, and R data
structures
Uses grid (low level) graphics, very
flexible, but not leveraging tools like
ggplot2
ggbio - Genomic Data Vis - Interface 2012, Rice University 5 /31
25. Outline
What is the grammar of graphics?
How it is extended for genomic data.
Examples
Next steps: interactive graphics
ggbio - Genomic Data Vis - Interface 2012, Rice University 6 /31
26. Grammar
Grammar forms the foundation of a
language. It is a set of structural
rules that govern composition.
For graphics, it provides a way to
construct a plot in a common form,
and enables clarification of
similarities and differences between
plots.
ggbio - Genomic Data Vis - Interface 2012, Rice University 7 /31
27. Grammar (ggplot2)
80
Bar chart 60 day
Thu
ggplot(data=tips,
count
40 Fri
Sat
aes(x=day, fill=day)) + 20
Sun
geom_bar(width=1) 0
Thu Fri Sat Sun
day
ggbio - Genomic Data Vis - Interface 2012, Rice University 8 /31
28. Grammar (ggplot2)
80
Bar chart 60 day
Thu
ggplot(data=tips,
count
40 Fri
Sat
aes(x=day, fill=day)) + 20
Sun
geom_bar(width=1) 0
Thu Fri Sat Sun
day
Pie chart
ggplot(data=tips,
aes(x=day, fill=day)) +
geom_bar(width=1) +
coord_polar()
ggbio - Genomic Data Vis - Interface 2012, Rice University 8 /31
29. Grammar (ggplot2)
80
Bar chart 60 day
Thu
ggplot(data=tips,
count
40 Fri
Sat
aes(x=day, fill=day)) + 20
Sun
geom_bar(width=1) 0
Thu Fri Sat Sun
day
Pie chart 80
Sun Thu
60
ggplot(data=tips, 40
20
day
Thu
count
0 Fri
aes(x=day, fill=day)) + Sat
Sun
Sat Fri
geom_bar(width=1) +
coord_polar() day
ggbio - Genomic Data Vis - Interface 2012, Rice University 8 /31
30. Grammar (ggplot2)
80
Bar chart 60 day
Thu
ggplot(data=tips,
count
40 Fri
Sat
aes(x=day, fill=day)) + 20
Sun
geom_bar(width=1) 0
Thu Fri Sat Sun
day
Pie chart 80
Sun Thu
60
ggplot(data=tips, 40
20
day
Thu
count
0 Fri
aes(x=day, fill=day)) + Sat
Sun
Sat Fri
geom_bar(width=1) +
coord_polar() day
ggbio - Genomic Data Vis - Interface 2012, Rice University 8 /31
31. Grammar (ggplot2)
80
Bar chart 60 day
Thu
ggplot(data=tips,
count
40 Fri
Sat
aes(x=day, fill=day)) + 20
Sun
geom_bar(width=1) 0
Thu Fri Sat Sun
day
Pie chart 80
Sun Thu
60
ggplot(data=tips, 40
20
day
Thu
count
0 Fri
aes(x=day, fill=day)) + Sat
Sun
Sat Fri
geom_bar(width=1) +
coord_polar() day
ggbio - Genomic Data Vis - Interface 2012, Rice University 8 /31
32. Grammar (ggplot2)
80
Bar chart 60 day
Thu
ggplot(data=tips,
count
40 Fri
Sat
aes(x=day, fill=day)) + 20
Sun
geom_bar(width=1) 0
Thu Fri Sat Sun
day
80
Sun Thu
60 day
40
20 Thu
count
0 Fri
Sat
Sun
Sat Fri
day
ggbio - Genomic Data Vis - Interface 2012, Rice University 8 /31
33. Grammar (ggplot2)
80
Bar chart 60 day
Thu
ggplot(data=tips,
count
40 Fri
Sat
aes(x=day, fill=day)) + 20
Sun
geom_bar(width=1) 0
Thu Fri Sat Sun
day
Rose plot/Coxcomb 80
Sun Thu
60
ggplot(data=tips, 40
20
day
Thu
count
0 Fri
aes(x=day, fill=day)) + Sat
Sun
Sat Fri
geom_bar(width=1) +
coord_polar() day
ggbio - Genomic Data Vis - Interface 2012, Rice University 8 /31
34. Grammar (ggplot2)
80
Bar chart 60 day
Thu
ggplot(data=tips,
count
40 Fri
Sat
aes(x=day, fill=day)) + 20
Sun
geom_bar(width=1) 0
Thu Fri Sat Sun
day
80
Sun Thu
60 day
40
20 Thu
count
0 Fri
Sat
Sun
Sat Fri
day
ggbio - Genomic Data Vis - Interface 2012, Rice University 8 /31
35. Grammar (ggplot2)
Stacked bar chart 200
ggplot(data=tips, 150
day
Thu
count
aes(x=””, fill=day)) + 100
Fri
Sat
Sun
geom_bar(width=1) 50
0
""
ggbio - Genomic Data Vis - Interface 2012, Rice University 9 /31
36. Grammar (ggplot2)
Stacked bar chart 200
ggplot(data=tips, 150
day
Thu
count
aes(x=””, fill=day)) + 100
Fri
Sat
Sun
geom_bar(width=1) 50
0
""
Pie chart
ggplot(data=tips,
aes(x=””, fill=day)) +
geom_bar(width=1) +
coord_polar(theta="y")
ggbio - Genomic Data Vis - Interface 2012, Rice University 9 /31
37. Grammar (ggplot2)
Stacked bar chart 200
ggplot(data=tips, 150
day
Thu
count
aes(x=””, fill=day)) + 100
Fri
Sat
Sun
geom_bar(width=1) 50
0
""
Pie chart 0
ggplot(data=tips, 200
50
day
Thu
count
Fri
aes(x=””, fill=day)) + Sat
Sun
geom_bar(width=1) + 150
100
coord_polar(theta="y") ""
ggbio - Genomic Data Vis - Interface 2012, Rice University 9 /31
38. Grammar Elements
DATA: What is to be plotted
STAT: Statistical operations to make on
data, like binning.
GEOM: Geometric object, elements to use
to displays aspects of the data
SCALE: Map data to aesthetics to geom
COORD: Coordinate system to use, eg
Cartesian
(FACET): subset and display
ggbio - Genomic Data Vis - Interface 2012, Rice University 10 /31
52. Example: MA plot
qplot(baseMean, log2FoldChange,
data = res, geom = "point",
xlab = "Normalized mean",
ylab = "log2 fold change",
xlim = c(0, 10000),
color = group) +
scale_x_log10() +
scale_color_manual(
values = c("black", "red"))
ggbio - Genomic Data Vis - Interface 2012, Rice University 12 /31
53. What’s different?
Genomic data has interval context
Several common geoms used in
standard plots, not in current
grammar
Additional transformations common
Lining up of multiple data plots,
especially against genome
ggbio - Genomic Data Vis - Interface 2012, Rice University 13 /31
54. What’s different?
No seqnames ranges strand tx id exon id
1 chrX [48242968, 48243005] + 35775 132624
2 chrX [48243475, 48243563] + 35775 132625
3 chrX [48244003, 48244117] + 35775 132626
4 chrX [48244794, 48244889] + 35775 132627
5 chrX [48246753, 48246802] + 35775 132628
... ... ... ... ... ... ...
26 chrX [48270193, 48270307] - 35778 132637
27 chrX [48269421, 48269516] - 35778 132636
28 chrX [48267508, 48267557] - 35778 132635
29 chrX [48262894, 48262998] - 35778 132633
30 chrX [48261524, 48262111] - 35778 132632
able 2: Typical biological data coerced into a data frame: A GRanges table representing gene SSX4 an
DATA: Genomic ranges
SX4B. One row represents one exon, seqnames indicates the chromosome name, ranges indicates the interva
exons, strand is the direction, tx id and exon id are the internal id’s used for mapping cross database.
ggbio - Genomic Data Vis - Interface 2012, Rice University 14 /31
55. Extensions
data source(s)
I/O packages in bioconductor
abstract data (formal model) meta data
autoplot geom stat scale grammar of
graphics with
coord facet layout extensions
plots tracks
ggbio - Genomic Data Vis - Interface 2012, Rice University 15 /31
56. Extensions
current software. Development of new visualization tools should be
independently factorized into components of the grammar. Table 1
describes the extensions developed in this work.
Comp name usage icon
geom geom rect rectangle
geom segment segment
aut
geom chevron chevron
geom arrow arrow
geom arch arches
geom bar bar
geom alignment alignment (gene)
stat stat coverage coverage (of reads) Figu
ggbio - Genomic Data Vis -mismatch pileup forRice University
Interface 2012, tion.
16 /31
stat mismatch mod
57. geom arrow arrow
Extensions
geom arch arches
geom bar bar
geom alignment alignment (gene)
stat stat coverage coverage (of reads) Figure
tion. It
stat mismatch mismatch pileup for model,
alignments map d
new co
stat aggregate aggregate in sliding
body o
window
ggbio.
stat stepping avoid overplotting
stat gene consider gene struc-
ture
Sev
In ggp
stat table tabulate ranges ple bin
resenti
stat identity no change comm
covera
coord linear ggplot2 linear but
ggbio - Genomic Data Vis - Interface 2012, Rice University
facet by chromo- 17 31al
read
/
Add
58. stat gene consider gene struc-
ture
Several
Extensions
In ggplot2
stat table tabulate ranges ple binning
resenting a
stat identity no change commonly
coverage,
coord linear ggplot2 linear but
facet by chromo-
read alignm
some Additio
faceting m
genome put everything on are listed in
genominc coordi- examples.
nates
Let’s an
truncate gaps compact view by Figure 3, t
shrinking gaps In this plo
by using th
layout track stacked tracks attributes s
system. It
chr1
chr2
karyogram karyogram display
ing genom
chr3
50 100 150 200 250 300
start
circle circular the differe
the gramm
faceting formula facet by formula once one g
discover th
ranges facet by ranges componen
scale not extended ggplot2default in the desig
ggbio - Genomic Data Vis - Interface 2012, Rice University /
18 31
The foll
59. Extensions
autoplot
Tries, and does a jolly good job, of
recognizing the data object to be
plotted, and how it should be
displayed.
ggbio - Genomic Data Vis - Interface 2012, Rice University 19 /31
60. Example
2
strand
+
−
1
48245000 48250000 48255000 48260000 48265000 48270000
hg19::chrX
ggbio - Genomic Data Vis - Interface 2012, Rice University 20 /31
61. Example
2
strand
+
−
1
48245000 48250000 48255000 48260000 48265000 48270000
hg19::chrX
DATA=GRangesList Object
ggbio - Genomic Data Vis - Interface 2012, Rice University 20 /31
62. Example
2
strand
+
−
1
48245000 48250000 48255000 48260000 48265000 48270000
hg19::chrX
ggbio - Genomic Data Vis - Interface 2012, Rice University 20 /31
63. Example
GEOM=alignment, chevron
2
strand
+
−
1
48245000 48250000 48255000 48260000 48265000 48270000
hg19::chrX
ggbio - Genomic Data Vis - Interface 2012, Rice University 20 /31
64. Example
2
strand
+
−
1
48245000 48250000 48255000 48260000 48265000 48270000
hg19::chrX
ggbio - Genomic Data Vis - Interface 2012, Rice University 20 /31
65. Example
2
SCALE=stepping, strand
+
color=strand
−
1
48245000 48250000 48255000 48260000 48265000 48270000
hg19::chrX
ggbio - Genomic Data Vis - Interface 2012, Rice University 20 /31
66. Example
2
strand
+
−
1
48245000 48250000 48255000 48260000 48265000 48270000
hg19::chrX
ggbio - Genomic Data Vis - Interface 2012, Rice University 20 /31
67. Example
2
strand
+
−
1
48245000 48250000 48255000 48260000
hg19::chrX LAYOUT=linear
48265000 48270000
ggbio - Genomic Data Vis - Interface 2012, Rice University 20 /31
68. Example
2
strand
+
−
1
48245000 48250000 48255000 48260000 48265000 48270000
hg19::chrX
ggbio - Genomic Data Vis - Interface 2012, Rice University 20 /31
69. Examples
p1 <- autoplot(gr)
p2 <- autoplot(gr,
Stepping
stat = "coverage")
tracks(p1, p2)
Examine short
15
reads
Stack them (top)
Coverage
10
5 Collapse into
0
“density” (bottom)
0 100 200 300
ggbio - Genomic Data Vis - Interface 2012, Rice University 21 /31
70. Examples
uc002dwc.3(226)
p1 <- autoplot(txdb,
which = genesymbol["A"])
uc010veg.2(226)
p2 <- autoplot(txdb,
uc002dwa.4(226)
which = genesymbol["A"],
uc002dvz.3(226)
stat = "reduce")
uc002dvw.3(226)
tracks(p1, p2,
heights = c(4, 1))
uc002dvx.3(226)
Compare transcripts
uc010bzo.2(226)
Reduce all to one
30065000 30070000 30075000 30080000
ggbio - Genomic Data Vis - Interface 2012, Rice University 22 /31