[ Download for a better resolution ] genomation is a package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, It can use BAM or BigWig files as input.
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Poster/cheatsheet for R/BioC package genomation [Download for a better resolution]
1. Summarize, annotate and visualize genomic intervals
with R/BioC package genomation
Genomic intervals
Genomic intervals are the basis of genome
annota3on. Intervals can contain simply
loca3on informa3on (TFBS loca3ons) or can
contain a variety of scores, with different
scales, such as percent methyla3on, ChIP-seq
enrichment or read coverage.
Windows/regions of interest
over the genome
Summary of genomic intervals over different
sub-genomic contexts or windows of interest
is the necessary first step in making inference
about the biological importance of the data.
The can be of equal length (Ex: pre-defined
regions around TSS) or not (Ex: exons,
transcript or CpG islands)
Extract genomic intervals for
windows of interest
Genomic intervals could be extracted for
windows of interest and stored in matrix
format. If the windows are not of equal
length, then binning strategies can be used
to create equal number of bins over those
windows and s3ll use a matrix to store
informa3on.
Visualize the summary of
genomic intervals
The matrices that described above can be
visualized using heatmaps or meta-region
plots. Meta-region plots show the average
value of the signal from genomic intervals
over the windows of interest, this could be a
line plot or a heatmap where colors indicate
average values. Another type heatmap could
be used to show the values for matrices that
contain the signal value for genomic intervals
over the windows. In these heatmaps, every
row represents a window and every column
is a base-posi3on or a bin.
Read genomic intervals into R
Genomic intervals of any kind can be red into
R using the func3ons below, all which return
GRanges or GRangesList objects:
readGeneric(file,…)can read generic
text files with genomic interval informa3on.
readBed(file,...)can read BED files.
readTranscriptFeatures(file,…) can
read BED12 files with exon/intron structure.
gffToGRanges can read a GFF file.
readBroadPeak,readNarrowPeak,
readFeatureFlank are other convenience
func3ons to read BED-like files.
Extract genomic intervals for
windows of interest in R
Genomic intervals could be extracted for
windows of interest using
ScoreMatrix(target,windows) &
ScoreMatrixBin(target,windows)
func3ons. These func3ons can handle BAM,
BigWig files and GRanges objects as inputs.
patternMatrix() returns the rela3ve
loca3ons or scores of k-mers or mo3fs, useful
for analyzing ChIP-seq.
All of these return ScoreMatrix or
ScoreMatrixList objects. These objects can
be manipulated further using orderBy,
binMatrix, scaleScoreMatrix,
scaleScoreMatrixList,
intersectScoreMatrixList and []
func3ons.
Visualize the summary of
genomic intervals in R
heatMatrix(scoreMatrixobj) and
multiHeatMatrix(scoreMatrixListObj
) makes the heatmaps for windows of
interest.
plotMeta() makes meta-region plots and
heatMeta() makes meta-region heatmaps.
These func3ons return values invisibly see
respec3ve help pages for func3ons. Intergenic
Intron
Exon
Promoter40.9
11.6
21.825.7
0 500 1000
0.00.20.40.60.81.0
base-pairs around anchor
readpermillion
TF4
TF3
TF2
TF1
0
500
1000
0 0.5 1 1.5 2
TF 4
0
500
1000
0 0.5 1 1.5 2 2.5
TF 3
0
500
1000
0 0.5 1 1.5 2 2.5
TF 2
0
500
1000
0 0.5 1 1.5 2 2.5
TF 1
0 500 1000
base-pairs around anchor
TF1
TF2
TF3
TF4
0.0720.340.60.861.1
meta-region plots meta-region heatmaps heatmaps for genomic interval sets
Piecharts for annotation
Annota3on for genomic
intervals in R
Annota3on summaries for target genomic
intervals can be obtained by
annotateWithFeatures(). The resul3ng
object can be visualized by
plotTargetAnnotation() for piechart
and heatTargetAnnotation() for a
heatmap of annota3on overlapping
percentages.
Contributors: Altuna Akalin [aut, cre], Vedran Franke [aut, cre], Katarzyna Wreczycka [aut],
Alexander Gosdschan [ctb], Liz Ing-Simmons [ctb]
CitaCon: Akalin A, Franke V, Vlahovicek K, Mason CE, Schubeler D. (2015). Bioinforma3cs. Doi:
10.1093/bioinforma3cs/btu775
Genomic intervals with different kinds of informa3on
Extract subset of genomic intervals for windows of interest
Meta-region line plots
Meta-region heatmap
heatmaps for genomic intervals
Annota3on pie charts Annota3on heatmaps
Annota3on for genomic
intervals
Genomic intervals such as ChIP-seq peaks or
differen3ally methylated regions needs to be
annotated further with known genomic
annota3ons such as promoter, intron, exon
structures of the genes. This gives addi3onal
informa3on for func3onal characteriza3on of
the genomic intervals. This usually
represented as percent of genomic features
overlapping with the annota3on.
urls: hap://bioinforma3cs.mdc-berlin.de/genoma3on
hap://www.biconductor.org/release/genoma3on
Install genomaCon: OR
devtools::install_github(”BIMSBbioinfo/genomation”, build_vig=F)
source("https://bioconductor.org/biocLite.R") ; biocLite(“genomation”)
Visualize summaries and annota3on for genomic intervals
Misc. func3ons
getRandomEnrichment() calculates the significance of
associa3on between two sets of genomics intervals.
Valid for package version => 1.5.6