Xu W., Wongsa A., Lee J., Shang L., Cannone J.J., and Gutell R.R. (2011).
RNA2DMap: A Visual Exploration Tool of the Information in RNA's Higher-Order Structure.
Proceedings of 2011 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2011), Atlanta, GA. November 12-15, 2011. IEEE Computer Society, Washington, DC, USA. pp. 613-617.
2. s
v
c
R
s
p
o
d
d
s
c
to
v
T
f
im
T
in
c
II
A
T
c
th
m
d
w
s
s
P
d
th
v
s
c
s
s
n
secondary str
visualizing mu
creates an in
RNAMovies is
structure space
predictions. On
open source to
diagrams of R
designed as a
secondary struc
capabilities to
o adapt to
visualization to
This tool draw
from a few dif
mplements fou
The primary f
nteractively ed
can output visu
II. VISUALI
A. Main interfa
The main app
consists of two
he main visual
Figure 1 Overv
The control p
molecule selec
display options
with the mol
sequence and
selected by lef
Panel 3 provid
discussed in de
The main vis
hree parts, (
visualization w
saved position
contains optio
structure (zoom
secondary stru
number, to pres
ructures. RNA
ultiple sets of
nterpolated an
s primarily us
s for evaluatin
ne of the comm
ool to create, a
RNA sequence
user interface
ctures in a spec
create visual r
applications i
ool for RNA se
ws a RNA seco
fferent RNA s
ur different dra
feature for VA
dit and annota
ualization as sta
IZATION IMPLE
ace and basic in
plication interf
o groups: contr
lization on the
view of the RNA2
panel is further
ctor, (2) nuc
s. Different R
ecule selector
structure inf
ft-click and/or
des options for
tails in subsequ
sualization pan
(4) the struc
window – Stru
n toolbar. The
ns, to change
m controls), to
ucture, to loc
sent all positio
AMovies is
secondary str
nimation of
sed to show
ng different sec
monly used too
annotate and d
es [13]. This
to create new
cial format and
representations
in analysis.
econdary struc
ondary structu
structure file fo
awing algorith
ARNA is for t
ated secondary
atic images.
EMENTATIONS A
nteractions
face of RNA2
rol panels on
right.
2DMap interface.
r divided into
cleotide inform
RNA molecule
r. Panel 2 r
formation for
hovered by th
different view
uent sections.
nel at the right
cture toolbar,
ucture Naviga
e structure too
e the size of
print current v
cate nucleotid
ns and their ba
a system f
ructure data a
the data [1
RNA seconda
condary structu
ols is XRNA,
isplay seconda
software tool
w or edit existi
d is limited in
s of the data a
A more rece
cture is VARN
ure automatica
ormats. VARN
hms in Java [1
the biologists
y structures a
AND FEATURES
2DMap (Fig.
the left side a
.
three parts, (1
mation, and (
e can be select
eveals availab
the nucleotid
he mouse curs
ws which will
t is composed
(5) the ma
ator, and (6) t
olbar at the t
f the seconda
view or the ent
des by positi
ase pair partner
for
and
2].
ary
ure
an
ary
is
ing
its
and
ent
NA.
ally
NA
4].
to
and
S
1)
and
) a
(3)
ted
ble
des
or.
be
of
ain
the
top
ary
tire
ion
r if
they fo
conserv
position
been s
entering
saved in
B. Seco
Figure 2
conform
Curre
diagram
structur
marism
models
more R
utilizes
diagram
window
or zoom
with a s
or decr
space o
The D
“Struct
and bas
base pa
defined
and co
glyphs.
2). The
position
Escher
also be
and num
C. Visu
The
six typ
distanc
form a second
vation values f
n toolbar at th
selected, by d
g the position
n their current
ondary Structur
2. Highlighting
mations.
ently the 5S, 1
ms for the hig
res (Thermus
mortui (5S and
s (Escherichia
RNA families
s the coordinat
m and maps
w. The visualiz
med out to sho
sliding bar loc
rease the zoom
of the visualiza
Display Option
ture” (Fig. 2). T
se pairs are m
airs symbol by
d by the Lee &
olor for each c
. The legend is
e tertiary intera
n numbers for
richia coli (the
e shown (or hid
mbers displaye
ualization of ad
“data set” tab
es of informat
ces between th
dary or tertia
for each positi
e bottom lists
double-clicking
n number on
state (e.g. RN
re view
the selected b
6S, and 23S rR
gh resolution t
thermophiles
d 23S)) and th
coli) are avail
s will be incl
ates for an RN
them within
zation can be z
ow an overvie
cated at the top
m values. Users
ation to move th
ns panel has tw
The graphical
modified in the
y default reve
& Gutell nomen
conformation
s in the Displa
actions can be
the crystal stru
typical referen
dden) with tic m
ed every 50.
dditional inform
in the display
tion: 1) the ph
he selected n
ary interaction
ion in a table.
those position
g on the nuc
the bottom ri
NA distance col
base pair group
RNA secondar
three-dimensio
(16S) and H
he comparativ
lable. In the fu
luded. The RN
NA’s secondary
the main vi
zoomed in to sh
ew of the entir
p of the screen
s can also drag
he current view
wo tabs – “Data
format of the n
“Structure” se
eals the confor
nclature [15]. T
is shown with
ay Options pan
displayed or h
ucture and the
nce species) po
marks every te
mation
y options pane
hysical three d
nucleotide and
n and the
The saved
s that have
cleotide or
ight, to be
loring).
ps and their
ry structure
onal crystal
Haloarcula
e structure
uture, many
NA2DMap
y structure
isualization
how details
re structure
to increase
g the white
w.
a Sets” and
nucleotides
ection. The
rmation, as
The symbol
h different
nel (Figure
hidden. The
equivalent
ositions can
n positions
el includes
imensional
d all other
614
3. n
f
b
s
H
F
a
v
r
p
c
b
c
s
c
m
T
d
ti
c
S
n
s
m
in
in
F
T
nucleotides in t
for every nucle
base pair types,
secondary struc
Highlighting bas
Figure 3 Five mo
To simultane
and types of co
visualization
representation;t
pair type while
conformation (
based on their
conformations
structural moti
circles are show
motifs are larg
Therefore, use
different types
ime. The color
customized to e
Showing distan
RNA2DMap
nucleotides bas
- left). The
specified upper
map ranging
ndicates the s
ndicates the lar
Figure (left) 3D
The conservation
the RNA molec
eotide, 3) coax
, 5) the conform
cture motifs.
se pairing, con
otif types are high
eously render
onformation, R
technique, a
the color of th
the color of in
(Fig. 3). Stru
unique arrang
with unpaire
ifs, displayed
wn in Fig. 3. T
ger than thos
ers can select
s of conforma
r used for any
emphasize the
nce values in th
can also show
sed on their po
distance value
r limit value ar
from black
mallest distan
rgest distance v
D distances are h
n view of the 16S
cule, 2) the con
xial stacking o
mation of the b
nformation and
hlighted in differe
both types of
RNA2DMap us
two-colored
he outer circle
nner circle is fo
uctural motifs
gement of bas
ed nucleotides
with large co
The sizes of co
e used for ba
t multiple ba
tions and mot
y of the circle r
desired pattern
hree dimension
w the distance
osition in cryst
es between ze
e mapped to a
to green whe
ce values and
values.
highlighted with a
S rRNA secondar
nservation valu
of helices, 4) t
base pairs, and
motif
ent colors.
f base pair gro
ses a glyph bas
d nested circ
reveals the ba
or the base pai
are categoriz
e pair types a
s. Five differe
olored nucleoti
olored circles f
ase pair group
ase pair group
tifs at the sam
rendering can
n.
al space
e values betwe
tal structure (F
ero and the us
continuous co
ere black co
d the green co
a color map; (rig
ry structure.
ues
the
6)
oup
sed
cle
ase
ir’s
zed
and
ent
ide
for
ps.
ps,
me
be
een
Fig.
ser
lor
lor
lor
ght)
Highligh
The
determi
conserv
differen
invarian
indicate
greater
below
lower c
right re
highly v
A. Expl
A.
B.
Figure
conform
conform
A tot
pair t
conform
conform
proxim
structur
fundam
Visuali
types a
identify
hting conserva
extent of nuc
ined with a m
vation values a
nt shades of
nt have a con
e greater varia
are shown in
1.9 are black. T
conservation v
eveals the exten
variable region
IV. EXE
lore base pair t
5 A (Top).
mations. B (Botto
mations (see text)
tal of 16 base
type can f
mations. The f
mations associ
mity with one
ral motif they
mental to a
izing the frequ
and their confo
y patterns th
ation values
cleotide conse
modified Shann
are associated
black and g
nservation val
ation. Position
red. Positions
The density of
values (ie. gre
nt and location
ns of the RNA
EMPLAR USER
types and conf
Most frequent
om). Least frequ
).
pair types are
form approx
frequency of
ated with each
another are d
y might be
an RNAs
uency and orga
ormations is a
hat have been
ervation and v
non equation [
with the colo
gray. Positions
lue of 2. Low
ns with values
with values im
f the black decr
eater variation
ns of highly co
.
SCENARIOS
formation types
base pair g
uent base pair
possible. And
ximately 15
each base pair
h base pair type
diagnostic of th
associated wi
higher-order
anization of the
very effective
n characterize
variation is
16]. These
ors red and
s that are
wer values
s of 1.9 or
mmediately
reases with
). Fig. 4 -
onserved to
s
groups and
groups and
d each base
different
r type, the
e, and their
he type of
ith and is
structure.
e base pair
e means to
ed and to
615
4. d
v
a
a
C
a
th
c
h
th
th
o
th
p
f
A
u
s
A
A
s
e
th
p
p
B
F
lo
w
in
s
1
d
s
s
r
o
m
d
p
T
discover new
variation of an
The most fre
and conformati
and the least fr
C:C, C:U, and
are shown in F
he vast majori
conformations
helices althoug
he regular hel
he least freque
outside of the
his latter group
pair types is
frequent confor
A:G base pai
usually occur
secondary struc
A:G base pairs
A:A base pairs
sheared confo
exceptions whe
he middle of
pairs within a
previous analys
B. Nucleotides w
Figure 6 Tertiary
one-pair tri-loop,
with a thicker
nteractions and g
While comp
secondary struc
16S, and 23S
dimensional st
subunits det
substantiated t
rRNAs and ide
on our Thermu
marismortui 2
diagrams [20] [
One of the g
predict an RNA
To achieve this
patterns that
existing motif.
equent base pai
ions (Watson-C
equent base pa
U:U) and con
Figures 5A and
ity of the mos
occur within t
gh a few of the
ix (Fig 5A). In
ent base pair gr
secondary stru
p, the most com
A:G, and the
rmations is sh
irs have the
immediately
cture helix. A p
s at the end o
. These A:A ba
ormation [17].
ere two consec
the helix. Thi
secondary stru
sis [18].
within structure
y interactions bet
and the UAA/GA
line in RNA2D
green lines for ba
parative analy
cture for nume
S rRNAs [19
tructures for
termined wi
the comparati
ntified the tert
us thermophile
23S and 5S
[21].
grand challeng
As secondary a
s very ambitiou
could be a n
.
ir groups (G:C
Crick (WC) and
air types (G:G,
nformations (ne
d 5B respective
st frequent bas
the regular sec
ese base pairs
n contrast the
roups and conf
ucture helix (F
mmon of the le
e most comm
heared (S). The
sheared confo
adjacent to
previous study
f a helix frequ
ase pairs usual
. However, t
cutive A:G bas
is observation
ucture helix is
e motifs form te
tween nucleotide
AN motifs and its
DMap. Blue lin
ase-backbone in
ysis accurately
erous RNAs, i
9], the high-r
the 30S and
ith X-ray
ive structure
tiary interaction
es 16S rRNA
rRNA seco
ges in biology
and three-dimen
us goal, an und
new motif or
C, A:U, and G:
d Wobble (Wb
, G:A, A:A, A:
early 40 in tot
ely. As expecte
se pair types a
condary structu
occur outside
vast majority
formations occ
Fig 5B). With
east frequent ba
mon of the le
e majority of t
ormation. The
the end of
revealed that t
uently change
ly form the sam
there are som
se pairs occur
of tandem A
s consistent w
ertiary interactio
es in the tetralo
s partner are dra
es for base-ba
teractions.
y predicted t
including the 5
resolution thre
50S ribosom
crystallograp
model for t
ns that are draw
and Haloarcu
ondary structu
is to accurate
nsional structu
derstanding of t
r a
U)
b)),
:C,
tal)
ed,
and
ure
of
of
cur
hin
ase
ast
the
ese
a
the
to
me
me
in
A:G
with
ons
op,
awn
ase
the
5S,
ee-
mal
phy
the
wn
ula
ure
ely
ure.
the
fundam
is essen
structur
loop [2
that hav
or the s
Fig.
exampl
orange)
(pink).
base-ba
are emp
the othe
C. Inve
Figure 7
highlight
Whil
position
signific
clusters
reveals
in prox
betwee
specific
more re
the rRN
weaker
Fig. 7 r
weak
immedi
structur
evaluat
determi
coordin
thermop
within
mental rules tha
ntial. The iden
ral motifs, inc
23], and UAA
ve a preponder
sugar-phosphat
6 reveals a re
les of these th
), one tetraloo
The tertiary
ackbone bondin
phasized with
er tertiary struc
estigation of no
7 The secondary
ts all neighbor ef
le the stronges
ns indicate a b
cant covariatio
s of nucleotid
s that the posit
ximity in the th
en two regions
c function dur
ecent analysis
NA secondary
r covariation s
reveals the nu
covariations.
iately adjacen
re diagram. T
ted with the R
ine their abso
nates in the hig
ophiles 16S rR
hydrogen bon
at influence the
ntification and
luding the tetr
A/GAN [24] al
rance to hydro
te backbone of
egion of the 2
hree motifs - t
op (purple), an
y interactions
ng with nucleo
a thicker line
cture interactio
on-base pairing
structural map o
ffects (red lines c
st covariation s
base pair, wea
on scores ha
des. Our prev
tions involved
hree-dimension
s of the tRNA
ring protein s
of 16S rRNA r
structure that
scores (also ca
ucleotides that
Three sets o
nt with one an
These shaded
RNA distance o
olute physical
gh-resolution c
RNA. All three
nding distance
e correct foldin
characterizatio
raloop [22], lon
ll contain spe
gen bond to an
f the RNA mol
23S rRNA tha
two UAA/GAN
nd one lone-pa
that form bas
otides within th
to distinguish
ons.
g constraints
of T. thermophilu
connecting nucle
scores for two
aker but still s
ave been obs
vious analysis
in these assoc
nal structure o
A that are inv
synthesis [25]
reveals several
have these clu
alled neighbou
are part of a n
of covariation
nother on the
regions in Fi
option in RNA
distance bas
crystal structur
e sets of nucle
e, suggesting
ng of RNA
on of a few
ne-pair tri-
cific bases
nother base
ecule.
at has four
N (reddish
air tri-loop
se pairs or
hese motifs
them from
us 16S rRNA
eotides).
nucleotide
statistically
served for
of tRNA
ciations are
of tRNA or
volved in a
[26]. Our
l regions of
usterings of
ur effects).
network of
ns are not
secondary
ig. 7 were
A2DMap to
sed on the
re of the T.
eotides are
that these
616
5. nucleotides might form a base pair or are sufficiently close
in three-dimensional space to have structural constraints
with the other nucleotides in these clusters of neighbour
effects. This observation with RNA2DMap increases our
confidence that these neighbour effects are true structural
constraints and demonstrates nucleotides that do not form a
base pair can influence the evolution of other nucleotides
that are physically close with one another.
V. CONCLUSIONS
In this paper, we present RNA2DMap, an interactive tool
for visualizing multiple types of information on an RNA
secondary structure. The primary visualization is based on
the standard RNA secondary structure diagram. Nested
circles with different colors were used to reveal multiple
dimensions of information, including base pair types, base
pair conformations, conservation values of RNA sequences,
and the physical three-dimensional distances between the
selected nucleotide and all other nucleotides. RNA2DMap
has the flexibility to show different combinations of
information on the RNA secondary structure. We
demonstrate three use cases. The first reveals the frequency
and organization of the different base pair types and their
conformations. The second reveals tertiary interactions
associated with a few of the structural motifs. The third
utilizes the RNA distance function to determine if different
sets of positions with a weak covariation are sufficiently
close in three-dimensional space to either form a base pair
or affect the spatial constraints of the nucleotides on other
nucleotides in this local region of the RNA structure.
RNA2DMap can be adapted to work with any secondary
structure diagram generated with other programs.
ACKNOWLEDGEMENT
This work was supported by NIH grants GM085337
(awarded to RG and WX) and GM067317 (awarded to RG),
and the Welch Foundation F-1427 (awarded to RG).
REFERENCES
[1] W. Xu, S. Ozer, and R. R. Gutell, "Covariant Evolutionary Event
Analysis for Base Interaction Prediction Using a Relational Database
Management System for RNA," in proceedings of Scientific and
Statistical Datatbase Management (SSDBM'09), LNCS, Winslett, Ed.:
Springer, pp. 200-216. 2009
[2] S. Ozer, K. J. Doshi, W. Xu, and R. R Gutell, "rCAD: A Novel
Database Schema for the Comparative Analysis of RNA," to appear
IEEE e-science Conference, Dec. 2011.
[3] K Yamamoto, N Sakurai, and H Yoshikura, "Graphics of RNA
secondary structure; towards an object-oriented algorithm," Comput
Appl Biosci. , vol. 3, no. 2, pp. 99-103, June 1987.
[4] O. Matzura and A. Wennborg, "RNAdraw: an integrated program for
RNA secondary structure calculation and analysis under 32-bit
Microsoft Windows," Computer Applications in the Biosciences, vol.
12, no. 3, pp. 247-9, 1996.
[5] R. M. Felciano, R. O. Chen, and R. B. Altman, "RNA Secondary
Structure as a Reusable Interface to Biological Information
Resources," SMI, SMI-96-0641, 1997.
[6] P.D. Rijk, J. Wuyts, and R D Wachter, "RnaViz 2: an improved
representation of RNA secondary structure," Bioinformatics, vol. 19,
no. 2, pp. 299-300, 2003.
[7] RNAfamily.. http://bioinfo.lifl.fr/RNA/RNAfamily/rnafamily.php
[8] KC Wiese, E Glen, and A Vasudevan, "JViz.Rna--a Java tool for
RNA secondary structure visualization," IEEE Trans Nanobioscience,
vol. 4, no. 3, pp. 212-8, Sep. 2005.
[9] H. Yang et al., "Tools for the automatic identification and
classification of RNA base pairs," Nucleic Acids Research, vol. 31,
no. 13, pp. 3450-60, 2003.
[10] Y. Byun and K. Han, "PseudoViewer3: generating planar drawings of
large-scale RNA structures with pseudoknots," Bioinformatics, vol.
25, pp. 1435-7, 2009.
[11] R. Giegerich and D J. Evers, "RNA Movies: visualizing RNA
secondary structure spaces," Bioinformatics, 15(1) pp. 32-7, 1999.
[12] UCSC, XRNA. http://rna.ucsc.edu/rnacenter/xrna/xrna.html
[13] K. Darty, A. Denise, and Y. Ponty, "VARNA: Interactive drawing and
editing of the RNA secondary structure," Bioinformatics, vol. 25, no.
15, pp. 1974-5, April 2009.
[14] J.C. Lee and R.R. Gutell, "Diversity of Base-pair Conformations and
their Occurrence in rRNA Structure and RNA Structural Motifs,"
Journal of Molecular Biology, vol. 344, pp. 1225-1249, 2004.
[15] R.R. Gutell, B. Weiser, C.R. Woese, and H.F. Noller, "Comparative
anatomy of 16S-like ribosomal RNA," Progress in Nucleic Acid
Research and Molecular Biology, no. 32, pp. 155-216, 1985.
[16] T. Elgavish, J.J. Cannone, J.C. Lee, S.C. Harvey, and R.R. Gutell,
"AA.AG@Helix.Ends: A:A and A:G Base-pairs at the Ends of 16 S
and 23 S rRNA Helices," Journal of Molecular Biology, vol. 310, no.
4, pp. 735-753, 2001.
[17] D. Gautheret, D. Konings, and R.R. Gutell, "A major family of motifs
involving G.A mismatches in ribosomal RNA," J. Mol. Biol., vol. 242,
no. 1, pp. 1-8, 1994.
[18] R.R. Gutell, J.C. Lee, and J.J. Cannone, "(2002). The Accuracy of
Ribosomal RNA Comparative Structure Models," Current Opinion in
Structural Biology, vol. 12, no. 3, pp. 301-310, 2002.
[19] B. T. Wimberly et al., "Structure of the 30S ribosomal subunit,"
Nature , no. 407, pp. 327-339, 2000.
[20] N. Ban, P. Nissen, J. Hansen, P. B. Moore, and T. A. Steitz, "The
complete atomic structure of the large ribosomal subunit at 2.4 A
resolution," Science, no. 289, pp. 905-920, 2000.
[21] C.R. Woese, S. Winker, and R.R. Gutell, "Architecture of Ribosomal
RNA: Constraints on the sequence of Tetra-loops," Proceedings of the
National Academy of Sciences, vol. 87, no. 21, pp. 8467-8471, 1990.
[22] J.C. Lee, J.J. Cannone, and R.R. Gutell, "The Lonepair Triloop: A
New Motif in RNA Structure," Journal of Molecular Biology, vol.
325, no. 1, pp. 65-83, 2003.
[23] J.C. Lee, R.R. Gutell, and R. Russell, "The UAA/GAN internal loop
motif: a new RNA structural element that forms a cross-strand AAA
stack and long-range tertiary interactions," Journal of Molecular
Biology, vol. 360, no. 5, pp. 978-988, 2006.
[24] R.R. Gutell, A. Power, G. Hertz, E. Putz, and G. Stormo, "Identifying
Constraints on the Higher-Order Structure of RNA: Continued
Development and Application of Comparative Sequence Analysis
Methods," Nucleic Acids Research, no. 20, pp. 5875-95, 1992.
[25] D. Gautheret, S.H. Damberger, and R.R. Gutell, "Identification of
Base Triples in RNA Using Comparative Sequence Analysis," Journal
of Molecular Biology, vol. 248, no. 1, pp. 27-43, 1995.
[26] J.J Cannone, S. Subramanian, M.N. Schnare, J.R Collett., L.M.
D'Souza, Y. Du, B. Feng, N. Lin, L.V. Madabusi, K.M. Müller, N.
Pande, Z. Shang, N. Yu, and R.R. Gutell "The Comparative RNA
Web (CRW) Site: An Online Database of Comparative Sequence and
Structure Information for Ribosomal, Intron, and Other RNAs,"
BioMed Central Bioinformatics, 3:2, 2002
617