SlideShare une entreprise Scribd logo
1  sur  25
Diversity of Base-pair Conformations and their
Occurrence in rRNA Structure and RNA
Structural Motifs
Jung C. Lee and Robin R. Gutell*
The Institute for Cellular and
Molecular Biology, The
University of Texas at Austin
1 University Station A4800
Austin, TX 78712-0159, USA
In addition to the canonical base-pairs comprising the standard Watson–
Crick (C:G and U:A) and wobble U:G conformations, an analysis of the
base-pair types and conformations in the rRNAs in the high-resolution
crystal structures of the Thermus thermophilus 30 S and Haloarcula
marismortui 50 S ribosomal subunits has identified a wide variety of non-
canonical base-pair types and conformations. However, the existing
nomenclatures do not describe all of the observed non-canonical
conformations or describe them with some ambiguity. Thus, a
standardized system is required to classify all of these non-canonical
conformations appropriately. Here, we propose a new, simple and
systematic nomenclature that unambiguously classifies base-pair confor-
mations occurring in base-pairs, base-triples and base-quadruples that are
associated with secondary and tertiary interactions. This system is based
on the topological arrangement of the two bases and glycosidic bonds in a
given base-pair. Base-pairs in the internal positions of regular secondary
structure helices usually form with canonical base-pair groups (C:G, U:A,
and U:G) and canonical conformations (C:G WC, U:AWC, and U:G Wb). In
contrast, non-helical base-pairs outside of regular structure helices usually
have non-canonical base-pair groups and conformations. In addition, many
non-helical base-pairs are involved in RNA motifs that form a defined set
of non-canonical conformations. Thus, each rare non-canonical confor-
mation may be functionally and structurally important. Finally, the
topology-based isostericity of base-pair conformations can rationalize
base-pair exchanges in the evolution of RNA molecules.
q 2004 Elsevier Ltd. All rights reserved.
Keywords: base-pair conformation; isostericity; bifurcated hydrogen bonds;
RNA motif*Corresponding author
Introduction
Recently, the high-resolution crystal structures of
the bacterial Thermus thermophilus 30 S (PDB, 1FJF1
)
and archaeal Haloarcula marismortui 50 S (PDB,
1FFK2
and 1JJ23
) ribosomal subunits were deter-
mined; the former includes the 16 S rRNA and the
latter the 23 S and 5 S rRNAs. An analysis of the
base-pairs present in the rRNAs in the two crystal
structures not only validated the authenticity of the
covariation-based rRNA structure models,4
but also
provides a wealth of RNA structural folds, confor-
mations and motifs to identify and relate to
nucleotide sequences and base-pairs. In addition
to the canonical base-pairs with canonical confor-
mations consisting of the standard Watson–Crick
(C:G and U:A)5
and the wobble U:G6
base-pair
types and conformations, these crystal structures
contain many canonical and non-canonical base-
pair types with non-canonical conformations. The
non-canonical conformations are frequently
involved in a variety of motifs, including GNRA,
UUCG, and CUUG tetraloops,7–11
G$U base-pair
0022-2836/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.
Abbreviations used: WC, Watson–Crick; Wb, wobble;
sWC, slipped Watson–Crick; sWb, slipped wobble; rWC,
reversed Watson–Crick; rWb, reversed wobble; H,
Hoogsteen; rH, reversed Hoogsteen; S, sheared; rS,
reversed sheared; fS, flipped sheared; pfS, parallel flipped
sheared; pS, parallel sheared; rpS, reversed parallel
sheared; BHB, bifurcated hydrogen bond; PDB, Protein
Data Bank; LPTL, lonepair triloop; dDA, distance
between the hydrogen bond donor and acceptor; Ec,
Escherichia coli.
E-mail address of the corresponding author:
robin.gutell@mail.utexas.edu
doi:10.1016/j.jmb.2004.09.072 J. Mol. Biol. (2004) 344, 1225–1249
motifs,12
A platforms,13
AA.AG at helix.ends
motifs,14
tandem GA motifs,15,16
lonepair triloop
motifs,17,18
and sticky motifs consisting of AGUA/
GAA, GUA/GAA, and GGA/GAA motifs (J.C.L.
& R.R.G., unpublished results) U-turns,19,20
A-
minor motifs,21,22
K-turns,3
and H-turns.23
Many of the previously identified non-canonical
base-pairs have been organized onto a web site†.24
As well, a nomenclature has been proposed to
classify base-pair conformations by introducing the
interacting edges,25
while a computational
approach was developed to automatically identify
these latter base-pair conformations from RNA
crystal structures.26
More recently, a new compu-
tational study attempted to theoretically model
base-pair conformations with no nomenclature,
based on isomorphic relationships of base inter-
actions.27
The proposed naming systems are not
analogous to the traditional system (e.g. Watson–
Crick, wobble, reversed Watson–Crick, reversed
wobble, Hoogsteen, reversed Hoogsteen, and
sheared conformations), which includes the orig-
inal base-pair conformations.5,6,28,29
Unfortunately,
not all of the observed non-canonical conformations
have been described unambiguously with the
system. Thus, a simple and widely applicable
nomenclature is needed to specifically describe all
of the observed and reasonable base-pair confor-
mations. By analyzing topological arrangements of
the bases and glycosidic bonds for all base-pairs
and by expanding the traditional classification, we
propose a new, simple and systematic nomen-
clature to classify all of the observed base-pair
conformations, regardless of the number of
hydrogen bonds between the two bases.
In addition to the introduction to our new
nomenclature, we analyze: (1) the distribution of
structural parameters of representative base-pair
conformations observed in the rRNAs; (2) the
relationship between the sugar puckering patterns
and base-pair conformations; (3) the protonated
base-pairs and the bifurcated hydrogen bonding
interactions in RNA structure; and (4) the distri-
bution of base-pair conformations on the rRNA
secondary structure models and the significance of
non-canonical conformations in RNA structure.
Results
Topological relationships of base-pair
conformations
The base-pair conformation refers to the spatial
arrangement of the two bases in a given base-pair,
X:Y, which are hydrogen-bonded to one another. In
addition to the standard Watson–Crick C:G and
U:A and wobble U:G base-pair types with canonical
base-pair conformations, a visual identification and
characterization of the base-pair conformations in
the rRNAs in the high-resolution crystal structures
of the T. thermophilus 30 S (PDB, 1FJF1
) and
H. marismortui 50 S (PDB, 1FFK2
and 1JJ23
) ribo-
somal subunits using the RasMol program30,31
reveals many canonical and non-canonical base-
pair types with non-canonical conformations
(Table 1).
While all 16 possible base-pairs are divided into
ten base-pair groups (i.e. C:G, U:A, U:G, G:A, C:A,
U:C, A:A, C:C, G:G, and U:U), their conformations
are classified into 14 major conformational families
(Table 1): Watson–Crick (WC), wobble (Wb),
slipped Watson–Crick (sWC), slipped wobble
(sWb), reversed Watson–Crick (rWC), reversed
wobble (rWb), Hoogsteen (H), reversed Hoogsteen
(rH), sheared (S), reversed sheared (rS), flipped
sheared (fS), parallel flipped sheared (pfS), parallel
sheared (pS), and reversed parallel sheared (rpS).
Base-pair conformations can be systematically
and unambiguously named based on the topologi-
cal arrangements of the two bases and the two
glycosidic bonds in a given base-pair, X:Y. As
depicted in Figure 1, all of the non-canonical
conformations are derived by simple topological
manipulation of the starting Watson–Crick (WC) or
wobble (Wb) conformation for each base-pair
group. For example, sWC/sWb is generated by
slipping (translating base Y either along the
negative y-axis (sWC/sWb) or along the positive
y-axis (sWC*
/sWb*
) to form only a single hydrogen
bond); rWC/rWb, by reversing (rotating nucleotide
Y 1808 about the x-axis); H, by flipping (rotating
either base Y (H) or base X (H*
) 1808 about its
glycosidic bond); rH, by flipping and then reversing
base Y (rH) or base X (rH*
); S, by shearing (either
translating base Y along the negative y-axis and
then along the negative x-axis (S) or translating base
X along the negative y-axis and then along the
positive x-axis (S*
)); rS, by shearing and then
reversing base Y (rS) or base X (rS*
); fS, by flipping
and then shearing base Y (fS) or base X (fS*
); pfS, by
flipping, shearing, and then paralleling nucleotide
Y (pfS) or X (pfS*
) (rotating either nucleotide Y or X
about the y-axis to have the glycosidic bonds that
run parallel in the same direction); pS, by parallel-
ing and then shearing either Y (pS) or X (pS*
); rpS,
by paralleling, shearing, and then reversing either X
or Y. The order of successive manipulations of the
base Y (or X) does not matter. The topological
relationships of the observed and theoretically
possible conformations for the ten base-pair groups
are shown in Figures 2–11: C:G, Figure 2; U:A,
Figure 3; U:G, Figure 4; G:A, Figure 5; C:A, Figure 6;
U:C, Figure 7; A:A, Figure 8; C:C, Figure 9; G:G,
Figure 10; and U:U, Figure 11. Interestingly, some
conformations within the same conformation
family have more than one hydrogen bonding
possibility with a similar topology: H, rH*
, fS*
,
pfS*
for U:G in Figure 4; H*
, rH, rH*
, pfS, pfS*
, and
pS for G:A in Figure 5; rH, S, pfS, and pS for G:G in
Figure 10. In addition, either base of the two bases
can be reversed in the rS and rpS conformations for† http://prion.bchs.uh.edu/bp_type/
1226 Diversity of Base-pair Conformations
Table 1. Base-pair conformations present in the rRNAs in the T. thermophilus 30 S and H. marismortui 50 S crystal structures
bpC C:G U:A U:G G:A C:A U:C A:A C:C G:G U:U Total
BP WC 869 252 1 21 2 8 – – # – 1153
Wb 1 1 124(3) – 3(2) – 6 9 3 16 168
sWC – 4(2) # # # 2 # # # # 8
sWb # # (1) # 3(1) # # 3 – 1 9
rWC 4 11 – – – 1 # – # – 16
rWb – – 2 – 3 – 5 – – 5 15
H 1 8 1 3 (2) (1) – – – 2 – 18
rH – 58(1) – 1 13(1) (1) 9 1 3 – 88
S – 7(2) 1(7) 143 8 (3) 5 10 4 4 – 194
rS 3 4 2 7 2 1 19 # 2 – 40
fS 1(1) (2) 6 5 1(1) – 1 – – – 18
pfS 4 (2) 1(1) 1 – – – – 1 – 10
pS – 1 1(1) 1 2 – 1 – – – 7
rpS – – – 1 # – # # – – 1
Total 884 355 152 185 46 18 51 17 15 22 1745
TQ WC 3 4 – 1 – 2 – – # – 10
Wb – – – – 1 – 1 – 1 1 4
sWC (1) – # # # – # # # # 1
sWb # # – # – # # – – – 0
rWC – 1 – – – – # – # – 1
rWb – – 1 – – – 9 – – – 10
H – 4 2 – 1 – – 2 6 1 16
rH 2(5) 5 (1) 3 8(3) – 2 – 1 3 33
S – 3 (1) 5 11 1 2 1 7 – 31
rS 2(1) 1 – 56 1 3 2 # 4 – 70
fS (1) 1(1) (6) 18 – (1) – – – 1 29
pfS 1(1) 1(2) 3(2) 9 3(5) 1(1) – 4 1 – 34
pS 1(1) 4 1(16) 3 (2) 5 2 1 2 1 39
rpS 1(2) – 4(2) 19 # – # # 3 – 31
Total 22 27 39 114 35 14 18 8 25 7 309
Base-pair types are divided into ten base-pair groups, while base-pair conformations (bpC; X:Y Z) involved both in simple base-pairs (bp) and in higher-order interactions including base–base-pair
and base-pair–base-pair interactions (TQ) are classified into 14 main families: WC, Watson–Crick; Wb, wobble; sWC, slipped Watson–Crick; sWb, slipped wobble; rWC, reversed Watson–Crick; rWb,
reversed wobble; H, Hoogsteen; rH, reversed Hoogsteen; S, sheared; rS, reversed sheared; fS, flipped sheared; pfS, parallel flipped sheared; pS, parallel sheared; rpS, reversed parallel sheared. The
parentheses represent the alternative base-pair conformations with an asterisk (*), and the hash sign (#) indicates the 19 base-pair conformations that are not likely to form.
G:A in Figure 5 and in the rpS conformation for G:G
in Figure 10.
Statistics of base-pair conformations observed
in rRNA crystal structures
A total of 121 nucleotide and conformational
arrangements are possible for the ten base-pair
groups and 14 theoretically possible conformations
for each base-pair group; 19 of the 140 possible
arrangements will not form and are not considered.
Of these 121 conformations, 73 simple base-pairs
(BP in Table 1) and 69 higher-order interactions
associated with base-basepair and base-pair–base-
pair interactions (TQ in Table 1) occur in the rRNAs
in the T. thermophilus 30 S and H. marismortui 50 S
crystal structures. Four significant trends are
identified in Table 1. (1) Of the 1745 simple base-
pairs, 884 (51%) are C:G, followed by U:A (355;
20%), G:A (185; 11%), U:G (152; 10%), and the
remaining six base-pair groups (9%). (2) The base-
pair conformation that occurs at the highest
frequency is WC (1153; 66%), followed by S (194;
11%), Wb (168; 10%), rH (88; 5%), and the remaining
ten base-pair conformations (8%). (3) The 869 (98%)
of the C:G base-pairs have the WC conformation;
252 (71%) and 59 (17%) of the U:A base-pairs form
WC and rH conformations, respectively; 127 (84%)
of the U:G base-pairs have the Wb conformation;
143 (77%) and 21 (11%) of the G:A base-pairs have S
and WC conformations, respectively; more than
half of the C:A base-pairs adopt the rH (14; 31%)
Figure 1. A schematic representation of the topological relationships between base-pair conformations for a given
base-pair, X:Y. Bases are represented as triangles and glycosidic bonds as thick lines attached to triangles. For simplicity,
the starting Watson–Crick and wobble conformations are represented as X:Y WC/Wb in a box with a shaded border in
the center. Each conformation is obtained by simply manipulating Y or X: s, shearing; f, flipping; r, reversing; p,
paralleling; sl, slipping (see the text for details). The alternative conformations are shown with an asterisk (*) in the
parentheses. The dotted arrow shows other conformations that are not simply derived by manipulating either of Yand X.
Figures 2–11 have the same presentation scheme with the possible protonated hydrogen bonds marked with wavy lines.
1228 Diversity of Base-pair Conformations
and S (11; 24%) conformations, while 19 (37%) and
10 (20%) of the A:A base-pairs assume the rS and S
conformations, respectively. (4) Of the 1745 base-
pair conformations, the most populated is C:G WC
(869; 50%), followed by U:A WC (252; 14%), G:A S
(143; 8%), U:G Wb (127; 7%), and U:A rH (59; 3%).
While the C:G WC, U:A WC, and U:G Wb
conformations predominantly occur within regular
secondary RNA helices, the vast majority of the G:A
S conformations occur immediately outside of
regular secondary helices.
While the most common conformation for simple
secondary structure base-pairs is WC, the higher-
order interactions have a wide variety of non-
canonical conformations (Table 1). The most
commonly observed conformations in higher-
order interactions are rS (70; 23%), pS (39; 13%),
pfS (34; 11%), rH (33; 11%), S (31; 10%), rpS (31;
Figure 2. Base-pair conformations for the C:G base-pair group.
Diversity of Base-pair Conformations 1229
10%), and fS (29; 9%), while the most common base-
pair groups in higher-order interactions are G:A
(114; 37%), U:G (39; 13%), C:A (35; 11%), and U:A
(27; 9%). Jointly, the most frequent arrangements in
higher-order interactions are the G:A rS (56), G:A
rpS (19), and G:A fS (18) (Table 1).
Geometries of base-pair conformations
Hydrogen bonds are weak and largely electro-
static in nature because of the partial positive
hydrogen atom from the donor and the partial
negative acceptor atom. Consequently, the two
bases of a base-pair conformation are usually not
perfectly coplanar even in the internal regions of the
regular secondary structure helices. Instead, they
are frequently propeller twisted, sometimes
buckled, staggered, stretched, or sometimes open
toward either the major or the minor groove side,32
while maintaining their topological arrangement,
suggesting that the base-pair conformations are in
constant motion. The average structural parameters
Figure 3. Base-pair conformations for the U:A base-pair group.
1230 Diversity of Base-pair Conformations
for the representative base-pair conformations
observed in the rRNAs in the T. thermophilus 30 S
and H. marismortui 50 S crystal structures are
illustrated in Figure 12: dCC, the C10
–C10
distance;
dDA, the donor–acceptor distance associated with a
hydrogen bond; !X and !Y, the N1–C10
–C10
and
N9–C10
–C10
angles.
While the canonical conformations associated
with the most commonly occurring base-pairs in
the regular secondary helical regions (C:G WC, U:A
WC, and U:G Wb) have their dCCs in the range of
10.4 A˚ and 10.6 A˚ , their dDAs gradually increase
from the minor to the major groove (Figure 12(a)).
This structural consistency of the A-form RNA will
be maintained unless a regular helix accommodates
a base-pair with a dramatically shifted dCC. For
example, the G:A base-pair with the G 50
and the A
30
to a regular secondary helix forms the G:A WC
conformation with a much longer dCC of 12.6 A˚ ,14
whose dDAs increase from the minor to the major
groove. Nonetheless, the 16 U:U Wb conformations
manage to be embedded within secondary helices
Figure 4. Base-pair conformations for the U:G base-pair group.
Diversity of Base-pair Conformations 1231
(data not shown), despite the much shorter average
dCC of 8.7 A˚ (Figure 12(a)). Moreover, the U:C WC
and C:C Wb conformations also form within regular
helices (data not shown) and their dDAs signifi-
cantly increase from the minor to the major groove;
the former has an opening toward the minor
groove, while the latter has the O2–N3 separation
(4.0 A˚ ) beyond the putatively protonated hydrogen
bonding distance. Furthermore, the U:A sWC and
C:A sWb conformations have the elongated dCCs
from those of their WC and Wb counterparts.
While the shearing and flipping arrangements of
the two bases in a base-pair result in the reduction
of dCC, the shearing and flipping arrangements
followed by the reversing frequently do the
opposite. For example, while the G:A S and A:A S
conformations have shorter dCCs, 9.5 A˚ and 9.8 A˚ ,
respectively, the C:A rH and A:A rS conformations
have the elongated dCCs, 10.9 A˚ and 11.1 A˚ ,
respectively (Figure 12(b)). When an A is involved
in a base-pairing interaction with two hydrogen
bonds, the dDA associated with the A at the –NH2
Figure 5. Base-pair conformations for the G:A base-pair group.
1232 Diversity of Base-pair Conformations
group is longer than the one associated with the A
at N7 (Figure 12(b)), because of the moderate-
strength, non-linear hydrogen bonding interaction
(data not shown).
While !X is usually larger than !Y in Figure 12,
!X is less than !Y in the alternative conformations
with an asterisk (e.g. the Wb*
, sWC*
, sWb*
, and S*
conformations). Interestingly, regardless of the
base-pair group, the difference between the two
!X and !Y angles, j!XK!Yj, is less than 58 for
the WC conformations, 20–408 for the Wb confor-
mations, approximately 45–608 for sWC and sWb
conformations, and 75–908 for the S conformations.
In this regard, the j!X–!Yj value can be used to
determine the vast majority of the base-pair
conformations, in the rRNAs in that approximately
88% of the 1745 simple base-pairs have the WC, Wb,
sWC, sWb, and S conformations (Table 1).
Almost all of the base-pairs within regular helices
(called internal base-pairs) have the canonical
Figure 6. Base-pair conformations for the C:A base-pair group.
Diversity of Base-pair Conformations 1233
conformations, while the base-pairs at helix ends
(called terminal base-pairs) sometimes have the
non-canonical conformations (Table 2). However,
the vast majority of base-pairs with the non-
canonical conformations either occur in the
unpaired regions in the covariation-based
rRNA secondary structure models or are associated
with higher-order interactions. The distribution of
base-pair conformations is discussed in detail
below.
Base-pair conformations and their sugar
puckering patterns
While 1561 (89%) of the 1745 simple base-pairs in
the rRNAs have the C30
-endo sugar puckering in
both nucleotides that are base-paired, the remaining
184 (11%) have the C20
-endo or O40
-endo sugar
puckering in at least one of the two base-paired
nucleotides (Table 3). Of the 184 base-pairs with the
sugar puckering other than the C30
-endo puckering,
Figure 7. Base-pair conformations for the U:C base-pair group.
1234 Diversity of Base-pair Conformations
26 have the C20
-endo puckering at both nucleotides
and three have the unusual O40
-endo puckering. The
authenticity of the latter O40
-endo puckering was
questioned in a recent publication.33
However, the
184base-pairswiththe“perturbed”sugarpuckerings
are not restricted to any specific base-pair group or
conformation; they include 23 C:G WC, 12 U:AWC, 3
U:G Wb, 21 U:A rH, 20 G:A S, 15 A:A rS, and 90 other
non-canonical conformations (data not shown).
All of the base-pairs in the internal positions of
the helices in the 16 S and the 23 S rRNA compara-
tive structure models have the C30
-endo puckering
at both nucleotides that are base-paired, except for
three C:G WC base-pairs at positions 1555–1566
(Ec: 1448:1463), 1827–2021 (Ec: 1771:1980), and
1853–1878 (Ec: 1797:1822) in the 23 S rRNA. All
of the remaining 181 base-pairs containing the
“perturbed” sugar puckerings occur at the ends of
helices, in lonepairs, in base-pairs associated with
motifs (e.g. tetraloops and E-loops), and in tertiary
interactions (data not shown). Nonetheless, no
correlations between base-pair conformations and
sugar puckering are observed.
Protonated base-pair conformations
The C:A Wb and C:C Wb conformations can have
two hydrogen bonds, one of which results from
protonation of A at N1 and of C at N3, respectively.
The protonated C:A Wb and C:C Wb conformations
with two hydrogen bonds have been reported.34,35
A recent spectroscopic study of the Escherichia coli
tRNAAla
acceptor stem showed that N1 of the C:A
Wb conformation is protonated at pH 5.0–5.5 and
unprotonated at pH 7.0–7.5.36
A 1
H NMR study
indicated that, upon forming DNA triplexes, the
Figure 8. Base-pair conformations for the A:A base-pair group.
Diversity of Base-pair Conformations 1235
C:C Wb conformation is protonated up to pH 7.0
but completely unprotonated at pH 7.6.37
In the rRNAs in the T. thermophilus 30 S (PDB,
1FJF; pH 6.5)1
and the H. marismortui 50 S (PDB,
1JJ2; pH 5.8)3
structures, several C:A base-pairs
including C1384:A1477 (Ec: C1402:A1500) in the
16 S and C963:A1005 (Ec: U868:A909) in the 23 S
rRNA have conformational arrangements identical
with that of the protonated C:A base-pair pre-
viously reported. Interestingly, however, the C:A
Wb conformation forms at a pH value higher than
the reported protonation pH limit with the topology
of the protonated C:A Wb conformation. For
example, the 16 S rRNA base-pair C1384:A1477
(Ec: C1402:A1500) forms very similar conformations
in the native 30 S (PDB, 1FJF; pH 6.5)1
and the
substrate-bound 30 S (PDB, 1I94; pH 7.8)38
struc-
tures; the distances from CaO of C to N1 of A,
d(O2–N1), is 2.41 A˚ in the native 16 S and 2.24 A˚ in
the ligand-bound 16 S rRNA, respectively. In
addition, the 16 S rRNA base-pair C240:C278 (Ec:
U245:U283) forms the C:C Wb conformation at
pH 7.8, which is topologically identical with the
protonated C:C base-pair; the distance from CaO of
one C to N3 of the other, d(O2–N3), is 2.47 A˚ and
2.65 A˚ in the native and the substrate-bound 30 S
structures, respectively.
These two “protonated-like” C:A Wb and C:C Wb
conformations at high pH 7.8 could result from a
localized pH change in the vicinity of these base-pairs.
Figure 9. Base-pair conformations for the C:C base-pair group.
1236 Diversity of Base-pair Conformations
In contrast, all of the eight C:C Wb conformations in
the 23 S rRNAs at pH 5.8 are flanked by two internal
base-pairs but have much longer d(O2–N3) values,
leading to an opening toward the minor groove
(Figure 12). In addition, the vast majority of the
water molecules interacting with base-pairs in the
H. marismortui 50 S structure are located in the
major groove, not in the minor groove, preventing
the protonation of C at N3 and A at N1 from the
minor groove (data not shown). Thus, the C:A Wb
and C:C Wb conformations may or may not be
protonated due to a localized pH change in their
proximity.
The two protonation-like base-pairs were pre-
viously predicted with comparative sequence
analysis. First, while the base-pair at 1384:1477
(Ec: 1402:1500) is a C:A in more than 10,000 16 S and
16 S-like rRNA sequences, it is a U:G in a few rRNA
sequences in mitochondria from eukaryotes that
map to different branches of the phylogenetic
tree,39,40
rationalizing that the covarying C:A and
U:G base-pairs have similar conformations
Figure 10. Base-pair conformations for the G:G base-pair group.
Diversity of Base-pair Conformations 1237
(Figures 4 and 6). Second, the non-canonical base-
pair at positions C240:C278 (Ec: U245:U283) in 16 S
rRNA was proposed based on the covariation
between U:U and C:C,41
implying that the co-
varying U:U and C:C base-pairs have similar
conformations (Figures 9 and 11).
Base-pair conformations involving bifurcated
hydrogen bonds
The four theoretically possible instances of
bifurcated hydrogen bonding interactions are: (1)
when one hydrogen atom simultaneously interacts
with two acceptor atoms (type I); (2) when one
acceptor atom simultaneously makes contact with
two hydrogen atoms (type II); (3) when two
hydrogen atoms from the donor make contacts
with two different acceptor atoms (type III);
(4) when one hydrogen atom interacts with one
acceptor atom while the donor interacts with
another hydrogen atom (type IV) (Figure 13). The
type II bifurcated hydrogen bonding inter-
actions systematically and commonly occur in
protein b-sheets.42
Some base-pair conformations
Figure 11. Base-pair conformations for the U:U base-pair group.
1238 Diversity of Base-pair Conformations
Figure 12. Geometries of (a) WC, Wb, sWC, and sWb and (b) S, rH, H, and rS. The average values for dDAs (donor–
acceptor distances for hydrogen bonds), dCC (C10
–C10
distance), and !X and !Y (N1–C10
–C10
or N9–C10
–C10
angles)
are obtained using N number of each base-pair conformation. The standard deviations for these structural parameters
are not provided intentionally.
Table 2. Distribution of canonical and non-canonical base-pair conformations on the covariation-based structure models
of rRNAs in the T. thermophilus 30 S and H. marismortui 50 S crystal structures
Location Association Canonical Non-canonical Total
Paireda
1241 (96.4) 46 (3.6) 1287
Internal region 717 (98.9) 8 (1.1) 725
Helix endsb
384 (93.7) 26 (6.3) 410
Dipair helixc
110 (97.4) 3 (2.6) 113
Lonepair helixc
30 (76.9) 9 (23.1) 39
Unpaireda
80 (17.5) 378 (82.5) 458
Motifsd
13 (7.0) 172 (93.0) 185
Unknown 67 (24.5) 206 (75.5) 273
Total 1321 (75.7) 424 (24.3) 1745
While canonical conformations (percentage in parentheses) are defined as WC and Wb conformations of any base-pair, the remaining 12
conformations are considered to be non-canonical (percentage in parentheses).
a
Paired and unpaired in the covariation-based structure models.
b
Helix ends are defined here as the terminal base-pairs occurring at the ends of a regular secondary helix.
c
Covariation-based helices with one or two base-pairs.
d
Identified motifs are GNRA and UNCG tetraloops, AA.AG at helix.ends, E loops, and tandem G:A base-pairs, lonepair triloops,
K-turns, H-turns, and sticky motifs (see the text).
Diversity of Base-pair Conformations 1239
with bifurcated hydrogen bonds (BHBs) in RNA
structure have also been reported.24,25
Although
explicitly not shown in Figures 2–11, the type I and
II BHBs are possible in some conformations
associated with simple base-pairs: The former
includes C:A sWb (Figure 6), U:C sWC (Figure 7),
and C:C sWb and H (Figure 9); the latter includes
include C:G fS and pfS (Figure 2), U:G fS and pfS
(Figure 4), G:A rH, fS*
and pfS*
(Figure 5), C:A pfS
(Figure 6), and G:G Wb and pfS (Figure 10).
Our conformational analysis revealed no type I
BHBs but identified the type II BHBs in a few of the
simple base-pairs in the rRNAs in the T. thermo-
philus 30 S and the H. marismortui 50 S structures.
For example, the type II BHBs are observed in two
of the three G:G Wb conformations shown in Table
1, which are distorted probably due to the steric
clash between NH2 of one G and NH of the other,
followed by forming two simultaneous interactions
of NH and NH2 of one G with CaO of the other
(data not shown). A very similar G:G Wb confor-
mation is observed with G76:G100 in the crystal
structure of the E. coli 5 S rRNA fragment in
complex with L25 (PDB, 1DFU43
). In fact, the G:G
Wb conformations with bifurcated hydrogen bonds
have the topological arrangement with the glyco-
sidic bonds of their two G bases almost reversed
(data not shown); they are possibly an intermediate
step for the transition between G:G Wb and G:G rH
(Figure 10).
The fS and pfS conformations of the C:G and U:G
base-pairs can have either a single hydrogen bond
or the type II BHBs, while maintaining the identical
topological arrangement (Figure 13(a)). Specifically,
the U:G fS conformation features the base-pair
conformation formed between the first and the last
nucleotides in the five UNCG tetraloops in the 16 S
and 23 S rRNAs; these include U338:G341
(Ec: U343:G346), U415:G418 (Ec: U420:G423),
U1117:G1120 (Ec: U1135:G1138), and U1430:G1433
(Ec: U1450:G1453) in the 16 S and U1770:G1773 (Ec:
U1692:G1695) in the 23 S rRNA. In fact, the UNCG
tetraloops involves an additional hydrogen bond
between 20
-OH of U and CaO of G, stabilizing their
formation in RNA structure (Figure 13(a)). The
same U:G fS conformation containing the type II
BHBs is observed with U9:G12 in the UUCG
tetraloop crystal structure (PDB, 1F7Y44
). In contrast
to U:G fS in the crystal structures, the solution
structure for the UUCG tetraloop revealed the
canonical U:G Wb conformation.8
The type II BHBs also commonly ocur in higher-
order interactions involving base-triples and base-
quadruples. Together with the type II BHBs, the
type III and IV BHBs frequently occur in higher-
order interactions including the A-minor inter-
actions.21,22
For example, the two base-pairs in the
23 S rRNA, C2833:G2847 (Ec: G2816:C2830) and
G2851:A2906 (Ec: G2834:A2883), interact with each
other to simultaneously form the type II, III, and IV
BHBs, which are formed with C2833 (Ec: G2816) at
CaO, G2847 (Ec: C2830) at NH2, and A2906 (Ec:
A2883) at 20
-OH, respectively (Figure 13(b)). In this
respect, the type II, III, and IV bifurcated hydrogen
bonding interactions play a significant role in
Table 3. Sugar puckering patterns for base-pairs observed in the rRNAs in the T. thermophilus 30 S and H. marismortui
50 S crystal structures
RNA [3 : 3] [3 : 2] [2 : 3] [2 : 2] [3:o] [o:2] Total
16 S 565 17 20 4 1 1 608
23 S 953 64 50 22 1 0 1090
5 S 43 1 3 0 0 0 47
Total 1561 82 73 26 2 1 1745
The sugar puckering pattern for a given base-pair, X:Y, is represented as [m:n], where m and n are either 3, 2, or o: 3, C30
-endo puckering;
2, C20
-endo puckering; o, O40
-endo puckering.
Figure 13. Bifurcated hydrogen bonds (BHBs) observed
in the rRNAs: (a) U:G fS with and without type II BHBs;
(b) an A-mediated higher-order interaction with type II,
III, and IV BHBs. The dDAvalues are explicitly illustrated
using broken lines and atoms are assigned different
colors: C, black; N, cyan; O, oxygen; P, orange.
1240 Diversity of Base-pair Conformations
stabilizing folded RNA structure, for example, by
increasing the number of hydrogen bonds in long-
range tertiary interactions.
Isostericity of base-pair conformations
Two or more base-pair types with a topologically
identical arrangement of the two base-pairing
nucleotides are structurally equivalent or iso-
steric.45
The two best-known isosteric base-pairs,
C:G WC and U:A WC, are also isosteric with G:A
WC and U:C WC as well as with U:G WC (Figure 4)
and C:AWC (Figure 6); the latter two conformations
are theoretically possible with keto-enol and amino-
imino tautomerism, respectively. For example, the
A288:C364 (Ec: A282:U358) base-pair in the 23 S
rRNA has a topological arrangement very similar to
that of the U:A WC conformation (Figure 14(a)). In
theory, all base-pair types other than G:G can form
their corresponding WC conformations, while all
the base-pair types can form the Wb conformation
(Figures 2–11).
In contrast, the U:G Wb conformation is not
isosteric to the standard C:G WC and U:A WC
conformation and occurs less frequently than the
C:G WC and U:A WC conformations (Table 1).
Besides the 124 U:G Wb conformations, our analysis
identified three U:G Wb*
and one C:G Wb confor-
mations in the rRNAs. The C:G Wb conformation
was previously observed with C84:G92 in the 1.6 A˚
resolution crystal structure for domain E in the
Thermus flavus 5 S rRNA (PDB, 439D).46
Interestingly, while C:G Wb is isosteric to U:G Wb,
U:G Wb*
should be flipped horizontally to be
isosteric to U:G Wb (Figure 14(b)). G647:G724
(Ec: G664:G741) in the 16 S rRNA also adopts the
conformational arrangement similar to that of the
U:G Wb conformation (Figure 14(b)).
When two consecutive nucleotides on a single
RNA strand are base-paired, they form the charac-
teristic, non-canonical pS(*) conformation. The first
set of examples were observed in the adenosine
platform motif in the Tetrahymena group I intron.47
Several more examples of this type of base-pairing
occur in the rRNAs at positions G175:U176 (Ec:
A181:A182) and U624:A625 (Ec: U641:A642) in the
16 S, C1105:A1106 (Ec: A1008:A1009), G1119:U1120
(Ec: G1022:U1023), and G1235:A1236 (Ec:
G1131:U1132) in the 23 S, and A51:A52 in the 5 S
rRNAs. In all of these examples, the base of the
leading 50
nucleotide always moves into the major
groove and that of the 30
nucleotide into the minor
groove. A similar but not identical example
involves the two consecutive bases A1193 and
Figure 14. Isostericity between base-pair conformations: (a) U:A WC and C:A WC; (b) U:G Wb, C:G Wb, U:G Wb*
, and
G:G Wb. The dDAvalues are explicitly illustrated using broken lines and atoms are assigned different colors: C, black; N,
cyan; O, oxygen; P, orange.
Diversity of Base-pair Conformations 1241
A1194 (Ec: A1089 and A1090) in the L11-binding
region of the 23 S rRNA, which exchange with G
and U, respectively. These two positions form a
base-pair with the pS conformation,19,48,49
while
position A1194 (Ec: A1090) forms a regular base-
pair with position U1205 (Ec: U1101), forming a
base-triple. Moreover, the consecutive GU bases in
the AGUA/GAA motif also form the parallel
sheared conformation, U:G pS*
(Figure 4).50
Moreover, the G:G H conformation (Figure 10)
observed at positions G294:G549 (Ec: G299:G566) in
the 16 S and G604:G607 (Ec: not homologous) in the
23 S rRNAs is isosteric to all other base-pairs in the H
conformation (Figures2–11).TheG:GHconformation
was originally observed in the NMR and crystal
structures of the G-quadruplex DNAs formed by
telomeric DNA sequences (PDB, 139D51
and 1JPQ52
).
Furthermore, the frequent exchange between G:A
and A:A (or sometimes C:A) at the ends of helices14
can simply be explained by the topological iso-
stericity of the G:A S, A:A S, and C:A S confor-
mations (Figures 5, 6, and 8). The U:A rH and C:A
rH conformations that are frequently observed in
the rRNAs (Table 1) are also topologically similar to
all other base-pairs in the rH conformation
(Figures 2–11).
These isosteric base-pairs covary or exchange with
one another at similar positions in homologous RNA
molecules from phylogenetically different organisms,
without affecting the overall three-dimensional RNA
structures. Therefore, the conformational isostericity
of base-pairs can be applied to rationalize base-pair
exchanges in an alignment of homologous RNA
sequences.
Unusual base-pair conformations observed in
other non-ribosomal crystal structures
Along with the 73 base-pair types and confor-
mational arrangements in the rRNAs (Table 1), six
additional conformational arrangements are identi-
fied in some non-rRNA crystal structures. (1) C:C
rWC (Figure 9) is observed at positions C16:C59 in
the E. coli Cys-tRNA crystal structure (PDB, 1B2353
).
This same conformation was observed in the
telomeric C-rich sequences forming an unusually
intercalated DNA structure known as the i-motif
(PDB, 105D54
), which has been known to be
stabilized by TMPyP4, a DNA-binding cationic
porphyrin causing chromosomal destabilization.55
(2) U:G H*
(Figure 4) is formed at positions G80:U96
in domain E of the T. flavus 5 S rRNA crystal
structure (PDB, 361D56
). (3) U:G rH*
(Figure 4) is
formed at positions U168:G188 in the crystal
structures of the P4–P6 domain of the Tetrahymena
group I intron (PDB, 1GID57
and 1HR258
). (4) G:G
sWb (Figure 10) is formed at positions G28:G40 in
the UUCG tetraloop crystal structure (PDB, 1F7Y43
).
(5) G:A rH*
(left of the two G:A rH*
structures in
Figure 5) is formed at positions G22:A46 in the
(C13:G22)A46 base-triple in the crystal structure of
the Saccharomyces cerevisiae Asp-tRNA complexed
with Asp-tRNA synthetase (PDB, 1ASY59
). This
conformation can be protonated to have two
hydrogen bonds and is isosteric to G:G rH at
positions G22:G46 of the (C13:G22)G46 base-triple
in the S. cerevisiae Phe-tRNA crystal structure (PDB,
6TNA60
). (6) U:U rWC (Figure 11) is observed at
positions U1301:U1339 (Ec: G1288:U1326) in the
23 S rRNA in the crystal structure of the Deinococcus
radiodurans 50 S crystal structure (PDB, 1LNR;
pH 7.8)61
, which is equivalent to U:C rWC (Figure 7)
for positions C1394:U1432 (Ec: G1288:U1326) in the
H. marismortui 23 S rRNA. Thus, as the number and
diversity of RNA crystal and NMR structures
increases, we expect to find more of the theoretically
possible arrangements of base-pair types and
conformations shown in Table 1 that have not
already been observed. Ultimately, this information
will help us understand the biological significance
of the rare non-canonical base-pair conformations.
Discussion
Distribution of base-pair conformations on rRNA
secondary structures
The statistics for the distribution of base-pair
conformations for the 1745 simple base-pairs
observed in the rRNAs in the T. thermophilus 30 S
and the H. marismortui 50 S crystal structures are
summarized in Table 2. Overall, 96% of the base-
pairs associated with secondary structure helices
have the canonical WC and Wb conformations, with
the highest percentage of the canonical confor-
mations in internal regions, followed by two base-
pair helices, helix ends, and lonepair helices. As
expected, the majority (76%; 35 out of 46) of the non-
canonical conformations occurring in helical
regions are associated with the helix ends and
lonepair helices (Table 2).
In contrast to the helical base-pairs with the
canonical conformations, the vast majority (83%) of
the base-pairs associated with the unpaired regions
on the secondary structure models† have non-
canonical conformations. In particular, 93% of the
base-pairs associated with the previously known
structure motifs, such as GNRA and UNCG tetra-
loops,7–10
A platforms,13,62
AA.AG at helix.ends
motifs,14
E-loops,47,63–65
tandem GAs,15,16
lonepair
triloops,17,18
K-turns,3
H-turns,23
and sticky motifs
(J.C.L. and R.R.G., unpublished results), contain
non-canonical conformations (Table 2).
As shown in Table 2, a total of 273 base-pairs not
associated with the previously reported motifs are
observed in the unpaired regions of the rRNA
secondary structure models in the T. thermophilus
30 S and the H. marismortui 50 S crystal structures.
These base-pairs either extend secondary helices
probably by stabilizing the ends of regular second-
ary helices or are involved in the organization and
folding of RNA structure by mediating long-range
† http://www.rna.icmb.utexas.edu/
1242 Diversity of Base-pair Conformations
tertiary interactions. While 67 (24%) of these base-
pairs have the canonical conformations, the
majority (76%) has non-canonical base-pair confor-
mations, providing further opportunities for iden-
tifying additional new RNA motifs.
Implications of non-canonical base-pair
conformations in RNA structure
Bases on a single RNA chain are vertically
projected from the backbone to minimize steric
clashes between bases and sugar rings and, simul-
taneously, consecutive sugar rings in the backbone
are helically twisted to minimize their steric
collisions, intrinsically leading to the helical stack-
ing of bases in the RNA chain. Thus, while
maintaining the structural integrity in each RNA
chain, the base-pairs within regular secondary
helices are structurally constrained to adopt the
canonical WC and Wb conformations (Table 2). In
contrast to the internal base-pairs structurally
locked in the WC and Wb conformations, the
base-pairs outside or at the termini of regular
helices are subject to conformational change,
leading to diversity of base-pair conformations.
Nonetheless, many non-helical base-pairs are
locked in RNA structure motifs and adopt a
consistently defined set of non-canonical base-pair
conformations, suggesting that base-pair confor-
mations are context-dependent. Thus, the more
constrained base-pairs, the less diversity of base-
pair conformations. The base-pair conformations
adopted by known RNA motifs in rRNAs are
shown in Table 4.
The four RNA motifs, the GNRA and UNCG
tetraloops, A platforms, E loops, and K-turns
involve a specific set of base-pair types and
consistently form a unique conformation for each
base-pair (Table 4). For example, the GNRA tetra-
loops in the rRNAs form the G:A S conformation
and are usually involved in long-range tertiary
interactions (data not shown). The A platforms
occurring in internal loops associated with the
50
-CUAAG/UAUG-30
sequence serve as a receptor
for the GAAA tetraloop and consistently form
four base-pairs, C:G WC, U:A rH, A:A pS, and
G:U Wb. The E-loop motifs50,63–65
occur in internal
or multi-stem loops and form a defined set of base-
pair conformations, A:A rS, U:A rH and G:U pS*
,
and A:G S (or A:A S). Interestingly, the E-loop motif
with the AGUA/GAC sequence has the A:C rS
conformation for its leading base-pair, which has no
hydrogen bonds between the two bases but is a
topological isostere to the A:A rS conformation for
the E-loop motif with the AGUA/GAA sequence.
However, an additional non-canonical base-pair
immediately outside of the E-loop motif forms
different conformations, depending on their
sequence context: the S conformation with the
AGUA/GAA sequence and the sWb conformation
with the AGUA/GAC sequence (data not shown).
The K-turns3
occur in asymmetric internal loops
Table 4. Base-pairs and their conformations formed in RNA motifs
Motif Base-pairs Conformations Comments
Tetraloops7–10
G:A S 50
-GNRA-30
: hairpin loops
U:G fS 50
-UNCG-30
: hairpin loops
A platforms13,62
C:G WC 50
-CUAAG/UAUG-30
: internal loops
U:A rH
A:A pS
U:A Wb
E-loops51,63–65
A:A (A:C) rS 50
-AGUA/GAA-30
: internal loops
G(U:A)a
U:A rH and G:U pS*
A:G (A:A) S
K-turns3
C:G WC 50
-AG/CNNNG-30
: internal loops
G:A S
Lonepair triloops17
U:A rH, WC R1 LPTL(50
-UGNRA-30
): hairpin loops
C:A rH, rWb R2 LPTL (50
-UUYRA-30
: hairpin loops
C:G WC, rWC
G:A S, H
H-turns23
G:A S 50
-GA/UA-30
: multi-stem loops
U:A rH
AA.AG at helix.ends14
G:A (C:A) S, WC, H*
, rH, Svb
G:A with G 30
and A 50
to helix
A:A S, Wb
G:A WC G:A with G 50
and A 30
to helix
Tandem GAs15–16
G:A S 50
-GA/GA-30
: internal and multi-stem loops
A:A S, rS 50
-GA/AA-30
: internal and multi-stem loops
U:A rH 50
-GA/UA-30
: internal loops
A-mediated interactionsc
A:G rS, rpS, fS, pfS Unpaired As in long-range tertiary interactions
C:Y fS, pfS
NZ{A, C, G, U}, RZ{A, R}, and YZ{C, U}.
a
The G(U:A) base-triple with the U:A rS and G:U pS*
conformations is sandwiched between A:A rS and A:G S.
b
Sv represents an S-like conformation with two bases vertically arranged. It may be an intermediate between G:A S and G:A rS.
c
The A-mediated interactions include the A-minor motifs.21,22
Diversity of Base-pair Conformations 1243
and form two discrete base-pairs, C:G WC and G:A
S, with usually three to four intervening unpaired
nucleotides leading to a sharp turn in the backbone.
In contrast, the remaining five RNA motifs form
base-pairs, each of which is capable of having
several different conformations, depending on the
structure context. For example, while the R1 and R2
LPTLs occur in hairpin loops with the UGNRA and
UUYRA sequences, respectively, and allow only the
U:A rH (or sometimes C:A rH) conformation for
their lonepair, due to their constrained sequence
and structure. These two groups of LPTLs are
involved in long-range tertiary interactions by
recruiting an unpaired A between the fourth base
in the triloop and the 30
base of the lonepair; the
three-dimensional structure of the resulting hairpin
loop mimics that of the GNRA tetraloops.17
How-
ever, lonepairs in the other LPTLs containing
variable loop sequences are not restricted to any
specific conformation; they depend on the base-pair
type and their structural context. For instance, the
R3 LPTLs containing the UAA triloop sequence
occur in the multi-stem loop and the third position
of the triloop is involved in a long-range tertiary
interaction, although their lonepair conformation is
dependent on the lonepair type.17
The H-turns23
form two base-pairs in multi-stem loops, G:A S and
U:A rH, but either of the two base-pairs may not
form probably due to the lack of enough structural
constraints upon forming a sharp hook-turn of one
strand.
The AA.AG at helix.ends motif14
involves a
single base-pair, G:A (with G 30
and A 50
to a
regular secondary helix) and A:A, and usually
forms G:A S and A:A S, stabilizing helix ends by
preventing any potential structural perturbation
from being further propagated into helical stems.
Our analysis also revealed that some AA.AG at
helix.ends base-pairs exchange with C:A, forming
C:A S isosteric to G:A S (data not shown). In
addition, some other exceptional conformations can
be formed, depending on the structural context
(Table 4). Specifically, the G:A S and A:A S
conformations occur 100% in hairpin loops, 82%
in internal loops (with 8% WC), and 61% in multi-
stem loops (with 13% WC).14
Besides, several
AA.AG at helix.ends motifs in internal and multi-
stem loops are not achieved in the rRNA crystal
structures (data not shown). Consequently, the
AA.AG at helix.ends base-pairs are highly con-
strained in hairpin loops, constrained in internal
loops, and relatively not constrained in multi-stem
loops. In contrast, the eight G:A base-pairs with the
reversed orientation (G 50
and A 30
to helix) always
adopt the G:A WC conformation with the long dCC
of 12.6 A˚ in the rRNAs (Figure 12(a)),14
which were
recently rediscovered as the cis Watson–Crick A/G
base-pairs,66
and are involved in helical stacking
(data not shown).
The tandem GA motifs15,16
occur in internal or
multi-stem loops and are composed of the GA/GA,
GA/AA, and GA/UA sequences, wherein the GA/
UA sequence always exists as part of the E-loops
and forms U:A rH and A:G S. Both of the tandem
GA base-pairs usually adopt G:A S (or A:A S) in 2!
2 internal loops. In large internal and multi-stem
loops, the helix-side base-pair forms G:A S (or A:A S)
and the loop-side base-pair is frequently not formed.
Interestingly, however, the loop-side A:A base-pairs
form the A:A rS conformation (data not shown).
The A-mediated interactions involve the N1 and
N3 positions of A, which are intrinsically nucleo-
philic due to the electron-donating amino group.
Consequently, many unpaired A bases in the rRNA
secondary structure models are involved in tertiary
interactions with other sections of the RNA chain.
In particular, such unpaired A bases frequently
interact in the minor groove with C:G (or sometimes
U:G and U:A) within helical stems. Due to the
lack of any structural constraint, however, the
A-mediated tertiary interactions lead to diverse
conformations depending on the topology of an
unpaired A in the minor groove of a helical stem
(Table 4). The most common A-mediated tertiary
interaction employs an unpaired A at the N3
position and the G of a C:G, forming either G:A rS
(known as type I A-minor motif21,22
) or G:A rpS
(Figure 5). The alternative use of the N1 position of
the unpaired A results in the G:A fS or G:A pfS
conformation (Figure 5). On the other hand, when
the unpaired A interacts with a pyrimidine (C or U),
its amino group is hydrogen-bonded to the
pyrimidine carbonyl group in the minor groove,
leading to either C:A fS or C:A pfS (Figure 6) and
U:A fS (Figure 3).
In this regard, each of the wide variety of rare
non-canonical conformations observed in RNA
structure should not be ignored. Although rare,
each or clusters of them may reveal structural and
biological relevance by being involved in organiz-
ing local structures nearby or may play a critical
role in RNA folding by mediating long-range
tertiary interactions.
Evaluation of the existing naming systems
This new Lee–Gutell (LG) system is based on
the topology of base–base interactions and unam-
biguously describes all possible arrangements
and orientations of the two bases that are
hydrogen-bonded to each other, even without the
explicit inclusion of the base-backbone interactions
between the 20
-OH group of one nucleotide and the
base of the other. Table 5 compares the LG system
with the existing systems, including the common
designation (CD) system,5,6,24,27,28
the Leontis–
Westhof (LW) system,25
and the Saenger system.67
While the majority of base-pair conformations
correspond between the first three systems, the
LG system has several advantages over the existing
systems.
First, the LG system is simple, systematic, and
convenient to use; instead of the long names, it uses
short names. The CD system based on the interact-
ing chemical groups is not easy to use except for
some traditional names such as Watson–Crick,
1244 Diversity of Base-pair Conformations
Table 5. Correspondence between the ten base-pair groups described here and 14 theoretically possible base-pair
conformations and the three primary naming systems
This work Common designation5,6,24,27,28
Leontis & Westhof25
Saenger67
C:G WC GC Watson–Crick CG cis WC/WC XIX (WC)
Wb GC NH-COa
– –
sWC – – –
sWb # # #
rWC CG reverse Watson–Crick CG trans WC/WC XXII (rWC)
rWb – – –
H(*) GC Hoogsteen (–) CC
G cis WC/H (–) –
rH(*) GC NH2-COa
(–) CC
G trans WC/H (–) –
S(*) – – –
rS(*) GC N7-NH2
a
(–) CG trans H/H (GC trans S/S) –
fS(*) – (GC N3-NH2, NH2-N3) GC trans WC/S (CG trans WC/S) –
pfS(*) GC NH-CO (–) GC cis WC/S (CG cis WC/S) –
pS(*) – (GC N3-NH2
a
– (CG cis H/S) –
rpS(*) GC CO-NH2
a
(GC NH2-COa
) – (CG cis S/S or GC cis S/Sb
) –
U:A WC AU Watson–Crick UA cis WC/WC XX (WC)
Wb – – –
sWC AU NH2K2-COa
– –
sWb # # #
rWC AU reverse Watson–Crick UA trans WC/WC XXI (rWC)
rWb – – –
H(*) AU Hoogsteen (AU NH2-4-COa
) UA cis WC/H (AU cis WC/H) XXIII (H)
rH(*) AU reverse Hoogsteen (–) UA trans WC/H (–) XXIV (rH)
S(*) – AU trans H/S or AU cis WC/S –
rS(*) – UA trans H/H or UA trans S/Sb
(–) –
fS(*) AU NH2K2-CO (AU N3-NH2
a
) UA cis WC/S (–) –
pfS(*) – AU cis WC/S (UA cis WC/S) –
pS – AU cis H/S (–) –
rpS(*) – UA cis S/Sb
(AU cis S/Sb
) –
U:G WC – – –
Wb(*) GU wobble (GU NH-4-COa
) UG cis WC/WC (–) XXVIII
sWC # # #
sWb – – –
rWC – – –
rWb GU reverse wobble UG trans WC/WC XXVII
H(*) GU CO-NHa
(–) UG cis WC/H (–) –
rH(*) GU N7-NHa
(–) UG trans WC/S (GU trans WC/S) –
S(*) – – (UG trans H/S) –
rS(*) – – (GU trans S/S) –
fS(*) GU N3-NH, NH2-CO (–) GU trans WC/S (UG trans WC/S) –
pfS(*) – (GU NH2-4-CO) GU cis WC/S (UG cis WC/S) –
pS(*) – (GU NH2K2-CO) GU cis H/S (–) –
rpS(*) – GU cis S/Sb
(UG cis S/S) –
G:A WC GA imino GA cis WC/WC VIII
Wb – – –
sWC # # #
sWb # # #
rWC – – –
rWb – – –
H(*) GA N7–N1,CO-NH2 (GAC
N7–N1,CO-NH2) GA cis WC/H (AC
G cis WC/H) IX
rH(*) – – (AG trans WC/H) –
S(*) GA sheared or GA NH2-N7a
(–) AG trans H/S or AG cis WC/S (–) XI
rS(*) GA NH2-N3a
(–) GA transH/H or GA trans S/S (AG trans S/S) –
fS(*) GA N3-NH2, NH2-N1 (–) AG trans WC/S (–) X
pfS(*) GA NH2-N1 or GA N3-NH2
a
(–) AG cis WC/S (GA cis WC/S) –
pS(*) – AG cis H/S (–) –
rpS(*) – GA cis S/Sb
or GA cis H/H (AG cis S/S) –
C:A WC – – –
Wb(*) AC wobble (AC N1-NH2
a
) CC
A cis WC/WC (–) –
sWC # # #
sWb AC N1-NH2
a
– –
rWC – – –
rWb AC reverse wobble CA trans WC/WC XXVI
H(*) AC N7-NH2 (NH2-2-COa
) – –
rH(*) AC reverse Hoogsteen (–) CA trans WC/H (–) XXV
S(*) – AC trans H/S or AC cis WC/S
(CA trans H/S or CA cis WC/S)
–
rS(*) – CA trans H/H (–) –
(continued on next page)
Diversity of Base-pair Conformations 1245
Table 5 (continued)
This work Common designation5,6,24,27,28
Leontis & Westhof25
Saenger67
fS(*) – AC trans WC/S (CA trans WC/S) –
pfS(*) – (AC N3-NH2
a
) – –
pS(*) – AC cis H/S (CA cis H/S) –
rpS(*) # # CA cis S/Sb
(AC cis S/Sb
) #
U:C WC UC 4-CO-NH2 UC cis WC/WC XVIII
Wb – – –
sWC UC 2-CO-NH2
a
– –
sWb # # #
rWC UC 2-CO-NH2
a
UC trans WC/WC XVII
rWb – – –
H(*) – – (CU cis WC/H) –
rH(*) – – –
S(*) – CU trans H/S (–) –
rS(*) UC 2-CO-NH2
a
(–) UC trans H/S (–) –
fS(*) – (UC NH-CO) CU trans WC/S (UC trans WC/S) –
pfS(*) – CU cis WC/S (UC cis WC/S) –
pS – CU cis H/S (–) –
rpS(*) – UC cis S/Sb
(CU cis S/Sb
) –
A:A WC – – –
Wb AA N1-NH2 AA cis WC/WC –
sWC # # #
sWb # # #
rWC # # #
rWb AA N1-NH2, sym AA trans WC/WC I
H AA N7-NH2
a
– –
rH AA N7-NH2 AA trans WC/H V
S AA sheared or AA N3-NH2
a
AA trans H/S or AA cis WC/S –
rS AA N7-NH2, sym AA trans H/H II
fS – AA transWC/S –
pfS – – –
pS – AA cis H/S –
rpS # AA cis S/Sb
#
C:C WC – – –
Wb CC N3-CO, NH2-N3 CC
C cis WC/WC –
sWC # # #
sWb CC CO-NH2
a
– –
rWC CC CO-NH2, sym CC trans WC/WC –
rWb CC N3-NH2, sym – XIV
rWC – CC trans WC/WC XV
H – CC cis WC/H –
rH – CC trans WC/H –
S – CC trans H/S or CC cis WC/S –
rS # # #
fS – – –
pfS – CC trans WC/S –
pS – CC cis H/S –
rpS # CC cis S/Sb
#
G:G WC # # #
Wb GG CO-NHa
GG cis WC/WC –
sWC # # #
sWb – – –
rWC # # #
rWb GG N1-CO, sym GG trans WC/WC III
H GG N1-CO, N7-NH2 GG cis WC/H VI
rH GG N7-NH GG trans WC/H VII
S GG NH2-N7a
GG trans H/S –
rS GG N3-NH2, sym GG trans S/S IV
fS – – –
pfS GG N3-NH2
a
GG cis WC/S –
pS – – –
rpS – GG cis S/S –
U:U WC – – –
Wb UU NH-CO UU cis WC/WC XVI
sWC # # #
sWb – – –
rWC – – –
rWb(*) UU 2-CO-NH,sym (UU 4-CO-NH2, sym) UU trans WC/WC XII, XIII
H – UU cis WC/H –
rH UU 4-CO-C5H, NH-4-CO UU trans WC/H –
1246 Diversity of Base-pair Conformations
wobble, Hoogsteen, and reverse Hoogsteen. This
system also needs the explicit designation of the
number of hydrogen bonds to avoid confusing
names between different conformations. In contrast,
the LW system requires the explicit designation of
the relative orientations (cis or trans) of the
glycosidic bonds. Second, the LG system describes
the topological arrangements of the two bases in a
given base-pair, regardless of the presence and
absence of the hydrogen bond between the 20
-OH
group of one nucleotide and the base of the other,
which is required for many base-pairs in the LW
system (e.g. cis WC/S, trans WC/S, and trans H/S).
However, our analysis has revealed many cis
WC/S, trans WC/S, and trans H/S conformations
that do not have the 20
-OH-mediated hydrogen
bond, suggesting that these conformations fluctu-
ate. In particular, the G:A S conformation has two
names in the LW system, AG cis WC/S and AG
trans H/S, both of which are topologically equival-
ent with one another in the crystal structures. Third,
the LG system is not dependent on the order of two
paired nucleotides, but instead it is based on the
base-pair groups (Table 1). The alternative names
with an asterisk are used for the different relative
orientation of the two bases, instead of switching
the order of the two nucleotides. For example, while
the GU trans WC/S conformation in the LW system
can be described as U:G fS or G:U fS with the LG
system, the UG trans WC/S conformation in the LW
system can be described as U:G fS*
or G:U fS*
(Figure 4). Fourth, the LG system describes more
base-pair conformations. The LW system describes
74 of the 121 major conformations (exclusive of the
alternative conformations) defined by the LG
system (Table 5). For example, several confor-
mations including U:G Wb*
, C:G Wb, and C:A
Wb*
in the LG system are not described with the
LW system. In addition, the sWC (and sWb)
conformations are unique to the LG system. In
fact, the LW system describes 84 of the 119 confor-
mations involved in simple base-pairs and higher-
order interactions, inclusive of the alternative
conformations, which are described with the LG
system and are present in the set of crystal
structures analyzed here (data not shown). Fifth,
the LG system also provides formal names for
higher-order interactions involved in base-triples
and quadruples (Table 1), while the LW system does
not. Sixth, the LG system may be used to trace the
intermediates for the topological changes of base-
pair conformations. For example, the intermediate
conformations may be derived for the topological
transition between the three topologically related
conformations, C:A S, C:A rH and C:A sWb
(Figure 6). Together, the established topological
isostericity between base-pairs can be associated
with the base-pair exchange patterns in an align-
ment of homologous RNA sequences to predict
base-pair conformations.
Materials and Methods
Analysis and classification of base-pair
conformations
Base-pairs in the rRNAs in the crystal structures of the
T. thermophilus 30 S (PDB, 1FJF1
) and H. marismortui 50 S
(PDB, 1FFK2
and 1JJ23
) ribosomal subunits were visually
identified and characterized using the RasMol pro-
gram.30,31
The base-pairs were then (1) divided into ten
base-pair groups, C:G, U:A, U:G, G:A, C:A, U:C, A:A, C:C,
G:G, and U:U, and (2) classified into 14 major families
based on the topological arrangement of the two bases
and two glycosidic bonds of a given base-pair (Table 1).
Two bases that form base-backbone hydrogen bonding
interactions with the 20
-OH group and no direct donor–
acceptor interactions between the two bases are not
considered as a base-pair with our classification system.
The wavy lines in base-pair conformation Figures 2–11
represent hydrogen bonds once the base is protonated.
Since the base-pairs that can form bifurcated hydrogen
bonding interactions usually maintain the same topolo-
gical arrangement in the presence and absence of
bifurcated hydrogen bonds (BHBs), the BHBs are not
shown in Figures 2–11. In addition, the base-pairs that can
theoretically form their keto-enol and amino-imino
tautomers were depicted in Figures 2–11. Hydrogen
bonds were typically considered when the distance
between the hydrogen bond donor and acceptor, dDA,
is less than 3.5 A˚ . While it was not possible to measure the
angles for hydrogen bonding interactions due to the lack of
hydrogen atoms in the crystal structures, hydrogen bonds
Table 5 (continued)
This work Common designation5,6,24,27,28
Leontis & Westhof25
Saenger67
S – – –
rS – – –
fS – UU trans WC/S –
pfS – UU cis WC/S –
pS – – –
rpS – UU cisS/Sb
–
The hash sign (#) represents the conformations that are not likely to form a hydrogen bond(s) between two base-pairing bases, and the
long dash mark (–) represents the conformations that were not assigned by the three other naming systems. The conformations available
(at http://prion.bchs.uh.edu/bp_type/) are simply represented by using acronyms; CO for carbonyl, NH for imino, NH2 for amino,
and sym for symmetric.
a
Base-pair conformations with a single hydrogen bond are explicitly designated.
b
Base-pair conformations proposed by the LW system, which have either base–backbone or backbone–backbone hydrogen bonding
interactions between two nucleotides. These interactions were not considered as base-pairs and are not included in our classification
system for simplicity (Figures 2–11).
Diversity of Base-pair Conformations 1247
were considered to form linear and nearly linear hydrogen
bonding interactions. Base-pair positions for the rRNAs are
represented using the T. thermophilus numbering for the
16 S rRNA and the H. marismortui numbering for the 23 S
rRNA, with the E. coli numbering in parentheses†.
Acknowledgements
We thank Jamie J. Cannone for proofreading the
manuscript. This work was supported by the
National Institutes of Health (GM067317), the
Welch Foundation (F-1427), start-up funds from
the Institute for Cellular and Molecular Biology at
the University of Texas at Austin, and Ibis Thera-
peutics, a division of Isis Pharmaceuticals.
References
1. Wimberly, B. T., Brodersen, D. E., Clemons, W. M., Jr,
Morgan-Warren, R. J., Carter, A. P., Vonrhein, C. et al.
(2000). Structure of the 30 S ribosomal subunit. Nature,
407, 327–339.
2. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz,
T. A. (2000). The complete atomic structure of the
large ribosomal subunit at 2.4 A˚ resolution. Science,
289, 905–920.
3. Klein, D. J., Schmeing, T. M., Moore, P. B. & Steitz,
T. A. (2001). The kink-turn: a new RNA secondary
structure motif. EMBO J. 20, 4214–4221.
4. Gutell, R. R., Lee, J. C. & Cannone, J. J. (2002). The
accuracy of ribosomal RNA comparative structure
models. Curr. Opin. Struct. Biol. 12, 301–310.
5. Watson, J. D. & Crick, F. H. C. (1953). Molecular
structure of nucleic acids: a structure for deoxy-
ribonucleic acid. Nature, 171, 737–738.
6. Crick, F. H. (1966). Codon-anticodon pairing: the
wobble hypothesis. J. Mol. Biol. 19, 548–555.
7. Woese, C. R., Winker, S. & Gutell, R. R. (1990).
Architecture of ribosomal RNA: constraints on the
sequence of “tetra-loops”. Proc. Natl Acad. Sci. USA,
87, 8467–8471.
8. Cheong, C., Varani, G. & Tinoco, I., Jr (1990). Solution
structure of an unusually stable RNA hairpin,
50
GGAC(UUCG)GUCC. Nature, 346, 680–682.
9. Heus, H. A. & Pardi, A. (1991). Structural features that
give rise to the unusual stability of RNA hairpins
containing GNRA loops. Science, 253, 191–194.
10. Jucker, F. M. & Pardi, A. (1995). GNRA tetraloops
make a U-turn. RNA, 1, 219–222.
11. Jucker, F. M. & Pardi, A. (1995). Solution structure of
the CUUG hairpin loop: a novel RNA tetraloop motif.
Biochemistry, 34, 14416–14427.
12. Gautheret, D., Konnings, D. & Gutell, R. R. (1995).
G$U base pairing motifs in ribosomal RNA. RNA, 1,
807–814.
13. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K.,
Golden, B. L., Szewczak, A. A. et al. (1996). RNA
tertiary structure mediation by adenosine platforms.
Science, 273, 1676–1677.
14. Elgavish, T., Cannone, J. J., Lee, J. C., Harvey, S. C. &
Gutell, R. R. (2001). AA.AG at helix ends: A:A and
A:G base-pairs at the ends of 16 S and 23 S rRNA
helices. J. Mol. Biol. 310, 735–753.
15. SantaLucia, J., Kierzek, R. & Turner, D. H. (1990).
Effects of GA mismatches on the structure and
thermodynamics of RNA internal loop. Biochemistry,
29, 8813–8819.
16. Gautheret, D., Konings, D. & Gutell, R. R. (1994).
A major family of motifs involving GA mismatches in
ribosomal RNA. J. Mol. Biol. 242, 1–8.
17. Lee, J. C., Cannone, J. J. & Gutell, R. R. (2003).
Lonepair triloop: a new motif in RNA structure.
J. Mol. Biol. 325, 65–83.
18. Nagaswamy, U. & Fox, G. E. (2002). Frequent
occurrence of the T-loop RNA folding motif in
ribosomal RNAs. RNA, 8, 1112–1119.
19. Quigley, G. J. & Rich, A. (1976). Structural domains of
transfer RNA molecules. Science, 194, 796–806.
20. Gutell, R. R., Cannone, J. J., Konings, D. & Gautheret,
D. (2000). Predicting U-turns in ribosomal RNA with
comparative sequence analaysis. J. Mol. Biol. 300,
791–803.
21. Nissen, P., Ippolito, J. A., Ban, N., Moore, P. B. & Steitz,
T. A. (2001). RNA tertiary interactions in the large
ribosomal subunit: the A-minor motif. Proc. Natl Acad.
Sci. USA, 98, 4899–4903.
22. Doherty, E. A., Batey, R. T., Masquida, B. & Doudna,
J. A. (2001). A universal mode of helix packing in
RNA. Nature Struct. Biol. 8, 339–343.
23. Sze´p, S., Wang, J. & Moore, P. B. (2003). The crystal
structure of a 26-nucleotide RNA containing a hook-
turn. RNA, 9, 44–51.
24. Nagaswamy, U., Larios-Sanz, M., Hury, J., Collins, S.,
Zhang, Z., Zhao, Q. & Fox, G. E. (2002). NCIR: a
database of non-canonical interactions in known RNA
structures. Nucl. Acids Res. 30, 395–397.
25. Leontis, N. B., Stombaugh, J. & Westhof, E. (2002). The
non-Watson–Crick base pairs and their associated
isostericity matrices. Nucl. Acids Res. 30, 3497–3531.
26. Lemieux, S. & Major, F. (2002). RNA canonical and non-
canonical base pairing types: a recognition method and
complete repertoire. Nucl. Acids Res. 30, 4250–4263.
27. Walberer, B. J., Cheng, A. C. & Frankel, A. D. (2003).
Structural diversity and isomorphism of hydrogen-
bonded base interactions in nucleic acids. J. Mol. Biol.
327, 767–780.
28. Donohue, J. (1956). Hydrogen-bonded helical con-
figurations of polynucleotides. Proc. Natl Acad. Sci.
USA, 42, 60–65.
29. Donohue, J. & Trueblood, K. N. (1960). Base-pairing in
DNA. J. Mol. Biol. 2, 363–371.
30. Sayle, R. A. & Milner-White, E. J. (1995). RasMol:
biomolecular graphics for all. Trends Biochem. Sci. 20,
374–376.
31. Bernstein, H. J. (2000). Recent changes to RasMol: recom-
bining the variants. Trends Biochem. Sci. 25, 453–455.
32. Dickerson, R. E., Bansal, M., Calladine, C. R.,
Diekmann, S., Hunter, W. N., Kennard, O. et al.
(1998). Definitions and nomenclature of nucleic acid
structure parameters. EMBO J. 8, 1–4.
33. Murray, L. J. W., Arendall, W. B., III, Richardson, D. C.
& Richardson, J. S. (2003). RNA backbone is rota-
meric. Proc. Natl Acad. Sci. USA, 100, 13904–13909.
34. Puglisi, J. D., Wyatt, J. R. & Tinoco, I., Jr (1990).
Solution conformation of an RNA hairpin loop.
Biochemistry, 29, 4215–4226.
35. SantaLucia, J., Jr, Kierzek, R. & Turner, D. H. (1991).
Stabilities of consecutive AC, CC, GG, UC, and UU
† The Tables and Figures shown here are also available
at http://www.rna.icmb.utexas.edu/ANALYSIS/BPC/.
1248 Diversity of Base-pair Conformations
mismatches in RNA internal loops: evidence for
stable hydrogen-bonded UU and CCC
pairs.
Biochemistry, 30, 8242–8251.
36. Biala, E. & Strazewski, P. (2002). Internal mismatched
RNA: pH and solvent dependence of the thermal
unfolding of tRNAAla
acceptor stem microhairpins.
J. Am. Chem. Soc. 124, 3540–3545.
37. Leitner, D., Schro¨der, W. & Weisz, K. (2000). Influence
of sequence-dependent cystosine protonation and
methylation on DNA triplex stability. Biochemistry,
39, 5886–5892.
38. Pioletti, M., Schlu¨nzen, F., Harms, J., Zarivach, R.,
Glu¨hmann, M., Avila, H. et al. (2001). Crystal structures
of complexes of the small ribosomal subunit with
tetracycline, edeine and IF3. EMBO J. 20, 1829–1839.
39. Gutell, R. R. (1993). The simplicity behind the
elucidation of complex structure in ribosomal RNA.
In The Translational Apparatus (Nierhaus, J. H. et al.,
eds), pp. 477–488, Plenum Press, New York.
40. Gutell, R. R. (1996). Comparative sequence analysis
and the structure of 16 S and 23 S rRNA. In Ribosomal
RNA: Structure, Evolution, Processing, and Function in
Protein Synthesis (Dahlberg, A. E. & Zimmerman,
R. A., eds), pp. 111–128, CRC Press, Boca Raton, FL.
41. Gutell, R. R. & Woese, C. R. (1990). Higher-order
structural elements in ribosomal RNAs: pseudoknots
and the use of non-canonical pairs. Proc. Natl Acad.
Sci. USA, 87, 663–667.
42. Fabiola, G. F., Krishnaswamy, S., Nagarajan, V. &
Pattabhi, V. (1997). C–H/O hydrogen bonds in beta-
sheets. Acta Crystallog. sect. D, 53, 316–320.
43. Lu, M. & Steitz, T. A. (2000). Structure of Escherichia
coli ribosomal protein L25 complexed with a 5 S rRNA
fragment at 1.8 A˚ resolution. Proc. Natl Acad. Sci. USA,
97, 2023–2028.
44. Ennifar, E., Nikulin, A., Tishchenko, S., Serganov, A.,
Nevskaya, N., Garber, M. et al. (2000). The crystal
structure of UUCG tetraloop. J. Mol. Biol. 304, 3542.
45. Gautheret, D. & Gutell, R. R. (1997). Inferring the
conformation of RNA base pairs and triples from
patterns of sequence variation. Nucl. Acids Res. 25,
1559–1564.
46. Perbandt, M., Vallazza, M., Lippmann, C., Betzel, C. &
Erdmann, V. A. (2000). Structure of an RNA duplex
withanunusualGCpairin wobble-likeconformationat
1.6 A˚ resolution. Acta Crystallog. sect. D, D57, 219–224.
47. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K.,
Golden, B. L., Szewczak, A. A. et al. (1996). RNA
tertiary structure mediation by adenosine platforms.
Science, 273, 1696–1699.
48. Conn, G. L., Draper, D. E., Lattman, E. E. & Gittis,
A. G. (1999). Crystal structure of a conserved
ribosomal protein–RNA complex. Science, 284, 1171.
49. Wimberly, B. T., Guymon, R., McCutcheon, J. P.,
White, S. W. & Ramakrishnan, V. (1999). A detailed
view of a ribosomal active site: the structure of the
L11-RNA complex. Cell, 97, 491.
50. Wimberly, B., Varani, G. & Tinoco, I., Jr (1993). The
conformation of loop E of eukaryotic 5 S ribosomal
RNA. Biochemistry, 32, 1078–1087.
51. Wang, Y. & Patel, D. J. (1993). Solution structure of a
parallel-stranded G-quadruplex DNA. J. Mol. Biol.
234, 1171–1183.
52. Haider, S., Parkinson, G. N. & Neidle, S. (2002). Crystal
structure of the potassium form of an Oxytricha nova G-
quadruplex. J. Mol. Biol. 320, 189–200.
53. Nissen,P.,Kjeldgaard,M.,Thirup,S.&Nyborg,J.(1999).
The crystal structure of Cys-tRNACys
-EF-Tu-GDPNP
reveals general and specific features in the ternary
complex and in tRNA. Struct. Fold. Des. 7, 143–156.
54. Phan, A. T., Gueron, M. & Leroy, J.-L. (2000). The
solution structure and internal motions of a fragment
of the cytidine-rich strand of the human telomere.
J. Mol. Biol. 299, 123–144.
55. Fedoroff, O. Y., Rangan, A., Chemeris, V. V. & Hurley,
L. H. (2000). Cationic porphyrin promote the formation
of i-motif DNA and bind peripherally by a non-
intercalative mechanism. Biochemistry, 39, 15083–15090.
56. Perbandt, M., Nolte, A., Lorenz, S., Bald, R., Betzel, C.
& Erdmann, V. A. (1998). Crystal structure of domain
E of Thermus flavus 5 S rRNA: a helical RNA structure
including a hairpin loop. FEBS Letters, 429, 211–215.
57. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K.,
Golden, B. L., Kundrot, C. E. et al. (1996). Crystal
structure of a group I ribozyme domain: principles of
RNA packing. Science, 273, 1678–1685.
58. Juneau, K., Podell, E. R., Harrington, D. J. & Cech,
T. R. (2001). Structural basis of the enhanced stability
of a mutant ribozyme domain and a detailed view of
RNA–solvent interactions. Structure, 9, 221–231.
59. Ruff, M., Krishnaswamy, S., Boeglin, M., Poterszman,
A., Mitschler, A., Podjarny, A. et al. (1991). Class II
aminoacyl transfer RNA synthetases: crystal structure
of yeast aspartyl-tRNA synthetase complexed with
tRNA. Science, 252, 1682–1689.
60. Sussman, J. L., Holbrook, S. R., Warrant, R. W.,
Church, G. M. & Kim, S. H. (1978). Crystal structure
of yeast phenylalanine tRNA. I. Crystallographic
refinement. J. Mol. Biol. 123, 607–630.
61. Harms, J., Schluenzen, F., Zarvivach, R., Bashan, A.,
Gat, S., Agmon, I. et al. (2001). High resolution
structure of the large ribosomal subunit from a
mesophilic eubacterium. Cell, 107, 679–688.
62. Adams, P. L., Stahley, M. R., Kosek, A. B., Wang, J. &
Strobel, S. (2004). Crystal structure of a self-splicing
group I intron with both exons. Nature, 435, 45–50.
63. Gutell, R. R., Cannone, J. J., Shang, Z., Du, Y. & Serra,
M. J. (2000). A story: unpaired adenosine bases in
ribosomal RNAs. J. Mol. Biol. 304, 335–354.
64. Leontis, N. B. & Westhof, E. (1998). A common motif
organizes the structure of multi-helix loops in 16 S
and 23 S ribosomal RNAs. J. Mol. Biol. 283, 571–583.
65. Wimberly, B. (1994). A common RNA loop motif as a
docking module and its function in the hammerhead
ribozyme. Struct. Biol. 1, 820–827.
66. Sponer, J., Mokdad, A., Sponer, J. E., Spackova´, N.,
Leszczynski, J. & Leontis, N. B. (2003). Recent unique
tertiary and neighbor interactions determine conser-
vation patterns of cis Watson–Crick A/G base-pairs.
J. Mol. Biol. 330, 967–978.
67. Saenger, W. (1984). Principles of Nucliec Acid Structure,
pp. 120–121, Springer-Verlag, New York.
Edited by D. E. Draper
(Received 7 June 2004; received in revised form 20 September 2004; accepted 24 September 2004)
Diversity of Base-pair Conformations 1249

Contenu connexe

Similaire à Gutell 092.jmb.2004.344.1225

Gutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfigGutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfigRobin Gutell
 
Crystal Structure of the Retinoblastoma Protein
Crystal Structure of the Retinoblastoma ProteinCrystal Structure of the Retinoblastoma Protein
Crystal Structure of the Retinoblastoma ProteinMaciej Luczynski
 
Gutell 085.jmb.2003.325.0065
Gutell 085.jmb.2003.325.0065Gutell 085.jmb.2003.325.0065
Gutell 085.jmb.2003.325.0065Robin Gutell
 
Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Robin Gutell
 
Gutell 061.nar.1997.25.01559
Gutell 061.nar.1997.25.01559Gutell 061.nar.1997.25.01559
Gutell 061.nar.1997.25.01559Robin Gutell
 
JCMD_2014_stackovr
JCMD_2014_stackovrJCMD_2014_stackovr
JCMD_2014_stackovrsankar basu
 
Gutell 002.nar.1981.09.06167
Gutell 002.nar.1981.09.06167Gutell 002.nar.1981.09.06167
Gutell 002.nar.1981.09.06167Robin Gutell
 
Gutell 054.jmb.1996.256.0701
Gutell 054.jmb.1996.256.0701Gutell 054.jmb.1996.256.0701
Gutell 054.jmb.1996.256.0701Robin Gutell
 
Gutell 013.jbiosci.1985.08.0747
Gutell 013.jbiosci.1985.08.0747Gutell 013.jbiosci.1985.08.0747
Gutell 013.jbiosci.1985.08.0747Robin Gutell
 
Gutell 081.cosb.2002.12.0301
Gutell 081.cosb.2002.12.0301Gutell 081.cosb.2002.12.0301
Gutell 081.cosb.2002.12.0301Robin Gutell
 
Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535Robin Gutell
 
Gutell 005.mr.1983.47.0621
Gutell 005.mr.1983.47.0621Gutell 005.mr.1983.47.0621
Gutell 005.mr.1983.47.0621Robin Gutell
 
Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769Robin Gutell
 
Gutell 080.bmc.bioinformatics.2002.3.2
Gutell 080.bmc.bioinformatics.2002.3.2Gutell 080.bmc.bioinformatics.2002.3.2
Gutell 080.bmc.bioinformatics.2002.3.2Robin Gutell
 
Gutell 009.biochemistry.1984.23.03330
Gutell 009.biochemistry.1984.23.03330Gutell 009.biochemistry.1984.23.03330
Gutell 009.biochemistry.1984.23.03330Robin Gutell
 
Gutell 098.jmb.2006.360.0978
Gutell 098.jmb.2006.360.0978Gutell 098.jmb.2006.360.0978
Gutell 098.jmb.2006.360.0978Robin Gutell
 
The structures of DNA & RNA
The structures of DNA & RNAThe structures of DNA & RNA
The structures of DNA & RNAvanitha vani
 
Gutell 101.physica.a.2007.386.0564.good
Gutell 101.physica.a.2007.386.0564.goodGutell 101.physica.a.2007.386.0564.good
Gutell 101.physica.a.2007.386.0564.goodRobin Gutell
 

Similaire à Gutell 092.jmb.2004.344.1225 (20)

Gutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfigGutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfig
 
Struttura_dna
 Struttura_dna Struttura_dna
Struttura_dna
 
Crystal Structure of the Retinoblastoma Protein
Crystal Structure of the Retinoblastoma ProteinCrystal Structure of the Retinoblastoma Protein
Crystal Structure of the Retinoblastoma Protein
 
Gutell 085.jmb.2003.325.0065
Gutell 085.jmb.2003.325.0065Gutell 085.jmb.2003.325.0065
Gutell 085.jmb.2003.325.0065
 
Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383
 
Gutell 061.nar.1997.25.01559
Gutell 061.nar.1997.25.01559Gutell 061.nar.1997.25.01559
Gutell 061.nar.1997.25.01559
 
JCMD_2014_stackovr
JCMD_2014_stackovrJCMD_2014_stackovr
JCMD_2014_stackovr
 
Gutell 002.nar.1981.09.06167
Gutell 002.nar.1981.09.06167Gutell 002.nar.1981.09.06167
Gutell 002.nar.1981.09.06167
 
Ribosoma virtual
Ribosoma virtualRibosoma virtual
Ribosoma virtual
 
Gutell 054.jmb.1996.256.0701
Gutell 054.jmb.1996.256.0701Gutell 054.jmb.1996.256.0701
Gutell 054.jmb.1996.256.0701
 
Gutell 013.jbiosci.1985.08.0747
Gutell 013.jbiosci.1985.08.0747Gutell 013.jbiosci.1985.08.0747
Gutell 013.jbiosci.1985.08.0747
 
Gutell 081.cosb.2002.12.0301
Gutell 081.cosb.2002.12.0301Gutell 081.cosb.2002.12.0301
Gutell 081.cosb.2002.12.0301
 
Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535
 
Gutell 005.mr.1983.47.0621
Gutell 005.mr.1983.47.0621Gutell 005.mr.1983.47.0621
Gutell 005.mr.1983.47.0621
 
Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769
 
Gutell 080.bmc.bioinformatics.2002.3.2
Gutell 080.bmc.bioinformatics.2002.3.2Gutell 080.bmc.bioinformatics.2002.3.2
Gutell 080.bmc.bioinformatics.2002.3.2
 
Gutell 009.biochemistry.1984.23.03330
Gutell 009.biochemistry.1984.23.03330Gutell 009.biochemistry.1984.23.03330
Gutell 009.biochemistry.1984.23.03330
 
Gutell 098.jmb.2006.360.0978
Gutell 098.jmb.2006.360.0978Gutell 098.jmb.2006.360.0978
Gutell 098.jmb.2006.360.0978
 
The structures of DNA & RNA
The structures of DNA & RNAThe structures of DNA & RNA
The structures of DNA & RNA
 
Gutell 101.physica.a.2007.386.0564.good
Gutell 101.physica.a.2007.386.0564.goodGutell 101.physica.a.2007.386.0564.good
Gutell 101.physica.a.2007.386.0564.good
 

Plus de Robin Gutell

Gutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xiGutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xiRobin Gutell
 
Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803Robin Gutell
 
Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Robin Gutell
 
Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Robin Gutell
 
Gutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataGutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataRobin Gutell
 
Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Robin Gutell
 
Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Robin Gutell
 
Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011Robin Gutell
 
Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011Robin Gutell
 
Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768Robin Gutell
 
Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497Robin Gutell
 
Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Robin Gutell
 
Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195Robin Gutell
 
Gutell 109.ejp.2009.44.277
Gutell 109.ejp.2009.44.277Gutell 109.ejp.2009.44.277
Gutell 109.ejp.2009.44.277Robin Gutell
 
Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200Robin Gutell
 
Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2Robin Gutell
 
Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043Robin Gutell
 
Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016Robin Gutell
 
Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289Robin Gutell
 
Gutell 100.imb.2006.15.533
Gutell 100.imb.2006.15.533Gutell 100.imb.2006.15.533
Gutell 100.imb.2006.15.533Robin Gutell
 

Plus de Robin Gutell (20)

Gutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xiGutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xi
 
Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803
 
Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013
 
Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676
 
Gutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataGutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_data
 
Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473
 
Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22
 
Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011
 
Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011
 
Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768
 
Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497
 
Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485
 
Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195
 
Gutell 109.ejp.2009.44.277
Gutell 109.ejp.2009.44.277Gutell 109.ejp.2009.44.277
Gutell 109.ejp.2009.44.277
 
Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200
 
Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2
 
Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043
 
Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016
 
Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289
 
Gutell 100.imb.2006.15.533
Gutell 100.imb.2006.15.533Gutell 100.imb.2006.15.533
Gutell 100.imb.2006.15.533
 

Dernier

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Dernier (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

Gutell 092.jmb.2004.344.1225

  • 1. Diversity of Base-pair Conformations and their Occurrence in rRNA Structure and RNA Structural Motifs Jung C. Lee and Robin R. Gutell* The Institute for Cellular and Molecular Biology, The University of Texas at Austin 1 University Station A4800 Austin, TX 78712-0159, USA In addition to the canonical base-pairs comprising the standard Watson– Crick (C:G and U:A) and wobble U:G conformations, an analysis of the base-pair types and conformations in the rRNAs in the high-resolution crystal structures of the Thermus thermophilus 30 S and Haloarcula marismortui 50 S ribosomal subunits has identified a wide variety of non- canonical base-pair types and conformations. However, the existing nomenclatures do not describe all of the observed non-canonical conformations or describe them with some ambiguity. Thus, a standardized system is required to classify all of these non-canonical conformations appropriately. Here, we propose a new, simple and systematic nomenclature that unambiguously classifies base-pair confor- mations occurring in base-pairs, base-triples and base-quadruples that are associated with secondary and tertiary interactions. This system is based on the topological arrangement of the two bases and glycosidic bonds in a given base-pair. Base-pairs in the internal positions of regular secondary structure helices usually form with canonical base-pair groups (C:G, U:A, and U:G) and canonical conformations (C:G WC, U:AWC, and U:G Wb). In contrast, non-helical base-pairs outside of regular structure helices usually have non-canonical base-pair groups and conformations. In addition, many non-helical base-pairs are involved in RNA motifs that form a defined set of non-canonical conformations. Thus, each rare non-canonical confor- mation may be functionally and structurally important. Finally, the topology-based isostericity of base-pair conformations can rationalize base-pair exchanges in the evolution of RNA molecules. q 2004 Elsevier Ltd. All rights reserved. Keywords: base-pair conformation; isostericity; bifurcated hydrogen bonds; RNA motif*Corresponding author Introduction Recently, the high-resolution crystal structures of the bacterial Thermus thermophilus 30 S (PDB, 1FJF1 ) and archaeal Haloarcula marismortui 50 S (PDB, 1FFK2 and 1JJ23 ) ribosomal subunits were deter- mined; the former includes the 16 S rRNA and the latter the 23 S and 5 S rRNAs. An analysis of the base-pairs present in the rRNAs in the two crystal structures not only validated the authenticity of the covariation-based rRNA structure models,4 but also provides a wealth of RNA structural folds, confor- mations and motifs to identify and relate to nucleotide sequences and base-pairs. In addition to the canonical base-pairs with canonical confor- mations consisting of the standard Watson–Crick (C:G and U:A)5 and the wobble U:G6 base-pair types and conformations, these crystal structures contain many canonical and non-canonical base- pair types with non-canonical conformations. The non-canonical conformations are frequently involved in a variety of motifs, including GNRA, UUCG, and CUUG tetraloops,7–11 G$U base-pair 0022-2836/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. Abbreviations used: WC, Watson–Crick; Wb, wobble; sWC, slipped Watson–Crick; sWb, slipped wobble; rWC, reversed Watson–Crick; rWb, reversed wobble; H, Hoogsteen; rH, reversed Hoogsteen; S, sheared; rS, reversed sheared; fS, flipped sheared; pfS, parallel flipped sheared; pS, parallel sheared; rpS, reversed parallel sheared; BHB, bifurcated hydrogen bond; PDB, Protein Data Bank; LPTL, lonepair triloop; dDA, distance between the hydrogen bond donor and acceptor; Ec, Escherichia coli. E-mail address of the corresponding author: robin.gutell@mail.utexas.edu doi:10.1016/j.jmb.2004.09.072 J. Mol. Biol. (2004) 344, 1225–1249
  • 2. motifs,12 A platforms,13 AA.AG at helix.ends motifs,14 tandem GA motifs,15,16 lonepair triloop motifs,17,18 and sticky motifs consisting of AGUA/ GAA, GUA/GAA, and GGA/GAA motifs (J.C.L. & R.R.G., unpublished results) U-turns,19,20 A- minor motifs,21,22 K-turns,3 and H-turns.23 Many of the previously identified non-canonical base-pairs have been organized onto a web site†.24 As well, a nomenclature has been proposed to classify base-pair conformations by introducing the interacting edges,25 while a computational approach was developed to automatically identify these latter base-pair conformations from RNA crystal structures.26 More recently, a new compu- tational study attempted to theoretically model base-pair conformations with no nomenclature, based on isomorphic relationships of base inter- actions.27 The proposed naming systems are not analogous to the traditional system (e.g. Watson– Crick, wobble, reversed Watson–Crick, reversed wobble, Hoogsteen, reversed Hoogsteen, and sheared conformations), which includes the orig- inal base-pair conformations.5,6,28,29 Unfortunately, not all of the observed non-canonical conformations have been described unambiguously with the system. Thus, a simple and widely applicable nomenclature is needed to specifically describe all of the observed and reasonable base-pair confor- mations. By analyzing topological arrangements of the bases and glycosidic bonds for all base-pairs and by expanding the traditional classification, we propose a new, simple and systematic nomen- clature to classify all of the observed base-pair conformations, regardless of the number of hydrogen bonds between the two bases. In addition to the introduction to our new nomenclature, we analyze: (1) the distribution of structural parameters of representative base-pair conformations observed in the rRNAs; (2) the relationship between the sugar puckering patterns and base-pair conformations; (3) the protonated base-pairs and the bifurcated hydrogen bonding interactions in RNA structure; and (4) the distri- bution of base-pair conformations on the rRNA secondary structure models and the significance of non-canonical conformations in RNA structure. Results Topological relationships of base-pair conformations The base-pair conformation refers to the spatial arrangement of the two bases in a given base-pair, X:Y, which are hydrogen-bonded to one another. In addition to the standard Watson–Crick C:G and U:A and wobble U:G base-pair types with canonical base-pair conformations, a visual identification and characterization of the base-pair conformations in the rRNAs in the high-resolution crystal structures of the T. thermophilus 30 S (PDB, 1FJF1 ) and H. marismortui 50 S (PDB, 1FFK2 and 1JJ23 ) ribo- somal subunits using the RasMol program30,31 reveals many canonical and non-canonical base- pair types with non-canonical conformations (Table 1). While all 16 possible base-pairs are divided into ten base-pair groups (i.e. C:G, U:A, U:G, G:A, C:A, U:C, A:A, C:C, G:G, and U:U), their conformations are classified into 14 major conformational families (Table 1): Watson–Crick (WC), wobble (Wb), slipped Watson–Crick (sWC), slipped wobble (sWb), reversed Watson–Crick (rWC), reversed wobble (rWb), Hoogsteen (H), reversed Hoogsteen (rH), sheared (S), reversed sheared (rS), flipped sheared (fS), parallel flipped sheared (pfS), parallel sheared (pS), and reversed parallel sheared (rpS). Base-pair conformations can be systematically and unambiguously named based on the topologi- cal arrangements of the two bases and the two glycosidic bonds in a given base-pair, X:Y. As depicted in Figure 1, all of the non-canonical conformations are derived by simple topological manipulation of the starting Watson–Crick (WC) or wobble (Wb) conformation for each base-pair group. For example, sWC/sWb is generated by slipping (translating base Y either along the negative y-axis (sWC/sWb) or along the positive y-axis (sWC* /sWb* ) to form only a single hydrogen bond); rWC/rWb, by reversing (rotating nucleotide Y 1808 about the x-axis); H, by flipping (rotating either base Y (H) or base X (H* ) 1808 about its glycosidic bond); rH, by flipping and then reversing base Y (rH) or base X (rH* ); S, by shearing (either translating base Y along the negative y-axis and then along the negative x-axis (S) or translating base X along the negative y-axis and then along the positive x-axis (S* )); rS, by shearing and then reversing base Y (rS) or base X (rS* ); fS, by flipping and then shearing base Y (fS) or base X (fS* ); pfS, by flipping, shearing, and then paralleling nucleotide Y (pfS) or X (pfS* ) (rotating either nucleotide Y or X about the y-axis to have the glycosidic bonds that run parallel in the same direction); pS, by parallel- ing and then shearing either Y (pS) or X (pS* ); rpS, by paralleling, shearing, and then reversing either X or Y. The order of successive manipulations of the base Y (or X) does not matter. The topological relationships of the observed and theoretically possible conformations for the ten base-pair groups are shown in Figures 2–11: C:G, Figure 2; U:A, Figure 3; U:G, Figure 4; G:A, Figure 5; C:A, Figure 6; U:C, Figure 7; A:A, Figure 8; C:C, Figure 9; G:G, Figure 10; and U:U, Figure 11. Interestingly, some conformations within the same conformation family have more than one hydrogen bonding possibility with a similar topology: H, rH* , fS* , pfS* for U:G in Figure 4; H* , rH, rH* , pfS, pfS* , and pS for G:A in Figure 5; rH, S, pfS, and pS for G:G in Figure 10. In addition, either base of the two bases can be reversed in the rS and rpS conformations for† http://prion.bchs.uh.edu/bp_type/ 1226 Diversity of Base-pair Conformations
  • 3. Table 1. Base-pair conformations present in the rRNAs in the T. thermophilus 30 S and H. marismortui 50 S crystal structures bpC C:G U:A U:G G:A C:A U:C A:A C:C G:G U:U Total BP WC 869 252 1 21 2 8 – – # – 1153 Wb 1 1 124(3) – 3(2) – 6 9 3 16 168 sWC – 4(2) # # # 2 # # # # 8 sWb # # (1) # 3(1) # # 3 – 1 9 rWC 4 11 – – – 1 # – # – 16 rWb – – 2 – 3 – 5 – – 5 15 H 1 8 1 3 (2) (1) – – – 2 – 18 rH – 58(1) – 1 13(1) (1) 9 1 3 – 88 S – 7(2) 1(7) 143 8 (3) 5 10 4 4 – 194 rS 3 4 2 7 2 1 19 # 2 – 40 fS 1(1) (2) 6 5 1(1) – 1 – – – 18 pfS 4 (2) 1(1) 1 – – – – 1 – 10 pS – 1 1(1) 1 2 – 1 – – – 7 rpS – – – 1 # – # # – – 1 Total 884 355 152 185 46 18 51 17 15 22 1745 TQ WC 3 4 – 1 – 2 – – # – 10 Wb – – – – 1 – 1 – 1 1 4 sWC (1) – # # # – # # # # 1 sWb # # – # – # # – – – 0 rWC – 1 – – – – # – # – 1 rWb – – 1 – – – 9 – – – 10 H – 4 2 – 1 – – 2 6 1 16 rH 2(5) 5 (1) 3 8(3) – 2 – 1 3 33 S – 3 (1) 5 11 1 2 1 7 – 31 rS 2(1) 1 – 56 1 3 2 # 4 – 70 fS (1) 1(1) (6) 18 – (1) – – – 1 29 pfS 1(1) 1(2) 3(2) 9 3(5) 1(1) – 4 1 – 34 pS 1(1) 4 1(16) 3 (2) 5 2 1 2 1 39 rpS 1(2) – 4(2) 19 # – # # 3 – 31 Total 22 27 39 114 35 14 18 8 25 7 309 Base-pair types are divided into ten base-pair groups, while base-pair conformations (bpC; X:Y Z) involved both in simple base-pairs (bp) and in higher-order interactions including base–base-pair and base-pair–base-pair interactions (TQ) are classified into 14 main families: WC, Watson–Crick; Wb, wobble; sWC, slipped Watson–Crick; sWb, slipped wobble; rWC, reversed Watson–Crick; rWb, reversed wobble; H, Hoogsteen; rH, reversed Hoogsteen; S, sheared; rS, reversed sheared; fS, flipped sheared; pfS, parallel flipped sheared; pS, parallel sheared; rpS, reversed parallel sheared. The parentheses represent the alternative base-pair conformations with an asterisk (*), and the hash sign (#) indicates the 19 base-pair conformations that are not likely to form.
  • 4. G:A in Figure 5 and in the rpS conformation for G:G in Figure 10. Statistics of base-pair conformations observed in rRNA crystal structures A total of 121 nucleotide and conformational arrangements are possible for the ten base-pair groups and 14 theoretically possible conformations for each base-pair group; 19 of the 140 possible arrangements will not form and are not considered. Of these 121 conformations, 73 simple base-pairs (BP in Table 1) and 69 higher-order interactions associated with base-basepair and base-pair–base- pair interactions (TQ in Table 1) occur in the rRNAs in the T. thermophilus 30 S and H. marismortui 50 S crystal structures. Four significant trends are identified in Table 1. (1) Of the 1745 simple base- pairs, 884 (51%) are C:G, followed by U:A (355; 20%), G:A (185; 11%), U:G (152; 10%), and the remaining six base-pair groups (9%). (2) The base- pair conformation that occurs at the highest frequency is WC (1153; 66%), followed by S (194; 11%), Wb (168; 10%), rH (88; 5%), and the remaining ten base-pair conformations (8%). (3) The 869 (98%) of the C:G base-pairs have the WC conformation; 252 (71%) and 59 (17%) of the U:A base-pairs form WC and rH conformations, respectively; 127 (84%) of the U:G base-pairs have the Wb conformation; 143 (77%) and 21 (11%) of the G:A base-pairs have S and WC conformations, respectively; more than half of the C:A base-pairs adopt the rH (14; 31%) Figure 1. A schematic representation of the topological relationships between base-pair conformations for a given base-pair, X:Y. Bases are represented as triangles and glycosidic bonds as thick lines attached to triangles. For simplicity, the starting Watson–Crick and wobble conformations are represented as X:Y WC/Wb in a box with a shaded border in the center. Each conformation is obtained by simply manipulating Y or X: s, shearing; f, flipping; r, reversing; p, paralleling; sl, slipping (see the text for details). The alternative conformations are shown with an asterisk (*) in the parentheses. The dotted arrow shows other conformations that are not simply derived by manipulating either of Yand X. Figures 2–11 have the same presentation scheme with the possible protonated hydrogen bonds marked with wavy lines. 1228 Diversity of Base-pair Conformations
  • 5. and S (11; 24%) conformations, while 19 (37%) and 10 (20%) of the A:A base-pairs assume the rS and S conformations, respectively. (4) Of the 1745 base- pair conformations, the most populated is C:G WC (869; 50%), followed by U:A WC (252; 14%), G:A S (143; 8%), U:G Wb (127; 7%), and U:A rH (59; 3%). While the C:G WC, U:A WC, and U:G Wb conformations predominantly occur within regular secondary RNA helices, the vast majority of the G:A S conformations occur immediately outside of regular secondary helices. While the most common conformation for simple secondary structure base-pairs is WC, the higher- order interactions have a wide variety of non- canonical conformations (Table 1). The most commonly observed conformations in higher- order interactions are rS (70; 23%), pS (39; 13%), pfS (34; 11%), rH (33; 11%), S (31; 10%), rpS (31; Figure 2. Base-pair conformations for the C:G base-pair group. Diversity of Base-pair Conformations 1229
  • 6. 10%), and fS (29; 9%), while the most common base- pair groups in higher-order interactions are G:A (114; 37%), U:G (39; 13%), C:A (35; 11%), and U:A (27; 9%). Jointly, the most frequent arrangements in higher-order interactions are the G:A rS (56), G:A rpS (19), and G:A fS (18) (Table 1). Geometries of base-pair conformations Hydrogen bonds are weak and largely electro- static in nature because of the partial positive hydrogen atom from the donor and the partial negative acceptor atom. Consequently, the two bases of a base-pair conformation are usually not perfectly coplanar even in the internal regions of the regular secondary structure helices. Instead, they are frequently propeller twisted, sometimes buckled, staggered, stretched, or sometimes open toward either the major or the minor groove side,32 while maintaining their topological arrangement, suggesting that the base-pair conformations are in constant motion. The average structural parameters Figure 3. Base-pair conformations for the U:A base-pair group. 1230 Diversity of Base-pair Conformations
  • 7. for the representative base-pair conformations observed in the rRNAs in the T. thermophilus 30 S and H. marismortui 50 S crystal structures are illustrated in Figure 12: dCC, the C10 –C10 distance; dDA, the donor–acceptor distance associated with a hydrogen bond; !X and !Y, the N1–C10 –C10 and N9–C10 –C10 angles. While the canonical conformations associated with the most commonly occurring base-pairs in the regular secondary helical regions (C:G WC, U:A WC, and U:G Wb) have their dCCs in the range of 10.4 A˚ and 10.6 A˚ , their dDAs gradually increase from the minor to the major groove (Figure 12(a)). This structural consistency of the A-form RNA will be maintained unless a regular helix accommodates a base-pair with a dramatically shifted dCC. For example, the G:A base-pair with the G 50 and the A 30 to a regular secondary helix forms the G:A WC conformation with a much longer dCC of 12.6 A˚ ,14 whose dDAs increase from the minor to the major groove. Nonetheless, the 16 U:U Wb conformations manage to be embedded within secondary helices Figure 4. Base-pair conformations for the U:G base-pair group. Diversity of Base-pair Conformations 1231
  • 8. (data not shown), despite the much shorter average dCC of 8.7 A˚ (Figure 12(a)). Moreover, the U:C WC and C:C Wb conformations also form within regular helices (data not shown) and their dDAs signifi- cantly increase from the minor to the major groove; the former has an opening toward the minor groove, while the latter has the O2–N3 separation (4.0 A˚ ) beyond the putatively protonated hydrogen bonding distance. Furthermore, the U:A sWC and C:A sWb conformations have the elongated dCCs from those of their WC and Wb counterparts. While the shearing and flipping arrangements of the two bases in a base-pair result in the reduction of dCC, the shearing and flipping arrangements followed by the reversing frequently do the opposite. For example, while the G:A S and A:A S conformations have shorter dCCs, 9.5 A˚ and 9.8 A˚ , respectively, the C:A rH and A:A rS conformations have the elongated dCCs, 10.9 A˚ and 11.1 A˚ , respectively (Figure 12(b)). When an A is involved in a base-pairing interaction with two hydrogen bonds, the dDA associated with the A at the –NH2 Figure 5. Base-pair conformations for the G:A base-pair group. 1232 Diversity of Base-pair Conformations
  • 9. group is longer than the one associated with the A at N7 (Figure 12(b)), because of the moderate- strength, non-linear hydrogen bonding interaction (data not shown). While !X is usually larger than !Y in Figure 12, !X is less than !Y in the alternative conformations with an asterisk (e.g. the Wb* , sWC* , sWb* , and S* conformations). Interestingly, regardless of the base-pair group, the difference between the two !X and !Y angles, j!XK!Yj, is less than 58 for the WC conformations, 20–408 for the Wb confor- mations, approximately 45–608 for sWC and sWb conformations, and 75–908 for the S conformations. In this regard, the j!X–!Yj value can be used to determine the vast majority of the base-pair conformations, in the rRNAs in that approximately 88% of the 1745 simple base-pairs have the WC, Wb, sWC, sWb, and S conformations (Table 1). Almost all of the base-pairs within regular helices (called internal base-pairs) have the canonical Figure 6. Base-pair conformations for the C:A base-pair group. Diversity of Base-pair Conformations 1233
  • 10. conformations, while the base-pairs at helix ends (called terminal base-pairs) sometimes have the non-canonical conformations (Table 2). However, the vast majority of base-pairs with the non- canonical conformations either occur in the unpaired regions in the covariation-based rRNA secondary structure models or are associated with higher-order interactions. The distribution of base-pair conformations is discussed in detail below. Base-pair conformations and their sugar puckering patterns While 1561 (89%) of the 1745 simple base-pairs in the rRNAs have the C30 -endo sugar puckering in both nucleotides that are base-paired, the remaining 184 (11%) have the C20 -endo or O40 -endo sugar puckering in at least one of the two base-paired nucleotides (Table 3). Of the 184 base-pairs with the sugar puckering other than the C30 -endo puckering, Figure 7. Base-pair conformations for the U:C base-pair group. 1234 Diversity of Base-pair Conformations
  • 11. 26 have the C20 -endo puckering at both nucleotides and three have the unusual O40 -endo puckering. The authenticity of the latter O40 -endo puckering was questioned in a recent publication.33 However, the 184base-pairswiththe“perturbed”sugarpuckerings are not restricted to any specific base-pair group or conformation; they include 23 C:G WC, 12 U:AWC, 3 U:G Wb, 21 U:A rH, 20 G:A S, 15 A:A rS, and 90 other non-canonical conformations (data not shown). All of the base-pairs in the internal positions of the helices in the 16 S and the 23 S rRNA compara- tive structure models have the C30 -endo puckering at both nucleotides that are base-paired, except for three C:G WC base-pairs at positions 1555–1566 (Ec: 1448:1463), 1827–2021 (Ec: 1771:1980), and 1853–1878 (Ec: 1797:1822) in the 23 S rRNA. All of the remaining 181 base-pairs containing the “perturbed” sugar puckerings occur at the ends of helices, in lonepairs, in base-pairs associated with motifs (e.g. tetraloops and E-loops), and in tertiary interactions (data not shown). Nonetheless, no correlations between base-pair conformations and sugar puckering are observed. Protonated base-pair conformations The C:A Wb and C:C Wb conformations can have two hydrogen bonds, one of which results from protonation of A at N1 and of C at N3, respectively. The protonated C:A Wb and C:C Wb conformations with two hydrogen bonds have been reported.34,35 A recent spectroscopic study of the Escherichia coli tRNAAla acceptor stem showed that N1 of the C:A Wb conformation is protonated at pH 5.0–5.5 and unprotonated at pH 7.0–7.5.36 A 1 H NMR study indicated that, upon forming DNA triplexes, the Figure 8. Base-pair conformations for the A:A base-pair group. Diversity of Base-pair Conformations 1235
  • 12. C:C Wb conformation is protonated up to pH 7.0 but completely unprotonated at pH 7.6.37 In the rRNAs in the T. thermophilus 30 S (PDB, 1FJF; pH 6.5)1 and the H. marismortui 50 S (PDB, 1JJ2; pH 5.8)3 structures, several C:A base-pairs including C1384:A1477 (Ec: C1402:A1500) in the 16 S and C963:A1005 (Ec: U868:A909) in the 23 S rRNA have conformational arrangements identical with that of the protonated C:A base-pair pre- viously reported. Interestingly, however, the C:A Wb conformation forms at a pH value higher than the reported protonation pH limit with the topology of the protonated C:A Wb conformation. For example, the 16 S rRNA base-pair C1384:A1477 (Ec: C1402:A1500) forms very similar conformations in the native 30 S (PDB, 1FJF; pH 6.5)1 and the substrate-bound 30 S (PDB, 1I94; pH 7.8)38 struc- tures; the distances from CaO of C to N1 of A, d(O2–N1), is 2.41 A˚ in the native 16 S and 2.24 A˚ in the ligand-bound 16 S rRNA, respectively. In addition, the 16 S rRNA base-pair C240:C278 (Ec: U245:U283) forms the C:C Wb conformation at pH 7.8, which is topologically identical with the protonated C:C base-pair; the distance from CaO of one C to N3 of the other, d(O2–N3), is 2.47 A˚ and 2.65 A˚ in the native and the substrate-bound 30 S structures, respectively. These two “protonated-like” C:A Wb and C:C Wb conformations at high pH 7.8 could result from a localized pH change in the vicinity of these base-pairs. Figure 9. Base-pair conformations for the C:C base-pair group. 1236 Diversity of Base-pair Conformations
  • 13. In contrast, all of the eight C:C Wb conformations in the 23 S rRNAs at pH 5.8 are flanked by two internal base-pairs but have much longer d(O2–N3) values, leading to an opening toward the minor groove (Figure 12). In addition, the vast majority of the water molecules interacting with base-pairs in the H. marismortui 50 S structure are located in the major groove, not in the minor groove, preventing the protonation of C at N3 and A at N1 from the minor groove (data not shown). Thus, the C:A Wb and C:C Wb conformations may or may not be protonated due to a localized pH change in their proximity. The two protonation-like base-pairs were pre- viously predicted with comparative sequence analysis. First, while the base-pair at 1384:1477 (Ec: 1402:1500) is a C:A in more than 10,000 16 S and 16 S-like rRNA sequences, it is a U:G in a few rRNA sequences in mitochondria from eukaryotes that map to different branches of the phylogenetic tree,39,40 rationalizing that the covarying C:A and U:G base-pairs have similar conformations Figure 10. Base-pair conformations for the G:G base-pair group. Diversity of Base-pair Conformations 1237
  • 14. (Figures 4 and 6). Second, the non-canonical base- pair at positions C240:C278 (Ec: U245:U283) in 16 S rRNA was proposed based on the covariation between U:U and C:C,41 implying that the co- varying U:U and C:C base-pairs have similar conformations (Figures 9 and 11). Base-pair conformations involving bifurcated hydrogen bonds The four theoretically possible instances of bifurcated hydrogen bonding interactions are: (1) when one hydrogen atom simultaneously interacts with two acceptor atoms (type I); (2) when one acceptor atom simultaneously makes contact with two hydrogen atoms (type II); (3) when two hydrogen atoms from the donor make contacts with two different acceptor atoms (type III); (4) when one hydrogen atom interacts with one acceptor atom while the donor interacts with another hydrogen atom (type IV) (Figure 13). The type II bifurcated hydrogen bonding inter- actions systematically and commonly occur in protein b-sheets.42 Some base-pair conformations Figure 11. Base-pair conformations for the U:U base-pair group. 1238 Diversity of Base-pair Conformations
  • 15. Figure 12. Geometries of (a) WC, Wb, sWC, and sWb and (b) S, rH, H, and rS. The average values for dDAs (donor– acceptor distances for hydrogen bonds), dCC (C10 –C10 distance), and !X and !Y (N1–C10 –C10 or N9–C10 –C10 angles) are obtained using N number of each base-pair conformation. The standard deviations for these structural parameters are not provided intentionally. Table 2. Distribution of canonical and non-canonical base-pair conformations on the covariation-based structure models of rRNAs in the T. thermophilus 30 S and H. marismortui 50 S crystal structures Location Association Canonical Non-canonical Total Paireda 1241 (96.4) 46 (3.6) 1287 Internal region 717 (98.9) 8 (1.1) 725 Helix endsb 384 (93.7) 26 (6.3) 410 Dipair helixc 110 (97.4) 3 (2.6) 113 Lonepair helixc 30 (76.9) 9 (23.1) 39 Unpaireda 80 (17.5) 378 (82.5) 458 Motifsd 13 (7.0) 172 (93.0) 185 Unknown 67 (24.5) 206 (75.5) 273 Total 1321 (75.7) 424 (24.3) 1745 While canonical conformations (percentage in parentheses) are defined as WC and Wb conformations of any base-pair, the remaining 12 conformations are considered to be non-canonical (percentage in parentheses). a Paired and unpaired in the covariation-based structure models. b Helix ends are defined here as the terminal base-pairs occurring at the ends of a regular secondary helix. c Covariation-based helices with one or two base-pairs. d Identified motifs are GNRA and UNCG tetraloops, AA.AG at helix.ends, E loops, and tandem G:A base-pairs, lonepair triloops, K-turns, H-turns, and sticky motifs (see the text). Diversity of Base-pair Conformations 1239
  • 16. with bifurcated hydrogen bonds (BHBs) in RNA structure have also been reported.24,25 Although explicitly not shown in Figures 2–11, the type I and II BHBs are possible in some conformations associated with simple base-pairs: The former includes C:A sWb (Figure 6), U:C sWC (Figure 7), and C:C sWb and H (Figure 9); the latter includes include C:G fS and pfS (Figure 2), U:G fS and pfS (Figure 4), G:A rH, fS* and pfS* (Figure 5), C:A pfS (Figure 6), and G:G Wb and pfS (Figure 10). Our conformational analysis revealed no type I BHBs but identified the type II BHBs in a few of the simple base-pairs in the rRNAs in the T. thermo- philus 30 S and the H. marismortui 50 S structures. For example, the type II BHBs are observed in two of the three G:G Wb conformations shown in Table 1, which are distorted probably due to the steric clash between NH2 of one G and NH of the other, followed by forming two simultaneous interactions of NH and NH2 of one G with CaO of the other (data not shown). A very similar G:G Wb confor- mation is observed with G76:G100 in the crystal structure of the E. coli 5 S rRNA fragment in complex with L25 (PDB, 1DFU43 ). In fact, the G:G Wb conformations with bifurcated hydrogen bonds have the topological arrangement with the glyco- sidic bonds of their two G bases almost reversed (data not shown); they are possibly an intermediate step for the transition between G:G Wb and G:G rH (Figure 10). The fS and pfS conformations of the C:G and U:G base-pairs can have either a single hydrogen bond or the type II BHBs, while maintaining the identical topological arrangement (Figure 13(a)). Specifically, the U:G fS conformation features the base-pair conformation formed between the first and the last nucleotides in the five UNCG tetraloops in the 16 S and 23 S rRNAs; these include U338:G341 (Ec: U343:G346), U415:G418 (Ec: U420:G423), U1117:G1120 (Ec: U1135:G1138), and U1430:G1433 (Ec: U1450:G1453) in the 16 S and U1770:G1773 (Ec: U1692:G1695) in the 23 S rRNA. In fact, the UNCG tetraloops involves an additional hydrogen bond between 20 -OH of U and CaO of G, stabilizing their formation in RNA structure (Figure 13(a)). The same U:G fS conformation containing the type II BHBs is observed with U9:G12 in the UUCG tetraloop crystal structure (PDB, 1F7Y44 ). In contrast to U:G fS in the crystal structures, the solution structure for the UUCG tetraloop revealed the canonical U:G Wb conformation.8 The type II BHBs also commonly ocur in higher- order interactions involving base-triples and base- quadruples. Together with the type II BHBs, the type III and IV BHBs frequently occur in higher- order interactions including the A-minor inter- actions.21,22 For example, the two base-pairs in the 23 S rRNA, C2833:G2847 (Ec: G2816:C2830) and G2851:A2906 (Ec: G2834:A2883), interact with each other to simultaneously form the type II, III, and IV BHBs, which are formed with C2833 (Ec: G2816) at CaO, G2847 (Ec: C2830) at NH2, and A2906 (Ec: A2883) at 20 -OH, respectively (Figure 13(b)). In this respect, the type II, III, and IV bifurcated hydrogen bonding interactions play a significant role in Table 3. Sugar puckering patterns for base-pairs observed in the rRNAs in the T. thermophilus 30 S and H. marismortui 50 S crystal structures RNA [3 : 3] [3 : 2] [2 : 3] [2 : 2] [3:o] [o:2] Total 16 S 565 17 20 4 1 1 608 23 S 953 64 50 22 1 0 1090 5 S 43 1 3 0 0 0 47 Total 1561 82 73 26 2 1 1745 The sugar puckering pattern for a given base-pair, X:Y, is represented as [m:n], where m and n are either 3, 2, or o: 3, C30 -endo puckering; 2, C20 -endo puckering; o, O40 -endo puckering. Figure 13. Bifurcated hydrogen bonds (BHBs) observed in the rRNAs: (a) U:G fS with and without type II BHBs; (b) an A-mediated higher-order interaction with type II, III, and IV BHBs. The dDAvalues are explicitly illustrated using broken lines and atoms are assigned different colors: C, black; N, cyan; O, oxygen; P, orange. 1240 Diversity of Base-pair Conformations
  • 17. stabilizing folded RNA structure, for example, by increasing the number of hydrogen bonds in long- range tertiary interactions. Isostericity of base-pair conformations Two or more base-pair types with a topologically identical arrangement of the two base-pairing nucleotides are structurally equivalent or iso- steric.45 The two best-known isosteric base-pairs, C:G WC and U:A WC, are also isosteric with G:A WC and U:C WC as well as with U:G WC (Figure 4) and C:AWC (Figure 6); the latter two conformations are theoretically possible with keto-enol and amino- imino tautomerism, respectively. For example, the A288:C364 (Ec: A282:U358) base-pair in the 23 S rRNA has a topological arrangement very similar to that of the U:A WC conformation (Figure 14(a)). In theory, all base-pair types other than G:G can form their corresponding WC conformations, while all the base-pair types can form the Wb conformation (Figures 2–11). In contrast, the U:G Wb conformation is not isosteric to the standard C:G WC and U:A WC conformation and occurs less frequently than the C:G WC and U:A WC conformations (Table 1). Besides the 124 U:G Wb conformations, our analysis identified three U:G Wb* and one C:G Wb confor- mations in the rRNAs. The C:G Wb conformation was previously observed with C84:G92 in the 1.6 A˚ resolution crystal structure for domain E in the Thermus flavus 5 S rRNA (PDB, 439D).46 Interestingly, while C:G Wb is isosteric to U:G Wb, U:G Wb* should be flipped horizontally to be isosteric to U:G Wb (Figure 14(b)). G647:G724 (Ec: G664:G741) in the 16 S rRNA also adopts the conformational arrangement similar to that of the U:G Wb conformation (Figure 14(b)). When two consecutive nucleotides on a single RNA strand are base-paired, they form the charac- teristic, non-canonical pS(*) conformation. The first set of examples were observed in the adenosine platform motif in the Tetrahymena group I intron.47 Several more examples of this type of base-pairing occur in the rRNAs at positions G175:U176 (Ec: A181:A182) and U624:A625 (Ec: U641:A642) in the 16 S, C1105:A1106 (Ec: A1008:A1009), G1119:U1120 (Ec: G1022:U1023), and G1235:A1236 (Ec: G1131:U1132) in the 23 S, and A51:A52 in the 5 S rRNAs. In all of these examples, the base of the leading 50 nucleotide always moves into the major groove and that of the 30 nucleotide into the minor groove. A similar but not identical example involves the two consecutive bases A1193 and Figure 14. Isostericity between base-pair conformations: (a) U:A WC and C:A WC; (b) U:G Wb, C:G Wb, U:G Wb* , and G:G Wb. The dDAvalues are explicitly illustrated using broken lines and atoms are assigned different colors: C, black; N, cyan; O, oxygen; P, orange. Diversity of Base-pair Conformations 1241
  • 18. A1194 (Ec: A1089 and A1090) in the L11-binding region of the 23 S rRNA, which exchange with G and U, respectively. These two positions form a base-pair with the pS conformation,19,48,49 while position A1194 (Ec: A1090) forms a regular base- pair with position U1205 (Ec: U1101), forming a base-triple. Moreover, the consecutive GU bases in the AGUA/GAA motif also form the parallel sheared conformation, U:G pS* (Figure 4).50 Moreover, the G:G H conformation (Figure 10) observed at positions G294:G549 (Ec: G299:G566) in the 16 S and G604:G607 (Ec: not homologous) in the 23 S rRNAs is isosteric to all other base-pairs in the H conformation (Figures2–11).TheG:GHconformation was originally observed in the NMR and crystal structures of the G-quadruplex DNAs formed by telomeric DNA sequences (PDB, 139D51 and 1JPQ52 ). Furthermore, the frequent exchange between G:A and A:A (or sometimes C:A) at the ends of helices14 can simply be explained by the topological iso- stericity of the G:A S, A:A S, and C:A S confor- mations (Figures 5, 6, and 8). The U:A rH and C:A rH conformations that are frequently observed in the rRNAs (Table 1) are also topologically similar to all other base-pairs in the rH conformation (Figures 2–11). These isosteric base-pairs covary or exchange with one another at similar positions in homologous RNA molecules from phylogenetically different organisms, without affecting the overall three-dimensional RNA structures. Therefore, the conformational isostericity of base-pairs can be applied to rationalize base-pair exchanges in an alignment of homologous RNA sequences. Unusual base-pair conformations observed in other non-ribosomal crystal structures Along with the 73 base-pair types and confor- mational arrangements in the rRNAs (Table 1), six additional conformational arrangements are identi- fied in some non-rRNA crystal structures. (1) C:C rWC (Figure 9) is observed at positions C16:C59 in the E. coli Cys-tRNA crystal structure (PDB, 1B2353 ). This same conformation was observed in the telomeric C-rich sequences forming an unusually intercalated DNA structure known as the i-motif (PDB, 105D54 ), which has been known to be stabilized by TMPyP4, a DNA-binding cationic porphyrin causing chromosomal destabilization.55 (2) U:G H* (Figure 4) is formed at positions G80:U96 in domain E of the T. flavus 5 S rRNA crystal structure (PDB, 361D56 ). (3) U:G rH* (Figure 4) is formed at positions U168:G188 in the crystal structures of the P4–P6 domain of the Tetrahymena group I intron (PDB, 1GID57 and 1HR258 ). (4) G:G sWb (Figure 10) is formed at positions G28:G40 in the UUCG tetraloop crystal structure (PDB, 1F7Y43 ). (5) G:A rH* (left of the two G:A rH* structures in Figure 5) is formed at positions G22:A46 in the (C13:G22)A46 base-triple in the crystal structure of the Saccharomyces cerevisiae Asp-tRNA complexed with Asp-tRNA synthetase (PDB, 1ASY59 ). This conformation can be protonated to have two hydrogen bonds and is isosteric to G:G rH at positions G22:G46 of the (C13:G22)G46 base-triple in the S. cerevisiae Phe-tRNA crystal structure (PDB, 6TNA60 ). (6) U:U rWC (Figure 11) is observed at positions U1301:U1339 (Ec: G1288:U1326) in the 23 S rRNA in the crystal structure of the Deinococcus radiodurans 50 S crystal structure (PDB, 1LNR; pH 7.8)61 , which is equivalent to U:C rWC (Figure 7) for positions C1394:U1432 (Ec: G1288:U1326) in the H. marismortui 23 S rRNA. Thus, as the number and diversity of RNA crystal and NMR structures increases, we expect to find more of the theoretically possible arrangements of base-pair types and conformations shown in Table 1 that have not already been observed. Ultimately, this information will help us understand the biological significance of the rare non-canonical base-pair conformations. Discussion Distribution of base-pair conformations on rRNA secondary structures The statistics for the distribution of base-pair conformations for the 1745 simple base-pairs observed in the rRNAs in the T. thermophilus 30 S and the H. marismortui 50 S crystal structures are summarized in Table 2. Overall, 96% of the base- pairs associated with secondary structure helices have the canonical WC and Wb conformations, with the highest percentage of the canonical confor- mations in internal regions, followed by two base- pair helices, helix ends, and lonepair helices. As expected, the majority (76%; 35 out of 46) of the non- canonical conformations occurring in helical regions are associated with the helix ends and lonepair helices (Table 2). In contrast to the helical base-pairs with the canonical conformations, the vast majority (83%) of the base-pairs associated with the unpaired regions on the secondary structure models† have non- canonical conformations. In particular, 93% of the base-pairs associated with the previously known structure motifs, such as GNRA and UNCG tetra- loops,7–10 A platforms,13,62 AA.AG at helix.ends motifs,14 E-loops,47,63–65 tandem GAs,15,16 lonepair triloops,17,18 K-turns,3 H-turns,23 and sticky motifs (J.C.L. and R.R.G., unpublished results), contain non-canonical conformations (Table 2). As shown in Table 2, a total of 273 base-pairs not associated with the previously reported motifs are observed in the unpaired regions of the rRNA secondary structure models in the T. thermophilus 30 S and the H. marismortui 50 S crystal structures. These base-pairs either extend secondary helices probably by stabilizing the ends of regular second- ary helices or are involved in the organization and folding of RNA structure by mediating long-range † http://www.rna.icmb.utexas.edu/ 1242 Diversity of Base-pair Conformations
  • 19. tertiary interactions. While 67 (24%) of these base- pairs have the canonical conformations, the majority (76%) has non-canonical base-pair confor- mations, providing further opportunities for iden- tifying additional new RNA motifs. Implications of non-canonical base-pair conformations in RNA structure Bases on a single RNA chain are vertically projected from the backbone to minimize steric clashes between bases and sugar rings and, simul- taneously, consecutive sugar rings in the backbone are helically twisted to minimize their steric collisions, intrinsically leading to the helical stack- ing of bases in the RNA chain. Thus, while maintaining the structural integrity in each RNA chain, the base-pairs within regular secondary helices are structurally constrained to adopt the canonical WC and Wb conformations (Table 2). In contrast to the internal base-pairs structurally locked in the WC and Wb conformations, the base-pairs outside or at the termini of regular helices are subject to conformational change, leading to diversity of base-pair conformations. Nonetheless, many non-helical base-pairs are locked in RNA structure motifs and adopt a consistently defined set of non-canonical base-pair conformations, suggesting that base-pair confor- mations are context-dependent. Thus, the more constrained base-pairs, the less diversity of base- pair conformations. The base-pair conformations adopted by known RNA motifs in rRNAs are shown in Table 4. The four RNA motifs, the GNRA and UNCG tetraloops, A platforms, E loops, and K-turns involve a specific set of base-pair types and consistently form a unique conformation for each base-pair (Table 4). For example, the GNRA tetra- loops in the rRNAs form the G:A S conformation and are usually involved in long-range tertiary interactions (data not shown). The A platforms occurring in internal loops associated with the 50 -CUAAG/UAUG-30 sequence serve as a receptor for the GAAA tetraloop and consistently form four base-pairs, C:G WC, U:A rH, A:A pS, and G:U Wb. The E-loop motifs50,63–65 occur in internal or multi-stem loops and form a defined set of base- pair conformations, A:A rS, U:A rH and G:U pS* , and A:G S (or A:A S). Interestingly, the E-loop motif with the AGUA/GAC sequence has the A:C rS conformation for its leading base-pair, which has no hydrogen bonds between the two bases but is a topological isostere to the A:A rS conformation for the E-loop motif with the AGUA/GAA sequence. However, an additional non-canonical base-pair immediately outside of the E-loop motif forms different conformations, depending on their sequence context: the S conformation with the AGUA/GAA sequence and the sWb conformation with the AGUA/GAC sequence (data not shown). The K-turns3 occur in asymmetric internal loops Table 4. Base-pairs and their conformations formed in RNA motifs Motif Base-pairs Conformations Comments Tetraloops7–10 G:A S 50 -GNRA-30 : hairpin loops U:G fS 50 -UNCG-30 : hairpin loops A platforms13,62 C:G WC 50 -CUAAG/UAUG-30 : internal loops U:A rH A:A pS U:A Wb E-loops51,63–65 A:A (A:C) rS 50 -AGUA/GAA-30 : internal loops G(U:A)a U:A rH and G:U pS* A:G (A:A) S K-turns3 C:G WC 50 -AG/CNNNG-30 : internal loops G:A S Lonepair triloops17 U:A rH, WC R1 LPTL(50 -UGNRA-30 ): hairpin loops C:A rH, rWb R2 LPTL (50 -UUYRA-30 : hairpin loops C:G WC, rWC G:A S, H H-turns23 G:A S 50 -GA/UA-30 : multi-stem loops U:A rH AA.AG at helix.ends14 G:A (C:A) S, WC, H* , rH, Svb G:A with G 30 and A 50 to helix A:A S, Wb G:A WC G:A with G 50 and A 30 to helix Tandem GAs15–16 G:A S 50 -GA/GA-30 : internal and multi-stem loops A:A S, rS 50 -GA/AA-30 : internal and multi-stem loops U:A rH 50 -GA/UA-30 : internal loops A-mediated interactionsc A:G rS, rpS, fS, pfS Unpaired As in long-range tertiary interactions C:Y fS, pfS NZ{A, C, G, U}, RZ{A, R}, and YZ{C, U}. a The G(U:A) base-triple with the U:A rS and G:U pS* conformations is sandwiched between A:A rS and A:G S. b Sv represents an S-like conformation with two bases vertically arranged. It may be an intermediate between G:A S and G:A rS. c The A-mediated interactions include the A-minor motifs.21,22 Diversity of Base-pair Conformations 1243
  • 20. and form two discrete base-pairs, C:G WC and G:A S, with usually three to four intervening unpaired nucleotides leading to a sharp turn in the backbone. In contrast, the remaining five RNA motifs form base-pairs, each of which is capable of having several different conformations, depending on the structure context. For example, while the R1 and R2 LPTLs occur in hairpin loops with the UGNRA and UUYRA sequences, respectively, and allow only the U:A rH (or sometimes C:A rH) conformation for their lonepair, due to their constrained sequence and structure. These two groups of LPTLs are involved in long-range tertiary interactions by recruiting an unpaired A between the fourth base in the triloop and the 30 base of the lonepair; the three-dimensional structure of the resulting hairpin loop mimics that of the GNRA tetraloops.17 How- ever, lonepairs in the other LPTLs containing variable loop sequences are not restricted to any specific conformation; they depend on the base-pair type and their structural context. For instance, the R3 LPTLs containing the UAA triloop sequence occur in the multi-stem loop and the third position of the triloop is involved in a long-range tertiary interaction, although their lonepair conformation is dependent on the lonepair type.17 The H-turns23 form two base-pairs in multi-stem loops, G:A S and U:A rH, but either of the two base-pairs may not form probably due to the lack of enough structural constraints upon forming a sharp hook-turn of one strand. The AA.AG at helix.ends motif14 involves a single base-pair, G:A (with G 30 and A 50 to a regular secondary helix) and A:A, and usually forms G:A S and A:A S, stabilizing helix ends by preventing any potential structural perturbation from being further propagated into helical stems. Our analysis also revealed that some AA.AG at helix.ends base-pairs exchange with C:A, forming C:A S isosteric to G:A S (data not shown). In addition, some other exceptional conformations can be formed, depending on the structural context (Table 4). Specifically, the G:A S and A:A S conformations occur 100% in hairpin loops, 82% in internal loops (with 8% WC), and 61% in multi- stem loops (with 13% WC).14 Besides, several AA.AG at helix.ends motifs in internal and multi- stem loops are not achieved in the rRNA crystal structures (data not shown). Consequently, the AA.AG at helix.ends base-pairs are highly con- strained in hairpin loops, constrained in internal loops, and relatively not constrained in multi-stem loops. In contrast, the eight G:A base-pairs with the reversed orientation (G 50 and A 30 to helix) always adopt the G:A WC conformation with the long dCC of 12.6 A˚ in the rRNAs (Figure 12(a)),14 which were recently rediscovered as the cis Watson–Crick A/G base-pairs,66 and are involved in helical stacking (data not shown). The tandem GA motifs15,16 occur in internal or multi-stem loops and are composed of the GA/GA, GA/AA, and GA/UA sequences, wherein the GA/ UA sequence always exists as part of the E-loops and forms U:A rH and A:G S. Both of the tandem GA base-pairs usually adopt G:A S (or A:A S) in 2! 2 internal loops. In large internal and multi-stem loops, the helix-side base-pair forms G:A S (or A:A S) and the loop-side base-pair is frequently not formed. Interestingly, however, the loop-side A:A base-pairs form the A:A rS conformation (data not shown). The A-mediated interactions involve the N1 and N3 positions of A, which are intrinsically nucleo- philic due to the electron-donating amino group. Consequently, many unpaired A bases in the rRNA secondary structure models are involved in tertiary interactions with other sections of the RNA chain. In particular, such unpaired A bases frequently interact in the minor groove with C:G (or sometimes U:G and U:A) within helical stems. Due to the lack of any structural constraint, however, the A-mediated tertiary interactions lead to diverse conformations depending on the topology of an unpaired A in the minor groove of a helical stem (Table 4). The most common A-mediated tertiary interaction employs an unpaired A at the N3 position and the G of a C:G, forming either G:A rS (known as type I A-minor motif21,22 ) or G:A rpS (Figure 5). The alternative use of the N1 position of the unpaired A results in the G:A fS or G:A pfS conformation (Figure 5). On the other hand, when the unpaired A interacts with a pyrimidine (C or U), its amino group is hydrogen-bonded to the pyrimidine carbonyl group in the minor groove, leading to either C:A fS or C:A pfS (Figure 6) and U:A fS (Figure 3). In this regard, each of the wide variety of rare non-canonical conformations observed in RNA structure should not be ignored. Although rare, each or clusters of them may reveal structural and biological relevance by being involved in organiz- ing local structures nearby or may play a critical role in RNA folding by mediating long-range tertiary interactions. Evaluation of the existing naming systems This new Lee–Gutell (LG) system is based on the topology of base–base interactions and unam- biguously describes all possible arrangements and orientations of the two bases that are hydrogen-bonded to each other, even without the explicit inclusion of the base-backbone interactions between the 20 -OH group of one nucleotide and the base of the other. Table 5 compares the LG system with the existing systems, including the common designation (CD) system,5,6,24,27,28 the Leontis– Westhof (LW) system,25 and the Saenger system.67 While the majority of base-pair conformations correspond between the first three systems, the LG system has several advantages over the existing systems. First, the LG system is simple, systematic, and convenient to use; instead of the long names, it uses short names. The CD system based on the interact- ing chemical groups is not easy to use except for some traditional names such as Watson–Crick, 1244 Diversity of Base-pair Conformations
  • 21. Table 5. Correspondence between the ten base-pair groups described here and 14 theoretically possible base-pair conformations and the three primary naming systems This work Common designation5,6,24,27,28 Leontis & Westhof25 Saenger67 C:G WC GC Watson–Crick CG cis WC/WC XIX (WC) Wb GC NH-COa – – sWC – – – sWb # # # rWC CG reverse Watson–Crick CG trans WC/WC XXII (rWC) rWb – – – H(*) GC Hoogsteen (–) CC G cis WC/H (–) – rH(*) GC NH2-COa (–) CC G trans WC/H (–) – S(*) – – – rS(*) GC N7-NH2 a (–) CG trans H/H (GC trans S/S) – fS(*) – (GC N3-NH2, NH2-N3) GC trans WC/S (CG trans WC/S) – pfS(*) GC NH-CO (–) GC cis WC/S (CG cis WC/S) – pS(*) – (GC N3-NH2 a – (CG cis H/S) – rpS(*) GC CO-NH2 a (GC NH2-COa ) – (CG cis S/S or GC cis S/Sb ) – U:A WC AU Watson–Crick UA cis WC/WC XX (WC) Wb – – – sWC AU NH2K2-COa – – sWb # # # rWC AU reverse Watson–Crick UA trans WC/WC XXI (rWC) rWb – – – H(*) AU Hoogsteen (AU NH2-4-COa ) UA cis WC/H (AU cis WC/H) XXIII (H) rH(*) AU reverse Hoogsteen (–) UA trans WC/H (–) XXIV (rH) S(*) – AU trans H/S or AU cis WC/S – rS(*) – UA trans H/H or UA trans S/Sb (–) – fS(*) AU NH2K2-CO (AU N3-NH2 a ) UA cis WC/S (–) – pfS(*) – AU cis WC/S (UA cis WC/S) – pS – AU cis H/S (–) – rpS(*) – UA cis S/Sb (AU cis S/Sb ) – U:G WC – – – Wb(*) GU wobble (GU NH-4-COa ) UG cis WC/WC (–) XXVIII sWC # # # sWb – – – rWC – – – rWb GU reverse wobble UG trans WC/WC XXVII H(*) GU CO-NHa (–) UG cis WC/H (–) – rH(*) GU N7-NHa (–) UG trans WC/S (GU trans WC/S) – S(*) – – (UG trans H/S) – rS(*) – – (GU trans S/S) – fS(*) GU N3-NH, NH2-CO (–) GU trans WC/S (UG trans WC/S) – pfS(*) – (GU NH2-4-CO) GU cis WC/S (UG cis WC/S) – pS(*) – (GU NH2K2-CO) GU cis H/S (–) – rpS(*) – GU cis S/Sb (UG cis S/S) – G:A WC GA imino GA cis WC/WC VIII Wb – – – sWC # # # sWb # # # rWC – – – rWb – – – H(*) GA N7–N1,CO-NH2 (GAC N7–N1,CO-NH2) GA cis WC/H (AC G cis WC/H) IX rH(*) – – (AG trans WC/H) – S(*) GA sheared or GA NH2-N7a (–) AG trans H/S or AG cis WC/S (–) XI rS(*) GA NH2-N3a (–) GA transH/H or GA trans S/S (AG trans S/S) – fS(*) GA N3-NH2, NH2-N1 (–) AG trans WC/S (–) X pfS(*) GA NH2-N1 or GA N3-NH2 a (–) AG cis WC/S (GA cis WC/S) – pS(*) – AG cis H/S (–) – rpS(*) – GA cis S/Sb or GA cis H/H (AG cis S/S) – C:A WC – – – Wb(*) AC wobble (AC N1-NH2 a ) CC A cis WC/WC (–) – sWC # # # sWb AC N1-NH2 a – – rWC – – – rWb AC reverse wobble CA trans WC/WC XXVI H(*) AC N7-NH2 (NH2-2-COa ) – – rH(*) AC reverse Hoogsteen (–) CA trans WC/H (–) XXV S(*) – AC trans H/S or AC cis WC/S (CA trans H/S or CA cis WC/S) – rS(*) – CA trans H/H (–) – (continued on next page) Diversity of Base-pair Conformations 1245
  • 22. Table 5 (continued) This work Common designation5,6,24,27,28 Leontis & Westhof25 Saenger67 fS(*) – AC trans WC/S (CA trans WC/S) – pfS(*) – (AC N3-NH2 a ) – – pS(*) – AC cis H/S (CA cis H/S) – rpS(*) # # CA cis S/Sb (AC cis S/Sb ) # U:C WC UC 4-CO-NH2 UC cis WC/WC XVIII Wb – – – sWC UC 2-CO-NH2 a – – sWb # # # rWC UC 2-CO-NH2 a UC trans WC/WC XVII rWb – – – H(*) – – (CU cis WC/H) – rH(*) – – – S(*) – CU trans H/S (–) – rS(*) UC 2-CO-NH2 a (–) UC trans H/S (–) – fS(*) – (UC NH-CO) CU trans WC/S (UC trans WC/S) – pfS(*) – CU cis WC/S (UC cis WC/S) – pS – CU cis H/S (–) – rpS(*) – UC cis S/Sb (CU cis S/Sb ) – A:A WC – – – Wb AA N1-NH2 AA cis WC/WC – sWC # # # sWb # # # rWC # # # rWb AA N1-NH2, sym AA trans WC/WC I H AA N7-NH2 a – – rH AA N7-NH2 AA trans WC/H V S AA sheared or AA N3-NH2 a AA trans H/S or AA cis WC/S – rS AA N7-NH2, sym AA trans H/H II fS – AA transWC/S – pfS – – – pS – AA cis H/S – rpS # AA cis S/Sb # C:C WC – – – Wb CC N3-CO, NH2-N3 CC C cis WC/WC – sWC # # # sWb CC CO-NH2 a – – rWC CC CO-NH2, sym CC trans WC/WC – rWb CC N3-NH2, sym – XIV rWC – CC trans WC/WC XV H – CC cis WC/H – rH – CC trans WC/H – S – CC trans H/S or CC cis WC/S – rS # # # fS – – – pfS – CC trans WC/S – pS – CC cis H/S – rpS # CC cis S/Sb # G:G WC # # # Wb GG CO-NHa GG cis WC/WC – sWC # # # sWb – – – rWC # # # rWb GG N1-CO, sym GG trans WC/WC III H GG N1-CO, N7-NH2 GG cis WC/H VI rH GG N7-NH GG trans WC/H VII S GG NH2-N7a GG trans H/S – rS GG N3-NH2, sym GG trans S/S IV fS – – – pfS GG N3-NH2 a GG cis WC/S – pS – – – rpS – GG cis S/S – U:U WC – – – Wb UU NH-CO UU cis WC/WC XVI sWC # # # sWb – – – rWC – – – rWb(*) UU 2-CO-NH,sym (UU 4-CO-NH2, sym) UU trans WC/WC XII, XIII H – UU cis WC/H – rH UU 4-CO-C5H, NH-4-CO UU trans WC/H – 1246 Diversity of Base-pair Conformations
  • 23. wobble, Hoogsteen, and reverse Hoogsteen. This system also needs the explicit designation of the number of hydrogen bonds to avoid confusing names between different conformations. In contrast, the LW system requires the explicit designation of the relative orientations (cis or trans) of the glycosidic bonds. Second, the LG system describes the topological arrangements of the two bases in a given base-pair, regardless of the presence and absence of the hydrogen bond between the 20 -OH group of one nucleotide and the base of the other, which is required for many base-pairs in the LW system (e.g. cis WC/S, trans WC/S, and trans H/S). However, our analysis has revealed many cis WC/S, trans WC/S, and trans H/S conformations that do not have the 20 -OH-mediated hydrogen bond, suggesting that these conformations fluctu- ate. In particular, the G:A S conformation has two names in the LW system, AG cis WC/S and AG trans H/S, both of which are topologically equival- ent with one another in the crystal structures. Third, the LG system is not dependent on the order of two paired nucleotides, but instead it is based on the base-pair groups (Table 1). The alternative names with an asterisk are used for the different relative orientation of the two bases, instead of switching the order of the two nucleotides. For example, while the GU trans WC/S conformation in the LW system can be described as U:G fS or G:U fS with the LG system, the UG trans WC/S conformation in the LW system can be described as U:G fS* or G:U fS* (Figure 4). Fourth, the LG system describes more base-pair conformations. The LW system describes 74 of the 121 major conformations (exclusive of the alternative conformations) defined by the LG system (Table 5). For example, several confor- mations including U:G Wb* , C:G Wb, and C:A Wb* in the LG system are not described with the LW system. In addition, the sWC (and sWb) conformations are unique to the LG system. In fact, the LW system describes 84 of the 119 confor- mations involved in simple base-pairs and higher- order interactions, inclusive of the alternative conformations, which are described with the LG system and are present in the set of crystal structures analyzed here (data not shown). Fifth, the LG system also provides formal names for higher-order interactions involved in base-triples and quadruples (Table 1), while the LW system does not. Sixth, the LG system may be used to trace the intermediates for the topological changes of base- pair conformations. For example, the intermediate conformations may be derived for the topological transition between the three topologically related conformations, C:A S, C:A rH and C:A sWb (Figure 6). Together, the established topological isostericity between base-pairs can be associated with the base-pair exchange patterns in an align- ment of homologous RNA sequences to predict base-pair conformations. Materials and Methods Analysis and classification of base-pair conformations Base-pairs in the rRNAs in the crystal structures of the T. thermophilus 30 S (PDB, 1FJF1 ) and H. marismortui 50 S (PDB, 1FFK2 and 1JJ23 ) ribosomal subunits were visually identified and characterized using the RasMol pro- gram.30,31 The base-pairs were then (1) divided into ten base-pair groups, C:G, U:A, U:G, G:A, C:A, U:C, A:A, C:C, G:G, and U:U, and (2) classified into 14 major families based on the topological arrangement of the two bases and two glycosidic bonds of a given base-pair (Table 1). Two bases that form base-backbone hydrogen bonding interactions with the 20 -OH group and no direct donor– acceptor interactions between the two bases are not considered as a base-pair with our classification system. The wavy lines in base-pair conformation Figures 2–11 represent hydrogen bonds once the base is protonated. Since the base-pairs that can form bifurcated hydrogen bonding interactions usually maintain the same topolo- gical arrangement in the presence and absence of bifurcated hydrogen bonds (BHBs), the BHBs are not shown in Figures 2–11. In addition, the base-pairs that can theoretically form their keto-enol and amino-imino tautomers were depicted in Figures 2–11. Hydrogen bonds were typically considered when the distance between the hydrogen bond donor and acceptor, dDA, is less than 3.5 A˚ . While it was not possible to measure the angles for hydrogen bonding interactions due to the lack of hydrogen atoms in the crystal structures, hydrogen bonds Table 5 (continued) This work Common designation5,6,24,27,28 Leontis & Westhof25 Saenger67 S – – – rS – – – fS – UU trans WC/S – pfS – UU cis WC/S – pS – – – rpS – UU cisS/Sb – The hash sign (#) represents the conformations that are not likely to form a hydrogen bond(s) between two base-pairing bases, and the long dash mark (–) represents the conformations that were not assigned by the three other naming systems. The conformations available (at http://prion.bchs.uh.edu/bp_type/) are simply represented by using acronyms; CO for carbonyl, NH for imino, NH2 for amino, and sym for symmetric. a Base-pair conformations with a single hydrogen bond are explicitly designated. b Base-pair conformations proposed by the LW system, which have either base–backbone or backbone–backbone hydrogen bonding interactions between two nucleotides. These interactions were not considered as base-pairs and are not included in our classification system for simplicity (Figures 2–11). Diversity of Base-pair Conformations 1247
  • 24. were considered to form linear and nearly linear hydrogen bonding interactions. Base-pair positions for the rRNAs are represented using the T. thermophilus numbering for the 16 S rRNA and the H. marismortui numbering for the 23 S rRNA, with the E. coli numbering in parentheses†. Acknowledgements We thank Jamie J. Cannone for proofreading the manuscript. This work was supported by the National Institutes of Health (GM067317), the Welch Foundation (F-1427), start-up funds from the Institute for Cellular and Molecular Biology at the University of Texas at Austin, and Ibis Thera- peutics, a division of Isis Pharmaceuticals. References 1. Wimberly, B. T., Brodersen, D. E., Clemons, W. M., Jr, Morgan-Warren, R. J., Carter, A. P., Vonrhein, C. et al. (2000). Structure of the 30 S ribosomal subunit. Nature, 407, 327–339. 2. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz, T. A. (2000). The complete atomic structure of the large ribosomal subunit at 2.4 A˚ resolution. Science, 289, 905–920. 3. Klein, D. J., Schmeing, T. M., Moore, P. B. & Steitz, T. A. (2001). The kink-turn: a new RNA secondary structure motif. EMBO J. 20, 4214–4221. 4. Gutell, R. R., Lee, J. C. & Cannone, J. J. (2002). The accuracy of ribosomal RNA comparative structure models. Curr. Opin. Struct. Biol. 12, 301–310. 5. Watson, J. D. & Crick, F. H. C. (1953). Molecular structure of nucleic acids: a structure for deoxy- ribonucleic acid. Nature, 171, 737–738. 6. Crick, F. H. (1966). Codon-anticodon pairing: the wobble hypothesis. J. Mol. Biol. 19, 548–555. 7. Woese, C. R., Winker, S. & Gutell, R. R. (1990). Architecture of ribosomal RNA: constraints on the sequence of “tetra-loops”. Proc. Natl Acad. Sci. USA, 87, 8467–8471. 8. Cheong, C., Varani, G. & Tinoco, I., Jr (1990). Solution structure of an unusually stable RNA hairpin, 50 GGAC(UUCG)GUCC. Nature, 346, 680–682. 9. Heus, H. A. & Pardi, A. (1991). Structural features that give rise to the unusual stability of RNA hairpins containing GNRA loops. Science, 253, 191–194. 10. Jucker, F. M. & Pardi, A. (1995). GNRA tetraloops make a U-turn. RNA, 1, 219–222. 11. Jucker, F. M. & Pardi, A. (1995). Solution structure of the CUUG hairpin loop: a novel RNA tetraloop motif. Biochemistry, 34, 14416–14427. 12. Gautheret, D., Konnings, D. & Gutell, R. R. (1995). G$U base pairing motifs in ribosomal RNA. RNA, 1, 807–814. 13. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K., Golden, B. L., Szewczak, A. A. et al. (1996). RNA tertiary structure mediation by adenosine platforms. Science, 273, 1676–1677. 14. Elgavish, T., Cannone, J. J., Lee, J. C., Harvey, S. C. & Gutell, R. R. (2001). AA.AG at helix ends: A:A and A:G base-pairs at the ends of 16 S and 23 S rRNA helices. J. Mol. Biol. 310, 735–753. 15. SantaLucia, J., Kierzek, R. & Turner, D. H. (1990). Effects of GA mismatches on the structure and thermodynamics of RNA internal loop. Biochemistry, 29, 8813–8819. 16. Gautheret, D., Konings, D. & Gutell, R. R. (1994). A major family of motifs involving GA mismatches in ribosomal RNA. J. Mol. Biol. 242, 1–8. 17. Lee, J. C., Cannone, J. J. & Gutell, R. R. (2003). Lonepair triloop: a new motif in RNA structure. J. Mol. Biol. 325, 65–83. 18. Nagaswamy, U. & Fox, G. E. (2002). Frequent occurrence of the T-loop RNA folding motif in ribosomal RNAs. RNA, 8, 1112–1119. 19. Quigley, G. J. & Rich, A. (1976). Structural domains of transfer RNA molecules. Science, 194, 796–806. 20. Gutell, R. R., Cannone, J. J., Konings, D. & Gautheret, D. (2000). Predicting U-turns in ribosomal RNA with comparative sequence analaysis. J. Mol. Biol. 300, 791–803. 21. Nissen, P., Ippolito, J. A., Ban, N., Moore, P. B. & Steitz, T. A. (2001). RNA tertiary interactions in the large ribosomal subunit: the A-minor motif. Proc. Natl Acad. Sci. USA, 98, 4899–4903. 22. Doherty, E. A., Batey, R. T., Masquida, B. & Doudna, J. A. (2001). A universal mode of helix packing in RNA. Nature Struct. Biol. 8, 339–343. 23. Sze´p, S., Wang, J. & Moore, P. B. (2003). The crystal structure of a 26-nucleotide RNA containing a hook- turn. RNA, 9, 44–51. 24. Nagaswamy, U., Larios-Sanz, M., Hury, J., Collins, S., Zhang, Z., Zhao, Q. & Fox, G. E. (2002). NCIR: a database of non-canonical interactions in known RNA structures. Nucl. Acids Res. 30, 395–397. 25. Leontis, N. B., Stombaugh, J. & Westhof, E. (2002). The non-Watson–Crick base pairs and their associated isostericity matrices. Nucl. Acids Res. 30, 3497–3531. 26. Lemieux, S. & Major, F. (2002). RNA canonical and non- canonical base pairing types: a recognition method and complete repertoire. Nucl. Acids Res. 30, 4250–4263. 27. Walberer, B. J., Cheng, A. C. & Frankel, A. D. (2003). Structural diversity and isomorphism of hydrogen- bonded base interactions in nucleic acids. J. Mol. Biol. 327, 767–780. 28. Donohue, J. (1956). Hydrogen-bonded helical con- figurations of polynucleotides. Proc. Natl Acad. Sci. USA, 42, 60–65. 29. Donohue, J. & Trueblood, K. N. (1960). Base-pairing in DNA. J. Mol. Biol. 2, 363–371. 30. Sayle, R. A. & Milner-White, E. J. (1995). RasMol: biomolecular graphics for all. Trends Biochem. Sci. 20, 374–376. 31. Bernstein, H. J. (2000). Recent changes to RasMol: recom- bining the variants. Trends Biochem. Sci. 25, 453–455. 32. Dickerson, R. E., Bansal, M., Calladine, C. R., Diekmann, S., Hunter, W. N., Kennard, O. et al. (1998). Definitions and nomenclature of nucleic acid structure parameters. EMBO J. 8, 1–4. 33. Murray, L. J. W., Arendall, W. B., III, Richardson, D. C. & Richardson, J. S. (2003). RNA backbone is rota- meric. Proc. Natl Acad. Sci. USA, 100, 13904–13909. 34. Puglisi, J. D., Wyatt, J. R. & Tinoco, I., Jr (1990). Solution conformation of an RNA hairpin loop. Biochemistry, 29, 4215–4226. 35. SantaLucia, J., Jr, Kierzek, R. & Turner, D. H. (1991). Stabilities of consecutive AC, CC, GG, UC, and UU † The Tables and Figures shown here are also available at http://www.rna.icmb.utexas.edu/ANALYSIS/BPC/. 1248 Diversity of Base-pair Conformations
  • 25. mismatches in RNA internal loops: evidence for stable hydrogen-bonded UU and CCC pairs. Biochemistry, 30, 8242–8251. 36. Biala, E. & Strazewski, P. (2002). Internal mismatched RNA: pH and solvent dependence of the thermal unfolding of tRNAAla acceptor stem microhairpins. J. Am. Chem. Soc. 124, 3540–3545. 37. Leitner, D., Schro¨der, W. & Weisz, K. (2000). Influence of sequence-dependent cystosine protonation and methylation on DNA triplex stability. Biochemistry, 39, 5886–5892. 38. Pioletti, M., Schlu¨nzen, F., Harms, J., Zarivach, R., Glu¨hmann, M., Avila, H. et al. (2001). Crystal structures of complexes of the small ribosomal subunit with tetracycline, edeine and IF3. EMBO J. 20, 1829–1839. 39. Gutell, R. R. (1993). The simplicity behind the elucidation of complex structure in ribosomal RNA. In The Translational Apparatus (Nierhaus, J. H. et al., eds), pp. 477–488, Plenum Press, New York. 40. Gutell, R. R. (1996). Comparative sequence analysis and the structure of 16 S and 23 S rRNA. In Ribosomal RNA: Structure, Evolution, Processing, and Function in Protein Synthesis (Dahlberg, A. E. & Zimmerman, R. A., eds), pp. 111–128, CRC Press, Boca Raton, FL. 41. Gutell, R. R. & Woese, C. R. (1990). Higher-order structural elements in ribosomal RNAs: pseudoknots and the use of non-canonical pairs. Proc. Natl Acad. Sci. USA, 87, 663–667. 42. Fabiola, G. F., Krishnaswamy, S., Nagarajan, V. & Pattabhi, V. (1997). C–H/O hydrogen bonds in beta- sheets. Acta Crystallog. sect. D, 53, 316–320. 43. Lu, M. & Steitz, T. A. (2000). Structure of Escherichia coli ribosomal protein L25 complexed with a 5 S rRNA fragment at 1.8 A˚ resolution. Proc. Natl Acad. Sci. USA, 97, 2023–2028. 44. Ennifar, E., Nikulin, A., Tishchenko, S., Serganov, A., Nevskaya, N., Garber, M. et al. (2000). The crystal structure of UUCG tetraloop. J. Mol. Biol. 304, 3542. 45. Gautheret, D. & Gutell, R. R. (1997). Inferring the conformation of RNA base pairs and triples from patterns of sequence variation. Nucl. Acids Res. 25, 1559–1564. 46. Perbandt, M., Vallazza, M., Lippmann, C., Betzel, C. & Erdmann, V. A. (2000). Structure of an RNA duplex withanunusualGCpairin wobble-likeconformationat 1.6 A˚ resolution. Acta Crystallog. sect. D, D57, 219–224. 47. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K., Golden, B. L., Szewczak, A. A. et al. (1996). RNA tertiary structure mediation by adenosine platforms. Science, 273, 1696–1699. 48. Conn, G. L., Draper, D. E., Lattman, E. E. & Gittis, A. G. (1999). Crystal structure of a conserved ribosomal protein–RNA complex. Science, 284, 1171. 49. Wimberly, B. T., Guymon, R., McCutcheon, J. P., White, S. W. & Ramakrishnan, V. (1999). A detailed view of a ribosomal active site: the structure of the L11-RNA complex. Cell, 97, 491. 50. Wimberly, B., Varani, G. & Tinoco, I., Jr (1993). The conformation of loop E of eukaryotic 5 S ribosomal RNA. Biochemistry, 32, 1078–1087. 51. Wang, Y. & Patel, D. J. (1993). Solution structure of a parallel-stranded G-quadruplex DNA. J. Mol. Biol. 234, 1171–1183. 52. Haider, S., Parkinson, G. N. & Neidle, S. (2002). Crystal structure of the potassium form of an Oxytricha nova G- quadruplex. J. Mol. Biol. 320, 189–200. 53. Nissen,P.,Kjeldgaard,M.,Thirup,S.&Nyborg,J.(1999). The crystal structure of Cys-tRNACys -EF-Tu-GDPNP reveals general and specific features in the ternary complex and in tRNA. Struct. Fold. Des. 7, 143–156. 54. Phan, A. T., Gueron, M. & Leroy, J.-L. (2000). The solution structure and internal motions of a fragment of the cytidine-rich strand of the human telomere. J. Mol. Biol. 299, 123–144. 55. Fedoroff, O. Y., Rangan, A., Chemeris, V. V. & Hurley, L. H. (2000). Cationic porphyrin promote the formation of i-motif DNA and bind peripherally by a non- intercalative mechanism. Biochemistry, 39, 15083–15090. 56. Perbandt, M., Nolte, A., Lorenz, S., Bald, R., Betzel, C. & Erdmann, V. A. (1998). Crystal structure of domain E of Thermus flavus 5 S rRNA: a helical RNA structure including a hairpin loop. FEBS Letters, 429, 211–215. 57. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K., Golden, B. L., Kundrot, C. E. et al. (1996). Crystal structure of a group I ribozyme domain: principles of RNA packing. Science, 273, 1678–1685. 58. Juneau, K., Podell, E. R., Harrington, D. J. & Cech, T. R. (2001). Structural basis of the enhanced stability of a mutant ribozyme domain and a detailed view of RNA–solvent interactions. Structure, 9, 221–231. 59. Ruff, M., Krishnaswamy, S., Boeglin, M., Poterszman, A., Mitschler, A., Podjarny, A. et al. (1991). Class II aminoacyl transfer RNA synthetases: crystal structure of yeast aspartyl-tRNA synthetase complexed with tRNA. Science, 252, 1682–1689. 60. Sussman, J. L., Holbrook, S. R., Warrant, R. W., Church, G. M. & Kim, S. H. (1978). Crystal structure of yeast phenylalanine tRNA. I. Crystallographic refinement. J. Mol. Biol. 123, 607–630. 61. Harms, J., Schluenzen, F., Zarvivach, R., Bashan, A., Gat, S., Agmon, I. et al. (2001). High resolution structure of the large ribosomal subunit from a mesophilic eubacterium. Cell, 107, 679–688. 62. Adams, P. L., Stahley, M. R., Kosek, A. B., Wang, J. & Strobel, S. (2004). Crystal structure of a self-splicing group I intron with both exons. Nature, 435, 45–50. 63. Gutell, R. R., Cannone, J. J., Shang, Z., Du, Y. & Serra, M. J. (2000). A story: unpaired adenosine bases in ribosomal RNAs. J. Mol. Biol. 304, 335–354. 64. Leontis, N. B. & Westhof, E. (1998). A common motif organizes the structure of multi-helix loops in 16 S and 23 S ribosomal RNAs. J. Mol. Biol. 283, 571–583. 65. Wimberly, B. (1994). A common RNA loop motif as a docking module and its function in the hammerhead ribozyme. Struct. Biol. 1, 820–827. 66. Sponer, J., Mokdad, A., Sponer, J. E., Spackova´, N., Leszczynski, J. & Leontis, N. B. (2003). Recent unique tertiary and neighbor interactions determine conser- vation patterns of cis Watson–Crick A/G base-pairs. J. Mol. Biol. 330, 967–978. 67. Saenger, W. (1984). Principles of Nucliec Acid Structure, pp. 120–121, Springer-Verlag, New York. Edited by D. E. Draper (Received 7 June 2004; received in revised form 20 September 2004; accepted 24 September 2004) Diversity of Base-pair Conformations 1249