This document discusses long non-coding RNAs (lncRNAs). It begins by describing the discovery of lncRNAs in the 1980s-2000s through cDNA sequencing. It then states that lncRNAs are the largest class of transcripts in mouse and human genomes. The document discusses that lncRNAs were once thought to be useless but are now known to have regulatory functions. It provides details on the characteristics, locations in the genome, functions, mechanisms of action, roles in human disease, and implications in human carcinomas of lncRNAs.
2. DISCOVERY
Many initial lncRNAs, such as XIST and H19, were
discovered in the 1980s and 1990s by searching
cDNA libraries for clones of interest
Early 2000s: development of large scale cDNA
sequencing leads to discovery of surprising number
of lncRNA transcripts.
Mid 2000s: the number of detected lncRNA
transcripts increases exponentially.
2
3. Long non-protein-coding RNAs (lncRNAs) are
proposed to be the largest transcript class in the
mouse and human transcriptomes
The transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and
other non-coding RNA transcribed in one cell or a population of cells
In addition to protein-coding genes, the human genome
makes a large amount of noncoding RNAs, including
microRNAs and long noncoding RNAs (lncRNAs). Both
microRNAs and lncRNAs have been shown to have a critical
role in the regulation of cellular processes such as cell growth
and apoptosis, as well as cancer progression and metastasis
Such RNAs were previously disregarded as useless,
but recent functional studies have revealed that they
have multiple regulatory functions.
In addition to mRNA and microRNA, lncRNA is a major
hotspot in functional genomics research.
3
4. LNCRNAS - A NEW LAYER OF GENOME REGULATORY
INFORMATION
It is now well appreciated that less than two percent of the
human genome codes for proteins and the majority of the
genome gives rise to non-protein-coding RNAs (ncRNAs)
The focus of this review is long ncRNAs (known as lncRNAs),
which constitute the biggest class of ncRNAs with
approximately 10,000 lncRNA genes so far annotated in
humans.
.Based on large-scale sequencing and prediction from
chromatin-state maps of full length cDNA libraries in
FANTOM2 and 3 as well as human transcriptomes, more than
4,600 LncRNAs in mouse and over 3,300 LncRNAs in human
have been identified with a total of approximately 23,000
LncRNAs in a mammalian genome.
http://fantom3.gsc.riken.jp/
The number of lncRNAs may still rise if next-generation
sequencing studies focus on cell types that are not yet
completely characterized. 4
5. lncRNAs are RNA polymerase II (RNAPII) transcripts that
lack an open reading frame and are longer than 200
nucleotides.
Majority of long ncRNAs have been shown to be transcribed
through RNA polymerase II, although some long ncRNAs are
generated by RNA polymerase III
This size cut-off distinguishes lncRNAs from small RNAs
such as microRNAs, piwi-interacting RNAs (piRNAs), small
nucleolar RNAs (snoRNAs) and small interfering RNAs
(siRNAs) and arises from RNA preparation methods that
capture RNA molecules above this size.
?Two important questions are whether all lncRNAs are
functional and how they could exert a function.
Several lncRNAs have been shown to function through their
product, but this is not the only possible mode of action.
5
6. FUNCTION
The function of the majority of lncRNAs is still unknown.
the number of characterized lncRNAs is growing and
they play roles in negatively or positively regulating gene
expression in development, differentiation and human
disease
lncRNAs may regulate protein-coding (pc) gene
expression at both the posttranscriptional and
transcriptional level.
Posttranscriptional regulation could occur by lncRNAs
acting as competing endogenous RNAs to regulate
microRNA levels as well as by modulating mRNA
stability and translation by homologous base pairing, or
as in the example of NEAT1 that is involved in nuclear
retention of mRNAs . 6
7. FUNCTION
Long ncRNAs that regulate transcription are
divergent molecules.
Paucity of Introns (nuclear localization)
Low GC content (low expression level)
Predicted ORFs have poor start codon and
contexts (activation of nonsense-mediated decay
pathway)
Significant similarity between lncRNA and 3’-UTR of
mRNA (structural feature + sequence composition)
7
8. The most mRNAs, which ultimately localized to the cytoplasm
after processing, most lncRNAs are permanently localized in
the nucleus. There are also some lncRNAs selectively
localized in the cytoplasm
8
9. LOCATION IN GENOME
LncRNAs can be categorized according to their proximity to
protein coding genes in the genome, using this criteria
lncRNAs are generally placed into five categories:
sense
antisense
bidirectional
intronic
intergenic
9
10. 10
Sense - The lncRNA sequence
overlaps with the sense strand of
a protein coding gene.
Antisense - The lncRNA
sequence overlaps with the
antisense strand of a protein
coding gene.
Bidirectional - The lncRNA
sequence is located on the
opposite strand from a protein
coding gene whose transcription
is initiated less than 1000 base
pairs away.
Intronic - The lncRNA sequence
is derived entirely from within an
intron of another transcript. This
may be either a true independent
transcript or a product of pre-
mRNA processing
Intergenic - The lncRNA
sequence is not located near any
other protein coding loci
11. MODES OF TRANSCRIPTIONAL REGULATION
BY LNCRNAS
Regulation of transcription is considered:
to be an interplay of tissue and developmental specific
transcription factors (TFs)
chromatin modifying factors acting on enhancer
promoter sequences to facilitate the assembly of the
transcription machinery at gene promoters.
Transcriptional regulation by lncRNAs could work either
in cis or in trans, and could negatively or positively
control pc gene expression.
lncRNAs work in cis when their effects are restricted to
the chromosome from which they are transcribed, and
work in trans when they affect genes on other
chromosomes.
11
12. 12
LncRNAs fall into one of three categories.
(A) Long intervening non-coding RNAs (lincRNAs) are transcribed from
regions far away from protein-coding genes.
(B) Natural Antisense Transcripts (NATs) are transcribed from the
opposite strand of a protein-coding gene.
(C) Intronic lncRNAs (shown in green) are transcribed from within
introns of protein-coding genes.
13. TRANSCRIPTION OF LONG NCRNAS
Many of long ncRNAs represent tissue-specific pattern of
expression
the expression of these long ncRNAs should:
1.these long ncRNAs contain trimethyl marks of histone H3-
lysine (K) 4 at their promoter regions and trimethyl marks of
histone H3-K36 along the length of the transcribed region,
which are observed in usual transcripts by RNA polymerase II.
Trimethyl marks are designated as “chromatin signature (a
K4-K36 domain)
2. long ncRNAs generally possess the 50CAP (7-
methylguanosine cap) structure at the 50 edge and also poly
(A) tail at their 30 end as well
3. the long ncRNAs have well-defined transcription factor
binding sites like NF-kB in their promoter regions
13
15. Transcriptional Regulation Through Targeting Basic
Transcription Factors and RNA Polymerase II by Long
ncRNAs
***************
1. Alu RNA
SINE retrotransposon elements including Alu repeats generate
numerous species of long ncRNAs
Alu RNAs and SINE B2 RNAs exert transcriptional repression under
the heat-shock condition
SINE B2 and Alu RNA directly target RNA polymerase II.
SINE B2 turns out to have similar repressive effect on the
transcription as well
repetitive sequence that occupies the half of the human genome
could be transcribed, and their transcripts, the long ncRNAs, exert
transcriptional repression
15
16. 2. Dehydrofolate Reductase ncRNA
In mammalian cells, expression of dehydrofolate reductase (DHFR)
is repressed
a transcript of a minor promoter located upstream of a major
promoter is involved in the repression of DHFR
The alternative promoters within the same gene have been observed
in various loci.
It could be a general mechanism that the transcripts from the
alternative promoters have a regulatory role in transcription of the
promoter
16
17. 17
Transcriptional repression of dehydrofolate reductase (DHFR)
gene by the ncRNA transcribed from the minor promoter of
the DHFR gene. The DHFR ncRNA represses the DHFR
gene expression by blocking the preinitiation complex through
targeting TFIIB and RNA polymerase II
18. REGULATION IN TRANS
Some significant examples of lncRNAs that act in trans are those
that can influence the general transcriptional output of a cell by
directly affecting RNAPII activity
EXAMPLE:
The 331 nucleotide 7SK lncRNA, which represses transcription
elongation by preventing the PTEFβ transcription factor from
phosphorylating the RNAPII carboxy-terminal domain
THE 178 nucleotide B2 lncRNA, a general repressor of RNAPII
activity upon heat shock.
** The B2 lncRNA acts by binding RNAPII and inhibiting
phosphorylation of its CTD by TFIIH, thus disturbing the ability of
RNAPII to bind DNA
18
19. REGULATION IN TRANS
Regulation in trans can also act locus-specifically.
While the ability of lncRNAs to act locus-specifically to
regulate a set of genes was first demonstrated for imprinted
genes where lncRNA expression was shown to silence from
one to ten flanking genes in cis, lncRNAs that lie outside
imprinted gene clusters, such as the HOTAIR lncRNA, were
later found also to have locus-specific action.
HOTAIR is expressed from the HOXC cluster and was shown
to repress transcription in trans across 40 kb of
the HOXD cluster
HOTAIR interacts with Polycomb repressive complex 2
(PRC2) and is required for repressive histone H3 lysine-27
trimethylation (H3K27me3) of the HOXD cluster.
19
20. REGULATION IN CIS
20
In contrast to trans-acting
lncRNAs, which act via their
RNA product, cis-acting
lncRNAs have the possibility
to act in two fundamentally
different modes.
1. depends on a lncRNA
product
EXAMPLE: general cis-
regulation is induction of X
inactivation by the Xist lncRNA
in female mammals. Xist is
expressed from one of the two
X chromosomes and induces
silencing of the whole
chromosome
21. It was also proposed to act in plants.
In Arabidopsis thaliana, the COLDAIR lncRNA is initiated
from an intron of the FLC pc gene and silences it by targeting
repressive chromatin marks to the locus to control flowering
time
2. regulation by lncRNAs involves the process of transcription
itself, which is a priori cis-acting
Several lines of evidence suggest that the mere process of
lncRNA transcription can affect gene expression if RNAPII
traverses a regulatory element or changes general chromatin
organization of the locus.
21
25. MECHANISMS BY WHICH LNCRNA TRANSCRIPTION
SILENCES GENE EXPRESSION
Transcription-mediated silencing, also referred to as
‘transcriptional interference’ (TI)
TI has been reported in unicellular and multicellular organisms
Mechanistic details are still largely unclear, but TI could
theoretically act at several stages in transcription: by
influencing enhancer or promoter activity or by blocking
RNAPII elongation, splicing or polyadenylation.
25
28. 28
Type Subclasses Symbol
Small ncRNA
(18 to 200 nt in size)
Transfer RNAs tRNAs
MicroRNAs miRNAs
Ribosomal 5S and 5.8S RNAs rRNAs
Piwi interacting RNAs piRNAs
Tiny transcription initiation RNAs tiRNAs
Small interfering RNAs siRNA
Promoter-associated short RNAs PASRs
Termini-associated short RNAs TASRs
Antisense termini associated short RNAs aTASRs
Small nucleolar RNAs snoRNAs
Transcription start site antisense RNAs TSSa-RNAs
Small nuclear RNAs snRNAs
Retrotransposon-derived RNAs RE-RNAs
3’UTR-derived RNAs uaRNAs
x-ncRNA x-ncRNA
Human Y RNA hY RNA
Unusually small RNAs usRNAs
Small NF90-associated RNAs snaRs
Vault RNAs vtRNAs
29. 29
Type Subclasses Symbol
Long ncRNA
(lncRNAs, 200 nt
to >100 kb in size)
Ribosomal 18S and 28S RNAs rRNAs
Long or large intergenic ncRNAs lincRNAs
Transcribed ultraconserved regions T-UCRs
Pseudogenes None
GAA-repeat containing RNAs GRC-RNAs
Long intronic ncRNAs None
Antisense RNAs aRNAs
Promoter-associated long RNAs PALRs
Promoter upstream transcripts PROMPTs
Stable excised intron RNAs None
Long stress-induced non-coding
transcripts
LSINCTs
30. 30
accumulating reports of
misregulated lncRNA
expression across numerous
cancer types suggest that
aberrant lncRNA expression
may be a major contributor to
tumorigenesis
This surge in publications
reflects the increasing attention
to this subject and a number of
useful lncRNA databases have
been created
transcribed ultraconserved
regions (T-UCRs), within
human carcinomas.
32. With advancements in cancer transcriptome
profiling and accumulating evidence supporting
lncRNA function, a number of differentially
expressed lncRNAs have been associated with
cancer
LncRNAs have been implicated to regulate a range
of biological functions and the disruption of some of
these functions, such as genomic imprinting and
transcriptional regulation, plays a critical role in
cancer development.
32
33. MPRINTED LNCRNA GENES
Imprinting is a process whereby the copy of a gene inherited
from one parent is epigenetically silenced
imprinted regions often include multiple maternal and
paternally expressed genes with a high frequency of ncRNA
genes
The imprinted ncRNA genes are implicated in the imprinting
of the region by a variety of mechanisms including enhancer
competition and chromatin remodeling
A key feature of cancer is the loss of this imprinting
resulting in altered gene expression
Two of the best known imprinted genes are in fact lncRNAs.
33
34. The H19 gene encodes a 2.3 kb lncRNA that is expressed
exclusively from the maternal allele
H19 and its reciprocally imprinted protein-coding neighbor the
Insulin-Like Growth Factor 2 or IGF2 gene at 11p15.5.
The expression of H19 is high during vertebrate embryo
development, but is down regulated in most tissues shortly
after birth with the exception of skeletal tissue and cartilage
loss of imprinting at the H19 locus resulted in high H19
expression in cancers of the esophagus, colon, liver, bladder
and with hepatic metastases
H19 has been implicated as having both oncogenic and tumor
suppression properties in cancer
H19 is upregulated in a number of human cancers, including
hepatocellular, bladder and breast carcinomas, suggesting an
oncogenic function for this lncRNA
34
35. In colon cancer H19 was shown to be directly activated by the
oncogenic transcription factor c-Myc
H19 transcripts also serve as a precursor for miR-675, a
miRNA involved in the regulation of developmental genes
miR-675 is processed from the first exon of H19 and
functionally down regulates the tumor suppressor gene
retinoblastoma (RB1) in human colorectal cancer, further
implying an oncogenic role for H19
There is evidence suggesting H19 may also play a role in
tumor suppression
35
36. Using a mouse model for colorectal cancer, it was shown that
mice lacking H19 manifested an increased polyp count
compared to wildtype
a mouse teratocarcinoma model demonstrated larger tumor
growth when the embryo lacked H19
in a hepatocarcinoma model, mice developed cancer much
earlier when H19 was absent
36
37. XIST - X-INACTIVE-SPECIFIC TRANSCRIPT
XIST is arguably an archetype for the study of functional
lncRNAs in mammalian cells, having been studied for nearly
two decades
In female cells, the XIST transcript plays a critical role in X-
chromosome inactivation by physically coating one of the two
X-chromosomes, and is necessary for the cis-inactivation of
the over one thousand X-linked genes
Like the lncRNAs HOTAIR and ANRIL, XIST associates with
polycombrepressor proteins, suggesting a common pathway
of inducing silencing utilized by diverse lncRNAs
37
38. In mice, X inactivation in the extra embryonic tissues is
nonrandom , and the initial expression of Xist is always
paternal in origin, followed later by random X inactivation in
the epiblast associated with random mono-allelic expression
It is unclear, however, how much of this regulation is
conserved in humans, who do not show imprinted X
inactivation
While XIST expression levels are correlated with outcome in
some cancers, such as the therapeutic response in ovarian
cancer, the actual role that XIST may play in human
carcinomas, if any, is not entirely clear.
38
39. For tumors in which two active X chromosomes are observed,
as has been frequently observed in breast cancer, the most
common mechanism involves loss of the inactive X and
duplication of the active X, often resulting in heterogeneous
XIST expression in these tumors
many cancers do show different onsets and progressions in
males and females
XIST expression will increase with the number of inactive X
chromosomes.
It has been shown that approximately 15% of human X linked
genes continue to be expressed from the inactive X
chromosome
39
41. Many of the described lncRNAs are expressed in a variety of
cancers, however a select few thus far have been associated
with a single cancer type:
HOTAIR, has only been described in breast cancer
three lncRNAs PCGEM1, DD3 and PCNCR1 have been
associated solely with prostate cancer
the liver associated lncRNA HULC is highly expressed in
primary liver tumors, and in colorectal carcinomas that
metastasized to the liver, but not in the primary colon tumors
or in non-liver metastases
41
42. 42
The HOTAIR lncRNA is transcribed from the HOXC locus
binding of the PRC2 and LSD1 complex to the HOXD locus
the HOTAIR-PRC2-LSD1 complex is redirected to the HOXD locus on
chromosome 2 where genes involved in metastasis suppression are
silenced through H3K27 methylation and H3K4 demethylation.
This drives breast cancer cells to develop gene expression patterns that
more closely resemble embryonic fibroblasts than epithelial cells.
mechanism of HOTAIR
43. 43
(1) The kinase PRKACB functions as an activator of CREB.
(2) Phosphorylated (activated) CREB forms part of the RNA pol II
transcriptional machinery to activate HULC expression.
(3) Abundant HULC RNA acts as a molecular sponge to sequester and
inactivate the repressive function miR-372.
(4) PRKACB levels increase, as transcripts are normally translationally
repressed by high miR-372 levels
44. 44
The long ncRNAs involving
in transcriptional regulation
through chromosomal
modification
(a) Evf2 activates
transcription by removing
the methylase MeCP2 on
CpG regions and releasing
HDAC activity from the
target gene.
(b) HOTAIR activates
transcription by binding
PRC2 and histone
methylation of HOXD locus
46. RNA POLYMERASE III TRANSCRIPTION OF LNCRNA
The lncRNAs described thus far are products of RNA pol II
transcription, yet many ncRNAs are transcribed by RNA
polymerase III (RNA pol III)
RNA pol III is frequently deregulated in cancer cells
resulting in increased activity
Aberrant RNA pol III function may have consequences to the
expression of lncRNAs transcribed by this polymerase.
46
47. EXAMPLE
the lncRNA BC200 is a small cytoplasmic lncRNA in the
neurons of primate nervous systems and human cancers, but
not in non-neuronal organs
Unlike the majority of lncRNAs described thus far, BC200 is
transcribed by RNA pol III and shares unique homology with
human Alu elements
The BC200 RNA has been characterized as a negative
regulator of eIF4A-dependent translation initiation
Due to the fact that many whole transcriptome sequencing
methods were developed to enrich for poly(A) purified
transcripts, RNA pol III transcripts may have been excluded
from analysis
This suggests that other, yet unidentified RNA pol III lncRNAs
over-expressed in cancer may be participating in
tumorigenesis 47
48. LONG INTERGENIC NON-CODING RNAS, ONE
OF LIFE’S BARE ESSENTIALS (2013)
Researchers from the Rinn Lab set out to investigate the
functional relevance of lincRNAs under a number
of physiological conditions, given that they are a key player in
the regulation of gene expression.
The scientists focused on a subgroup of lncRNAs called long
intergenic noncoding RNAs (lincRNAs)
Here’s what they discovered:
Knockout of 3 of the lincRNAs of interest produces a lethal
phenotype, a true TKO (at least in the boxing world).
Several other knockouts also produced severe growth and homeotic
defects.
Further detailed characterization was also done on a particular
lincRNA knockout (linc-Brn1b), which led developmental defects in
the brain. 48
53. The Reference Database For Functional Long Noncoding
RNAs
http://www.lncrnadb.org/
the LncRNADisease database
http://www.cuilab.cn/lncrnadisease
the latest version of this long non-coding RNA database
contains 113,513 human annotated lncRNAs
http://www.lncipedia.org/
http://deepbase.sysu.edu.cn/chipbase/lncrna.php
53