INTRODUCTION
A surprising revelation from the human genome project was that less than 2% of the human genome was composed of protein-coding genes[1]. Although originally thought of as junk DNA, the ENCODE project revealed that 76% of the genome is actively transcribed[2]. Although transcription alone is not a demonstration that these are indeed functional molecules, through careful examination we now know that a great deal of this non-coding RNA (ncRNA) is functionally relevant to physiology and disease. One important class of ncRNA is long ncRNA (lncRNA). NcRNA species greater than 200 nucleotides in length are all grouped together as lncRNAs. This arbitrary limit to their size likely ignores the diversity present among lncRNAs.
The GENECODE consortium (version 18) has annotated 13562 lncRNAs[3]. Around 2/3 or lncRNAs are intergenic [long intergenic ncRNAs (lincRNAs)], the rest are overlapping, antisense, or intronic to protein coding genes. Although some lncRNAs have been found to be transcribed by RNA Pol III, a majority of lncRNA is thought to be transcribed by RNA pol II, to be polyadenylated, spliced and 5’-capped[4]. LncRNAs show lower evolutionary conservation than protein coding genes and show less conserved than other ncRNAs, i.e., microRNAs (miRNAs). This may be because lncRNAs have rapidly evolved or because selection pressure maintains only short critical sequences or secondary structures of the lncRNAs.
LncRNAs are functionally very diverse, interacting with ncRNAs, mRNAs, proteins and genomic DNA and acting as tethers, guides, decoys, and scaffolds[5]. Most lncRNAs have been found to be critical regulators of gene expression. LncRNAs have been found to act at nearly every level of gene regulation: epigenetic, transcriptional, posttranscriptional, and translational.
Here, we will discuss lncRNAs in stem cells, where they are shown to be critical regulators of pluripotency and self-renewal. Next, we will examine lncRNAs role in regulating the epigenome. Finally we will summarize the suspected involvement of lncRNAs in tumorigenesis.
LNCRNAS IN STEM CELLS
One area of biology where lncRNAs are emerging as major players is in stem cell biology. Several studies have found that lncRNAs can regulate pluripotency and differentiation in embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs). Furthermore, lncRNAs are also emerging as important regulators of adult stem cells.
LncRNAs in Embryonic Stem Cells
Guttman et al[6] examined lincRNA expression and function in ESCs. They first profiled lincRNA expression in ESCs and then performed knockdown of 147 lincRNAs that were expressed in ESCs (out of 226 lincRNAs they could detect in ESCs). They found that a majority of these lincRNAs (93%) were able to influence gene expression patterns in ESCs. Furthermore, they found that 26 lincRNAs were able to significantly impact expression levels of the critical pluripotency factor Nanog. This suggests that lincRNAs are heavily involved in maintaining pluripotency and preventing the differentiation of ESCs. Next, they examined differentiation of ESCs and identified 13 lincRNAs associated with endoderm differentiation, 7 lincRNAs associated with ectoderm differentiation, 5 with neuroectoderm, 7 with mesoderm, and 2 with trophectoderm differentiation. This demonstrates that lincRNAs are critical regulators of lineage-specific differentiation of ESCs. Next, they examined whether lincRNAs in ESCs were regulated by pluripotency-associated transcription factors (Oct4, Sox2, Nanog, cMyc, nMyc, KLF4, ZFX, Smad and TCF3). At least one pluripotency factors was associated with the promoter region of 75% of the 226 lincRNAs expressed in ESCs suggesting that these lincRNAs serve as direct gene targets of pluripotency transcription factors. Finally, they found that 74 (30%) of these lincRNAs were associated with chromatin-modifying proteins known to be important in ESCs. Sheik Mohamed et al[7] also examined the role of lncRNAs in regulating pluripotency of ESCs. They examined high-confidence binding sites of OCT4 and Nanog as determined by paired end-tag sequencing of ChIP-PET DNA. Ten percent of the OCT4 binding sites (105/1083) and 11% of the Nanog binding sites (335/3006) occupied proximal regions of lncRNA genes. This strongly suggests that regulation of lncRNA is a critical aspect of pluripotency.
Ng et al[8] examined pluripotency associated lncRNAs by examining expression following differentiation of ESCs to Neural Progenitor Cells (NPCs). Using custom lncRNA microarray technology they revealed over 934 lncRNAs that were differentially expressed. They identified 36 lncRNAs that were downregulated greater than five-fold in NPCs. Next they chose 3 candidate lncRNAs for functional characterization, lncRNA ES1, ES2, ES3. They found that knockdown of any of these 3 lncRNAs resulted in ESC differentiation. They performed RNA Immunoprecipitation experiments and found that ES1 and ES2 were associated with SUZ12 [Polycomb Group (PcG)] and pluripotency factor SOX2 in the nucleus.
These studies reveal that maintenance of pluripotency and lineage-specific differentiation are carefully regulated by lncRNA networks, many of which are direct targets of master pluripotency transcription factors. Additional functional studies of the lncRNAs should shed more light on the mechanism and interacting partners of these lncRNAs in ESCs.
LncRNAs in IPSCs
Loewer et al[9] examined lincRNA expression in iPSCs. They discovered that the lincRNA-ST8SIA3 [or Regulator of Reprogramming (ROR)] was upregulated in iPSCs. They found that pluripotency transcription factors (OCT4, SOX2 and NANOG) directly regulated expression of ROR. Furthermore they found that ROR knockdown could inhibit reprogramming and iPSC colony formation. Furthermore, overexpression of ROR enhanced iPSC colony formation. Wang et al[10] examined ROR function in ESCs and found that ROR was able to regulated self-renewal and differentiation of ESCs. Furthermore, they identified a possible mechanism through which ROR may act, serving as a competitive endogenous RNA (sponge) for miR-145, which is know to target OCT4, SOX2, KLF4 and regulate ESC differentiation[11].
LncRNAs in adult stem cells
There are several reports of lncRNAs regulating adult stem cell self-renewal and differentiation. Zuo et al[12] examined osteoblast differentiation of Mesenchymal stem cells and found that 116 lncRNAs were differentially expressed upon differentiation. Next, Kretz et al[13] found that the lncRNA ANCR was down-regulated during differentiation of epidermal cells and found that ANCR regulates gene expression to suppress differentiation of epidermal progenitors. Ramos et al[14] found that the lncRNAs Six3os and Dix1as were regulators of glial-neuronal lineage specification from adult neural stem cells. Finally, Yildirim et al[15] found that the lncRNA Xist was essential for long term survival of hematopoietic stem cells.
Adipocytes and adipogenesis
Much of the information involving lncRNA regulation of adipogenesis came from the recent discoveries by Sun et al[16] who identified 20 candidate lncRNAs upregulated during adipogenesis using RNA-seq to analyze gene expression of in vitro cultured mouse preadipoctyes and brown/white adipocytes, alongside primary isolated mature adipocytes. Using RNAi based loss of function screening and unique score metrics, they found that loss of many of these lncRNAs resulted in either partial or near complete reversion of mature adipocyte to precursor phenotype. In another study, Pang et al[17] found that an antisense lncRNA for the PU.1 gene promotes adipogenesis by an forming mRNA/lncRNA duplex with PU.1 mRNA and blocking translation.
EPIGENETIC REGULATION
LncRNA is emerging as a major player in chromatin remodeling and epigenetic regulation of gene expression. Evidence indicates that lncRNAs play important roles in the critical physiological processes of X-chromosome inactivation (XCI) and genomic imprinting. Furthermore, dysregulation of lncRNAs has been indicated in epigenetic reprogramming in human cancer.
XCI
One of the X chromosomes in female mammalian cells is epigenetically inactivated during early development to ensure balanced expression of X-chromosomal genes known as dosage compensation. The lncRNA, X-inactive specific transcript (Xist), was discovered due to its surprisingly only being expressed by inactive X-chromosomes. Curiously, Xist was not expressed in male cells and was only shown to be expressed in cells with at least two X-chromosomes. It was revealed that Xist plays a central role directing XCI by coating one X-chromosome leading to its epigenetic silencing[18].
XCI initiation begins with one of the X-chromosomes randomly activating Xist expression in early development. Xist then recruits Polycomb repressor complex 2 (PRC2) to the Xist promoter region in the future inactive chromosome. The Xist promoter is then methylated in cis resulting in further activation of Xist[19]. The transcribed Xist RNA transfers to specific loci in cis across the X-chromosome using a targeting mechanism based on the three-dimensional conformation of the X-chromosome[20]. The attachment of Xist to the future inactive X-chromosome (Xi) is mediated by a transcription factor, YY1, which can bind simultaneously both DNA and RNA through different motifs[21]. Xist finally spreads and coats the Xi and PRC2 is recruited by interacting with the Repeat A region of Xist[19]. At the end, Xi chromatin is extensively ubiquitylated at histone H2A, resulting in the formation of heterochromatin. This process is suppressed at the active X-chromosomes through the action of the transcribed antisense of Xist, the lncRNA Tsix, which through its complementary sequence can inhibit Xist expression.
Genomic imprinting
Genomic imprinting is an epigenetic process by which certain autosomal genes express only the maternal or paternal allele. Imprinted genes are usually clustered on chromosomes and these imprinted gene clusters contain both protein-coding genes and lncRNA genes. The Igf2r locus contains three imprinted, maternally expressed protein-coding genes, Igf2r, Slc22a2, Slc22a3, and an imprinted, paternally expressed lncRNA gene, Airn (antisense Igf2r RNA non-coding), which is antisense to Igf2r RNA and required for the silencing of Igf2r, Slc22a2 and Slc22a3 on the paternal chromosome[22].
The gene product of Airn, Air lncRNA, represses genes from the Igf2r locus via multiple different silencing mechanisms. In mouse placenta, the Air RNA recruits the H3K9 histone methyltransferase G9a to the Slc22a3 promoter chromatin, resulting in H3K9 methylation and Slc22a3 transcriptional silencing[23]. In contrast, Air silences Igf2r gene through a transcriptional interference mechanism, in which Airn transcriptional overlap of the weaker Igf2r promoter results in the silencing of Igf2r gene[24]. The silenced Igf2r promoter is then subject to DNA methylation that along with Airn expression, maintains Igf2r silencing[25].
LncRNA may also function as scaffolds allowing multiple chromatin modifying complexes to regulate target genes. For example, lncRNA HOTAIR in the mammalian HOXC locus can bind both PRC2 and LSD1/CoRES/REST complexes mediating histone H3K27 trimethylation and H3K4 demethylation of target genes[26]. This complex has been shown to target and silence the HOXD gene cluster[27].
LNCRNA IN CANCER
Profiling of normal and tumor tissues has revealed that lncRNAs are dysregulated in many human cancers including prostate[28], colorectal[29], breast[30], bladder[31], liver[32], and brain cancer[33]. They have been found to function as oncogenes and tumor suppressors and regulate many of the hallmarks of cancer.
Tumor Suppressor lncRNAs
The lncRNA Xist was found to be a potent tumor suppressor of hematologic malignancies in vivo. Yildirim et al[15] found that knockdown of Xist in hematopoietic cells in mice resulted in aggressive myeloproliferative neoplasm and myelodysplastic syndrome. They also demonstrated that Xist was critical for hematopoietic stem cell survival. The authors hypothesized that X reactivation via Xist silencing could lead to genomic instability and cancer.
Huarte et al[34] examined lincRNAs regulated by the tumor suppressor p53. They identified lincRNA-p21 as a direct p53 target. Furthermore, they found that lincRNA-p21 is critical in regulating many of the genes that are repressed in response to p53 activity and they found that lincRNA-p21 associates with hnRNP-K. The lincRNA-p21/hnRNP-K interaction was found to be necessary for hnRNP-K genomic localization at sites of gene repression.
Oncogenic lncRNAs
Gupta et al[30] found that lncRNA HOTAIR overexpression was a strong predictor of breast tumor metastasis. Furthermore, they found that HOTAIR overexpression could promote breast tumor cell invasion and knockdown of HOTAIR could inhibit invasiveness. Prensner et al[35] examined lincRNA expression prostate tumors and found that the lincRNA PCAT-1 was overexpressed in high-grade and metastatic tumors. They found that PCAT-1 expression promoted proliferation of prostate cancer cells.
Epigenetic reprogramming in cancer
Since lncRNAs play such a large role in epigenetic regulation and chromatin remodeling in normal physiology it is possible that they may play a role in the epigenetic reprogramming that is a hallmark of human cancer. One hundred and seventy ncRNAs including HOTAIR are differentially expressed among normal human breast tissue, primary breast tumors, and metastatic breast tumors[30]. Forced expression of HOTAIR in epithelial cancer cells altered the localization of PRC2 on chromatin. Genome-wide studies revealed PRC2 localization more resembling occupancy in embryonic fibroblasts. This correlated with increased tumor cell invasiveness. Conversely, knockdown of HOTAIR was shown to inhibit cancer cell invasiveness in cells with high levels of PRC2 expression. Similar findings were reported in colorectal cancer[36].
It has also been shown that the tumor suppressor gene p15 is silenced by its natural antisense RNA, a lncRNA ANRIL[37,38]. ANRIL directs PRC2 to the p15 locus inhibiting p15 expression[38]. Similarly, the expression the tumor suppressor gene p21 was shown to be epigenetically repressed by its antisense RNA[39]. These reports suggest that tumor suppressor gene silencing may be a result of an imbalance in bidirectional transcription.
Prensner et al[28] found that the lncRNA SChLAP1 was overexpresed in prostate tumors and where it is critical for tumor cell metastasis. They found that SChLAP1 antagonized localization of the SWI/SNF chromatin-remodeling complex and inhibited tumor suppressive action of SWI/SNF.
Microvascular invasion and angiogenesis
Angiogenesis is an important hallmark of human cancer. New vasculature is required for providing nutrients and oxygen for tumor growth and proliferation as well as proving an avenue for metastatic spread. Although a variety of genes which can stimulate angiogenesis have been discovered, recently there has been a focus on investigating the roles of ncRNA in the angiogenic switch.
Yuan et al[40] discovered that the lncRNA MVIH (long noncoding RNA associated with microvascular invasion in hepatocellular carcinoma) was overexpressed in hepatocellular carcinoma. MVIH overexpression was associated with frequent microvascular invasion and a higher tumor node metastasis stage as well as decrease in recurrence-free survival. Further data showed MVIH could promote tumor inducing angiogenesis through inhibiting the secretion of phosphoglycerate kinase phosphoglycerate kinase 1. This is clear evidence that dysregulation of lncRNA promotes tumor growth through angiogenesis.
Angiogenic signaling can also be activated in response to hypoxia. Normal hypoxia can induce gene expression in response to oxygen sensing accompanied by adaptive responses to the hypoxic environment. However, during tumor growth, hypoxia can alter gene expression to promote angiogenesis, cell proliferation and even metastasis. Ferdin et al[41] found the tight link between long noncoding transcripts from ultraconserved regions, termed transcribed-ultraconserved regions (T-UCRs) and oxygen deprivation. Several T-UCR were upregulated during hypoxia (several were induced directly by hypoxia-inducible factor), these hypoxia-induced noncoding ultra-conserved transcripts (HINCUTs) were also found overexpressed in colon cancer patients. The author also hypothesized that protein modification through addition of O-linked b-N-acetyl glucosamine was involved in the sensing oxygen tension as one specific HINCUT lncRNA (HINCUT-1) was part of retained intron of the host protein-coding gene O-linked N-acetyl glucosamine transferase.
Another potential angiogenic lncRNA, Maternally expressed gene 3 (MEG3) was found to be silenced in pituitary adenomas. MEG3 knockout mice showed enhanced angiogenic signaling and increased microvascular density in embryonic brains[42]. Since MEG3 stimulated p53 pathways and regulated p53 target genes, it is highly desirable to examine MEG3 expression in other cancers.
Metabolism imbalance
Metabolic imbalance is a key hallmark of human cancer. One mechanism lncRNA may contribute to tumorigenesis is through regulating tumor cell metabolism. However, very little is known regarding the role of lncRNAs in metabolism. One report demonstrated that a particular lncRNA on paternal chromosome 15q11-q13 was critical to maintaining energy balance[43]. The loss of noncoding RNAs on this site is followed by genetic disease called Prader-Willi syndrome (PWS). These imprinted PWS locus cover a long noncoding RNA transcript which are processed into SNORD116 small nucleolar RNA and spiced exons of the host gene 116HG. The author found 116HG are concentrated subnuclearly with transcriptional activator RBBP5 at active genes involved in metabolism. 116HG deficient mice showed dramatic increases in energy expenditures. Although the inculpated tissues are mainly in the brain and the syndrome is characteristic of disorders of neurodevelopment and obesity in children, whether it is also a genetic risk factor for certain cancers is unknown and needs to be examined.
Another well-known lncRNA, ANRIL, is implicated as a risk factor for breast cancer and is overexpressed in prostate cancer[44]. Recently, knockdown of ANRIL resulted in downregulation of 3 genes, ADIPOR1, VAMP3 and C11ORF10, all of which have a well-established role during the metabolism of fatty acid and glucose and in inflammation[45].
Extracellular matrix
Extracellular matrix (ECM) is an important component of both normal and malignant tissues. In addition to structural support the ECM supports the communication between nearby cells in order to coordinate signaling and tissue function. The role of lncRNA in the ECM during malignancy has only recently been investigated. In one study, Zhuang et al[46] investigated the expression of lncRNA during type I collagen stimulation in a 3-D culture. Type I collagen has been shown enriched in tumor microenvironment and can promote tumor growth. In this study, the lncRNA HOTAIR was upregualted in lung adenocarcinoma cells grown in 3-D cell culture supplemented with collagen. Furthermore, the authors observed loss of acini, a sign of hyperproliferation and poor differentiation. This might be the first attempt to determine the function of lincRNA at the level of organotypic culture and it is a very useful strategy for further characterization of lncRNA molecules functional in the tumor microenvironment.
Tumor microenvironment
It is known that the tumor microenvironment plays a role in tumor formation, invasive progression and metastatic dissemination[47]. Tumor microenvironment is composed of mesenchymal stromal cells, adipocytes, fibroblasts, endothelial and immune cells. The tumor microenvironment is abundant in proinflammatory cytokines that are derived from both cancer cells and nearby endothelial and adipocyte like cells; these secreted cytokines not only stimulate inflammation but also recruit mesenchymal stromal cells and preadipocytes to the tumor. Furthermore, tumor microenvironment can influence tumor cell metabolism and metabolic changes can also affect inflammatory signaling and this feedback can further alter the tumor microenvironment and promote tumor invasion. LncRNAs have been found to be critical regulators of mesenchymal stem cells[12], endothelial cells[48], adipocytes[16] and immune cells[49,50]. It remains untested what role lncRNAs may have in promoting tumorigenesis through signaling within the tumor microenvironment.
Cancer stem cells
As was discussed earlier, lncRNAs play critical roles in regulating pluripotency in ESCs. The two defining characteristics of stem cells, self-renewal and differentiation capacity, are hijacked by some neoplastic cells in what are often referred to as cancer stem cells. Many of the signaling pathways that are critical in ESCs (OCT4, SOX2, KLF4 and PcG[51-54]) have been found activated in cancer stem cells. Since the regulation of OCT4, SOX2, KLF4 involves feedback loops with lncRNAs[10], and as lncRNAs are known to be dysregulated in cancer, it seems likely that lncRNAs may be involved in regulating stem cell signaling in cancer cells. Examining the functions of lncRNAs in cancer stem cells may reveal new therapeutic targets and mechanisms for overcoming chemoresistance.
Many lncRNAs are shown to interact with chromatin modifying complexes. In particular the PcG is known to regulate pluripotency and self-renewal of ESCs. As polycomb signaling has been implicated in some types of cancer[54], it is likely that lncRNAs may serve to target polycomb complexes to sites in chromatin to silence differentiation programs in cancer stem cells
It is already known that ncRNAs, specifically miRNAs, play critical roles regulating cancer stem cells, e.g., the miR-200 family[55], let-7[56], miR-140[57]. One potential function of lncRNAs is to competitively inhibit miRNAs as a molecular sponge (also referred to as competing endogenous RNAs) (Figure 1). Kallen et al[58] recently discovered that lncRNA H19 can serve as a sponge for let-7 family of miRNAs in muscle tissues. There is potential for this or other lncRNAs to regulate miRNA signaling in cancer stem cells and future studies will provide further evidence of the importance of ncRNA, both lncRNA and miRNA, in regulating cancer stem cell signaling.
Figure 1 Diverse functions of long non-coding RNA in human cancer cells.
Long non-coding RNA (lncRNA) are known to be active regulators of gene expression through transcriptional, post-transcriptional, and translational mechanisms. LncRNA dysregulation in cancer may provide oncogenic signaling or loss of tumor suppressive function. LncRNAs are known to serve as tethers and scaffolds that direct chromatin modifying enzymes such as Polycomb group proteins to regulatory regions of DNA. This may result in upregulated transcription of oncogenic protein coding genes or loss of transcription of tumor suppressor protein coding genes. LncRNAs also may serve as competitive endogenous RNA for microRNAs (miRNAs) (sponges) that then inhibit miRNA function thereby preventing miRNA from interacting with target mRNAs. Finally, lncRNAs can interact directly or indirectly with mRNAs in the nucleus and cytoplasm regulating mRNA splicing, mRNA stability, and mRNA translation.