1
|
Emerging Roles of Long Noncoding RNAs in Breast Cancer Epigenetics and Epitranscriptomics. Front Cell Dev Biol 2022; 10:922351. [PMID: 35865634 PMCID: PMC9294602 DOI: 10.3389/fcell.2022.922351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Accepted: 05/30/2022] [Indexed: 11/13/2022] Open
Abstract
Breast carcinogenesis is a multistep process that involves both genetic and epigenetic changes. Epigenetics refers to reversible changes in gene expression that are not accompanied by changes in gene sequence. In breast cancer (BC), dysregulated epigenetic changes, such as DNA methylation and histone modifications, are accompanied by epitranscriptomic changes, in particular adenine to inosine modifications within RNA molecules. Factors that trigger these phenomena are largely unknown, but there is evidence for widespread participation of long noncoding RNAs (lncRNAs) that already have been linked to virtually any aspect of BC biology, making them promising biomarkers and therapeutic targets in BC patients. Here, we provide a systematic review of known and possible roles of lncRNAs in epigenetic and epitranscriptomic processes, along with methods and tools to study them, followed by a brief overview of current challenges regarding the use of lncRNAs in medical applications.
Collapse
|
2
|
MEF2C shapes the microtranscriptome during differentiation of skeletal muscles. Sci Rep 2021; 11:3476. [PMID: 33568691 PMCID: PMC7875991 DOI: 10.1038/s41598-021-82706-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 01/20/2021] [Indexed: 01/04/2023] Open
Abstract
Myocyte enhancer factor 2C (MEF2C) is a transcription factor that regulates heart and skeletal muscle differentiation and growth. Several protein-encoding genes were identified as targets of this factor; however, little is known about its contribution to the microtranscriptome composition and dynamics in myogenic programs. In this report, we aimed to address this question. Deep sequencing of small RNAs of human muscle cells revealed a set of microRNAs (miRNAs), including several muscle-specific miRNAs, that are sensitive to MEF2C depletion. As expected, in cells with knockdown of MEF2C, we found mostly downregulated miRNAs; nevertheless, as much as one-third of altered miRNAs were upregulated. The majority of these changes are driven by transcription efficiency. Moreover, we found that MEF2C affects nontemplated 3′-end nucleotide addition of miRNAs, mainly oligouridylation. The rate of these modifications is associated with the level of TUT4 which mediates RNA 3′-uridylation. Finally, we found that a quarter of miRNAs which significantly changed upon differentiation of human skeletal myoblasts is inversely altered in MEF2C deficient cells. We concluded that MEF2C is an essential factor regulating both the quantity and quality of the microtranscriptome, leaving an imprint on the stability and perhaps specificity of many miRNAs during the differentiation of muscle cells.
Collapse
|
3
|
lncEvo: automated identification and conservation study of long noncoding RNAs. BMC Bioinformatics 2021; 22:59. [PMID: 33563213 PMCID: PMC7871587 DOI: 10.1186/s12859-021-03991-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 02/01/2021] [Indexed: 12/04/2022] Open
Abstract
BACKGROUND Long noncoding RNAs represent a large class of transcripts with two common features: they exceed an arbitrary length threshold of 200 nt and are assumed to not encode proteins. Although a growing body of evidence indicates that the vast majority of lncRNAs are potentially nonfunctional, hundreds of them have already been revealed to perform essential gene regulatory functions or to be linked to a number of cellular processes, including those associated with the etiology of human diseases. To better understand the biology of lncRNAs, it is essential to perform a more in-depth study of their evolution. In contrast to protein-encoding transcripts, however, they do not show the strong sequence conservation that usually results from purifying selection; therefore, software that is typically used to resolve the evolutionary relationships of protein-encoding genes and transcripts is not applicable to the study of lncRNAs. RESULTS To tackle this issue, we developed lncEvo, a computational pipeline that consists of three modules: (1) transcriptome assembly from RNA-Seq data, (2) prediction of lncRNAs, and (3) conservation study-a genome-wide comparison of lncRNA transcriptomes between two species of interest, including search for orthologs. Importantly, one can choose to apply lncEvo solely for transcriptome assembly or lncRNA prediction, without calling the conservation-related part. CONCLUSIONS lncEvo is an all-in-one tool built with the Nextflow framework, utilizing state-of-the-art software and algorithms with customizable trade-offs between speed and sensitivity, ease of use and built-in reporting functionalities. The source code of the pipeline is freely available for academic and nonacademic use under the MIT license at https://gitlab.com/spirit678/lncrna_conservation_nf .
Collapse
|
4
|
Abstract
A large portion of the human genome is transcribed into long noncoding RNAs that can range from 200 nucleotides to several kilobases in length. The number of identified lncRNAs is still growing, but only a handful of them have been functionally characterized. However, it is known that the functions of lncRNAs are closely related to their subcellular localization. Cytoplasmic lncRNAs can regulate mRNA stability, affect translation and act as miRNA sponges, while nuclear-retained long noncoding RNAs have been reported to be involved in transcriptional control, chromosome scaffolding, modulation of alternative splicing and chromatin remodelling. Through these processes, lncRNAs have diverse regulatory roles in cell biology and diseases. OIP5-AS1 (also known as Cyrano), a poorly characterized lncRNA expressed antisense to the OIP5 oncogene, is deregulated in multiple cancers. We showed that one of the OIP5-AS1 splicing forms (ENST00000501665.2) is retained in the cell nucleus where it associates with chromatin, thus narrowing down the spectrum of its possible mechanisms of action. Its knockdown with antisense LNA gapmeRs led to inhibited expression of a sense partner, OIP5, strongly suggesting a functional coupling between OIP5 and ENST00000501665.2. A subsequent bioinformatics analysis followed by RAP-MS and RNA Immunoprecipitation experiments suggested its possible mode of action; in particular, we found that ENST00000501665.2 directly binds to a number of nuclear proteins, including SMARCA4, a component of the SWI/SNF chromatin remodelling complex, whose binding motif is located in the promoter of the OIP5 oncogene.
Collapse
|
5
|
SyntDB: defining orthologues of human long noncoding RNAs across primates. Nucleic Acids Res 2020; 48:D238-D245. [PMID: 31728519 PMCID: PMC7145678 DOI: 10.1093/nar/gkz941] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 10/04/2019] [Accepted: 11/12/2019] [Indexed: 12/12/2022] Open
Abstract
SyntDB (http://syntdb.amu.edu.pl/) is a collection of data on long noncoding RNAs (lncRNAs) and their evolutionary relationships in twelve primate species, including humans. This is the first database dedicated to primate lncRNAs, thousands of which are uniquely stored in SyntDB. The lncRNAs were predicted with our computational pipeline using publicly available RNA-Seq data spanning diverse tissues and organs. Most of the species included in SyntDB still lack lncRNA annotations in public resources. In addition to providing users with unique sets of lncRNAs and their characteristics, SyntDB provides data on orthology relationships between the lncRNAs of humans and other primates, which are not available on this scale elsewhere. Keeping in mind that only a small fraction of currently known human lncRNAs have been functionally characterized and that lncRNA conservation is frequently used to identify the most relevant lncRNAs for functional studies, we believe that SyntDB will contribute to ongoing research aimed at deciphering the biological roles of lncRNAs.
Collapse
|
6
|
Towards a deeper annotation of human lncRNAs. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194385. [PMID: 31128317 DOI: 10.1016/j.bbagrm.2019.05.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Revised: 05/13/2019] [Accepted: 05/14/2019] [Indexed: 01/05/2023]
Abstract
A substantial fraction of the human transcriptome is composed of the so-called long noncoding RNAs (lncRNAs), yet the available catalogs of known lncRNAs are far from complete. Moreover, functional studies of these RNAs are challenged by several factors, such as their tissue-specific expression and functional heterogeneity, resulting in only ca. 1% of them being well characterized. Here, we describe a set of 41,400 novel lncRNAs discovered with RNA-Seq data from 1463 samples encompassing diverse tissues and cell lines. We utilized publicly available transcriptomic and genomic data to provide their characteristics, such as tissue specificity, cellular abundance, polyA status, cellular localization, evolutionary conservation and transcript stability, which allowed us to speculate on their possible biological roles. We also pinpointed 24 novel lncRNAs as candidates for breast cancer biomarkers. The results bring us closer to a comprehensive annotation of human lncRNAs, though vast amounts of further work are needed to validate the predictions and fully decipher their biology. This article is part of a Special Issue entitled: ncRNA in control of gene expression edited by Kotb Abdelmohsen.
Collapse
|
7
|
Abstract
Long non-coding RNAs (lncRNAs) are a class of potent regulators of gene expression that are found in a wide array of eukaryotes; however, our knowledge about these molecules in plants is very limited. In particular, a number of plant species with important roles in biotechnology, agriculture and basic research still lack comprehensively identified and annotated sets of lncRNAs. To address these shortcomings, we previously created a database of lncRNAs in 10 model species, called CANTATAdb, and now we are expanding this online resource to encompass 39 species, including three algae. The lncRNAs were identified computationally using publicly available RNA sequencing (RNA-Seq) data. Expression values, coding potential calculations and other types of information were used to provide annotations for the identified lncRNAs. The data are freely available for searching, browsing and downloading from an online database called CANTATAdb 2.0 ( http://cantata.amu.edu.pl , http://yeti.amu.edu.pl/CANTATA/ ).
Collapse
|
8
|
Natural antisense transcripts in diseases: From modes of action to targeted therapies. WILEY INTERDISCIPLINARY REVIEWS. RNA 2018; 9:e1461. [PMID: 29341438 PMCID: PMC5838512 DOI: 10.1002/wrna.1461] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Revised: 11/28/2017] [Accepted: 11/29/2017] [Indexed: 12/16/2022]
Abstract
Antisense transcription is a widespread phenomenon in mammalian genomes, leading to production of RNAs molecules referred to as natural antisense transcripts (NATs). NATs apply diverse transcriptional and post-transcriptional regulatory mechanisms to carry out a wide variety of biological roles that are important for the normal functioning of living cells, but their dysfunctions can be associated with human diseases. In this review, we attempt to provide a molecular basis for the involvement of NATs in the etiology of human disorders such as cancers and neurodegenerative and cardiovascular diseases. We also discuss the pros and cons of oligonucleotide-based therapies targeted against NATs, and we comment on state-of-the-art progress in this promising area of clinical research. WIREs RNA 2018, 9:e1461. doi: 10.1002/wrna.1461 This article is categorized under: RNA in Disease and Development > RNA in Disease Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs RNA Interactions with Proteins and Other Molecules > Small Molecule-RNA Interactions.
Collapse
|
9
|
Retroposition as a source of antisense long non-coding RNAs with possible regulatory functions. Acta Biochim Pol 2016; 63:825-833. [PMID: 27801428 DOI: 10.18388/abp.2016_1354] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2016] [Revised: 06/30/2016] [Accepted: 07/19/2016] [Indexed: 11/10/2022]
Abstract
Long non-coding RNAs (lncRNAs) are a class of intensely studied, yet enigmatic molecules that make up a substantial portion of the human transcriptome. In this work, we link the origins and functions of some lncRNAs to retroposition, a process resulting in the creation of intronless copies (retrocopies) of the so-called parental genes. We found 35 human retrocopies transcribed in antisense and giving rise to 58 lncRNA transcripts. These lncRNAs share sequence similarity with the corresponding parental genes but in the sense/antisense orientation, meaning they have the potential to interact with each other and to form RNA:RNA duplexes. We took a closer look at these duplexes and found that 10 of the lncRNAs might regulate parental gene expression and processing at the pre-mRNA and mRNA levels. Further analysis of the co-expression and expression correlation provided support for the existence of functional coupling between lncRNAs and their mate parental gene transcripts.
Collapse
|
10
|
lncRNA-RNA Interactions across the Human Transcriptome. PLoS One 2016; 11:e0150353. [PMID: 26930590 PMCID: PMC4773119 DOI: 10.1371/journal.pone.0150353] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 02/14/2016] [Indexed: 01/21/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) represent a numerous class of non-protein coding transcripts longer than 200 nucleotides. There is possibility that a fraction of lncRNAs are not functional and represent mere transcriptional noise but a growing body of evidence shows they are engaged in a plethora of molecular functions and contribute considerably to the observed diversification of eukaryotic transcriptomes and proteomes. Still, however, only ca. 1% of lncRNAs have well established functions and much remains to be done towards decipherment of their biological roles. One of the least studied aspects of lncRNAs biology is their engagement in gene expression regulation through RNA-RNA interactions. By hybridizing with mate RNA molecules, lncRNAs could potentially participate in modulation of pre-mRNA splicing, RNA editing, mRNA stability control, translation activation, or abrogation of miRNA-induced repression. Here, we implemented a similarity-search based method for transcriptome-wide identification of RNA-RNA interactions, which enabled us to find 18,871,097 lncRNA-RNA base-pairings in human. Further analyses showed that the interactions could be involved in processing, stability control and functions of 57,303 transcripts. An extensive use of RNA-Seq data provided support for approximately one third of the interactions, at least in terms of the two RNA components being co-expressed. The results suggest that lncRNA-RNA interactions are broadly used to regulate and diversify the human transcriptome.
Collapse
|
11
|
Abstract
RNA-sequencing (RNA-seq) has a wide variety of applications, but no single analysis pipeline can be used in all cases. We review all of the major steps in RNA-seq data analysis, including experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping. We highlight the challenges associated with each step. We discuss the analysis of small RNAs and the integration of RNA-seq with other functional genomics techniques. Finally, we discuss the outlook for novel technologies that are changing the state of the art in transcriptomics.
Collapse
|
12
|
HuntMi: an efficient and taxon-specific approach in pre-miRNA identification. BMC Bioinformatics 2013; 14:83. [PMID: 23497112 PMCID: PMC3686668 DOI: 10.1186/1471-2105-14-83] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Accepted: 02/21/2013] [Indexed: 12/16/2022] Open
Abstract
Background Machine learning techniques are known to be a powerful way of distinguishing microRNA hairpins from pseudo hairpins and have been applied in a number of recognised miRNA search tools. However, many current methods based on machine learning suffer from some drawbacks, including not addressing the class imbalance problem properly. It may lead to overlearning the majority class and/or incorrect assessment of classification performance. Moreover, those tools are effective for a narrow range of species, usually the model ones. This study aims at improving performance of miRNA classification procedure, extending its usability and reducing computational time. Results We present HuntMi, a stand-alone machine learning miRNA classification tool. We developed a novel method of dealing with the class imbalance problem called ROC-select, which is based on thresholding score function produced by traditional classifiers. We also introduced new features to the data representation. Several classification algorithms in combination with ROC-select were tested and random forest was selected for the best balance between sensitivity and specificity. Reliable assessment of classification performance is guaranteed by using large, strongly imbalanced, and taxon-specific datasets in 10-fold cross-validation procedure. As a result, HuntMi achieves a considerably better performance than any other miRNA classification tool and can be applied in miRNA search experiments in a wide range of species. Conclusions Our results indicate that HuntMi represents an effective and flexible tool for identification of new microRNAs in animals, plants and viruses. ROC-select strategy proves to be superior to other methods of dealing with class imbalance problem and can possibly be used in other machine learning classification tasks. The HuntMi software as well as datasets used in the research are freely available at http://lemur.amu.edu.pl/share/HuntMi/.
Collapse
|
13
|
|
14
|
[MicroRNA databases]. Postepy Biochem 2012; 58:91-99. [PMID: 23214133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
microRNAs (miRNAs) are small RNAs that play key roles in regulation of cellular processes and therefore could largely contribute to solving many problems in medicine, biotechnology, and other biological sciences. As a result, the numbers of research projects and publications on miRNAs are constantly growing, which is accompanied by increasing amounts of new data and databases need to be created for data storage. There are 51 dedicated miRNA databases at the moment, what make it quite difficult for the users to find relevant data. Moreover, such problems as insufficient documentation, low quality of data or flaws in the graphical interface make the things even worse. However, there are positive signs, including standardization of database interfaces, a tendency to create integrated systems that collect data from a number of databases and present it in a uniform format, and emergence of systems for automated data search and download.
Collapse
|
15
|
Abstract
Despite accumulating data on animal and plant microRNAs and their functions, existing public miRNA resources usually collect miRNAs from a very limited number of species. A lot of microRNAs, including those from model organisms, remain undiscovered. As a result there is a continuous need to search for new microRNAs. We present miRNEST (http://mirnest.amu.edu.pl), a comprehensive database of animal, plant and virus microRNAs. The core part of the database is built from our miRNA predictions conducted on Expressed Sequence Tags of 225 animal and 202 plant species. The miRNA search was performed based on sequence similarity and as many as 10 004 miRNA candidates in 221 animal and 199 plant species were discovered. Out of them only 299 have already been deposited in miRBase. Additionally, miRNEST has been integrated with external miRNA data from literature and 13 databases, which includes miRNA sequences, small RNA sequencing data, expression, polymorphisms and targets data as well as links to external miRNA resources, whenever applicable. All this makes miRNEST a considerable miRNA resource in a sense of number of species (544) that integrates a scattered miRNA data into a uniform format with a user-friendly web interface.
Collapse
|