1
|
Tian H, Tang L, Yang Z, Xiang Y, Min Q, Yin M, You H, Xiao Z, Shen J. Current understanding of functional peptides encoded by lncRNA in cancer. Cancer Cell Int 2024; 24:252. [PMID: 39030557 PMCID: PMC11265036 DOI: 10.1186/s12935-024-03446-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 07/09/2024] [Indexed: 07/21/2024] Open
Abstract
Dysregulated gene expression and imbalance of transcriptional regulation are typical features of cancer. RNA always plays a key role in these processes. Human transcripts contain many RNAs without long open reading frames (ORF, > 100 aa) and that are more than 200 bp in length. They are usually regarded as long non-coding RNA (lncRNA) which play an important role in cancer regulation, including chromatin remodeling, transcriptional regulation, translational regulation and as miRNA sponges. With the advancement of ribosome profiling and sequencing technologies, increasing research evidence revealed that some ORFs in lncRNA can also encode peptides and participate in the regulation of multiple organ tumors, which undoubtedly opens a new chapter in the field of lncRNA and oncology research. In this review, we discuss the biological function of lncRNA in tumors, the current methods to evaluate their coding potential and the role of functional small peptides encoded by lncRNA in cancers. Investigating the small peptides encoded by lncRNA and understanding the regulatory mechanisms of these functional peptides may contribute to a deeper understanding of cancer and the development of new targeted anticancer therapies.
Collapse
Affiliation(s)
- Hua Tian
- Laboratory of Molecular Pharmacology, Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, 646000, China
- Cell Therapy and Cell Drugs of Luzhou Key Laboratory, Luzhou, 646000, China
- South Sichuan Institute of Translational Medicine, Luzhou, 646000, China
- School of Nursing, Chongqing College of Humanities, Science & Technology, Chongqing, China
| | - Lu Tang
- Laboratory of Molecular Pharmacology, Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, 646000, China
- Cell Therapy and Cell Drugs of Luzhou Key Laboratory, Luzhou, 646000, China
- South Sichuan Institute of Translational Medicine, Luzhou, 646000, China
| | - Zihan Yang
- Department of Pathology, The Affiliated Hospital of Southwest Medical University, Luzhou, China, 646000
| | - Yanxi Xiang
- Laboratory of Molecular Pharmacology, Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, 646000, China
- Cell Therapy and Cell Drugs of Luzhou Key Laboratory, Luzhou, 646000, China
- South Sichuan Institute of Translational Medicine, Luzhou, 646000, China
| | - Qi Min
- Laboratory of Molecular Pharmacology, Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, 646000, China
- Cell Therapy and Cell Drugs of Luzhou Key Laboratory, Luzhou, 646000, China
- South Sichuan Institute of Translational Medicine, Luzhou, 646000, China
| | - Mengshuang Yin
- Laboratory of Molecular Pharmacology, Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, 646000, China
- Cell Therapy and Cell Drugs of Luzhou Key Laboratory, Luzhou, 646000, China
- South Sichuan Institute of Translational Medicine, Luzhou, 646000, China
| | - Huili You
- Laboratory of Molecular Pharmacology, Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, 646000, China
- Cell Therapy and Cell Drugs of Luzhou Key Laboratory, Luzhou, 646000, China
- South Sichuan Institute of Translational Medicine, Luzhou, 646000, China
| | - Zhangang Xiao
- Laboratory of Molecular Pharmacology, Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, 646000, China.
- Cell Therapy and Cell Drugs of Luzhou Key Laboratory, Luzhou, 646000, China.
- South Sichuan Institute of Translational Medicine, Luzhou, 646000, China.
- Gulin Traditional Chinese Medicine Hospital, Luzhou, China.
- Department of Pharmacology, School of Pharmacy, Sichuan College of Traditional Chinese Medicine, Mianyang, China.
| | - Jing Shen
- Laboratory of Molecular Pharmacology, Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, 646000, China.
- Cell Therapy and Cell Drugs of Luzhou Key Laboratory, Luzhou, 646000, China.
- South Sichuan Institute of Translational Medicine, Luzhou, 646000, China.
| |
Collapse
|
2
|
Palos K, Yu L, Railey CE, Nelson Dittrich AC, Nelson ADL. Linking discoveries, mechanisms, and technologies to develop a clearer perspective on plant long noncoding RNAs. THE PLANT CELL 2023; 35:1762-1786. [PMID: 36738093 PMCID: PMC10226578 DOI: 10.1093/plcell/koad027] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 12/19/2022] [Accepted: 12/22/2022] [Indexed: 05/30/2023]
Abstract
Long noncoding RNAs (lncRNAs) are a large and diverse class of genes in eukaryotic genomes that contribute to a variety of regulatory processes. Functionally characterized lncRNAs play critical roles in plants, ranging from regulating flowering to controlling lateral root formation. However, findings from the past decade have revealed that thousands of lncRNAs are present in plant transcriptomes, and characterization has lagged far behind identification. In this setting, distinguishing function from noise is challenging. However, the plant community has been at the forefront of discovery in lncRNA biology, providing many functional and mechanistic insights that have increased our understanding of this gene class. In this review, we examine the key discoveries and insights made in plant lncRNA biology over the past two and a half decades. We describe how discoveries made in the pregenomics era have informed efforts to identify and functionally characterize lncRNAs in the subsequent decades. We provide an overview of the functional archetypes into which characterized plant lncRNAs fit and speculate on new avenues of research that may uncover yet more archetypes. Finally, this review discusses the challenges facing the field and some exciting new molecular and computational approaches that may help inform lncRNA comparative and functional analyses.
Collapse
Affiliation(s)
- Kyle Palos
- Boyce Thompson Institute, Cornell University, Ithaca, NY 14853, USA
| | - Li’ang Yu
- Boyce Thompson Institute, Cornell University, Ithaca, NY 14853, USA
| | - Caylyn E Railey
- Boyce Thompson Institute, Cornell University, Ithaca, NY 14853, USA
- Plant Biology Graduate Field, Cornell University, Ithaca, NY 14853, USA
| | | | | |
Collapse
|
3
|
Petrosino G, Ponte G, Volpe M, Zarrella I, Ansaloni F, Langella C, Di Cristina G, Finaurini S, Russo MT, Basu S, Musacchia F, Ristoratore F, Pavlinic D, Benes V, Ferrante MI, Albertin C, Simakov O, Gustincich S, Fiorito G, Sanges R. Identification of LINE retrotransposons and long non-coding RNAs expressed in the octopus brain. BMC Biol 2022; 20:116. [PMID: 35581640 PMCID: PMC9115989 DOI: 10.1186/s12915-022-01303-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Accepted: 04/21/2022] [Indexed: 01/07/2023] Open
Abstract
Background Transposable elements (TEs) widely contribute to the evolution of genomes allowing genomic innovations, generating germinal and somatic heterogeneity, and giving birth to long non-coding RNAs (lncRNAs). These features have been associated to the evolution, functioning, and complexity of the nervous system at such a level that somatic retrotransposition of long interspersed element (LINE) L1 has been proposed to be associated to human cognition. Among invertebrates, octopuses are fascinating animals whose nervous system reaches a high level of complexity achieving sophisticated cognitive abilities. The sequencing of the genome of the Octopus bimaculoides revealed a striking expansion of TEs which were proposed to have contributed to the evolution of its complex nervous system. We recently found a similar expansion also in the genome of Octopus vulgaris. However, a specific search for the existence and the transcription of full-length transpositionally competent TEs has not been performed in this genus. Results Here, we report the identification of LINE elements competent for retrotransposition in Octopus vulgaris and Octopus bimaculoides and show evidence suggesting that they might be transcribed and determine germline and somatic polymorphisms especially in the brain. Transcription and translation measured for one of these elements resulted in specific signals in neurons belonging to areas associated with behavioral plasticity. We also report the transcription of thousands of lncRNAs and the pervasive inclusion of TE fragments in the transcriptomes of both Octopus species, further testifying the crucial activity of TEs in the evolution of the octopus genomes. Conclusions The neural transcriptome of the octopus shows the transcription of thousands of putative lncRNAs and of a full-length LINE element belonging to the RTE class. We speculate that a convergent evolutionary process involving retrotransposons activity in the brain has been important for the evolution of sophisticated cognitive abilities in this genus. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-022-01303-5.
Collapse
Affiliation(s)
- Giuseppe Petrosino
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy.,Institute of Molecular Biology (IMB), Mainz, Germany
| | - Giovanna Ponte
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy
| | - Massimiliano Volpe
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy.,Central RNA Laboratory, Istituto Italiano di Tecnologia (IIT), Via Enrico Melen 83, 16152, Genova, Italy.,Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden
| | - Ilaria Zarrella
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy
| | - Federico Ansaloni
- Central RNA Laboratory, Istituto Italiano di Tecnologia (IIT), Via Enrico Melen 83, 16152, Genova, Italy
| | - Concetta Langella
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy
| | - Giulia Di Cristina
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy.,Institute of Zoology, University of Cologne, Cologne, Germany
| | - Sara Finaurini
- Neurobiology Sector, Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Bonomea 265, 34136, Trieste, Italy
| | - Monia T Russo
- Department of Integrative Marine Ecology, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy
| | - Swaraj Basu
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy.,Strand Life Sciences, Bengaluru, India
| | - Francesco Musacchia
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy
| | - Filomena Ristoratore
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy
| | - Dinko Pavlinic
- Scientific Core Facilities & Technologies, GeneCore, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117, Heidelberg, Germany.,Institute of Molecular and Clinical Ophthalmology, Basel, Switzerland
| | - Vladimir Benes
- Scientific Core Facilities & Technologies, GeneCore, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117, Heidelberg, Germany
| | - Maria I Ferrante
- Department of Integrative Marine Ecology, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy
| | | | - Oleg Simakov
- Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa, 9040495, Japan.,Department of Molecular Evolution and Development, Wien University, Althanstraße 14 (UZA I), 1090, Wien, Austria
| | - Stefano Gustincich
- Central RNA Laboratory, Istituto Italiano di Tecnologia (IIT), Via Enrico Melen 83, 16152, Genova, Italy.,Neurobiology Sector, Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Bonomea 265, 34136, Trieste, Italy
| | - Graziano Fiorito
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy.
| | - Remo Sanges
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, SZN, 80121, Naples, Italy. .,Central RNA Laboratory, Istituto Italiano di Tecnologia (IIT), Via Enrico Melen 83, 16152, Genova, Italy. .,Neurobiology Sector, Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Bonomea 265, 34136, Trieste, Italy.
| |
Collapse
|
4
|
Micropeptides translated from putative long non-coding RNAs. Acta Biochim Biophys Sin (Shanghai) 2022; 54:292-300. [PMID: 35538037 PMCID: PMC9827906 DOI: 10.3724/abbs.2022010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) transcribed in mammals and eukaryotes were thought to have no protein coding capability. However, recent studies have suggested that plenty of lncRNAs are mis-annotated and virtually contain coding sequences which are translated into functional peptides by ribosomal machinery, and these functional peptides are called micropeptides or small peptides. Here we review the rapidly advancing field of micropeptides translated from putative lncRNAs, describe the strategies for their identification, and elucidate their critical roles in many fundamental biological processes. We also discuss the prospects of research in micropeptides and the potential applications of micropeptides.
Collapse
|
5
|
Klapproth C, Sen R, Stadler PF, Findeiß S, Fallmann J. Common Features in lncRNA Annotation and Classification: A Survey. Noncoding RNA 2021; 7:77. [PMID: 34940758 PMCID: PMC8708962 DOI: 10.3390/ncrna7040077] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 12/03/2021] [Accepted: 12/06/2021] [Indexed: 12/29/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) are widely recognized as important regulators of gene expression. Their molecular functions range from miRNA sponging to chromatin-associated mechanisms, leading to effects in disease progression and establishing them as diagnostic and therapeutic targets. Still, only a few representatives of this diverse class of RNAs are well studied, while the vast majority is poorly described beyond the existence of their transcripts. In this review we survey common in silico approaches for lncRNA annotation. We focus on the well-established sets of features used for classification and discuss their specific advantages and weaknesses. While the available tools perform very well for the task of distinguishing coding sequence from other RNAs, we find that current methods are not well suited to distinguish lncRNAs or parts thereof from other non-protein-coding input sequences. We conclude that the distinction of lncRNAs from intronic sequences and untranslated regions of coding mRNAs remains a pressing research gap.
Collapse
Affiliation(s)
- Christopher Klapproth
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany; (C.K.); (P.F.S.); (S.F.)
| | - Rituparno Sen
- Helmholtz Institute for RNA-Based Infection Research (HIRI), Helmholtz-Center for Infection Research (HZI), D-97080 Würzburg, Germany;
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany; (C.K.); (P.F.S.); (S.F.)
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Center for Scalable Data Services and Solutions, and Leipzig Research Center for Civilization Diseases, University Leipzig, D-04103 Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria
- Facultad de Ciencias, Universidad National de Colombia, Bogotá CO-111321, Colombia
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Sven Findeiß
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany; (C.K.); (P.F.S.); (S.F.)
| | - Jörg Fallmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany; (C.K.); (P.F.S.); (S.F.)
| |
Collapse
|
6
|
Xie D, Tong M, Xia B, Feng G, Wang L, Li A, Luo G, Wan H, Zhang Z, Zhang H, Yang YG, Zhou Q, Wang M, Wang XJ. Long noncoding RNA lnc-NAP sponges mmu-miR-139-5p to modulate Nanog functions in mouse ESCs and embryos. RNA Biol 2021; 18:875-887. [PMID: 32991228 PMCID: PMC8081037 DOI: 10.1080/15476286.2020.1827591] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 08/30/2020] [Accepted: 09/20/2020] [Indexed: 12/30/2022] Open
Abstract
The pluripotency of embryonic stem cells (ESCs) is controlled by a multilayer regulatory network, of which the key factors include core pluripotency genes Oct4, Sox2 and Nanog, and multiple microRNAs (miRNAs). Recently, long noncoding RNAs (lncRNAs) have been discovered as a class of new regulators for ESCs, and some lncRNAs could function as competing endogenous RNAs (ceRNAs) to regulate mRNAs by competitively binding to miRNAs. Here, we identify mmu-miR-139-5p as a new regulator for Nanog by targeting Nanog 3' untranslated region (UTR) to repress Nanog expression in mouse ESCs and embryos. Such regulation could be released by an ESC-specifically expressed ceRNA named lnc-NAP. The expression of lnc-NAP is activated by OCT4, SOX2, as well as NANOG through promoter binding. Downregulation of lnc-NAP reduces Nanog abundance, which leads to decreased pluripotency of mouse ESCs and embryonic lethality. These results reveal lnc-NAP as a new regulator for Nanog in mouse ESCs, and uncover a feed-forward regulatory loop of Nanog through the participation of lnc-NAP.
Collapse
MESH Headings
- 3' Untranslated Regions/genetics
- Animals
- Cell Differentiation/genetics
- Embryo, Mammalian/cytology
- Embryo, Mammalian/embryology
- Embryo, Mammalian/metabolism
- Embryonic Stem Cells/cytology
- Embryonic Stem Cells/metabolism
- Gene Expression Regulation, Developmental
- Mice, Inbred C57BL
- Mice, Inbred DBA
- Mice, Inbred NOD
- Mice, SCID
- MicroRNAs/genetics
- Nanog Homeobox Protein/genetics
- Nanog Homeobox Protein/metabolism
- Octamer Transcription Factor-3/genetics
- Octamer Transcription Factor-3/metabolism
- Promoter Regions, Genetic/genetics
- Protein Binding
- RNA, Long Noncoding/genetics
- RNA-Seq/methods
- Reverse Transcriptase Polymerase Chain Reaction/methods
- SOXB1 Transcription Factors/genetics
- SOXB1 Transcription Factors/metabolism
- Mice
Collapse
Affiliation(s)
- Dongfang Xie
- Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Innovation Academy of Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Man Tong
- Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Innovation Academy of Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Baolong Xia
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Guihai Feng
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Leyun Wang
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Ang Li
- University of Chinese Academy of Sciences, Beijing, China
- Key Laboratory of Genomic and Precision Medicine, Collaborative Innovation Center of Genetics and Development, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Guanzheng Luo
- Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Innovation Academy of Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Haifeng Wan
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Zeyu Zhang
- Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Innovation Academy of Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Hao Zhang
- Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Innovation Academy of Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Yun-Gui Yang
- University of Chinese Academy of Sciences, Beijing, China
- Key Laboratory of Genomic and Precision Medicine, Collaborative Innovation Center of Genetics and Development, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Qi Zhou
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Meng Wang
- Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Innovation Academy of Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Xiu-Jie Wang
- Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Innovation Academy of Seed Design, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
7
|
Li Y, Wang L. RNA Coding Potential Prediction Using Alignment-Free Logistic Regression Model. Methods Mol Biol 2021; 2254:27-39. [PMID: 33326068 DOI: 10.1007/978-1-0716-1158-6_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
CPAT (Coding-Potential Assessment Tool) is a logistic regression model-based classifier that can accurately and quickly distinguish protein-coding and noncoding RNAs using pure linguistic features calculated from the RNA sequences. CPAT takes as input the nucleotides sequences or genomic coordinates of RNAs and outputs the probabilities p (0 ≤ p ≤ 1), which measure the likelihood of protein coding. Users can run CPAT online ( http://lilab.research.bcm.edu/cpat/ ) or from the local computers after installation. CPAT provides prebuilt logistic models to recognize RNAs originated from human (Homo sapiens), mouse (Mus musculus), zebrafish (Danio rerio), and fly (Drosophila melanogaster) genomes. Instructions on how to train models for other genomes are described in CPAT website ( http://rna-cpat.sourceforge.net/ ) and this chapter.
Collapse
Affiliation(s)
- Ying Li
- Division of Biomedical Statistics and Informatics, Mayo Clinic College of Medicine, Rochester, MN, USA
| | - Liguo Wang
- Division of Biomedical Statistics and Informatics, Mayo Clinic College of Medicine, Rochester, MN, USA.
- Department of Biochemistry and Molecular Biology, Mayo Clinic College of Medicine, Rochester, MN, USA.
| |
Collapse
|
8
|
Qiao X, Liu J, Zhu L, Song R, Zhong M, Guo Y. Long noncoding RNA CEBPA-DT promotes cisplatin chemo-resistance through CEBPA/BCL2 mediated apoptosis in oral squamous cellular cancer. Int J Med Sci 2021; 18:3728-3737. [PMID: 34790046 PMCID: PMC8579301 DOI: 10.7150/ijms.64253] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 09/20/2021] [Indexed: 12/13/2022] Open
Abstract
Intrinsic or developing resistance to chemotherapy drugs including cisplatin (CDDP) remains the major limitation of cancer therapeutic efficacy in cancers. Recently, increasing evidence suggested that long noncoding RNAs (lncRNAs) play a critical role in various biological processes of tumors, and have been implicated in resistance to various drugs. However, the role of lncRNAs in cisplatin resistance is poorly understood. Here, we found that the expression of lncRNA CEBPA-DT/CEBPA/BCL2 was upregulated in cisplatin resistance OSCC cells (Cal27-CisR and HSC4-CisR) compared with their parental cells (Cal27 and HSC4). CEBPA-DT overexpression could upregulated both cytoplasmic and nuclear CEBPA expression. Down-regulation of CEBPA-DT enhances cisplatin sensitivity, facilitates cell apoptosis in cisplatin-resistant OSCC cells. In addition, we identified that CEBPA-DT regulates cisplatin chemosensitivity through CEBPA/BCL2-mediated cell apoptosis. Knockdown of CEBPA and BCL2 could alleviate the increasement of cisplatin resistance induced by CEBPA-DT overexpression. Our findings indicate that downregulation of lncRNA CEBPA-DT may be a potential therapy to overcome cisplatin resistance in OSCC.
Collapse
Affiliation(s)
- Xue Qiao
- Department of Central Laboratory, School and Hospital of Stomatology, China Medical University, Liaoning Province Key Laboratory of Oral Disease.,Department of Oral Biology, School and Hospital of Stomatology, China Medical University, Liaoning Province Key Laboratory of Oral Disease
| | - Jiayi Liu
- Department of Oral Pathology, School and Hospital of Stomatology, China Medical University, Liaoning Province Key Laboratory of Oral Disease
| | - Li Zhu
- Department of Central Laboratory, School and Hospital of Stomatology, China Medical University, Liaoning Province Key Laboratory of Oral Disease.,Department of Oral Biology, School and Hospital of Stomatology, China Medical University, Liaoning Province Key Laboratory of Oral Disease
| | - Rongbo Song
- Department of Central Laboratory, School and Hospital of Stomatology, China Medical University, Liaoning Province Key Laboratory of Oral Disease.,Department of Oral Biology, School and Hospital of Stomatology, China Medical University, Liaoning Province Key Laboratory of Oral Disease
| | - Ming Zhong
- Department of Central Laboratory, School and Hospital of Stomatology, China Medical University, Liaoning Province Key Laboratory of Oral Disease.,Department of Stomatology, Xiang'an Hospital of Xiamen University, Xiamen, China
| | - Yan Guo
- Department of Central Laboratory, School and Hospital of Stomatology, China Medical University, Liaoning Province Key Laboratory of Oral Disease.,Department of Oral Biology, School and Hospital of Stomatology, China Medical University, Liaoning Province Key Laboratory of Oral Disease
| |
Collapse
|
9
|
Li J, Zhang X, Liu C. The computational approaches of lncRNA identification based on coding potential: Status quo and challenges. Comput Struct Biotechnol J 2020; 18:3666-3677. [PMID: 33304463 PMCID: PMC7710504 DOI: 10.1016/j.csbj.2020.11.030] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 11/15/2020] [Accepted: 11/16/2020] [Indexed: 12/13/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) make up a large proportion of transcriptome in eukaryotes, and have been revealed with many regulatory functions in various biological processes. When studying lncRNAs, the first step is to accurately and specifically distinguish them from the colossal transcriptome data with complicated composition, which contains mRNAs, lncRNAs, small RNAs and their primary transcripts. In the face of such a huge and progressively expanding transcriptome data, the in-silico approaches provide a practicable scheme for effectively and rapidly filtering out lncRNA targets, using machine learning and probability statistics. In this review, we mainly discussed the characteristics of algorithms and features on currently developed approaches. We also outlined the traits of some state-of-the-art tools for ease of operation. Finally, we pointed out the underlying challenges in lncRNA identification with the advent of new experimental data.
Collapse
Affiliation(s)
- Jing Li
- CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
- Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
| | - Xuan Zhang
- CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
| | - Changning Liu
- CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
- Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
- The Innovative Academy of Seed Design, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
| |
Collapse
|
10
|
Zhou B, Yang H, Yang C, Bao YL, Yang SM, Liu J, Xiao YF. Translation of noncoding RNAs and cancer. Cancer Lett 2020; 497:89-99. [PMID: 33038492 DOI: 10.1016/j.canlet.2020.10.002] [Citation(s) in RCA: 83] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 09/30/2020] [Accepted: 10/01/2020] [Indexed: 02/07/2023]
Abstract
The human genome contains thousands of noncoding RNAs (ncRNAs), which are thought to lack open reading frames (ORFs) and cannot be translated. Some ncRNAs reportedly have important functions, including epigenetic regulation, chromatin remolding, protein modification, and RNA degradation, but the functions of most ncRNAs remain elusive. Through the application and development of ribosome profiling and sequencing technologies, an increasing number of studies have discovered the translation of ncRNAs. Although ncRNAs were initially defined as noncoding RNAs, a number of ncRNAs actually contain ORFs that are translated into peptides. Here, we summarize the available methods, tools, and databases for identifying and validating ncRNA-encoded peptides/proteins, and the recent findings regarding ncRNA-encoded small peptides/proteins in cancer are compiled and synthesized. Importantly, the role of ncRNA-encoding peptides/proteins has application prospects in cancer research, but some potential challenges remain unresolved. The aim of this review is to provide a theoretical basis that might promote the discovery of more peptides/proteins encoded by ncRNAs and aid the further development of novel diagnostic and prognostic cancer markers and therapeutic targets.
Collapse
Affiliation(s)
- Bo Zhou
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China
| | - Huan Yang
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China
| | - Chuan Yang
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China
| | - Yu-Lu Bao
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China
| | - Shi-Ming Yang
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China
| | - Jiao Liu
- Department of Endoscope, General Hospital of Northern Theater Command, Shenyang, 110016, Liaoning, China.
| | - Yu-Feng Xiao
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China.
| |
Collapse
|
11
|
Choi SW, Kim HW, Nam JW. The small peptide world in long noncoding RNAs. Brief Bioinform 2020; 20:1853-1864. [PMID: 30010717 PMCID: PMC6917221 DOI: 10.1093/bib/bby055] [Citation(s) in RCA: 173] [Impact Index Per Article: 43.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Revised: 05/08/2018] [Indexed: 02/07/2023] Open
Abstract
Long noncoding RNAs (lncRNAs) are a group of transcripts that are longer than 200 nucleotides (nt) without coding potential. Over the past decade, tens of thousands of novel lncRNAs have been annotated in animal and plant genomes because of advanced high-throughput RNA sequencing technologies and with the aid of coding transcript classifiers. Further, a considerable number of reports have revealed the existence of stable, functional small peptides (also known as micropeptides), translated from lncRNAs. In this review, we discuss the methods of lncRNA classification, the investigations regarding their coding potential and the functional significance of the peptides they encode.
Collapse
Affiliation(s)
- Seo-Won Choi
- Department of Life Science, College of Natural Sciences, Hanyang University, Seoul 04763, Republic of Korea
| | - Hyun-Woo Kim
- Department of Life Science, College of Natural Sciences, Hanyang University, Seoul 04763, Republic of Korea
| | - Jin-Wu Nam
- Department of Life Science, College of Natural Sciences, Hanyang University, Seoul 04763, Republic of Korea
| |
Collapse
|
12
|
Guo TY, Huang L, Yao W, Du X, Li QQ, Ma ML, Li QF, Liu HL, Zhang JB, Pan ZX. The potential biological functions of circular RNAs during the initiation of atresia in pig follicles. Domest Anim Endocrinol 2020; 72:106401. [PMID: 32278256 DOI: 10.1016/j.domaniend.2019.106401] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Revised: 09/18/2019] [Accepted: 09/29/2019] [Indexed: 11/18/2022]
Abstract
The specific expression profile and function of circular RNAs (circRNAs) in mammalian ovarian follicles, especially during the atresia process, are unclear. In this study, genome-wide deep circRNA sequencing was applied to screen circRNAs in healthy and early atretic antral follicles in pig ovaries. A total of 40,567 distinct circRNAs were identified in follicles, among which 197 circRNAs (108 upregulated and 89 downregulated) were significantly shifted during the early atresia process. Most differentially expressed circRNAs (DECs) lacked protein-coding potential. Annotation analysis of the DECs revealed 162 known host genes, or noncoding RNAs, and 10 intergenic regions. The key pathways in which these host genes are involved include the focal adhesion-PI3K-Akt-mTOR signaling pathway, vascular endothelial growth factor A (VEGFA)-vascular endothelial growth factor receptor 2 signaling pathway and transforming growth factor-beta signaling pathway. Further comparison analysis between host genes of DECs and the differentially expressed linear messenger RNA transcripts revealed the cotranscription of circRNAs and their linear mRNAs in inhibin beta units (INHBA and INHBB), glutathione S-transferase (GSTA1), and VEGFA. In addition, we predicted 196 pairs of potential circRNA-micro RNA (miRNA) interactions among 77 DECs and 101 porcine miRNAs. We have identified 16 functional miRNAs by comparing the 101 miRNAs to the functional miRNAs reported in mammal ovarian follicle atresia and granulosa cell apoptosis studies. Our study adds new knowledge to circRNA distribution profiles in pig ovarian follicles, offers a valuable reference for transcriptomic profiles in the initiation of follicular atresia, highlights warranted circRNAs for further functional investigation, and provides possible biomarkers for ovarian dysfunctions.
Collapse
Affiliation(s)
- T Y Guo
- College of Animal Science and Technology, Nanjing Agriculture University, Nanjing, Jiangsu, P. R. China 210095
| | - L Huang
- College of Animal Science and Technology, Nanjing Agriculture University, Nanjing, Jiangsu, P. R. China 210095
| | - W Yao
- College of Animal Science and Technology, Nanjing Agriculture University, Nanjing, Jiangsu, P. R. China 210095
| | - X Du
- College of Animal Science and Technology, Nanjing Agriculture University, Nanjing, Jiangsu, P. R. China 210095
| | - Q Q Li
- College of Animal Science and Technology, Nanjing Agriculture University, Nanjing, Jiangsu, P. R. China 210095
| | - M L Ma
- College of Animal Science and Technology, Nanjing Agriculture University, Nanjing, Jiangsu, P. R. China 210095
| | - Q F Li
- College of Animal Science and Technology, Nanjing Agriculture University, Nanjing, Jiangsu, P. R. China 210095
| | - H L Liu
- College of Animal Science and Technology, Nanjing Agriculture University, Nanjing, Jiangsu, P. R. China 210095
| | - J B Zhang
- College of Animal Science and Technology, Nanjing Agriculture University, Nanjing, Jiangsu, P. R. China 210095
| | - Z X Pan
- College of Animal Science and Technology, Nanjing Agriculture University, Nanjing, Jiangsu, P. R. China 210095; National Experimental Teaching Demonstration Center of Animal Science, Nanjing Agriculture University, Nanjing, Jiangsu, P. R. China 210095.
| |
Collapse
|
13
|
Fan T, Zhang Q, Hu Y, Wang Z, Huang Y. Genome-wide identification of lncRNAs during hickory (Carya cathayensis) flowering. Funct Integr Genomics 2020; 20:591-607. [PMID: 32215772 DOI: 10.1007/s10142-020-00737-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 02/04/2020] [Accepted: 02/26/2020] [Indexed: 12/14/2022]
Abstract
Non-coding RNAs with lengths greater than 200 bp are known as long non-coding RNAs (lncRNAs), and these RNAs play important role in gene regulation and plant development. However, to date, little is known regarding the role played by lncRNAs during flowering in hickory (Carya cathayensis). Here, we performed whole transcriptome RNA-sequencing of samples from hickory female and male floral buds, in which three samples (H0311PF, H0318PF, and H0402PF) represent pre-flowering, flowering, and post-flowering, respectively, while eight male samples collected from May 8th to June 13th as this time course are the key stage for male floral bud differentiation. We identified 2163 lncRNAs in hickory during flowering, containing 213 intronic, 1488 intergenic, and 462 antisense lncRNAs. We noticed that 510 and 648 lncRNAs were differentially expressed corresponding to female and male floral buds, respectively. And some of the lncRNAs were in a tightly tissue-specific or stage-specific manner. To further understand the roles of the lncRNAs, we predicted the function of the lncRNAs in cis- and trans-acting modes. The results showed that 924 lncRNAs were cis-correlated with 1536 protein-coding genes, while 1207 lncRNAs co-expressed (trans-acting) with 7432 protein-coding genes (R > 0.95 or R < - 0.95). These lncRNAs were all enriched in flower development-associated biological processes, i.e., circadian rhythm, vernalization response, response to gibberellin, inflorescence development, floral organ development, etc. To further understand the relationships between lncRNAs and floral-core genes, we build a co-expressing lncRNA-mRNA flowering network. We classified these floral genes into different pathway (photoperiod, vernalization, gibberellin, autonomous, and sucrose pathway) according to their particular functions. We found a set of lncRNAs that preferentially expressed in these pathways. The network showed that some lncRNAs (i.e., XLOC_038669, XLOC_017938) functioned in a particular flowering time pathway, while others (i.e., XLOC_011251, XLOC_04110) were involved in multiple pathway. Furthermore, some lncRNAs (i.e., XLOC_038669, XLOC_009597, and XLOC_049539) played roles in single or multiple pathways via interaction with each other. This study provides a genome-wide survey of hickory flower-related lncRNAs and will contribute to further understanding of the molecular mechanism underpinning flowering in hickory.
Collapse
Affiliation(s)
- Tongqiang Fan
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, Hangzhou, 311300, People's Republic of China
| | - Qixiang Zhang
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, Hangzhou, 311300, People's Republic of China
| | - Yuanyuan Hu
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, Hangzhou, 311300, People's Republic of China
| | - Zhengjia Wang
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, Hangzhou, 311300, People's Republic of China.
| | - Youjun Huang
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, Hangzhou, 311300, People's Republic of China.
| |
Collapse
|
14
|
Floriano JF, Willis G, Catapano F, de Lima PR, Reis FVDS, Barbosa AMP, Rudge MVC, Emanueli C. Exosomes Could Offer New Options to Combat the Long-Term Complications Inflicted by Gestational Diabetes Mellitus. Cells 2020; 9:E675. [PMID: 32164322 PMCID: PMC7140615 DOI: 10.3390/cells9030675] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2020] [Revised: 02/20/2020] [Accepted: 02/29/2020] [Indexed: 02/08/2023] Open
Abstract
Gestational diabetes Mellitus (GDM) is a complex clinical condition that promotes pelvic floor myopathy, thus predisposing sufferers to urinary incontinence (UI). GDM usually regresses after birth. Nonetheless, a GDM history is associated with higher risk of subsequently developing type 2 diabetes, cardiovascular diseases (CVD) and UI. Some aspects of the pathophysiology of GDM remain unclear and the associated pathologies (outcomes) are poorly addressed, simultaneously raising public health costs and diminishing women's quality of life. Exosomes are small extracellular vesicles produced and actively secreted by cells as part of their intercellular communication system. Exosomes are heterogenous in their cargo and depending on the cell sources and environment, they can mediate both pathogenetic and therapeutic functions. With the advancement in knowledge of exosomes, new perspectives have emerged to support the mechanistic understanding, prediction/diagnosis and ultimately, treatment of the post-GMD outcomes. Here, we will review recent advances in knowledge of the role of exosomes in GDM and related areas and discuss the possibilities for translating exosomes as therapeutic agents in the GDM clinical setting.
Collapse
Affiliation(s)
- Juliana Ferreira Floriano
- Botucatu Medical School, Sao Paulo State University, 18618687 Botucatu, Brazil; (J.F.F.); (P.R.d.L.); (F.V.D.S.R.); (A.M.P.B.)
| | - Gareth Willis
- Division of Newborn Medicine/Children’s Hospital, Harvard Medical School, Boston, MA 02115, USA;
| | - Francesco Catapano
- National Heart and Lung Institute, Imperial College London, London W12 0NN, UK;
| | - Patrícia Rodrigues de Lima
- Botucatu Medical School, Sao Paulo State University, 18618687 Botucatu, Brazil; (J.F.F.); (P.R.d.L.); (F.V.D.S.R.); (A.M.P.B.)
| | | | - Angélica Mercia Pascon Barbosa
- Botucatu Medical School, Sao Paulo State University, 18618687 Botucatu, Brazil; (J.F.F.); (P.R.d.L.); (F.V.D.S.R.); (A.M.P.B.)
| | - Marilza Vieira Cunha Rudge
- Botucatu Medical School, Sao Paulo State University, 18618687 Botucatu, Brazil; (J.F.F.); (P.R.d.L.); (F.V.D.S.R.); (A.M.P.B.)
| | - Costanza Emanueli
- National Heart and Lung Institute, Imperial College London, London W12 0NN, UK;
| |
Collapse
|
15
|
Martone J, Mariani D, Desideri F, Ballarino M. Non-coding RNAs Shaping Muscle. Front Cell Dev Biol 2020; 7:394. [PMID: 32117954 PMCID: PMC7019099 DOI: 10.3389/fcell.2019.00394] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Accepted: 12/26/2019] [Indexed: 12/19/2022] Open
Abstract
In 1957, Francis Crick speculated that RNA, beyond its protein-coding capacity, could have its own function. Decade after decade, this theory was dramatically boosted by the discovery of new classes of non-coding RNAs (ncRNAs), including long ncRNAs (lncRNAs) and circular RNAs (circRNAs), which play a fundamental role in the fine spatio-temporal control of multiple layers of gene expression. Recently, many of these molecules have been identified in a plethora of different tissues, and they have emerged to be more cell-type specific than protein-coding genes. These findings shed light on how ncRNAs are involved in the precise tuning of gene regulatory mechanisms governing tissues homeostasis. In this review, we discuss the recent findings on the mechanisms used by lncRNAs and circRNAs to sustain skeletal and cardiac muscle formation, paying particular attention to the technological developments that, over the last few years, have aided their genome-wide identification and study. Together with lncRNAs and circRNAs, the emerging contribution of Piwi-interacting RNAs and transfer RNA-derived fragments to myogenesis will be also discussed, with a glimpse on the impact of their dysregulation in muscle disorders, such as myopathies, muscle atrophy, and rhabdomyosarcoma degeneration.
Collapse
Affiliation(s)
- Julie Martone
- Department of Biology and Biotechnology Charles Darwin, Sapienza University of Rome, Rome, Italy
| | - Davide Mariani
- Center for Human Technologies, Italian Institute of Technology, Genoa, Italy
| | - Fabio Desideri
- Department of Biology and Biotechnology Charles Darwin, Sapienza University of Rome, Rome, Italy
| | - Monica Ballarino
- Department of Biology and Biotechnology Charles Darwin, Sapienza University of Rome, Rome, Italy
| |
Collapse
|
16
|
Zhou B, Yang Y, Zhan J, Dou X, Wang J, Zhou Y. Predicting functional long non-coding RNAs validated by low throughput experiments. RNA Biol 2019; 16:1555-1564. [PMID: 31345106 PMCID: PMC6779387 DOI: 10.1080/15476286.2019.1644590] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 06/17/2019] [Accepted: 07/10/2019] [Indexed: 01/05/2023] Open
Abstract
High-throughput techniques have uncovered hundreds and thousands of long non-coding RNAs (lncRNAs). Among them, only a tiny fraction has experimentally validated functions (EVlncRNAs) by low-throughput methods. What fraction of lncRNAs from high-throughput experiments (HTlncRNAs) is truly functional is an active subject of debate. Here, we developed the first method to distinguish EVlncRNAs from HTlncRNAs and mRNAs by using Support Vector Machines and found that EVlncRNAs can be well separated from HTlncRNAs and mRNAs with 0.6 for Matthews correlation coefficient, 64% for sensitivity, and 81% for precision for the independent human test set. The most useful features for classification are related to sequence conservations at RNA (for separating from HTlncRNAs) and protein (for separating from mRNA) levels. The method is found to be robust as the human-RNA-trained model is applicable to independent mouse RNAs with similar accuracy and to a lesser extent to plant RNAs. The method can recover newly discovered EVlncRNAs with high sensitivity. Its application to randomly selected 2000 human HTlncRNAs indicates that the majority of HTlncRNAs is probably non-functional but a large portion (nearly 30%) are likely functional. In other words, there is an ample number of lncRNAs whose specific biological roles are yet to be discovered. The method developed here is expected to speed up and reduce the cost of the discovery by prioritizing potentially functional lncRNAs prior to experimental validation. EVlncRNA-pred is available as a web server at http://biophy.dzu.edu.cn/lncrnapred/index.html . All datasets used in this study can be obtained from the same website.
Collapse
Affiliation(s)
- Bailing Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- College of Physics and Electronic Information, Dezhou University, Dezhou, China
| | - Yuedong Yang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
| | - Jian Zhan
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
| | - Xianghua Dou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- College of Physics and Electronic Information, Dezhou University, Dezhou, China
| | - Jihua Wang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- College of Physics and Electronic Information, Dezhou University, Dezhou, China
| | - Yaoqi Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
| |
Collapse
|
17
|
Rai MI, Alam M, Lightfoot DA, Gurha P, Afzal AJ. Classification and experimental identification of plant long non-coding RNAs. Genomics 2019; 111:997-1005. [DOI: 10.1016/j.ygeno.2018.04.014] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Revised: 04/13/2018] [Accepted: 04/17/2018] [Indexed: 02/07/2023]
|
18
|
Long non-coding RNA: Classification, biogenesis and functions in blood cells. Mol Immunol 2019; 112:82-92. [DOI: 10.1016/j.molimm.2019.04.011] [Citation(s) in RCA: 199] [Impact Index Per Article: 39.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 04/16/2019] [Accepted: 04/23/2019] [Indexed: 12/20/2022]
|
19
|
Antonov IV, Mazurov E, Borodovsky M, Medvedeva YA. Prediction of lncRNAs and their interactions with nucleic acids: benchmarking bioinformatics tools. Brief Bioinform 2019; 20:551-564. [PMID: 29697742 DOI: 10.1093/bib/bby032] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2017] [Revised: 03/26/2018] [Indexed: 01/22/2023] Open
Abstract
The genomes of mammalian species are pervasively transcribed producing as many noncoding as protein-coding RNAs. There is a growing body of evidence supporting their functional role. Long noncoding RNA (lncRNA) can bind both nucleic acids and proteins through several mechanisms. A reliable computational prediction of the most probable mechanism of lncRNA interaction can facilitate experimental validation of its function. In this study, we benchmarked computational tools capable to discriminate lncRNA from mRNA and predict lncRNA interactions with other nucleic acids. We assessed the performance of 9 tools for distinguishing protein-coding from noncoding RNAs, as well as 19 tools for prediction of RNA-RNA and RNA-DNA interactions. Our conclusions about the considered tools were based on their performances on the entire genome/transcriptome level, as it is the most common task nowadays. We found that FEELnc and CPAT distinguish between coding and noncoding mammalian transcripts in the most accurate manner. ASSA, RIBlast and LASTAL, as well as Triplexator, turned out to be the best predictors of RNA-RNA and RNA-DNA interactions, respectively. We showed that the normalization of the predicted interaction strength to the transcript length and GC content may improve the accuracy of inferring RNA interactions. Yet, all the current tools have difficulties to make accurate predictions of short-trans RNA-RNA interactions-stretches of sparse contacts. All over, there is still room for improvement in each category, especially for predictions of RNA interactions.
Collapse
Affiliation(s)
- Ivan V Antonov
- Institute of Bioengineering, Research Center of Biotechnology, Russian Academy of Science, Moscow, Russian Federation.,Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, Russian Federation
| | | | - Mark Borodovsky
- Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, Russian Federation
| | - Yulia A Medvedeva
- Institute of Bioengineering, Research Center of Biotechnology, Russian Academy of Science, Moscow, Russian Federation.,Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, Russian Federation.,Department of Computational Biology, Vavilov Institute of General Genetics, Russian Academy of Science, Moscow, Russian Federation
| |
Collapse
|
20
|
Kang YJ, Yang DC, Kong L, Hou M, Meng YQ, Wei L, Gao G. CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features. Nucleic Acids Res 2019; 45:W12-W16. [PMID: 28521017 PMCID: PMC5793834 DOI: 10.1093/nar/gkx428] [Citation(s) in RCA: 790] [Impact Index Per Article: 158.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Accepted: 05/03/2017] [Indexed: 12/19/2022] Open
Abstract
With advances in next-generation sequencing technologies, numerous novel transcripts in a large number of organisms have been identified. With the goal of fast, accurate assessment of the coding ability of RNA transcripts, we upgraded the coding potential calculator CPC1 to CPC2. CPC2 runs ∼1000 times faster than CPC1 and exhibits superior accuracy compared with CPC1, especially for long non-coding transcripts. Moreover, the model of CPC2 is species-neutral, making it feasible for ever-growing non-model organism transcriptomes. A mobile-friendly web server, as well as a downloadable standalone package, is freely available at http://cpc2.cbi.pku.edu.cn.
Collapse
Affiliation(s)
- Yu-Jian Kang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - De-Chang Yang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - Lei Kong
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - Mei Hou
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - Yu-Qi Meng
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - Liping Wei
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - Ge Gao
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| |
Collapse
|
21
|
Identification and Expression Analysis of Long Noncoding RNAs in Fat-Tail of Sheep Breeds. G3-GENES GENOMES GENETICS 2019; 9:1263-1276. [PMID: 30787031 PMCID: PMC6469412 DOI: 10.1534/g3.118.201014] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Emerging evidence suggests that long non-coding RNAs (lncRNAs) participate in the regulation of a diverse range of biological processes. However, most studies have been focused on a few established model organisms and little is known about lncRNAs in fat-tail development in sheep. Here, the first profile of lncRNA in sheep fat-tail along with their possible roles in fat deposition were investigated, based on a comparative transcriptome analysis between fat-tailed (Lori-Bakhtiari) and thin-tailed (Zel) Iranian sheep breeds. Among all identified lncRNAs candidates, 358 and 66 transcripts were considered novel intergenic (lincRNAs) and novel intronic (ilncRNAs) corresponding to 302 and 58 gene loci, respectively. Our results indicated that a low percentage of the novel lncRNAs were conserved. Also, synteny analysis identified 168 novel lincRNAs with the same syntenic region in human, bovine and chicken. Only seven lncRNAs were identified as differentially expressed genes between fat and thin tailed breeds. Q-RT-PCR results were consistent with the RNA-Seq data and validated the findings. Target prediction analysis revealed that the novel lncRNAs may act in cis or trans and regulate the expression of genes that are involved in the lipid metabolism. A gene regulatory network including lncRNA-mRNA interactions were constructed and three significant modules were found, with genes relevant to lipid metabolism, insulin and calcium signaling pathway. Moreover, integrated analysis with AnimalQTLdb database further suggested six lincRNAs and one ilncRNAs as candidates of sheep fat-tail development. Our results highlighted the putative contributions of lncRNAs in regulating expression of genes associated with fat-tail development in sheep.
Collapse
|
22
|
Ruy PDC, Monteiro-Teles NM, Miserani Magalhães RD, Freitas-Castro F, Dias L, Aquino Defina TP, Rosas De Vasconcelos EJ, Myler PJ, Kaysel Cruz A. Comparative transcriptomics in Leishmania braziliensis: disclosing differential gene expression of coding and putative noncoding RNAs across developmental stages. RNA Biol 2019; 16:639-660. [PMID: 30689499 DOI: 10.1080/15476286.2019.1574161] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Leishmaniasis is a worldwide public health problem caused by protozoan parasites of the genus Leishmania. Leishmania braziliensis is the most important species responsible for tegumentary leishmaniases in Brazil. An understanding of the molecular mechanisms underlying the success of this parasite is urgently needed. An in-depth study on the modulation of gene expression across the life cycle stages of L. braziliensis covering coding and noncoding RNAs (ncRNAs) was missing and is presented herein. Analyses of differentially expressed (DE) genes revealed that most prominent differences were observed between the transcriptomes of insect and mammalian proliferative forms (6,576 genes). Gene ontology (GO) analysis indicated stage-specific enriched biological processes. A computational pipeline and 5 ncRNA predictors allowed the identification of 11,372 putative ncRNAs. Most of the DE ncRNAs were found between the transcriptomes of insect and mammalian proliferative stages (38%). Of the DE ncRNAs, 295 were DE in all three stages and displayed a wide range of lengths, chromosomal distributions and locations; many of them had a distinct expression profile compared to that of their protein-coding neighbors. Thirty-five putative ncRNAs were submitted to northern blotting analysis, and one or more hybridization-positive signals were observed in 22 of these ncRNAs. This work presents an overview of the L. braziliensis transcriptome and its adjustments throughout development. In addition to determining the general features of the transcriptome at each life stage and the profile of protein-coding transcripts, we identified and characterized a variety of noncoding transcripts. The novel putative ncRNAs uncovered in L. braziliensis might be regulatory elements to be further investigated.
Collapse
Affiliation(s)
- Patrícia De Cássia Ruy
- a Cell and Molecular Biology Department, Ribeirão Preto Medical School , University of São Paulo, Ribeirão Preto , São Paulo , Brazil
| | - Natália Melquie Monteiro-Teles
- a Cell and Molecular Biology Department, Ribeirão Preto Medical School , University of São Paulo, Ribeirão Preto , São Paulo , Brazil
| | - Rubens Daniel Miserani Magalhães
- a Cell and Molecular Biology Department, Ribeirão Preto Medical School , University of São Paulo, Ribeirão Preto , São Paulo , Brazil
| | - Felipe Freitas-Castro
- a Cell and Molecular Biology Department, Ribeirão Preto Medical School , University of São Paulo, Ribeirão Preto , São Paulo , Brazil
| | - Leandro Dias
- a Cell and Molecular Biology Department, Ribeirão Preto Medical School , University of São Paulo, Ribeirão Preto , São Paulo , Brazil
| | - Tania Paula Aquino Defina
- a Cell and Molecular Biology Department, Ribeirão Preto Medical School , University of São Paulo, Ribeirão Preto , São Paulo , Brazil
| | | | - Peter J Myler
- b Center for Infectious Disease Research , Seattle, Washington , USA
| | - Angela Kaysel Cruz
- a Cell and Molecular Biology Department, Ribeirão Preto Medical School , University of São Paulo, Ribeirão Preto , São Paulo , Brazil
| |
Collapse
|
23
|
Lorenzi L, Avila Cobos F, Decock A, Everaert C, Helsmoortel H, Lefever S, Verboom K, Volders PJ, Speleman F, Vandesompele J, Mestdagh P. Long noncoding RNA expression profiling in cancer: Challenges and opportunities. Genes Chromosomes Cancer 2019; 58:191-199. [PMID: 30461116 DOI: 10.1002/gcc.22709] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 11/06/2018] [Accepted: 11/18/2018] [Indexed: 12/11/2022] Open
Abstract
In recent years, technological advances in transcriptome profiling revealed that the repertoire of human RNA molecules is more diverse and extended than originally thought. This diversity and complexity mainly derive from a large ensemble of noncoding RNAs. Because of their key roles in cellular processes important for normal development and physiology, disruption of noncoding RNA expression is intrinsically linked to human disease, including cancer. Therefore, studying the noncoding portion of the transcriptome offers the prospect of identifying novel therapeutic and diagnostic targets. Although evidence of the relevance of noncoding RNAs in cancer is accumulating, we still face many challenges when it comes to accurately profiling their expression levels. Some of these challenges are inherent to the technologies employed, whereas others are associated with characteristics of the noncoding RNAs themselves. In this review, we discuss the challenges related to long noncoding RNA expression profiling, highlight how cancer long noncoding RNAs provide new opportunities for cancer diagnosis and treatment, and reflect on future developments.
Collapse
Affiliation(s)
- Lucía Lorenzi
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Francisco Avila Cobos
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Anneleen Decock
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Celine Everaert
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Hetty Helsmoortel
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Steve Lefever
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Karen Verboom
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Pieter-Jan Volders
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Frank Speleman
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Jo Vandesompele
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Pieter Mestdagh
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| |
Collapse
|
24
|
Methods in Metagenomics and Environmental Biotechnology. NANOSCIENCE AND BIOTECHNOLOGY FOR ENVIRONMENTAL APPLICATIONS 2019. [DOI: 10.1007/978-3-319-97922-9_4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
25
|
Chen Q, Liu X, Hu Y, Sun B, Hu Y, Wang X, Tang H, Wang Y. Transcriptomic Profiling of Fruit Development in Black Raspberry Rubus coreanus. Int J Genomics 2018; 2018:8084032. [PMID: 29805970 PMCID: PMC5901860 DOI: 10.1155/2018/8084032] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Revised: 02/09/2018] [Accepted: 02/20/2018] [Indexed: 12/19/2022] Open
Abstract
The wild Rubus species R. coreanus, which is widely distributed in southwest China, shows great promise as a genetic resource for breeding. One of its outstanding properties is adaptation to high temperature and humidity. To facilitate its use in selection and breeding programs, we assembled de novo 179,738,287 R. coreanus reads (125 bp in length) generated by RNA sequencing from fruits at three representative developmental stages. We also used the recently released draft genome of R. occidentalis to perform reference-guided assembly. We inferred a final 95,845-transcript reference for R. coreanus. Of these genetic resources, 66,597 (69.5%) were annotated. Based on these results, we carried out a comprehensive analysis of differentially expressed genes. Flavonoid biosynthesis, phenylpropanoid biosynthesis, plant hormone signal transduction, and cutin, suberin, and wax biosynthesis pathways were significantly enriched throughout the ripening process. We identified 23 transcripts involved in the flavonoid biosynthesis pathway whose expression perfectly paralleled changes in the metabolites. Additionally, we identified 119 nucleotide-binding site leucine-rich repeat (NBS-LRR) protein-coding genes, involved in pathogen resistance, of which 74 were in the completely conserved domain. These results provide, for the first time, genome-wide genetic information for understanding developmental regulation of R. coreanus fruits. They have the potential for use in breeding through functional genetic approaches in the near future.
Collapse
Affiliation(s)
- Qing Chen
- College of Horticulture, Sichuan Agricultural University, Chengdu, Sichuan 611130, China
| | - Xunju Liu
- College of Horticulture, Sichuan Agricultural University, Chengdu, Sichuan 611130, China
| | - Yueyang Hu
- College of Horticulture, Sichuan Agricultural University, Chengdu, Sichuan 611130, China
| | - Bo Sun
- College of Horticulture, Sichuan Agricultural University, Chengdu, Sichuan 611130, China
| | - Yaodong Hu
- Science and Technology Management Division, Sichuan Agricultural University, Chengdu, Sichuan 611130, China
| | - Xiaorong Wang
- Institute of Pomology and Olericulture, Sichuan Agricultural University, Chengdu, Sichuan 611130, China
| | - Haoru Tang
- College of Horticulture, Sichuan Agricultural University, Chengdu, Sichuan 611130, China
| | - Yan Wang
- Institute of Pomology and Olericulture, Sichuan Agricultural University, Chengdu, Sichuan 611130, China
| |
Collapse
|
26
|
Hoang NV, Furtado A, Thirugnanasambandam PP, Botha FC, Henry RJ. De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts. Heliyon 2018; 4:e00583. [PMID: 29862346 PMCID: PMC5968133 DOI: 10.1016/j.heliyon.2018.e00583] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Revised: 03/02/2018] [Accepted: 03/16/2018] [Indexed: 12/31/2022] Open
Abstract
Sugarcane biomass has been used for sugar, bioenergy and biomaterial production. The majority of the sugarcane biomass comes from the culm, which makes it important to understand the genetic control of biomass production in this part of the plant. A meta-transcriptome of the culm was obtained in an earlier study by using about one billion paired-end (150 bp) reads of deep RNA sequencing of samples from 20 diverse sugarcane genotypes and combining de novo assemblies from different assemblers and different settings. Although many genes could be recovered, this resulted in a large combined assembly which created the need for clustering to reduce transcript redundancy while maintaining gene content. Here, we present a comprehensive analysis of the effect of different assembly settings and clustering methods on de novo assembly, annotation and transcript profiling focusing especially on the coding transcripts from the highly polyploid sugarcane genome. The new coding sequence-based transcript clustering resulted in a better representation of transcripts compared to the earlier approach, having 121,987 contigs, which included 78,052 main and 43,935 alternative transcripts. About 73%, 67%, 61% and 10% of the transcriptome was annotated against the NCBI NR protein database, GO terms, orthologous groups and KEGG orthologies, respectively. Using this set for a differential gene expression analysis between the young and mature sugarcane culm tissues, a total of 822 transcripts were found to be differentially expressed, including key transcripts involved in sugar/fiber accumulation in sugarcane. In the context of the lack of a whole genome sequence for sugarcane, the availability of a well annotated culm-derived meta-transcriptome through deep sequencing provides useful information on coding genes specific to the sugarcane culm and will certainly contribute to understanding the process of carbon partitioning, and biomass accumulation in the sugarcane culm.
Collapse
Affiliation(s)
- Nam V. Hoang
- College of Agriculture and Forestry, Hue University, Hue, Vietnam
| | - Agnelo Furtado
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, Queensland, 4072, Australia
| | - Prathima P. Thirugnanasambandam
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, Queensland, 4072, Australia
- ICAR - Sugarcane Breeding Institute, Coimbatore, Tamil Nadu, India
| | - Frederik C. Botha
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, Queensland, 4072, Australia
- Sugar Research Australia, Indooroopilly, Queensland, Australia
| | - Robert J. Henry
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, Queensland, 4072, Australia
| |
Collapse
|
27
|
Das M, Renganathan A, Dighe SN, Bhaduri U, Shettar A, Mukherjee G, Kondaiah P, Satyanarayana Rao MR. DDX5/p68 associated lncRNA LOC284454 is differentially expressed in human cancers and modulates gene expression. RNA Biol 2018; 15:214-230. [PMID: 29227193 PMCID: PMC5798960 DOI: 10.1080/15476286.2017.1397261] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Revised: 10/04/2017] [Accepted: 10/22/2017] [Indexed: 12/21/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) are emerging as important players in regulation of gene expression in higher eukaryotes. DDX5/p68 RNA helicase protein which is involved in splicing of precursor mRNAs also interacts with lncRNAs like, SRA and mrhl, to modulate gene expression. We performed RIP-seq analysis in HEK293T cells to identify the complete repertoire of DDX5/p68 interacting transcripts including 73 single exonic (SE) lncRNAs. The LOC284454 lncRNA is the second top hit of the list of SE lncRNAs which we have characterized in detail for its molecular features and cellular functions. The RNA is located in the same primary transcript harboring miR-23a∼27a∼24-2 cluster. LOC284454 is a stable, nuclear restricted and chromatin associated lncRNA. The sequence is conserved only in primates among 26 different species and is expressed in multiple human tissues. Expression of LOC284454 is significantly reduced in breast, prostate, uterus and kidney cancer and also in breast cancer cell lines (MCF7 and T47D). Global gene expression studies upon loss and gain of function of LOC284454 revealed perturbation of genes related to cancer-related pathways. Focal adhesion and cell migration pathway genes are downregulated under overexpression condition, and these genes are significantly upregulated in breast cancer cell lines as well as breast cancer tissue samples suggesting a functional role of LOC284454 lncRNA in breast cancer pathobiology.
Collapse
Affiliation(s)
- Monalisa Das
- Molecular Biology and Genetics Unit, Jawaharlal Nehru Centre for Advance Scientific Research, Bangalore, Karnataka, India
| | - Arun Renganathan
- Molecular Biology and Genetics Unit, Jawaharlal Nehru Centre for Advance Scientific Research, Bangalore, Karnataka, India
| | - Shrinivas Nivrutti Dighe
- Molecular Biology and Genetics Unit, Jawaharlal Nehru Centre for Advance Scientific Research, Bangalore, Karnataka, India
| | - Utsa Bhaduri
- Molecular Biology and Genetics Unit, Jawaharlal Nehru Centre for Advance Scientific Research, Bangalore, Karnataka, India
| | - Abhijith Shettar
- Molecular Reproduction, Development and Genetics, Indian Institute of Science, Bangalore, Karnataka, India
| | | | - Paturu Kondaiah
- Molecular Reproduction, Development and Genetics, Indian Institute of Science, Bangalore, Karnataka, India
| | | |
Collapse
|
28
|
Abernathy J, Overturf K. Expression of Antisense Long Noncoding RNAs as Potential Regulators in Rainbow Trout with Different Tolerance to Plant-Based Diets. Anim Biotechnol 2018; 30:87-94. [PMID: 29300121 DOI: 10.1080/10495398.2017.1401546] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Reformulation of aquafeeds in salmonid diets to include more plant proteins is critical for sustainable aquaculture. However, increasing plant proteins can lead to stunted growth and enteritis. Toward an understanding of the regulatory mechanisms behind plant protein utilization, directional RNA sequencing of liver tissues from a rainbow trout strain selected for growth on an all plant-protein diet and a control strain, both fed a plant diet for 12 weeks, were utilized to construct long noncoding RNAs. Antisense long noncoding RNAs were selected for differential expression and functional analyses since they have been shown to have regulatory actions within a genome. A total of 142 unique antisense long noncoding RNAs were differentially expressed between strains, 60 of which could be mapped to a gene. Genes underlying these noncoding RNAs are indicated in lipid metabolism and immunity. Six noncoding transcripts were also found to overlap with differentially expressed protein-coding genes, all of which were co-expressed. Associating variation in regulatory elements between rainbow trout strains with differing tolerance to plant-protein diets will assist in future studies toward increased gains throughout carnivorous aquaculture.
Collapse
Affiliation(s)
- Jason Abernathy
- a USDA, Agricultural Research Service , Harry K. Dupree Stuttgart National Aquaculture Research Center , Stuttgart , AR , USA
| | - Ken Overturf
- b USDA, Agricultural Research Service , Hagerman Fish Culture Experiment Station , Hagerman , ID , USA
| |
Collapse
|
29
|
Heikkinen LK, Kesäniemi JE, Knott KE. De novo transcriptome assembly and developmental mode specific gene expression of Pygospio elegans. Evol Dev 2017; 19:205-217. [PMID: 28869352 DOI: 10.1111/ede.12230] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Species with multiple different larval developmental modes are interesting models for the study of mechanisms underlying developmental mode transitions and life history evolution. Pygospio elegans, a small, tube-dwelling polychaete worm commonly found in estuarine and marine habitats around the northern hemisphere, is one species with variable developmental modes. To provide new genomic resources for studying P. elegans and to address the differences in gene expression between individuals producing offspring with different larval developmental modes, we performed whole transcriptome Illumina RNA sequencing of adult worms from two populations and prepared a de novo assembly of the P. elegans transcriptome. The transcriptome comprises 66,233 unigenes, of which 33,807 contain predicted coding sequences, 26,448 have at least one functional annotation, and 3,076 are classified as putative long non-coding RNAs. We found more than 8,000 unigenes significantly differentially expressed between adult worms from populations producing either planktonic or benthic larvae. This comprehensive transcriptome resource for P. elegans adds to the available genomic data for annelids and can be used to uncover mechanisms allowing developmental variation in this and potentially other marine invertebrate species.
Collapse
Affiliation(s)
- Liisa K Heikkinen
- Department of Biological and Environmental Science, University of Jyvaskyla, Jyvaskyla, Finland
| | - Jenni E Kesäniemi
- Department of Biological and Environmental Science, University of Jyvaskyla, Jyvaskyla, Finland
| | - K Emily Knott
- Department of Biological and Environmental Science, University of Jyvaskyla, Jyvaskyla, Finland
| |
Collapse
|
30
|
Arnone B, Chen JY, Qin G. Characterization and analysis of long non-coding rna (lncRNA) in In Vitro- and Ex Vivo-derived cardiac progenitor cells. PLoS One 2017. [PMID: 28640894 PMCID: PMC5481004 DOI: 10.1371/journal.pone.0180096] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Recent advancements in cell-based therapies for the treatment of cardiovascular disease (CVD) show continuing promise for the use of transplanted stem and cardiac progenitor cells (CPCs) to promote cardiac restitution. However, a detailed understanding of the molecular mechanisms that control the development of these cells remains incomplete and is critical for optimizing their use in such therapy. Long non-coding (lnc) RNA has recently emerged as a crucial class of regulatory molecules involved in directing a variety of critical biological processes including development, homeostasis and disease. As such, a rising body of evidence suggests that they also play key regulatory roles in CPC development, though many questions remain regarding the expression landscape and specific identity of lncRNA involved in this process. To address this, we performed whole transcriptome sequencing of two murine CPC populations–Nkx2-5 EmGFP reporter-sorted embryonic stem (ES) cell-derived and ex vivo, cardiosphere-derived–in an effort to characterize their lncRNA profiles and potentially identify novel CPC regulators. The resulting sequencing data revealed an enrichment in both CPC populations for a panel of previously-identified lncRNA genes associated with cardiac differentiation. Additionally, a total of 1,678 differentially expressed and as-of-yet unannotated, putative lncRNA genes were found to be enriched for in the two CPC populations relative to undifferentiated ES cells.
Collapse
Affiliation(s)
- Baron Arnone
- Department of Biomedical Engineering, School of Medicine & School of Engineering, UAB, Birmingham, AL, United States of America
| | - Jake Y. Chen
- Informatics Institute, School of Medicine, UAB, Birmingham, AL, United States of America
| | - Gangjian Qin
- Department of Biomedical Engineering, School of Medicine & School of Engineering, UAB, Birmingham, AL, United States of America
- * E-mail:
| |
Collapse
|
31
|
Freitas Castro F, Ruy PC, Nogueira Zeviani K, Freitas Santos R, Simões Toledo J, Kaysel Cruz A. Evidence of putative non-coding RNAs from Leishmania untranslated regions. Mol Biochem Parasitol 2017; 214:69-74. [PMID: 28385563 DOI: 10.1016/j.molbiopara.2017.04.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 03/29/2017] [Accepted: 04/01/2017] [Indexed: 11/28/2022]
Abstract
Non-coding RNAs (ncRNAs) are regulatory elements present in a wide range of organisms, including trypanosomatids. ncRNAs transcribed from the untranslated regions (UTRs) of coding genes have been described in the transcriptomes of several eukaryotes, including Trypanosoma brucei. To uncover novel putative ncRNAs in two Leishmania species, we examined a L. major cDNA library and a L. donovani non-polysomal RNA library. Using a combination of computational analysis and experimental approaches, we classified 26 putative ncRNA in L. major, of these, 5 arising from intergenic regions and 21 from untranslated regions. In L. donovani, we classified 37 putative ncRNAs, of these, 7 arising from intergenic regions, and 30 from UTRs. Our results suggest, for the first time, that UTR-transcripts may be a common feature in the eukaryote Leishmania similarly to those previously shown in T. brucei and other eukaryotes.
Collapse
Affiliation(s)
- Felipe Freitas Castro
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil
| | - Patricia C Ruy
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil
| | - Karina Nogueira Zeviani
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil
| | - Ramon Freitas Santos
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil
| | - Juliano Simões Toledo
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil
| | - Angela Kaysel Cruz
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil.
| |
Collapse
|
32
|
Li J, Gao Z, Wang X, Liu H, Zhang Y, Liu Z. Identification and functional analysis of long intergenic noncoding RNA genes in porcine pre-implantation embryonic development. Sci Rep 2016; 6:38333. [PMID: 27922056 PMCID: PMC5138625 DOI: 10.1038/srep38333] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Accepted: 11/08/2016] [Indexed: 12/21/2022] Open
Abstract
Genome-wide transcriptome studies have identified thousands of long intergenic noncoding RNAs (lincRNAs), some of which play important roles in pre-implantation embryonic development (PED). Pig is an ideal model for reproduction, however, porcine lincRNAs are still poorly characterized and it is unknown if they are associated with porcine PED. Here we reconstructed 195,531 transcripts in 122,007 loci, and identified 7,618 novel lincRNAs from 4,776 loci based on published RNA-seq data. These lincRNAs show low exon number, short length, low expression level, tissue-specific expression and cis-acting, which is consistent with previous reports in other species. By weighted co-expression network analysis, we identified 5 developmental stages specific co-expression modules. Gene ontology enrichment analysis of these specific co-expression modules suggested that many lincRNAs are associated with cell cycle regulation, transcription and metabolism to regulate the process of zygotic genome activation. Futhermore, we identified hub lincRNAs in each co-expression modules, and found two lincRNAs TCONS_00166370 and TCONS_00020255 may play a vital role in porcine PED. This study systematically analyze lincRNAs in pig and provides the first catalog of lincRNAs that might function as gene regulatory factors of porcine PED.
Collapse
Affiliation(s)
- Jingyu Li
- College of Life Science, North-east Agricultural University, Harbin, 150030, China.,Chong Qing Reproductive and Genetics Institute, Chongqing Obstetrics and Gynecology Hospital, 64 Jing Tang ST, Yu Zhong District, Chongqing, 400013, China
| | - Zhengling Gao
- College of Life Science, North-east Agricultural University, Harbin, 150030, China
| | - Xingyu Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150080, China
| | - Hongbo Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150080, China
| | - Yan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150080, China
| | - Zhonghua Liu
- College of Life Science, North-east Agricultural University, Harbin, 150030, China
| |
Collapse
|
33
|
Zhao J, Song X, Wang K. lncScore: alignment-free identification of long noncoding RNA from assembled novel transcripts. Sci Rep 2016; 6:34838. [PMID: 27708423 PMCID: PMC5052565 DOI: 10.1038/srep34838] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Accepted: 09/21/2016] [Indexed: 12/21/2022] Open
Abstract
RNA-Seq based transcriptome assembly has been widely used to identify novel lncRNAs. However, the best-performing transcript reconstruction methods merely identified 21% of full-length protein-coding transcripts from H. sapiens. Those partial-length protein-coding transcripts are more likely to be classified as lncRNAs due to their incomplete CDS, leading to higher false positive rate for lncRNA identification. Furthermore, potential sequencing or assembly error that gain or abolish stop codons also complicates ORF-based prediction of lncRNAs. Therefore, it remains a challenge to identify lncRNAs from the assembled transcripts, particularly the partial-length ones. Here, we present a novel alignment-free tool, lncScore, which uses a logistic regression model with 11 carefully selected features. Compared to other state-of-the-art alignment-free tools (e.g. CPAT, CNCI, and PLEK), lncScore outperforms them on accurately distinguishing lncRNAs from mRNAs, especially partial-length mRNAs in the human and mouse datasets. In addition, lncScore also performed well on transcripts from five other species (Zebrafish, Fly, C. elegans, Rat, and Sheep). To speed up the prediction, multithreading is implemented within lncScore, and it only took 2 minute to classify 64,756 transcripts and 54 seconds to train a new model with 21,000 transcripts with 12 threads, which is much faster than other tools. lncScore is available at https://github.com/WGLab/lncScore.
Collapse
Affiliation(s)
- Jian Zhao
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
- Zilkha Neurogenetic Institute, Keck School of Medicine, University of Southern California, Los Angeles, CA 90089, USA
| | - Xiaofeng Song
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
| | - Kai Wang
- Zilkha Neurogenetic Institute, Keck School of Medicine, University of Southern California, Los Angeles, CA 90089, USA
- Division of Bioinformatics, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90089, USA
- Institute for Genomic Medicine, Columbia University Medical Center, New York, NY 10032, USA
- Department of Biomedical Informatics, Columbia University Medical Center, New York, NY 10032, USA
| |
Collapse
|
34
|
The Tetraodon nigroviridis reference transcriptome: developmental transition, length retention and microsynteny of long non-coding RNAs in a compact vertebrate genome. Sci Rep 2016; 6:33210. [PMID: 27628538 PMCID: PMC5024134 DOI: 10.1038/srep33210] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Accepted: 07/28/2016] [Indexed: 01/03/2023] Open
Abstract
Pufferfish such as fugu and tetraodon carry the smallest genomes among all vertebrates and are ideal for studying genome evolution. However, comparative genomics using these species is hindered by the poor annotation of their genomes. We performed RNA sequencing during key stages of maternal to zygotic transition of Tetraodon nigroviridis and report its first developmental transcriptome. We assembled 61,033 transcripts (23,837 loci) representing 80% of the annotated gene models and 3816 novel coding transcripts from 2667 loci. We demonstrate the similarities of gene expression profiles between pufferfish and zebrafish during maternal to zygotic transition and annotated 1120 long non-coding RNAs (lncRNAs) many of which differentially expressed during development. The promoters for 60% of the assembled transcripts result validated by CAGE-seq. Despite the extreme compaction of the tetraodon genome and the dramatic loss of transposons, the length of lncRNA exons remain comparable to that of other vertebrates and a small set of lncRNAs appears enriched for transposable elements suggesting a selective pressure acting on lncRNAs length and composition. Finally, a set of lncRNAs are microsyntenic between teleost and vertebrates, which indicates potential regulatory interactions between lncRNAs and their flanking coding genes. Our work provides a fundamental molecular resource for vertebrate comparative genomics and embryogenesis studies.
Collapse
|
35
|
Yuan F, Lyu MJA, Leng BY, Zhu XG, Wang BS. The transcriptome of NaCl-treated Limonium bicolor leaves reveals the genes controlling salt secretion of salt gland. PLANT MOLECULAR BIOLOGY 2016; 91:241-56. [PMID: 26936070 DOI: 10.1007/s11103-016-0460-0] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Accepted: 02/24/2016] [Indexed: 05/20/2023]
Abstract
Limonium bicolor, a typical recretohalophyte that lives in saline environments, excretes excessive salt to the environment through epidermal salt glands to avoid salt stress. The aim of this study was to screen for L. bicolor genes involved in salt secretion by high-throughput RNA sequencing. We established the experimental procedure of salt secretion using detached mature leaves, in which the optimal salt concentration was determined as 200 mM NaCl. The detached salt secretion system combined with Illumina deep sequencing were applied. In total, 27,311 genes were annotated using an L. bicolor database, and 2040 of these genes were differentially expressed, of which 744 were up-regulated and 1260 were down-regulated with the NaCl versus the control treatment. A gene ontology enrichment analysis indicated that genes related to ion transport, vesicles, reactive oxygen species scavenging, the abscisic acid-dependent signaling pathway and transcription factors were found to be highly expressed under NaCl treatment. We found that 102 of these genes were likely to be involved in salt secretion, which was confirmed using salt-secretion mutants. The present study identifies the candidate genes in the L. bicolor salt gland that are highly associated with salt secretion. In addition, a salt-transporting pathway is presented to explain how Na(+) is excreted by the salt gland in L. bicolor. These findings will shed light on the molecular mechanism of salt secretion from the salt glands of plants.
Collapse
Affiliation(s)
- Fang Yuan
- Key Lab of Plant Stress Research, College of Life Science, Shandong Normal University, Jinan, 250014, Shandong, China
| | - Ming-Ju Amy Lyu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute of Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, 200031, China
| | - Bing-Ying Leng
- Key Lab of Plant Stress Research, College of Life Science, Shandong Normal University, Jinan, 250014, Shandong, China
| | - Xin-Guang Zhu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute of Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, 200031, China
| | - Bao-Shan Wang
- Key Lab of Plant Stress Research, College of Life Science, Shandong Normal University, Jinan, 250014, Shandong, China.
| |
Collapse
|
36
|
Qin D, Xu C. Study strategies for long non-coding RNAs and their roles in regulating gene expression. Cell Mol Biol Lett 2016. [PMID: 26204411 DOI: 10.1515/cmble-2015-0021] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) have attracted considerable attention recently due to their involvement in numerous key cellular processes and in the development of various disorders. New high-throughput methods enable their study on a genome-wide scale. Numerous lncRNAs have been identified and characterized as important members of the biological regulatory network, with significant roles in regulating gene expression at the epigenetic, transcriptional and post-transcriptional levels. This paper summarizes the diverse mechanisms of action of these lncRNAs and looks at the study strategies in this field. A major challenge in future study is to establish more effective bioinformatics and experimental methods to explore the functions, detailed mechanisms of action and structures deciding the functional diversity of lncRNAs, since the vast majority remain unresolved.
Collapse
|
37
|
Zhao XY, Lin JD. Long Noncoding RNAs: A New Regulatory Code in Metabolic Control. Trends Biochem Sci 2016; 40:586-596. [PMID: 26410599 DOI: 10.1016/j.tibs.2015.08.002] [Citation(s) in RCA: 131] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2015] [Revised: 08/05/2015] [Accepted: 08/06/2015] [Indexed: 12/27/2022]
Abstract
Long noncoding RNAs (lncRNAs) are emerging as an integral part of the regulatory information encoded in the genome. lncRNAs possess the unique capability to interact with nucleic acids and proteins, and exert discrete effects on numerous biological processes. Recent studies have delineated multiple lncRNA pathways that control metabolic tissue development and function. The expansion of the regulatory code that links nutrient and hormonal signals to tissue metabolism gives new insights into the genetic and pathogenic mechanisms underlying metabolic disease. This review discusses lncRNA biology with a focus on their role in the development, signaling, and function of key metabolic tissues.
Collapse
Affiliation(s)
- Xu-Yun Zhao
- Life Sciences Institute and Department of Cell and Developmental Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jiandie D Lin
- Life Sciences Institute and Department of Cell and Developmental Biology, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
38
|
Tripathi KP, Evangelista D, Zuccaro A, Guarracino MR. Transcriptator: An Automated Computational Pipeline to Annotate Assembled Reads and Identify Non Coding RNA. PLoS One 2015; 10:e0140268. [PMID: 26581084 PMCID: PMC4651556 DOI: 10.1371/journal.pone.0140268] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2015] [Accepted: 09/22/2015] [Indexed: 12/20/2022] Open
Abstract
RNA-seq is a new tool to measure RNA transcript counts, using high-throughput sequencing at an extraordinary accuracy. It provides quantitative means to explore the transcriptome of an organism of interest. However, interpreting this extremely large data into biological knowledge is a problem, and biologist-friendly tools are lacking. In our lab, we developed Transcriptator, a web application based on a computational Python pipeline with a user-friendly Java interface. This pipeline uses the web services available for BLAST (Basis Local Search Alignment Tool), QuickGO and DAVID (Database for Annotation, Visualization and Integrated Discovery) tools. It offers a report on statistical analysis of functional and Gene Ontology (GO) annotation's enrichment. It helps users to identify enriched biological themes, particularly GO terms, pathways, domains, gene/proteins features and protein-protein interactions related informations. It clusters the transcripts based on functional annotations and generates a tabular report for functional and gene ontology annotations for each submitted transcript to the web server. The implementation of QuickGo web-services in our pipeline enable the users to carry out GO-Slim analysis, whereas the integration of PORTRAIT (Prediction of transcriptomic non coding RNA (ncRNA) by ab initio methods) helps to identify the non coding RNAs and their regulatory role in transcriptome. In summary, Transcriptator is a useful software for both NGS and array data. It helps the users to characterize the de-novo assembled reads, obtained from NGS experiments for non-referenced organisms, while it also performs the functional enrichment analysis of differentially expressed transcripts/genes for both RNA-seq and micro-array experiments. It generates easy to read tables and interactive charts for better understanding of the data. The pipeline is modular in nature, and provides an opportunity to add new plugins in the future. Web application is freely available at: http://www-labgtp.na.icar.cnr.it/Transcriptator.
Collapse
Affiliation(s)
- Kumar Parijat Tripathi
- Laboratory for Genomics, Transcriptomics and Proteomics (LAB-GTP), High Performance Computing and Networking Institute (ICAR), National Research Council of Italy (CNR), Via Pietro Castellino, 111, Napoli, Italy
- * E-mail:
| | - Daniela Evangelista
- Laboratory for Genomics, Transcriptomics and Proteomics (LAB-GTP), High Performance Computing and Networking Institute (ICAR), National Research Council of Italy (CNR), Via Pietro Castellino, 111, Napoli, Italy
| | - Antonio Zuccaro
- Laboratory for Genomics, Transcriptomics and Proteomics (LAB-GTP), High Performance Computing and Networking Institute (ICAR), National Research Council of Italy (CNR), Via Pietro Castellino, 111, Napoli, Italy
| | - Mario Rosario Guarracino
- Laboratory for Genomics, Transcriptomics and Proteomics (LAB-GTP), High Performance Computing and Networking Institute (ICAR), National Research Council of Italy (CNR), Via Pietro Castellino, 111, Napoli, Italy
| |
Collapse
|
39
|
Petrella V, Aceto S, Musacchia F, Colonna V, Robinson M, Benes V, Cicotti G, Bongiorno G, Gradoni L, Volf P, Salvemini M. De novo assembly and sex-specific transcriptome profiling in the sand fly Phlebotomus perniciosus (Diptera, Phlebotominae), a major Old World vector of Leishmania infantum. BMC Genomics 2015; 16:847. [PMID: 26493315 PMCID: PMC4619268 DOI: 10.1186/s12864-015-2088-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Accepted: 10/15/2015] [Indexed: 12/17/2022] Open
Abstract
Background The phlebotomine sand fly Phlebotomus perniciosus (Diptera: Psychodidae, Phlebotominae) is a major Old World vector of the protozoan Leishmania infantum, the etiological agent of visceral and cutaneous leishmaniases in humans and dogs, a worldwide re-emerging diseases of great public health concern, affecting 101 countries. Despite the growing interest in the study of this sand fly species in the last years, the development of genomic resources has been limited so far. To increase the available sequence data for P. perniciosus and to start studying the molecular basis of the sexual differentiation in sand flies, we performed whole transcriptome Illumina RNA sequencing (RNA-seq) of adult males and females and de novo transcriptome assembly. Results We assembled 55,393 high quality transcripts, of which 29,292 were unique, starting from adult whole body male and female pools. 11,736 transcripts had at least one functional annotation, including full-length low abundance salivary transcripts, 981 transcripts were classified as putative long non-coding RNAs and 244 transcripts encoded for putative novel proteins specific of the Phlebotominae sub-family. Differential expression analysis identified 8590 transcripts significantly biased between sexes. Among them, some show relaxation of selective constraints when compared to their orthologs of the New World sand fly species Lutzomyia longipalpis. Conclusions In this paper, we present a comprehensive transcriptome resource for the sand fly species P. perniciosus built from short-read RNA-seq and we provide insights into sex-specific gene expression at adult stage. Our analysis represents a first step towards the identification of sex-specific genes and pathways and a foundation for forthcoming investigations into this important vector species, including the study of the evolution of sex-biased genes and of the sexual differentiation in phlebotomine sand flies. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2088-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- V Petrella
- Department of Biology, University of Naples Federico II, Naples, Italy
| | - S Aceto
- Department of Biology, University of Naples Federico II, Naples, Italy
| | - F Musacchia
- Stazione Zoologica "Anton Dohrn", Naples, Italy
| | - V Colonna
- National Research Council, Institute of Genetics and Biophysics, Naples, Italy
| | - M Robinson
- Institute of Molecular Life Science, University of Zurich, Zurich, Switzerland.,SIB-Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
| | - V Benes
- Genomics Core Facility, EMBL, Heidelberg, Germany
| | - G Cicotti
- Institute for High Performance Computing and Networking, ICAR-CNR, Naples, Italy
| | - G Bongiorno
- Department of Infectious, Parasitic and Immunomediated Diseases, Istituto Superiore di Sanità, Rome, Italy
| | - L Gradoni
- Department of Infectious, Parasitic and Immunomediated Diseases, Istituto Superiore di Sanità, Rome, Italy
| | - P Volf
- Department of Parasitology, Charles University, Prague, Czech Republic
| | - M Salvemini
- Department of Biology, University of Naples Federico II, Naples, Italy.
| |
Collapse
|
40
|
Housman G, Ulitsky I. Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2015; 1859:31-40. [PMID: 26265145 DOI: 10.1016/j.bbagrm.2015.07.017] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 06/18/2015] [Accepted: 07/19/2015] [Indexed: 12/12/2022]
Abstract
Long noncoding RNAs (lncRNAs) are a diverse class of RNAs with increasingly appreciated functions in vertebrates, yet much of their biology remains poorly understood. In particular, it is unclear to what extent the current catalog of over 10,000 annotated lncRNAs is indeed devoid of genes coding for proteins. Here we review the available computational and experimental schemes for distinguishing between coding and noncoding transcripts and assess the conclusions from their recent genome-wide applications. We conclude that the model most consistent with the available data is that a large number of mammalian lncRNAs undergo translation, but only a very small minority of such translation events results in stable and functional peptides. The outcomes of the majority of the translation events and their potential biological purposes remain an intriguing topic for future investigation. This article is part of a Special Issue entitled: Clues to long noncoding RNA taxonomy1, edited by Dr. Tetsuro Hirose and Dr. Shinichi Nakagawa.
Collapse
Affiliation(s)
- Gali Housman
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
41
|
Arruda WC, Souza DS, Ralha CG, Walter MEMT, Raiol T, Brigido MM, Stadler PF. Knowledge-based reasoning to annotate noncoding RNA using multi-agent system. J Bioinform Comput Biol 2015. [PMID: 26223200 DOI: 10.1142/s0219720015500213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Noncoding RNAs (ncRNAs) have been focus of intense research over the last few years. Since characteristics and signals of ncRNAs are not entirely known, researchers use different computational tools together with their biological knowledge to predict putative ncRNAs. In this context, this work presents ncRNA-Agents, a multi-agent system to annotate ncRNAs based on the output of different tools, using inference rules to simulate biologists' reasoning. Experiments with data from the fungus Saccharomyces cerevisiae allowed to measure the performance of ncRNA-Agents, with better sensibility, when compared to Infernal, a widely used tool for annotating ncRNA. Besides, data of the Schizosaccharomyces pombe and Paracoccidioides brasiliensis fungi identified novel putative ncRNAs, which demonstrated the usefulness of our approach. NcRNA-Agents can be be found at: http://www.biomol.unb.br/ncrna-agents.
Collapse
Affiliation(s)
- Wosley C Arruda
- * Department of Computer Science, University of Brasília, Campus Universitário Darcy Ribeiro Prédio CIC/EST, ASA Norte, Brasília-DF,CEP: 70910-900, Brazil
| | - Daniel S Souza
- * Department of Computer Science, University of Brasília, Campus Universitário Darcy Ribeiro Prédio CIC/EST, ASA Norte, Brasília-DF,CEP: 70910-900, Brazil
| | - Célia G Ralha
- * Department of Computer Science, University of Brasília, Campus Universitário Darcy Ribeiro Prédio CIC/EST, ASA Norte, Brasília-DF,CEP: 70910-900, Brazil
| | - Maria Emilia M T Walter
- * Department of Computer Science, University of Brasília, Campus Universitário Darcy Ribeiro Prédio CIC/EST, ASA Norte, Brasília-DF,CEP: 70910-900, Brazil
| | - Tainá Raiol
- † Leônidas and Maria Deane Research Center (Fiocruz Amazônia), Rua Teresina, 476 Adrianópolis, Manaus-AM, CEP: 69027-070, Brazil
| | - Marcelo M Brigido
- ‡ Department of Cellular Biology, Institute of Biology, University of Brasília, Campus Universitário Darcy Ribeiro, Prédio do Institute de Biologia, ASA Norte, Brasília-DF,CEP: 70910-900, Brazil
| | - Peter F Stadler
- § Department of Computer Science and the Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107, Leipzig, Germany
| |
Collapse
|
42
|
Yuan F, Lyu MJA, Leng BY, Zheng GY, Feng ZT, Li PH, Zhu XG, Wang BS. Comparative transcriptome analysis of developmental stages of the Limonium bicolor leaf generates insights into salt gland differentiation. PLANT, CELL & ENVIRONMENT 2015; 38:1637-57. [PMID: 25651944 DOI: 10.1111/pce.12514] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Revised: 01/22/2015] [Accepted: 01/26/2015] [Indexed: 05/20/2023]
Abstract
With the expansion of saline land worldwide, it is essential to establish a model halophyte to study the salt-tolerance mechanism. The salt glands in the epidermis of Limonium bicolor (a recretohalophyte) play a pivotal role in salt tolerance by secreting excess salts from tissues. Despite the importance of salt secretion, nothing is known about the molecular mechanisms of salt gland development. In this study, we applied RNA sequencing to profile early leaf development using five distinct developmental stages, which were quantified by successive collections of the first true leaves of L. bicolor with precise spatial and temporal resolution. Specific gene expression patterns were identified for each developmental stage. In particular, we found that genes controlling salt gland differentiation in L. bicolor may evolve in a trichome formation, which was also confirmed by mutants with increased salt gland densities. Genes involved in the special ultrastructure of salt glands were also elucidated. Twenty-six genes were proposed to participate in salt gland differentiation. Our dataset sheds light on the molecular processes underpinning salt gland development and thus represents a first step towards the bioengineering of active salt-secretion capacity in crops.
Collapse
Affiliation(s)
- Fang Yuan
- Key Lab of Plant Stress Research, College of Life Science, Shandong Normal University, Ji'nan, Shandong, 250014, China
| | - Ming-Ju Amy Lyu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute of Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, 200031, China
- Graduate School of Chinese Academy of Sciences, Beijing, 100039, China
| | - Bing-Ying Leng
- Key Lab of Plant Stress Research, College of Life Science, Shandong Normal University, Ji'nan, Shandong, 250014, China
| | - Guang-Yong Zheng
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute of Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, 200031, China
| | - Zhong-Tao Feng
- Key Lab of Plant Stress Research, College of Life Science, Shandong Normal University, Ji'nan, Shandong, 250014, China
| | - Ping-Hua Li
- College of Agriculture, Shandong Agricultural University, Tai'an, 271018, China
| | - Xin-Guang Zhu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute of Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, 200031, China
- State Key Laboratory of Hybrid Rice, Shanghai Institutes for Biological Sciences, Shanghai, 200031, China
| | - Bao-Shan Wang
- Key Lab of Plant Stress Research, College of Life Science, Shandong Normal University, Ji'nan, Shandong, 250014, China
| |
Collapse
|
43
|
Chakraborty S, Britton M, Wegrzyn J, Butterfield T, Martínez-García PJ, Reagan RL, Rao BJ, Leslie CA, Aradhaya M, Neale D, Woeste K, Dandekar AM. YeATS - a tool suite for analyzing RNA-seq derived transcriptome identifies a highly transcribed putative extensin in heartwood/sapwood transition zone in black walnut. F1000Res 2015; 4:155. [PMID: 26870317 PMCID: PMC4732554 DOI: 10.12688/f1000research.6617.2] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/30/2015] [Indexed: 11/20/2022] Open
Abstract
The transcriptome provides a functional footprint of the genome by enumerating the molecular components of cells and tissues. The field of transcript discovery has been revolutionized through high-throughput mRNA sequencing (RNA-seq). Here, we present a methodology that replicates and improves existing methodologies, and implements a workflow for error estimation and correction followed by genome annotation and transcript abundance estimation for RNA-seq derived transcriptome sequences (YeATS - Yet Another Tool Suite for analyzing RNA-seq derived transcriptome). A unique feature of YeATS is the upfront determination of the errors in the sequencing or transcript assembly process by analyzing open reading frames of transcripts. YeATS identifies transcripts that have not been merged, result in broken open reading frames or contain long repeats as erroneous transcripts. We present the YeATS workflow using a representative sample of the transcriptome from the tissue at the heartwood/sapwood transition zone in black walnut. A novel feature of the transcriptome that emerged from our analysis was the identification of a highly abundant transcript that had no known homologous genes (GenBank accession: KT023102). The amino acid composition of the longest open reading frame of this gene classifies this as a putative extensin. Also, we corroborated the transcriptional abundance of proline-rich proteins, dehydrins, senescence-associated proteins, and the DNAJ family of chaperone proteins. Thus, YeATS presents a workflow for analyzing RNA-seq data with several innovative features that differentiate it from existing software.
Collapse
Affiliation(s)
| | - Monica Britton
- UC Davis Genome Center Bioinformatics Core Facility, University of California, Davis, CA, 95616, USA
| | - Jill Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269, USA
| | | | | | - Russell L Reagan
- Plant Sciences Department, University of California, Davis, CA, 95616, USA
| | - Basuthkar J Rao
- Department of Biological Sciences, Tata Institute of Fundamental Research, Homi Bhaba Road, Mumbai, 400, India
| | - Charles A Leslie
- Plant Sciences Department, University of California, Davis, CA, 95616, USA
| | | | - David Neale
- Plant Sciences Department, University of California, Davis, CA, 95616, USA
| | - Keith Woeste
- USDA Forest Service Hardwood Tree Improvement and Regeneration Center, Purdue University, West Lafayette, IN, 47907, USA
| | - Abhaya M Dandekar
- Plant Sciences Department, University of California, Davis, CA, 95616, USA
| |
Collapse
|
44
|
Chakraborty S, Britton M, Wegrzyn J, Butterfield T, Martínez-García PJ, Reagan RL, Rao BJ, Leslie CA, Aradhaya M, Neale D, Woeste K, Dandekar AM. YeATS - a tool suite for analyzing RNA-seq derived transcriptome identifies a highly transcribed putative extensin in heartwood/sapwood transition zone in black walnut. F1000Res 2015; 4:155. [PMID: 26870317 DOI: 10.12688/f1000research.6617.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/11/2015] [Indexed: 11/20/2022] Open
Abstract
The transcriptome provides a functional footprint of the genome by enumerating the molecular components of cells and tissues. The field of transcript discovery has been revolutionized through high-throughput mRNA sequencing (RNA-seq). Here, we present a methodology that replicates and improves existing methodologies, and implements a workflow for error estimation and correction followed by genome annotation and transcript abundance estimation for RNA-seq derived transcriptome sequences (YeATS - Yet Another Tool Suite for analyzing RNA-seq derived transcriptome). A unique feature of YeATS is the upfront determination of the errors in the sequencing or transcript assembly process by analyzing open reading frames of transcripts. YeATS identifies transcripts that have not been merged, result in broken open reading frames or contain long repeats as erroneous transcripts. We present the YeATS workflow using a representative sample of the transcriptome from the tissue at the heartwood/sapwood transition zone in black walnut. A novel feature of the transcriptome that emerged from our analysis was the identification of a highly abundant transcript that had no known homologous genes (GenBank accession: KT023102). The amino acid composition of the longest open reading frame of this gene classifies this as a putative extensin. Also, we corroborated the transcriptional abundance of proline-rich proteins, dehydrins, senescence-associated proteins, and the DNAJ family of chaperone proteins. Thus, YeATS presents a workflow for analyzing RNA-seq data with several innovative features that differentiate it from existing software.
Collapse
Affiliation(s)
| | - Monica Britton
- UC Davis Genome Center Bioinformatics Core Facility, University of California, Davis, CA, 95616, USA
| | - Jill Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269, USA
| | | | | | - Russell L Reagan
- Plant Sciences Department, University of California, Davis, CA, 95616, USA
| | - Basuthkar J Rao
- Department of Biological Sciences, Tata Institute of Fundamental Research, Homi Bhaba Road, Mumbai, 400, India
| | - Charles A Leslie
- Plant Sciences Department, University of California, Davis, CA, 95616, USA
| | | | - David Neale
- Plant Sciences Department, University of California, Davis, CA, 95616, USA
| | - Keith Woeste
- USDA Forest Service Hardwood Tree Improvement and Regeneration Center, Purdue University, West Lafayette, IN, 47907, USA
| | - Abhaya M Dandekar
- Plant Sciences Department, University of California, Davis, CA, 95616, USA
| |
Collapse
|
45
|
Mallory AC, Shkumatava A. LncRNAs in vertebrates: advances and challenges. Biochimie 2015; 117:3-14. [PMID: 25812751 DOI: 10.1016/j.biochi.2015.03.014] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Accepted: 03/17/2015] [Indexed: 01/06/2023]
Abstract
Beyond the handful of classic and well-characterized long noncoding RNAs (lncRNAs), more recently, hundreds of thousands of lncRNAs have been identified in multiple species including bacteria, plants and vertebrates, and the number of newly annotated lncRNAs continues to increase as more transcriptomes are analyzed. In vertebrates, the expression of many lncRNAs is highly regulated, displaying discrete temporal and spatial expression patterns, suggesting roles in a wide range of developmental processes and setting them apart from classic housekeeping ncRNAs. In addition, the deregulation of a subset of these lncRNAs has been linked to the development of several diseases, including cancers, as well as developmental anomalies. However, the majority of vertebrate lncRNA functions remain enigmatic. As such, a major task at hand is to decipher the biological roles of lncRNAs and uncover the regulatory networks upon which they impinge. This review focuses on our emerging understanding of lncRNAs in vertebrate animals, highlighting some recent advances in their functional analyses across several species and emphasizing the current challenges researchers face to characterize lncRNAs and identify their in vivo functions.
Collapse
Affiliation(s)
- Allison C Mallory
- Institut Curie, 26 Rue d'Ulm, 75248 Paris Cedex 05, France; CNRS UMR3215, 75248 Paris Cedex 05, France; INSERM U934, 75248 Paris Cedex 05, France.
| | - Alena Shkumatava
- Institut Curie, 26 Rue d'Ulm, 75248 Paris Cedex 05, France; CNRS UMR3215, 75248 Paris Cedex 05, France; INSERM U934, 75248 Paris Cedex 05, France.
| |
Collapse
|
46
|
Jalali S, Kapoor S, Sivadas A, Bhartiya D, Scaria V. Computational approaches towards understanding human long non-coding RNA biology. Bioinformatics 2015; 31:2241-51. [DOI: 10.1093/bioinformatics/btv148] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2014] [Accepted: 03/10/2015] [Indexed: 12/18/2022] Open
|
47
|
Mbandi SK, Hesse U, van Heusden P, Christoffels A. Inferring bona fide transfrags in RNA-Seq derived-transcriptome assemblies of non-model organisms. BMC Bioinformatics 2015; 16:58. [PMID: 25880035 PMCID: PMC4344733 DOI: 10.1186/s12859-015-0492-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Accepted: 02/06/2015] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND De novo transcriptome assembly of short transcribed fragments (transfrags) produced from sequencing-by-synthesis technologies often results in redundant datasets with differing levels of unassembled, partially assembled or mis-assembled transcripts. Post-assembly processing intended to reduce redundancy typically involves reassembly or clustering of assembled sequences. However, these approaches are mostly based on common word heuristics and often create clusters of biologically unrelated sequences, resulting in loss of unique transfrags annotations and propagation of mis-assemblies. RESULTS Here, we propose a structured framework that consists of a few steps in pipeline architecture for Inferring Functionally Relevant Assembly-derived Transcripts (IFRAT). IFRAT combines 1) removal of identical subsequences, 2) error tolerant CDS prediction, 3) identification of coding potential, and 4) complements BLAST with a multiple domain architecture annotation that reduces non-specific domain annotation. We demonstrate that independent of the assembler, IFRAT selects bona fide transfrags (with CDS and coding potential) from the transcriptome assembly of a model organism without relying on post-assembly clustering or reassembly. The robustness of IFRAT is inferred on RNA-Seq data of Neurospora crassa assembled using de Bruijn graph-based assemblers, in single (Trinity and Oases-25) and multiple (Oases-Merge and additive or pooled) k-mer modes. Single k-mer assemblies contained fewer transfrags compared to the multiple k-mer assemblies. However, Trinity identified a comparable number of predicted coding sequence and gene loci to Oases pooled assembly. IFRAT selects bona fide transfrags representing over 94% of cumulative BLAST-derived functional annotations of the unfiltered assemblies. Between 4-6% are lost when orphan transfrags are excluded and this represents only a tiny fraction of annotation derived from functional transference by sequence similarity. The median length of bona fide transfrags ranged from 1.5kb (Trinity) to 2kb (Oases), which is consistent with the average coding sequence length in fungi. The fraction of transfrags that could be associated with gene ontology terms ranged from 33-50%, which is also high for domain based annotation. We showed that unselected transfrags were mostly truncated and represent sequences from intronic, untranslated (5' and 3') regions and non-coding gene loci. CONCLUSIONS IFRAT simplifies post-assembly processing providing a reference transcriptome enriched with functionally relevant assembly-derived transcripts for non-model organism.
Collapse
Affiliation(s)
- Stanley Kimbung Mbandi
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa.
| | - Uljana Hesse
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa.
| | - Peter van Heusden
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa.
| | - Alan Christoffels
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa.
| |
Collapse
|
48
|
Musacchia F, Basu S, Petrosino G, Salvemini M, Sanges R. Annocript: a flexible pipeline for the annotation of transcriptomes able to identify putative long noncoding RNAs. Bioinformatics 2015; 31:2199-201. [DOI: 10.1093/bioinformatics/btv106] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2014] [Accepted: 02/11/2015] [Indexed: 11/14/2022] Open
|
49
|
Fan XN, Zhang SW. lncRNA-MFDL: identification of human long non-coding RNAs by fusing multiple features and using deep learning. MOLECULAR BIOSYSTEMS 2015; 11:892-7. [DOI: 10.1039/c4mb00650j] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
By fusing multiple features and using deep learning algorithms, a lncRNA-MFDL predictor was developed to identify lncRNAs, which is much more effective and robust.
Collapse
Affiliation(s)
- Xiao-Nan Fan
- Key Laboratory of Information Fusion Technology of Ministry of Education
- School of Automation
- Northwestern Polytechnical University
- Xi'an
- China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education
- School of Automation
- Northwestern Polytechnical University
- Xi'an
- China
| |
Collapse
|
50
|
Novel long noncoding RNAs (lncRNAs) in myogenesis: a miR-31 overlapping lncRNA transcript controls myoblast differentiation. Mol Cell Biol 2014; 35:728-36. [PMID: 25512605 DOI: 10.1128/mcb.01394-14] [Citation(s) in RCA: 82] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Transcriptome analysis allowed the identification of new long noncoding RNAs differentially expressed during murine myoblast differentiation. These transcripts were classified on the basis of their expression under proliferating versus differentiated conditions, muscle-restricted activation, and subcellular localization. Several species displayed preferential expression in dystrophic (mdx) versus wild-type muscles, indicating their possible link with regenerative processes. One of the identified transcripts, lnc-31, even if originating from the same nuclear precursor of miR-31, is produced by a pathway mutually exclusive. We show that lnc-31 and its human homologue hsa-lnc-31 are expressed in proliferating myoblasts, where they counteract differentiation. In line with this, both species are more abundant in mdx muscles and in human Duchenne muscular dystrophy (DMD) myoblasts, than in their normal counterparts. Altogether, these data suggest a crucial role for lnc-31 in controlling the differentiation commitment of precursor myoblasts and indicate that its function is maintained in evolution despite the poor sequence conservation with the human counterpart.
Collapse
|