1
|
Traubenik S, Charon C, Blein T. From environmental responses to adaptation: the roles of plant lncRNAs. PLANT PHYSIOLOGY 2024; 195:232-244. [PMID: 38246143 DOI: 10.1093/plphys/kiae034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 12/18/2023] [Accepted: 01/02/2024] [Indexed: 01/23/2024]
Abstract
As sessile organisms, plants are continuously exposed to heterogeneous and changing environments and constantly need to adapt their growth strategies. They have evolved complex mechanisms to recognize various stress factors, activate appropriate signaling pathways, and respond accordingly by reprogramming the expression of multiple genes at the transcriptional, post-transcriptional, and even epigenome levels to tolerate stressful conditions such as drought, high temperature, nutrient deficiency, and pathogenic interactions. Apart from protein-coding genes, long non-coding RNAs (lncRNAs) have emerged as key players in plant adaptation to environmental stresses. They are transcripts larger than 200 nucleotides without protein-coding potential. Still, they appear to regulate a wide range of processes, including epigenetic modifications and chromatin reorganization, as well as transcriptional and post-transcriptional modulation of gene expression, allowing plant adaptation to various environmental stresses. LncRNAs can positively or negatively modulate stress responses, affecting processes such as hormone signaling, temperature tolerance, and nutrient deficiency adaptation. Moreover, they also seem to play a role in stress memory, wherein prior exposure to mild stress enhances plant ability to adapt to subsequent stressful conditions. In this review, we summarize the contribution of lncRNAs in plant adaptation to biotic and abiotic stresses, as well as stress memory. The complex evolutionary conservation of lncRNAs is also discussed and provides insights into future research directions in this field.
Collapse
Affiliation(s)
- Soledad Traubenik
- Université Paris-Saclay, CNRS, INRAE, Université Evry, Institute of Plant Sciences Paris-Saclay (IPS2), 91190 Gif-sur-Yvette, France
- Université Paris Cité, CNRS, INRAE, Institute of Plant Sciences Paris-Saclay (IPS2), 91190 Gif-sur-Yvette, France
| | - Céline Charon
- Université Paris-Saclay, CNRS, INRAE, Université Evry, Institute of Plant Sciences Paris-Saclay (IPS2), 91190 Gif-sur-Yvette, France
- Université Paris Cité, CNRS, INRAE, Institute of Plant Sciences Paris-Saclay (IPS2), 91190 Gif-sur-Yvette, France
| | - Thomas Blein
- Université Paris-Saclay, CNRS, INRAE, Université Evry, Institute of Plant Sciences Paris-Saclay (IPS2), 91190 Gif-sur-Yvette, France
- Université Paris Cité, CNRS, INRAE, Institute of Plant Sciences Paris-Saclay (IPS2), 91190 Gif-sur-Yvette, France
| |
Collapse
|
2
|
Backofen R, Gorodkin J, Hofacker IL, Stadler PF. Comparative RNA Genomics. Methods Mol Biol 2024; 2802:347-393. [PMID: 38819565 DOI: 10.1007/978-1-0716-3838-5_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Over the last quarter of a century it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large-scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible non-coding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of non-coding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Jan Gorodkin
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria
- Bioinformatics and Computational Biology research group, University of Vienna, Vienna, Austria
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany.
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
- Universidad National de Colombia, Bogotá, Colombia.
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria.
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
3
|
da Cunha Agostini L, Almeida TC, da Silva GN. ANRIL, H19 and TUG1: a review about critical long non-coding RNAs in cardiovascular diseases. Mol Biol Rep 2023; 51:31. [PMID: 38155319 DOI: 10.1007/s11033-023-09007-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 10/30/2023] [Indexed: 12/30/2023]
Abstract
Cardiovascular diseases are the leading cause of death worldwide. They are non-transmissible diseases that affect the cardiovascular system and have different etiologies such as smoking, lipid disorders, diabetes, stress, sedentary lifestyle and genetic factors. To date, lncRNAs have been associated with increased susceptibility to the development of cardiovascular diseases such as hypertension, acute myocardial infarction, stroke, angina and heart failure. In this way, lncRNAs are becoming a very promising point for the prevention and diagnosis of cardiovascular diseases. Therefore, this review highlights the most important and recent discoveries about the mechanisms of action of the lncRNAs ANRIL, H19 and TUG1 and their clinical relevance in these pathologies. This may contribute to early detection of cardiovascular diseases in order to prevent the pathological phenotype from becoming established.
Collapse
Affiliation(s)
- Lívia da Cunha Agostini
- Programa de Pós-Graduação em Ciências Farmacêuticas (CiPharma), Escola de Farmácia, Universidade Federal de Ouro Preto, Morro do Cruzeiro, s/nº, Ouro Prêto, Minas Gerais, CEP 35402-163, Brazil
| | - Tamires Cunha Almeida
- Escola Superior Instituto Butantan (ESIB), Laboratório de Dor e Sinalização, Instituto Butantan, São Paulo, São Paulo, Brazil
| | - Glenda Nicioli da Silva
- Programa de Pós-Graduação em Ciências Farmacêuticas (CiPharma), Escola de Farmácia, Universidade Federal de Ouro Preto, Morro do Cruzeiro, s/nº, Ouro Prêto, Minas Gerais, CEP 35402-163, Brazil.
- Departamento de Análises Clínicas (DEACL), Escola de Farmácia, Universidade Federal de Ouro Preto, Ouro Prêto, Brazil.
| |
Collapse
|
4
|
Hazra S, Moulick D, Mukherjee A, Sahib S, Chowardhara B, Majumdar A, Upadhyay MK, Yadav P, Roy P, Santra SC, Mandal S, Nandy S, Dey A. Evaluation of efficacy of non-coding RNA in abiotic stress management of field crops: Current status and future prospective. PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2023; 203:107940. [PMID: 37738864 DOI: 10.1016/j.plaphy.2023.107940] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 07/23/2023] [Accepted: 08/04/2023] [Indexed: 09/24/2023]
Abstract
Abiotic stresses are responsible for the major losses in crop yield all over the world. Stresses generate harmful ROS which can impair cellular processes in plants. Therefore, plants have evolved antioxidant systems in defence against the stress-induced damages. The frequency of occurrence of abiotic stressors has increased several-fold due to the climate change experienced in recent times and projected for the future. This had particularly aggravated the risk of yield losses and threatened global food security. Non-coding RNAs are the part of eukaryotic genome that does not code for any proteins. However, they have been recently found to have a crucial role in the responses of plants to both abiotic and biotic stresses. There are different types of ncRNAs, for example, miRNAs and lncRNAs, which have the potential to regulate the expression of stress-related genes at the levels of transcription, post-transcription, and translation of proteins. The lncRNAs are also able to impart their epigenetic effects on the target genes through the alteration of the status of histone modification and organization of the chromatins. The current review attempts to deliver a comprehensive account of the role of ncRNAs in the regulation of plants' abiotic stress responses through ROS homeostasis. The potential applications ncRNAs in amelioration of abiotic stresses in field crops also have been evaluated.
Collapse
Affiliation(s)
- Swati Hazra
- Sharda School of Agricultural Sciences, Sharda University, Greater Noida, Uttar Pradesh 201310, India.
| | - Debojyoti Moulick
- Department of Environmental Science, University of Kalyani, Nadia, West Bengal 741235, India.
| | | | - Synudeen Sahib
- S. S. Cottage, Njarackal, P.O.: Perinad, Kollam, 691601, Kerala, India.
| | - Bhaben Chowardhara
- Department of Botany, Faculty of Science and Technology, Arunachal University of Studies, Arunachal Pradesh 792103, India.
| | - Arnab Majumdar
- Department of Earth Sciences, Indian Institute of Science Education and Research (IISER) Kolkata, West Bengal 741246, India.
| | - Munish Kumar Upadhyay
- Department of Civil Engineering, Indian Institute of Technology Kanpur, Uttar Pradesh 208016, India.
| | - Poonam Yadav
- Institute of Environment and Sustainable Development, Banaras Hindu University, Varanasi, Uttar Pradesh 221005, India.
| | - Priyabrata Roy
- Department of Molecular Biology and Biotechnology, University of Kalyani, West Bengal 741235, India.
| | - Subhas Chandra Santra
- Department of Environmental Science, University of Kalyani, Nadia, West Bengal 741235, India.
| | - Sayanti Mandal
- Department of Biotechnology, Dr. D. Y. Patil Arts, Commerce & Science College (affiliated to Savitribai Phule Pune University), Sant Tukaram Nagar, Pimpri, Pune, Maharashtra-411018, India.
| | - Samapika Nandy
- School of Pharmacy, Graphic Era Hill University, Bell Road, Clement Town, Dehradun, 248002, Uttarakhand, India; Department of Botany, Vedanta College, 33A Shiv Krishna Daw Lane, Kolkata-700054, India.
| | - Abhijit Dey
- Department of Life Sciences, Presidency University, Kolkata, West Bengal 700073, India.
| |
Collapse
|
5
|
Klapproth C, Zötzsche S, Kühnl F, Fallmann J, Stadler P, Findeiß S. Tailored machine learning models for functional RNA detection in genome-wide screens. NAR Genom Bioinform 2023; 5:lqad072. [PMID: 37608800 PMCID: PMC10440787 DOI: 10.1093/nargab/lqad072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 06/28/2023] [Accepted: 07/30/2023] [Indexed: 08/24/2023] Open
Abstract
The in silico prediction of non-coding and protein-coding genetic loci has received considerable attention in comparative genomics aiming in particular at the identification of properties of nucleotide sequences that are informative of their biological role in the cell. We present here a software framework for the alignment-based training, evaluation and application of machine learning models with user-defined parameters. Instead of focusing on the one-size-fits-all approach of pervasive in silico annotation pipelines, we offer a framework for the structured generation and evaluation of models based on arbitrary features and input data, focusing on stable and explainable results. Furthermore, we showcase the usage of our software package in a full-genome screen of Drosophila melanogaster and evaluate our results against the well-known but much less flexible program RNAz.
Collapse
Affiliation(s)
- Christopher Klapproth
- Leipzig University, Department of Computer Science and Interdisciplinary Center of Bioinformatics, Bioinformatics Group, Härtelstrasse 16-18, D-04107 Leipzig, Germany
- ScaDS.AI Leipzig (Center for Scalable Data Analytics and Artificial Intelligence), Humboldtstraße 25, D-04105 Leipzig, Germany
| | - Siegfried Zötzsche
- Leipzig University, Department of Computer Science and Interdisciplinary Center of Bioinformatics, Bioinformatics Group, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | - Felix Kühnl
- Leipzig University, Department of Computer Science and Interdisciplinary Center of Bioinformatics, Bioinformatics Group, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | - Jörg Fallmann
- Leipzig University, Department of Computer Science and Interdisciplinary Center of Bioinformatics, Bioinformatics Group, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | - Peter F Stadler
- Leipzig University, Department of Computer Science and Interdisciplinary Center of Bioinformatics, Bioinformatics Group, Härtelstrasse 16-18, D-04107 Leipzig, Germany
- Max Planck Institute for Mathematics in the Science, Inselstraße 22, D-04103 Leipzig, Germany
- University of Vienna, Institute for Theoretical Chemistry, Währingerstraße 17, A-1090 Vienna, Austria
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe NM 97501, USA
- Universidad Nacional de Colombia, Facultad de Ciencias, Bogotá, D.C., Colombia
| | - Sven Findeiß
- Leipzig University, Department of Computer Science and Interdisciplinary Center of Bioinformatics, Bioinformatics Group, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| |
Collapse
|
6
|
Zhang N, Xu K, Liu S, Yan R, Liu Z, Wu Y, Peng Y, Zhang X, Yukawa Y, Wu J. RNA Polymerase III-Dependent BoNR8 and AtR8 lncRNAs Contribute to Hypocotyl Elongation in Response to Light and Abscisic Acid. PLANT & CELL PHYSIOLOGY 2023; 64:646-659. [PMID: 36961744 DOI: 10.1093/pcp/pcad025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 03/24/2023] [Indexed: 06/16/2023]
Abstract
Hypocotyl elongation is inhibited by light and promoted by darkness. The plant hormone abscisic acid (ABA) also inhibits hypocotyl elongation. However, details of the molecular mechanism that regulates the integrated effects of light and ABA signaling on hypocotyl elongation remain unclear. Long non-coding RNAs (lncRNAs; >200 nt) do not encode proteins but play many physiological roles in organisms. Until now, only a few lncRNAs related to hypocotyl elongation have been reported. The lncRNAs BoNR8 (272 nt) and AtR8 (259 nt), both of which are transcribed by RNA polymerase III, are homologous lncRNAs that are abundantly present in cabbage and Arabidopsis, respectively. These lncRNAs shared 77% sequence identity, and their predicted RNA secondary structures were similar; the non-conserved nucleotides in both sequences were positioned mainly in the stem-loop regions of the secondary structures. A previous study showed that BoNR8 regulated seed germination along with ABA and that AtR8 may be involved in innate immune function in Arabidopsis. Our results show that the expression levels of BoNR8 and AtR8 were differentially affected by light and ABA and that overexpression (OX) of both BoNR8 and AtR8 in Arabidopsis regulated hypocotyl elongation depending on light and ABA.. The expression levels of light-related genes PHYB, COP1, HY5 and PIF4 and ABA-related genes ABI3 and ABI5 were altered in the AtR8-OX and BoNR8-OX lines, and, in an ABI3-defective mutant, hypocotyl elongation was greatly increased under dark condition with the addition of ABA. These results indicate that BoNR8 and AtR8 regulate hypocotyl elongation together with ABI3 and key downstream light signaling genes.
Collapse
Affiliation(s)
- Nan Zhang
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education, College of Life Sciences, Northeast Forestry University, Harbin 150040, China
| | - Kai Xu
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education, College of Life Sciences, Northeast Forestry University, Harbin 150040, China
| | - Shengyi Liu
- Department of Immunology, Nagoya University Graduate School of Medicine, Nagoya, 466- 850 Japan
| | - Rong Yan
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education, College of Life Sciences, Northeast Forestry University, Harbin 150040, China
| | - Ziguang Liu
- Key Laboratory of Combining Farming and Animal Husbandry, Institute of Animal Husbandry of Heilongjiang Academy of Agricultural Sciences, Ministry of Agriculture and Rural Affairs, Harbin 150040, China
| | - Ying Wu
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education, College of Life Sciences, Northeast Forestry University, Harbin 150040, China
| | - Yifang Peng
- College of Life Science, Agriculture and Forestry, Qiqihar University, Qiqihar, Heilongjiang 161006, China
| | - Xiaoxu Zhang
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education, College of Life Sciences, Northeast Forestry University, Harbin 150040, China
| | - Yasushi Yukawa
- Graduate School of Science, Nagoya City University, Nagoya, 467-8501 Japan
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education, College of Life Sciences, Northeast Forestry University, Harbin 150040, China
| | - Juan Wu
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education, College of Life Sciences, Northeast Forestry University, Harbin 150040, China
| |
Collapse
|
7
|
Mattick JS, Amaral PP, Carninci P, Carpenter S, Chang HY, Chen LL, Chen R, Dean C, Dinger ME, Fitzgerald KA, Gingeras TR, Guttman M, Hirose T, Huarte M, Johnson R, Kanduri C, Kapranov P, Lawrence JB, Lee JT, Mendell JT, Mercer TR, Moore KJ, Nakagawa S, Rinn JL, Spector DL, Ulitsky I, Wan Y, Wilusz JE, Wu M. Long non-coding RNAs: definitions, functions, challenges and recommendations. Nat Rev Mol Cell Biol 2023; 24:430-447. [PMID: 36596869 PMCID: PMC10213152 DOI: 10.1038/s41580-022-00566-8] [Citation(s) in RCA: 355] [Impact Index Per Article: 355.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/16/2022] [Indexed: 01/05/2023]
Abstract
Genes specifying long non-coding RNAs (lncRNAs) occupy a large fraction of the genomes of complex organisms. The term 'lncRNAs' encompasses RNA polymerase I (Pol I), Pol II and Pol III transcribed RNAs, and RNAs from processed introns. The various functions of lncRNAs and their many isoforms and interleaved relationships with other genes make lncRNA classification and annotation difficult. Most lncRNAs evolve more rapidly than protein-coding sequences, are cell type specific and regulate many aspects of cell differentiation and development and other physiological processes. Many lncRNAs associate with chromatin-modifying complexes, are transcribed from enhancers and nucleate phase separation of nuclear condensates and domains, indicating an intimate link between lncRNA expression and the spatial control of gene expression during development. lncRNAs also have important roles in the cytoplasm and beyond, including in the regulation of translation, metabolism and signalling. lncRNAs often have a modular structure and are rich in repeats, which are increasingly being shown to be relevant to their function. In this Consensus Statement, we address the definition and nomenclature of lncRNAs and their conservation, expression, phenotypic visibility, structure and functions. We also discuss research challenges and provide recommendations to advance the understanding of the roles of lncRNAs in development, cell biology and disease.
Collapse
Affiliation(s)
- John S Mattick
- School of Biotechnology and Biomolecular Sciences, UNSW, Sydney, NSW, Australia.
- UNSW RNA Institute, UNSW, Sydney, NSW, Australia.
| | - Paulo P Amaral
- INSPER Institute of Education and Research, São Paulo, Brazil
| | - Piero Carninci
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Human Technopole, Milan, Italy
| | - Susan Carpenter
- Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Howard Y Chang
- Center for Personal Dynamics Regulomes, Stanford University School of Medicine, Stanford, CA, USA
- Department of Dermatology, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Ling-Ling Chen
- CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Runsheng Chen
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Caroline Dean
- John Innes Centre, Norwich Research Park, Norwich, UK
| | - Marcel E Dinger
- School of Biotechnology and Biomolecular Sciences, UNSW, Sydney, NSW, Australia
- UNSW RNA Institute, UNSW, Sydney, NSW, Australia
| | - Katherine A Fitzgerald
- Division of Innate Immunity, Department of Medicine, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | | | - Mitchell Guttman
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Tetsuro Hirose
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
| | - Maite Huarte
- Department of Gene Therapy and Regulation of Gene Expression, Center for Applied Medical Research, University of Navarra, Pamplona, Spain
- Institute of Health Research of Navarra, Pamplona, Spain
| | - Rory Johnson
- School of Biology and Environmental Science, University College Dublin, Dublin, Ireland
- Conway Institute for Biomolecular and Biomedical Research, University College Dublin, Dublin, Ireland
| | - Chandrasekhar Kanduri
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Philipp Kapranov
- Institute of Genomics, School of Medicine, Huaqiao University, Xiamen, China
| | - Jeanne B Lawrence
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Jeannie T Lee
- Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Joshua T Mendell
- Howard Hughes Medical Institute, UT Southwestern Medical Center, Dallas, TX, USA
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Timothy R Mercer
- Australian Institute for Bioengineering and Nanotechnology, University of Queensland, Brisbane, QLD, Australia
| | - Kathryn J Moore
- Department of Medicine, New York University Grossman School of Medicine, New York, NY, USA
| | - Shinichi Nakagawa
- RNA Biology Laboratory, Faculty of Pharmaceutical Sciences, Hokkaido University, Sapporo, Japan
| | - John L Rinn
- Department of Biochemistry, University of Colorado Boulder, Boulder, CO, USA
- BioFrontiers Institute, University of Colorado Boulder, Boulder, CO, USA
- Howard Hughes Medical Institute, University of Colorado Boulder, Boulder, CO, USA
| | - David L Spector
- Cold Spring Harbour Laboratory, Cold Spring Harbour, NY, USA
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel
| | - Yue Wan
- Laboratory of RNA Genomics and Structure, Genome Institute of Singapore, A*STAR, Singapore, Singapore
- Department of Biochemistry, National University of Singapore, Singapore, Singapore
| | - Jeremy E Wilusz
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Therapeutic Innovation Center, Baylor College of Medicine, Houston, TX, USA
| | - Mian Wu
- Translational Research Institute, Henan Provincial People's Hospital, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| |
Collapse
|
8
|
Mattick JS. RNA out of the mist. Trends Genet 2023; 39:187-207. [PMID: 36528415 DOI: 10.1016/j.tig.2022.11.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 11/08/2022] [Accepted: 11/27/2022] [Indexed: 12/23/2022]
Abstract
RNA has long been regarded primarily as the intermediate between genes and proteins. It was a surprise then to discover that eukaryotic genes are mosaics of mRNA sequences interrupted by large tracts of transcribed but untranslated sequences, and that multicellular organisms also express many long 'intergenic' and antisense noncoding RNAs (lncRNAs). The identification of small RNAs that regulate mRNA translation and half-life did not disturb the prevailing view that animals and plant genomes are full of evolutionary debris and that their development is mainly supervised by transcription factors. Gathering evidence to the contrary involved addressing the low conservation, expression, and genetic visibility of lncRNAs, demonstrating their cell-specific roles in cell and developmental biology, and their association with chromatin-modifying complexes and phase-separated domains. The emerging picture is that most lncRNAs are the products of genetic loci termed 'enhancers', which marshal generic effector proteins to their sites of action to control cell fate decisions during development.
Collapse
Affiliation(s)
- John S Mattick
- School of Biotechnology and Biomolecular Sciences, UNSW, Sydney, NSW 2052, Australia; UNSW RNA Institute, UNSW, Sydney, NSW 2052, Australia.
| |
Collapse
|
9
|
Walter Costa MB. Evolutionary Conservation of RNA Secondary Structure. Methods Mol Biol 2023; 2586:121-146. [PMID: 36705902 DOI: 10.1007/978-1-0716-2768-6_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Noncoding RNAs, ncRNAs, naturally fold into structures, which allow them to perform their functions in the cell. Evolutionarily close species share structures and functions. This occurs because of shared selective pressures, resulting in conserved groups. Previous efforts in finding functional RNAs have been made in detecting conserved structures in genomes or alignments. It may occur that, within a conserved group, species-specific structures arise after species split due to positive selection. Detecting positive selection in ncRNAs is a hard problem in biology as well as bioinformatics. To detect positive selection, one should find species-specific structures within a conserved set. This chapter provides protocols to detect and analyze positive selection in ncRNA structures with the SSS-test and other free software.
Collapse
Affiliation(s)
- Maria Beatriz Walter Costa
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
- Institute of Laboratory Medicine, Clinical Chemistry und Molecular Diagnostics, University of Leipzig Medical Center, Leipzig, Germany
| |
Collapse
|
10
|
Haridevamuthu B, Guru A, Velayutham M, Snega Priya P, Arshad A, Arockiaraj J. Long non‐coding RNA, a supreme post‐transcriptional immune regulator of bacterial or virus‐driven immune evolution in teleost. REVIEWS IN AQUACULTURE 2023; 15:163-178. [DOI: 10.1111/raq.12709] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 06/18/2022] [Indexed: 10/16/2023]
Abstract
AbstractThe global aquaculture boom, fuelled by a reduction in wild population and detection of novel viruses, has created a demanding market, hence, there is a pressing need to investigate the immune system of fish, further. As the most diverse community of vertebrates and a central contributor to the progressing global aquaculture market, teleost continues to draw vast scientific interest. Recent breakthroughs in multi‐omics technologies have provided a platform to understand the role of long non‐coding RNA (lncRNA) in the host immune system during infection. Emerging evidence shows that teleost lncRNA might have a regulatory role in immune responses, mostly through lncRNA–microRNA (miRNA) sponging. Teleost lncRNA shares a functionally active short sequence complement to target the miRNA which is conserved among the several fish species. Recent report suggests that rhabdovirus exploits a lncRNA in teleost and, to dodge the host immune mechanism and negatively regulate the immune system. This observation reveals the essentiality of lncRNA in pathogen‐driven immunity in teleost. Reports available on the function of teleost lncRNA are still in early stages and experimental verifications are a limiting factor. Unravelling the lncRNA‐mediated immune regulation in fishes could be used against the invading pathogens to strengthen the aquaculture production. This review elaborates on the experimentally identified and functionally characterized lncRNA and its regulatory role in the teleost immune response during infection and pathogen‐driven host immune evolution, which could eventually lead to achieving high standards in aquaculture productivity.
Collapse
Affiliation(s)
- B. Haridevamuthu
- Department of Biotechnology, College of Science and Humanities SRM Institute of Science and Technology Chennai Tamil Nadu India
| | - Ajay Guru
- Department of Biotechnology, College of Science and Humanities SRM Institute of Science and Technology Chennai Tamil Nadu India
| | - Manikandan Velayutham
- Department of Biotechnology, College of Science and Humanities SRM Institute of Science and Technology Chennai Tamil Nadu India
| | - P. Snega Priya
- Department of Biotechnology, College of Science and Humanities SRM Institute of Science and Technology Chennai Tamil Nadu India
| | - Aziz Arshad
- International Institute of Aquaculture and Aquatic Sciences (I‐AQUAS) Universiti Putra Malaysia Port Dickson Malaysia
| | - Jesu Arockiaraj
- Department of Biotechnology, College of Science and Humanities SRM Institute of Science and Technology Chennai Tamil Nadu India
| |
Collapse
|
11
|
Reinhardt F, Stadler PF. ExceS-A: an exon-centric split aligner. J Integr Bioinform 2022; 19:jib-2021-0040. [PMID: 35254744 PMCID: PMC9069663 DOI: 10.1515/jib-2021-0040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 01/12/2022] [Indexed: 11/25/2022] Open
Abstract
Spliced alignments are a key step in the construction of high-quality homology-based annotations of protein sequences. The exon/intron structure, which is computed as part of spliced alignment procedures, often conveys important information for the distinguishing paralogous members of gene families. Here we present an exon-centric pipeline for spliced alignment that is intended in particular for applications that involve exon-by-exon comparisons of coding sequences. We show that the simple, blat-based approach has advantages over established tools in particular for genes with very large introns and applications to fragmented genome assemblies.
Collapse
Affiliation(s)
- Franziska Reinhardt
- Bioinformatics Group, Institute of Computer Science, Interdisciplinary Center of Bioinformatics, Leipzig University, Härtelstraße 16-18, D-04107 Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Institute of Computer Science, Interdisciplinary Center of Bioinformatics, Leipzig University, Härtelstraße 16-18, D-04107 Leipzig, Germany.,Max-Planck-Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany.,Institute of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria.,Facultad de Ciencias, Universidad National de Colombia, Sede Bogotá, Colombia.,Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| |
Collapse
|
12
|
Chen L, Zhu QH. The evolutionary landscape and expression pattern of plant lincRNAs. RNA Biol 2022; 19:1190-1207. [PMID: 36382947 PMCID: PMC9673970 DOI: 10.1080/15476286.2022.2144609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Long intergenic non-coding RNAs (lincRNAs) are important regulators of cellular processes, including development and stress response. Many lincRNAs have been bioinformatically identified in plants, but their evolutionary dynamics and expression characteristics are still elusive. Here, we systematically identified thousands of lincRNAs in 26 plant species, including 6 non-flowering plants, investigated the conservation of the identified lincRNAs in different levels of plant lineages based on sequence and/or synteny homology and explored characteristics of the conserved lincRNAs during plant evolution and their co-expression relationship with protein-coding genes (PCGs). In addition to confirmation of the features well documented in literature for lincRNAs, such as species-specific, fewer exons, tissue-specific expression patterns and less abundantly expressed, we revealed that histone modification signals and/or binding sites of transcription factors were enriched in the conserved lincRNAs, implying their biological functionalities, as demonstrated by identifying conserved lincRNAs related to flower development in both the Brassicaceae and grass families and ancient lincRNAs potentially functioning in meristem development of non-flowering plants. Compared to PCGs, lincRNAs are more likely to be associated with transposable elements (TEs), but with different characteristics in different evolutionary lineages, for instance, the types of TEs and the variable level of association in lincRNAs with different conservativeness. Together, these results provide a comprehensive view on the evolutionary landscape of plant lincRNAs and shed new insights on the conservation and functionality of plant lincRNAs.
Collapse
Affiliation(s)
- Li Chen
- School of Life Sciences, Westlake University, Hangzhou, China,Institute for Biology, Plant Cell and Molecular Biology, Humboldt-Universität Zu Berlin, Berlin, Germany,CONTACT Li Chen
| | - Qian-Hao Zhu
- CSIRO Agriculture and Food, Canberra, Australia,Qian-Hao Zhu CSIRO Agriculture and Food, Canberra, ACT2601, Australia
| |
Collapse
|
13
|
A Novel Regulatory Player in the Innate Immune System: Long Non-Coding RNAs. Int J Mol Sci 2021; 22:ijms22179535. [PMID: 34502451 PMCID: PMC8430513 DOI: 10.3390/ijms22179535] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 08/30/2021] [Accepted: 08/31/2021] [Indexed: 12/12/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) represent crucial transcriptional and post-transcriptional gene regulators during antimicrobial responses in the host innate immune system. Studies have shown that lncRNAs are expressed in a highly tissue- and cell-specific- manner and are involved in the differentiation and function of innate immune cells, as well as inflammatory and antiviral processes, through versatile molecular mechanisms. These lncRNAs function via the interactions with DNA, RNA, or protein in either cis or trans pattern, relying on their specific sequences or their transcriptions and processing. The dysregulation of lncRNA function is associated with various human non-infectious diseases, such as inflammatory bowel disease, cardiovascular diseases, and diabetes mellitus. Here, we provide an overview of the regulation and mechanisms of lncRNA function in the development and differentiation of innate immune cells, and during the activation or repression of innate immune responses. These elucidations might be beneficial for the development of therapeutic strategies targeting inflammatory and innate immune-mediated diseases.
Collapse
|
14
|
Policarpo R, Sierksma A, De Strooper B, d'Ydewalle C. From Junk to Function: LncRNAs in CNS Health and Disease. Front Mol Neurosci 2021; 14:714768. [PMID: 34349622 PMCID: PMC8327212 DOI: 10.3389/fnmol.2021.714768] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 06/25/2021] [Indexed: 12/26/2022] Open
Abstract
Recent advances in RNA sequencing technologies helped to uncover the existence of tens of thousands of long non-coding RNAs (lncRNAs) that arise from the dark matter of the genome. These lncRNAs were originally thought to be transcriptional noise but an increasing number of studies demonstrate that these transcripts can modulate protein-coding gene expression by a wide variety of transcriptional and post-transcriptional mechanisms. The spatiotemporal regulation of lncRNA expression is particularly evident in the central nervous system, suggesting that they may directly contribute to specific brain processes, including neurogenesis and cellular homeostasis. Not surprisingly, lncRNAs are therefore gaining attention as putative novel therapeutic targets for disorders of the brain. In this review, we summarize the recent insights into the functions of lncRNAs in the brain, their role in neuronal maintenance, and their potential contribution to disease. We conclude this review by postulating how these RNA molecules can be targeted for the treatment of yet incurable neurological disorders.
Collapse
Affiliation(s)
- Rafaela Policarpo
- VIB-KU Leuven Center For Brain & Disease Research, Leuven, Belgium.,Laboratory for the Research of Neurodegenerative Diseases, Department of Neurosciences, Leuven Brain Institute (LBI), KU Leuven, Leuven, Belgium.,Neuroscience Discovery, Janssen Research & Development, Janssen Pharmaceutica N.V., Beerse, Belgium
| | - Annerieke Sierksma
- VIB-KU Leuven Center For Brain & Disease Research, Leuven, Belgium.,Laboratory for the Research of Neurodegenerative Diseases, Department of Neurosciences, Leuven Brain Institute (LBI), KU Leuven, Leuven, Belgium
| | - Bart De Strooper
- VIB-KU Leuven Center For Brain & Disease Research, Leuven, Belgium.,Laboratory for the Research of Neurodegenerative Diseases, Department of Neurosciences, Leuven Brain Institute (LBI), KU Leuven, Leuven, Belgium.,UK Dementia Research Institute, University College London, London, United Kingdom
| | - Constantin d'Ydewalle
- Neuroscience Discovery, Janssen Research & Development, Janssen Pharmaceutica N.V., Beerse, Belgium
| |
Collapse
|
15
|
Zhang C, Niu K, Lian P, Hu Y, Shuai Z, Gao S, Ge S, Xu T, Xiao Q, Chen Z. Pathological Bases and Clinical Application of Long Noncoding RNAs in Cardiovascular Diseases. Hypertension 2021; 78:16-29. [PMID: 34058852 DOI: 10.1161/hypertensionaha.120.16752] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Increasing evidence has suggested that noncoding RNAs (ncRNAs) have vital roles in cardiovascular tissue homeostasis and diseases. As a main subgroup of ncRNAs, long ncRNAs (lncRNAs) have been reported to play important roles in lipid metabolism, inflammation, vascular injury, and angiogenesis. They have also been implicated in many human diseases including atherosclerosis, arterial remodeling, hypertension, myocardial injury, cardiac remodeling, and heart failure. Importantly, it was reported that lncRNAs were dysregulated in the development and progression of cardiovascular diseases (CVDs). A variety of studies have demonstrated that lncRNAs could influence gene expression at transcription, post-transcription, translation, and post-translation level. Particularly, emerging evidence has confirmed that the crosstalk among lncRNAs, mRNA, and miRNAs is an important underlying regulatory mechanism of lncRNAs. Nevertheless, the biological functions and molecular mechanisms of lncRNAs in CVDs have not been fully explored yet. In this review, we will comprehensively summarize the main findings about lncRNAs and CVDs, highlighting the most recent discoveries in the field of lncRNAs and their pathophysiological functions in CVDs, with the aim of dissecting the intrinsic association between lncRNAs and common risk factors of CVDs including hypertension, high glucose, and high fat. Finally, the potential of lncRNAs functioning as the biomarkers, therapeutic targets, as well as specific diagnostic and prognostic indicators of CVDs will be discussed in this review.
Collapse
Affiliation(s)
- Chengxin Zhang
- From the Department of Cardiovascular Surgery, First Affiliated Hospital of Anhui Medical University, P.R. China (C.Z., Z.S., S. Ge, Q.X.)
| | - Kaiyuan Niu
- Clinical Pharmacology, William Harvey Research Institute (WHRI), Barts and The London School of Medicine and Dentistry, Queen Mary University of London, United Kingdom (K.N., Q.X.)
- Department of Otolaryngology, the third affiliated hospital of Anhui Medical University, China (K.N.)
| | - Panpan Lian
- Center for Translational Medicine and Jiangsu Key Laboratory of Molecular Medicine, Medical School of Nanjing University, P.R. China (P.L.)
| | - Ying Hu
- Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, School of Pharmacy, Anhui Medical University, P.R. China (Y.H., T.X.)
| | - Ziqiang Shuai
- From the Department of Cardiovascular Surgery, First Affiliated Hospital of Anhui Medical University, P.R. China (C.Z., Z.S., S. Ge, Q.X.)
| | - Shan Gao
- Department of Pharmacology, Basic Medical College, Anhui Medical University, P.R. China (S. Gao, Q.X.)
| | - Shenglin Ge
- From the Department of Cardiovascular Surgery, First Affiliated Hospital of Anhui Medical University, P.R. China (C.Z., Z.S., S. Ge, Q.X.)
| | - Tao Xu
- Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, School of Pharmacy, Anhui Medical University, P.R. China (Y.H., T.X.)
| | - Qingzhong Xiao
- From the Department of Cardiovascular Surgery, First Affiliated Hospital of Anhui Medical University, P.R. China (C.Z., Z.S., S. Ge, Q.X.)
- Clinical Pharmacology, William Harvey Research Institute (WHRI), Barts and The London School of Medicine and Dentistry, Queen Mary University of London, United Kingdom (K.N., Q.X.)
- Department of Pharmacology, Basic Medical College, Anhui Medical University, P.R. China (S. Gao, Q.X.)
| | - Zhaolin Chen
- Division of Life Sciences and Medicine, Department of Pharmacy, The First Affiliated Hospital of USTC, University of Science and Technology of China, Anhui Provincial Hospital, P.R. China (Z.C.)
| |
Collapse
|
16
|
Decoding LncRNAs. Cancers (Basel) 2021; 13:cancers13112643. [PMID: 34072257 PMCID: PMC8199187 DOI: 10.3390/cancers13112643] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 05/23/2021] [Accepted: 05/25/2021] [Indexed: 02/07/2023] Open
Abstract
Non-coding RNAs (ncRNAs) have been considered as unimportant additions to the transcriptome. Yet, in light of numerous studies, it has become clear that ncRNAs play important roles in development, health and disease. Long-ignored, long non-coding RNAs (lncRNAs), ncRNAs made of more than 200 nucleotides have gained attention due to their involvement as drivers or suppressors of a myriad of tumours. The detailed understanding of some of their functions, structures and interactomes has been the result of interdisciplinary efforts, as in many cases, new methods need to be created or adapted to characterise these molecules. Unlike most reviews on lncRNAs, we summarize the achievements on lncRNA studies by taking into consideration the approaches for identification of lncRNA functions, interactomes, and structural arrangements. We also provide information about the recent data on the involvement of lncRNAs in diseases and present applications of these molecules, especially in medicine.
Collapse
|
17
|
Comparative genomics in the search for conserved long noncoding RNAs. Essays Biochem 2021; 65:741-749. [PMID: 33885137 PMCID: PMC8564735 DOI: 10.1042/ebc20200069] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 02/15/2021] [Accepted: 03/15/2021] [Indexed: 12/23/2022]
Abstract
Long noncoding RNAs (lncRNAs) have emerged as prominent regulators of gene expression in eukaryotes. The identification of lncRNA orthologs is essential in efforts to decipher their roles across model organisms, as homologous genes tend to have similar molecular and biological functions. The relatively high sequence plasticity of lncRNA genes compared with protein-coding genes, makes the identification of their orthologs a challenging task. This is why comparative genomics of lncRNAs requires the development of specific and, sometimes, complex approaches. Here, we briefly review current advancements and challenges associated with four levels of lncRNA conservation: genomic sequences, splicing signals, secondary structures and syntenic transcription.
Collapse
|
18
|
Liau WS, Samaddar S, Banerjee S, Bredy TW. On the functional relevance of spatiotemporally-specific patterns of experience-dependent long noncoding RNA expression in the brain. RNA Biol 2021; 18:1025-1036. [PMID: 33397182 DOI: 10.1080/15476286.2020.1868165] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
The majority of transcriptionally active RNA derived from the mammalian genome does not code for protein. Long noncoding RNA (lncRNA) is the most abundant form of noncoding RNA found in the brain and is involved in many aspects of cellular metabolism. Beyond their fundamental role in the nucleus as decoys for RNA-binding proteins associated with alternative splicing or as guides for the epigenetic regulation of protein-coding gene expression, recent findings indicate that activity-induced lncRNAs also regulate neural plasticity. In this review, we discuss how lncRNAs may exert molecular control over brain function beyond their known roles in the nucleus. We propose that subcellular localization is a critical feature of experience-dependent lncRNA activity in the brain, and that lncRNA-mediated control over RNA metabolism at the synapse serves to regulate local mRNA stability and translation, thereby influencing neuronal function, learning and memory.
Collapse
Affiliation(s)
- Wei-Siang Liau
- Cognitive Neuroepigenetics Laboratory, Queensland Brain Institute, The University of Queensland, Brisbane, Australia
| | | | | | - Timothy W Bredy
- Cognitive Neuroepigenetics Laboratory, Queensland Brain Institute, The University of Queensland, Brisbane, Australia
| |
Collapse
|
19
|
Alzheimer-related genes show accelerated evolution. Mol Psychiatry 2021; 26:5790-5796. [PMID: 32203153 PMCID: PMC8758480 DOI: 10.1038/s41380-020-0680-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Revised: 01/27/2020] [Accepted: 02/03/2020] [Indexed: 12/27/2022]
Abstract
Alzheimer's disease (AD) is a neurodegenerative disorder of unknown cause with complex genetic and environmental traits. While AD is extremely prevalent in human elderly, it hardly occurs in non-primate mammals and even non-human-primates develop only an incomplete form of the disease. This specificity of AD to human clearly implies a phylogenetic aspect. Still, the evolutionary dimension of AD pathomechanism remains difficult to prove and has not been established so far. To analyze the evolutionary age and dynamics of AD-associated-genes, we established the AD-associated genome-wide RNA-profile comprising both protein-coding and non-protein-coding transcripts. We than applied a systematic analysis on the conservation of splice-sites as a measure of gene-structure based on multiple alignments across vertebrates of homologs of AD-associated-genes. Here, we show that nearly all AD-associated-genes are evolutionarily old and did not originate later in evolution than not-AD-associated-genes. However, the gene-structures of loci, that exhibit AD-associated changes in their expression, evolve faster than the genome at large. While protein-coding-loci exhibit an enhanced rate of small changes in gene structure, non-coding loci show even much larger changes. The accelerated evolution of AD-associated-genes indicates a more rapid functional adaptation of these genes. In particular AD-associated non-coding-genes play an important, as yet largely unexplored, role in AD. This phylogenetic trait indicates that recent adaptive evolution of human brain is causally involved in basic principles of neurodegeneration. It highlights the necessity for a paradigmatic change of our disease-concepts and to reconsider the appropriateness of current animal-models to develop disease-modifying strategies that can be translated to human.
Collapse
|
20
|
Abstract
Recent advances in sequencing technologies have uncovered the existence of thousands of long noncoding RNAs (lncRNAs) with dysregulated expression in cancer. As a result, there is burgeoning interest in understanding their function and biological significance in both homeostasis and disease. RNA interference (RNAi) enables sequence-specific gene silencing and can, in principle, be employed to silence virtually any gene. However, when applied to lncRNAs, it is important to consider current limitations in their annotation and current principles regarding lncRNA regulation and function when assessing their phenotype in cancer cell lines. In this chapter we describe the analysis of lncRNA splicing variant expression, including subcellular localization, transfection of siRNAs in cancer cell lines, and validation of gene silencing by quantitative PCR and single molecule in situ hybridization. All protocols can be performed in a laboratory with essential equipment for cell culture, molecular biology, and imaging.
Collapse
|
21
|
Ramírez-Colmenero A, Oktaba K, Fernandez-Valverde SL. Evolution of Genome-Organizing Long Non-coding RNAs in Metazoans. Front Genet 2020; 11:589697. [PMID: 33329735 PMCID: PMC7734150 DOI: 10.3389/fgene.2020.589697] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 11/09/2020] [Indexed: 12/28/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) have important regulatory functions across eukarya. It is now clear that many of these functions are related to gene expression regulation through their capacity to recruit epigenetic modifiers and establish chromatin interactions. Several lncRNAs have been recently shown to participate in modulating chromatin within the spatial organization of the genome in the three-dimensional space of the nucleus. The identification of lncRNA candidates is challenging, as it is their functional characterization. Conservation signatures of lncRNAs are different from those of protein-coding genes, making identifying lncRNAs under selection a difficult task, and the homology between lncRNAs may not be readily apparent. Here, we review the evidence for these higher-order genome organization functions of lncRNAs in animals and the evolutionary signatures they display.
Collapse
Affiliation(s)
- América Ramírez-Colmenero
- Unidad de Genómica Avanzada (Langebio), Centro de Investigación y de Estudios Avanzados del IPN, Irapuato, México
| | - Katarzyna Oktaba
- Unidad Irapuato, Centro de Investigación y de Estudios Avanzados del IPN, Irapuato, México
| | - Selene L Fernandez-Valverde
- Unidad de Genómica Avanzada (Langebio), Centro de Investigación y de Estudios Avanzados del IPN, Irapuato, México
| |
Collapse
|
22
|
Abstract
Many small nucleolar RNAs and many of the hairpin precursors of miRNAs are processed from long non-protein-coding host genes. In contrast to their highly conserved and heavily structured payload, the host genes feature poorly conserved sequences. Nevertheless, there is mounting evidence that the host genes have biological functions beyond their primary task of carrying a ncRNA as payload. So far, no connections between the function of the host genes and the function of their payloads have been reported. Here we investigate whether there is evidence for an association of host gene function or mechanisms with the type of payload. To assess this hypothesis we test whether the miRNA host genes (MIRHGs), snoRNA host genes (SNHGs), and other lncRNA host genes can be distinguished based on sequence and/or structure features unrelated to their payload. A positive answer would imply a functional and mechanistic correlation between host genes and their payload, provided the classification does not depend on the presence and type of the payload. A negative answer would indicate that to the extent that secondary functions are acquired, they are not strongly constrained by the prior, primary function of the payload. We find that the three classes can be distinguished reliably when the classifier is allowed to extract features from the payloads. They become virtually indistinguishable, however, as soon as only sequence and structure of parts of the host gene distal from the snoRNAs or miRNA payload is used for classification. This indicates that the functions of MIRHGs and SNHGs are largely independent of the functions of their payloads. Furthermore, there is no evidence that the MIRHGs and SNHGs form coherent classes of long non-coding RNAs distinguished by features other than their payloads.
Collapse
|
23
|
Target Enrichment Enables the Discovery of lncRNAs with Somatic Mutations or Altered Expression in Paraffin-Embedded Colorectal Cancer Samples. Cancers (Basel) 2020; 12:cancers12102844. [PMID: 33019720 PMCID: PMC7650602 DOI: 10.3390/cancers12102844] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 09/20/2020] [Accepted: 09/23/2020] [Indexed: 12/25/2022] Open
Abstract
Simple Summary Alterations in long noncoding RNAs and their mutations have been increasingly recognized in tumorogenesis and cancer progression awakening especial interest as potential novel cancer biomarkers and therapeutic targets. The use of adjuvant chemotherapy in stage II colorectal cancer patients is challenging, and new biomarkers are required to identify patients with high probability of relapse. We focused on translational potential of non-coding RNAs in colorectal cancer. In this study, we aim to validate a new tool which couples target enrichment and RNAseq for transcriptomics studies of lncRNAs in formalin-fixed paraffin embedded (FFPE) tissue samples. Our results show that this new approach efficiently detects lncRNAs and differences in their expression between healthy and tumor FFPE tissues, as well as somatic mutations in expressed lncRNAs, identifying novel lncRNAs as potential candidates for colorectal cancer. This new approach could represent a promising avenue that would reduce costs and enable more efficient translational research. Abstract Long non-coding RNAs (lncRNAs) play important roles in cancer and are potential new biomarkers or targets for therapy. However, given the low and tissue-specific expression of lncRNAs, linking these molecules to particular cancer types and processes through transcriptional profiling is challenging. Formalin-fixed, paraffin-embedded (FFPE) tissues are abundant resources for research but are prone to nucleic acid degradation, thereby complicating the study of lncRNAs. Here, we designed and validated a probe-based enrichment strategy to efficiently profile lncRNA expression in FFPE samples, and we applied it for the detection of lncRNAs associated with colorectal cancer (CRC). Our approach efficiently enriched targeted lncRNAs from FFPE samples, while preserving their relative abundance, and enabled the detection of tumor-specific mutations. We identified 379 lncRNAs differentially expressed between CRC tumors and matched healthy tissues and found tumor-specific lncRNA variants. Our results show that numerous lncRNAs are differentially expressed and/or accumulate variants in CRC tumors, thereby suggesting a role in CRC progression. More generally, our approach unlocks the study of lncRNAs in FFPE samples, thus enabling the retrospective use of abundant, well documented material available in hospital biobanks.
Collapse
|
24
|
Genome-wide detection and sequence conservation analysis of long non-coding RNA during hair follicle cycle of yak. BMC Genomics 2020; 21:681. [PMID: 32998696 PMCID: PMC7528256 DOI: 10.1186/s12864-020-07082-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Accepted: 09/18/2020] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Long non-coding RNA (lncRNA) as an important regulator has been demonstrated playing an indispensable role in the biological process of hair follicles (HFs) growth. However, their function and expression profile in the HFs cycle of yak are yet unknown. Only a few functional lncRNAs have been identified, partly due to the low sequence conservation and lack of identified conserved properties in lncRNAs. Here, lncRNA-seq was employed to detect the expression profile of lncRNAs during the HFs cycle of yak, and the sequence conservation of two datasets between yak and cashmere goat during the HFs cycle was analyzed. RESULTS A total of 2884 lncRNAs were identified in 5 phases (Jan., Mar., Jun., Aug., and Oct.) during the HFs cycle of yak. Then, differential expression analysis between 3 phases (Jan., Mar., and Oct.) was performed, revealing that 198 differentially expressed lncRNAs (DELs) were obtained in the Oct.-vs-Jan. group, 280 DELs were obtained in the Jan.-vs-Mar. group, and 340 DELs were obtained in the Mar.-vs-Oct. group. Subsequently, the nearest genes of lncRNAs were searched as the potential target genes and used to explore the function of DELs by GO and KEGG enrichment analysis. Several critical pathways involved in HFs development such as Wnt signaling pathway, VEGF signaling pathway, and signaling pathways regulating pluripotency of stem cells, were enriched. To further screen key lncRNAs influencing the HFs cycle, 24 DELs with differ degree of sequence conservation were obtained via a comparative analysis of partial DELs with previously published lncRNA-seq data of cashmere goat in the HFs cycle using NCBI BLAST-2.9.0+, and 3 DELs of them were randomly selected for further detailed analysis of the sequence conservation properties. CONCLUSIONS This study revealed the expression pattern and potential function of lncRNAs during HFs cycle of yak, which would expand the knowledge about the role of lncRNAs in the HFs cycle. The findings related to sequence conservation properties of lncRNAs in the HFs cycle between the two species may provide valuable insights into the study of lncRNA functionality and mechanism.
Collapse
|
25
|
Xu B, Meng Y, Jin Y. RNA structures in alternative splicing and back-splicing. WILEY INTERDISCIPLINARY REVIEWS-RNA 2020; 12:e1626. [PMID: 32929887 DOI: 10.1002/wrna.1626] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 08/14/2020] [Accepted: 08/22/2020] [Indexed: 12/12/2022]
Abstract
Alternative splicing greatly expands the transcriptomic and proteomic diversities related to physiological and developmental processes in higher eukaryotes. Splicing of long noncoding RNAs, and back- and trans- splicing further expanded the regulatory repertoire of alternative splicing. RNA structures were shown to play an important role in regulating alternative splicing and back-splicing. Application of novel sequencing technologies made it possible to identify genome-wide RNA structures and interaction networks, which might provide new insights into RNA splicing regulation in vitro to in vivo. The emerging transcription-folding-splicing paradigm is changing our understanding of RNA alternative splicing regulation. Here, we review the insights into the roles and mechanisms of RNA structures in alternative splicing and back-splicing, as well as how disruption of these structures affects alternative splicing and then leads to human diseases. This article is categorized under: RNA Processing > Splicing Regulation/Alternative Splicing RNA Structure and Dynamics > Influence of RNA Structure in Biological Systems.
Collapse
Affiliation(s)
- Bingbing Xu
- MOE Laboratory of Biosystems Homeostasis & Protection and Innovation Center for Cell Signaling Network, College of Life Sciences, Zhejiang University, Zhejiang, Hangzhou, China
| | - Yijun Meng
- College of Life and Environmental Sciences, Hangzhou Normal University, Zhejiang, Hangzhou, China
| | - Yongfeng Jin
- MOE Laboratory of Biosystems Homeostasis & Protection and Innovation Center for Cell Signaling Network, College of Life Sciences, Zhejiang University, Zhejiang, Hangzhou, China
| |
Collapse
|
26
|
Jones AN, Pisignano G, Pavelitz T, White J, Kinisu M, Forino N, Albin D, Varani G. An evolutionarily conserved RNA structure in the functional core of the lincRNA Cyrano. RNA (NEW YORK, N.Y.) 2020; 26:1234-1246. [PMID: 32457084 PMCID: PMC7430676 DOI: 10.1261/rna.076117.120] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 05/18/2020] [Indexed: 05/08/2023]
Abstract
The wide prevalence and regulated expression of long noncoding RNAs (lncRNAs) highlight their functional roles, but the molecular basis for their activities and structure-function relationships remains to be investigated, with few exceptions. Among the relatively few lncRNAs conserved over significant evolutionary distances is the long intergenic noncoding RNA (lincRNA) Cyrano (orthologous to human OIP5-AS1), which contains a region of 300 highly conserved nucleotides within tetrapods, which in turn contains a functional stretch of 26 nt of deep conservation. This region binds to and facilitates the degradation of the microRNA miR-7, a short ncRNA with multiple cellular functions, including modulation of oncogenic expression. We probed the secondary structure of Cyrano in vitro and in cells using chemical and enzymatic probing, and validated the results using comparative sequence analysis. At the center of the functional core of Cyrano is a cloverleaf structure maintained over the >400 million years of divergent evolution that separates fish and primates. This strikingly conserved motif provides interaction sites for several RNA-binding proteins and masks a conserved recognition site for miR-7. Conservation in this region strongly suggests that the function of Cyrano depends on the formation of this RNA structure, which could modulate the rate and efficiency of degradation of miR-7.
Collapse
Affiliation(s)
- Alisha N Jones
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195, USA
| | - Giuseppina Pisignano
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195, USA
- Tumor Biology and Experimental Therapeutics Program, Institute of Oncology Research (IOR) and Oncology Institute of Southern Switzerland (IOSI), Bellinzona CH-6500, Switzerland
- Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, BA2 7AY, United Kingdom
| | - Thomas Pavelitz
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195, USA
| | - Jessica White
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195, USA
| | - Martin Kinisu
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195, USA
| | - Nicholas Forino
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195, USA
| | - Dreycey Albin
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195, USA
| | - Gabriele Varani
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195, USA
| |
Collapse
|
27
|
Ntini E, Marsico A. Functional impacts of non-coding RNA processing on enhancer activity and target gene expression. J Mol Cell Biol 2020; 11:868-879. [PMID: 31169884 PMCID: PMC6884709 DOI: 10.1093/jmcb/mjz047] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 04/03/2019] [Accepted: 04/04/2019] [Indexed: 01/06/2023] Open
Abstract
Tight regulation of gene expression is orchestrated by enhancers. Through recent research advancements, it is becoming clear that enhancers are not solely distal regulatory elements harboring transcription factor binding sites and decorated with specific histone marks, but they rather display signatures of active transcription, showing distinct degrees of transcription unit organization. Thereby, a substantial fraction of enhancers give rise to different species of non-coding RNA transcripts with an unprecedented range of potential functions. In this review, we bring together data from recent studies indicating that non-coding RNA transcription from active enhancers, as well as enhancer-produced long non-coding RNA transcripts, may modulate or define the functional regulatory potential of the cognate enhancer. In addition, we summarize supporting evidence that RNA processing of the enhancer-associated long non-coding RNA transcripts may constitute an additional layer of regulation of enhancer activity, which contributes to the control and final outcome of enhancer-targeted gene expression.
Collapse
Affiliation(s)
- Evgenia Ntini
- Max Planck Institute for Molecular Genetics, Berlin, Germany.,Free University Berlin, Berlin, Germany
| | - Annalisa Marsico
- Max Planck Institute for Molecular Genetics, Berlin, Germany.,Free University Berlin, Berlin, Germany.,Institute of Computational Biology, Helmholtz Zentrum München, München, Germany
| |
Collapse
|
28
|
Corona-Gomez JA, Garcia-Lopez IJ, Stadler PF, Fernandez-Valverde SL. Splicing conservation signals in plant long noncoding RNAs. RNA (NEW YORK, N.Y.) 2020; 26:784-793. [PMID: 32241834 PMCID: PMC7297117 DOI: 10.1261/rna.074393.119] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 03/28/2020] [Indexed: 05/12/2023]
Abstract
Long noncoding RNAs (lncRNAs) have recently emerged as prominent regulators of gene expression in eukaryotes. LncRNAs often drive the modification and maintenance of gene activation or gene silencing states via chromatin conformation rearrangements. In plants, lncRNAs have been shown to participate in gene regulation, and are essential to processes such as vernalization and photomorphogenesis. Despite their prominent functions, only over a dozen lncRNAs have been experimentally and functionally characterized. Similar to its animal counterparts, the rates of sequence divergence are much higher in plant lncRNAs than in protein coding mRNAs, making it difficult to identify lncRNA conservation using traditional sequence comparison methods. Beyond this, little is known about the evolutionary patterns of lncRNAs in plants. Here, we characterized the splicing conservation of lncRNAs in Brassicaceae. We generated a whole-genome alignment of 16 Brassica species and used it to identify synthenic lncRNA orthologs. Using a scoring system trained on transcriptomes from A. thaliana and B. oleracea, we identified splice sites across the whole alignment and measured their conservation. Our analysis revealed that 17.9% (112/627) of all intergenic lncRNAs display splicing conservation in at least one exon, an estimate that is substantially higher than previous estimates of lncRNA conservation in this group. Our findings agree with similar studies in vertebrates, demonstrating that splicing conservation can be evidence of stabilizing selection. We provide conclusive evidence for the existence of evolutionary deeply conserved lncRNAs in plants and describe a generally applicable computational workflow to identify functional lncRNAs in plants.
Collapse
Affiliation(s)
| | | | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University Leipzig, D-04107 Leipzig, Germany
- Interdisciplinary Center for Bioinformatics, University Leipzig, D-04107 Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany
- Department of Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria
- Facultad de Ciencias, Universidad Nacional de Colombia, 11001 Sede Bogotá, Colombia
- Santa Fe Institute, Santa Fe, New Mexico 87501, USA
| | | |
Collapse
|
29
|
Dai Y, Ma W, Zhang T, Yang J, Zang C, Liu K, Wang X, Wang J, Wu Z, Zhang X, Li C, Li J, Wang X, Guo J, Li L. Long Noncoding RNA Expression Profiling During the Neuronal Differentiation of Glial Precursor Cells from Rat Dorsal Root Ganglia. BIOTECHNOL BIOPROC E 2020. [DOI: 10.1007/s12257-019-0317-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
30
|
Abou Alezz M, Celli L, Belotti G, Lisa A, Bione S. GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation. Front Genet 2020; 11:488. [PMID: 32499820 PMCID: PMC7242645 DOI: 10.3389/fgene.2020.00488] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 04/20/2020] [Indexed: 12/16/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) are recognized as an important class of regulatory molecules involved in a variety of biological functions. However, the regulatory mechanisms of long non-coding genes expression are still poorly understood. The characterization of the genomic features of lncRNAs is crucial to get insight into their function. In this study, we exploited recent annotations by GENCODE to characterize the genomic and splicing features of long non-coding genes in comparison with protein-coding ones, both in human and mouse. Our analysis highlighted differences between the two classes of genes in terms of their gene architecture. Significant differences in the splice sites usage were observed between long non-coding and protein-coding genes (PCG). While the frequency of non-canonical GC-AG splice junctions represents about 0.8% of total splice sites in PCGs, we identified a significant enrichment of the GC-AG splice sites in long non-coding genes, both in human (3.0%) and mouse (1.9%). In addition, we found a positional bias of GC-AG splice sites being enriched in the first intron in both classes of genes. Moreover, a significant shorter length and weaker donor and acceptor sites were found comparing GC-AG introns to GT-AG introns. Genes containing at least one GC-AG intron were found conserved in many species, more prone to alternative splicing and a functional analysis pointed toward their enrichment in specific biological processes such as DNA repair. Our study shows for the first time that GC-AG introns are mainly associated with lncRNAs and are preferentially located in the first intron. Additionally, we discovered their regulatory potential indicating the existence of a new mechanism of non-coding and PCGs expression regulation.
Collapse
Affiliation(s)
- Monah Abou Alezz
- Computational Biology Unit, Institute of Molecular Genetics Luigi Luca Cavalli-Sforza, National Research Council, Pavia, Italy
| | - Ludovica Celli
- Computational Biology Unit, Institute of Molecular Genetics Luigi Luca Cavalli-Sforza, National Research Council, Pavia, Italy
| | - Giulia Belotti
- Computational Biology Unit, Institute of Molecular Genetics Luigi Luca Cavalli-Sforza, National Research Council, Pavia, Italy
| | - Antonella Lisa
- Computational Biology Unit, Institute of Molecular Genetics Luigi Luca Cavalli-Sforza, National Research Council, Pavia, Italy
| | - Silvia Bione
- Computational Biology Unit, Institute of Molecular Genetics Luigi Luca Cavalli-Sforza, National Research Council, Pavia, Italy
| |
Collapse
|
31
|
Rusconi F, Battaglioli E, Venturin M. Psychiatric Disorders and lncRNAs: A Synaptic Match. Int J Mol Sci 2020; 21:ijms21093030. [PMID: 32344798 PMCID: PMC7246907 DOI: 10.3390/ijms21093030] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 04/15/2020] [Accepted: 04/21/2020] [Indexed: 12/15/2022] Open
Abstract
Psychiatric disorders represent a heterogeneous class of multifactorial mental diseases whose origin entails a pathogenic integration of genetic and environmental influences. Incidence of these pathologies is dangerously high, as more than 20% of the Western population is affected. Despite the diverse origins of specific molecular dysfunctions, these pathologies entail disruption of fine synaptic regulation, which is fundamental to behavioral adaptation to the environment. The synapses, as functional units of cognition, represent major evolutionary targets. Consistently, fine synaptic tuning occurs at several levels, involving a novel class of molecular regulators known as long non-coding RNAs (lncRNAs). Non-coding RNAs operate mainly in mammals as epigenetic modifiers and enhancers of proteome diversity. The prominent evolutionary expansion of the gene number of lncRNAs in mammals, particularly in primates and humans, and their preferential neuronal expression does represent a driving force that enhanced the layering of synaptic control mechanisms. In the last few years, remarkable alterations of the expression of lncRNAs have been reported in psychiatric conditions such as schizophrenia, autism, and depression, suggesting unprecedented mechanistic insights into disruption of fine synaptic tuning underlying severe behavioral manifestations of psychosis. In this review, we integrate literature data from rodent pathological models and human evidence that proposes the biology of lncRNAs as a promising field of neuropsychiatric investigation.
Collapse
Affiliation(s)
- Francesco Rusconi
- Correspondence: (F.R.); (M.V.); Tel.: +39-02-503-30445 (F.R.); +39-02-503-30443 (M.V.)
| | | | - Marco Venturin
- Correspondence: (F.R.); (M.V.); Tel.: +39-02-503-30445 (F.R.); +39-02-503-30443 (M.V.)
| |
Collapse
|
32
|
Abstract
Long non-coding RNAs (lncRNAs) represent a major fraction of the transcriptome in multicellular organisms. Although a handful of well-studied lncRNAs are broadly recognized as biologically meaningful, the fraction of such transcripts out of the entire collection of lncRNAs remains a subject of vigorous debate. Here we review the evidence for and against biological functionalities of lncRNAs and attempt to arrive at potential modes of lncRNA functionality that would reconcile the contradictory conclusions. Finally, we discuss different strategies of phenotypic analyses that could be used to investigate such modes of lncRNA functionality.
Collapse
Affiliation(s)
- Fan Gao
- Institute of Genomics, School of Biomedical Sciences, Huaqiao University, 201 Pan-Chinese S & T Building, 668 Jimei Road, Xiamen, 361021, China
| | - Ye Cai
- Institute of Genomics, School of Biomedical Sciences, Huaqiao University, 201 Pan-Chinese S & T Building, 668 Jimei Road, Xiamen, 361021, China
| | - Philipp Kapranov
- Institute of Genomics, School of Biomedical Sciences, Huaqiao University, 201 Pan-Chinese S & T Building, 668 Jimei Road, Xiamen, 361021, China.
| | - Dongyang Xu
- Institute of Genomics, School of Biomedical Sciences, Huaqiao University, 201 Pan-Chinese S & T Building, 668 Jimei Road, Xiamen, 361021, China.
| |
Collapse
|
33
|
Budak H, Kaya SB, Cagirici HB. Long Non-coding RNA in Plants in the Era of Reference Sequences. FRONTIERS IN PLANT SCIENCE 2020; 11:276. [PMID: 32226437 PMCID: PMC7080850 DOI: 10.3389/fpls.2020.00276] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Accepted: 02/21/2020] [Indexed: 05/04/2023]
Abstract
The discovery of non-coding RNAs (ncRNAs), and the subsequent elucidation of their functional roles, was largely delayed due to the misidentification of non-protein-coding parts of DNA as "junk DNA," which forced ncRNAs into the shadows of their protein-coding counterparts. However, over the past decade, insight into the important regulatory roles of ncRNAs has led to rapid progress in their identification and characterization. Of the different types of ncRNAs, long non-coding RNAs (lncRNAs), has attracted considerable attention due to their mRNA-like structures and gene regulatory functions in plant stress responses. While RNA sequencing has been commonly used for mining lncRNAs, a lack of widespread conservation at the sequence level in addition to relatively low and highly tissue-specific expression patterns challenges high-throughput in silico identification approaches. The complex folding characteristics of lncRNA molecules also complicate target predictions, as the knowledge about the interaction interfaces between lncRNAs and potential targets is insufficient. Progress in characterizing lncRNAs and their targets from different species may hold the key to efficient identification of this class of ncRNAs from transcriptomic and potentially genomic resources. In wheat and barley, two of the most important crops, the knowledge about lncRNAs is very limited. However, recently published high-quality genomes of these crops are considered as promising resources for the identification of not only lncRNAs, but any class of molecules. Considering the increasing demand for food, these resources should be used efficiently to discover molecular mechanisms lying behind development and a/biotic stress responses. As our understanding of lncRNAs expands, interactions among ncRNA classes, as well as interactions with the coding sequences, will likely define novel functional networks that may be modulated for crop improvement.
Collapse
Affiliation(s)
- Hikmet Budak
- Montana BioAgriculture, Inc., Bozeman, MT, United States
- *Correspondence: Hikmet Budak,
| | - Sezgi Biyiklioglu Kaya
- Engineering and Natural Sciences, Molecular Biology, Genetics and Bioengineering Program, Sabancı University, Istanbul, Turkey
| | - Halise Busra Cagirici
- Engineering and Natural Sciences, Molecular Biology, Genetics and Bioengineering Program, Sabancı University, Istanbul, Turkey
| |
Collapse
|
34
|
Emerging Roles of Long Non-Coding RNAs as Drivers of Brain Evolution. Cells 2019; 8:cells8111399. [PMID: 31698782 PMCID: PMC6912723 DOI: 10.3390/cells8111399] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Revised: 11/01/2019] [Accepted: 11/03/2019] [Indexed: 01/09/2023] Open
Abstract
Mammalian genomes encode tens of thousands of long-noncoding RNAs (lncRNAs), which are capable of interactions with DNA, RNA and protein molecules, thereby enabling a variety of transcriptional and post-transcriptional regulatory activities. Strikingly, about 40% of lncRNAs are expressed specifically in the brain with precisely regulated temporal and spatial expression patterns. In stark contrast to the highly conserved repertoire of protein-coding genes, thousands of lncRNAs have newly appeared during primate nervous system evolution with hundreds of human-specific lncRNAs. Their evolvable nature and the myriad of potential functions make lncRNAs ideal candidates for drivers of human brain evolution. The human brain displays the largest relative volume of any animal species and the most remarkable cognitive abilities. In addition to brain size, structural reorganization and adaptive changes represent crucial hallmarks of human brain evolution. lncRNAs are increasingly reported to be involved in neurodevelopmental processes suggested to underlie human brain evolution, including proliferation, neurite outgrowth and synaptogenesis, as well as in neuroplasticity. Hence, evolutionary human brain adaptations are proposed to be essentially driven by lncRNAs, which will be discussed in this review.
Collapse
|
35
|
Walter Costa MB, Höner Zu Siederdissen C, Dunjić M, Stadler PF, Nowick K. SSS-test: a novel test for detecting positive selection on RNA secondary structure. BMC Bioinformatics 2019; 20:151. [PMID: 30898084 PMCID: PMC6429701 DOI: 10.1186/s12859-019-2711-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Accepted: 03/03/2019] [Indexed: 12/23/2022] Open
Abstract
Background Long non-coding RNAs (lncRNAs) play an important role in regulating gene expression and are thus important for determining phenotypes. Most attempts to measure selection in lncRNAs have focused on the primary sequence. The majority of small RNAs and at least some parts of lncRNAs must fold into specific structures to perform their biological function. Comprehensive assessments of selection acting on RNAs therefore must also encompass structure. Selection pressures acting on the structure of non-coding genes can be detected within multiple sequence alignments. Approaches of this type, however, have so far focused on negative selection. Thus, a computational method for identifying ncRNAs under positive selection is needed. Results We introduce the SSS-test (test for Selection on Secondary Structure) to identify positive selection and thus adaptive evolution. Benchmarks with biological as well as synthetic controls yield coherent signals for both negative and positive selection, demonstrating the functionality of the test. A survey of a lncRNA collection comprising 15,443 families resulted in 110 candidates that appear to be under positive selection in human. In 26 lncRNAs that have been associated with psychiatric disorders we identified local structures that have signs of positive selection in the human lineage. Conclusions It is feasible to assay positive selection acting on RNA secondary structures on a genome-wide scale. The detection of human-specific positive selection in lncRNAs associated with cognitive disorder provides a set of candidate genes for further experimental testing and may provide insights into the evolution of cognitive abilities in humans. Availability The SSS-test and related software is available at: https://github.com/waltercostamb/SSS-test. The databases used in this work are available at: http://www.bioinf.uni-leipzig.de/Software/SSS-test/. Electronic supplementary material The online version of this article (10.1186/s12859-019-2711-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Maria Beatriz Walter Costa
- Embrapa Agroenergia, Parque Estação Biológica (PqEB), Asa Norte, Brasília, DF, 70770-901, Brazil. .,Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, Leipzig, 04107, Germany.
| | - Christian Höner Zu Siederdissen
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, Leipzig, 04107, Germany
| | - Marko Dunjić
- Human Biology Group, Institute for Biology, Department of Biology, Chemistry, Pharmacy, Freie Universitaet Berlin, Königin-Luise-Straße 1-3, Berlin, 14195, Germany.,Center for Human Molecular Genetics, Faculty of Biology, University of Belgrade, Studentski trg 16, PO box 43, Belgrade, 11000, Serbia
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, Leipzig, 04107, Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig & Competence Center for Scalable Data Services and Solutions Dresden-Leipzig & Leipzig Research Center for Civilization Diseases, University Leipzig, Leipzig, 04107, Germany.,Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, Leipzig, 04103, Germany.,Department of Theoretical Chemistry, University of Vienna, Währinger Straße 17, Vienna, A-1090, Austria.,Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg C, DK-1870, Denmark.,Faculdad de Ciencias, Universidad Nacional de Colombia, Sede Bogotá, Ciudad Universitaria, Bogotá, D.C., COL-111321, Colombia.,Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM87501, USA
| | - Katja Nowick
- Human Biology Group, Institute for Biology, Department of Biology, Chemistry, Pharmacy, Freie Universitaet Berlin, Königin-Luise-Straße 1-3, Berlin, 14195, Germany. .,TFome Research Group, Bioinformatics Group, Interdisciplinary Center of Bioinformatics, Department of Computer Science, University of Leipzig, Härtelstraße 16-18, Leipzig, 04107, Germany. .,Paul-Flechsig-Institute for Brain Research, University of Leipzig, Liebigstraße 19. Haus C, Leipzig, 04103, Germany. .,Bioinformatics, Faculty of Agricultural Sciences, Institute of Animal Science, University of Hohenheim, Garbenstraße 13, Stuttgart, 70593, Germany.
| |
Collapse
|
36
|
Pegueroles C, Iraola-Guzmán S, Chorostecki U, Ksiezopolska E, Saus E, Gabaldón T. Transcriptomic analyses reveal groups of co-expressed, syntenic lncRNAs in four species of the genus Caenorhabditis. RNA Biol 2019; 16:320-329. [PMID: 30691342 PMCID: PMC6380332 DOI: 10.1080/15476286.2019.1572438] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 12/18/2018] [Accepted: 01/13/2019] [Indexed: 01/24/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) are a heterogeneous class of genes that do not code for proteins. Since lncRNAs (or a fraction thereof) are expected to be functional, many efforts have been dedicated to catalog lncRNAs in numerous organisms, but our knowledge of lncRNAs in non vertebrate species remains very limited. Here, we annotated lncRNAs using transcriptomic data from the same larval stage of four Caenorhabditis species. The number of annotated lncRNAs in self-fertile nematodes was lower than in out-crossing species. We used a combination of approaches to identify putatively homologous lncRNAs: synteny, sequence conservation, and structural conservation. We classified a total of 1,532 out of 7,635 genes from the four species into families of lncRNAs with conserved synteny and expression at the larval stage, suggesting that a large fraction of the predicted lncRNAs may be species specific. Despite both sequence and local secondary structure seem to be poorly conserved, sequences within families frequently shared BLASTn hits and short sequence motifs, which were more likely to be unpaired in the predicted structures. We provide the first multi-species catalog of lncRNAs in nematodes and identify groups of lncRNAs with conserved synteny and expression, that share exposed motifs.
Collapse
Affiliation(s)
- Cinta Pegueroles
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Susana Iraola-Guzmán
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Uciel Chorostecki
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Ewa Ksiezopolska
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Ester Saus
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Toni Gabaldón
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| |
Collapse
|
37
|
Springer MS, Emerling CA, Gatesy J, Randall J, Collin MA, Hecker N, Hiller M, Delsuc F. Odontogenic ameloblast-associated (ODAM) is inactivated in toothless/enamelless placental mammals and toothed whales. BMC Evol Biol 2019; 19:31. [PMID: 30674270 PMCID: PMC6343362 DOI: 10.1186/s12862-019-1359-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2018] [Accepted: 01/11/2019] [Indexed: 11/10/2022] Open
Abstract
Background The gene for odontogenic ameloblast-associated (ODAM) is a member of the secretory calcium-binding phosphoprotein gene family. ODAM is primarily expressed in dental tissues including the enamel organ and the junctional epithelium, and may also have pleiotropic functions that are unrelated to teeth. Here, we leverage the power of natural selection to test competing hypotheses that ODAM is tooth-specific versus pleiotropic. Specifically, we compiled and screened complete protein-coding sequences, plus sequences for flanking intronic regions, for ODAM in 165 placental mammals to determine if this gene contains inactivating mutations in lineages that either lack teeth (baleen whales, pangolins, anteaters) or lack enamel on their teeth (aardvarks, sloths, armadillos), as would be expected if the only essential functions of ODAM are related to tooth development and the adhesion of the gingival junctional epithelium to the enamel tooth surface. Results We discovered inactivating mutations in all species of placental mammals that either lack teeth or lack enamel on their teeth. A surprising result is that ODAM is also inactivated in a few additional lineages including all toothed whales that were examined. We hypothesize that ODAM inactivation is related to the simplified outer enamel surface of toothed whales. An alternate hypothesis is that ODAM inactivation in toothed whales may be related to altered antimicrobial functions of the junctional epithelium in aquatic habitats. Selection analyses on ODAM sequences revealed that the composite dN/dS value for pseudogenic branches is close to 1.0 as expected for a neutrally evolving pseudogene. DN/dS values on transitional branches were used to estimate ODAM inactivation times. In the case of pangolins, ODAM was inactivated ~ 65 million years ago, which is older than the oldest pangolin fossil (Eomanis, 47 Ma) and suggests an even more ancient loss or simplification of teeth in this lineage. Conclusion Our results validate the hypothesis that the only essential functions of ODAM that are maintained by natural selection are related to tooth development and/or the maintenance of a healthy junctional epithelium that attaches to the enamel surface of teeth. Electronic supplementary material The online version of this article (10.1186/s12862-019-1359-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA, 92521, USA.
| | - Christopher A Emerling
- Institut des Sciences de l'Évolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France.,Department of Biology, Whittier College, Whittier, CA, 90602, USA
| | - John Gatesy
- Division of Vertebrate Zoology and Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY, 10024, USA
| | - Jason Randall
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA, 92521, USA
| | - Matthew A Collin
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA, 92521, USA
| | - Nikolai Hecker
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany.,Max Planck Institute for the Physics of Complex Systems, Dresden, Germany.,Center for Systems Biology Dresden, Dresden, Germany
| | - Michael Hiller
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany.,Max Planck Institute for the Physics of Complex Systems, Dresden, Germany.,Center for Systems Biology Dresden, Dresden, Germany
| | - Frédéric Delsuc
- Institut des Sciences de l'Évolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France
| |
Collapse
|
38
|
Lorenzi L, Avila Cobos F, Decock A, Everaert C, Helsmoortel H, Lefever S, Verboom K, Volders PJ, Speleman F, Vandesompele J, Mestdagh P. Long noncoding RNA expression profiling in cancer: Challenges and opportunities. Genes Chromosomes Cancer 2019; 58:191-199. [PMID: 30461116 DOI: 10.1002/gcc.22709] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 11/06/2018] [Accepted: 11/18/2018] [Indexed: 12/11/2022] Open
Abstract
In recent years, technological advances in transcriptome profiling revealed that the repertoire of human RNA molecules is more diverse and extended than originally thought. This diversity and complexity mainly derive from a large ensemble of noncoding RNAs. Because of their key roles in cellular processes important for normal development and physiology, disruption of noncoding RNA expression is intrinsically linked to human disease, including cancer. Therefore, studying the noncoding portion of the transcriptome offers the prospect of identifying novel therapeutic and diagnostic targets. Although evidence of the relevance of noncoding RNAs in cancer is accumulating, we still face many challenges when it comes to accurately profiling their expression levels. Some of these challenges are inherent to the technologies employed, whereas others are associated with characteristics of the noncoding RNAs themselves. In this review, we discuss the challenges related to long noncoding RNA expression profiling, highlight how cancer long noncoding RNAs provide new opportunities for cancer diagnosis and treatment, and reflect on future developments.
Collapse
Affiliation(s)
- Lucía Lorenzi
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Francisco Avila Cobos
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Anneleen Decock
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Celine Everaert
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Hetty Helsmoortel
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Steve Lefever
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Karen Verboom
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Pieter-Jan Volders
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Frank Speleman
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Jo Vandesompele
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| | - Pieter Mestdagh
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium
| |
Collapse
|
39
|
Kirsch R, Seemann SE, Ruzzo WL, Cohen SM, Stadler PF, Gorodkin J. Identification and characterization of novel conserved RNA structures in Drosophila. BMC Genomics 2018; 19:899. [PMID: 30537930 PMCID: PMC6288889 DOI: 10.1186/s12864-018-5234-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Accepted: 11/08/2018] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Comparative genomics approaches have facilitated the discovery of many novel non-coding and structured RNAs (ncRNAs). The increasing availability of related genomes now makes it possible to systematically search for compensatory base changes - and thus for conserved secondary structures - even in genomic regions that are poorly alignable in the primary sequence. The wealth of available transcriptome data can add valuable insight into expression and possible function for new ncRNA candidates. Earlier work identifying ncRNAs in Drosophila melanogaster made use of sequence-based alignments and employed a sliding window approach, inevitably biasing identification toward RNAs encoded in the more conserved parts of the genome. RESULTS To search for conserved RNA structures (CRSs) that may not be highly conserved in sequence and to assess the expression of CRSs, we conducted a genome-wide structural alignment screen of 27 insect genomes including D. melanogaster and integrated this with an extensive set of tiling array data. The structural alignment screen revealed ∼30,000 novel candidate CRSs at an estimated false discovery rate of less than 10%. With more than one quarter of all individual CRS motifs showing sequence identities below 60%, the predicted CRSs largely complement the findings of sliding window approaches applied previously. While a sixth of the CRSs were ubiquitously expressed, we found that most were expressed in specific developmental stages or cell lines. Notably, most statistically significant enrichment of CRSs were observed in pupae, mainly in exons of untranslated regions, promotors, enhancers, and long ncRNAs. Interestingly, cell lines were found to express a different set of CRSs than were found in vivo. Only a small fraction of intergenic CRSs were co-expressed with the adjacent protein coding genes, which suggests that most intergenic CRSs are independent genetic units. CONCLUSIONS This study provides a more comprehensive view of the ncRNA transcriptome in fly as well as evidence for differential expression of CRSs during development and in cell lines.
Collapse
Affiliation(s)
- Rebecca Kirsch
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg C, DK-1870 Denmark
- Department of Veterinary and Animal Science, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg C, DK-1870 Denmark
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, Leipzig, D-04107 Germany
| | - Stefan E. Seemann
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg C, DK-1870 Denmark
- Department of Veterinary and Animal Science, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg C, DK-1870 Denmark
| | - Walter L. Ruzzo
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg C, DK-1870 Denmark
- School of Computer Science and Engineering, University of Washington, Box 352350, Seattle, 98195-2350 WA USA
- Department of Genome Sciences, University of Washington, Box 355065, Seattle, 98195-5065 WA USA
- Fred Hutchinson Cancer Research Center, 1100 Fairview Ave. N., Seattle, 98109-1024 WA USA
| | - Stephen M. Cohen
- Department of Cellular and Molecular Medicine, University of Copenhagen, Blegdamsvej 3, Copenhagen N, DK-2200 Denmark
| | - Peter F. Stadler
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg C, DK-1870 Denmark
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, Leipzig, D-04107 Germany
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, Leipzig, D-04103 Germany
- Faculdad de Ciencias, Universidad Nacional de Colombia, Sede Bogotá, Ciudad Universitaria, Bogotá, COL-111321 D.C. Colombia
- Department of Theoretical Chemistry, University of Vienna, Währinger Straße 17, Vienna, A-1090 Austria
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM87501 USA
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg C, DK-1870 Denmark
- Department of Veterinary and Animal Science, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg C, DK-1870 Denmark
| |
Collapse
|
40
|
|
41
|
Peng Y, Chang L, Wang Y, Wang R, Hu L, Zhao Z, Geng L, Liu Z, Gong Y, Li J, Li X, Zhang C. Genome-wide differential expression of long noncoding RNAs and mRNAs in ovarian follicles of two different chicken breeds. Genomics 2018; 111:1395-1403. [PMID: 30268779 DOI: 10.1016/j.ygeno.2018.09.012] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2018] [Revised: 08/23/2018] [Accepted: 09/17/2018] [Indexed: 01/27/2023]
Abstract
Bashang long-tail chickens are an indigenous breed with dual purpose in China (meat and eggs) but have low egg laying performance. To improve the low egg laying performance, a genome-wide analysis of mRNAs and long noncoding RNAs (lncRNAs) from Bashang long-tail chickens and Hy-Line brown layers was performed. A total of 16,354 mRNAs and 8691 lncRNAs were obtained from ovarian follicles. Between the breeds, 160 mRNAs and 550 lncRNAs were found to be significantly differentially expressed. Integrated network analysis suggested some differentially expressed genes were involved in ovarian follicular development through oocyte meiosis, progesterone-mediated oocyte maturation, and cell cycle. The impact of lncRNAs on cis and trans target genes, indicating some lncRNAs may play important roles in ovarian follicular development. The current results provided a catalog of chicken ovarian follicular lncRNAs and genes for further study to understand their roles in regulation of egg laying performance.
Collapse
Affiliation(s)
- Yongdong Peng
- College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, Hebei, People's Republic of China
| | - Li Chang
- College of Animal Science and Technology, Agricultural University of Hebei Province, Baoding 071001, Hebei, People's Republic of China; Qinhuangdao Animal Disease Control Center, Qinhuangdao 066001, Hebei, People's Republic of China
| | - Yaqi Wang
- College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, Hebei, People's Republic of China
| | - Ruining Wang
- College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, Hebei, People's Republic of China
| | - Lulu Hu
- College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, Hebei, People's Republic of China
| | - Ziya Zhao
- College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, Hebei, People's Republic of China
| | - Liying Geng
- College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, Hebei, People's Republic of China
| | - Zhengzhu Liu
- College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, Hebei, People's Republic of China
| | - Yuanfang Gong
- College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, Hebei, People's Republic of China
| | - Jingshi Li
- College of Life Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, Hebei, People's Republic of China
| | - Xianglong Li
- College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, Hebei, People's Republic of China.
| | - Chuansheng Zhang
- College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, Hebei, People's Republic of China.
| |
Collapse
|
42
|
Gärtner F, Höner zu Siederdissen C, Müller L, Stadler PF. Coordinate systems for supergenomes. Algorithms Mol Biol 2018; 13:15. [PMID: 30258487 PMCID: PMC6151955 DOI: 10.1186/s13015-018-0133-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 09/07/2018] [Indexed: 01/05/2023] Open
Abstract
Background Genome sequences and genome annotation data have become available at ever increasing rates in response to the rapid progress in sequencing technologies. As a consequence the demand for methods supporting comparative, evolutionary analysis is also growing. In particular, efficient tools to visualize-omics data simultaneously for multiple species are sorely lacking. A first and crucial step in this direction is the construction of a common coordinate system. Since genomes not only differ by rearrangements but also by large insertions, deletions, and duplications, the use of a single reference genome is insufficient, in particular when the number of species becomes large. Results The computational problem then becomes to determine an order and orientations of optimal local alignments that are as co-linear as possible with all the genome sequences. We first review the most prominent approaches to model the problem formally and then proceed to showing that it can be phrased as a particular variant of the Betweenness Problem. It is NP hard in general. As exact solutions are beyond reach for the problem sizes of practical interest, we introduce a collection of heuristic simplifiers to resolve ordering conflicts. Conclusion Benchmarks on real-life data ranging from bacterial to fly genomes demonstrate the feasibility of computing good common coordinate systems. Electronic supplementary material The online version of this article (10.1186/s13015-018-0133-4) contains supplementary material, which is available to authorized users.
Collapse
|
43
|
Identification and functional analysis of long non-coding RNAs in human and mouse early embryos based on single-cell transcriptome data. Oncotarget 2018; 7:61215-61228. [PMID: 27542205 PMCID: PMC5308646 DOI: 10.18632/oncotarget.11304] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Accepted: 08/08/2016] [Indexed: 11/25/2022] Open
Abstract
Epigenetics regulations have an important role in fertilization and proper embryonic development, and several human diseases are associated with epigenetic modification disorders, such as Rett syndrome, Beckwith-Wiedemann syndrome and Angelman syndrome. However, the dynamics and functions of long non-coding RNAs (lncRNAs), one type of epigenetic regulators, in human pre-implantation development have not yet been demonstrated. In this study, a comprehensive analysis of human and mouse early-stage embryonic lncRNAs was performed based on public single-cell RNA sequencing data. Expression profile analysis revealed that lncRNAs are expressed in a developmental stage-specific manner during human early-stage embryonic development, whereas a more temporal-specific expression pattern was identified in mouse embryos. Weighted gene co-expression network analysis suggested that lncRNAs involved in human early-stage embryonic development are associated with several important functions and processes, such as oocyte maturation, zygotic genome activation and mitochondrial functions. We also found that the network of lncRNAs involved in zygotic genome activation was highly preservative between human and mouse embryos, whereas in other stages no strong correlation between human and mouse embryo was observed. This study provides insight into the molecular mechanism underlying lncRNA involvement in human pre-implantation embryonic development.
Collapse
|
44
|
Lagarde J, Johnson R. Capturing a Long Look at Our Genetic Library. Cell Syst 2018; 6:153-155. [PMID: 29494803 DOI: 10.1016/j.cels.2018.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Long-read sequencing, coupled to cDNA capture, provides an unrivaled view of the transcriptome of chromosome 21, revealing surprises about the splicing of long noncoding RNAs.
Collapse
Affiliation(s)
- Julien Lagarde
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain
| | - Rory Johnson
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010 Bern, Switzerland; Department of Biomedical Research (DBMR), University of Bern, 3008 Bern, Switzerland.
| |
Collapse
|
45
|
Hawkes EJ, Hennelly SP, Novikova IV, Irwin JA, Dean C, Sanbonmatsu KY. COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures. Cell Rep 2018; 16:3087-3096. [PMID: 27653675 DOI: 10.1016/j.celrep.2016.08.045] [Citation(s) in RCA: 91] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Revised: 06/03/2016] [Accepted: 08/12/2016] [Indexed: 01/07/2023] Open
Abstract
There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. We investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probing and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.
Collapse
Affiliation(s)
- Emily J Hawkes
- John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Scott P Hennelly
- Los Alamos National Laboratory, Los Alamos, NM 87545, USA; New Mexico Consortium, Los Alamos, NM 87544, USA
| | - Irina V Novikova
- Los Alamos National Laboratory, Los Alamos, NM 87545, USA; Pacific Northwest National Laboratory, Environmental Molecular Sciences Laboratory, Richland, WA 99354, USA
| | - Judith A Irwin
- John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Caroline Dean
- John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Karissa Y Sanbonmatsu
- Los Alamos National Laboratory, Los Alamos, NM 87545, USA; New Mexico Consortium, Los Alamos, NM 87544, USA.
| |
Collapse
|
46
|
Abstract
Over the last two decades it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible noncoding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of noncoding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, D-79110 Freiburg, Germany.,Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Ivo L Hofacker
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark.,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria.,Bioinformatics and Computational Biology Research Group, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria
| | - Peter F Stadler
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark. .,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria. .,Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany. .,Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany. .,Fraunhofer Institute for Cell Therapy and Immunology, Perlickstraße 1, D-04103 Leipzig, Germany. .,Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA.
| |
Collapse
|
47
|
Abstract
PURPOSE OF REVIEW Long noncoding RNAs (lncRNAs) have emerged as powerful regulators of nearly all biological processes. Their cell-type and tissue-specific expression in health and disease provides new avenues for diagnosis and therapy. This review highlights the role of lncRNAs that are involved in cardiovascular disease (CVD) with a special focus on cell types involved in cardiac injury and remodeling, vascular injury, angiogenesis, inflammation, and lipid metabolism. RECENT FINDINGS Almost 98% of the genome does not encode for proteins. LncRNAs are among the most abundant type of RNA in the noncoding genome. Accumulating studies have uncovered novel lncRNA-mediated regulation of CVD-associated genes, signaling pathways, and pathophysiological responses. Targeting lncRNAs in vivo using short antisense oligonucleotides or by gene editing has provided important insights into disease pathogenesis through epigenetic, transcriptional, or translational mechanisms. Although cross-species conservation still remains a major obstacle, there is increasing appreciation that altered expression of lncRNAs associates with stage-specific CVD and in human patient cohorts, providing new opportunities for diagnosis and therapy. SUMMARY A better understanding of lncRNAs will not only fundamentally improve our understanding of key signaling pathways in CVD, but also aid in the development of effective new therapies and RNA-based biomarkers.
Collapse
Affiliation(s)
- Stefan Haemmig
- Department of Medicine, Cardiovascular Division, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Viorel Simion
- Department of Medicine, Cardiovascular Division, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Dafeng Yang
- Department of Medicine, Cardiovascular Division, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, USA
- Department of Cardiology, Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Yihuan Deng
- Department of Medicine, Cardiovascular Division, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, USA
- Department of Cardiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Mark W. Feinberg
- Department of Medicine, Cardiovascular Division, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
48
|
Deng P, Liu S, Nie X, Weining S, Wu L. Conservation analysis of long non-coding RNAs in plants. SCIENCE CHINA-LIFE SCIENCES 2017; 61:190-198. [PMID: 29101587 DOI: 10.1007/s11427-017-9174-9] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Accepted: 07/20/2017] [Indexed: 11/26/2022]
Abstract
Long non-coding RNAs (lncRNAs) are gene regulators that have vital roles in development and adaptation to the environment in eukaryotes. However, the structural and evolutionary analyses of plant lncRNAs are limited. In this study, we performed an analysis of lncRNAs in five monocot and five dicot species. Our results showed that plant lncRNA genes were generally shorter and had fewer exons than protein-coding genes. The numbers of lncRNAs were positively correlated with the numbers of protein-coding genes in different plant species, despite a high range of variation. Sequence conservation analysis showed that the majority of lncRNAs had high sequence conservation at the intra-species and sub-species levels, reminiscent of protein-coding genes. At the inter-species level, a subset of lncRNAs were highly diverged at the nucleotide level, but conserved by position. Interestingly, we found that plant lncRNAs have identical splicing signals, and those which can form precursors or targets of miRNAs have a conservative identity in different species. We also revealed that most of the lowly expressed lncRNAs were tissue-specific, while those highly conserved were constitutively transcribed. Meanwhile, we characterized a subset of rice lncRNAs that were co-expressed with their adjacent protein-coding genes, suggesting they may play cis-regulatory roles. These results will contribute to understanding the biological significance and evolution of lncRNAs in plants.
Collapse
Affiliation(s)
- Pingchuan Deng
- State Key Laboratory of Rice Biology, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
| | - Shu Liu
- State Key Laboratory of Rice Biology, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
| | - Xiaojun Nie
- State Key Laboratory of Crop Stress Biology in Arid Areas, College of Agronomy and Yangling Branch of China Wheat Improvement Center, Northwest A&F University, Yangling, 712100, China
| | - Song Weining
- State Key Laboratory of Crop Stress Biology in Arid Areas, College of Agronomy and Yangling Branch of China Wheat Improvement Center, Northwest A&F University, Yangling, 712100, China
| | - Liang Wu
- State Key Laboratory of Rice Biology, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China.
| |
Collapse
|
49
|
Tracing the Evolutionary History of the CAP Superfamily of Proteins Using Amino Acid Sequence Homology and Conservation of Splice Sites. J Mol Evol 2017; 85:137-157. [DOI: 10.1007/s00239-017-9813-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2017] [Accepted: 10/11/2017] [Indexed: 11/26/2022]
|
50
|
Schneider HW, Raiol T, Brigido MM, Walter MEMT, Stadler PF. A Support Vector Machine based method to distinguish long non-coding RNAs from protein coding transcripts. BMC Genomics 2017; 18:804. [PMID: 29047334 PMCID: PMC5648457 DOI: 10.1186/s12864-017-4178-4] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 10/05/2017] [Indexed: 12/31/2022] Open
Abstract
Background In recent years, a rapidly increasing number of RNA transcripts has been generated by thousands of sequencing projects around the world, creating enormous volumes of transcript data to be analyzed. An important problem to be addressed when analyzing this data is distinguishing between long non-coding RNAs (lncRNAs) and protein coding transcripts (PCTs). Thus, we present a Support Vector Machine (SVM) based method to distinguish lncRNAs from PCTs, using features based on frequencies of nucleotide patterns and ORF lengths, in transcripts. Methods The proposed method is based on SVM and uses the first ORF relative length and frequencies of nucleotide patterns selected by PCA as features. FASTA files were used as input to calculate all possible features. These features were divided in two sets: (i) 336 frequencies of nucleotide patterns; and (ii) 4 features derived from ORFs. PCA were applied to the first set to identify 6 groups of frequencies that could most contribute to the distinction. Twenty-four experiments using the 6 groups from the first set and the features from the second set where built to create the best model to distinguish lncRNAs from PCTs. Results This method was trained and tested with human (Homo sapiens), mouse (Mus musculus) and zebrafish (Danio rerio) data, achieving 98.21%, 98.03% and 96.09%, accuracy, respectively. Our method was compared to other tools available in the literature (CPAT, CPC, iSeeRNA, lncRNApred, lncRScan-SVM and FEELnc), and showed an improvement in accuracy by ≈3.00%. In addition, to validate our model, the mouse data was classified with the human model, and vice-versa, achieving ≈97.80% accuracy in both cases, showing that the model is not overfit. The SVM models were validated with data from rat (Rattus norvegicus), pig (Sus scrofa) and fruit fly (Drosophila melanogaster), and obtained more than 84.00% accuracy in all these organisms. Our results also showed that 81.2% of human pseudogenes and 91.7% of mouse pseudogenes were classified as non-coding. Moreover, our method was capable of re-annotating two uncharacterized sequences of Swiss-Prot database with high probability of being lncRNAs. Finally, in order to use the method to annotate transcripts derived from RNA-seq, previously identified lncRNAs of human, gorilla (Gorilla gorilla) and rhesus macaque (Macaca mulatta) were analyzed, having successfully classified 98.62%, 80.8% and 91.9%, respectively. Conclusions The SVM method proposed in this work presents high performance to distinguish lncRNAs from PCTs, as shown in the results. To build the model, besides using features known in the literature regarding ORFs, we used PCA to identify features among nucleotide pattern frequencies that contribute the most in distinguishing lncRNAs from PCTs, in reference data sets. Interestingly, models created with two evolutionary distant species could distinguish lncRNAs of even more distant species. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-4178-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hugo W Schneider
- Department of Computer Science, University of Brasilia, ICC Central, Instituto de Ciências Exatas, Campus Universitario Darcy Ribeiro, Asa Norte, CEP: 70910-900, Brasilia, Brazil.
| | - Taina Raiol
- Gerência Regional de Brasilia (GEREB), Oswaldo Cruz Foundation (Fiocruz), Av. L3 Norte, Campus Universitário Darcy Ribeiro, Gleba A, Asa Norte, CEP: 70910-900, Brasília, Brazil
| | - Marcelo M Brigido
- Laboratory of Molecular Biology, University of Brasilia, Instituto de Ciencias Biologicas, Campus Universitario Darcy Ribeiro, Asa Norte, CEP: 70910-900, Brasilia, Brazil
| | - Maria Emilia M T Walter
- Department of Computer Science, University of Brasilia, ICC Central, Instituto de Ciências Exatas, Campus Universitario Darcy Ribeiro, Asa Norte, CEP: 70910-900, Brasilia, Brazil
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Hartelstrasse 16-18, Leipzig, D-04107, Germany
| |
Collapse
|