1
|
Backofen R, Gorodkin J, Hofacker IL, Stadler PF. Comparative RNA Genomics. Methods Mol Biol 2024; 2802:347-393. [PMID: 38819565 DOI: 10.1007/978-1-0716-3838-5_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Over the last quarter of a century it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large-scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible non-coding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of non-coding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Jan Gorodkin
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria
- Bioinformatics and Computational Biology research group, University of Vienna, Vienna, Austria
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany.
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
- Universidad National de Colombia, Bogotá, Colombia.
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria.
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
2
|
Lin BC, Katneni U, Jankowska KI, Meyer D, Kimchi-Sarfaty C. In silico methods for predicting functional synonymous variants. Genome Biol 2023; 24:126. [PMID: 37217943 PMCID: PMC10204308 DOI: 10.1186/s13059-023-02966-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 05/10/2023] [Indexed: 05/24/2023] Open
Abstract
Single nucleotide variants (SNVs) contribute to human genomic diversity. Synonymous SNVs are previously considered to be "silent," but mounting evidence has revealed that these variants can cause RNA and protein changes and are implicated in over 85 human diseases and cancers. Recent improvements in computational platforms have led to the development of numerous machine-learning tools, which can be used to advance synonymous SNV research. In this review, we discuss tools that should be used to investigate synonymous variants. We provide supportive examples from seminal studies that demonstrate how these tools have driven new discoveries of functional synonymous SNVs.
Collapse
Affiliation(s)
- Brian C Lin
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research, US FDA, Silver Spring, MD, USA
| | - Upendra Katneni
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research, US FDA, Silver Spring, MD, USA
| | - Katarzyna I Jankowska
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research, US FDA, Silver Spring, MD, USA
| | - Douglas Meyer
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research, US FDA, Silver Spring, MD, USA
| | - Chava Kimchi-Sarfaty
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research, US FDA, Silver Spring, MD, USA.
| |
Collapse
|
3
|
Lokras A, Chakravarty A, Rades T, Christensen D, Franzyk H, Thakur A, Foged C. Simultaneous quantification of multiple RNA cargos co-loaded into nanoparticle-based delivery systems. Int J Pharm 2022; 626:122171. [PMID: 36070841 DOI: 10.1016/j.ijpharm.2022.122171] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 08/26/2022] [Accepted: 08/31/2022] [Indexed: 11/17/2022]
Abstract
Robust, sensitive, and versatile analytical methods are essential for quantification of RNA drug cargos loaded into nanoparticle-based delivery systems. However, simultaneous quantification of multiple RNA cargos co-loaded into nanoparticles remains a challenge. Here, we developed and validated the use of ion-pair reversed-phase high-performance liquid chromatography combined with UV detection (IP-RP-HPLC-UV) for simultaneous quantification of single- and double-stranded RNA cargos. Complete extraction of RNA cargo from the nanoparticle carrier was achieved using a phenol:chloroform:isoamyl alcohol mixture. Separations were performed using either a C18 or a PLRP-S column, eluted with 0.1 M triethylammonium acetate (TEAA) solution as ion-pairing reagent (eluent A), and 0.1 M TEAA containing 25 % (v/v) CH3CN as eluent B. These methods were applied to quantify mRNA and polyinosinic:polycytidylic acid co-loaded into lipid-polymer hybrid nanoparticles, and single-stranded oligodeoxynucleotide donors and Alt-R CRISPR single guide RNAs co-loaded into lipid nanoparticles. The developed methods were sensitive (limit of RNA quantification < 60 ng), linear (R2 > 0.997), and accurate (≈ 100 % recovery of RNA spiked in nanoparticles). Hence, the present study may facilitate convenient quantification of multiple RNA cargos co-loaded into nanoparticle-based delivery systems.
Collapse
Affiliation(s)
- Abhijeet Lokras
- Department of Pharmacy, Faculty of Health and Medical Sciences, University of Copenhagen, Universitetsparken 2, 2100 Copenhagen Ø, Denmark
| | - Akash Chakravarty
- Department of Pharmacy, Faculty of Health and Medical Sciences, University of Copenhagen, Universitetsparken 2, 2100 Copenhagen Ø, Denmark
| | - Thomas Rades
- Department of Pharmacy, Faculty of Health and Medical Sciences, University of Copenhagen, Universitetsparken 2, 2100 Copenhagen Ø, Denmark
| | - Dennis Christensen
- Department of Infectious Disease Immunology, Statens Serum Institut, Artillerivej 5, 2300 Copenhagen S, Denmark
| | - Henrik Franzyk
- Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Jagtvej 162, 2100 Copenhagen Ø, Denmark
| | - Aneesh Thakur
- Department of Pharmacy, Faculty of Health and Medical Sciences, University of Copenhagen, Universitetsparken 2, 2100 Copenhagen Ø, Denmark
| | - Camilla Foged
- Department of Pharmacy, Faculty of Health and Medical Sciences, University of Copenhagen, Universitetsparken 2, 2100 Copenhagen Ø, Denmark.
| |
Collapse
|
4
|
Jiang W, Wagner J, Du W, Plitzko J, Baumeister W, Beck F, Guo Q. A transformation clustering algorithm and its application in polyribosomes structural profiling. Nucleic Acids Res 2022; 50:9001-9011. [PMID: 35811088 PMCID: PMC9458451 DOI: 10.1093/nar/gkac547] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 06/01/2022] [Accepted: 06/10/2022] [Indexed: 12/26/2022] Open
Abstract
Improvements in cryo-electron tomography sample preparation, electron-microscopy instrumentations, and image processing algorithms have advanced the structural analysis of macromolecules in situ. Beyond such analyses of individual macromolecules, the study of their interactions with functionally related neighbors in crowded cellular habitats, i.e. 'molecular sociology', is of fundamental importance in biology. Here we present a NEighboring Molecule TOpology Clustering (NEMO-TOC) algorithm. We optimized this algorithm for the detection and profiling of polyribosomes, which play both constitutive and regulatory roles in gene expression. Our results suggest a model where polysomes are formed by connecting multiple nonstochastic blocks, in which translation is likely synchronized.
Collapse
Affiliation(s)
- Wenhong Jiang
- State Key Laboratory of Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, School of Life Sciences, Peking University, Beijing 100871, China
| | - Jonathan Wagner
- Department of Structural Molecular Biology, Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| | - Wenjing Du
- State Key Laboratory of Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, School of Life Sciences, Peking University, Beijing 100871, China
| | - Juergen Plitzko
- CryoEM Technology, Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| | - Wolfgang Baumeister
- Department of Structural Molecular Biology, Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| | - Florian Beck
- CryoEM Technology, Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| | - Qiang Guo
- State Key Laboratory of Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, School of Life Sciences, Peking University, Beijing 100871, China
- Changping Laboratory, Beijing, China
| |
Collapse
|
5
|
Tsybulskyi V, Meyer IM. ShapeSorter: a fully probabilistic method for detecting conserved RNA structure features supported by SHAPE evidence. Nucleic Acids Res 2022; 50:e85. [PMID: 35641016 PMCID: PMC9410897 DOI: 10.1093/nar/gkac405] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 05/09/2022] [Indexed: 12/20/2022] Open
Abstract
There is an increased interest in the determination of RNA structures in vivo as it is now possible to probe them in a high-throughput manner, e.g. using SHAPE protocols. By now, there exist a range of computational methods that integrate experimental SHAPE-probing evidence into computational RNA secondary structure prediction. The state-of-the-art in this field is currently provided by computational methods that employ the minimum-free energy strategy for prediction RNA secondary structures with SHAPE-probing evidence. These methods, however, rely on the assumption that transcripts in vivo fold into the thermodynamically most stable configuration and ignore evolutionary evidence for conserved RNA structure features. We here present a new computational method, ShapeSorter, that predicts RNA structure features without employing the thermodynamic strategy. Instead, ShapeSorter employs a fully probabilistic framework to identify RNA structure features that are supported by evolutionary and SHAPE-probing evidence. Our method can capture RNA structure heterogeneity, pseudo-knotted RNA structures as well as transient and mutually exclusive RNA structure features. Moreover, it estimates P-values for the predicted RNA structure features which allows for easy filtering and ranking. We investigate the merits of our method in a comprehensive performance benchmarking and conclude that ShapeSorter has a significantly superior performance for predicting base-pairs than the existing state-of-the-art methods.
Collapse
Affiliation(s)
- Volodymyr Tsybulskyi
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany.,Freie Universität Berlin, Department of Mathematics and Computer Science, Institute of Computer Science, Takustraße 9, 14195 Berlin, Germany
| | - Irmtraud M Meyer
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany.,Freie Universität Berlin, Department of Biology, Chemistry and Pharmacy, Institute of Chemistry and Biochemistry, Thielallee 63, 14195 Berlin, Germany.,Freie Universität Berlin, Department of Mathematics and Computer Science, Institute of Computer Science, Takustraße 9, 14195 Berlin, Germany
| |
Collapse
|
6
|
Zeng Z, Aptekmann AA, Bromberg Y. Decoding the effects of synonymous variants. Nucleic Acids Res 2021; 49:12673-12691. [PMID: 34850938 PMCID: PMC8682775 DOI: 10.1093/nar/gkab1159] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 11/02/2021] [Accepted: 11/08/2021] [Indexed: 12/12/2022] Open
Abstract
Synonymous single nucleotide variants (sSNVs) are common in the human genome but are often overlooked. However, sSNVs can have significant biological impact and may lead to disease. Existing computational methods for evaluating the effect of sSNVs suffer from the lack of gold-standard training/evaluation data and exhibit over-reliance on sequence conservation signals. We developed synVep (synonymous Variant effect predictor), a machine learning-based method that overcomes both of these limitations. Our training data was a combination of variants reported by gnomAD (observed) and those unreported, but possible in the human genome (generated). We used positive-unlabeled learning to purify the generated variant set of any likely unobservable variants. We then trained two sequential extreme gradient boosting models to identify subsets of the remaining variants putatively enriched and depleted in effect. Our method attained 90% precision/recall on a previously unseen set of variants. Furthermore, although synVep does not explicitly use conservation, its scores correlated with evolutionary distances between orthologs in cross-species variation analysis. synVep was also able to differentiate pathogenic vs. benign variants, as well as splice-site disrupting variants (SDV) vs. non-SDVs. Thus, synVep provides an important improvement in annotation of sSNVs, allowing users to focus on variants that most likely harbor effects.
Collapse
Affiliation(s)
- Zishuo Zeng
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
| | - Ariel A Aptekmann
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
- Department of Genetics, Rutgers University, Piscataway, NJ 08854, USA
| |
Collapse
|
7
|
Thermal adaptation of mRNA secondary structure: stability versus lability. Proc Natl Acad Sci U S A 2021; 118:2113324118. [PMID: 34728561 DOI: 10.1073/pnas.2113324118] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/16/2021] [Indexed: 12/30/2022] Open
Abstract
Macromolecular function commonly involves rapidly reversible alterations in three-dimensional structure (conformation). To allow these essential conformational changes, macromolecules must possess higher order structures that are appropriately balanced between rigidity and flexibility. Because of the low stabilization free energies (marginal stabilities) of macromolecule conformations, temperature changes have strong effects on conformation and, thereby, on function. As is well known for proteins, during evolution, temperature-adaptive changes in sequence foster retention of optimal marginal stability at a species' normal physiological temperatures. Here, we extend this type of analysis to messenger RNAs (mRNAs), a class of macromolecules for which the stability-lability balance has not been elucidated. We employ in silico methods to determine secondary structures and estimate changes in free energy of folding (ΔGfold) for 25 orthologous mRNAs that encode the enzyme cytosolic malate dehydrogenase in marine mollusks with adaptation temperatures spanning an almost 60 °C range. The change in free energy that occurs during formation of the ensemble of mRNA secondary structures is significantly correlated with adaptation temperature: ΔGfold values are all negative and their absolute values increase with adaptation temperature. A principal mechanism underlying these adaptations is a significant increase in synonymous guanine + cytosine substitutions with increasing temperature. These findings open up an avenue of exploration in molecular evolution and raise interesting questions about the interaction between temperature-adaptive changes in mRNA sequence and in the proteins they encode.
Collapse
|
8
|
Martín AL, Mounir M, Meyer IM. CoBold: a method for identifying different functional classes of transient RNA structure features that can impact RNA structure formation in vivo. Nucleic Acids Res 2021; 49:e19. [PMID: 33095878 PMCID: PMC7913772 DOI: 10.1093/nar/gkaa900] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 09/16/2020] [Accepted: 09/30/2020] [Indexed: 11/14/2022] Open
Abstract
RNA structure formation in vivo happens co-transcriptionally while the transcript is being made. The corresponding co-transcriptional folding pathway typically involves transient RNA structure features that are not part of the final, functional RNA structure. These transient features can play important functional roles of their own and also influence the formation of the final RNA structure in vivo. We here present CoBold, a computational method for identifying different functional classes of transient RNA structure features that can either aid or hinder the formation of a known reference RNA structure. Our method takes as input either a single RNA or a corresponding multiple-sequence alignment as well as a known reference RNA secondary structure and identifies different classes of transient RNA structure features that could aid or prevent the formation of the given RNA structure. We make CoBold available via a web-server which includes dedicated data visualisation.
Collapse
Affiliation(s)
- Adrián López Martín
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Mohamed Mounir
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Irmtraud M Meyer
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany.,Freie Universität Berlin, Department of Biology, Chemistry and Pharmacy, Institute of Chemistry and Biochemistry, Thielallee 63, 14195 Berlin, Germany
| |
Collapse
|
9
|
Mikhailov KV, Efeykin BD, Panchin AY, Knorre DA, Logacheva MD, Penin AA, Muntyan MS, Nikitin MA, Popova OV, Zanegina ON, Vyssokikh MY, Spiridonov SE, Aleoshin VV, Panchin YV. Coding palindromes in mitochondrial genes of Nematomorpha. Nucleic Acids Res 2020; 47:6858-6870. [PMID: 31194871 PMCID: PMC6649704 DOI: 10.1093/nar/gkz517] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 05/29/2019] [Accepted: 06/01/2019] [Indexed: 12/11/2022] Open
Abstract
Inverted repeats are common DNA elements, but they rarely overlap with protein-coding sequences due to the ensuing conflict with the structure and function of the encoded protein. We discovered numerous perfect inverted repeats of considerable length (up to 284 bp) embedded within the protein-coding genes in mitochondrial genomes of four Nematomorpha species. Strikingly, both arms of the inverted repeats encode conserved regions of the amino acid sequence. We confirmed enzymatic activity of the respiratory complex I encoded by inverted repeat-containing genes. The nucleotide composition of inverted repeats suggests strong selection at the amino acid level in these regions. We conclude that the inverted repeat-containing genes are transcribed and translated into functional proteins. The survey of available mitochondrial genomes reveals that several other organisms possess similar albeit shorter embedded repeats. Mitochondrial genomes of Nematomorpha demonstrate an extraordinary evolutionary compromise where protein function and stringent secondary structure elements within the coding regions are preserved simultaneously.
Collapse
Affiliation(s)
- Kirill V Mikhailov
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Boris D Efeykin
- Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation.,Severtsov Institute of Ecology and Evolution, Moscow 119071, Russian Federation
| | - Alexander Y Panchin
- Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Dmitry A Knorre
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Institute of Molecular Medicine, Sechenov First Moscow State Medical University, Moscow 119991, Russian Federation
| | - Maria D Logacheva
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation.,Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Moscow 143028, Russian Federation
| | - Aleksey A Penin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Maria S Muntyan
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Mikhail A Nikitin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Olga V Popova
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Olga N Zanegina
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Mikhail Y Vyssokikh
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Sergei E Spiridonov
- Severtsov Institute of Ecology and Evolution, Moscow 119071, Russian Federation
| | - Vladimir V Aleoshin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Yuri V Panchin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| |
Collapse
|
10
|
Zeng Z, Bromberg Y. Predicting Functional Effects of Synonymous Variants: A Systematic Review and Perspectives. Front Genet 2019; 10:914. [PMID: 31649718 PMCID: PMC6791167 DOI: 10.3389/fgene.2019.00914] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 08/29/2019] [Indexed: 12/13/2022] Open
Abstract
Recent advances in high-throughput experimentation have put the exploration of genome sequences at the forefront of precision medicine. In an effort to interpret the sequencing data, numerous computational methods have been developed for evaluating the effects of genome variants. Interestingly, despite the fact that every person has as many synonymous (sSNV) as non-synonymous single nucleotide variants, our ability to predict their effects is limited. The paucity of experimentally tested sSNV effects appears to be the limiting factor in development of such methods. Here, we summarize the details and evaluate the performance of nine existing computational methods capable of predicting sSNV effects. We used a set of observed and artificially generated variants to approximate large scale performance expectations of these tools. We note that the distribution of these variants across amino acid and codon types suggests purifying evolutionary selection retaining generated variants out of the observed set; i.e., we expect the generated set to be enriched for deleterious variants. Closer inspection of the relationship between the observed variant frequencies and the associated prediction scores identifies predictor-specific scoring thresholds of reliable effect predictions. Notably, across all predictors, the variants scoring above these thresholds were significantly more often generated than observed. which confirms our assumption that the generated set is enriched for deleterious variants. Finally, we find that while the methods differ in their ability to identify severe sSNV effects, no predictor appears capable of definitively recognizing subtle effects of such variants on a large scale.
Collapse
Affiliation(s)
- Zishuo Zeng
- Institute for Quantitative Biomedicine, Rutgers University, Piscataway, NJ, United States
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, United States
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, United States
- Department of Genetics, Rutgers University, Human Genetics Institute, Piscataway, NJ, United States
| |
Collapse
|
11
|
Nowick K, Walter Costa MB, Höner Zu Siederdissen C, Stadler PF. Selection Pressures on RNA Sequences and Structures. Evol Bioinform Online 2019; 15:1176934319871919. [PMID: 31496634 PMCID: PMC6716170 DOI: 10.1177/1176934319871919] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2019] [Accepted: 07/29/2019] [Indexed: 12/31/2022] Open
Abstract
With the discovery of increasingly more functional noncoding RNAs (ncRNAs), it becomes eminent to more strongly consider them as important players during species evolution. Although tests for negative selection of ncRNAs already exist since the beginning of this century, the SSS-test is the first one for also investigating positive selection. When analyzing selection in ncRNAs, it should be taken into account that selection pressures can independently act on sequence and structure. We applied the SSS-test to explore the evolution of ncRNAs in primates and identified more than 100 long noncoding RNAs (lncRNAs) that might evolve under positive selection in humans. With this test, it is now possible to more thoroughly include ncRNAs into evolutionary studies.
Collapse
Affiliation(s)
- Katja Nowick
- Human Biology Group, Institute for Biology, Department of Biology, Chemistry, Pharmacy, Freie Universität Berlin, Berlin, Germany
| | | | - Christian Höner Zu Siederdissen
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany.,Max Planck Institute for Mathematics in the Science, Leipzig, Germany.,Department of Theoretical Chemistry, Universität Wien, Wien, Austria.,Faculdad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia.,Santa Fe Institute, Santa Fe, NM, USA
| |
Collapse
|
12
|
Kiening M, Ochsenreiter R, Hellinger HJ, Rattei T, Hofacker I, Frishman D. Conserved Secondary Structures in Viral mRNAs. Viruses 2019; 11:E401. [PMID: 31035717 PMCID: PMC6563262 DOI: 10.3390/v11050401] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Revised: 04/23/2019] [Accepted: 04/26/2019] [Indexed: 12/29/2022] Open
Abstract
RNA secondary structure in untranslated and protein coding regions has been shown to play an important role in regulatory processes and the viral replication cycle. While structures in non-coding regions have been investigated extensively, a thorough overview of the structural repertoire of protein coding mRNAs, especially for viruses, is lacking. Secondary structure prediction of large molecules, such as long mRNAs remains a challenging task, as the contingent of structures a sequence can theoretically fold into grows exponentially with sequence length. We applied a structure prediction pipeline to Viral Orthologous Groups that first identifies the local boundaries of potentially structured regions and subsequently predicts their functional importance. Using this procedure, the orthologous groups were split into structurally homogenous subgroups, which we call subVOGs. This is the first compilation of potentially functional conserved RNA structures in viral coding regions, covering the complete RefSeq viral database. We were able to recover structural elements from previous studies and discovered a variety of novel structured regions. The subVOGs are available through our web resource RNASIV (RNA structure in viruses; http://rnasiv.bio.wzw.tum.de).
Collapse
Affiliation(s)
- Michael Kiening
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, D-85354 Freising, Germany.
| | - Roman Ochsenreiter
- University of Vienna, Faculty of Computer Science, Research Group Bioinformatics and Computational Biology, Währingerstr. 29, 1090 Vienna, Austria.
| | - Hans-Jörg Hellinger
- Division of Computational Systems Biology, Department of Microbiology and Ecosystem Science, University of Vienna, Althanstraße 14, 1090 Vienna, Austria.
| | - Thomas Rattei
- Division of Computational Systems Biology, Department of Microbiology and Ecosystem Science, University of Vienna, Althanstraße 14, 1090 Vienna, Austria.
| | - Ivo Hofacker
- University of Vienna, Faculty of Computer Science, Research Group Bioinformatics and Computational Biology, Währingerstr. 29, 1090 Vienna, Austria.
- University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstrasse 17, 1090 Vienna, Austria.
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, D-85354 Freising, Germany.
- St Petersburg State Polytechnic University, St Petersburg 195251, Russia.
| |
Collapse
|
13
|
Endoh T, Sugimoto N. Conformational Dynamics of the RNA G-Quadruplex and its Effect on Translation Efficiency. Molecules 2019; 24:molecules24081613. [PMID: 31022854 PMCID: PMC6514569 DOI: 10.3390/molecules24081613] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 04/22/2019] [Accepted: 04/22/2019] [Indexed: 11/16/2022] Open
Abstract
During translation, intracellular mRNA folds co-transcriptionally and must refold following the passage of ribosome. The mRNAs can be entrapped in metastable structures during these folding events. In the present study, we evaluated the conformational dynamics of the kinetically favored, metastable, and hairpin-like structure, which disturbs the thermodynamically favored G-quadruplex structure, and its effect on co-transcriptional translation in prokaryotic cells. We found that nascent mRNA forms a metastable hairpin-like structure during co-transcriptional folding instead of the G-quadruplex structure. When the translation progressed co-transcriptionally before the metastable hairpin-like structure transition to the G-quadruplex, function of the G-quadruplex as a roadblock of the ribosome was sequestered. This suggested that kinetically formed RNA structures had a dominant effect on gene expression in prokaryotes. The results of this study indicate that it is critical to consider the conformational dynamics of RNA-folding to understand the contributions of the mRNA structures in controlling gene expression.
Collapse
Affiliation(s)
- Tamaki Endoh
- Frontier Institute for Biomolecular Engineering Research (FIBER), Konan University, 7-1-20 Minatojima-Minamimachi, Chuo-ku, Kobe 650-0047, Japan.
| | - Naoki Sugimoto
- Frontier Institute for Biomolecular Engineering Research (FIBER), Konan University, 7-1-20 Minatojima-Minamimachi, Chuo-ku, Kobe 650-0047, Japan.
- Graduate School of Frontiers of Innovative Research in Science and Technology (FIRST), Konan University, 7-1-20 Minatojima-Minamimachi, Chuo-ku, Kobe 650-0047, Japan.
| |
Collapse
|
14
|
Blanchard EL, Loomis KH, Bhosle SM, Vanover D, Baumhof P, Pitard B, Zurla C, Santangelo PJ. Proximity Ligation Assays for In Situ Detection of Innate Immune Activation: Focus on In Vitro-Transcribed mRNA. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 14:52-66. [PMID: 30579042 PMCID: PMC6304375 DOI: 10.1016/j.omtn.2018.11.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Revised: 11/08/2018] [Accepted: 11/09/2018] [Indexed: 01/04/2023]
Abstract
The characterization of innate immune activation is crucial for vaccine and therapeutic development, including RNA-based vaccines, a promising approach. Current measurement methods quantify type I interferon and inflammatory cytokine production, but they do not allow for the isolation of individual pathways, do not provide kinetic activation or spatial information within tissues, and cannot be translated into clinical studies. Here we demonstrated the use of proximity ligation assays (PLAs) to detect pattern recognition receptor (PRR) activation in cells and in tissue samples. First, we validated PLA's sensitivity and specificity using well-characterized soluble agonists. Next, we characterized PRR activation from in vitro-transcribed (IVT) mRNAs, as well as the effect of sequence and base modifications in vitro. Finally, we established the measurement of PRR activation in tissue sections via PLA upon IVT mRNA intramuscular (i.m.) injection in mice. Overall, our results indicate that PLA is a valuable, versatile, and sensitive tool to monitor PRR activation for vaccine, adjuvant, and therapeutic screening.
Collapse
Affiliation(s)
- Emmeline L Blanchard
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, 313 Ferst Drive, UA Whitaker Building, Atlanta, GA 30332, USA
| | - Kristin H Loomis
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, 313 Ferst Drive, UA Whitaker Building, Atlanta, GA 30332, USA
| | - Sushma M Bhosle
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, 313 Ferst Drive, UA Whitaker Building, Atlanta, GA 30332, USA
| | - Daryll Vanover
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, 313 Ferst Drive, UA Whitaker Building, Atlanta, GA 30332, USA
| | | | - Bruno Pitard
- In-Cell-Art, 21 rue de la Noue Bras de Fer, 44200 Nantes, France
| | - Chiara Zurla
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, 313 Ferst Drive, UA Whitaker Building, Atlanta, GA 30332, USA
| | - Philip J Santangelo
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, 313 Ferst Drive, UA Whitaker Building, Atlanta, GA 30332, USA.
| |
Collapse
|
15
|
Abrahams L, Hurst LD. Adenine Enrichment at the Fourth CDS Residue in Bacterial Genes Is Consistent with Error Proofing for +1 Frameshifts. Mol Biol Evol 2018; 34:3064-3080. [PMID: 28961919 PMCID: PMC5850271 DOI: 10.1093/molbev/msx223] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Beyond selection for optimal protein functioning, coding sequences (CDSs) are under selection at the RNA and DNA levels. Here, we identify a possible signature of “dual-coding,” namely extensive adenine (A) enrichment at bacterial CDS fourth sites. In 99.07% of studied bacterial genomes, fourth site A use is greater than expected given genomic A-starting codon use. Arguing for nucleotide level selection, A-starting serine and arginine second codons are heavily utilized when compared with their non-A starting synonyms. Several models have the ability to explain some of this trend. In part, A-enrichment likely reduces 5′ mRNA stability, promoting translation initiation. However T/U, which may also reduce stability, is avoided. Further, +1 frameshifts on the initiating ATG encode a stop codon (TGA) provided A is the fourth residue, acting either as a frameshift “catch and destroy” or a frameshift stop and adjust mechanism and hence implicated in translation initiation. Consistent with both, genomes lacking TGA stop codons exhibit weaker fourth site A-enrichment. Sequences lacking a Shine–Dalgarno sequence and those without upstream leader genes, that may be more error prone during initiation, have greater utilization of A, again suggesting a role in initiation. The frameshift correction model is consistent with the notion that many genomic features are error-mitigation factors and provides the first evidence for site-specific out of frame stop codon selection. We conjecture that the NTG universal start codon may have evolved as a consequence of TGA being a stop codon and the ability of NTGA to rapidly terminate or adjust a ribosome.
Collapse
Affiliation(s)
- Liam Abrahams
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| |
Collapse
|
16
|
Kaushik K, Sivadas A, Vellarikkal SK, Verma A, Jayarajan R, Pandey S, Sethi T, Maiti S, Scaria V, Sivasubbu S. RNA secondary structure profiling in zebrafish reveals unique regulatory features. BMC Genomics 2018; 19:147. [PMID: 29448945 PMCID: PMC5815192 DOI: 10.1186/s12864-018-4497-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2017] [Accepted: 01/28/2018] [Indexed: 01/08/2023] Open
Abstract
Background RNA is known to play diverse roles in gene regulation. The clues for this regulatory function of RNA are embedded in its ability to fold into intricate secondary and tertiary structure. Results We report the transcriptome-wide RNA secondary structure in zebrafish at single nucleotide resolution using Parallel Analysis of RNA Structure (PARS). This study provides the secondary structure map of zebrafish coding and non-coding RNAs. The single nucleotide pairing probabilities of 54,083 distinct transcripts in the zebrafish genome were documented. We identified RNA secondary structural features embedded in functional units of zebrafish mRNAs. Translation start and stop sites were demarcated by weak structural signals. The coding regions were characterized by the three-nucleotide periodicity of secondary structure and display a codon base specific structural constrain. The splice sites of transcripts were also delineated by distinct signature signals. Relatively higher structural signals were observed at 3’ Untranslated Regions (UTRs) compared to Coding DNA Sequence (CDS) and 5’ UTRs. The 3′ ends of transcripts were also marked by unique structure signals. Secondary structural signals in long non-coding RNAs were also explored to better understand their molecular function. Conclusions Our study presents the first PARS-enabled transcriptome-wide secondary structure map of zebrafish, which documents pairing probability of RNA at single nucleotide precision. Our findings open avenues for exploring structural features in zebrafish RNAs and their influence on gene expression. Electronic supplementary material The online version of this article (10.1186/s12864-018-4497-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Kriti Kaushik
- Genomics and Molecular Medicine, CSIR Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110025, India.,Academy of Scientific and Innovative Research (AcSIR), New Delhi, 110025, India
| | - Ambily Sivadas
- G.N. Ramachandran Knowledge Centre for Genome Informatics, CSIR Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110025, India.,Academy of Scientific and Innovative Research (AcSIR), New Delhi, 110025, India
| | - Shamsudheen Karuthedath Vellarikkal
- Genomics and Molecular Medicine, CSIR Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110025, India.,Academy of Scientific and Innovative Research (AcSIR), New Delhi, 110025, India
| | - Ankit Verma
- Genomics and Molecular Medicine, CSIR Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110025, India
| | - Rijith Jayarajan
- Genomics and Molecular Medicine, CSIR Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110025, India
| | - Satyaprakash Pandey
- Genomics and Molecular Medicine, CSIR Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110025, India.,Academy of Scientific and Innovative Research (AcSIR), New Delhi, 110025, India
| | - Tavprithesh Sethi
- Indraprastha Institute of Information Technology, Delhi, 110020, India
| | - Souvik Maiti
- Genomics and Molecular Medicine, CSIR Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110025, India.,Academy of Scientific and Innovative Research (AcSIR), New Delhi, 110025, India
| | - Vinod Scaria
- G.N. Ramachandran Knowledge Centre for Genome Informatics, CSIR Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110025, India. .,Academy of Scientific and Innovative Research (AcSIR), New Delhi, 110025, India.
| | - Sridhar Sivasubbu
- Genomics and Molecular Medicine, CSIR Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110025, India. .,Academy of Scientific and Innovative Research (AcSIR), New Delhi, 110025, India.
| |
Collapse
|
17
|
Au HHT, Elspass VM, Jan E. Functional Insights into the Adjacent Stem-Loop in Honey Bee Dicistroviruses That Promotes Internal Ribosome Entry Site-Mediated Translation and Viral Infection. J Virol 2018; 92:e01725-17. [PMID: 29093099 PMCID: PMC5752952 DOI: 10.1128/jvi.01725-17] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Accepted: 10/30/2017] [Indexed: 12/19/2022] Open
Abstract
All viruses must successfully harness the host translational apparatus and divert it towards viral protein synthesis. Dicistroviruses use an unusual internal ribosome entry site (IRES) mechanism whereby the IRES adopts a three-pseudoknot structure that accesses the ribosome tRNA binding sites to directly recruit the ribosome and initiate translation from a non-AUG start site. A subset of dicistroviruses, including the honey bee Israeli acute paralysis virus (IAPV), encode an extra stem-loop (SLVI) 5' -adjacent to the IGR IRES. Previously, the function of this additional stem-loop is unknown. Here, we provide mechanistic and functional insights into the role of SLVI in IGR IRES translation and in virus infection. Biochemical analyses of a series of mutant IRESs demonstrated that SLVI does not function in ribosome recruitment but is required for proper ribosome positioning on the IRES to direct translation. Using a chimeric infectious clone derived from the related Cricket paralysis virus, we showed that the integrity of SLVI is important for optimal viral translation and viral yield. Based on structural models of ribosome-IGR IRES complexes, the SLVI is predicted to be in the vicinity of the ribosome E site. We propose that SLVI of IAPV IGR IRES functionally mimics interactions of an E-site tRNA with the ribosome to direct positioning of the tRNA-like domain of the IRES in the A site.IMPORTANCEViral internal ribosome entry sites are RNA elements and structures that allow some positive-sense monopartite RNA viruses to hijack the host ribosome to start viral protein synthesis. We demonstrate that a unique stem-loop structure is essential for optimal viral protein synthesis and for virus infection. Biochemical evidence shows that this viral stem-loop RNA structure impacts a fundamental property of the ribosome to start protein synthesis.
Collapse
Affiliation(s)
- Hilda H T Au
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, British Columbia, Canada
| | - Valentina M Elspass
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, British Columbia, Canada
| | - Eric Jan
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
18
|
Abstract
Over the last two decades it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible noncoding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of noncoding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, D-79110 Freiburg, Germany.,Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Ivo L Hofacker
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark.,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria.,Bioinformatics and Computational Biology Research Group, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria
| | - Peter F Stadler
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark. .,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria. .,Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany. .,Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany. .,Fraunhofer Institute for Cell Therapy and Immunology, Perlickstraße 1, D-04103 Leipzig, Germany. .,Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA.
| |
Collapse
|
19
|
Savisaar R, Hurst LD. Both Maintenance and Avoidance of RNA-Binding Protein Interactions Constrain Coding Sequence Evolution. Mol Biol Evol 2017; 34:1110-1126. [PMID: 28138077 PMCID: PMC5400389 DOI: 10.1093/molbev/msx061] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
While the principal force directing coding sequence (CDS) evolution is selection on protein function, to ensure correct gene expression CDSs must also maintain interactions with RNA-binding proteins (RBPs). Understanding how our genes are shaped by these RNA-level pressures is necessary for diagnostics and for improving transgenes. However, the evolutionary impact of the need to maintain RBP interactions remains unresolved. Are coding sequences constrained by the need to specify RBP binding motifs? If so, what proportion of mutations are affected? Might sequence evolution also be constrained by the need not to specify motifs that might attract unwanted binding, for instance because it would interfere with exon definition? Here, we have scanned human CDSs for motifs that have been experimentally determined to be recognized by RBPs. We observe two sets of motifs-those that are enriched over nucleotide-controlled null and those that are depleted. Importantly, the depleted set is enriched for motifs recognized by non-CDS binding RBPs. Supporting the functional relevance of our observations, we find that motifs that are more enriched are also slower-evolving. The net effect of this selection to preserve is a reduction in the over-all rate of synonymous evolution of 2-3% in both primates and rodents. Stronger motif depletion, on the other hand, is associated with stronger selection against motif gain in evolution. The challenge faced by our CDSs is therefore not only one of attracting the right RBPs but also of avoiding the wrong ones, all while also evolving under selection pressures related to protein structure.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
20
|
Faure G, Ogurtsov AY, Shabalina SA, Koonin EV. Adaptation of mRNA structure to control protein folding. RNA Biol 2017; 14:1649-1654. [PMID: 28722509 DOI: 10.1080/15476286.2017.1349047] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
Abstract
Comparison of mRNA and protein structures shows that highly structured mRNAs typically encode compact protein domains suggesting that mRNA structure controls protein folding. This function is apparently performed by distinct structural elements in the mRNA, which implies 'fine tuning' of mRNA structure under selection for optimal protein folding. We find that, during evolution, changes in the mRNA folding energy follow amino acid replacements, reinforcing the notion of an intimate connection between the structures of a mRNA and the protein it encodes, and the double encoding of protein sequence and folding in the mRNA.
Collapse
Affiliation(s)
- Guilhem Faure
- a National Center for Biotechnology Information, National Library of Medicine , National Institutes of Health , Bethesda , MD , USA
| | - Aleksey Y Ogurtsov
- a National Center for Biotechnology Information, National Library of Medicine , National Institutes of Health , Bethesda , MD , USA
| | - Svetlana A Shabalina
- a National Center for Biotechnology Information, National Library of Medicine , National Institutes of Health , Bethesda , MD , USA
| | - Eugene V Koonin
- a National Center for Biotechnology Information, National Library of Medicine , National Institutes of Health , Bethesda , MD , USA
| |
Collapse
|
21
|
Endoh T, Sugimoto N. Conformational Dynamics of mRNA in Gene Expression as New Pharmaceutical Target. CHEM REC 2017; 17:817-832. [DOI: 10.1002/tcr.201700016] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Indexed: 11/05/2022]
Affiliation(s)
- Tamaki Endoh
- Frontier Institute for Biomolecular Engineering Research (FIBER); Konan University; 7-1-20 Minatojima-minamimachi Chuo-ku, Kobe 650-0047 Japan
| | - Naoki Sugimoto
- Frontier Institute for Biomolecular Engineering Research (FIBER); Konan University; 7-1-20 Minatojima-minamimachi Chuo-ku, Kobe 650-0047 Japan
- Graduate School of Frontiers of Innovative Research in Science and Technology (FIRST); Konan University; 7-1-20 Minatojima-minamimachi Chuo-ku, Kobe 650-0047 Japan
| |
Collapse
|
22
|
Meyer IM. In silico methods for co-transcriptional RNA secondary structure prediction and for investigating alternative RNA structure expression. Methods 2017; 120:3-16. [PMID: 28433606 DOI: 10.1016/j.ymeth.2017.04.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Revised: 03/16/2017] [Accepted: 04/14/2017] [Indexed: 01/26/2023] Open
Abstract
RNA transcripts are the primary products of active genes in any living organism, including many viruses. Their cellular destiny not only depends on primary sequence signals, but can also be determined by RNA structure. Recent experimental evidence shows that many transcripts can be assigned more than a single functional RNA structure throughout their cellular life and that structure formation happens co-transcriptionally, i.e. as the transcript is synthesised in the cell. Moreover, functional RNA structures are not limited to non-coding transcripts, but can also feature in coding transcripts. The picture that now emerges is that RNA structures constitute an additional layer of information that can be encoded in any RNA transcript (and on top of other layers of information such as protein-context) in order to exert a wide range of functional roles. Moreover, different encoded RNA structures can be expressed at different stages of a transcript's life in order to alter the transcript's behaviour depending on its actual cellular context. Similar to the concept of alternative splicing for protein-coding genes, where a single transcript can yield different proteins depending on cellular context, it is thus appropriate to propose the notion of alternative RNA structure expression for any given transcript. This review introduces several computational strategies that my group developed to detect different aspects of RNA structure expression in vivo. Two aspects are of particular interest to us: (1) RNA secondary structure features that emerge during co-transcriptional folding and (2) functional RNA structure features that are expressed at different times of a transcript's life and potentially mutually exclusive.
Collapse
Affiliation(s)
- Irmtraud M Meyer
- Laboratory of Bioinformatics of RNA Structure and Transcriptome Regulation, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125 Berlin-Buch, Germany; Institute of Chemistry and Biochemistry, Free University, Thielallee 63, 14195 Berlin, Germany.
| |
Collapse
|
23
|
Savisaar R, Hurst LD. Estimating the prevalence of functional exonic splice regulatory information. Hum Genet 2017; 136:1059-1078. [PMID: 28405812 PMCID: PMC5602102 DOI: 10.1007/s00439-017-1798-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Accepted: 04/04/2017] [Indexed: 12/14/2022]
Abstract
In addition to coding information, human exons contain sequences necessary for correct splicing. These elements are known to be under purifying selection and their disruption can cause disease. However, the density of functional exonic splicing information remains profoundly uncertain. Several groups have experimentally investigated how mutations at different exonic positions affect splicing. They have found splice information to be distributed widely in exons, with one estimate putting the proportion of splicing-relevant nucleotides at >90%. These results suggest that splicing could place a major pressure on exon evolution. However, analyses of sequence conservation have concluded that the need to preserve splice regulatory signals only slightly constrains exon evolution, with a resulting decrease in the average human rate of synonymous evolution of only 1–4%. Why do these two lines of research come to such different conclusions? Among other reasons, we suggest that the methods are measuring different things: one assays the density of sites that affect splicing, the other the density of sites whose effects on splicing are visible to selection. In addition, the experimental methods typically consider short exons, thereby enriching for nucleotides close to the splice junction, such sites being enriched for splice-control elements. By contrast, in part owing to correction for nucleotide composition biases and to the assumption that constraint only operates on exon ends, the conservation-based methods can be overly conservative.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK.
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
24
|
Pancsa R, Tompa P. Coding Regions of Intrinsic Disorder Accommodate Parallel Functions. Trends Biochem Sci 2016; 41:898-906. [DOI: 10.1016/j.tibs.2016.08.009] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Revised: 08/16/2016] [Accepted: 08/19/2016] [Indexed: 02/01/2023]
|
25
|
Mazloomian A, Meyer IM. Genome-wide identification and characterization of tissue-specific RNA editing events in D. melanogaster and their potential role in regulating alternative splicing. RNA Biol 2016; 12:1391-401. [PMID: 26512413 PMCID: PMC4829317 DOI: 10.1080/15476286.2015.1107703] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
RNA editing is a widespread mechanism that plays a crucial role in diversifying gene products. Its abundance and importance in regulating cellular processes were revealed using new sequencing technologies. The majority of these editing events, however, cannot be associated with regulatory mechanisms. We use tissue-specific high-throughput libraries of D. melanogaster to study RNA editing. We introduce an analysis pipeline that utilises large input data and explicitly captures ADAR's requirement for double-stranded regions. It combines probabilistic and deterministic filters and can identify RNA editing events with a low estimated false positive rate. Analyzing ten different tissue types, we predict 2879 editing sites and provide their detailed characterization. Our analysis pipeline accurately distinguishes genuine editing sites from SNPs and sequencing and mapping artifacts. Our editing sites are 3 times more likely to occur in exons with multiple splicing acceptor/donor sites than in exons with unique splice sites (p-value < 2.10−15). Furthermore, we identify 244 edited regions where RNA editing and alternative splicing are likely to influence each other. For 96 out of these 244 regions, we find evolutionary evidence for conserved RNA secondary-structures near splice sites suggesting a potential regulatory mechanism where RNA editing may alter splicing patterns via changes in local RNA structure.
Collapse
Affiliation(s)
- Alborz Mazloomian
- a Centre for High-Throughput Biology; Department of Computer Science and Department of Medical Genetics ; University of British Columbia ; Vancouver ; BC , Canada
| | - Irmtraud M Meyer
- a Centre for High-Throughput Biology; Department of Computer Science and Department of Medical Genetics ; University of British Columbia ; Vancouver ; BC , Canada
| |
Collapse
|
26
|
Nitsche A, Stadler PF. Evolutionary clues in lncRNAs. WILEY INTERDISCIPLINARY REVIEWS-RNA 2016; 8. [PMID: 27436689 DOI: 10.1002/wrna.1376] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Revised: 06/06/2016] [Accepted: 06/09/2016] [Indexed: 12/13/2022]
Abstract
The diversity of long non-coding RNAs (lncRNAs) in the human transcriptome is in stark contrast to the sparse exploration of their functions concomitant with their conservation and evolution. The pervasive transcription of the largely non-coding human genome makes the evolutionary age and conservation patterns of lncRNAs to a topic of interest. Yet it is a fairly unexplored field and not that easy to determine as for protein-coding genes. Although there are a few experimentally studied cases, which are conserved at the sequence level, most lncRNAs exhibit weak or untraceable primary sequence conservation. Recent studies shed light on the interspecies conservation of secondary structures among lncRNA homologs by using diverse computational methods. This highlights the importance of structure on functionality of lncRNAs as opposed to the poor impact of primary sequence changes. Further clues in the evolution of lncRNAs are given by selective constraints on non-coding gene structures (e.g., promoters or splice sites) as well as the conservation of prevalent spatio-temporal expression patterns. However, a rapid evolutionary turnover is observable throughout the heterogeneous group of lncRNAs. This still gives rise to questions about its functional meaning. WIREs RNA 2017, 8:e1376. doi: 10.1002/wrna.1376 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Anne Nitsche
- Bioinformatics Group, Department of Computer Science, University Leipzig, Leipzig, Germany.,Institute de Biologie Moléculaire et Cellulaire, Université de Strasbourg, Cedex, France
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University Leipzig, Leipzig, Germany.,Interdisciplinary Center for Bioinformatics, University Leipzig, Leipzig, Germany.,Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.,Department of Diagnostics, Fraunhofer Institute for Cell Therapy and Immunology - IZI, Leipzig, Germany.,Center for Non-Coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.,Department of Theoretical Chemistry, University of Vienna, Wien, Austria.,Santa Fe Institute, Santa Fe, NM, USA
| |
Collapse
|
27
|
Abstract
A comprehensive understanding of RNA structure will provide fundamental insights into the cellular function of both coding and non-coding RNAs. Although many RNA structures have been analysed by traditional biophysical and biochemical methods, the low-throughput nature of these approaches has prevented investigation of the vast majority of cellular transcripts. Triggered by advances in sequencing technology, genome-wide approaches for probing the transcriptome are beginning to reveal how RNA structure affects each step of protein expression and RNA stability. In this Review, we discuss the emerging relationships between RNA structure and the regulation of gene expression.
Collapse
|
28
|
Lai D, Meyer IM. e-RNA: a collection of web servers for comparative RNA structure prediction and visualisation. Nucleic Acids Res 2014; 42:W373-6. [PMID: 24810851 PMCID: PMC4086097 DOI: 10.1093/nar/gku292] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
e-RNA offers a free and open-access collection of five published RNA sequence analysis tools, each solving specific problems not readily addressed by other available tools. Given multiple sequence alignments, Transat detects all conserved helices, including those expected in a final structure, but also transient, alternative and pseudo-knotted helices. RNA-Decoder uses unique evolutionary models to detect conserved RNA secondary structure in alignments which may be partly protein-coding. SimulFold simultaneously co-estimates the potentially pseudo-knotted conserved structure, alignment and phylogenetic tree for a set of homologous input sequences. CoFold predicts the minimum-free energy structure for an input sequence while taking the effects of co-transcriptional folding into account, thereby greatly improving the prediction accuracy for long sequences. R-chie is a program to visualise RNA secondary structures as arc diagrams, allowing for easy comparison and analysis of conserved base-pairs and quantitative features. The web site server dispatches user jobs to a cluster, where up to 100 jobs can be processed in parallel. Upon job completion, users can retrieve their results via a bookmarked or emailed link. e-RNA is located at http://www.e-rna.org.
Collapse
Affiliation(s)
- Daniel Lai
- Centre for High-Throughput Biology, Department of Computer Science and Department of Medical Genetics, University of British Columbia, Vancouver V6T 1Z4, Canada
| | - Irmtraud M Meyer
- Centre for High-Throughput Biology, Department of Computer Science and Department of Medical Genetics, University of British Columbia, Vancouver V6T 1Z4, Canada
| |
Collapse
|
29
|
Gu W, Li M, Xu Y, Wang T, Ko JH, Zhou T. The impact of RNA structure on coding sequence evolution in both bacteria and eukaryotes. BMC Evol Biol 2014; 14:87. [PMID: 24758737 PMCID: PMC4021280 DOI: 10.1186/1471-2148-14-87] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2014] [Accepted: 04/07/2014] [Indexed: 12/03/2022] Open
Abstract
Background Many studies have found functional RNA secondary structures are selectively conserved among species. But, the effect of RNA structure selection on coding sequence evolution remains unknown. To address this problem, we systematically investigated the relationship between nucleotide conservation level and its structural sensitivity in four model organisms, Escherichia coli, yeast, fly, and mouse. Results We define structurally sensitive sites as those with putative local structure-disruptive mutations. Using both the Mantel-Haenszel procedure and association test, we found structurally sensitive nucleotide sites evolved more slowly than non-sensitive sites in all four organisms. Furthermore, we observed that this association is more obvious in highly expressed genes and region near the start codon. Conclusion We conclude that structurally sensitive sites in mRNA sequences normally have less nucleotide divergence in all species we analyzed. This study extends our understanding of the impact of RNA structure on coding sequence evolution, and is helpful to the development of a codon model with RNA structure information.
Collapse
Affiliation(s)
- Wanjun Gu
- Research Center for Learning Sciences, Southeast University, Nanjing, Jiangsu 210096, China.
| | | | | | | | | | | |
Collapse
|
30
|
Bai C, Wang X, Zhang J, Sun A, Wei D, Yang S. Optimisation of the mRNA secondary structure to improve the expression of interleukin-24 (IL-24) in Escherichia coli. Biotechnol Lett 2014; 36:1711-6. [PMID: 24752814 DOI: 10.1007/s10529-014-1535-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2014] [Accepted: 04/10/2014] [Indexed: 02/04/2023]
Abstract
Interleukin-24 (IL-24) is a novel cytokine selectively inhibiting proliferation of cancer cells but with little effect on normal cells. However, IL-24 is difficult to express in Escherichia coli. In this study, we optimised the secondary structure of the translation initiation region using computational approach to obtain non-fusion recombinant IL-24 (nrIL-24). The Gibbs free energy of the region was decreased from -22 to -9.07 kcal mol(-1), potentially promoting a loose secondary structure formation and improving the translation initiation efficiency. As a result, the expression of nrIL-24 was increased to 26 % of the total cellular protein from being barely initially detectable. nrIL-24 showed a concentration-dependent inhibition of A375 cells but had little effect on normal human cells. These results demonstrate that this method in increasing nrIL-24 expression is effective and efficient.
Collapse
Affiliation(s)
- Chaogang Bai
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, People's Republic of China
| | | | | | | | | | | |
Collapse
|
31
|
Mao Y, Liu H, Liu Y, Tao S. Deciphering the rules by which dynamics of mRNA secondary structure affect translation efficiency in Saccharomyces cerevisiae. Nucleic Acids Res 2014; 42:4813-22. [PMID: 24561808 PMCID: PMC4005662 DOI: 10.1093/nar/gku159] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Messenger RNA (mRNA) secondary structure decreases the elongation rate, as ribosomes must unwind every structure they encounter during translation. Therefore, the strength of mRNA secondary structure is assumed to be reduced in highly translated mRNAs. However, previous studies in vitro reported a positive correlation between mRNA folding strength and protein abundance. The counterintuitive finding suggests that mRNA secondary structure affects translation efficiency in an undetermined manner. Here, we analyzed the folding behavior of mRNA during translation and its effect on translation efficiency. We simulated translation process based on a novel computational model, taking into account the interactions among ribosomes, codon usage and mRNA secondary structures. We showed that mRNA secondary structure shortens ribosomal distance through the dynamics of folding strength. Notably, when adjacent ribosomes are close, mRNA secondary structures between them disappear, and codon usage determines the elongation rate. More importantly, our results showed that the combined effect of mRNA secondary structure and codon usage in highly translated mRNAs causes a short ribosomal distance in structural regions, which in turn eliminates the structures during translation, leading to a high elongation rate. Together, these findings reveal how the dynamics of mRNA secondary structure coupling with codon usage affect translation efficiency.
Collapse
Affiliation(s)
- Yuanhui Mao
- College of Life Sciences and State Key Laboratory of Crop Stress Biology in Arid Areas, Northwest A&F University, Yangling, Shaanxi 712100, China, Bioinformatics Center, Northwest A&F University, Yangling, Shaanxi 712100, China and College of Enology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | | | | | | |
Collapse
|
32
|
Mao Y, Li Q, Wang W, Liang P, Tao S. Number variation of high stability regions is correlated with gene functions. Genome Biol Evol 2013; 5:484-93. [PMID: 23407773 PMCID: PMC3622296 DOI: 10.1093/gbe/evt020] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Various regulatory elements in messenger RNAs (mRNAs) carrying the secondary structure play important roles in a wide range of expression processes. Numerous recent works have focused on the discovery of these functional elements that contain the conserved mRNA structures. However, to date, regions with high structural stability have been largely overlooked. In this study, we defined high stability regions (HSRs) in the coding sequences (CDSs) in bacteria based on the normalized folding free energy. We found that CDSs had high number of HSRs, and these HSRs showed high structural context robustness compared with random sequences, indicating a direct selective constraint imposed on HSRs. A reduced ribosome speed was detected near the start position of HSR, implying a possibility that HSR acted as obstacle to drive translational pausing that coordinated protein synthesis. Interestingly, we found that genes with high HSR density were enriched in the processes of translation, protein folding, and cell division. In addition, essential genes exhibited higher HSR density than nonessential genes. Overall, our study presented the previously unappreciated correlation between the number variation of HSRs and cellular processes.
Collapse
Affiliation(s)
- Yuanhui Mao
- College of life Sciences and State Key Laboratory of Crop Stress Biology in Arid Areas, Northwest A&F University, Yangling, Shaanxi, China
| | | | | | | | | |
Collapse
|
33
|
Genome-wide patterns of codon bias are shaped by natural selection in the purple sea urchin, Strongylocentrotus purpuratus. G3-GENES GENOMES GENETICS 2013; 3:1069-83. [PMID: 23637123 PMCID: PMC3704236 DOI: 10.1534/g3.113.005769] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Codon usage bias has been documented in a wide diversity of species, but the relative contributions of mutational bias and various forms of natural selection remain unclear. Here, we describe for the first time genome-wide patterns of codon bias at 4623 genes in the purple sea urchin, Strongylocentrotus purpuratus. Preferred codons were identified at 18 amino acids that exclusively used G or C at third positions, which contrasted with the strong AT bias of the genome (overall GC content is 36.9%). The GC content of third positions and coding regions exhibited significant correlations with the magnitude of codon bias. In contrast, the GC content of introns and flanking regions was indistinguishable from the genome-wide background, which suggested a limited contribution of mutational bias to synonymous codon usage. Five distinct clusters of genes were identified that had significantly different synonymous codon usage patterns. A significant correlation was observed between codon bias and mRNA expression supporting translational selection, but this relationship was driven by only one highly biased cluster that represented only 8.6% of all genes. In all five clusters preferred codons were evolutionarily conserved to a similar degree despite differences in their synonymous codon usage distributions and magnitude of codon bias. The third positions of preferred codons in two codon usage groups also paired significantly more often in stems than in loops of mRNA secondary structure predictions, which suggested that codon bias might also affect mRNA stability. Our results suggest that mutational bias has played a minor role in determining codon bias in S. purpuratus and that preferred codon usage may be heterogeneous across different genes and subject to different forms of natural selection.
Collapse
|
34
|
Chursov A, Frishman D, Shneider A. Conservation of mRNA secondary structures may filter out mutations in Escherichia coli evolution. Nucleic Acids Res 2013; 41:7854-60. [PMID: 23783573 PMCID: PMC3763529 DOI: 10.1093/nar/gkt507] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Recent reports indicate that mutations in viral genomes tend to preserve RNA secondary structure, and those mutations that disrupt secondary structural elements may reduce gene expression levels, thereby serving as a functional knockout. In this article, we explore the conservation of secondary structures of mRNA coding regions, a previously unknown factor in bacterial evolution, by comparing the structural consequences of mutations in essential and nonessential Escherichia coli genes accumulated over 40 000 generations in the course of the ‘long-term evolution experiment’. We monitored the extent to which mutations influence minimum free energy (MFE) values, assuming that a substantial change in MFE is indicative of structural perturbation. Our principal finding is that purifying selection tends to eliminate those mutations in essential genes that lead to greater changes of MFE values and, therefore, may be more disruptive for the corresponding mRNA secondary structures. This effect implies that synonymous mutations disrupting mRNA secondary structures may directly affect the fitness of the organism. These results demonstrate that the need to maintain intact mRNA structures imposes additional evolutionary constraints on bacterial genomes, which go beyond preservation of structure and function of the encoded proteins.
Collapse
Affiliation(s)
- Andrey Chursov
- Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftzentrum Weihenstephan, Maximus-von-Imhof-Forum 3, D-85354, Freising, Germany, Helmholtz Center Munich-German Research Center for Environmental Health (GmbH), Institute of Bioinformatics and Systems Biology, Ingolstädter Landstraße 1, D-85764 Neuherberg, Germany and Cure Lab, Inc., 43 Rybury Hillway, Needham, MA 02492, USA
| | | | | |
Collapse
|
35
|
Zhu JYA, Steif A, Proctor JR, Meyer IM. Transient RNA structure features are evolutionarily conserved and can be computationally predicted. Nucleic Acids Res 2013; 41:6273-85. [PMID: 23625966 PMCID: PMC3695514 DOI: 10.1093/nar/gkt319] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Functional RNA structures tend to be conserved during evolution. This finding is, for example, exploited by comparative methods for RNA secondary structure prediction that currently provide the state-of-art in terms of prediction accuracy. We here provide strong evidence that homologous RNA genes not only fold into similar final RNA structures, but that their folding pathways also share common transient structural features that have been evolutionarily conserved. For this, we compile and investigate a non-redundant data set of 32 sequences with known transient and final RNA secondary structures and devise a dedicated computational analysis pipeline.
Collapse
Affiliation(s)
- Jing Yun A Zhu
- Centre for High-Throughput Biology, University of British Columbia, 2125 East Mall, Vancouver, British Columbia V6T 1Z4, Canada
| | | | | | | |
Collapse
|
36
|
Ayoub NA, Garb JE, Kuelbs A, Hayashi CY. Ancient properties of spider silks revealed by the complete gene sequence of the prey-wrapping silk protein (AcSp1). Mol Biol Evol 2013; 30:589-601. [PMID: 23155003 PMCID: PMC3563967 DOI: 10.1093/molbev/mss254] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Spider silk fibers have impressive mechanical properties and are primarily composed of highly repetitive structural proteins (termed spidroins) encoded by a single gene family. Most characterized spidroin genes are incompletely known because of their extreme size (typically >9 kb) and repetitiveness, limiting understanding of the evolutionary processes that gave rise to their unusual gene architectures. The only complete spidroin genes characterized thus far form the dragline in the Western black widow, Latrodectus hesperus. Here, we describe the first complete gene sequence encoding the aciniform spidroin AcSp1, the primary component of spider prey-wrapping fibers. L. hesperus AcSp1 contains a single enormous (∼19 kb) exon. The AcSp1 repeat sequence is exceptionally conserved between two widow species (∼94% identity) and between widows and distantly related orb-weavers (∼30% identity), consistent with a history of strong purifying selection on its amino acid sequence. Furthermore, the 16 repeats (each 371-375 amino acids long) found in black widow AcSp1 are, on average, >99% identical at the nucleotide level. A combination of stabilizing selection on amino acid sequence, selection on silent sites, and intragenic recombination likely explains the extreme homogenization of AcSp1 repeats. In addition, phylogenetic analyses of spidroin paralogs support a gene duplication event occurring concomitantly with specialization of the aciniform glands and the tubuliform glands, which synthesize egg-case silk. With repeats that are dramatically different in length and amino acid composition from dragline spidroins, our L. hesperus AcSp1 expands the knowledge base for developing silk-based biomimetic technologies.
Collapse
Affiliation(s)
- Nadia A Ayoub
- Department of Biology, Washington and Lee University, USA.
| | | | | | | |
Collapse
|
37
|
Mao Y, Wang W, Cheng N, Li Q, Tao S. Universally increased mRNA stability downstream of the translation initiation site in eukaryotes and prokaryotes. Gene 2013; 517:230-5. [PMID: 23313297 DOI: 10.1016/j.gene.2012.12.062] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2012] [Accepted: 12/03/2012] [Indexed: 11/26/2022]
Abstract
Local secondary structures in coding sequences have important functions across various translational processes. To date, however, the local structures and their functions in the early stage of translation elongation remain poorly understood. Here, we surveyed the structural stability in the first 180 nucleotides of the coding sequence of 27 species using computational method. We found that the structural stability in the 30-80 nucleotide interval was significantly higher than that in other regions in eukaryotes and most prokaryotes. No significant correlation between local translation efficiency and structural stability was observed, suggesting that this structural region has undergone selection pressure directly to maintain high stability. Furthermore, ribosome was blocked by this region, providing an opportunity for co-translational regulation. Remarkably, in eukaryotes, we found that mRNAs with higher structural stability in the 30-80 nucleotide interval tended to encode the secreted proteins. Overall, our results revealed a previously unappreciated correlation between structural stability and protein localization.
Collapse
Affiliation(s)
- Yuanhui Mao
- State Key Laboratory of Crop Stress Biology in Arid Areas and College of Life Sciences, Northwest A&F University, Yangling, Shaanxi 712100, China
| | | | | | | | | |
Collapse
|
38
|
Shabalina SA, Spiridonov NA, Kashina A. Sounds of silence: synonymous nucleotides as a key to biological regulation and complexity. Nucleic Acids Res 2013; 41:2073-94. [PMID: 23293005 PMCID: PMC3575835 DOI: 10.1093/nar/gks1205] [Citation(s) in RCA: 187] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Messenger RNA is a key component of an intricate regulatory network of its own. It accommodates numerous nucleotide signals that overlap protein coding sequences and are responsible for multiple levels of regulation and generation of biological complexity. A wealth of structural and regulatory information, which mRNA carries in addition to the encoded amino acid sequence, raises the question of how these signals and overlapping codes are delineated along non-synonymous and synonymous positions in protein coding regions, especially in eukaryotes. Silent or synonymous codon positions, which do not determine amino acid sequences of the encoded proteins, define mRNA secondary structure and stability and affect the rate of translation, folding and post-translational modifications of nascent polypeptides. The RNA level selection is acting on synonymous sites in both prokaryotes and eukaryotes and is more common than previously thought. Selection pressure on the coding gene regions follows three-nucleotide periodic pattern of nucleotide base-pairing in mRNA, which is imposed by the genetic code. Synonymous positions of the coding regions have a higher level of hybridization potential relative to non-synonymous positions, and are multifunctional in their regulatory and structural roles. Recent experimental evidence and analysis of mRNA structure and interspecies conservation suggest that there is an evolutionary tradeoff between selective pressure acting at the RNA and protein levels. Here we provide a comprehensive overview of the studies that define the role of silent positions in regulating RNA structure and processing that exert downstream effects on proteins and their functions.
Collapse
Affiliation(s)
- Svetlana A Shabalina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20984, USA.
| | | | | |
Collapse
|
39
|
Salari R, Kimchi-Sarfaty C, Gottesman MM, Przytycka TM. Sensitive measurement of single-nucleotide polymorphism-induced changes of RNA conformation: application to disease studies. Nucleic Acids Res 2012; 41:44-53. [PMID: 23125360 PMCID: PMC3592397 DOI: 10.1093/nar/gks1009] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Single-nucleotide polymorphisms (SNPs) are often linked to critical phenotypes such as diseases or responses to vaccines, medications and environmental factors. However, the specific molecular mechanisms by which a causal SNP acts is usually not obvious. Changes in RNA secondary structure emerge as a possible explanation necessitating the development of methods to measure the impact of single-nucleotide variation on RNA structure. Despite the recognition of the importance of considering the changes in Boltzmann ensemble of RNA conformers in this context, a formal method to perform directly such comparison was lacking. Here, we solved this problem and designed an efficient method to compute the relative entropy between the Boltzmann ensembles of the native and a mutant structure. On the basis of this theoretical progress, we developed a software tool, remuRNA, and investigated examples of its application. Comparing the impact of common SNPs naturally occurring in populations with the impact of random point mutations, we found that structural changes introduced by common SNPs are smaller than those introduced by random point mutations. This suggests a natural selection against mutations that significantly change RNA structure and demonstrates, surprisingly, that randomly inserted point mutations provide inadequate estimation of random mutations effects. Subsequently, we applied remuRNA to determine which of the disease-associated non-coding SNPs are potentially related to RNA structural changes.
Collapse
Affiliation(s)
- Raheleh Salari
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
40
|
Pervouchine DD, Khrameeva EE, Pichugina MY, Nikolaienko OV, Gelfand MS, Rubtsov PM, Mironov AA. Evidence for widespread association of mammalian splicing and conserved long-range RNA structures. RNA (NEW YORK, N.Y.) 2012; 18:1-15. [PMID: 22128342 PMCID: PMC3261731 DOI: 10.1261/rna.029249.111] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Pre-mRNA structure impacts many cellular processes, including splicing in genes associated with disease. The contemporary paradigm of RNA structure prediction is biased toward secondary structures that occur within short ranges of pre-mRNA, although long-range base-pairings are known to be at least as important. Recently, we developed an efficient method for detecting conserved RNA structures on the genome-wide scale, one that does not require multiple sequence alignments and works equally well for the detection of local and long-range base-pairings. Using an enhanced method that detects base-pairings at all possible combinations of splice sites within each gene, we now report RNA structures that could be involved in the regulation of splicing in mammals. Statistically, we demonstrate strong association between the occurrence of conserved RNA structures and alternative splicing, where local RNA structures are generally more frequent at alternative donor splice sites, while long-range structures are more associated with weak alternative acceptor splice sites. As an example, we validated the RNA structure in the human SF1 gene using minigenes in the HEK293 cell line. Point mutations that disrupted the base-pairing of two complementary boxes between exons 9 and 10 of this gene altered the splicing pattern, while the compensatory mutations that reestablished the base-pairing reverted splicing to that of the wild-type. There is statistical evidence for a Dscam-like class of mammalian genes, in which mutually exclusive RNA structures control mutually exclusive alternative splicing. In sum, we propose that long-range base-pairings carry an important, yet unconsidered part of the splicing code, and that, even by modest estimates, there must be thousands of such potentially regulatory structures conserved throughout the evolutionary history of mammals.
Collapse
Affiliation(s)
- Dmitri D Pervouchine
- Department of Bioengineering and Bioinformatics, Moscow State University, Moscow, 119992, GSP-2 Russia.
| | | | | | | | | | | | | |
Collapse
|
41
|
Abstract
To detect positive Darwinian selection it is thought essential to compare two sequences. Despite its defects, "the comparative method rules." However, genes evolving rapidly under positive selection conflict more with internal forces (the genome phenotype) than genes evolving slowly under negative selection. In particular, there is conflict with stem-loop potential. The conflict between protein-encoding potential (primary information) and stem-loop potential (secondary information) permits detection of positive selection in a single sequence. The degree to which secondary information is compromised provides a measure of the speed of transmission of primary information. Thus, the sovereignty of the comparative method is challenged not only by its own defects, but also by the availability of a single-sequence method. However, while of limited utility for positive selection, the comparative method casts new light on Darwin's great question — the origin of species. Comparison of rates of synonymous and non-synonymous mutation suggests that branching into new species begins with synonymous mutations.
Collapse
Affiliation(s)
- DONALD R. FORSDYKE
- Department of Biochemistry, Queen's University, Kingston, Ontario, Canada K7L3N6, Canada
| |
Collapse
|
42
|
Chursov A, Walter MC, Schmidt T, Mironov A, Shneider A, Frishman D. Sequence-structure relationships in yeast mRNAs. Nucleic Acids Res 2011; 40:956-62. [PMID: 21954438 PMCID: PMC3273797 DOI: 10.1093/nar/gkr790] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
It is generally accepted that functionally important RNA structure is more conserved than sequence due to compensatory mutations that may alter the sequence without disrupting the structure. For small RNA molecules sequence–structure relationships are relatively well understood. However, structural bioinformatics of mRNAs is still in its infancy due to a virtual absence of experimental data. This report presents the first quantitative assessment of sequence–structure divergence in the coding regions of mRNA molecules based on recently published transcriptome-wide experimental determination of their base paring patterns. Structural resemblance in paralogous mRNA pairs quickly drops as sequence identity decreases from 100% to 85–90%. Structures of mRNAs sharing sequence identity below roughly 85% are essentially uncorrelated. This outcome is in dramatic contrast to small functional non-coding RNAs where sequence and structure divergence are correlated at very low levels of sequence similarity. The fact that very similar mRNA sequences can have vastly different secondary structures may imply that the particular global shape of base paired elements in coding regions does not play a major role in modulating gene expression and translation efficiency. Apparently, the need to maintain stable three-dimensional structures of encoded proteins places a much higher evolutionary pressure on mRNA sequences than on their RNA structures.
Collapse
Affiliation(s)
- Andrey Chursov
- Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftzentrum Weihenstephan, Maximus-von-Imhof-Forum 3, D-85354, Freising, Germany
| | | | | | | | | | | |
Collapse
|
43
|
Findeiss S, Engelhardt J, Prohaska SJ, Stadler PF. Protein-coding structured RNAs: A computational survey of conserved RNA secondary structures overlapping coding regions in drosophilids. Biochimie 2011; 93:2019-23. [PMID: 21835221 DOI: 10.1016/j.biochi.2011.07.023] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2011] [Accepted: 07/19/2011] [Indexed: 11/15/2022]
Abstract
Functional RNA elements can be embedded also within exonic sequences coding for functional proteins. While not uncommon in viruses, only a few examples of this type have been described in some detail for eukaryotic genomes. Here we use RNAz and RNAcode, two comparative genomics methods that measure signatures of stabilizing selection acting on RNA secondary structure and peptide sequence, resp., to survey the fruit fly genomes. We estimate that there might be on the order of 1000 loci that are subject to dual selection pressure. The used genome-wide screens also expose the limitations of the currently available methods.
Collapse
Affiliation(s)
- Sven Findeiss
- Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria.
| | | | | | | |
Collapse
|
44
|
Kahali B, Ahmad S, Ghosh TC. Selective constraints in yeast genes with differential expressivity: codon pair usage and mRNA stability perspectives. Gene 2011; 481:76-82. [PMID: 21554930 DOI: 10.1016/j.gene.2011.04.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2011] [Revised: 04/18/2011] [Accepted: 04/19/2011] [Indexed: 01/22/2023]
Abstract
Protein translation has been elucidated to be dictated by evolutionary constraints, namely, variations in tRNA availabilities and/or variations in codon-anticodon binding that is manifested in biased codon usage. Taking advantage of publicly available mRNA expression and protein abundance data for Saccharomyces cerevisiae, we have performed a comprehensive analysis of the diverse factors guiding translation leading to desired protein levels irrespective of the corresponding high or low mRNA levels. It has been elucidated in this study that different combinations of most abundant/non abundant tRNA isoacceptors are selected for in S. cerevisiae that helps in achieving the optimum speed and accuracy in the protein translation process. This is also accompanied by the strategic location of codon pairs in coherence to mRNA secondary structure folding stability for the above mentioned combinations of tRNA isoacceptors. We thus find that codon pair contextual effects; in addition to tRNA abundance and mRNA folding stability during translation elongation process play plausible roles in maintaining translation accuracy and speed that can achieve desired protein levels.
Collapse
Affiliation(s)
- Bratati Kahali
- Bioinformatics Centre, Bose Institute, C.I.T. Scheme VII M, Kolkata, India.
| | | | | |
Collapse
|
45
|
Martínez-Pérez F, Bendena WG, Chang BSW, Tobe SS. Influence of codon usage bias on FGLamide-allatostatin mRNA secondary structure. Peptides 2011; 32:509-17. [PMID: 20950662 DOI: 10.1016/j.peptides.2010.10.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Revised: 10/06/2010] [Accepted: 10/06/2010] [Indexed: 02/07/2023]
Abstract
The FGLamide allatostatins (ASTs) are invertebrate neuropeptides which inhibit juvenile hormone biosynthesis in Dictyoptera and related orders. They also show myomodulatory activity. FGLamide AST nucleotide frequencies and codon bias were investigated with respect to possible effects on mRNA secondary structure. 367 putative FGLamide ASTs and their potential endoproteolytic cleavage sites were identified from 40 species of crustaceans, chelicerates and insects. Among these, 55% comprised only 11 amino acids. An FGLamide AST consensus was identified to be (X)(1→16)Y(S/A/N/G)FGLGKR, with a strong bias for the codons UUU encoding for Phe and AAA for Lys, which can form strong Watson-Crick pairing in all peptides analyzed. The physical distance between these codons favor a loop structure from Ser/Ala-Phe to Lys-Arg. Other loop and hairpin loops were also inferred from the codon frequencies in the N-terminal motif, and the first amino acids from the C-terminal motif, or the dibasic potential endoproteolytic cleavage site. Our results indicate that nucleotide frequencies and codon usage bias in FGLamide ASTs tend to favor mRNA folds in the codon sequence in the C-terminal active peptide core and at the dibasic potential endoproteolytic cleavage site.
Collapse
Affiliation(s)
- Francisco Martínez-Pérez
- Department of Cell and Systems Biology, University of Toronto, 110 St. George St., Toronto, ON M5S 3G5, Canada
| | | | | | | |
Collapse
|
46
|
Computational RNomics: Structure identification and functional prediction of non-coding RNAs in silico. SCIENCE CHINA-LIFE SCIENCES 2010; 53:548-62. [DOI: 10.1007/s11427-010-0101-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2009] [Accepted: 06/28/2009] [Indexed: 01/05/2023]
|
47
|
Haque A, Buratti E, Baralle FE. Functional properties and evolutionary splicing constraints on a composite exonic regulatory element of splicing in CFTR exon 12. Nucleic Acids Res 2009; 38:647-59. [PMID: 19910374 PMCID: PMC2811005 DOI: 10.1093/nar/gkp1040] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
In general, splicing regulatory elements are defined as Enhancers or Silencers depending on their positive or negative effect upon exon inclusion. Often, these sequences are usually present separate from each other in exonic/intronic sequences. The Composite Exonic Splicing Regulatory Elements (CERES) represent an extreme physical overlap of enhancer/silencer activity. As a result, when CERES elements are mutated the consequences on the splicing process are difficult to predict. Here, we show that the functional activity of the CERES2 sequence in CFTR exon 12 is regulated by the binding, in very close proximity to each other, of several SR and hnRNP proteins. Moreover, our results show that practically the entire exon 12 sequence context participate in its definition. The consequences of this situation can be observed at the evolutionary level by comparing changes in conservation of different splicing elements in different species. In conclusion, our study highlights how it is increasingly difficult to define many exonic sequences by simply breaking them down in isolated enhancer/silencer or even neutral elements. The real picture is close to one of continuous competition between positive and negative factors where affinity for the target sequences and other dynamic factors decide the inclusion or exclusion of the exon.
Collapse
Affiliation(s)
- Ariful Haque
- International Centre for Genetic Engineering and Biotechnology, 34149 Trieste, Italy
| | | | | |
Collapse
|
48
|
Soldà G, Makunin IV, Sezerman OU, Corradin A, Corti G, Guffanti A. An Ariadne's thread to the identification and annotation of noncoding RNAs in eukaryotes. Brief Bioinform 2009; 10:475-89. [PMID: 19383843 DOI: 10.1093/bib/bbp022] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Non-protein coding RNAs (ncRNAs) have emerged as a vast and heterogeneous portion of eukaryotic transcriptomes. Several ncRNA families, either short (<200 nucleotides, nt) or long (>200 nt), have been described and implicated in a variety of biological processes, from translation to gene expression regulation and nuclear trafficking. Most probably, other families are still to be discovered. Computational methods for ncRNA research require different approaches from the ones normally used in the prediction of protein-coding genes. Indeed, primary sequence alone is often insufficient to infer ncRNA functionality, whereas secondary structure and local conservation of portions of the transcript could provide useful information for both the prediction and the functional annotation of ncRNAs. Here we present an overview of computational methods and bioinformatics resources currently available for studying ncRNA genes, introducing the common themes as well as the different approaches required for long and short ncRNA identification and annotation.
Collapse
Affiliation(s)
- Giulia Soldà
- Department of Biology and Genetics for Medical Sciences, University of Milano, 20133 Milan, Italy.
| | | | | | | | | | | |
Collapse
|
49
|
Schwartz S, Gal-Mark N, Kfir N, Oren R, Kim E, Ast G. Alu exonization events reveal features required for precise recognition of exons by the splicing machinery. PLoS Comput Biol 2009; 5:e1000300. [PMID: 19266014 PMCID: PMC2639721 DOI: 10.1371/journal.pcbi.1000300] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2008] [Accepted: 01/23/2009] [Indexed: 12/12/2022] Open
Abstract
Despite decades of research, the question of how the mRNA splicing machinery precisely identifies short exonic islands within the vast intronic oceans remains to a large extent obscure. In this study, we analyzed Alu exonization events, aiming to understand the requirements for correct selection of exons. Comparison of exonizing Alus to their non-exonizing counterparts is informative because Alus in these two groups have retained high sequence similarity but are perceived differently by the splicing machinery. We identified and characterized numerous features used by the splicing machinery to discriminate between Alu exons and their non-exonizing counterparts. Of these, the most novel is secondary structure: Alu exons in general and their 5′ splice sites (5′ss) in particular are characterized by decreased stability of local secondary structures with respect to their non-exonizing counterparts. We detected numerous further differences between Alu exons and their non-exonizing counterparts, among others in terms of exon–intron architecture and strength of splicing signals, enhancers, and silencers. Support vector machine analysis revealed that these features allow a high level of discrimination (AUC = 0.91) between exonizing and non-exonizing Alus. Moreover, the computationally derived probabilities of exonization significantly correlated with the biological inclusion level of the Alu exons, and the model could also be extended to general datasets of constitutive and alternative exons. This indicates that the features detected and explored in this study provide the basis not only for precise exon selection but also for the fine-tuned regulation thereof, manifested in cases of alternative splicing. A typical human gene consists of 9 exons around 150 nucleotides in length, separated by introns that are ∼3,000 nucleotides long. The challenge of the splicing machinery is to precisely identify and ligate the exons, while removing the introns. We aimed to understand how the splicing machinery meets this momentous challenge, based on Alu exonization events. Alus are transposable elements, of which approximately one million copies exist in the human genome, a large portion of which within introns. Throughout evolution, some intronic Alus accumulated mutations and became recognized by the splicing machinery as exons, a process termed exonization. Such Alus remain highly similar to their non-exonizing counterparts but are perceived as different by the splicing machinery. By comparing exonizing Alus to their non-exonizing counterparts, we were able to identify numerous features in which they differ and which presumably lead to the recognition only of the former by the splicing machinery. Our findings reveal insights regarding the role of local RNA secondary structures, exon–intron architecture constraints, and splicing regulatory signals. We integrated these features in a computational model, which was able to successfully mimic the function of the splicing machinery and discriminate between true Alu exons and their intronic counterparts, highlighting the functional importance of these features.
Collapse
Affiliation(s)
- Schraga Schwartz
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel-Aviv University, Tel Aviv, Israel
| | | | | | | | | | | |
Collapse
|
50
|
Biro JC. Discovery of proteomic code with mRNA assisted protein folding. Int J Mol Sci 2008; 9:2424-2446. [PMID: 19330085 PMCID: PMC2635648 DOI: 10.3390/ijms9122424] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2008] [Revised: 11/24/2008] [Accepted: 12/02/2008] [Indexed: 01/18/2023] Open
Abstract
The 3x redundancy of the Genetic Code is usually explained as a necessity to increase the mutation-resistance of the genetic information. However recent bioinformatical observations indicate that the redundant Genetic Code contains more biological information than previously known and which is additional to the 64/20 definition of amino acids. It might define the physico-chemical and structural properties of amino acids, the codon boundaries, the amino acid co-locations (interactions) in the coded proteins and the free folding energy of mRNAs. This additional information, which seems to be necessary to determine the 3D structure of coding nucleic acids as well as the coded proteins, is known as the Proteomic Code and mRNA Assisted Protein Folding.
Collapse
Affiliation(s)
- Jan C Biro
- Homulus Foundation, 612 S Flower St, Los Angeles, 90 017 CA, USA. E-Mail:
; Tel. +1-213-627-6134
| |
Collapse
|