1
|
Gerling N, Mendez JA, Gomez E, Ruiz-Garcia J. The separation between mRNA-ends is more variable than expected. FEBS Open Bio 2024. [PMID: 39226224 DOI: 10.1002/2211-5463.13877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Accepted: 07/29/2024] [Indexed: 09/05/2024] Open
Abstract
Effective circularization of mRNA molecules is a key step for the efficient initiation of translation. Research has shown that the intrinsic separation of the ends of mRNA molecules is rather small, suggesting that intramolecular arrangements could provide this effective circularization. Considering that the innate proximity of RNA ends might have important unknown biological implications, we aimed to determine whether the close proximity of the ends of mRNA molecules is a conserved feature across organisms and gain further insights into the functional effects of the proximity of RNA ends. To do so, we studied the secondary structure of 274 full native mRNA molecules from 17 different organisms to calculate the contour length (CL) of the external loop as an index of their end-to-end separation. Our computational predictions show bigger variations (from 0.59 to 31.8 nm) than previously reported and also than those observed in random sequences. Our results suggest that separations larger than 18.5 nm are not favored, whereas short separations could be related to phenotypical stability. Overall, our work implies the existence of a biological mechanism responsible for the increase in the observed variability, suggesting that the CL features of the exterior loop could be relevant for the initiation of translation and that a short CL could contribute to the stability of phenotypes.
Collapse
Affiliation(s)
- Nancy Gerling
- Institute of Physics, Biological Physics Laboratory, San Luis Potosi, Mexico
| | - J Alfredo Mendez
- Institute of Physics, Laboratory of Molecular Biophysics, San Luis Potosi, Mexico
| | - Eduardo Gomez
- Cold Atoms Laboratory, Institute of Physics, Universidad Autónoma de San Luis Potosí, San Luis Potosí, Mexico
| | - Jaime Ruiz-Garcia
- Institute of Physics, Biological Physics Laboratory, San Luis Potosi, Mexico
| |
Collapse
|
2
|
Allan MF, Aruda J, Plung JS, Grote SL, des Taillades YJM, de Lajarte AA, Bathe M, Rouskin S. Discovery and Quantification of Long-Range RNA Base Pairs in Coronavirus Genomes with SEARCH-MaP and SEISMIC-RNA. RESEARCH SQUARE 2024:rs.3.rs-4814547. [PMID: 39149495 PMCID: PMC11326378 DOI: 10.21203/rs.3.rs-4814547/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
RNA molecules perform a diversity of essential functions for which their linear sequences must fold into higher-order structures. Techniques including crystallography and cryogenic electron microscopy have revealed 3D structures of ribosomal, transfer, and other well-structured RNAs; while chemical probing with sequencing facilitates secondary structure modeling of any RNAs of interest, even within cells. Ongoing efforts continue increasing the accuracy, resolution, and ability to distinguish coexisting alternative structures. However, no method can discover and quantify alternative structures with base pairs spanning arbitrarily long distances - an obstacle for studying viral, messenger, and long noncoding RNAs, which may form long-range base pairs. Here, we introduce the method of Structure Ensemble Ablation by Reverse Complement Hybridization with Mutational Profiling (SEARCH-MaP) and software for Structure Ensemble Inference by Sequencing, Mutation Identification, and Clustering of RNA (SEISMIC-RNA). We use SEARCH-MaP and SEISMIC-RNA to discover that the frameshift stimulating element of SARS coronavirus 2 base-pairs with another element 1 kilobase downstream in nearly half of RNA molecules, and that this structure competes with a pseudoknot that stimulates ribosomal frameshifting. Moreover, we identify long-range base pairs involving the frameshift stimulating element in other coronaviruses including SARS coronavirus 1 and transmissible gastroenteritis virus, and model the full genomic secondary structure of the latter. These findings suggest that long-range base pairs are common in coronaviruses and may regulate ribosomal frameshifting, which is essential for viral RNA synthesis. We anticipate that SEARCH-MaP will enable solving many RNA structure ensembles that have eluded characterization, thereby enhancing our general understanding of RNA structures and their functions. SEISMIC-RNA, software for analyzing mutational profiling data at any scale, could power future studies on RNA structure and is available on GitHub and the Python Package Index.
Collapse
Affiliation(s)
- Matthew F. Allan
- Department of Microbiology, Harvard Medical School, Boston, Massachusetts, USA 02115
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA 02139
- Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA 02139
| | - Justin Aruda
- Department of Microbiology, Harvard Medical School, Boston, Massachusetts, USA 02115
- Harvard Program in Biological and Biomedical Sciences, Division of Medical Sciences, Harvard Medical School, Boston, MA, USA 02115
| | - Jesse S. Plung
- Department of Microbiology, Harvard Medical School, Boston, Massachusetts, USA 02115
- Harvard Program in Virology, Division of Medical Sciences, Harvard Medical School, Boston, MA, USA 02115
| | - Scott L. Grote
- Department of Microbiology, Harvard Medical School, Boston, Massachusetts, USA 02115
| | | | - Albéric A. de Lajarte
- Department of Microbiology, Harvard Medical School, Boston, Massachusetts, USA 02115
| | - Mark Bathe
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA 02139
| | - Silvi Rouskin
- Department of Microbiology, Harvard Medical School, Boston, Massachusetts, USA 02115
| |
Collapse
|
3
|
Allan MF, Aruda J, Plung JS, Grote SL, Martin des Taillades YJ, de Lajarte AA, Bathe M, Rouskin S. Discovery and Quantification of Long-Range RNA Base Pairs in Coronavirus Genomes with SEARCH-MaP and SEISMIC-RNA. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.29.591762. [PMID: 38746332 PMCID: PMC11092567 DOI: 10.1101/2024.04.29.591762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
RNA molecules perform a diversity of essential functions for which their linear sequences must fold into higher-order structures. Techniques including crystallography and cryogenic electron microscopy have revealed 3D structures of ribosomal, transfer, and other well-structured RNAs; while chemical probing with sequencing facilitates secondary structure modeling of any RNAs of interest, even within cells. Ongoing efforts continue increasing the accuracy, resolution, and ability to distinguish coexisting alternative structures. However, no method can discover and quantify alternative structures with base pairs spanning arbitrarily long distances - an obstacle for studying viral, messenger, and long noncoding RNAs, which may form long-range base pairs. Here, we introduce the method of Structure Ensemble Ablation by Reverse Complement Hybridization with Mutational Profiling (SEARCH-MaP) and software for Structure Ensemble Inference by Sequencing, Mutation Identification, and Clustering of RNA (SEISMIC-RNA). We use SEARCH-MaP and SEISMIC-RNA to discover that the frameshift stimulating element of SARS coronavirus 2 base-pairs with another element 1 kilobase downstream in nearly half of RNA molecules, and that this structure competes with a pseudoknot that stimulates ribosomal frameshifting. Moreover, we identify long-range base pairs involving the frameshift stimulating element in other coronaviruses including SARS coronavirus 1 and transmissible gastroenteritis virus, and model the full genomic secondary structure of the latter. These findings suggest that long-range base pairs are common in coronaviruses and may regulate ribosomal frameshifting, which is essential for viral RNA synthesis. We anticipate that SEARCH-MaP will enable solving many RNA structure ensembles that have eluded characterization, thereby enhancing our general understanding of RNA structures and their functions. SEISMIC-RNA, software for analyzing mutational profiling data at any scale, could power future studies on RNA structure and is available on GitHub and the Python Package Index.
Collapse
|
4
|
Tieng FYF, Abdullah-Zawawi MR, Md Shahri NAA, Mohamed-Hussein ZA, Lee LH, Mutalib NSA. A Hitchhiker's guide to RNA-RNA structure and interaction prediction tools. Brief Bioinform 2023; 25:bbad421. [PMID: 38040490 PMCID: PMC10753535 DOI: 10.1093/bib/bbad421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/16/2023] [Accepted: 10/26/2023] [Indexed: 12/03/2023] Open
Abstract
RNA biology has risen to prominence after a remarkable discovery of diverse functions of noncoding RNA (ncRNA). Most untranslated transcripts often exert their regulatory functions into RNA-RNA complexes via base pairing with complementary sequences in other RNAs. An interplay between RNAs is essential, as it possesses various functional roles in human cells, including genetic translation, RNA splicing, editing, ribosomal RNA maturation, RNA degradation and the regulation of metabolic pathways/riboswitches. Moreover, the pervasive transcription of the human genome allows for the discovery of novel genomic functions via RNA interactome investigation. The advancement of experimental procedures has resulted in an explosion of documented data, necessitating the development of efficient and precise computational tools and algorithms. This review provides an extensive update on RNA-RNA interaction (RRI) analysis via thermodynamic- and comparative-based RNA secondary structure prediction (RSP) and RNA-RNA interaction prediction (RIP) tools and their general functions. We also highlighted the current knowledge of RRIs and the limitations of RNA interactome mapping via experimental data. Then, the gap between RSP and RIP, the importance of RNA homologues, the relationship between pseudoknots, and RNA folding thermodynamics are discussed. It is hoped that these emerging prediction tools will deepen the understanding of RNA-associated interactions in human diseases and hasten treatment processes.
Collapse
Affiliation(s)
- Francis Yew Fu Tieng
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
| | | | - Nur Alyaa Afifah Md Shahri
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
| | - Zeti-Azura Mohamed-Hussein
- Institute of Systems Biology (INBIOSIS), UKM, Selangor 43600, Malaysia
- Department of Applied Physics, Faculty of Science and Technology, UKM, Selangor 43600, Malaysia
| | - Learn-Han Lee
- Sunway Microbiomics Centre, School of Medical and Life Sciences, Sunway University, Sunway City 47500, Malaysia
- Novel Bacteria and Drug Discovery Research Group, Microbiome and Bioresource Research Strength, Jeffrey Cheah School of Medicine and Health Sciences, Monash University of Malaysia, Selangor 47500, Malaysia
| | - Nurul-Syakima Ab Mutalib
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
- Novel Bacteria and Drug Discovery Research Group, Microbiome and Bioresource Research Strength, Jeffrey Cheah School of Medicine and Health Sciences, Monash University of Malaysia, Selangor 47500, Malaysia
- Faculty of Health Sciences, UKM, Kuala Lumpur 50300, Malaysia
| |
Collapse
|
5
|
Waldl M, Spicher T, Lorenz R, Beckmann IK, Hofacker IL, Löhneysen SV, Stadler PF. Local RNA folding revisited. J Bioinform Comput Biol 2023; 21:2350016. [PMID: 37522173 DOI: 10.1142/s0219720023500166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
Most of the functional RNA elements located within large transcripts are local. Local folding therefore serves a practically useful approximation to global structure prediction. Due to the sensitivity of RNA secondary structure prediction to the exact definition of sequence ends, accuracy can be increased by averaging local structure predictions over multiple, overlapping sequence windows. These averages can be computed efficiently by dynamic programming. Here we revisit the local folding problem, present a concise mathematical formalization that generalizes previous approaches and show that correct Boltzmann samples can be obtained by local stochastic backtracing in McCaskill's algorithms but not from local folding recursions. Corresponding new features are implemented in the ViennaRNA package to improve the support of local folding. Applications include the computation of maximum expected accuracy structures from RNAplfold data and a mutual information measure to quantify the sensitivity of individual sequence positions.
Collapse
Affiliation(s)
- Maria Waldl
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Thomas Spicher
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Ronny Lorenz
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Irene K Beckmann
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Sarah Von Löhneysen
- Institute of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstraße 16-18, D-04107 Leipzig, Germany
| | - Peter F Stadler
- Institute of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstraße 16-18, D-04107 Leipzig, Germany
| |
Collapse
|
6
|
Kofman C, Watkins AM, Kim D, Willi JA, Wooldredge A, Karim A, Das R, Jewett MC. Computationally-guided design and selection of high performing ribosomal active site mutants. Nucleic Acids Res 2022; 50:13143-13154. [PMID: 36484094 PMCID: PMC9825160 DOI: 10.1093/nar/gkac1036] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 10/13/2022] [Accepted: 10/22/2022] [Indexed: 12/14/2022] Open
Abstract
Understanding how modifications to the ribosome affect function has implications for studying ribosome biogenesis, building minimal cells, and repurposing ribosomes for synthetic biology. However, efforts to design sequence-modified ribosomes have been limited because point mutations in the ribosomal RNA (rRNA), especially in the catalytic active site (peptidyl transferase center; PTC), are often functionally detrimental. Moreover, methods for directed evolution of rRNA are constrained by practical considerations (e.g. library size). Here, to address these limitations, we developed a computational rRNA design approach for screening guided libraries of mutant ribosomes. Our method includes in silico library design and selection using a Rosetta stepwise Monte Carlo method (SWM), library construction and in vitro testing of combined ribosomal assembly and translation activity, and functional characterization in vivo. As a model, we apply our method to making modified ribosomes with mutant PTCs. We engineer ribosomes with as many as 30 mutations in their PTCs, highlighting previously unidentified epistatic interactions, and show that SWM helps identify sequences with beneficial phenotypes as compared to random library sequences. We further demonstrate that some variants improve cell growth in vivo, relative to wild type ribosomes. We anticipate that SWM design and selection may serve as a powerful tool for rRNA engineering.
Collapse
Affiliation(s)
- Camila Kofman
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
- Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL 60208, USA
| | - Andrew M Watkins
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
- Prescient Design, Genentech, South San Francisco, CA 94080, USA
| | - Do Soon Kim
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
- Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL 60208, USA
- Inceptive Nucleics, Inc., Palo Alto, CA 94304, USA
| | - Jessica A Willi
- Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL 60208, USA
| | - Alexandra C Wooldredge
- Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL 60208, USA
| | - Ashty S Karim
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
- Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL 60208, USA
| | - Rhiju Das
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
- Department of Physics, Stanford University, Stanford, CA 94305, USA
| | - Michael C Jewett
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
- Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL 60208, USA
- Robert H. Lurie Comprehensive Cancer Center and Simpson Querrey Institute, Northwestern University, Chicago, IL 60611, USA
| |
Collapse
|
7
|
Ross CJ, Ulitsky I. Discovering functional motifs in long noncoding RNAs. WILEY INTERDISCIPLINARY REVIEWS. RNA 2022; 13:e1708. [PMID: 34981665 DOI: 10.1002/wrna.1708] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 11/19/2021] [Accepted: 12/04/2021] [Indexed: 12/27/2022]
Abstract
Long noncoding RNAs (lncRNAs) are products of pervasive transcription that closely resemble messenger RNAs on the molecular level, yet function through largely unknown modes of action. The current model is that the function of lncRNAs often relies on specific, typically short, conserved elements, connected by linkers in which specific sequences and/or structures are less important. This notion has fueled the development of both computational and experimental methods focused on the discovery of functional elements within lncRNA genes, based on diverse signals such as evolutionary conservation, predicted structural elements, or the ability to rescue loss-of-function phenotypes. In this review, we outline the main challenges that the different methods need to overcome, describe the recently developed approaches, and discuss their respective limitations. This article is categorized under: RNA Evolution and Genomics > Computational Analyses of RNA RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs.
Collapse
Affiliation(s)
- Caroline Jane Ross
- Biological Regulation and Molecular Neuroscience, Weizmann Institute of Science, Rehovot, Israel
| | - Igor Ulitsky
- Biological Regulation and Molecular Neuroscience, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
8
|
Aviran S, Incarnato D. Computational approaches for RNA structure ensemble deconvolution from structure probing data. J Mol Biol 2022; 434:167635. [PMID: 35595163 DOI: 10.1016/j.jmb.2022.167635] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 04/29/2022] [Accepted: 05/05/2022] [Indexed: 12/15/2022]
Abstract
RNA structure probing experiments have emerged over the last decade as a straightforward way to determine the structure of RNA molecules in a number of different contexts. Although powerful, the ability of RNA to dynamically interconvert between, and to simultaneously populate, alternative structural configurations, poses a nontrivial challenge to the interpretation of data derived from these experiments. Recent efforts aimed at developing computational methods for the reconstruction of coexisting alternative RNA conformations from structure probing data are paving the way to the study of RNA structure ensembles, even in the context of living cells. In this review, we critically discuss these methods, their limitations and possible future improvements.
Collapse
Affiliation(s)
- Sharon Aviran
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA.
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, the Netherlands.
| |
Collapse
|
9
|
Abstract
Recent events have pushed RNA research into the spotlight. Continued discoveries of RNA with unexpected diverse functions in healthy and diseased cells, such as the role of RNA as both the source and countermeasure to a severe acute respiratory syndrome coronavirus 2 infection, are igniting a new passion for understanding this functionally and structurally versatile molecule. Although RNA structure is key to function, many foundational characteristics of RNA structure are misunderstood, and the default state of RNA is often thought of and depicted as a single floppy strand. The purpose of this perspective is to help adjust mental models, equipping the community to better use the fundamental aspects of RNA structural information in new mechanistic models, enhance experimental design to test these models, and refine data interpretation. We discuss six core observations focused on the inherent nature of RNA structure and how to incorporate these characteristics to better understand RNA structure. We also offer some ideas for future efforts to make validated RNA structural information available and readily used by all researchers.
Collapse
Affiliation(s)
- Quentin Vicens
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, School of Medicine, Aurora, CO 80045
- RNA BioScience Initiative, University of Colorado Denver School of Medicine, Aurora, CO 80045
| | - Jeffrey S. Kieft
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, School of Medicine, Aurora, CO 80045
- RNA BioScience Initiative, University of Colorado Denver School of Medicine, Aurora, CO 80045
| |
Collapse
|
10
|
Bose T, Fridkin G, Davidovich C, Krupkin M, Dinger N, Falkovich A, Peleg Y, Agmon I, Bashan A, Yonath A. Origin of life: protoribosome forms peptide bonds and links RNA and protein dominated worlds. Nucleic Acids Res 2022; 50:1815-1828. [PMID: 35137169 PMCID: PMC8886871 DOI: 10.1093/nar/gkac052] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 12/13/2021] [Accepted: 01/25/2022] [Indexed: 12/15/2022] Open
Abstract
Although the mode of action of the ribosomes, the multi-component universal effective protein-synthesis organelles, has been thoroughly explored, their mere appearance remained elusive. Our earlier comparative structural studies suggested that a universal internal small RNA pocket-like segment called by us the protoribosome, which is still embedded in the contemporary ribosome, is a vestige of the primordial ribosome. Herein, after constructing such pockets, we show using the "fragment reaction" and its analyses by MALDI-TOF and LC-MS mass spectrometry techniques, that several protoribosome constructs are indeed capable of mediating peptide-bond formation. These findings present strong evidence supporting our hypothesis on origin of life and on ribosome's construction, thus suggesting that the protoribosome may be the missing link between the RNA dominated world and the contemporary nucleic acids/proteins life.
Collapse
Affiliation(s)
- Tanaya Bose
- Department of Chemical and Structural Biology, Weizmann Institute of Science 7610001 Rehovot, Israel
| | - Gil Fridkin
- Department of Chemical and Structural Biology, Weizmann Institute of Science 7610001 Rehovot, Israel
- Department of Organic Chemistry, Israel Institute for Biological Research, P.O. Box 19, Ness Ziona 7410001, Israel
| | - Chen Davidovich
- Department of Chemical and Structural Biology, Weizmann Institute of Science 7610001 Rehovot, Israel
| | - Miri Krupkin
- Department of Chemical and Structural Biology, Weizmann Institute of Science 7610001 Rehovot, Israel
| | - Nikita Dinger
- Department of Chemical and Structural Biology, Weizmann Institute of Science 7610001 Rehovot, Israel
| | - Alla H Falkovich
- Department of Chemical and Structural Biology, Weizmann Institute of Science 7610001 Rehovot, Israel
| | - Yoav Peleg
- Department of Life Sciences Core Facilities (LSCF), Weizmann Institute of Science, Rehovot, Israel
| | - Ilana Agmon
- Institute for Advanced Studies in Theoretical Chemistry, Schulich Faculty of Chemistry-Technion-Israel Institute of Technology, Haifa 3200003, Israel
- Fritz Haber Research Center for Molecular Dynamics, Hebrew University, Jerusalem 9190401, Israel
| | - Anat Bashan
- Department of Chemical and Structural Biology, Weizmann Institute of Science 7610001 Rehovot, Israel
| | - Ada Yonath
- Department of Chemical and Structural Biology, Weizmann Institute of Science 7610001 Rehovot, Israel
| |
Collapse
|
11
|
Wang S, Chan KWK, Tan MJA, Flory C, Luo D, Lescar J, Forwood JK, Vasudevan SG. A conserved arginine in NS5 binds genomic 3' stem-loop RNA for primer-independent initiation of flavivirus RNA replication. RNA (NEW YORK, N.Y.) 2022; 28:177-193. [PMID: 34759006 PMCID: PMC8906541 DOI: 10.1261/rna.078949.121] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Accepted: 10/15/2021] [Indexed: 06/13/2023]
Abstract
The commitment to replicate the RNA genome of flaviviruses without a primer involves RNA-protein interactions that have been shown to include the recognition of the stem-loop A (SLA) in the 5' untranslated region (UTR) by the nonstructural protein NS5. We show that DENV2 NS5 arginine 888, located within the carboxy-terminal 18 residues, is completely conserved in all flaviviruses and interacts specifically with the top-loop of 3'SL in the 3'UTR which contains the pentanucleotide 5'-CACAG-3' previously shown to be critical for flavivirus RNA replication. We present virological and biochemical data showing the importance of this Arg 888 in virus viability and de novo initiation of RNA polymerase activity in vitro. Based on our binding studies, we hypothesize that ternary complex formation of NS5 with 3'SL, followed by dimerization, leads to the formation of the de novo initiation complex that could be regulated by the reversible zipping and unzipping of cis-acting RNA elements.
Collapse
Affiliation(s)
- Sai Wang
- Program in Emerging Infectious Diseases, Duke-NUS Medical School, 169857 Singapore
| | - Kitti Wing Ki Chan
- Program in Emerging Infectious Diseases, Duke-NUS Medical School, 169857 Singapore
| | - Min Jie Alvin Tan
- Program in Emerging Infectious Diseases, Duke-NUS Medical School, 169857 Singapore
| | - Charlotte Flory
- Program in Emerging Infectious Diseases, Duke-NUS Medical School, 169857 Singapore
| | - Dahai Luo
- Lee Kong Chian School of Medicine, Nanyang Technological University, 636921 Singapore
| | - Julian Lescar
- School of Biological Sciences, Nanyang Technological University, 637551 Singapore
| | - Jade K Forwood
- School of Biomedical Sciences, Charles Sturt University, Wagga Wagga, New South Wales 2650, Australia
| | - Subhash G Vasudevan
- Program in Emerging Infectious Diseases, Duke-NUS Medical School, 169857 Singapore
- Department of Microbiology and Immunology, National University of Singapore, 117545 Singapore
- Institute for Glycomics, Griffith University, Gold Coast Campus, QLD 4222, Australia
| |
Collapse
|
12
|
Bhandari BK, Lim CS, Remus DM, Chen A, van Dolleweerd C, Gardner PP. Analysis of 11,430 recombinant protein production experiments reveals that protein yield is tunable by synonymous codon changes of translation initiation sites. PLoS Comput Biol 2021; 17:e1009461. [PMID: 34610008 PMCID: PMC8519471 DOI: 10.1371/journal.pcbi.1009461] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 10/15/2021] [Accepted: 09/19/2021] [Indexed: 12/16/2022] Open
Abstract
Recombinant protein production is a key process in generating proteins of interest in the pharmaceutical industry and biomedical research. However, about 50% of recombinant proteins fail to be expressed in a variety of host cells. Here we show that the accessibility of translation initiation sites modelled using the mRNA base-unpairing across the Boltzmann's ensemble significantly outperforms alternative features. This approach accurately predicts the successes or failures of expression experiments, which utilised Escherichia coli cells to express 11,430 recombinant proteins from over 189 diverse species. On this basis, we develop TIsigner that uses simulated annealing to modify up to the first nine codons of mRNAs with synonymous substitutions. We show that accessibility captures the key propensity beyond the target region (initiation sites in this case), as a modest number of synonymous changes is sufficient to tune the recombinant protein expression levels. We build a stochastic simulation model and show that higher accessibility leads to higher protein production and slower cell growth, supporting the idea of protein cost, where cell growth is constrained by protein circuits during overexpression.
Collapse
Affiliation(s)
- Bikash K. Bhandari
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Chun Shen Lim
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Daniela M. Remus
- Callaghan Innovation Protein Science and Engineering, University of Canterbury, Christchurch, New Zealand
| | - Augustine Chen
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Craig van Dolleweerd
- Biomolecular Interaction Center, University of Canterbury, Christchurch, New Zealand
| | - Paul P. Gardner
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
- Biomolecular Interaction Center, University of Canterbury, Christchurch, New Zealand
| |
Collapse
|
13
|
Improving RNA Branching Predictions: Advances and Limitations. Genes (Basel) 2021; 12:genes12040469. [PMID: 33805944 PMCID: PMC8064352 DOI: 10.3390/genes12040469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 03/15/2021] [Accepted: 03/18/2021] [Indexed: 11/16/2022] Open
Abstract
Minimum free energy prediction of RNA secondary structures is based on the Nearest Neighbor Thermodynamics Model. While such predictions are typically good, the accuracy can vary widely even for short sequences, and the branching thermodynamics are an important factor in this variance. Recently, the simplest model for multiloop energetics—a linear function of the number of branches and unpaired nucleotides—was found to be the best. Subsequently, a parametric analysis demonstrated that per family accuracy can be improved by changing the weightings in this linear function. However, the extent of improvement was not known due to the ad hoc method used to find the new parameters. Here we develop a branch-and-bound algorithm that finds the set of optimal parameters with the highest average accuracy for a given set of sequences. Our analysis shows that the previous ad hoc parameters are nearly optimal for tRNA and 5S rRNA sequences on both training and testing sets. Moreover, cross-family improvement is possible but more difficult because competing parameter regions favor different families. The results also indicate that restricting the unpaired nucleotide penalty to small values is warranted. This reduction makes analyzing longer sequences using the present techniques more feasible.
Collapse
|
14
|
Rivas E. Evolutionary conservation of RNA sequence and structure. WILEY INTERDISCIPLINARY REVIEWS-RNA 2021; 12:e1649. [PMID: 33754485 PMCID: PMC8250186 DOI: 10.1002/wrna.1649] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 02/24/2021] [Accepted: 02/25/2021] [Indexed: 12/22/2022]
Abstract
An RNA structure prediction from a single‐sequence RNA folding program is not evidence for an RNA whose structure is important for function. Random sequences have plausible and complex predicted structures not easily distinguishable from those of structural RNAs. How to tell when an RNA has a conserved structure is a question that requires looking at the evolutionary signature left by the conserved RNA. This question is important not just for long noncoding RNAs which usually lack an identified function, but also for RNA binding protein motifs which can be single stranded RNAs or structures. Here we review recent advances using sequence and structural analysis to determine when RNA structure is conserved or not. Although covariation measures assess structural RNA conservation, one must distinguish covariation due to RNA structure from covariation due to independent phylogenetic substitutions. We review a statistical test to measure false positives expected under the null hypothesis of phylogenetic covariation alone (specificity). We also review a complementary test that measures power, that is, expected covariation derived from sequence variation alone (sensitivity). Power in the absence of covariation signals the absence of a conserved RNA structure. We analyze artifacts that falsely identify conserved RNA structure such as the misuse of programs that do not assess significance, the use of inappropriate statistics confounded by signals other than covariation, or misalignments that induce spurious covariation. Among artifacts that obscure the signal of a conserved RNA structure, we discuss the inclusion of pseudogenes in alignments which increase power but destroy covariation. This article is categorized under:RNA Structure and Dynamics > RNA Structure, Dynamics and Chemistry RNA Evolution and Genomics > Computational Analyses of RNA RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution
Collapse
Affiliation(s)
- Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
15
|
Li P, Zhou X, Xu K, Zhang QC. RASP: an atlas of transcriptome-wide RNA secondary structure probing data. Nucleic Acids Res 2021; 49:D183-D191. [PMID: 33068412 PMCID: PMC7779053 DOI: 10.1093/nar/gkaa880] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 09/13/2020] [Accepted: 09/26/2020] [Indexed: 02/06/2023] Open
Abstract
RNA molecules fold into complex structures that are important across many biological processes. Recent technological developments have enabled transcriptome-wide probing of RNA secondary structure using nucleases and chemical modifiers. These approaches have been widely applied to capture RNA secondary structure in many studies, but gathering and presenting such data from very different technologies in a comprehensive and accessible way has been challenging. Existing RNA structure probing databases usually focus on low-throughput or very specific datasets. Here, we present a comprehensive RNA structure probing database called RASP (RNA Atlas of Structure Probing) by collecting 161 deduplicated transcriptome-wide RNA secondary structure probing datasets from 38 papers. RASP covers 18 species across animals, plants, bacteria, fungi, and also viruses, and categorizes 18 experimental methods including DMS-seq, SHAPE-Seq, SHAPE-MaP, and icSHAPE, etc. Specially, RASP curates the up-to-date datasets of several RNA secondary structure probing studies for the RNA genome of SARS-CoV-2, the RNA virus that caused the on-going COVID-19 pandemic. RASP also provides a user-friendly interface to query, browse, and visualize RNA structure profiles, offering a shortcut to accessing RNA secondary structures grounded in experimental data. The database is freely available at http://rasp.zhanglab.net.
Collapse
MESH Headings
- Animals
- COVID-19/epidemiology
- COVID-19/prevention & control
- COVID-19/virology
- Computational Biology/methods
- Computational Biology/statistics & numerical data
- Databases, Genetic/statistics & numerical data
- Genome, Viral/genetics
- High-Throughput Nucleotide Sequencing/methods
- High-Throughput Nucleotide Sequencing/statistics & numerical data
- Humans
- Nucleic Acid Conformation
- Pandemics
- RNA/chemistry
- RNA/genetics
- RNA Probes/genetics
- RNA, Bacterial/chemistry
- RNA, Bacterial/genetics
- RNA, Fungal/chemistry
- RNA, Fungal/genetics
- RNA, Plant/chemistry
- RNA, Plant/genetics
- RNA, Viral/chemistry
- RNA, Viral/genetics
- SARS-CoV-2/genetics
- SARS-CoV-2/physiology
- Transcriptome
Collapse
Affiliation(s)
- Pan Li
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Xiaolin Zhou
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Kui Xu
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| |
Collapse
|
16
|
Vila-Sanjurjo A, Smith PM, Elson JL. Heterologous Inferential Analysis (HIA) and Other Emerging Concepts: In Understanding Mitochondrial Variation In Pathogenesis: There is no More Low-Hanging Fruit. Methods Mol Biol 2021; 2277:203-245. [PMID: 34080154 DOI: 10.1007/978-1-0716-1270-5_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Here we summarize our latest efforts to elucidate the role of mtDNA variants affecting the mitochondrial translation machinery, namely variants mapping to the mt-rRNA and mt-tRNA genes. Evidence is accumulating to suggest that the cellular response to interference with mitochondrial translation is different from that occurring as a result of mutations in genes encoding OXPHOS proteins. As a result, it appears safe to state that a complete view of mitochondrial disease will not be obtained until we understand the effect of mt-rRNA and mt-tRNA variants on mitochondrial protein synthesis. Despite the identification of a large number of potentially pathogenic variants in the mitochondrially encoded rRNA (mt-rRNA) genes, we lack direct methods to firmly establish their pathogenicity. In the absence of such methods, we have devised an indirect approach named heterologous inferential analysis (HIA ) that can be used to make predictions concerning the disruptive potential of a large subset of mt-rRNA variants. We have used HIA to explore the mutational landscape of 12S and 16S mt-rRNA genes. Our HIA studies include a thorough classification of all rare variants reported in the literature as well as others obtained from studies performed in collaboration with physicians. HIA has also been used with non-mammalian mt-rRNA genes to elucidate how mitotypes influence the interaction of the individual and the environment. Regarding mt-tRNA variations, rapidly growing evidence shows that the spectrum of mutations causing mitochondrial disease might differ between the different mitochondrial haplogroups seen in human populations.
Collapse
Affiliation(s)
- Antón Vila-Sanjurjo
- Departamento de Bioloxía, Facultade de Ciencias, Centro de Investigacións en Ciencias Avanzadas (CICA), Universidade da Coruña, A Coruña, Spain.
| | - Paul M Smith
- Department of Paediatrics, Royal Aberdeen Children's Hospital, Aberdeen, UK
| | - Joanna L Elson
- Biosciences Institute Newcastle, Newcastle University, Newcastle upon Tyne, UK.
- Human Metabolomics, North-West University, Potchefstroom, South Africa.
| |
Collapse
|
17
|
Huber RG, Marzinek JK, Boon PLS, Yue W, Bond PJ. Computational modelling of flavivirus dynamics: The ins and outs. Methods 2021; 185:28-38. [PMID: 32526282 PMCID: PMC7278654 DOI: 10.1016/j.ymeth.2020.06.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 04/24/2020] [Accepted: 06/04/2020] [Indexed: 02/06/2023] Open
Abstract
Enveloped viruses such as the flaviviruses represent a significant burden to human health around the world, with hundreds of millions of people each year affected by dengue alone. In an effort to improve our understanding of the molecular basis for the infective mechanisms of these viruses, extensive computational modelling approaches have been applied to elucidate their conformational dynamics. Multiscale protocols have been developed to simulate flavivirus envelopes in close accordance with biophysical data, in particular derived from cryo-electron microscopy, enabling high-resolution refinement of their structures and elucidation of the conformational changes associated with adaptation both to host environments and to immunological factors such as antibodies. Likewise, integrative modelling efforts combining data from biophysical experiments and from genome sequencing with chemical modification are providing unparalleled insights into the architecture of the previously unresolved nucleocapsid complex. Collectively, this work provides the basis for the future rational design of new antiviral therapeutics and vaccine development strategies targeting enveloped viruses.
Collapse
Affiliation(s)
- Roland G Huber
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, Matrix #07-01, 138671, Singapore
| | - Jan K Marzinek
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, Matrix #07-01, 138671, Singapore
| | - Priscilla L S Boon
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, Matrix #07-01, 138671, Singapore; NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore (NUS), University Hall, Tan Chin Tuan Wing #04-02, 119077, Singapore; Department of Biological Sciences (DBS), National University of Singapore (NUS), 16 Science Drive 4, Building S3, Singapore
| | - Wan Yue
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Genome #02-01, 138672, Singapore
| | - Peter J Bond
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, Matrix #07-01, 138671, Singapore; Department of Biological Sciences (DBS), National University of Singapore (NUS), 16 Science Drive 4, Building S3, Singapore.
| |
Collapse
|
18
|
RNA Secondary Structures with Limited Base Pair Span: Exact Backtracking and an Application. Genes (Basel) 2020; 12:genes12010014. [PMID: 33374382 PMCID: PMC7823788 DOI: 10.3390/genes12010014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 12/18/2020] [Accepted: 12/21/2020] [Indexed: 11/24/2022] Open
Abstract
The accuracy of RNA secondary structure prediction decreases with the span of a base pair, i.e., the number of nucleotides that it encloses. The dynamic programming algorithms for RNA folding can be easily specialized in order to consider only base pairs with a limited span L, reducing the memory requirements to O(nL), and further to O(n) by interleaving backtracking. However, the latter is an approximation that precludes the retrieval of the globally optimal structure. So far, the ViennaRNA package therefore does not provide a tool for computing optimal, span-restricted minimum energy structure. Here, we report on an efficient backtracking algorithm that reconstructs the globally optimal structure from the locally optimal fragments that are produced by the interleaved backtracking implemented in RNALfold. An implementation is integrated into the ViennaRNA package. The forward and the backtracking recursions of RNALfold are both easily constrained to structural components with a sufficiently negative z-scores. This provides a convenient method in order to identify hyper-stable structural elements. A screen of the C. elegans genome shows that such features are more abundant in real genomic sequences when compared to a di-nucleotide shuffled background model.
Collapse
|
19
|
Bossanyi MA, Carpentier V, Glouzon JPS, Ouangraoua A, Anselmetti Y. aliFreeFoldMulti: alignment-free method to predict secondary structures of multiple RNA homologs. NAR Genom Bioinform 2020; 2:lqaa086. [PMID: 33575631 PMCID: PMC7671329 DOI: 10.1093/nargab/lqaa086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 10/19/2020] [Indexed: 11/18/2022] Open
Abstract
Predicting RNA structure is crucial for understanding RNA’s mechanism of action. Comparative approaches for the prediction of RNA structures can be classified into four main strategies. The three first—align-and-fold, align-then-fold and fold-then-align—exploit multiple sequence alignments to improve the accuracy of conserved RNA-structure prediction. Align-and-fold methods perform generally better, but are also typically slower than the other alignment-based methods. The fourth strategy—alignment-free—consists in predicting the conserved RNA structure without relying on sequence alignment. This strategy has the advantage of being the faster, while predicting accurate structures through the use of latent representations of the candidate structures for each sequence. This paper presents aliFreeFoldMulti, an extension of the aliFreeFold algorithm. This algorithm predicts a representative secondary structure of multiple RNA homologs by using a vector representation of their suboptimal structures. aliFreeFoldMulti improves on aliFreeFold by additionally computing the conserved structure for each sequence. aliFreeFoldMulti is assessed by comparing its prediction performance and time efficiency with a set of leading RNA-structure prediction methods. aliFreeFoldMulti has the lowest computing times and the highest maximum accuracy scores. It achieves comparable average structure prediction accuracy as other methods, except TurboFoldII which is the best in terms of average accuracy but with the highest computing times. We present aliFreeFoldMulti as an illustration of the potential of alignment-free approaches to provide fast and accurate RNA-structure prediction methods.
Collapse
Affiliation(s)
- Marc-André Bossanyi
- CoBIUS lab, Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke, QC J1K 2R1, Canada
| | - Valentin Carpentier
- CoBIUS lab, Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke, QC J1K 2R1, Canada
| | - Jean-Pierre S Glouzon
- CoBIUS lab, Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke, QC J1K 2R1, Canada
| | - Aïda Ouangraoua
- CoBIUS lab, Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke, QC J1K 2R1, Canada
| | - Yoann Anselmetti
- CoBIUS lab, Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke, QC J1K 2R1, Canada
| |
Collapse
|
20
|
Kirkpatrick A, Patton K, Tetali P, Mitchell C. Markov Chain-Based Sampling for Exploring RNA Secondary Structure under the Nearest Neighbor Thermodynamic Model and Extended Applications. MATHEMATICAL AND COMPUTATIONAL APPLICATIONS 2020; 25. [PMID: 35924027 PMCID: PMC9344895 DOI: 10.3390/mca25040067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Ribonucleic acid (RNA) secondary structures and branching properties are important for determining functional ramifications in biology. While energy minimization of the Nearest Neighbor Thermodynamic Model (NNTM) is commonly used to identify such properties (number of hairpins, maximum ladder distance, etc.), it is difficult to know whether the resultant values fall within expected dispersion thresholds for a given energy function. The goal of this study was to construct a Markov chain capable of examining the dispersion of RNA secondary structures and branching properties obtained from NNTM energy function minimization independent of a specific nucleotide sequence. Plane trees are studied as a model for RNA secondary structure, with energy assigned to each tree based on the NNTM, and a corresponding Gibbs distribution is defined on the trees. Through a bijection between plane trees and 2-Motzkin paths, a Markov chain converging to the Gibbs distribution is constructed, and fast mixing time is established by estimating the spectral gap of the chain. The spectral gap estimate is obtained through a series of decompositions of the chain and also by building on known mixing time results for other chains on Dyck paths. The resulting algorithm can be used as a tool for exploring the branching structure of RNA, especially for long sequences, and to examine branching structure dependence on energy model parameters. Full exposition is provided for the mathematical techniques used with the expectation that these techniques will prove useful in bioinformatics, computational biology, and additional extended applications.
Collapse
Affiliation(s)
- Anna Kirkpatrick
- School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Kalen Patton
- School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332, USA
- School of Computer Science, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Prasad Tetali
- School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332, USA
- School of Computer Science, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Cassie Mitchell
- Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
- Correspondence:
| |
Collapse
|
21
|
Gumna J, Zok T, Figurski K, Pachulska-Wieczorek K, Szachniuk M. RNAthor - fast, accurate normalization, visualization and statistical analysis of RNA probing data resolved by capillary electrophoresis. PLoS One 2020; 15:e0239287. [PMID: 33002005 PMCID: PMC7529196 DOI: 10.1371/journal.pone.0239287] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Accepted: 09/03/2020] [Indexed: 12/18/2022] Open
Abstract
RNAs adopt specific structures to perform their functions, which are critical to fundamental cellular processes. For decades, these structures have been determined and modeled with strong support from computational methods. Still, the accuracy of the latter ones depends on the availability of experimental data, for example, chemical probing information that can define pseudo-energy constraints for RNA folding algorithms. At the same time, diverse computational tools have been developed to facilitate analysis and visualization of data from RNA structure probing experiments followed by capillary electrophoresis or next-generation sequencing. RNAthor, a new software tool for the fully automated normalization of SHAPE and DMS probing data resolved by capillary electrophoresis, has recently joined this collection. RNAthor automatically identifies unreliable probing data. It normalizes the reactivity information to a uniform scale and uses it in the RNA secondary structure prediction. Our web server also provides tools for fast and easy RNA probing data visualization and statistical analysis that facilitates the comparison of multiple data sets. RNAthor is freely available at http://rnathor.cs.put.poznan.pl/.
Collapse
Affiliation(s)
- Julita Gumna
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Kacper Figurski
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | | | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
- * E-mail: (KPW); (MS)
| |
Collapse
|
22
|
The challenge of RNA branching prediction: a parametric analysis of multiloop initiation under thermodynamic optimization. J Struct Biol 2020; 210:107475. [PMID: 32032754 DOI: 10.1016/j.jsb.2020.107475] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 09/25/2019] [Accepted: 01/30/2020] [Indexed: 12/29/2022]
Abstract
Prediction of RNA base pairings yields insight into molecular structure, and therefore function. The most common methods predict an optimal structure under the standard thermodynamic model. One component of this model is the equation which governs the cost of branching, where three or more helical "arms" radiate out from a multiloop (also known as a junction). The multiloop initiation equation has three parameters; changing those values can significantly alter the predicted structure. We give a complete analysis of the prediction accuracy, stability, and robustness for all possible parameter combinations for a diverse set of tRNA sequences, and also for 5S rRNA. We find that the accuracy can often be substantially improved on a per sequence basis. However, simultaneous improvement within families, and most especially between families, remains a challenge.
Collapse
|
23
|
He Q, Huang FW, Barrett C, Reidys CM. Genetic robustness of let-7 miRNA sequence-structure pairs. RNA (NEW YORK, N.Y.) 2019; 25:1592-1603. [PMID: 31548338 PMCID: PMC6859847 DOI: 10.1261/rna.065763.118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2018] [Accepted: 08/20/2019] [Indexed: 05/13/2023]
Abstract
Genetic robustness, the preservation of evolved phenotypes against genotypic mutations, is one of the central concepts in evolution. In recent years a large body of work has focused on the origins, mechanisms, and consequences of robustness in a wide range of biological systems. In particular, research on ncRNAs studied the ability of sequences to maintain folded structures against single-point mutations. In these studies, the structure is merely a reference. However, recent work revealed evidence that structure itself contributes to the genetic robustness of ncRNAs. We follow this line of thought and consider sequence-structure pairs as the unit of evolution and introduce the spectrum of extended mutational robustness (EMR spectrum) as a measurement of genetic robustness. Our analysis of the miRNA let-7 family captures key features of structure-modulated evolution and facilitates the study of robustness against multiple-point mutations.
Collapse
Affiliation(s)
- Qijun He
- Biocomplexity Institute and Initiative
| | | | | | - Christian M Reidys
- Biocomplexity Institute and Initiative
- Department of Mathematics, University of Virginia, Charlottesville, Virginia 22904, USA
| |
Collapse
|
24
|
Mapping the RNA structural landscape of viral genomes. Methods 2019; 183:57-67. [PMID: 31711930 DOI: 10.1016/j.ymeth.2019.11.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Revised: 10/13/2019] [Accepted: 11/05/2019] [Indexed: 12/26/2022] Open
Abstract
Functional RNA structures are prevalent in viral genomes, and have been shown to play roles in almost every aspect of their biology. However, the majority of viral RNA remains structurally uncharacterized. This is likely to remain true as the cost of sequencing decreases much faster than the cost of structural characterizations. Because of this, there is a need for rapid, inexpensive methods to highlight regions of viral RNA which are ideal candidates for structure-function analyses. The ScanFold method was developed as a single sequence alternative to traditional RNA structural motif discovery pipelines, which rely heavily on well curated sequence alignments to identify conserved RNA structures. ScanFold focuses on identifying (based on their more stable than expected folding energies) the most likely functional structures encoded within a single large RNA sequence, while allowing predicted motifs to be tested for evidence of structural conservation later. Decoupling these processes can be a benefit to researchers studying viruses lacking the ideal phylogenetic depth to yield evidence of structural conservation. Here, we demonstrate how the most significant ScanFold predicted structures correspond to higher base pairing probabilities, SHAPE reactivities, and predict known functional structures within the ZIKV and HIV-1 genomes with accuracy. Best practices and examples are also shown to aid users in utilizing ScanFold for their own systems of interest. ScanFold is available as a Webserver (https://mosslabtools.bb.iastate.edu/scanfold) or can be downloaded (https://github.com/moss-lab/ScanFold) and run locally.
Collapse
|
25
|
Glouzon JPS, Ouangraoua A. aliFreeFold: an alignment-free approach to predict secondary structure from homologous RNA sequences. Bioinformatics 2019; 34:i70-i78. [PMID: 29949960 PMCID: PMC6022685 DOI: 10.1093/bioinformatics/bty234] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Motivation Predicting the conserved secondary structure of homologous ribonucleic acid (RNA) sequences is crucial for understanding RNA functions. However, fast and accurate RNA structure prediction is challenging, especially when the number and the divergence of homologous RNA increases. To address this challenge, we propose aliFreeFold, based on a novel alignment-free approach which computes a representative structure from a set of homologous RNA sequences using sub-optimal secondary structures generated for each sequence. It is based on a vector representation of sub-optimal structures capturing structure conservation signals by weighting structural motifs according to their conservation across the sub-optimal structures. Results We demonstrate that aliFreeFold provides a good balance between speed and accuracy regarding predictions of representative structures for sets of homologous RNA compared to traditional methods based on sequence and structure alignment. We show that aliFreeFold is capable of uncovering conserved structural features fastly and effectively thanks to its weighting scheme that gives more (resp. less) importance to common (resp. uncommon) structural motifs. The weighting scheme is also shown to be capable of capturing conservation signal as the number of homologous RNA increases. These results demonstrate the ability of aliFreefold to efficiently and accurately provide interesting structural representatives of RNA families. Availability and implementation aliFreeFold was implemented in C++. Source code and Linux binary are freely available at https://github.com/UdeS-CoBIUS/aliFreeFold. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Aïda Ouangraoua
- Department of Computer Science, University of Sherbrooke, Sherbrooke, QC, Canada
| |
Collapse
|
26
|
Giannetti CA, Busan S, Weidmann CA, Weeks KM. SHAPE Probing Reveals Human rRNAs Are Largely Unfolded in Solution. Biochemistry 2019; 58:3377-3385. [PMID: 31305988 DOI: 10.1021/acs.biochem.9b00076] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Chemical probing experiments, coupled with empirically determined free energy change relationships, can enable accurate modeling of the secondary structures of diverse and complex RNAs. A current frontier lies in modeling large and structurally heterogeneous transcripts, including complex eukaryotic RNAs. To validate and improve on experimentally driven approaches for modeling large transcripts, we obtained high-quality SHAPE data for the protein-free human 18S and 28S ribosomal RNAs (rRNAs). To our surprise, SHAPE-directed structure models for the human rRNAs poorly matched accepted structures. Analysis of predicted rRNA structures based on low-SHAPE and low-entropy (lowSS) metrics revealed that, whereas ∼75% of Escherichia coli rRNA sequences form well-determined lowSS secondary structure, only ∼40% of the human rRNAs do. Critically, regions of the human rRNAs that specifically fold into well-determined lowSS structures were modeled to high accuracy using SHAPE data. This work reveals that eukaryotic rRNAs are more unfolded than are those of prokaryotic rRNAs and indeed are largely unfolded overall, likely reflecting increased protein dependence for eukaryotic ribosome structure. In addition, those regions and substructures that are well-determined can be identified de novo and successfully modeled by SHAPE-directed folding.
Collapse
Affiliation(s)
- Catherine A Giannetti
- Department of Chemistry , The University of North Carolina , Chapel Hill , North Carolina 27599-3290 , United States
| | - Steven Busan
- Department of Chemistry , The University of North Carolina , Chapel Hill , North Carolina 27599-3290 , United States
| | - Chase A Weidmann
- Department of Chemistry , The University of North Carolina , Chapel Hill , North Carolina 27599-3290 , United States
| | - Kevin M Weeks
- Department of Chemistry , The University of North Carolina , Chapel Hill , North Carolina 27599-3290 , United States
| |
Collapse
|
27
|
Su C, Weir JD, Zhang F, Yan H, Wu T. ENTRNA: a framework to predict RNA foldability. BMC Bioinformatics 2019; 20:373. [PMID: 31269893 PMCID: PMC6610807 DOI: 10.1186/s12859-019-2948-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 06/12/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions. Existing literature has focused on RNA design as either an RNA structure prediction problem or an RNA inverse folding problem where free energy has played a key role. RESULTS In this research, we propose a Positive-Unlabeled data- driven framework termed ENTRNA. Other than free energy and commonly studied sequence and structural features, we propose a new feature, Sequence Segment Entropy (SSE), to measure the diversity of RNA sequences. ENTRNA is trained and cross-validated using 1024 pseudoknot-free RNAs and 1060 pseudoknotted RNAs from the RNASTRAND database respectively. To test the robustness of the ENTRNA, the models are further blind tested on 206 pseudoknot-free and 93 pseudoknotted RNAs from the PDB database. For pseudoknot-free RNAs, ENTRNA has 86.5% sensitivity on the training dataset and 80.6% sensitivity on the testing dataset. For pseudoknotted RNAs, ENTRNA shows 81.5% sensitivity on the training dataset and 71.0% on the testing dataset. To test the applicability of ENTRNA to long structural-complex RNA, we collect 5 laboratory synthetic RNAs ranging from 1618 to 1790 nucleotides. ENTRNA is able to predict the foldability of 4 RNAs. CONCLUSION In this article, we reformulate the RNA design problem as a foldability prediction problem which is to predict the likelihood of the co-existence of a sequence-structure pair. This new construct has the potential for both RNA structure prediction and the inverse folding problem. In addition, this new construct enables us to explore data-driven approaches in RNA research.
Collapse
Affiliation(s)
- Congzhe Su
- School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, AZ 85281 USA
| | - Jeffery D. Weir
- Department of Operational Sciences, Graduate School of Engineering and Management, Air Force Institute of Technology, Wright-Patterson AFB, Dayton, OH 45433 USA
| | - Fei Zhang
- Biodesign Center for Molecular Design and Biomimetics, The Biodesign Institute & School of Molecular Sciences, Arizona State University, Tempe, AZ 85281 USA
| | - Hao Yan
- Biodesign Center for Molecular Design and Biomimetics, The Biodesign Institute & School of Molecular Sciences, Arizona State University, Tempe, AZ 85281 USA
| | - Teresa Wu
- School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, AZ 85281 USA
| |
Collapse
|
28
|
Andrews RJ, Moss WN. Computational approaches for the discovery of splicing regulatory RNA structures. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1862:194380. [PMID: 31048028 DOI: 10.1016/j.bbagrm.2019.04.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Revised: 04/15/2019] [Accepted: 04/16/2019] [Indexed: 12/14/2022]
Abstract
Global RNA structure and local functional motifs mediate interactions important in determining the rates and patterns of mRNA splicing. In this review, we overview approaches for the computational prediction of RNA secondary structure with a special emphasis on the discovery of motifs important to RNA splicing. The process of identifying and modeling potential splicing regulatory structures is illustrated using a recently-developed approach for RNA structural motif discovery, the ScanFold pipeline, which is applied to the identification of a known splicing regulatory structure in influenza virus.
Collapse
Affiliation(s)
- Ryan J Andrews
- Roy J. Carver Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State University, 2437 Pammel Drive, Ames, IA 50011, USA
| | - Walter N Moss
- Roy J. Carver Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State University, 2437 Pammel Drive, Ames, IA 50011, USA.
| |
Collapse
|
29
|
Tanzer A, Hofacker IL, Lorenz R. RNA modifications in structure prediction - Status quo and future challenges. Methods 2018; 156:32-39. [PMID: 30385321 DOI: 10.1016/j.ymeth.2018.10.019] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Revised: 10/12/2018] [Accepted: 10/26/2018] [Indexed: 01/01/2023] Open
Abstract
Chemical modifications of RNA nucleotides change their identity and characteristics and thus alter genetic and structural information encoded in the genomic DNA. tRNA and rRNA are probably the most heavily modified genes, and often depend on derivatization or isomerization of their nucleobases in order to correctly fold into their functional structures. Recent RNomics studies, however, report transcriptome wide RNA modification and suggest a more general regulation of structuredness of RNAs by this so called epitranscriptome. Modification seems to require specific substrate structures, which in turn are stabilized or destabilized and thus promote or inhibit refolding events of regulatory RNA structures. In this review, we revisit RNA modifications and the related structures from a computational point of view. We discuss known substrate structures, their properties such as sub-motifs as well as consequences of modifications on base pairing patterns and possible refolding events. Given that efficient RNA structure prediction methods for canonical base pairs have been established several decades ago, we review to what extend these methods allow the inclusion of modified nucleotides to model and study epitranscriptomic effects on RNA structures.
Collapse
Affiliation(s)
- Andrea Tanzer
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Waehringerstrasse 17, 1090 Vienna, Austria
| | - Ivo L Hofacker
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Waehringerstrasse 17, 1090 Vienna, Austria; Research Group Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Waehringerstrasse 29, 1090 Vienna, Austria
| | - Ronny Lorenz
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Waehringerstrasse 17, 1090 Vienna, Austria
| |
Collapse
|
30
|
Ben-Bassat I, Chor B, Orenstein Y. A deep neural network approach for learning intrinsic protein-RNA binding preferences. Bioinformatics 2018; 34:i638-i646. [DOI: 10.1093/bioinformatics/bty600] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Ilan Ben-Bassat
- Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
| | - Benny Chor
- Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
| | - Yaron Orenstein
- Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
31
|
Zhu Y, Xie Z, Li Y, Zhu M, Chen YPP. Research on folding diversity in statistical learning methods for RNA secondary structure prediction. Int J Biol Sci 2018; 14:872-882. [PMID: 29989089 PMCID: PMC6036747 DOI: 10.7150/ijbs.24595] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2017] [Accepted: 02/21/2018] [Indexed: 12/24/2022] Open
Abstract
How to improve the prediction accuracy of RNA secondary structure is currently a hot topic. The existing prediction methods for a single sequence do not fully consider the folding diversity which may occur among RNAs with different functions or sources. This paper explores the relationship between folding diversity and prediction accuracy, and puts forward a new method to improve the prediction accuracy of RNA secondary structure. Our research investigates the following: 1. The folding feature based on stochastic context-free grammar is proposed. By using dimension reduction and clustering techniques, some public data sets are analyzed. The results show that there is significant folding diversity among different RNA families. 2. To assign folding rules to RNAs without structural information, a classification method based on production probability is proposed. The experimental results show that the classification method proposed in this paper can effectively classify the RNAs of unknown structure. 3. Based on the existing prediction methods of statistical learning models, an RNA secondary structure prediction framework is proposed, namely "Cluster - Training - Parameter Selection - Prediction". The results show that, with information on folding diversity, prediction accuracy can be significantly improved.
Collapse
Affiliation(s)
- Yu Zhu
- College of Computer Science, Sichuan University, China
| | - ZhaoYang Xie
- College of Computer Science, Sichuan University, China
| | - YiZhou Li
- College of Chemistry, Sichuan University, China
| | - Min Zhu
- Vice Dean of College of Computer Science, Sichuan University
| | - Yi-Ping Phoebe Chen
- Department of Computer Science and Information Technology, La Trobe University, Australia
| |
Collapse
|
32
|
Lackey L, Coria A, Woods C, McArthur E, Laederach A. Allele-specific SHAPE-MaP assessment of the effects of somatic variation and protein binding on mRNA structure. RNA (NEW YORK, N.Y.) 2018; 24:513-528. [PMID: 29317542 PMCID: PMC5855952 DOI: 10.1261/rna.064469.117] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Accepted: 01/04/2018] [Indexed: 05/22/2023]
Abstract
The impact of inherited and somatic mutations on messenger RNA (mRNA) structure remains poorly understood. Recent technological advances that leverage next-generation sequencing to obtain experimental structure data, such as SHAPE-MaP, can reveal structural effects of mutations, especially when these data are incorporated into structure modeling. Here, we analyze the ability of SHAPE-MaP to detect the relatively subtle structural changes caused by single-nucleotide mutations. We find that allele-specific sorting greatly improved our detection ability. Thus, we used SHAPE-MaP with a novel combination of clone-free robotic mutagenesis and allele-specific sorting to perform a rapid, comprehensive survey of noncoding somatic and inherited riboSNitches in two cancer-associated mRNAs, TPT1 and LCP1 Using rigorous thermodynamic modeling of the Boltzmann suboptimal ensemble, we identified a subset of mutations that change TPT1 and LCP1 RNA structure, with approximately 14% of all variants identified as riboSNitches. To confirm that these in vitro structures were biologically relevant, we tested how dependent TPT1 and LCP1 mRNA structures were on their environments. We performed SHAPE-MaP on TPT1 and LCP1 mRNAs in the presence or absence of cellular proteins and found that both mRNAs have similar overall folds in all conditions. RiboSNitches identified within these mRNAs in vitro likely exist under biological conditions. Overall, these data reveal a robust mRNA structural landscape where differences in environmental conditions and most sequence variants do not significantly alter RNA structural ensembles. Finally, predicting riboSNitches in mRNAs from sequence alone remains particularly challenging; these data will provide the community with benchmarks for further algorithmic development.
Collapse
Affiliation(s)
- Lela Lackey
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Aaztli Coria
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Chanin Woods
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Evonne McArthur
- School of Medicine, Vanderbilt University, Nashville, Tennessee 37232, USA
| | - Alain Laederach
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| |
Collapse
|
33
|
Moss WN. RNA2DMut: a web tool for the design and analysis of RNA structure mutations. RNA (NEW YORK, N.Y.) 2018; 24:273-286. [PMID: 29183923 PMCID: PMC5824348 DOI: 10.1261/rna.063933.117] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Accepted: 11/25/2017] [Indexed: 06/07/2023]
Abstract
With the widespread application of high-throughput sequencing, novel RNA sequences are being discovered at an astonishing rate. The analysis of function, however, lags behind. In both the cis- and trans-regulatory functions of RNA, secondary structure (2D base-pairing) plays essential regulatory roles. In order to test RNA function, it is essential to be able to design and analyze mutations that can affect structure. This was the motivation for the creation of the RNA2DMut web tool. With RNA2DMut, users can enter in RNA sequences to analyze, constrain mutations to specific residues, or limit changes to purines/pyrimidines. The sequence is analyzed at each base to determine the effect of every possible point mutation on 2D structure. The metrics used in RNA2DMut rely on the calculation of the Boltzmann structure ensemble and do not require a robust 2D model of RNA structure for designing mutations. This tool can facilitate a wide array of uses involving RNA: for example, in designing and evaluating mutants for biological assays, interrogating RNA-protein interactions, identifying key regions to alter in SELEX experiments, and improving RNA folding and crystallization properties for structural biology. Additional tools are available to help users introduce other mutations (e.g., indels and substitutions) and evaluate their effects on RNA structure. Example calculations are shown for five RNAs that require 2D structure for their function: the MALAT1 mascRNA, an influenza virus splicing regulatory motif, the EBER2 viral noncoding RNA, the Xist lncRNA repA region, and human Y RNA 5. RNA2DMut can be accessed at https://rna2dmut.bb.iastate.edu/.
Collapse
Affiliation(s)
- Walter N Moss
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, Iowa 50011, USA
| |
Collapse
|
34
|
A method to improve prediction of secondary structure for large single RNA sequences. Biochem Biophys Res Commun 2018; 496:523-528. [PMID: 29339162 DOI: 10.1016/j.bbrc.2018.01.086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Accepted: 01/11/2018] [Indexed: 11/20/2022]
Abstract
The function of a particular RNA molecule within an organic system is principally determined by its structure. The current physical methods available for structure determination are time consuming and expensive. Hence, computational methods for structure prediction are often used. The prediction of the structure of a large single sequence of RNA needs a lot of research work. In the present work, a method is introduced to improve the prediction of large single sequence RNA secondary structure obtained by Mfold program using the concept of minimum free energy. The Mfold program contains a constraint option that allows forcing some helices in the predicted structure. In our method, some of the firstly formed hairpins that are expected, by a statistical study, to be present in the real structure are forced in the Mfold predicted structure. The results show improvement, toward the real structure, in the Mfold predicted structure and this gives evidence to the RNA kinetic folding.
Collapse
|
35
|
Zuber J, Sun H, Zhang X, McFadyen I, Mathews DH. A sensitivity analysis of RNA folding nearest neighbor parameters identifies a subset of free energy parameters with the greatest impact on RNA secondary structure prediction. Nucleic Acids Res 2017; 45:6168-6176. [PMID: 28334976 PMCID: PMC5449625 DOI: 10.1093/nar/gkx170] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2016] [Accepted: 03/10/2017] [Indexed: 01/02/2023] Open
Abstract
Nearest neighbor parameters for estimating the folding energy changes of RNA secondary structures are used in structure prediction and analysis. Despite their widespread application, a comprehensive analysis of the impact of each parameter on the precision of calculations had not been conducted. To identify the parameters with greatest impact, a sensitivity analysis was performed on the 291 parameters that compose the 2004 version of the free energy nearest neighbor rules. Perturbed parameter sets were generated by perturbing each parameter independently. Then the effect of each individual parameter change on predicted base-pair probabilities and secondary structures as compared to the standard parameter set was observed for a set of sequences including structured ncRNA, mRNA and randomized sequences. The results identify for the first time the parameters with the greatest impact on secondary structure prediction, and the subset which should be prioritized for further study in order to improve the precision of structure prediction. In particular, bulge loop initiation, multibranch loop initiation, AU/GU internal loop closure and AU/GU helix end parameters were particularly important. An analysis of parameter usage during folding free energy calculations of stochastic samples of secondary structures revealed a correlation between parameter usage and impact on structure prediction precision.
Collapse
Affiliation(s)
- Jeffrey Zuber
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Hongying Sun
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Xiaoju Zhang
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Iain McFadyen
- Computational Sciences, Moderna Therapeutics, Cambridge, MA 02141, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| |
Collapse
|
36
|
Rogers E, Murrugarra D, Heitsch C. Conditioning and Robustness of RNA Boltzmann Sampling under Thermodynamic Parameter Perturbations. Biophys J 2017. [PMID: 28629618 DOI: 10.1016/j.bpj.2017.05.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Understanding how RNA secondary structure prediction methods depend on the underlying nearest-neighbor thermodynamic model remains a fundamental challenge in the field. Minimum free energy (MFE) predictions are known to be "ill conditioned" in that small changes to the thermodynamic model can result in significantly different optimal structures. Hence, the best practice is now to sample from the Boltzmann distribution, which generates a set of suboptimal structures. Although the structural signal of this Boltzmann sample is known to be robust to stochastic noise, the conditioning and robustness under thermodynamic perturbations have yet to be addressed. We present here a mathematically rigorous model for conditioning inspired by numerical analysis, and also a biologically inspired definition for robustness under thermodynamic perturbation. We demonstrate the strong correlation between conditioning and robustness and use its tight relationship to define quantitative thresholds for well versus ill conditioning. These resulting thresholds demonstrate that the majority of the sequences are at least sample robust, which verifies the assumption of sampling's improved conditioning over the MFE prediction. Furthermore, because we find no correlation between conditioning and MFE accuracy, the presence of both well- and ill-conditioned sequences indicates the continued need for both thermodynamic model refinements and alternate RNA structure prediction methods beyond the physics-based ones.
Collapse
Affiliation(s)
- Emily Rogers
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia
| | - David Murrugarra
- Department of Mathematics, University of Kentucky, Lexington, Kentucky
| | - Christine Heitsch
- School of Mathematics, Georgia Institute of Technology, Atlanta, Georgia.
| |
Collapse
|
37
|
Bell DR, Cheng SY, Salazar H, Ren P. Capturing RNA Folding Free Energy with Coarse-Grained Molecular Dynamics Simulations. Sci Rep 2017; 7:45812. [PMID: 28393861 PMCID: PMC5385882 DOI: 10.1038/srep45812] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Accepted: 03/06/2017] [Indexed: 01/25/2023] Open
Abstract
We introduce a coarse-grained RNA model for molecular dynamics simulations, RACER (RnA CoarsE-gRained). RACER achieves accurate native structure prediction for a number of RNAs (average RMSD of 2.93 Å) and the sequence-specific variation of free energy is in excellent agreement with experimentally measured stabilities (R2 = 0.93). Using RACER, we identified hydrogen-bonding (or base pairing), base stacking, and electrostatic interactions as essential driving forces for RNA folding. Also, we found that separating pairing vs. stacking interactions allowed RACER to distinguish folded vs. unfolded states. In RACER, base pairing and stacking interactions each provide an approximate stability of 3-4 kcal/mol for an A-form helix. RACER was developed based on PDB structural statistics and experimental thermodynamic data. In contrast with previous work, RACER implements a novel effective vdW potential energy function, which led us to re-parameterize hydrogen bond and electrostatic potential energy functions. Further, RACER is validated and optimized using a simulated annealing protocol to generate potential energy vs. RMSD landscapes. Finally, RACER is tested using extensive equilibrium pulling simulations (0.86 ms total) on eleven RNA sequences (hairpins and duplexes).
Collapse
Affiliation(s)
- David R. Bell
- Department of Biomedical Engineering, University of Texas at Austin, Austin, Texas 78712, United States
| | - Sara Y. Cheng
- Department of Physics, University of Texas at Austin, Austin, Texas 78712, United States
| | - Heber Salazar
- Department of Biomedical Engineering, University of Texas at Austin, Austin, Texas 78712, United States
| | - Pengyu Ren
- Department of Biomedical Engineering, University of Texas at Austin, Austin, Texas 78712, United States
| |
Collapse
|
38
|
Choudhary K, Deng F, Aviran S. Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions. QUANTITATIVE BIOLOGY 2017; 5:3-24. [PMID: 28717530 PMCID: PMC5510538 DOI: 10.1007/s40484-017-0093-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 12/08/2016] [Accepted: 12/15/2016] [Indexed: 12/30/2022]
Abstract
BACKGROUND Structure profiling experiments provide single-nucleotide information on RNA structure. Recent advances in chemistry combined with application of high-throughput sequencing have enabled structure profiling at transcriptome scale and in living cells, creating unprecedented opportunities for RNA biology. Propelled by these experimental advances, massive data with ever-increasing diversity and complexity have been generated, which give rise to new challenges in interpreting and analyzing these data. RESULTS We review current practices in analysis of structure profiling data with emphasis on comparative and integrative analysis as well as highlight emerging questions. Comparative analysis has revealed structural patterns across transcriptomes and has become an integral component of recent profiling studies. Additionally, profiling data can be integrated into traditional structure prediction algorithms to improve prediction accuracy. CONCLUSIONS To keep pace with experimental developments, methods to facilitate, enhance and refine such analyses are needed. Parallel advances in analysis methodology will complement profiling technologies and help them reach their full potential.
Collapse
Affiliation(s)
| | | | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA 95616, USA
| |
Collapse
|
39
|
Spiraling Complexity: A Test of the Snowball Effect in a Computational Model of RNA Folding. Genetics 2016; 206:377-388. [PMID: 28007889 DOI: 10.1534/genetics.116.196030] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Accepted: 03/03/2017] [Indexed: 01/07/2023] Open
Abstract
Genetic incompatibilities can emerge as a byproduct of genetic divergence. According to Dobzhansky and Muller, an allele that fixes in one population may be incompatible with an allele at a different locus in another population when the two alleles are brought together in hybrids. Orr showed that the number of Dobzhansky-Muller incompatibilities (DMIs) should accumulate faster than linearly-i.e., snowball-as two lineages diverge. Several studies have attempted to test the snowball effect using data from natural populations. One limitation of these studies is that they have focused on predictions of the Orr model, but not on its underlying assumptions. Here, we use a computational model of RNA folding to test both predictions and assumptions of the Orr model. Two populations are allowed to evolve in allopatry on a holey fitness landscape. We find that the number of inviable introgressions (an indicator for the number of DMIs) snowballs, but does so more slowly than expected. We show that this pattern is explained, in part, by the fact that DMIs can disappear after they have arisen, contrary to the assumptions of the Orr model. This occurs because DMIs become progressively more complex (i.e., involve alleles at more loci) as a result of later substitutions. We also find that most DMIs involve >2 loci, i.e., they are complex. Reproductive isolation does not snowball because DMIs do not act independently of each other. We conclude that the RNA model supports the central prediction of the Orr model that the number of DMIs snowballs, but challenges other predictions, as well as some of its underlying assumptions.
Collapse
|
40
|
Abstract
Deciphering the folding pathways and predicting the structures of complex three-dimensional biomolecules is central to elucidating biological function. RNA is single-stranded, which gives it the freedom to fold into complex secondary and tertiary structures. These structures endow RNA with the ability to perform complex chemistries and functions ranging from enzymatic activity to gene regulation. Given that RNA is involved in many essential cellular processes, it is critical to understand how it folds and functions in vivo. Within the last few years, methods have been developed to probe RNA structures in vivo and genome-wide. These studies reveal that RNA often adopts very different structures in vivo and in vitro, and provide profound insights into RNA biology. Nonetheless, both in vitro and in vivo approaches have limitations: studies in the complex and uncontrolled cellular environment make it difficult to obtain insight into RNA folding pathways and thermodynamics, and studies in vitro often lack direct cellular relevance, leaving a gap in our knowledge of RNA folding in vivo. This gap is being bridged by biophysical and mechanistic studies of RNA structure and function under conditions that mimic the cellular environment. To date, most artificial cytoplasms have used various polymers as molecular crowding agents and a series of small molecules as cosolutes. Studies under such in vivo-like conditions are yielding fresh insights, such as cooperative folding of functional RNAs and increased activity of ribozymes. These observations are accounted for in part by molecular crowding effects and interactions with other molecules. In this review, we report milestones in RNA folding in vitro and in vivo and discuss ongoing experimental and computational efforts to bridge the gap between these two conditions in order to understand how RNA folds in the cell.
Collapse
|
41
|
Kawaguchi R, Kiryu H. Parallel computation of genome-scale RNA secondary structure to detect structural constraints on human genome. BMC Bioinformatics 2016; 17:203. [PMID: 27153986 PMCID: PMC4858847 DOI: 10.1186/s12859-016-1067-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Accepted: 04/29/2016] [Indexed: 02/08/2023] Open
Abstract
Background RNA secondary structure around splice sites is known to assist normal splicing by promoting spliceosome recognition. However, analyzing the structural properties of entire intronic regions or pre-mRNA sequences has been difficult hitherto, owing to serious experimental and computational limitations, such as low read coverage and numerical problems. Results Our novel software, “ParasoR”, is designed to run on a computer cluster and enables the exact computation of various structural features of long RNA sequences under the constraint of maximal base-pairing distance. ParasoR divides dynamic programming (DP) matrices into smaller pieces, such that each piece can be computed by a separate computer node without losing the connectivity information between the pieces. ParasoR directly computes the ratios of DP variables to avoid the reduction of numerical precision caused by the cancellation of a large number of Boltzmann factors. The structural preferences of mRNAs computed by ParasoR shows a high concordance with those determined by high-throughput sequencing analyses. Using ParasoR, we investigated the global structural preferences of transcribed regions in the human genome. A genome-wide folding simulation indicated that transcribed regions are significantly more structural than intergenic regions after removing repeat sequences and k-mer frequency bias. In particular, we observed a highly significant preference for base pairing over entire intronic regions as compared to their antisense sequences, as well as to intergenic regions. A comparison between pre-mRNAs and mRNAs showed that coding regions become more accessible after splicing, indicating constraints for translational efficiency. Such changes are correlated with gene expression levels, as well as GC content, and are enriched among genes associated with cytoskeleton and kinase functions. Conclusions We have shown that ParasoR is very useful for analyzing the structural properties of long RNA sequences such as mRNAs, pre-mRNAs, and long non-coding RNAs whose lengths can be more than a million bases in the human genome. In our analyses, transcribed regions including introns are indicated to be subject to various types of structural constraints that cannot be explained from simple sequence composition biases. ParasoR is freely available at https://github.com/carushi/ParasoR. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1067-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Risa Kawaguchi
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8561, Japan.
| | - Hisanori Kiryu
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8561, Japan
| |
Collapse
|
42
|
Lorenz R, Hofacker IL, Stadler PF. RNA folding with hard and soft constraints. Algorithms Mol Biol 2016; 11:8. [PMID: 27110276 PMCID: PMC4842303 DOI: 10.1186/s13015-016-0070-z] [Citation(s) in RCA: 69] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 04/01/2016] [Indexed: 12/21/2022] Open
Abstract
Background A large class of RNA secondary structure prediction programs uses an elaborate energy model grounded in extensive thermodynamic measurements and exact dynamic programming algorithms. External experimental evidence can be in principle be incorporated by means of hard constraints that restrict the search space or by means of soft constraints that distort the energy model. In particular recent advances in coupling chemical and enzymatic probing with sequencing techniques but also comparative approaches provide an increasing amount of experimental data to be combined with secondary structure prediction. Results Responding to the increasing needs for a versatile and user-friendly inclusion of external evidence into diverse flavors of RNA secondary structure prediction tools we implemented a generic layer of constraint handling into the ViennaRNA Package. It makes explicit use of the conceptual separation of the “folding grammar” defining the search space and the actual energy evaluation, which allows constraints to be interleaved in a natural way between recursion steps and evaluation of the standard energy function. Conclusions The extension of the ViennaRNA Package provides a generic way to include diverse types of constraints into RNA folding algorithms. The computational overhead incurred is negligible in practice. A wide variety of application scenarios can be accommodated by the new framework, including the incorporation of structure probing data, non-standard base pairs and chemical modifications, as well as structure-dependent ligand binding. Electronic supplementary material The online version of this article (doi:10.1186/s13015-016-0070-z) contains supplementary material, which is available to authorized users.
Collapse
|
43
|
Lorenz R, Wolfinger MT, Tanzer A, Hofacker IL. Predicting RNA secondary structures from sequence and probing data. Methods 2016; 103:86-98. [PMID: 27064083 DOI: 10.1016/j.ymeth.2016.04.004] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Revised: 03/29/2016] [Accepted: 04/04/2016] [Indexed: 01/08/2023] Open
Abstract
RNA secondary structures have proven essential for understanding the regulatory functions performed by RNA such as microRNAs, bacterial small RNAs, or riboswitches. This success is in part due to the availability of efficient computational methods for predicting RNA secondary structures. Recent advances focus on dealing with the inherent uncertainty of prediction by considering the ensemble of possible structures rather than the single most stable one. Moreover, the advent of high-throughput structural probing has spurred the development of computational methods that incorporate such experimental data as auxiliary information.
Collapse
Affiliation(s)
- Ronny Lorenz
- University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstrasse 17, 1090 Vienna, Austria.
| | - Michael T Wolfinger
- University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstrasse 17, 1090 Vienna, Austria; Medical University of Vienna, Center for Anatomy and Cell Biology, Währingerstraße 13, 1090 Vienna, Austria.
| | - Andrea Tanzer
- University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstrasse 17, 1090 Vienna, Austria.
| | - Ivo L Hofacker
- University of Vienna, Faculty of Chemistry, Department of Theoretical Chemistry, Währingerstrasse 17, 1090 Vienna, Austria; University of Vienna, Faculty of Computer Science, Research Group Bioinformatics and Computational Biology, Währingerstr. 29, 1090 Vienna, Austria.
| |
Collapse
|
44
|
Rogers E, Heitsch C. New insights from cluster analysis methods for RNA secondary structure prediction. WILEY INTERDISCIPLINARY REVIEWS-RNA 2016; 7:278-94. [PMID: 26971529 DOI: 10.1002/wrna.1334] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Revised: 12/03/2015] [Accepted: 12/17/2015] [Indexed: 01/12/2023]
Abstract
A widening gap exists between the best practices for RNA secondary structure prediction developed by computational researchers and the methods used in practice by experimentalists. Minimum free energy predictions, although broadly used, are outperformed by methods which sample from the Boltzmann distribution and data mine the results. In particular, moving beyond the single structure prediction paradigm yields substantial gains in accuracy. Furthermore, the largest improvements in accuracy and precision come from viewing secondary structures not at the base pair level but at lower granularity/higher abstraction. This suggests that random errors affecting precision and systematic ones affecting accuracy are both reduced by this 'fuzzier' view of secondary structures. Thus experimentalists who are willing to adopt a more rigorous, multilayered approach to secondary structure prediction by iterating through these levels of granularity will be much better able to capture fundamental aspects of RNA base pairing. WIREs RNA 2016, 7:278-294. doi: 10.1002/wrna.1334 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Emily Rogers
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0765, USA
| | - Christine Heitsch
- School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332-0160, USA
| |
Collapse
|
45
|
Abstract
Protein-RNA interactions play important roles in a wide variety of cellular processes, ranging from transcriptional and posttranscriptional regulation of genes to host defense against pathogens. In this chapter we present the computational approach catRAPID to predict protein-RNA interactions and discuss how it could be used to find trends in ribonucleoprotein networks. We envisage that the combination of computational and experimental approaches will be crucial to unravel the role of coding and noncoding RNAs in protein networks.
Collapse
|
46
|
Bader El Din NG, El Hefnawy MM, Omran MH, Dawood RM, El Abd Y, Ibrahim MK, El Awady MK. Spontaneous clearance of chronic hepatitis C infection is associated with an internal ribosomal entry site IIId stem loop structure variant. Indian J Med Microbiol 2015; 33 Suppl:143-8. [PMID: 25657135 DOI: 10.4103/0255-0857.148835] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
AIM To investigate if any mutations in hepatitis C virus (HCV) internal ribosome entry site (IRES) can inhibit the translation of viral polyprotein. MATERIALS AND METHODS A 26-year-old male patient infected with HCV 10 years ago was followed up. After 9 years of chronic infection. The patient had managed to resolve the infection for a period of 9 months, after which the patient experienced a viral recurrence characterized by high viral load and diverse HCV quasispecies. The IRES structures of the viral strains that disappeared were comparable with those that are currently active using structural mutational analysis. RESULTS A novo mutational position 254 combined with a rarely observed mutation at position 253 in the stem of the IIId subdomain were observed and the new conformation had an octa-apical loop (AGUGUUGG) and a shift in the 3 ` GU from the loop to the stem. CONCLUSIONS These mutations were found to be highly deleterious, and they affected the direct binding of the IIId loop to the 40S ribosomal subunit with a subsequent inhibition of translation of viral polyprotein and clearance of the virus.
Collapse
Affiliation(s)
- N G Bader El Din
- Department of Microbial Biotechnology , National Research Center, Tahrir, Dokki, Cairo, Egypt
| | | | | | | | | | | | | |
Collapse
|
47
|
Kun Á, Szathmáry E. Fitness Landscapes of Functional RNAs. Life (Basel) 2015; 5:1497-517. [PMID: 26308059 PMCID: PMC4598650 DOI: 10.3390/life5031497] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2015] [Revised: 07/26/2015] [Accepted: 08/03/2015] [Indexed: 11/16/2022] Open
Abstract
The notion of fitness landscapes, a map between genotype and fitness, was proposed more than 80 years ago. For most of this time data was only available for a few alleles, and thus we had only a restricted view of the whole fitness landscape. Recently, advances in genetics and molecular biology allow a more detailed view of them. Here we review experimental and theoretical studies of fitness landscapes of functional RNAs, especially aptamers and ribozymes. We find that RNA structures can be divided into critical structures, connecting structures, neutral structures and forbidden structures. Such characterisation, coupled with theoretical sequence-to-structure predictions, allows us to construct the whole fitness landscape. Fitness landscapes then can be used to study evolution, and in our case the development of the RNA world.
Collapse
Affiliation(s)
- Ádám Kun
- Parmenides Center for the Conceptual Foundations of Science, Kirchplatz 1, 82049 Munich/Pullach, Germany.
- MTA-ELTE-MTMT Ecology Research Group, Pázmány Péter sétány 1/C, 1117 Budapest, Hungary.
- Department of Plant Systematics, Ecology and Theoretical Biology, Institute of Biology, Eötvös University, Pázmány Péter sétány 1/C, 1117 Budapest, Hungary.
| | - Eörs Szathmáry
- Parmenides Center for the Conceptual Foundations of Science, Kirchplatz 1, 82049 Munich/Pullach, Germany.
- Department of Plant Systematics, Ecology and Theoretical Biology, Institute of Biology, Eötvös University, Pázmány Péter sétány 1/C, 1117 Budapest, Hungary.
- MTA-ELTE Theoretical Biology and Evolutionary Ecology Research Group, Pázmány Péter sétány 1/C, 1117 Budapest, Hungary.
| |
Collapse
|
48
|
Wu Y, Shi B, Ding X, Liu T, Hu X, Yip KY, Yang ZR, Mathews DH, Lu ZJ. Improved prediction of RNA secondary structure by integrating the free energy model with restraints derived from experimental probing data. Nucleic Acids Res 2015; 43:7247-59. [PMID: 26170232 PMCID: PMC4551937 DOI: 10.1093/nar/gkv706] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2015] [Accepted: 06/30/2015] [Indexed: 12/30/2022] Open
Abstract
Recently, several experimental techniques have emerged for probing RNA structures based on high-throughput sequencing. However, most secondary structure prediction tools that incorporate probing data are designed and optimized for particular types of experiments. For example, RNAstructure-Fold is optimized for SHAPE data, while SeqFold is optimized for PARS data. Here, we report a new RNA secondary structure prediction method, restrained MaxExpect (RME), which can incorporate multiple types of experimental probing data and is based on a free energy model and an MEA (maximizing expected accuracy) algorithm. We first demonstrated that RME substantially improved secondary structure prediction with perfect restraints (base pair information of known structures). Next, we collected structure-probing data from diverse experiments (e.g. SHAPE, PARS and DMS-seq) and transformed them into a unified set of pairing probabilities with a posterior probabilistic model. By using the probability scores as restraints in RME, we compared its secondary structure prediction performance with two other well-known tools, RNAstructure-Fold (based on a free energy minimization algorithm) and SeqFold (based on a sampling algorithm). For SHAPE data, RME and RNAstructure-Fold performed better than SeqFold, because they markedly altered the energy model with the experimental restraints. For high-throughput data (e.g. PARS and DMS-seq) with lower probing efficiency, the secondary structure prediction performances of the tested tools were comparable, with performance improvements for only a portion of the tested RNAs. However, when the effects of tertiary structure and protein interactions were removed, RME showed the highest prediction accuracy in the DMS-accessible regions by incorporating in vivo DMS-seq data.
Collapse
Affiliation(s)
- Yang Wu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Binbin Shi
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Xinqiang Ding
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Tong Liu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Xihao Hu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China
| | - Kevin Y Yip
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China
| | - Zheng Rong Yang
- School of Biosciences, University of Exeter, UK Exeter EX4 4QD, UK
| | - David H Mathews
- Department of Biochemistry and Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, USA
| | - Zhi John Lu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| |
Collapse
|
49
|
Bai Y, Dai X, Harrison A, Johnston C, Chen M. Toward a next-generation atlas of RNA secondary structure. Brief Bioinform 2015; 17:63-77. [PMID: 25922372 DOI: 10.1093/bib/bbv026] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2015] [Indexed: 12/23/2022] Open
Abstract
RNA structure plays a crucial role in gene maturation, regulation and function. Determining the form and frequency of RNA folds is essential for a better understanding of how RNA exerts its functions. Low-throughput studies have focused on RNA primary sequences and expression levels, but with an emphasis on relatively small numbers of transcripts. However, with the recent advent of high-throughput technologies, it is realistic to begin analyzing RNA secondary structures on a genome-wide scale. Here, we review genome-wide RNA secondary structure profiles as well as advances in computational structure predictions. We further discuss the novel characteristics of RNA secondary structure across messenger RNAs. Probing RNA secondary structure by high-throughput sequencing will enable us to build atlases of RNA secondary structures, an important step in helping us to understand the versatility of RNA functions in diverse cellular processes.
Collapse
|
50
|
Gupta N, Wu CH, Wu GY. Secondary Structural Elements of the HCV X-region Involved in Viral Replication. J Clin Transl Hepatol 2015; 3:1-8. [PMID: 26356238 PMCID: PMC4542080 DOI: 10.14218/jcth.2015.00003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/02/2015] [Revised: 02/26/2015] [Accepted: 03/01/2015] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND AND AIMS The noncoding regions in the 3'-untranslated region (UTR) of the hepatitis C virus (HCV) genome contain secondary structures that are important for replication. The aim of this study was to identify detailed conformational elements of the X-region involved in HCV replication. METHODS Ribonucleic acid (RNA) structural analogs X94, X12, and X12c were constructed to have identical conformation but 94%, 12%, and 0% sequence identity, respectively, to the X region of HCV genotype 2a. Effects of structural analogs on replication of HCV genotypes 1b and 2a HCV RNA were studied by quantitative reverse transcriptase polymerase chain reaction. RESULTS In replicon BB7 cells, a constitutive replication model, HCV RNA levels decreased to 55%, 52%, 53%, and 54% after transfection with expression plasmids generating RNA structural analogs 5B-46, X-94, X-12, and X-12c, respectively (p<0.001 for all). In an HCV genotype 2a infection model, RNA analogs 5B-46, X-94, and X-12 in hepatic cells inhibited replication to 11%, 9%, and 12%, respectively. Because the X-12 analog was only 12% identical to the corresponding sequence of HCV genotype 2a, the sequence per se, or antisense effects were unlikely to be involved. CONCLUSIONS The data suggest that conformation of secondary structures in 3'-UTR of HCV RNA genome is required for HCV replication. Stable expression of RNA analogs predicted to have identical stem-loop structures might inhibit HCV infection of hepatocytes in liver and may represent a novel approach to design anti-HCV agents.
Collapse
Affiliation(s)
| | | | - George Y. Wu
- Correspondence to: George Y. Wu, Department of Medicine, Division of Gastroenterology-Hepatology, University of Connecticut Health Center, 263 Farmington Ave, Farmington, CT 06030-1845, USA. Tel: +1-800-535-6232; +1-860-679-7692, Fax: +1-860-679-3159. E-mail:
| |
Collapse
|