1
|
Yamamura K, Asai K, Iwakiri J. Consistent features observed in structural probing data of eukaryotic RNAs. NAR Genom Bioinform 2025; 7:lqaf001. [PMID: 39885881 PMCID: PMC11780854 DOI: 10.1093/nargab/lqaf001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 12/25/2024] [Accepted: 01/09/2025] [Indexed: 02/01/2025] Open
Abstract
Understanding RNA structure is crucial for elucidating its regulatory mechanisms. With the recent commercialization of messenger RNA vaccines, the profound impact of RNA structure on stability and translation efficiency has become increasingly evident, underscoring the importance of understanding RNA structure. Chemical probing of RNA has emerged as a powerful technique for investigating RNA structure in living cells. This approach utilizes chemical probes that selectively react with accessible regions of RNA, and by measuring reactivity, the openness and potential of RNA for protein binding or base pairing can be inferred. Extensive experimental data generated using RNA chemical probing have significantly contributed to our understanding of RNA structure in cells. However, it is crucial to acknowledge potential biases in chemical probing data to ensure an accurate interpretation. In this study, we comprehensively analyzed transcriptome-scale RNA chemical probing data in eukaryotes and report common features. Notably, in all experiments, the number of bases modified in probing was small, the bases showing the top 10% reactivity well reflected the known secondary structure, bases with high reactivity were more likely to be exposed to solvent and low reactivity did not reflect solvent exposure, which is important information for the analysis of RNA chemical probing data.
Collapse
Affiliation(s)
- Kazuteru Yamamura
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, Chiba 277-8561, Japan
| | - Kiyoshi Asai
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, Chiba 277-8561, Japan
| | - Junichi Iwakiri
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, Chiba 277-8561, Japan
| |
Collapse
|
2
|
Nabeel-Shah S, Pu S, Burns JD, Braunschweig U, Ahmed N, Burke GL, Lee H, Radovani E, Zhong G, Tang H, Marcon E, Zhang Z, Hughes TR, Blencowe BJ, Greenblatt JF. C2H2-zinc-finger transcription factors bind RNA and function in diverse post-transcriptional regulatory processes. Mol Cell 2024; 84:3810-3825.e10. [PMID: 39303720 DOI: 10.1016/j.molcel.2024.08.037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 06/13/2024] [Accepted: 08/30/2024] [Indexed: 09/22/2024]
Abstract
Cys2-His2 zinc-finger proteins (C2H2-ZNFs) constitute the largest class of DNA-binding transcription factors (TFs) yet remain largely uncharacterized. Although certain family members, e.g., GTF3A, have been shown to bind both DNA and RNA, the extent to which C2H2-ZNFs interact with-and regulate-RNA-associated processes is not known. Using UV crosslinking and immunoprecipitation (CLIP), we observe that 148 of 150 analyzed C2H2-ZNFs bind directly to RNA in human cells. By integrating CLIP sequencing (CLIP-seq) RNA-binding maps for 50 of these C2H2-ZNFs with data from chromatin immunoprecipitation sequencing (ChIP-seq), protein-protein interaction assays, and transcriptome profiling experiments, we observe that the RNA-binding profiles of C2H2-ZNFs are generally distinct from their DNA-binding preferences and that they regulate a variety of post-transcriptional processes, including pre-mRNA splicing, cleavage and polyadenylation, and m6A modification of mRNA. Our results thus define a substantially expanded repertoire of C2H2-ZNFs that bind RNA and provide an important resource for elucidating post-transcriptional regulatory programs.
Collapse
Affiliation(s)
- Syed Nabeel-Shah
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Shuye Pu
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - James D Burns
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | | | - Nujhat Ahmed
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Giovanni L Burke
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Hyunmin Lee
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Computer Sciences, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Ernest Radovani
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Guoqing Zhong
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Hua Tang
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Edyta Marcon
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Zhaolei Zhang
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Computer Sciences, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Timothy R Hughes
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Benjamin J Blencowe
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Jack F Greenblatt
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada.
| |
Collapse
|
3
|
von Löhneysen S, Spicher T, Varenyk Y, Yao HT, Lorenz R, Hofacker I, Stadler PF. Phylogenetic and Chemical Probing Information as Soft Constraints in RNA Secondary Structure Prediction. J Comput Biol 2024; 31:549-563. [PMID: 38935442 DOI: 10.1089/cmb.2024.0519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
Extrinsic, experimental information can be incorporated into thermodynamics-based RNA folding algorithms in the form of pseudo-energies. Evolutionary conservation of RNA secondary structure elements is detectable in alignments of phylogenetically related sequences and provides evidence for the presence of certain base pairs that can also be converted into pseudo-energy contributions. We show that the centroid base pairs computed from a consensus folding model such as RNAalifold result in a substantial improvement of the prediction accuracy for single sequences. Evidence for specific base pairs turns out to be more informative than a position-wise profile for the conservation of the pairing status. A comparison with chemical probing data, furthermore, strongly suggests that phylogenetic base pairing data are more informative than position-specific data on (un)pairedness as obtained from chemical probing experiments. In this context we demonstrate, in addition, that the conversion of signal from probing data into pseudo-energies is possible using thermodynamic structure predictions as a reference instead of known RNA structures.
Collapse
Affiliation(s)
- Sarah von Löhneysen
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany
| | - Thomas Spicher
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- UniVie Doctoral School Computer Science (DoCS), University of Vienna, Vienna, Austria
| | - Yuliia Varenyk
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- Vienna BioCenter PhD Program, Doctoral School of the University of Vienna and Medical, University of Vienna, Vienna, Austria
| | - Hua-Ting Yao
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Ronny Lorenz
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Ivo Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
- Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia
- Santa Fe Institute, Santa Fe, New Mexico, USA
| |
Collapse
|
4
|
von Löhneysen S, Mörl M, Stadler PF. Limits of experimental evidence in RNA secondary structure prediction. FRONTIERS IN BIOINFORMATICS 2024; 4:1346779. [PMID: 38456157 PMCID: PMC10918467 DOI: 10.3389/fbinf.2024.1346779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 01/09/2024] [Indexed: 03/09/2024] Open
Affiliation(s)
- Sarah von Löhneysen
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Mario Mörl
- Institute for Biochemistry, Leipzig University, Leipzig, Germany
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
- Competence Center for Scalable Data Analytics and Artificial Intelligence, School of Embedded and Compositive Artificial Intelligence (SECAI), Leipzig University, Leipzig, Germany
- Department of Theoretical Chemistry, University of Vienna, Wien, Austria
- Facultad de Ciencias, Universidad National de Colombia, Bogotá, Colombia
- Center for Non-Coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
- Santa Fe Institute, Santa Fe, NM, United States
| |
Collapse
|
5
|
Sharma H, Valentine MNZ, Toki N, Sueki HN, Gustincich S, Takahashi H, Carninci P. Decryption of sequence, structure, and functional features of SINE repeat elements in SINEUP non-coding RNA-mediated post-transcriptional gene regulation. Nat Commun 2024; 15:1400. [PMID: 38383605 PMCID: PMC10881587 DOI: 10.1038/s41467-024-45517-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 01/26/2024] [Indexed: 02/23/2024] Open
Abstract
RNA structure folding largely influences RNA regulation by providing flexibility and functional diversity. In silico and in vitro analyses are limited in their ability to capture the intricate relationships between dynamic RNA structure and RNA functional diversity present in the cell. Here, we investigate sequence, structure and functional features of mouse and human SINE-transcribed retrotransposons embedded in SINEUPs long non-coding RNAs, which positively regulate target gene expression post-transcriptionally. In-cell secondary structure probing reveals that functional SINEs-derived RNAs contain conserved short structure motifs essential for SINEUP-induced translation enhancement. We show that SINE RNA structure dynamically changes between the nucleus and cytoplasm and is associated with compartment-specific binding to RBP and related functions. Moreover, RNA-RNA interaction analysis shows that the SINE-derived RNAs interact directly with ribosomal RNAs, suggesting a mechanism of translation regulation. We further predict the architecture of 18 SINE RNAs in three dimensions guided by experimental secondary structure data. Overall, we demonstrate that the conservation of short key features involved in interactions with RBPs and ribosomal RNA drives the convergent function of evolutionarily distant SINE-transcribed RNAs.
Collapse
Affiliation(s)
- Harshita Sharma
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan
| | - Matthew N Z Valentine
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan
| | - Naoko Toki
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan
| | - Hiromi Nishiyori Sueki
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan
| | | | - Hazuki Takahashi
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan.
| | - Piero Carninci
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan.
- Human Technopole, Milan, 20157, Italy.
| |
Collapse
|
6
|
Backofen R, Gorodkin J, Hofacker IL, Stadler PF. Comparative RNA Genomics. Methods Mol Biol 2024; 2802:347-393. [PMID: 38819565 DOI: 10.1007/978-1-0716-3838-5_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Over the last quarter of a century it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large-scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible non-coding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of non-coding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Jan Gorodkin
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria
- Bioinformatics and Computational Biology research group, University of Vienna, Vienna, Austria
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany.
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
- Universidad National de Colombia, Bogotá, Colombia.
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria.
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
7
|
Greenwood T, Heitsch CE. How Parameters Influence SHAPE-Directed Predictions. Methods Mol Biol 2024; 2726:105-124. [PMID: 38780729 DOI: 10.1007/978-1-0716-3519-3_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
The structure of an RNA sequence encodes information about its biological function. Dynamic programming algorithms are often used to predict the conformation of an RNA molecule from its sequence alone, and adding experimental data as auxiliary information improves prediction accuracy. This auxiliary data is typically incorporated into the nearest neighbor thermodynamic model22 by converting the data into pseudoenergies. Here, we look at how much of the space of possible structures auxiliary data allows prediction methods to explore. We find that for a large class of RNA sequences, auxiliary data shifts the predictions significantly. Additionally, we find that predictions are highly sensitive to the parameters which define the auxiliary data pseudoenergies. In fact, the parameter space can typically be partitioned into regions where different structural predictions predominate.
Collapse
|
8
|
Waldl M, Spicher T, Lorenz R, Beckmann IK, Hofacker IL, Löhneysen SV, Stadler PF. Local RNA folding revisited. J Bioinform Comput Biol 2023; 21:2350016. [PMID: 37522173 DOI: 10.1142/s0219720023500166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
Most of the functional RNA elements located within large transcripts are local. Local folding therefore serves a practically useful approximation to global structure prediction. Due to the sensitivity of RNA secondary structure prediction to the exact definition of sequence ends, accuracy can be increased by averaging local structure predictions over multiple, overlapping sequence windows. These averages can be computed efficiently by dynamic programming. Here we revisit the local folding problem, present a concise mathematical formalization that generalizes previous approaches and show that correct Boltzmann samples can be obtained by local stochastic backtracing in McCaskill's algorithms but not from local folding recursions. Corresponding new features are implemented in the ViennaRNA package to improve the support of local folding. Applications include the computation of maximum expected accuracy structures from RNAplfold data and a mutual information measure to quantify the sensitivity of individual sequence positions.
Collapse
Affiliation(s)
- Maria Waldl
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Thomas Spicher
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Ronny Lorenz
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Irene K Beckmann
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Sarah Von Löhneysen
- Institute of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstraße 16-18, D-04107 Leipzig, Germany
| | - Peter F Stadler
- Institute of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstraße 16-18, D-04107 Leipzig, Germany
| |
Collapse
|
9
|
Wu KE, Zou JY, Chang H. Machine learning modeling of RNA structures: methods, challenges and future perspectives. Brief Bioinform 2023; 24:bbad210. [PMID: 37280185 DOI: 10.1093/bib/bbad210] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 05/12/2023] [Accepted: 05/17/2023] [Indexed: 06/08/2023] Open
Abstract
The three-dimensional structure of RNA molecules plays a critical role in a wide range of cellular processes encompassing functions from riboswitches to epigenetic regulation. These RNA structures are incredibly dynamic and can indeed be described aptly as an ensemble of structures that shifts in distribution depending on different cellular conditions. Thus, the computational prediction of RNA structure poses a unique challenge, even as computational protein folding has seen great advances. In this review, we focus on a variety of machine learning-based methods that have been developed to predict RNA molecules' secondary structure, as well as more complex tertiary structures. We survey commonly used modeling strategies, and how many are inspired by or incorporate thermodynamic principles. We discuss the shortcomings that various design decisions entail and propose future directions that could build off these methods to yield more robust, accurate RNA structure predictions.
Collapse
Affiliation(s)
- Kevin E Wu
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - James Y Zou
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Howard Chang
- Howard Hughes Medical Institute, Stanford University, Stanford, CA 94305, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
10
|
Kolberg T, von Löhneysen S, Ozerova I, Wellner K, Hartmann R, Stadler P, Mörl M. Led-Seq: ligation-enhanced double-end sequence-based structure analysis of RNA. Nucleic Acids Res 2023; 51:e63. [PMID: 37114986 PMCID: PMC10287922 DOI: 10.1093/nar/gkad312] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 03/21/2023] [Accepted: 04/13/2023] [Indexed: 04/29/2023] Open
Abstract
Structural analysis of RNA is an important and versatile tool to investigate the function of this type of molecules in the cell as well as in vitro. Several robust and reliable procedures are available, relying on chemical modification inducing RT stops or nucleotide misincorporations during reverse transcription. Others are based on cleavage reactions and RT stop signals. However, these methods address only one side of the RT stop or misincorporation position. Here, we describe Led-Seq, a new approach based on lead-induced cleavage of unpaired RNA positions, where both resulting cleavage products are investigated. The RNA fragments carrying 2', 3'-cyclic phosphate or 5'-OH ends are selectively ligated to oligonucleotide adapters by specific RNA ligases. In a deep sequencing analysis, the cleavage sites are identified as ligation positions, avoiding possible false positive signals based on premature RT stops. With a benchmark set of transcripts in Escherichia coli, we show that Led-Seq is an improved and reliable approach based on metal ion-induced phosphodiester hydrolysis to investigate RNA structures in vivo.
Collapse
Affiliation(s)
- Tim Kolberg
- Institute for Biochemistry, Leipzig University, Brüderstr. 34, 04103 Leipzig, Germany
| | - Sarah von Löhneysen
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstr. 16–18, 04107 Leipzig, Germany
| | - Iuliia Ozerova
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstr. 16–18, 04107 Leipzig, Germany
| | - Karolin Wellner
- Institute for Biochemistry, Leipzig University, Brüderstr. 34, 04103 Leipzig, Germany
| | - Roland K Hartmann
- Institute for Pharmaceutical Chemistry, Philipps University Marburg, Marbacher Weg 6, 35037 Marburg, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstr. 16–18, 04107 Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany
- Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria
- Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Mario Mörl
- Institute for Biochemistry, Leipzig University, Brüderstr. 34, 04103 Leipzig, Germany
| |
Collapse
|
11
|
Banijamali E, Baronti L, Becker W, Sajkowska-Kozielewicz JJ, Huang T, Palka C, Kosek D, Sweetapple L, Müller J, Stone MD, Andersson ER, Petzold K. RNA:RNA interaction in ternary complexes resolved by chemical probing. RNA (NEW YORK, N.Y.) 2023; 29:317-329. [PMID: 36617673 PMCID: PMC9945442 DOI: 10.1261/rna.079190.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 11/25/2022] [Indexed: 06/17/2023]
Abstract
RNA regulation can be performed by a second targeting RNA molecule, such as in the microRNA regulation mechanism. Selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) probes the structure of RNA molecules and can resolve RNA:protein interactions, but RNA:RNA interactions have not yet been addressed with this technique. Here, we apply SHAPE to investigate RNA-mediated binding processes in RNA:RNA and RNA:RNA-RBP complexes. We use RNA:RNA binding by SHAPE (RABS) to investigate microRNA-34a (miR-34a) binding its mRNA target, the silent information regulator 1 (mSIRT1), both with and without the Argonaute protein, constituting the RNA-induced silencing complex (RISC). We show that the seed of the mRNA target must be bound to the microRNA loaded into RISC to enable further binding of the compensatory region by RISC, while the naked miR-34a is able to bind the compensatory region without seed interaction. The method presented here provides complementary structural evidence for the commonly performed luciferase-assay-based evaluation of microRNA binding-site efficiency and specificity on the mRNA target site and could therefore be used in conjunction with it. The method can be applied to any nucleic acid-mediated RNA- or RBP-binding process, such as splicing, antisense RNA binding, or regulation by RISC, providing important insight into the targeted RNA structure.
Collapse
Affiliation(s)
- Elnaz Banijamali
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Lorenzo Baronti
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Walter Becker
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | | | - Ting Huang
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Christina Palka
- Department of Chemistry and Biochemistry, University of California, Santa Cruz, California 95064, USA
| | - David Kosek
- Department of Cell and Molecular Biology, Karolinska Institute, 17177 Stockholm, Sweden
| | - Lara Sweetapple
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Juliane Müller
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Michael D Stone
- Department of Chemistry and Biochemistry, University of California, Santa Cruz, California 95064, USA
| | - Emma R Andersson
- Department of Cell and Molecular Biology, Karolinska Institute, 17177 Stockholm, Sweden
| | - Katja Petzold
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
- Stellenbosch Institute for Advanced Study (STIAS), Wallenberg Research Centre at Stellenbosch University, Stellenbosch 7600, South Africa
| |
Collapse
|
12
|
Chandler-Bostock R, Bingham RJ, Clark S, Scott AJP, Wroblewski E, Barker A, White SJ, Dykeman EC, Mata CP, Bohon J, Farquhar E, Twarock R, Stockley PG. Genome-regulated Assembly of a ssRNA Virus May Also Prepare It for Infection. J Mol Biol 2022; 434:167797. [PMID: 35998704 DOI: 10.1016/j.jmb.2022.167797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 08/09/2022] [Accepted: 08/15/2022] [Indexed: 11/17/2022]
Abstract
Many single-stranded, positive-sense RNA viruses regulate assembly of their infectious virions by forming multiple, cognate coat protein (CP)-genome contacts at sites termed Packaging Signals (PSs). We have determined the secondary structures of the bacteriophage MS2 ssRNA genome (gRNA) frozen in defined states using constraints from X-ray synchrotron footprinting (XRF). Comparison of the footprints from phage and transcript confirms the presence of multiple PSs in contact with CP dimers in the former. This is also true for a virus-like particle (VLP) assembled around the gRNA in vitro in the absence of the single-copy Maturation Protein (MP) found in phage. Since PS folds are present at many sites across gRNA transcripts, it appears that this genome has evolved to facilitate this mechanism of assembly regulation. There are striking differences between the gRNA-CP contacts seen in phage and the VLP, suggesting that the latter are inappropriate surrogates for aspects of phage structure/function. Roughly 50% of potential PS sites in the gRNA are not in contact with the protein shell of phage. However, many of these sit adjacent to, albeit not in contact with, PS-binding sites on CP dimers. We hypothesize that these act as PSs transiently during assembly but subsequently dissociate. Combining the XRF data with PS locations from an asymmetric cryo-EM reconstruction suggests that the genome positions of such dissociations are non-random and may facilitate infection. The loss of many PS-CP interactions towards the 3' end of the gRNA would allow this part of the genome to transit more easily through the narrow basal body of the pilus extruding machinery. This is the known first step in phage infection. In addition, each PS-CP dissociation event leaves the protein partner trapped in a non-lowest free-energy conformation. This destabilizes the protein shell which must disassemble during infection, further facilitating this stage of the life-cycle.
Collapse
Affiliation(s)
| | - Richard J Bingham
- Departments of Mathematics and Biology & York Cross-Disciplinary Centre for Systems Analysis, University of York, York, UK
| | - Sam Clark
- Departments of Mathematics and Biology & York Cross-Disciplinary Centre for Systems Analysis, University of York, York, UK
| | - Andrew J P Scott
- Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds LS2 9JT, UK
| | - Emma Wroblewski
- Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds LS2 9JT, UK
| | - Amy Barker
- Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds LS2 9JT, UK
| | - Simon J White
- Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds LS2 9JT, UK
| | - Eric C Dykeman
- Departments of Mathematics and Biology & York Cross-Disciplinary Centre for Systems Analysis, University of York, York, UK
| | - Carlos P Mata
- Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds LS2 9JT, UK
| | - Jen Bohon
- CWRU Center for Synchrotron Biosciences, NSLS-II, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Erik Farquhar
- CWRU Center for Synchrotron Biosciences, NSLS-II, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Reidun Twarock
- Departments of Mathematics and Biology & York Cross-Disciplinary Centre for Systems Analysis, University of York, York, UK.
| | - Peter G Stockley
- Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds LS2 9JT, UK.
| |
Collapse
|
13
|
RNA secondary structure packages evaluated and improved by high-throughput experiments. Nat Methods 2022; 19:1234-1242. [PMID: 36192461 PMCID: PMC9839360 DOI: 10.1038/s41592-022-01605-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 08/10/2022] [Indexed: 01/17/2023]
Abstract
Despite the popularity of computer-aided study and design of RNA molecules, little is known about the accuracy of commonly used structure modeling packages in tasks sensitive to ensemble properties of RNA. Here, we demonstrate that the EternaBench dataset, a set of more than 20,000 synthetic RNA constructs designed on the RNA design platform Eterna, provides incisive discriminative power in evaluating current packages in ensemble-oriented structure prediction tasks. We find that CONTRAfold and RNAsoft, packages with parameters derived through statistical learning, achieve consistently higher accuracy than more widely used packages in their standard settings, which derive parameters primarily from thermodynamic experiments. We hypothesized that training a multitask model with the varied data types in EternaBench might improve inference on ensemble-based prediction tasks. Indeed, the resulting model, named EternaFold, demonstrated improved performance that generalizes to diverse external datasets including complete messenger RNAs, viral genomes probed in human cells and synthetic designs modeling mRNA vaccines.
Collapse
|
14
|
Yang SL, Ponti RD, Wan Y, Huber RG. Computational and Experimental Approaches to Study the RNA Secondary Structures of RNA Viruses. Viruses 2022; 14:1795. [PMID: 36016417 PMCID: PMC9415818 DOI: 10.3390/v14081795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 08/12/2022] [Accepted: 08/13/2022] [Indexed: 11/16/2022] Open
Abstract
Most pandemics of recent decades can be traced to RNA viruses, including HIV, SARS, influenza, dengue, Zika, and SARS-CoV-2. These RNA viruses impose considerable social and economic burdens on our society, resulting in a high number of deaths and high treatment costs. As these RNA viruses utilize an RNA genome, which is important for different stages of the viral life cycle, including replication, translation, and packaging, studying how the genome folds is important to understand virus function. In this review, we summarize recent advances in computational and high-throughput RNA structure-mapping approaches and their use in understanding structures within RNA virus genomes. In particular, we focus on the genome structures of the dengue, Zika, and SARS-CoV-2 viruses due to recent significant outbreaks of these viruses around the world.
Collapse
Affiliation(s)
- Siwy Ling Yang
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore
| | - Riccardo Delli Ponti
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Singapore 138671, Singapore
| | - Yue Wan
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore
| | - Roland G. Huber
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Singapore 138671, Singapore
| |
Collapse
|
15
|
Aviran S, Incarnato D. Computational approaches for RNA structure ensemble deconvolution from structure probing data. J Mol Biol 2022; 434:167635. [PMID: 35595163 DOI: 10.1016/j.jmb.2022.167635] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 04/29/2022] [Accepted: 05/05/2022] [Indexed: 12/15/2022]
Abstract
RNA structure probing experiments have emerged over the last decade as a straightforward way to determine the structure of RNA molecules in a number of different contexts. Although powerful, the ability of RNA to dynamically interconvert between, and to simultaneously populate, alternative structural configurations, poses a nontrivial challenge to the interpretation of data derived from these experiments. Recent efforts aimed at developing computational methods for the reconstruction of coexisting alternative RNA conformations from structure probing data are paving the way to the study of RNA structure ensembles, even in the context of living cells. In this review, we critically discuss these methods, their limitations and possible future improvements.
Collapse
Affiliation(s)
- Sharon Aviran
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA.
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, the Netherlands.
| |
Collapse
|
16
|
Morandi E, van Hemert MJ, Incarnato D. SHAPE-guided RNA structure homology search and motif discovery. Nat Commun 2022; 13:1722. [PMID: 35361788 PMCID: PMC8971488 DOI: 10.1038/s41467-022-29398-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 03/11/2022] [Indexed: 01/13/2023] Open
Abstract
The rapidly growing popularity of RNA structure probing methods is leading to increasingly large amounts of available RNA structure information. This demands the development of efficient tools for the identification of RNAs sharing regions of structural similarity by direct comparison of their reactivity profiles, hence enabling the discovery of conserved structural features. We here introduce SHAPEwarp, a largely sequence-agnostic SHAPE-guided algorithm for the identification of structurally-similar regions in RNA molecules. Analysis of Dengue, Zika and coronavirus genomes recapitulates known regulatory RNA structures and identifies novel highly-conserved structural elements. This work represents a preliminary step towards the model-free search and identification of shared and conserved RNA structural features within transcriptomes.
Collapse
Affiliation(s)
- Edoardo Morandi
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, The Netherlands
| | - Martijn J van Hemert
- Department of Medical Microbiology, Molecular Virology Laboratory, Leiden University Medical Center, Leiden, The Netherlands
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, The Netherlands.
| |
Collapse
|
17
|
Kis Z. Stability Modelling of mRNA Vaccine Quality Based on Temperature Monitoring throughout the Distribution Chain. Pharmaceutics 2022; 14:430. [PMID: 35214162 PMCID: PMC8877932 DOI: 10.3390/pharmaceutics14020430] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Revised: 01/31/2022] [Accepted: 02/08/2022] [Indexed: 11/22/2022] Open
Abstract
The vaccine distribution chains in several low- and middle-income countries are not adequate to facilitate the rapid delivery of high volumes of thermosensitive COVID-19 mRNA vaccines at the required low and ultra-low temperatures. COVID-19 mRNA vaccines are currently distributed along with temperature monitoring devices to track and identify deviations from predefined conditions throughout the distribution chain. These temperature readings can feed into computational models to quantify mRNA vaccine critical quality attributes (CQAs) and the remaining vaccine shelf life more accurately. Here, a kinetic modelling approach is proposed to quantify the stability-related CQAs and the remaining shelf life of mRNA vaccines. The CQA and shelf-life values can be computed based on the conditions under which the vaccines have been distributed from the manufacturing facilities via the distribution network to the vaccination centres. This approach helps to quantify the degree to which temperature excursions impact vaccine quality and can also reduce vaccine wastage. In addition, vaccine stock management can be improved due to the information obtained on the remaining shelf life of mRNA vaccines. This model-based quantification of mRNA vaccine quality and remaining shelf life can improve the deployment of COVID-19 mRNA vaccines to low- and middle-income countries.
Collapse
Affiliation(s)
- Zoltán Kis
- Department of Chemical and Biological Engineering, The University of Sheffield, Mappin St., Sheffield S1 3JD, UK;
- The Sargent Centre for Process Systems Engineering, Department of Chemical Engineering, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| |
Collapse
|
18
|
Tagashira M, Asai K. ConsAlifold: considering RNA structural alignments improves prediction accuracy of RNA consensus secondary structures. Bioinformatics 2022; 38:710-719. [PMID: 34694364 DOI: 10.1093/bioinformatics/btab738] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 08/24/2021] [Accepted: 10/20/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION By detecting homology among RNAs, the probabilistic consideration of RNA structural alignments has improved the prediction accuracy of significant RNA prediction problems. Predicting an RNA consensus secondary structure from an RNA sequence alignment is a fundamental research objective because in the detection of conserved base-pairings among RNA homologs, predicting an RNA consensus secondary structure is more convenient than predicting an RNA structural alignment. RESULTS We developed and implemented ConsAlifold, a dynamic programming-based method that predicts the consensus secondary structure of an RNA sequence alignment. ConsAlifold considers RNA structural alignments. ConsAlifold achieves moderate running time and the best prediction accuracy of RNA consensus secondary structures among available prediction methods. AVAILABILITY AND IMPLEMENTATION ConsAlifold, data and Python scripts for generating both figures and tables are freely available at https://github.com/heartsh/consalifold. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Masaki Tagashira
- Department of Computational Biology and Medical Sciences, University of Tokyo, Chiba 277-8561, Japan.,Artificial Intelligence Research Center, AIST, Tokyo 135-0064, Japan
| | - Kiyoshi Asai
- Department of Computational Biology and Medical Sciences, University of Tokyo, Chiba 277-8561, Japan.,Artificial Intelligence Research Center, AIST, Tokyo 135-0064, Japan
| |
Collapse
|
19
|
Zambrano RAI, Hernandez-Perez C, Takahashi MK. RNA Structure Prediction, Analysis, and Design: An Introduction to Web-Based Tools. Methods Mol Biol 2022; 2518:253-269. [PMID: 35666450 DOI: 10.1007/978-1-0716-2421-0_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Understanding RNA structure has become critical in the study of RNA in their roles as mediators of biological processes. To aid in these studies, computational algorithms that utilize thermodynamics have been developed to predict RNA secondary structure. Due to the importance of intermolecular interactions, the algorithms have been expanded to determine and predict RNA-RNA hybridization. This chapter discusses popular webservers with the tools for RNA secondary structure prediction, RNA-RNA hybridization, and design. We address key features that distinguish common-functioning programs and their purposes for the interests of the user. Ultimately, we hope this review elucidates web-based tools researchers may take advantage of in their investigations of RNA structure and function.
Collapse
Affiliation(s)
| | | | - Melissa K Takahashi
- Department of Biology, California State University Northridge, Northridge, CA, USA.
| |
Collapse
|
20
|
Zhao YH, Zhou T, Wang JX, Li Y, Fang MF, Liu JN, Li ZH. Evolution and structural variations in chloroplast tRNAs in gymnosperms. BMC Genomics 2021; 22:750. [PMID: 34663228 PMCID: PMC8524817 DOI: 10.1186/s12864-021-08058-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 10/06/2021] [Indexed: 11/22/2022] Open
Abstract
Background Chloroplast transfer RNAs (tRNAs) can participate in various vital processes. Gymnosperms have important ecological and economic value, and they are the dominant species in forest ecosystems in the Northern Hemisphere. However, the evolution and structural changes in chloroplast tRNAs in gymnosperms remain largely unclear. Results In this study, we determined the nucleotide evolution, phylogenetic relationships, and structural variations in 1779 chloroplast tRNAs in gymnosperms. The numbers and types of tRNA genes present in the chloroplast genomes of different gymnosperms did not differ greatly, where the average number of tRNAs was 33 and the frequencies of occurrence for various types of tRNAs were generally consistent. Nearly half of the anticodons were absent. Molecular sequence variation analysis identified the conserved secondary structures of tRNAs. About a quarter of the tRNA genes were found to contain precoded 3′ CCA tails. A few tRNAs have undergone novel structural changes that are closely related to their minimum free energy, and these structural changes affect the stability of the tRNAs. Phylogenetic analysis showed that tRNAs have evolved from multiple common ancestors. The transition rate was higher than the transversion rate in gymnosperm chloroplast tRNAs. More loss events than duplication events have occurred in gymnosperm chloroplast tRNAs during their evolutionary process. Conclusions These findings provide novel insights into the molecular evolution and biological characteristics of chloroplast tRNAs in gymnosperms. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08058-3.
Collapse
Affiliation(s)
- Yu-He Zhao
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi'an, 710069, China
| | - Tong Zhou
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi'an, 710069, China
| | - Jiu-Xia Wang
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi'an, 710069, China
| | - Yan Li
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi'an, 710069, China
| | - Min-Feng Fang
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi'an, 710069, China
| | - Jian-Ni Liu
- State Key Laboratory of Continental Dynamics, Department of Geology, Early Life Institute, Northwest University, Xi'an, 710069, China
| | - Zhong-Hu Li
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi'an, 710069, China.
| |
Collapse
|
21
|
Wayment-Steele HK, Kim DS, Choe CA, Nicol JJ, Wellington-Oguri R, Watkins AM, Parra Sperberg RA, Huang PS, Participants E, Das R. Theoretical basis for stabilizing messenger RNA through secondary structure design. Nucleic Acids Res 2021; 49:10604-10617. [PMID: 34520542 PMCID: PMC8499941 DOI: 10.1093/nar/gkab764] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 08/17/2021] [Accepted: 08/27/2021] [Indexed: 01/08/2023] Open
Abstract
RNA hydrolysis presents problems in manufacturing, long-term storage, world-wide delivery and in vivo stability of messenger RNA (mRNA)-based vaccines and therapeutics. A largely unexplored strategy to reduce mRNA hydrolysis is to redesign RNAs to form double-stranded regions, which are protected from in-line cleavage and enzymatic degradation, while coding for the same proteins. The amount of stabilization that this strategy can deliver and the most effective algorithmic approach to achieve stabilization remain poorly understood. Here, we present simple calculations for estimating RNA stability against hydrolysis, and a model that links the average unpaired probability of an mRNA, or AUP, to its overall hydrolysis rate. To characterize the stabilization achievable through structure design, we compare AUP optimization by conventional mRNA design methods to results from more computationally sophisticated algorithms and crowdsourcing through the OpenVaccine challenge on the Eterna platform. We find that rational design on Eterna and the more sophisticated algorithms lead to constructs with low AUP, which we term 'superfolder' mRNAs. These designs exhibit a wide diversity of sequence and structure features that may be desirable for translation, biophysical size, and immunogenicity. Furthermore, their folding is robust to temperature, computer modeling method, choice of flanking untranslated regions, and changes in target protein sequence, as illustrated by rapid redesign of superfolder mRNAs for B.1.351, P.1 and B.1.1.7 variants of the prefusion-stabilized SARS-CoV-2 spike protein. Increases in in vitro mRNA half-life by at least two-fold appear immediately achievable.
Collapse
MESH Headings
- Algorithms
- Base Pairing
- Base Sequence
- COVID-19/prevention & control
- Humans
- Hydrolysis
- RNA Stability
- RNA, Double-Stranded/chemistry
- RNA, Double-Stranded/genetics
- RNA, Double-Stranded/immunology
- RNA, Messenger/chemistry
- RNA, Messenger/genetics
- RNA, Messenger/immunology
- RNA, Viral/chemistry
- RNA, Viral/genetics
- RNA, Viral/immunology
- SARS-CoV-2/genetics
- SARS-CoV-2/immunology
- Spike Glycoprotein, Coronavirus/genetics
- Spike Glycoprotein, Coronavirus/immunology
- Thermodynamics
Collapse
Affiliation(s)
- Hannah K Wayment-Steele
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA
- Eterna Massive Open Laboratory
| | - Do Soon Kim
- Eterna Massive Open Laboratory
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Christian A Choe
- Eterna Massive Open Laboratory
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | | | - Andrew M Watkins
- Eterna Massive Open Laboratory
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | | | - Po-Ssu Huang
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Rhiju Das
- Eterna Massive Open Laboratory
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
- Department of Physics, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
22
|
Andrews RJ, O’Leary CA, Tompkins VS, Peterson JM, Haniff H, Williams C, Disney MD, Moss WN. A map of the SARS-CoV-2 RNA structurome. NAR Genom Bioinform 2021; 3:lqab043. [PMID: 34046592 PMCID: PMC8140738 DOI: 10.1093/nargab/lqab043] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 04/06/2021] [Accepted: 04/28/2021] [Indexed: 12/11/2022] Open
Abstract
SARS-CoV-2 has exploded throughout the human population. To facilitate efforts to gain insights into SARS-CoV-2 biology and to target the virus therapeutically, it is essential to have a roadmap of likely functional regions embedded in its RNA genome. In this report, we used a bioinformatics approach, ScanFold, to deduce the local RNA structural landscape of the SARS-CoV-2 genome with the highest likelihood of being functional. We recapitulate previously-known elements of RNA structure and provide a model for the folding of an essential frameshift signal. Our results find that SARS-CoV-2 is greatly enriched in unusually stable and likely evolutionarily ordered RNA structure, which provides a large reservoir of potential drug targets for RNA-binding small molecules. Results are enhanced via the re-analyses of publicly-available genome-wide biochemical structure probing datasets that are broadly in agreement with our models. Additionally, ScanFold was updated to incorporate experimental data as constraints in the analysis to facilitate comparisons between ScanFold and other RNA modelling approaches. Ultimately, ScanFold was able to identify eight highly structured/conserved motifs in SARS-CoV-2 that agree with experimental data, without explicitly using these data. All results are made available via a public database (the RNAStructuromeDB: https://structurome.bb.iastate.edu/sars-cov-2) and model comparisons are readily viewable at https://structurome.bb.iastate.edu/sars-cov-2-global-model-comparisons.
Collapse
Affiliation(s)
- Ryan J Andrews
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| | - Collin A O’Leary
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| | - Van S Tompkins
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| | - Jake M Peterson
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| | - Hafeez S Haniff
- Department of Chemistry, The Scripps Research Institute, Jupiter, FL 33458, USA
| | | | - Matthew D Disney
- Department of Chemistry, The Scripps Research Institute, Jupiter, FL 33458, USA
| | - Walter N Moss
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| |
Collapse
|
23
|
Cao J, Xue Y. Characteristic chemical probing patterns of loop motifs improve prediction accuracy of RNA secondary structures. Nucleic Acids Res 2021; 49:4294-4307. [PMID: 33849076 PMCID: PMC8096282 DOI: 10.1093/nar/gkab250] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 03/24/2021] [Accepted: 04/10/2021] [Indexed: 12/14/2022] Open
Abstract
RNA structures play a fundamental role in nearly every aspect of cellular physiology and pathology. Gaining insights into the functions of RNA molecules requires accurate predictions of RNA secondary structures. However, the existing thermodynamic folding models remain less accurate than desired, even when chemical probing data, such as selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) reactivities, are used as restraints. Unlike most SHAPE-directed algorithms that only consider SHAPE restraints for base pairing, we extract two-dimensional structural features encoded in SHAPE data and establish robust relationships between characteristic SHAPE patterns and loop motifs of various types (hairpin, internal, and bulge) and lengths (2-11 nucleotides). Such characteristic SHAPE patterns are closely related to the sugar pucker conformations of loop residues. Based on these patterns, we propose a computational method, SHAPELoop, which refines the predicted results of the existing methods, thereby further improving their prediction accuracy. In addition, SHAPELoop can provide information about local or global structural rearrangements (including pseudoknots) and help researchers to easily test their hypothesized secondary structures.
Collapse
Affiliation(s)
- Jingyi Cao
- School of Life Sciences, Tsinghua-Peking Joint Center for Life Sciences, Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing 100084, China
| | - Yi Xue
- School of Life Sciences, Tsinghua-Peking Joint Center for Life Sciences, Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
24
|
Rivas E. Evolutionary conservation of RNA sequence and structure. WILEY INTERDISCIPLINARY REVIEWS-RNA 2021; 12:e1649. [PMID: 33754485 PMCID: PMC8250186 DOI: 10.1002/wrna.1649] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 02/24/2021] [Accepted: 02/25/2021] [Indexed: 12/22/2022]
Abstract
An RNA structure prediction from a single‐sequence RNA folding program is not evidence for an RNA whose structure is important for function. Random sequences have plausible and complex predicted structures not easily distinguishable from those of structural RNAs. How to tell when an RNA has a conserved structure is a question that requires looking at the evolutionary signature left by the conserved RNA. This question is important not just for long noncoding RNAs which usually lack an identified function, but also for RNA binding protein motifs which can be single stranded RNAs or structures. Here we review recent advances using sequence and structural analysis to determine when RNA structure is conserved or not. Although covariation measures assess structural RNA conservation, one must distinguish covariation due to RNA structure from covariation due to independent phylogenetic substitutions. We review a statistical test to measure false positives expected under the null hypothesis of phylogenetic covariation alone (specificity). We also review a complementary test that measures power, that is, expected covariation derived from sequence variation alone (sensitivity). Power in the absence of covariation signals the absence of a conserved RNA structure. We analyze artifacts that falsely identify conserved RNA structure such as the misuse of programs that do not assess significance, the use of inappropriate statistics confounded by signals other than covariation, or misalignments that induce spurious covariation. Among artifacts that obscure the signal of a conserved RNA structure, we discuss the inclusion of pseudogenes in alignments which increase power but destroy covariation. This article is categorized under:RNA Structure and Dynamics > RNA Structure, Dynamics and Chemistry RNA Evolution and Genomics > Computational Analyses of RNA RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution
Collapse
Affiliation(s)
- Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
25
|
Královičová J, Borovská I, Pengelly R, Lee E, Abaffy P, Šindelka R, Grutzner F, Vořechovský I. Restriction of an intron size en route to endothermy. Nucleic Acids Res 2021; 49:2460-2487. [PMID: 33550394 PMCID: PMC7969005 DOI: 10.1093/nar/gkab046] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 01/11/2021] [Accepted: 01/15/2021] [Indexed: 11/15/2022] Open
Abstract
Ca2+-insensitive and -sensitive E1 subunits of the 2-oxoglutarate dehydrogenase complex (OGDHC) regulate tissue-specific NADH and ATP supply by mutually exclusive OGDH exons 4a and 4b. Here we show that their splicing is enforced by distant lariat branch points (dBPs) located near the 5' splice site of the intervening intron. dBPs restrict the intron length and prevent transposon insertions, which can introduce or eliminate dBP competitors. The size restriction was imposed by a single dominant dBP in anamniotes that expanded into a conserved constellation of four dBP adenines in amniotes. The amniote clusters exhibit taxon-specific usage of individual dBPs, reflecting accessibility of their extended motifs within a stable RNA hairpin rather than U2 snRNA:dBP base-pairing. The dBP expansion took place in early terrestrial species and was followed by a uridine enrichment of large downstream polypyrimidine tracts in mammals. The dBP-protected megatracts permit reciprocal regulation of exon 4a and 4b by uridine-binding proteins, including TIA-1/TIAR and PUF60, which promote U1 and U2 snRNP recruitment to the 5' splice site and BP, respectively, but do not significantly alter the relative dBP usage. We further show that codons for residues critically contributing to protein binding sites for Ca2+ and other divalent metals confer the exon inclusion order that mirrors the Irving-Williams affinity series, linking the evolution of auxiliary splicing motifs in exons to metallome constraints. Finally, we hypothesize that the dBP-driven selection for Ca2+-dependent ATP provision by E1 facilitated evolution of endothermy by optimizing the aerobic scope in target tissues.
Collapse
Affiliation(s)
- Jana Královičová
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
- Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Ivana Borovská
- Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Reuben Pengelly
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
| | - Eunice Lee
- School of Biological Sciences, University of Adelaide, Adelaide 5005, SA, Australia
| | - Pavel Abaffy
- Czech Academy of Sciences, Institute of Biotechnology, 25250 Vestec, Czech Republic
| | - Radek Šindelka
- Czech Academy of Sciences, Institute of Biotechnology, 25250 Vestec, Czech Republic
| | - Frank Grutzner
- School of Biological Sciences, University of Adelaide, Adelaide 5005, SA, Australia
| | - Igor Vořechovský
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
| |
Collapse
|
26
|
Calonaci N, Jones A, Cuturello F, Sattler M, Bussi G. Machine learning a model for RNA structure prediction. NAR Genom Bioinform 2021; 2:lqaa090. [PMID: 33575634 PMCID: PMC7671377 DOI: 10.1093/nargab/lqaa090] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 10/06/2020] [Accepted: 10/20/2020] [Indexed: 01/04/2023] Open
Abstract
RNA function crucially depends on its structure. Thermodynamic models currently used for secondary structure prediction rely on computing the partition function of folding ensembles, and can thus estimate minimum free-energy structures and ensemble populations. These models sometimes fail in identifying native structures unless complemented by auxiliary experimental data. Here, we build a set of models that combine thermodynamic parameters, chemical probing data (DMS and SHAPE) and co-evolutionary data (direct coupling analysis) through a network that outputs perturbations to the ensemble free energy. Perturbations are trained to increase the ensemble populations of a representative set of known native RNA structures. In the chemical probing nodes of the network, a convolutional window combines neighboring reactivities, enlightening their structural information content and the contribution of local conformational ensembles. Regularization is used to limit overfitting and improve transferability. The most transferable model is selected through a cross-validation strategy that estimates the performance of models on systems on which they are not trained. With the selected model we obtain increased ensemble populations for native structures and more accurate predictions in an independent validation set. The flexibility of the approach allows the model to be easily retrained and adapted to incorporate arbitrary experimental information.
Collapse
Affiliation(s)
- Nicola Calonaci
- International School for Advanced Studies, via Bonomea 265, 34136 Trieste, Italy
| | - Alisha Jones
- Institute of Structural Biology, Helmholtz Zentrum München, Neuherberg 85764, Germany
| | - Francesca Cuturello
- International School for Advanced Studies, via Bonomea 265, 34136 Trieste, Italy
| | - Michael Sattler
- Institute of Structural Biology, Helmholtz Zentrum München, Neuherberg 85764, Germany
| | - Giovanni Bussi
- International School for Advanced Studies, via Bonomea 265, 34136 Trieste, Italy
| |
Collapse
|
27
|
Rivas E. RNA structure prediction using positive and negative evolutionary information. PLoS Comput Biol 2020; 16:e1008387. [PMID: 33125376 PMCID: PMC7657543 DOI: 10.1371/journal.pcbi.1008387] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 11/11/2020] [Accepted: 09/24/2020] [Indexed: 12/22/2022] Open
Abstract
Knowing the structure of conserved structural RNAs is important to elucidate their function and mechanism of action. However, predicting a conserved RNA structure remains unreliable, even when using a combination of thermodynamic stability and evolutionary covariation information. Here we present a method to predict a conserved RNA structure that combines the following three features. First, it uses significant covariation due to RNA structure and removes spurious covariation due to phylogeny. Second, it uses negative evolutionary information: basepairs that have variation but no significant covariation are prevented from occurring. Lastly, it uses a battery of probabilistic folding algorithms that incorporate all positive covariation into one structure. The method, named CaCoFold (Cascade variation/covariation Constrained Folding algorithm), predicts a nested structure guided by a maximal subset of positive basepairs, and recursively incorporates all remaining positive basepairs into alternative helices. The alternative helices can be compatible with the nested structure such as pseudoknots, or overlapping such as competing structures, base triplets, or other 3D non-antiparallel interactions. We present evidence that CaCoFold predictions are consistent with structures modeled from crystallography. The availability of deeper comparative sequence alignments and recent advances in statistical analysis of RNA sequence covariation have made it possible to identify a reliable set of conserved base pairs, as well as a reliable set of non-basepairs (positions that vary without covarying). Predicting an overall consensus secondary structure consistent with a set of individual inferred pairs and non-pairs remains a problem. Current RNA structure prediction algorithms that predict nested secondary structures cannot use the full set of inferred covarying pairs, because covariation analysis also identifies important non-nested pairing interactions such as pseudoknots, base triples, and alternative structures. Moreover, although algorithms for incorporating negative constraints exist, negative information from covariation analysis (inferred non-pairs) has not been systematically exploited. Here I introduce an efficient approximate RNA structure prediction algorithm that incorporates all inferred pairs and excludes all non-pairs. Using this, and an improved visualization tool, I show that the method correctly identifies many non-nested structures in agreement with known crystal structures, and improves many curated consensus secondary structure annotations in RNA sequence alignment databases.
Collapse
Affiliation(s)
- Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
- * E-mail:
| |
Collapse
|
28
|
Li B, Cao Y, Westhof E, Miao Z. Advances in RNA 3D Structure Modeling Using Experimental Data. Front Genet 2020; 11:574485. [PMID: 33193680 PMCID: PMC7649352 DOI: 10.3389/fgene.2020.574485] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 09/02/2020] [Indexed: 12/26/2022] Open
Abstract
RNA is a unique bio-macromolecule that can both record genetic information and perform biological functions in a variety of molecular processes, including transcription, splicing, translation, and even regulating protein function. RNAs adopt specific three-dimensional conformations to enable their functions. Experimental determination of high-resolution RNA structures using x-ray crystallography is both laborious and demands expertise, thus, hindering our comprehension of RNA structural biology. The computational modeling of RNA structure was a milestone in the birth of bioinformatics. Although computational modeling has been greatly improved over the last decade showing many successful cases, the accuracy of such computational modeling is not only length-dependent but also varies according to the complexity of the structure. To increase credibility, various experimental data were integrated into computational modeling. In this review, we summarize the experiments that can be integrated into RNA structure modeling as well as the computational methods based on these experimental data. We also demonstrate how computational modeling can help the experimental determination of RNA structure. We highlight the recent advances in computational modeling which can offer reliable structure models using high-throughput experimental data.
Collapse
Affiliation(s)
- Bing Li
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Eric Westhof
- Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, Strasbourg, France
| | - Zhichao Miao
- Translational Research Institute of Brain and Brain-Like Intelligence, Department of Anesthesiology, Shanghai Fourth People’s Hospital Affiliated to Tongji University School of Medicine, Shanghai, China
- Newcastle Fibrosis Research Group, Institute of Cellular Medicine, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| |
Collapse
|
29
|
Greenwood T, Heitsch CE. On the Problem of Reconstructing a Mixture of RNA Structures. Bull Math Biol 2020; 82:133. [PMID: 33029669 DOI: 10.1007/s11538-020-00804-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Accepted: 09/08/2020] [Indexed: 01/02/2023]
Abstract
A growing number of RNA sequences are now known to exist in some distribution with two or more different stable structures. Recent algorithms attempt to reconstruct such mixtures using the list of nucleotides in a sequence in conjunction with auxiliary experimental footprinting data. In this paper, we demonstrate some challenges which remain in addressing this problem; in particular we consider the difficulty of reconstructing a mixture of two RNA structures across a spectrum of different relative abundances. Although progress has been made in identifying the stable structures present, it remains nontrivial to predict the relative abundance of each within the experimentally sampled mixture. Because the ratio of structures present can change depending on experimental conditions, it is the footprinting data-and not the sequence-which must encode information on changes in the relative abundance. Here, we use simulated experimental data to demonstrate that there exist RNA sequences and relative abundance combinations which cannot be recovered by current methods. We then prove that this is not a single exception, but rather part of the rule. In particular, we show, using a Nussinov-Jacobson model, that recovering the relative abundances is difficult for a large proportion of RNA structure pairs. Lastly, we use information theory to establish a framework for quantifying how useful auxiliary data is in predicting the relative abundance of a structure. Together, these results demonstrate that aspects of the problem of reconstructing a mixture of RNA structures from experimental data remain open.
Collapse
|
30
|
Ohyama T, Takahashi H, Sharma H, Yamazaki T, Gustincich S, Ishii Y, Carninci P. An NMR-based approach reveals the core structure of the functional domain of SINEUP lncRNAs. Nucleic Acids Res 2020; 48:9346-9360. [PMID: 32697302 PMCID: PMC7498343 DOI: 10.1093/nar/gkaa598] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2019] [Revised: 06/30/2020] [Accepted: 07/06/2020] [Indexed: 02/06/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) are attracting widespread attention for their emerging regulatory, transcriptional, epigenetic, structural and various other functions. Comprehensive transcriptome analysis has revealed that retrotransposon elements (REs) are transcribed and enriched in lncRNA sequences. However, the functions of lncRNAs and the molecular roles of the embedded REs are largely unknown. The secondary and tertiary structures of lncRNAs and their embedded REs are likely to have essential functional roles, but experimental determination and reliable computational prediction of large RNA structures have been extremely challenging. We report here the nuclear magnetic resonance (NMR)-based secondary structure determination of the 167-nt inverted short interspersed nuclear element (SINE) B2, which is embedded in antisense Uchl1 lncRNA and upregulates the translation of sense Uchl1 mRNAs. By using NMR 'fingerprints' as a sensitive probe in the domain survey, we successfully divided the full-length inverted SINE B2 into minimal units made of two discrete structured domains and one dynamic domain without altering their original structures after careful boundary adjustments. This approach allowed us to identify a structured domain in nucleotides 31-119 of the inverted SINE B2. This approach will be applicable to determining the structures of other regulatory lncRNAs.
Collapse
Affiliation(s)
- Takako Ohyama
- NMR Division, RIKEN SPring-8 Center (RSC), RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Hazuki Takahashi
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Harshita Sharma
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Toshio Yamazaki
- NMR Division, RIKEN SPring-8 Center (RSC), RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Stefano Gustincich
- Central RNA Laboratory, Instituto Italiano di Tecnologia (IIT), 16163 Genova, Italy
| | - Yoshitaka Ishii
- NMR Division, RIKEN SPring-8 Center (RSC), RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- School of Life Science and Technology, Tokyo Institute of Technology, 4259 Midori-ku, Yokohama, Kanagawa 226-8503, Japan
| | - Piero Carninci
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| |
Collapse
|
31
|
Saaidi A, Allouche D, Regnier M, Sargueil B, Ponty Y. IPANEMAP: integrative probing analysis of nucleic acids empowered by multiple accessibility profiles. Nucleic Acids Res 2020; 48:8276-8289. [PMID: 32735675 PMCID: PMC7470984 DOI: 10.1093/nar/gkaa607] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 07/03/2020] [Accepted: 07/29/2020] [Indexed: 11/13/2022] Open
Abstract
The manual production of reliable RNA structure models from chemical probing experiments benefits from the integration of information derived from multiple protocols and reagents. However, the interpretation of multiple probing profiles remains a complex task, hindering the quality and reproducibility of modeling efforts. We introduce IPANEMAP, the first automated method for the modeling of RNA structure from multiple probing reactivity profiles. Input profiles can result from experiments based on diverse protocols, reagents, or collection of variants, and are jointly analyzed to predict the dominant conformations of an RNA. IPANEMAP combines sampling, clustering and multi-optimization, to produce secondary structure models that are both stable and well-supported by experimental evidences. The analysis of multiple reactivity profiles, both publicly available and produced in our study, demonstrates the good performances of IPANEMAP, even in a mono probing setting. It confirms the potential of integrating multiple sources of probing data, informing the design of informative probing assays.
Collapse
Affiliation(s)
- Afaf Saaidi
- CNRS UMR 7161, LIX, Ecole Polytechnique, Institut Polytechnique de Paris, 1 rue Estienne d'Orves, 91120 Palaiseau, France
| | - Delphine Allouche
- CNRS UMR 8038, CitCoM, Université de Paris, 4 avenue de l'observatoire, 75006 Paris, France
| | - Mireille Regnier
- CNRS UMR 7161, LIX, Ecole Polytechnique, Institut Polytechnique de Paris, 1 rue Estienne d'Orves, 91120 Palaiseau, France
| | - Bruno Sargueil
- CNRS UMR 8038, CitCoM, Université de Paris, 4 avenue de l'observatoire, 75006 Paris, France
| | - Yann Ponty
- CNRS UMR 7161, LIX, Ecole Polytechnique, Institut Polytechnique de Paris, 1 rue Estienne d'Orves, 91120 Palaiseau, France
| |
Collapse
|
32
|
Li TJX, Reidys CM. On an enhancement of RNA probing data using information theory. Algorithms Mol Biol 2020; 15:15. [PMID: 32782456 PMCID: PMC7413225 DOI: 10.1186/s13015-020-00176-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2019] [Accepted: 07/31/2020] [Indexed: 12/21/2022] Open
Abstract
Identifying the secondary structure of an RNA is crucial for understanding its diverse regulatory functions. This paper focuses on how to enhance target identification in a Boltzmann ensemble of structures via chemical probing data. We employ an information-theoretic approach to solve the problem, via considering a variant of the Rényi-Ulam game. Our framework is centered around the ensemble tree, a hierarchical bi-partition of the input ensemble, that is constructed by recursively querying about whether or not a base pair of maximum information entropy is contained in the target. These queries are answered via relating local with global probing data, employing the modularity in RNA secondary structures. We present that leaves of the tree are comprised of sub-samples exhibiting a distinguished structure with high probability. In particular, for a Boltzmann ensemble incorporating probing data, which is well established in the literature, the probability of our framework correctly identifying the target in the leaf is greater than \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$90\%$$\end{document}90%.
Collapse
|
33
|
Mautner S, Montaseri S, Miladi M, Raden M, Costa F, Backofen R. ShaKer: RNA SHAPE prediction using graph kernel. Bioinformatics 2020; 35:i354-i359. [PMID: 31510707 PMCID: PMC6612843 DOI: 10.1093/bioinformatics/btz395] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Summary SHAPE experiments are used to probe the structure of RNA molecules. We present ShaKer to predict SHAPE data for RNA using a graph-kernel-based machine learning approach that is trained on experimental SHAPE information. While other available methods require a manually curated reference structure, ShaKer predicts reactivity data based on sequence input only and by sampling the ensemble of possible structures. Thus, ShaKer is well placed to enable experiment-driven, transcriptome-wide SHAPE data prediction to enable the study of RNA structuredness and to improve RNA structure and RNA–RNA interaction prediction. For performance evaluation, we use accuracy and accessibility comparing to experimental SHAPE data and competing methods. We can show that Shaker outperforms its competitors and is able to predict high quality SHAPE annotations even when no reference structure is provided. Availability and implementation ShaKer is freely available at https://github.com/BackofenLab/ShaKer.
Collapse
Affiliation(s)
- Stefan Mautner
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Soheila Montaseri
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Milad Miladi
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Martin Raden
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Fabrizio Costa
- Department Computer Science, University of Exeter, Exeter, UK
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany.,Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Freiburg, Germany
| |
Collapse
|
34
|
Kuksa PP, Li F, Kannan S, Gregory BD, Leung YY, Wang LS. HiPR: High-throughput probabilistic RNA structure inference. Comput Struct Biotechnol J 2020; 18:1539-1547. [PMID: 32637050 PMCID: PMC7327253 DOI: 10.1016/j.csbj.2020.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 05/15/2020] [Accepted: 06/01/2020] [Indexed: 11/20/2022] Open
Abstract
Recent high-throughput structure-sensitive genome-wide sequencing-based assays have enabled large-scale studies of RNA structure, and robust transcriptome-wide computational prediction of individual RNA structures across RNA classes from these assays has potential to further improve the prediction accuracy. Here, we describe HiPR, a novel method for RNA structure prediction at single-nucleotide resolution that combines high-throughput structure probing data (DMS-seq, DMS-MaPseq) with a novel probabilistic folding algorithm. On validation data spanning a variety of RNA classes, HiPR often increases accuracy for predicting RNA structures, giving researchers new tools to study RNA structure.
Collapse
Affiliation(s)
- Pavel P. Kuksa
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Fan Li
- Children’s Hospital Los Angeles, Los Angeles, CA 90027, USA
| | - Sampath Kannan
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Brian D. Gregory
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yuk Yee Leung
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Li-San Wang
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
35
|
Karner H, Webb CH, Carmona S, Liu Y, Lin B, Erhard M, Chan D, Baldi P, Spitale RC, Sun S. Functional Conservation of LncRNA JPX Despite Sequence and Structural Divergence. J Mol Biol 2019; 432:283-300. [PMID: 31518612 DOI: 10.1016/j.jmb.2019.09.002] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Revised: 08/29/2019] [Accepted: 09/02/2019] [Indexed: 02/02/2023]
Abstract
Long noncoding RNAs (lncRNAs) have been identified in all eukaryotes and are most abundant in the human genome. However, the functional importance and mechanisms of action for human lncRNAs are largely unknown. Using comparative sequence, structural, and functional analyses, we characterize the evolution and molecular function of human lncRNA JPX. We find that human JPX and its mouse homolog, lncRNA Jpx, have deep divergence in their nucleotide sequences and RNA secondary structures. Despite such differences, both lncRNAs demonstrate robust binding to CTCF, a protein that is central to Jpx's role in X chromosome inactivation. In addition, our functional rescue experiment using Jpx-deletion mutant cells shows that human JPX can functionally complement the loss of Jpx in mouse embryonic stem cells. Our findings support a model for functional conservation of lncRNAs independent from sequence and structural divergence. This study provides mechanistic insight into the evolution of lncRNA function.
Collapse
Affiliation(s)
- Heather Karner
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Chiu-Ho Webb
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Sarah Carmona
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Yu Liu
- Department of Computer Science, Institute for Genomics and Bioinformatics, University of California Irvine, Irvine, CA 92697, USA
| | - Benjamin Lin
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Micaela Erhard
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Dalen Chan
- Department of Pharmaceutical Sciences, College of Health Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Pierre Baldi
- Department of Computer Science, Institute for Genomics and Bioinformatics, University of California Irvine, Irvine, CA 92697, USA
| | - Robert C Spitale
- Department of Pharmaceutical Sciences, College of Health Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Sha Sun
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA.
| |
Collapse
|
36
|
Incarnato D, Morandi E, Simon LM, Oliviero S. RNA Framework: an all-in-one toolkit for the analysis of RNA structures and post-transcriptional modifications. Nucleic Acids Res 2019; 46:e97. [PMID: 29893890 PMCID: PMC6144828 DOI: 10.1093/nar/gky486] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2018] [Accepted: 05/23/2018] [Indexed: 12/24/2022] Open
Abstract
RNA is emerging as a key regulator of a plethora of biological processes. While its study has remained elusive for decades, the recent advent of high-throughput sequencing technologies provided the unique opportunity to develop novel techniques for the study of RNA structure and post-transcriptional modifications. Nonetheless, most of the required downstream bioinformatics analyses steps are not easily reproducible, thus making the application of these techniques a prerogative of few laboratories. Here we introduce RNA Framework, an all-in-one toolkit for the analysis of most NGS-based RNA structure probing and post-transcriptional modification mapping experiments. To prove the extreme versatility of RNA Framework, we applied it to both an in-house generated DMS-MaPseq dataset, and to a series of literature available experiments. Notably, when starting from publicly available datasets, our software easily allows replicating authors' findings. Collectively, RNA Framework provides the most complete and versatile toolkit to date for a rapid and streamlined analysis of the RNA epistructurome. RNA Framework is available for download at: http://www.rnaframework.com.
Collapse
Affiliation(s)
- Danny Incarnato
- Italian Institute for Genomic Medicine (IIGM), Via Nizza 52, 10126 Torino, Italy.,Dipartimento di Scienze della Vita e Biologia dei Sistemi, Università di Torino, Via Accademia Albertina 13, Torino, Italy
| | - Edoardo Morandi
- Italian Institute for Genomic Medicine (IIGM), Via Nizza 52, 10126 Torino, Italy.,Dipartimento di Scienze della Vita e Biologia dei Sistemi, Università di Torino, Via Accademia Albertina 13, Torino, Italy
| | - Lisa Marie Simon
- Italian Institute for Genomic Medicine (IIGM), Via Nizza 52, 10126 Torino, Italy.,Dipartimento di Scienze della Vita e Biologia dei Sistemi, Università di Torino, Via Accademia Albertina 13, Torino, Italy
| | - Salvatore Oliviero
- Italian Institute for Genomic Medicine (IIGM), Via Nizza 52, 10126 Torino, Italy.,Dipartimento di Scienze della Vita e Biologia dei Sistemi, Università di Torino, Via Accademia Albertina 13, Torino, Italy
| |
Collapse
|
37
|
Katz N, Cohen R, Solomon O, Kaufmann B, Atar O, Yakhini Z, Goldberg S, Amit R. Synthetic 5' UTRs Can Either Up- or Downregulate Expression upon RNA-Binding Protein Binding. Cell Syst 2019; 9:93-106.e8. [PMID: 31129060 DOI: 10.1016/j.cels.2019.04.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 02/07/2019] [Accepted: 04/26/2019] [Indexed: 01/08/2023]
Abstract
The construction of complex gene-regulatory networks requires both inhibitory and upregulatory modules. However, the vast majority of RNA-based regulatory "parts" are inhibitory. Using a synthetic biology approach combined with SHAPE-seq, we explored the regulatory effect of RNA-binding protein (RBP)-RNA interactions in bacterial 5' UTRs. By positioning a library of RNA hairpins upstream of a reporter gene and co-expressing them with the matching RBP, we observed a set of regulatory responses, including translational stimulation, translational repression, and cooperative behavior. Our combined approach revealed three distinct states in vivo: in the absence of RBPs, the RNA molecules can be found in either a molten state that is amenable to translation or a structured phase that inhibits translation. In the presence of RBPs, the RNA molecules are in a semi-structured phase with partial translational capacity. Our work provides new insight into RBP-based regulation and a blueprint for designing complete gene-regulatory circuits at the post-transcriptional level.
Collapse
Affiliation(s)
- Noa Katz
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel
| | - Roni Cohen
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel
| | - Oz Solomon
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel; School of Computer Science, Interdisciplinary Center, 46150 Herzeliya, Israel
| | - Beate Kaufmann
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel
| | - Orna Atar
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel
| | - Zohar Yakhini
- Department of Computer Science, Technion - Israel Institute of Technology, 32000 Haifa, Israel; School of Computer Science, Interdisciplinary Center, 46150 Herzeliya, Israel
| | - Sarah Goldberg
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel
| | - Roee Amit
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel; Russell Berrie Nanotechnology Institute, Technion - Israel Institute of Technology, 32000 Haifa, Israel.
| |
Collapse
|
38
|
Abstract
RNA performs and regulates a diverse range of cellular processes, with new functional roles being uncovered at a rapid pace. Interest is growing in how these functions are linked to RNA structures that form in the complex cellular environment. A growing suite of technologies that use advances in RNA structural probes, high-throughput sequencing and new computational approaches to interrogate RNA structure at unprecedented throughput are beginning to provide insights into RNA structures at new spatial, temporal and cellular scales.
Collapse
Affiliation(s)
- Eric J Strobel
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | - Angela M Yu
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Julius B Lucks
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
39
|
Spasic A, Assmann SM, Bevilacqua PC, Mathews DH. Modeling RNA secondary structure folding ensembles using SHAPE mapping data. Nucleic Acids Res 2019; 46:314-323. [PMID: 29177466 PMCID: PMC5758915 DOI: 10.1093/nar/gkx1057] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 10/30/2017] [Indexed: 12/22/2022] Open
Abstract
RNA secondary structure prediction is widely used for developing hypotheses about the structures of RNA sequences, and structure can provide insight about RNA function. The accuracy of structure prediction is known to be improved using experimental mapping data that provide information about the pairing status of single nucleotides, and these data can now be acquired for whole transcriptomes using high-throughput sequencing. Prior methods for using these experimental data focused on predicting structures for sequences assuming that they populate a single structure. Most RNAs populate multiple structures, however, where the ensemble of strands populates structures with different sets of canonical base pairs. The focus on modeling single structures has been a bottleneck for accurately modeling RNA structure. In this work, we introduce Rsample, an algorithm for using experimental data to predict more than one RNA structure for sequences that populate multiple structures at equilibrium. We demonstrate, using SHAPE mapping data, that we can accurately model RNA sequences that populate multiple structures, including the relative probabilities of those structures. This program is freely available as part of the RNAstructure software package.
Collapse
Affiliation(s)
- Aleksandar Spasic
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Sarah M Assmann
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Philip C Bevilacqua
- Department of Chemistry, Department of Biochemistry & Molecular Biology, Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| |
Collapse
|
40
|
Frezza E, Courban A, Allouche D, Sargueil B, Pasquali S. The interplay between molecular flexibility and RNA chemical probing reactivities analyzed at the nucleotide level via an extensive molecular dynamics study. Methods 2019; 162-163:108-127. [PMID: 31145972 DOI: 10.1016/j.ymeth.2019.05.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 05/22/2019] [Accepted: 05/22/2019] [Indexed: 12/20/2022] Open
Abstract
Determination of the tridimensional structure of ribonucleic acid molecules is fundamental for understanding their function in the cell. A common method to investigate RNA structures of large molecules is the use of chemical probes such as SHAPE (2'-hydroxyl acylation analyzed by primer extension) reagents, DMS (dimethyl sulfate) and CMCT (1-cyclohexyl-3-(2-morpholinoethyl) carbodiimide metho-p-toluene sulfate), the reaction of which is dependent on the local structural properties of each nucleotide. In order to understand the interplay between local flexibility, sugar pucker, canonical pairing and chemical reactivity of the probes, we performed all-atom molecular dynamics simulations on a set of RNA molecules for which both tridimensional structure and chemical probing data are available and we analyzed the correlations between geometrical parameters and the chemical reactivity. Our study confirms that SHAPE reactivity is guided by the local flexibility of the different chemical moieties but suggests that a combination of multiple parameters is needed to better understand the implications of the reactivity at the molecular level. This is also the case for DMS and CMCT for which the reactivity appears to be more complex than commonly accepted.
Collapse
Affiliation(s)
- Elisa Frezza
- Faculté de Pharmacie de Paris, Laboratoire de Cristallographie et RMN Biologiques, UMR 8015 - CNRS, Université Paris Descartes, 4 Avenue de l'Observatoire 75270 PARIS CEDEX 06, France.
| | - Antoine Courban
- Faculté de Pharmacie de Paris, Laboratoire de Cristallographie et RMN Biologiques, UMR 8015 - CNRS, Université Paris Descartes, 4 Avenue de l'Observatoire 75270 PARIS CEDEX 06, France
| | - Delphine Allouche
- Faculté de Pharmacie de Paris, Laboratoire de Cristallographie et RMN Biologiques, UMR 8015 - CNRS, Université Paris Descartes, 4 Avenue de l'Observatoire 75270 PARIS CEDEX 06, France
| | - Bruno Sargueil
- Faculté de Pharmacie de Paris, Laboratoire de Cristallographie et RMN Biologiques, UMR 8015 - CNRS, Université Paris Descartes, 4 Avenue de l'Observatoire 75270 PARIS CEDEX 06, France.
| | - Samuela Pasquali
- Faculté de Pharmacie de Paris, Laboratoire de Cristallographie et RMN Biologiques, UMR 8015 - CNRS, Université Paris Descartes, 4 Avenue de l'Observatoire 75270 PARIS CEDEX 06, France.
| |
Collapse
|
41
|
Andrews RJ, Roche J, Moss WN. ScanFold: an approach for genome-wide discovery of local RNA structural elements-applications to Zika virus and HIV. PeerJ 2018; 6:e6136. [PMID: 30627482 PMCID: PMC6317755 DOI: 10.7717/peerj.6136] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 11/15/2018] [Indexed: 12/24/2022] Open
Abstract
In addition to encoding RNA primary structures, genomes also encode RNA secondary and tertiary structures that play roles in gene regulation and, in the case of RNA viruses, genome replication. Methods for the identification of functional RNA structures in genomes typically rely on scanning analysis windows, where multiple partially-overlapping windows are used to predict RNA structures and folding metrics to deduce regions likely to form functional structure. Separate structural models are produced for each window, where the step size can greatly affect the returned model. This makes deducing unique local structures challenging, as the same nucleotides in each window can be alternatively base paired. We are presenting here a new approach where all base pairs from analysis windows are considered and weighted by favorable folding. This results in unique base pairing throughout the genome and the generation of local regions/structures that can be ranked by their propensity to form unusually thermodynamically stable folds. We applied this approach to the Zika virus (ZIKV) and HIV-1 genomes. ZIKV is linked to a variety of neurological ailments including microcephaly and Guillain-Barré syndrome and its (+)-sense RNA genome encodes two, previously described, functionally essential structured RNA regions. HIV, the cause of AIDS, contains multiple functional RNA motifs in its genome, which have been extensively studied. Our approach is able to successfully identify and model the structures of known functional motifs in both viruses, while also finding additional regions likely to form functional structures. All data have been archived at the RNAStructuromeDB (www.structurome.bb.iastate.edu), a repository of RNA folding data for humans and their pathogens.
Collapse
Affiliation(s)
- Ryan J. Andrews
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, USA
| | - Julien Roche
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, USA
| | - Walter N. Moss
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, USA
| |
Collapse
|
42
|
Rendleman J, Cheng Z, Maity S, Kastelic N, Munschauer M, Allgoewer K, Teo G, Zhang YBM, Lei A, Parker B, Landthaler M, Freeberg L, Kuersten S, Choi H, Vogel C. New insights into the cellular temporal response to proteostatic stress. eLife 2018; 7:39054. [PMID: 30272558 PMCID: PMC6185107 DOI: 10.7554/elife.39054] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Accepted: 09/28/2018] [Indexed: 12/13/2022] Open
Abstract
Maintaining a healthy proteome involves all layers of gene expression regulation. By quantifying temporal changes of the transcriptome, translatome, proteome, and RNA-protein interactome in cervical cancer cells, we systematically characterize the molecular landscape in response to proteostatic challenges. We identify shared and specific responses to misfolded proteins and to oxidative stress, two conditions that are tightly linked. We reveal new aspects of the unfolded protein response, including many genes that escape global translation shutdown. A subset of these genes supports rerouting of energy production in the mitochondria. We also find that many genes change at multiple levels, in either the same or opposing directions, and at different time points. We highlight a variety of putative regulatory pathways, including the stress-dependent alternative splicing of aminoacyl-tRNA synthetases, and protein-RNA binding within the 3’ untranslated region of molecular chaperones. These results illustrate the potential of this information-rich resource.
Collapse
Affiliation(s)
- Justin Rendleman
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, United States
| | - Zhe Cheng
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, United States
| | - Shuvadeep Maity
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, United States
| | - Nicolai Kastelic
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Mathias Munschauer
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Kristina Allgoewer
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, United States
| | - Guoshou Teo
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, United States
| | - Yun Bin Matteo Zhang
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, United States
| | - Amy Lei
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, United States
| | - Brian Parker
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, United States
| | - Markus Landthaler
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany.,Integrative Research Institute for the Life Sciences, Institute of Biology, Humboldt University, Berlin, Germany
| | | | | | - Hyungwon Choi
- National University of Singapore, Singapore.,Institute of Molecular and Cell Biology, Agency for Science, Technology and Research, Singapore
| | - Christine Vogel
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, United States
| |
Collapse
|
43
|
Lotfi M, Zare-Mirakabad F, Montaseri S. RNA design using simulated SHAPE data. Genes Genet Syst 2018; 92:257-265. [PMID: 28757510 DOI: 10.1266/ggs.16-00067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
It has long been established that in addition to being involved in protein translation, RNA plays essential roles in numerous other cellular processes, including gene regulation and DNA replication. Such roles are known to be dictated by higher-order structures of RNA molecules. It is therefore of prime importance to find an RNA sequence that can fold to acquire a particular function that is desirable for use in pharmaceuticals and basic research. The challenge of finding an RNA sequence for a given structure is known as the RNA design problem. Although there are several algorithms to solve this problem, they mainly consider hard constraints, such as minimum free energy, to evaluate the predicted sequences. Recently, SHAPE data has emerged as a new soft constraint for RNA secondary structure prediction. To take advantage of this new experimental constraint, we report here a new method for accurate design of RNA sequences based on their secondary structures using SHAPE data as pseudo-free energy. We then compare our algorithm with four others: INFO-RNA, ERD, MODENA and RNAifold 2.0. Our algorithm precisely predicts 26 out of 29 new sequences for the structures extracted from the Rfam dataset, while the other four algorithms predict no more than 22 out of 29. The proposed algorithm is comparable to the above algorithms on RNA-SSD datasets, where they can predict up to 33 appropriate sequences for RNA secondary structures out of 34.
Collapse
Affiliation(s)
- Mohadeseh Lotfi
- Faculty of Mathematics and Computer Science, Amirkabir University of Technology
| | | | - Soheila Montaseri
- School of Mathematics, Statistics and Computer Science, College of Science, Enghelab Avenue, University of Tehran
| |
Collapse
|
44
|
Wright PR, Mann M, Backofen R. Structure and Interaction Prediction in Prokaryotic RNA Biology. Microbiol Spectr 2018; 6:10.1128/microbiolspec.rwr-0001-2017. [PMID: 29676245 PMCID: PMC11633574 DOI: 10.1128/microbiolspec.rwr-0001-2017] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Indexed: 01/01/2023] Open
Abstract
Many years of research in RNA biology have soundly established the importance of RNA-based regulation far beyond most early traditional presumptions. Importantly, the advances in "wet" laboratory techniques have produced unprecedented amounts of data that require efficient and precise computational analysis schemes and algorithms. Hence, many in silico methods that attempt topological and functional classification of novel putative RNA-based regulators are available. In this review, we technically outline thermodynamics-based standard RNA secondary structure and RNA-RNA interaction prediction approaches that have proven valuable to the RNA research community in the past and present. For these, we highlight their usability with a special focus on prokaryotic organisms and also briefly mention recent advances in whole-genome interactomics and how this may influence the field of predictive RNA research.
Collapse
Affiliation(s)
| | | | - Rolf Backofen
- Bioinformatics Group
- Center for Biological Signaling Studies (BIOSS), University of Freiburg, Freiburg, Germany
| |
Collapse
|
45
|
Montaseri S, Zare-Mirakabad F, Ganjtabesh M. Evaluating the quality of SHAPE data simulated by k-mers for RNA structure prediction. J Bioinform Comput Biol 2017; 15:1750023. [PMID: 29113564 DOI: 10.1142/s0219720017500238] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Finding an effective measure to predict a more accurate RNA secondary structure is a challenging problem. In the last decade, an experimental method, known as selective [Formula: see text]-hydroxyl acylation analyzed by primer extension (SHAPE), was proposed to measure the tendency of forming a base pair for almost all nucleotides in an RNA sequence. These SHAPE reactivities are then utilized to improve the accuracy of RNA structure prediction. Due to a significant impact of SHAPE reactivity and in order to reduce the experimental costs, we propose a new model called HL-k-mer. This model simulates the SHAPE reactivity for each nucleotide in an RNA sequence. This is done by fetching the SHAPE reactivities for all sub-sequences of length k (k-mers) appearing in helix and loop regions. For evaluating the quality of simulated SHAPE data, ESD-Fold method is used based on the SHAPE data simulated by the HL-k-mer model ([Formula: see text]). Also, for further evaluation of simulated SHAPE data, three different methods are employed. We also extend this model to simulate the SHAPE data for the RNA pseudoknotted structure. The results indicate that the average accuracies of prediction using the SHAPE data simulated by our models (for [Formula: see text]) are higher compared to the experimental SHAPE data.
Collapse
Affiliation(s)
- Soheila Montaseri
- 1 Department of Computer Science, School of Mathematics Statistics, and Computer Science, University of Tehran, Tehran, Iran
| | - Fatemeh Zare-Mirakabad
- 2 Department of Computer Science, Faculty of Mathematics and Computer Science, Amirkabir, University of Technology, Tehran, Iran
| | - Mohammad Ganjtabesh
- 1 Department of Computer Science, School of Mathematics Statistics, and Computer Science, University of Tehran, Tehran, Iran.,3 School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, P.O. Box: 19395-5746, Iran
| |
Collapse
|
46
|
Tan Z, Sharma G, Mathews DH. Modeling RNA Secondary Structure with Sequence Comparison and Experimental Mapping Data. Biophys J 2017; 113:330-338. [PMID: 28735622 DOI: 10.1016/j.bpj.2017.06.039] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Revised: 06/07/2017] [Accepted: 06/19/2017] [Indexed: 10/19/2022] Open
Abstract
Secondary structure prediction is an important problem in RNA bioinformatics because knowledge of structure is critical to understanding the functions of RNA sequences. Significant improvements in prediction accuracy have recently been demonstrated though the incorporation of experimentally obtained structural information, for instance using selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) mapping. However, such mapping data is currently available only for a limited number of RNA sequences. In this article, we present a method for extending the benefit of experimental mapping data in secondary structure prediction to homologous sequences. Specifically, we propose a method for integrating experimental mapping data into a comparative sequence analysis algorithm for secondary structure prediction of multiple homologs, whereby the mapping data benefits not only the prediction for the specific sequence that was mapped but also other homologs. The proposed method is realized by modifying the TurboFold II algorithm for prediction of RNA secondary structures to utilize basepairing probabilities guided by SHAPE experimental data when such data are available. The SHAPE-mapping-guided basepairing probabilities are obtained using the RSample method. Results demonstrate that the SHAPE mapping data for a sequence improves structure prediction accuracy of other homologous sequences beyond the accuracy obtained by sequence comparison alone (TurboFold II). The updated version of TurboFold II is freely available as part of the RNAstructure software package.
Collapse
Affiliation(s)
- Zhen Tan
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, New York; Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - Gaurav Sharma
- Center for RNA Biology, University of Rochester Medical Center, Rochester, New York; Department of Electrical and Computer Engineering, University of Rochester Medical Center, Rochester, New York; Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York.
| | - David H Mathews
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, New York; Center for RNA Biology, University of Rochester Medical Center, Rochester, New York; Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York.
| |
Collapse
|
47
|
Choudhary K, Deng F, Aviran S. Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions. QUANTITATIVE BIOLOGY 2017; 5:3-24. [PMID: 28717530 PMCID: PMC5510538 DOI: 10.1007/s40484-017-0093-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 12/08/2016] [Accepted: 12/15/2016] [Indexed: 12/30/2022]
Abstract
BACKGROUND Structure profiling experiments provide single-nucleotide information on RNA structure. Recent advances in chemistry combined with application of high-throughput sequencing have enabled structure profiling at transcriptome scale and in living cells, creating unprecedented opportunities for RNA biology. Propelled by these experimental advances, massive data with ever-increasing diversity and complexity have been generated, which give rise to new challenges in interpreting and analyzing these data. RESULTS We review current practices in analysis of structure profiling data with emphasis on comparative and integrative analysis as well as highlight emerging questions. Comparative analysis has revealed structural patterns across transcriptomes and has become an integral component of recent profiling studies. Additionally, profiling data can be integrated into traditional structure prediction algorithms to improve prediction accuracy. CONCLUSIONS To keep pace with experimental developments, methods to facilitate, enhance and refine such analyses are needed. Parallel advances in analysis methodology will complement profiling technologies and help them reach their full potential.
Collapse
Affiliation(s)
| | | | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA 95616, USA
| |
Collapse
|
48
|
Deng F, Ledda M, Vaziri S, Aviran S. Data-directed RNA secondary structure prediction using probabilistic modeling. RNA (NEW YORK, N.Y.) 2016; 22:1109-1119. [PMID: 27251549 PMCID: PMC4931104 DOI: 10.1261/rna.055756.115] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Accepted: 04/26/2016] [Indexed: 06/05/2023]
Abstract
Structure dictates the function of many RNAs, but secondary RNA structure analysis is either labor intensive and costly or relies on computational predictions that are often inaccurate. These limitations are alleviated by integration of structure probing data into prediction algorithms. However, existing algorithms are optimized for a specific type of probing data. Recently, new chemistries combined with advances in sequencing have facilitated structure probing at unprecedented scale and sensitivity. These novel technologies and anticipated wealth of data highlight a need for algorithms that readily accommodate more complex and diverse input sources. We implemented and investigated a recently outlined probabilistic framework for RNA secondary structure prediction and extended it to accommodate further refinement of structural information. This framework utilizes direct likelihood-based calculations of pseudo-energy terms per considered structural context and can readily accommodate diverse data types and complex data dependencies. We use real data in conjunction with simulations to evaluate performances of several implementations and to show that proper integration of structural contexts can lead to improvements. Our tests also reveal discrepancies between real data and simulations, which we show can be alleviated by refined modeling. We then propose statistical preprocessing approaches to standardize data interpretation and integration into such a generic framework. We further systematically quantify the information content of data subsets, demonstrating that high reactivities are major drivers of SHAPE-directed predictions and that better understanding of less informative reactivities is key to further improvements. Finally, we provide evidence for the adaptive capability of our framework using mock probe simulations.
Collapse
Affiliation(s)
- Fei Deng
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| | - Mirko Ledda
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| | - Sana Vaziri
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| |
Collapse
|
49
|
Kutchko KM, Laederach A. Transcending the prediction paradigm: novel applications of SHAPE to RNA function and evolution. WILEY INTERDISCIPLINARY REVIEWS-RNA 2016; 8. [PMID: 27396578 PMCID: PMC5179297 DOI: 10.1002/wrna.1374] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Revised: 04/29/2016] [Accepted: 05/23/2016] [Indexed: 12/31/2022]
Abstract
Selective 2′‐hydroxyl acylation analyzed by primer extension (SHAPE) provides information on RNA structure at single‐nucleotide resolution. It is most often used in conjunction with RNA secondary structure prediction algorithms as a probabilistic or thermodynamic restraint. With the recent advent of ultra‐high‐throughput approaches for collecting SHAPE data, the applications of this technology are extending beyond structure prediction. In this review, we discuss recent applications of SHAPE data in the transcriptomic context and how this new experimental paradigm is changing our understanding of these experiments and RNA folding in general. SHAPE experiments probe both the secondary and tertiary structure of an RNA, suggesting that model‐free approaches for within and comparative RNA structure analysis can provide significant structural insight without the need for a full structural model. New methods incorporating SHAPE at different nucleotide resolutions are required to parse these transcriptomic data sets to transcend secondary structure modeling with global structural metrics. These ‘multiscale’ approaches provide deeper insights into RNA global structure, evolution, and function in the cell. WIREs RNA 2017, 8:e1374. doi: 10.1002/wrna.1374 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Katrina M Kutchko
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Alain Laederach
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
50
|
Abstract
Deciphering the folding pathways and predicting the structures of complex three-dimensional biomolecules is central to elucidating biological function. RNA is single-stranded, which gives it the freedom to fold into complex secondary and tertiary structures. These structures endow RNA with the ability to perform complex chemistries and functions ranging from enzymatic activity to gene regulation. Given that RNA is involved in many essential cellular processes, it is critical to understand how it folds and functions in vivo. Within the last few years, methods have been developed to probe RNA structures in vivo and genome-wide. These studies reveal that RNA often adopts very different structures in vivo and in vitro, and provide profound insights into RNA biology. Nonetheless, both in vitro and in vivo approaches have limitations: studies in the complex and uncontrolled cellular environment make it difficult to obtain insight into RNA folding pathways and thermodynamics, and studies in vitro often lack direct cellular relevance, leaving a gap in our knowledge of RNA folding in vivo. This gap is being bridged by biophysical and mechanistic studies of RNA structure and function under conditions that mimic the cellular environment. To date, most artificial cytoplasms have used various polymers as molecular crowding agents and a series of small molecules as cosolutes. Studies under such in vivo-like conditions are yielding fresh insights, such as cooperative folding of functional RNAs and increased activity of ribozymes. These observations are accounted for in part by molecular crowding effects and interactions with other molecules. In this review, we report milestones in RNA folding in vitro and in vivo and discuss ongoing experimental and computational efforts to bridge the gap between these two conditions in order to understand how RNA folds in the cell.
Collapse
|