1
|
Yamamura K, Asai K, Iwakiri J. Consistent features observed in structural probing data of eukaryotic RNAs. NAR Genom Bioinform 2025; 7:lqaf001. [PMID: 39885881 PMCID: PMC11780854 DOI: 10.1093/nargab/lqaf001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 12/25/2024] [Accepted: 01/09/2025] [Indexed: 02/01/2025] Open
Abstract
Understanding RNA structure is crucial for elucidating its regulatory mechanisms. With the recent commercialization of messenger RNA vaccines, the profound impact of RNA structure on stability and translation efficiency has become increasingly evident, underscoring the importance of understanding RNA structure. Chemical probing of RNA has emerged as a powerful technique for investigating RNA structure in living cells. This approach utilizes chemical probes that selectively react with accessible regions of RNA, and by measuring reactivity, the openness and potential of RNA for protein binding or base pairing can be inferred. Extensive experimental data generated using RNA chemical probing have significantly contributed to our understanding of RNA structure in cells. However, it is crucial to acknowledge potential biases in chemical probing data to ensure an accurate interpretation. In this study, we comprehensively analyzed transcriptome-scale RNA chemical probing data in eukaryotes and report common features. Notably, in all experiments, the number of bases modified in probing was small, the bases showing the top 10% reactivity well reflected the known secondary structure, bases with high reactivity were more likely to be exposed to solvent and low reactivity did not reflect solvent exposure, which is important information for the analysis of RNA chemical probing data.
Collapse
Affiliation(s)
- Kazuteru Yamamura
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, Chiba 277-8561, Japan
| | - Kiyoshi Asai
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, Chiba 277-8561, Japan
| | - Junichi Iwakiri
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, Chiba 277-8561, Japan
| |
Collapse
|
2
|
Tong Y, Childs-Disney JL, Disney MD. Targeting RNA with small molecules, from RNA structures to precision medicines: IUPHAR review: 40. Br J Pharmacol 2024; 181:4152-4173. [PMID: 39224931 DOI: 10.1111/bph.17308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 06/10/2024] [Accepted: 07/09/2024] [Indexed: 09/04/2024] Open
Abstract
RNA plays important roles in regulating both health and disease biology in all kingdoms of life. Notably, RNA can form intricate three-dimensional structures, and their biological functions are dependent on these structures. Targeting the structured regions of RNA with small molecules has gained increasing attention over the past decade, because it provides both chemical probes to study fundamental biology processes and lead medicines for diseases with unmet medical needs. Recent advances in RNA structure prediction and determination and RNA biology have accelerated the rational design and development of RNA-targeted small molecules to modulate disease pathology. However, challenges remain in advancing RNA-targeted small molecules towards clinical applications. This review summarizes strategies to study RNA structures, to identify small molecules recognizing these structures, and to augment the functionality of RNA-binding small molecules. We focus on recent advances in developing RNA-targeted small molecules as potential therapeutics in a variety of diseases, encompassing different modes of actions and targeting strategies. Furthermore, we present the current gaps between early-stage discovery of RNA-binding small molecules and their clinical applications, as well as a roadmap to overcome these challenges in the near future.
Collapse
Affiliation(s)
- Yuquan Tong
- Department of Chemistry, The Scripps Research Institute, Jupiter, Florida, USA
- Department of Chemistry, The Herbert Wertheim UF Scripps Institute for Biomedical Innovation & Technology, Jupiter, Florida, USA
| | - Jessica L Childs-Disney
- Department of Chemistry, The Herbert Wertheim UF Scripps Institute for Biomedical Innovation & Technology, Jupiter, Florida, USA
| | - Matthew D Disney
- Department of Chemistry, The Scripps Research Institute, Jupiter, Florida, USA
- Department of Chemistry, The Herbert Wertheim UF Scripps Institute for Biomedical Innovation & Technology, Jupiter, Florida, USA
| |
Collapse
|
3
|
von Löhneysen S, Spicher T, Varenyk Y, Yao HT, Lorenz R, Hofacker I, Stadler PF. Phylogenetic and Chemical Probing Information as Soft Constraints in RNA Secondary Structure Prediction. J Comput Biol 2024; 31:549-563. [PMID: 38935442 DOI: 10.1089/cmb.2024.0519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
Extrinsic, experimental information can be incorporated into thermodynamics-based RNA folding algorithms in the form of pseudo-energies. Evolutionary conservation of RNA secondary structure elements is detectable in alignments of phylogenetically related sequences and provides evidence for the presence of certain base pairs that can also be converted into pseudo-energy contributions. We show that the centroid base pairs computed from a consensus folding model such as RNAalifold result in a substantial improvement of the prediction accuracy for single sequences. Evidence for specific base pairs turns out to be more informative than a position-wise profile for the conservation of the pairing status. A comparison with chemical probing data, furthermore, strongly suggests that phylogenetic base pairing data are more informative than position-specific data on (un)pairedness as obtained from chemical probing experiments. In this context we demonstrate, in addition, that the conversion of signal from probing data into pseudo-energies is possible using thermodynamic structure predictions as a reference instead of known RNA structures.
Collapse
Affiliation(s)
- Sarah von Löhneysen
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany
| | - Thomas Spicher
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- UniVie Doctoral School Computer Science (DoCS), University of Vienna, Vienna, Austria
| | - Yuliia Varenyk
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- Vienna BioCenter PhD Program, Doctoral School of the University of Vienna and Medical, University of Vienna, Vienna, Austria
| | - Hua-Ting Yao
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Ronny Lorenz
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Ivo Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
- Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia
- Santa Fe Institute, Santa Fe, New Mexico, USA
| |
Collapse
|
4
|
Greenwood T, Heitsch CE. How Parameters Influence SHAPE-Directed Predictions. Methods Mol Biol 2024; 2726:105-124. [PMID: 38780729 DOI: 10.1007/978-1-0716-3519-3_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
The structure of an RNA sequence encodes information about its biological function. Dynamic programming algorithms are often used to predict the conformation of an RNA molecule from its sequence alone, and adding experimental data as auxiliary information improves prediction accuracy. This auxiliary data is typically incorporated into the nearest neighbor thermodynamic model22 by converting the data into pseudoenergies. Here, we look at how much of the space of possible structures auxiliary data allows prediction methods to explore. We find that for a large class of RNA sequences, auxiliary data shifts the predictions significantly. Additionally, we find that predictions are highly sensitive to the parameters which define the auxiliary data pseudoenergies. In fact, the parameter space can typically be partitioned into regions where different structural predictions predominate.
Collapse
|
5
|
Kosek DM, Banijamali E, Becker W, Petzold K, Andersson E. Efficient 3'-pairing renders microRNA targeting less sensitive to mRNA seed accessibility. Nucleic Acids Res 2023; 51:11162-11177. [PMID: 37819016 PMCID: PMC10639062 DOI: 10.1093/nar/gkad795] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 09/15/2023] [Accepted: 09/19/2023] [Indexed: 10/13/2023] Open
Abstract
MicroRNAs (miRNAs) are short RNAs that post-transcriptionally regulate gene expression by binding to specific sites in mRNAs. Site recognition is primarily mediated by the seed region (nucleotides g2-g8 in the miRNA), but pairing beyond the seed (3'-pairing) is important for some miRNA:target interactions. Here, we use SHAPE, luciferase reporter assays and transcriptomics analyses to study the combined effect of 3'-pairing and secondary structures in mRNAs on repression efficiency. Using the interaction between miR-34a and its SIRT1 binding site as a model, we provide structural and functional evidence that 3'-pairing can compensate for low seed-binding site accessibility, enabling repression of sites that would otherwise be ineffective. We show that miRNA 3'-pairing regions can productively base-pair with nucleotides far upstream of the seed-binding site and that both hairpins and unstructured bulges within the target site are tolerated. We use SHAPE to show that sequences that overcome inaccessible seed-binding sites by strong 3'-pairing adopt the predicted structures and corroborate the model using luciferase assays and high-throughput modelling of 8177 3'-UTR targets for six miRNAs. Finally, we demonstrate that PHB2, a target of miR-141, is an inaccessible target rescued by efficient 3'-pairing. We propose that these results could refine predictions of effective target sites.
Collapse
Affiliation(s)
- David M Kosek
- Department of Cell and Molecular Biology, Karolinska Institute, Biomedicum 9B, Solnavägen 9, 17177Stockholm, Sweden
| | - Elnaz Banijamali
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, Biomedicum 9B, Solnavägen 9, 17177Stockholm, Sweden
| | - Walter Becker
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, Biomedicum 9B, Solnavägen 9, 17177Stockholm, Sweden
| | - Katja Petzold
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, Biomedicum 9B, Solnavägen 9, 17177Stockholm, Sweden
- Department of Medical Biochemistry and Microbiology, Uppsala University, Biomedical Centre D9:3, Husargatan 3, 752 37 Uppsala, Sweden
| | - Emma R Andersson
- Department of Cell and Molecular Biology, Karolinska Institute, Biomedicum 9B, Solnavägen 9, 17177Stockholm, Sweden
| |
Collapse
|
6
|
Waldl M, Spicher T, Lorenz R, Beckmann IK, Hofacker IL, Löhneysen SV, Stadler PF. Local RNA folding revisited. J Bioinform Comput Biol 2023; 21:2350016. [PMID: 37522173 DOI: 10.1142/s0219720023500166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
Most of the functional RNA elements located within large transcripts are local. Local folding therefore serves a practically useful approximation to global structure prediction. Due to the sensitivity of RNA secondary structure prediction to the exact definition of sequence ends, accuracy can be increased by averaging local structure predictions over multiple, overlapping sequence windows. These averages can be computed efficiently by dynamic programming. Here we revisit the local folding problem, present a concise mathematical formalization that generalizes previous approaches and show that correct Boltzmann samples can be obtained by local stochastic backtracing in McCaskill's algorithms but not from local folding recursions. Corresponding new features are implemented in the ViennaRNA package to improve the support of local folding. Applications include the computation of maximum expected accuracy structures from RNAplfold data and a mutual information measure to quantify the sensitivity of individual sequence positions.
Collapse
Affiliation(s)
- Maria Waldl
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Thomas Spicher
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Ronny Lorenz
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Irene K Beckmann
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Sarah Von Löhneysen
- Institute of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstraße 16-18, D-04107 Leipzig, Germany
| | - Peter F Stadler
- Institute of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstraße 16-18, D-04107 Leipzig, Germany
| |
Collapse
|
7
|
Banijamali E, Baronti L, Becker W, Sajkowska-Kozielewicz JJ, Huang T, Palka C, Kosek D, Sweetapple L, Müller J, Stone MD, Andersson ER, Petzold K. RNA:RNA interaction in ternary complexes resolved by chemical probing. RNA (NEW YORK, N.Y.) 2023; 29:317-329. [PMID: 36617673 PMCID: PMC9945442 DOI: 10.1261/rna.079190.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 11/25/2022] [Indexed: 06/17/2023]
Abstract
RNA regulation can be performed by a second targeting RNA molecule, such as in the microRNA regulation mechanism. Selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) probes the structure of RNA molecules and can resolve RNA:protein interactions, but RNA:RNA interactions have not yet been addressed with this technique. Here, we apply SHAPE to investigate RNA-mediated binding processes in RNA:RNA and RNA:RNA-RBP complexes. We use RNA:RNA binding by SHAPE (RABS) to investigate microRNA-34a (miR-34a) binding its mRNA target, the silent information regulator 1 (mSIRT1), both with and without the Argonaute protein, constituting the RNA-induced silencing complex (RISC). We show that the seed of the mRNA target must be bound to the microRNA loaded into RISC to enable further binding of the compensatory region by RISC, while the naked miR-34a is able to bind the compensatory region without seed interaction. The method presented here provides complementary structural evidence for the commonly performed luciferase-assay-based evaluation of microRNA binding-site efficiency and specificity on the mRNA target site and could therefore be used in conjunction with it. The method can be applied to any nucleic acid-mediated RNA- or RBP-binding process, such as splicing, antisense RNA binding, or regulation by RISC, providing important insight into the targeted RNA structure.
Collapse
Affiliation(s)
- Elnaz Banijamali
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Lorenzo Baronti
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Walter Becker
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | | | - Ting Huang
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Christina Palka
- Department of Chemistry and Biochemistry, University of California, Santa Cruz, California 95064, USA
| | - David Kosek
- Department of Cell and Molecular Biology, Karolinska Institute, 17177 Stockholm, Sweden
| | - Lara Sweetapple
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Juliane Müller
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Michael D Stone
- Department of Chemistry and Biochemistry, University of California, Santa Cruz, California 95064, USA
| | - Emma R Andersson
- Department of Cell and Molecular Biology, Karolinska Institute, 17177 Stockholm, Sweden
| | - Katja Petzold
- Department of Medical Biochemistry and Biophysics, Karolinska Institute, 17177 Stockholm, Sweden
- Stellenbosch Institute for Advanced Study (STIAS), Wallenberg Research Centre at Stellenbosch University, Stellenbosch 7600, South Africa
| |
Collapse
|
8
|
RNA secondary structure packages evaluated and improved by high-throughput experiments. Nat Methods 2022; 19:1234-1242. [PMID: 36192461 PMCID: PMC9839360 DOI: 10.1038/s41592-022-01605-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 08/10/2022] [Indexed: 01/17/2023]
Abstract
Despite the popularity of computer-aided study and design of RNA molecules, little is known about the accuracy of commonly used structure modeling packages in tasks sensitive to ensemble properties of RNA. Here, we demonstrate that the EternaBench dataset, a set of more than 20,000 synthetic RNA constructs designed on the RNA design platform Eterna, provides incisive discriminative power in evaluating current packages in ensemble-oriented structure prediction tasks. We find that CONTRAfold and RNAsoft, packages with parameters derived through statistical learning, achieve consistently higher accuracy than more widely used packages in their standard settings, which derive parameters primarily from thermodynamic experiments. We hypothesized that training a multitask model with the varied data types in EternaBench might improve inference on ensemble-based prediction tasks. Indeed, the resulting model, named EternaFold, demonstrated improved performance that generalizes to diverse external datasets including complete messenger RNAs, viral genomes probed in human cells and synthetic designs modeling mRNA vaccines.
Collapse
|
9
|
Yang SL, Ponti RD, Wan Y, Huber RG. Computational and Experimental Approaches to Study the RNA Secondary Structures of RNA Viruses. Viruses 2022; 14:1795. [PMID: 36016417 PMCID: PMC9415818 DOI: 10.3390/v14081795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 08/12/2022] [Accepted: 08/13/2022] [Indexed: 11/16/2022] Open
Abstract
Most pandemics of recent decades can be traced to RNA viruses, including HIV, SARS, influenza, dengue, Zika, and SARS-CoV-2. These RNA viruses impose considerable social and economic burdens on our society, resulting in a high number of deaths and high treatment costs. As these RNA viruses utilize an RNA genome, which is important for different stages of the viral life cycle, including replication, translation, and packaging, studying how the genome folds is important to understand virus function. In this review, we summarize recent advances in computational and high-throughput RNA structure-mapping approaches and their use in understanding structures within RNA virus genomes. In particular, we focus on the genome structures of the dengue, Zika, and SARS-CoV-2 viruses due to recent significant outbreaks of these viruses around the world.
Collapse
Affiliation(s)
- Siwy Ling Yang
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore
| | - Riccardo Delli Ponti
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Singapore 138671, Singapore
| | - Yue Wan
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore
| | - Roland G. Huber
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Singapore 138671, Singapore
| |
Collapse
|
10
|
Aviran S, Incarnato D. Computational approaches for RNA structure ensemble deconvolution from structure probing data. J Mol Biol 2022; 434:167635. [PMID: 35595163 DOI: 10.1016/j.jmb.2022.167635] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 04/29/2022] [Accepted: 05/05/2022] [Indexed: 12/15/2022]
Abstract
RNA structure probing experiments have emerged over the last decade as a straightforward way to determine the structure of RNA molecules in a number of different contexts. Although powerful, the ability of RNA to dynamically interconvert between, and to simultaneously populate, alternative structural configurations, poses a nontrivial challenge to the interpretation of data derived from these experiments. Recent efforts aimed at developing computational methods for the reconstruction of coexisting alternative RNA conformations from structure probing data are paving the way to the study of RNA structure ensembles, even in the context of living cells. In this review, we critically discuss these methods, their limitations and possible future improvements.
Collapse
Affiliation(s)
- Sharon Aviran
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA.
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, the Netherlands.
| |
Collapse
|
11
|
Morandi E, van Hemert MJ, Incarnato D. SHAPE-guided RNA structure homology search and motif discovery. Nat Commun 2022; 13:1722. [PMID: 35361788 PMCID: PMC8971488 DOI: 10.1038/s41467-022-29398-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 03/11/2022] [Indexed: 01/13/2023] Open
Abstract
The rapidly growing popularity of RNA structure probing methods is leading to increasingly large amounts of available RNA structure information. This demands the development of efficient tools for the identification of RNAs sharing regions of structural similarity by direct comparison of their reactivity profiles, hence enabling the discovery of conserved structural features. We here introduce SHAPEwarp, a largely sequence-agnostic SHAPE-guided algorithm for the identification of structurally-similar regions in RNA molecules. Analysis of Dengue, Zika and coronavirus genomes recapitulates known regulatory RNA structures and identifies novel highly-conserved structural elements. This work represents a preliminary step towards the model-free search and identification of shared and conserved RNA structural features within transcriptomes.
Collapse
Affiliation(s)
- Edoardo Morandi
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, The Netherlands
| | - Martijn J van Hemert
- Department of Medical Microbiology, Molecular Virology Laboratory, Leiden University Medical Center, Leiden, The Netherlands
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, The Netherlands.
| |
Collapse
|
12
|
Kis Z. Stability Modelling of mRNA Vaccine Quality Based on Temperature Monitoring throughout the Distribution Chain. Pharmaceutics 2022; 14:430. [PMID: 35214162 PMCID: PMC8877932 DOI: 10.3390/pharmaceutics14020430] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Revised: 01/31/2022] [Accepted: 02/08/2022] [Indexed: 11/22/2022] Open
Abstract
The vaccine distribution chains in several low- and middle-income countries are not adequate to facilitate the rapid delivery of high volumes of thermosensitive COVID-19 mRNA vaccines at the required low and ultra-low temperatures. COVID-19 mRNA vaccines are currently distributed along with temperature monitoring devices to track and identify deviations from predefined conditions throughout the distribution chain. These temperature readings can feed into computational models to quantify mRNA vaccine critical quality attributes (CQAs) and the remaining vaccine shelf life more accurately. Here, a kinetic modelling approach is proposed to quantify the stability-related CQAs and the remaining shelf life of mRNA vaccines. The CQA and shelf-life values can be computed based on the conditions under which the vaccines have been distributed from the manufacturing facilities via the distribution network to the vaccination centres. This approach helps to quantify the degree to which temperature excursions impact vaccine quality and can also reduce vaccine wastage. In addition, vaccine stock management can be improved due to the information obtained on the remaining shelf life of mRNA vaccines. This model-based quantification of mRNA vaccine quality and remaining shelf life can improve the deployment of COVID-19 mRNA vaccines to low- and middle-income countries.
Collapse
Affiliation(s)
- Zoltán Kis
- Department of Chemical and Biological Engineering, The University of Sheffield, Mappin St., Sheffield S1 3JD, UK;
- The Sargent Centre for Process Systems Engineering, Department of Chemical Engineering, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| |
Collapse
|
13
|
Tagashira M, Asai K. ConsAlifold: considering RNA structural alignments improves prediction accuracy of RNA consensus secondary structures. Bioinformatics 2022; 38:710-719. [PMID: 34694364 DOI: 10.1093/bioinformatics/btab738] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 08/24/2021] [Accepted: 10/20/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION By detecting homology among RNAs, the probabilistic consideration of RNA structural alignments has improved the prediction accuracy of significant RNA prediction problems. Predicting an RNA consensus secondary structure from an RNA sequence alignment is a fundamental research objective because in the detection of conserved base-pairings among RNA homologs, predicting an RNA consensus secondary structure is more convenient than predicting an RNA structural alignment. RESULTS We developed and implemented ConsAlifold, a dynamic programming-based method that predicts the consensus secondary structure of an RNA sequence alignment. ConsAlifold considers RNA structural alignments. ConsAlifold achieves moderate running time and the best prediction accuracy of RNA consensus secondary structures among available prediction methods. AVAILABILITY AND IMPLEMENTATION ConsAlifold, data and Python scripts for generating both figures and tables are freely available at https://github.com/heartsh/consalifold. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Masaki Tagashira
- Department of Computational Biology and Medical Sciences, University of Tokyo, Chiba 277-8561, Japan.,Artificial Intelligence Research Center, AIST, Tokyo 135-0064, Japan
| | - Kiyoshi Asai
- Department of Computational Biology and Medical Sciences, University of Tokyo, Chiba 277-8561, Japan.,Artificial Intelligence Research Center, AIST, Tokyo 135-0064, Japan
| |
Collapse
|
14
|
Zambrano RAI, Hernandez-Perez C, Takahashi MK. RNA Structure Prediction, Analysis, and Design: An Introduction to Web-Based Tools. Methods Mol Biol 2022; 2518:253-269. [PMID: 35666450 DOI: 10.1007/978-1-0716-2421-0_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Understanding RNA structure has become critical in the study of RNA in their roles as mediators of biological processes. To aid in these studies, computational algorithms that utilize thermodynamics have been developed to predict RNA secondary structure. Due to the importance of intermolecular interactions, the algorithms have been expanded to determine and predict RNA-RNA hybridization. This chapter discusses popular webservers with the tools for RNA secondary structure prediction, RNA-RNA hybridization, and design. We address key features that distinguish common-functioning programs and their purposes for the interests of the user. Ultimately, we hope this review elucidates web-based tools researchers may take advantage of in their investigations of RNA structure and function.
Collapse
Affiliation(s)
| | | | - Melissa K Takahashi
- Department of Biology, California State University Northridge, Northridge, CA, USA.
| |
Collapse
|
15
|
De Bisschop G, Allouche D, Frezza E, Masquida B, Ponty Y, Will S, Sargueil B. Progress toward SHAPE Constrained Computational Prediction of Tertiary Interactions in RNA Structure. Noncoding RNA 2021; 7:71. [PMID: 34842779 PMCID: PMC8628965 DOI: 10.3390/ncrna7040071] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 10/29/2021] [Accepted: 11/02/2021] [Indexed: 01/04/2023] Open
Abstract
As more sequencing data accumulate and novel puzzling genetic regulations are discovered, the need for accurate automated modeling of RNA structure increases. RNA structure modeling from chemical probing experiments has made tremendous progress, however accurately predicting large RNA structures is still challenging for several reasons: RNA are inherently flexible and often adopt many energetically similar structures, which are not reliably distinguished by the available, incomplete thermodynamic model. Moreover, computationally, the problem is aggravated by the relevance of pseudoknots and non-canonical base pairs, which are hardly predicted efficiently. To identify nucleotides involved in pseudoknots and non-canonical interactions, we scrutinized the SHAPE reactivity of each nucleotide of the 188 nt long lariat-capping ribozyme under multiple conditions. Reactivities analyzed in the light of the X-ray structure were shown to report accurately the nucleotide status. Those that seemed paradoxical were rationalized by the nucleotide behavior along molecular dynamic simulations. We show that valuable information on intricate interactions can be deduced from probing with different reagents, and in the presence or absence of Mg2+. Furthermore, probing at increasing temperature was remarkably efficient at pointing to non-canonical interactions and pseudoknot pairings. The possibilities of following such strategies to inform structure modeling software are discussed.
Collapse
Affiliation(s)
- Grégoire De Bisschop
- Université de Paris, CNRS, UMR 8038/CiTCoM, F-75006 Paris, France; (G.D.B.); (D.A.); (E.F.)
- Institut de Recherches Cliniques de Montréal (IRCM), Montréal, QC H2W 1R7, Canada
| | - Delphine Allouche
- Université de Paris, CNRS, UMR 8038/CiTCoM, F-75006 Paris, France; (G.D.B.); (D.A.); (E.F.)
- Institut Necker-Enfants Malades (INEM), Inserm U1151, 156 rue de Vaugirard, CEDEX 15, 75015 Paris, France
| | - Elisa Frezza
- Université de Paris, CNRS, UMR 8038/CiTCoM, F-75006 Paris, France; (G.D.B.); (D.A.); (E.F.)
| | - Benoît Masquida
- Université de Strasbourg, CNRS UMR7156 GMGM, 67084 Strasbourg, France;
| | - Yann Ponty
- Ecole Polytechnique, CNRS UMR 7161, LIX, 91120 Palaiseau, France; (Y.P.); (S.W.)
| | - Sebastian Will
- Ecole Polytechnique, CNRS UMR 7161, LIX, 91120 Palaiseau, France; (Y.P.); (S.W.)
| | - Bruno Sargueil
- Université de Paris, CNRS, UMR 8038/CiTCoM, F-75006 Paris, France; (G.D.B.); (D.A.); (E.F.)
| |
Collapse
|
16
|
Wayment-Steele HK, Kim DS, Choe CA, Nicol JJ, Wellington-Oguri R, Watkins AM, Parra Sperberg RA, Huang PS, Participants E, Das R. Theoretical basis for stabilizing messenger RNA through secondary structure design. Nucleic Acids Res 2021; 49:10604-10617. [PMID: 34520542 PMCID: PMC8499941 DOI: 10.1093/nar/gkab764] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 08/17/2021] [Accepted: 08/27/2021] [Indexed: 01/08/2023] Open
Abstract
RNA hydrolysis presents problems in manufacturing, long-term storage, world-wide delivery and in vivo stability of messenger RNA (mRNA)-based vaccines and therapeutics. A largely unexplored strategy to reduce mRNA hydrolysis is to redesign RNAs to form double-stranded regions, which are protected from in-line cleavage and enzymatic degradation, while coding for the same proteins. The amount of stabilization that this strategy can deliver and the most effective algorithmic approach to achieve stabilization remain poorly understood. Here, we present simple calculations for estimating RNA stability against hydrolysis, and a model that links the average unpaired probability of an mRNA, or AUP, to its overall hydrolysis rate. To characterize the stabilization achievable through structure design, we compare AUP optimization by conventional mRNA design methods to results from more computationally sophisticated algorithms and crowdsourcing through the OpenVaccine challenge on the Eterna platform. We find that rational design on Eterna and the more sophisticated algorithms lead to constructs with low AUP, which we term 'superfolder' mRNAs. These designs exhibit a wide diversity of sequence and structure features that may be desirable for translation, biophysical size, and immunogenicity. Furthermore, their folding is robust to temperature, computer modeling method, choice of flanking untranslated regions, and changes in target protein sequence, as illustrated by rapid redesign of superfolder mRNAs for B.1.351, P.1 and B.1.1.7 variants of the prefusion-stabilized SARS-CoV-2 spike protein. Increases in in vitro mRNA half-life by at least two-fold appear immediately achievable.
Collapse
MESH Headings
- Algorithms
- Base Pairing
- Base Sequence
- COVID-19/prevention & control
- Humans
- Hydrolysis
- RNA Stability
- RNA, Double-Stranded/chemistry
- RNA, Double-Stranded/genetics
- RNA, Double-Stranded/immunology
- RNA, Messenger/chemistry
- RNA, Messenger/genetics
- RNA, Messenger/immunology
- RNA, Viral/chemistry
- RNA, Viral/genetics
- RNA, Viral/immunology
- SARS-CoV-2/genetics
- SARS-CoV-2/immunology
- Spike Glycoprotein, Coronavirus/genetics
- Spike Glycoprotein, Coronavirus/immunology
- Thermodynamics
Collapse
Affiliation(s)
- Hannah K Wayment-Steele
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA
- Eterna Massive Open Laboratory
| | - Do Soon Kim
- Eterna Massive Open Laboratory
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Christian A Choe
- Eterna Massive Open Laboratory
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | | | - Andrew M Watkins
- Eterna Massive Open Laboratory
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | | | - Po-Ssu Huang
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Rhiju Das
- Eterna Massive Open Laboratory
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
- Department of Physics, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
17
|
Cao J, Xue Y. Characteristic chemical probing patterns of loop motifs improve prediction accuracy of RNA secondary structures. Nucleic Acids Res 2021; 49:4294-4307. [PMID: 33849076 PMCID: PMC8096282 DOI: 10.1093/nar/gkab250] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 03/24/2021] [Accepted: 04/10/2021] [Indexed: 12/14/2022] Open
Abstract
RNA structures play a fundamental role in nearly every aspect of cellular physiology and pathology. Gaining insights into the functions of RNA molecules requires accurate predictions of RNA secondary structures. However, the existing thermodynamic folding models remain less accurate than desired, even when chemical probing data, such as selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) reactivities, are used as restraints. Unlike most SHAPE-directed algorithms that only consider SHAPE restraints for base pairing, we extract two-dimensional structural features encoded in SHAPE data and establish robust relationships between characteristic SHAPE patterns and loop motifs of various types (hairpin, internal, and bulge) and lengths (2-11 nucleotides). Such characteristic SHAPE patterns are closely related to the sugar pucker conformations of loop residues. Based on these patterns, we propose a computational method, SHAPELoop, which refines the predicted results of the existing methods, thereby further improving their prediction accuracy. In addition, SHAPELoop can provide information about local or global structural rearrangements (including pseudoknots) and help researchers to easily test their hypothesized secondary structures.
Collapse
Affiliation(s)
- Jingyi Cao
- School of Life Sciences, Tsinghua-Peking Joint Center for Life Sciences, Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing 100084, China
| | - Yi Xue
- School of Life Sciences, Tsinghua-Peking Joint Center for Life Sciences, Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
18
|
Rivas E. Evolutionary conservation of RNA sequence and structure. WILEY INTERDISCIPLINARY REVIEWS-RNA 2021; 12:e1649. [PMID: 33754485 PMCID: PMC8250186 DOI: 10.1002/wrna.1649] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 02/24/2021] [Accepted: 02/25/2021] [Indexed: 12/22/2022]
Abstract
An RNA structure prediction from a single‐sequence RNA folding program is not evidence for an RNA whose structure is important for function. Random sequences have plausible and complex predicted structures not easily distinguishable from those of structural RNAs. How to tell when an RNA has a conserved structure is a question that requires looking at the evolutionary signature left by the conserved RNA. This question is important not just for long noncoding RNAs which usually lack an identified function, but also for RNA binding protein motifs which can be single stranded RNAs or structures. Here we review recent advances using sequence and structural analysis to determine when RNA structure is conserved or not. Although covariation measures assess structural RNA conservation, one must distinguish covariation due to RNA structure from covariation due to independent phylogenetic substitutions. We review a statistical test to measure false positives expected under the null hypothesis of phylogenetic covariation alone (specificity). We also review a complementary test that measures power, that is, expected covariation derived from sequence variation alone (sensitivity). Power in the absence of covariation signals the absence of a conserved RNA structure. We analyze artifacts that falsely identify conserved RNA structure such as the misuse of programs that do not assess significance, the use of inappropriate statistics confounded by signals other than covariation, or misalignments that induce spurious covariation. Among artifacts that obscure the signal of a conserved RNA structure, we discuss the inclusion of pseudogenes in alignments which increase power but destroy covariation. This article is categorized under:RNA Structure and Dynamics > RNA Structure, Dynamics and Chemistry RNA Evolution and Genomics > Computational Analyses of RNA RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution
Collapse
Affiliation(s)
- Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
19
|
Královičová J, Borovská I, Pengelly R, Lee E, Abaffy P, Šindelka R, Grutzner F, Vořechovský I. Restriction of an intron size en route to endothermy. Nucleic Acids Res 2021; 49:2460-2487. [PMID: 33550394 PMCID: PMC7969005 DOI: 10.1093/nar/gkab046] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 01/11/2021] [Accepted: 01/15/2021] [Indexed: 11/15/2022] Open
Abstract
Ca2+-insensitive and -sensitive E1 subunits of the 2-oxoglutarate dehydrogenase complex (OGDHC) regulate tissue-specific NADH and ATP supply by mutually exclusive OGDH exons 4a and 4b. Here we show that their splicing is enforced by distant lariat branch points (dBPs) located near the 5' splice site of the intervening intron. dBPs restrict the intron length and prevent transposon insertions, which can introduce or eliminate dBP competitors. The size restriction was imposed by a single dominant dBP in anamniotes that expanded into a conserved constellation of four dBP adenines in amniotes. The amniote clusters exhibit taxon-specific usage of individual dBPs, reflecting accessibility of their extended motifs within a stable RNA hairpin rather than U2 snRNA:dBP base-pairing. The dBP expansion took place in early terrestrial species and was followed by a uridine enrichment of large downstream polypyrimidine tracts in mammals. The dBP-protected megatracts permit reciprocal regulation of exon 4a and 4b by uridine-binding proteins, including TIA-1/TIAR and PUF60, which promote U1 and U2 snRNP recruitment to the 5' splice site and BP, respectively, but do not significantly alter the relative dBP usage. We further show that codons for residues critically contributing to protein binding sites for Ca2+ and other divalent metals confer the exon inclusion order that mirrors the Irving-Williams affinity series, linking the evolution of auxiliary splicing motifs in exons to metallome constraints. Finally, we hypothesize that the dBP-driven selection for Ca2+-dependent ATP provision by E1 facilitated evolution of endothermy by optimizing the aerobic scope in target tissues.
Collapse
Affiliation(s)
- Jana Královičová
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
- Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Ivana Borovská
- Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Reuben Pengelly
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
| | - Eunice Lee
- School of Biological Sciences, University of Adelaide, Adelaide 5005, SA, Australia
| | - Pavel Abaffy
- Czech Academy of Sciences, Institute of Biotechnology, 25250 Vestec, Czech Republic
| | - Radek Šindelka
- Czech Academy of Sciences, Institute of Biotechnology, 25250 Vestec, Czech Republic
| | - Frank Grutzner
- School of Biological Sciences, University of Adelaide, Adelaide 5005, SA, Australia
| | - Igor Vořechovský
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
| |
Collapse
|
20
|
Pietrosanto M, Adinolfi M, Guarracino A, Ferrè F, Ausiello G, Vitale I, Helmer-Citterich M. Relative Information Gain: Shannon entropy-based measure of the relative structural conservation in RNA alignments. NAR Genom Bioinform 2021; 3:lqab007. [PMID: 33615214 PMCID: PMC7884220 DOI: 10.1093/nargab/lqab007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 12/18/2020] [Accepted: 01/26/2021] [Indexed: 12/21/2022] Open
Abstract
Structural characterization of RNAs is a dynamic field, offering many modelling possibilities. RNA secondary structure models are usually characterized by an encoding that depicts structural information of the molecule through string representations or graphs. In this work, we provide a generalization of the BEAR encoding (a context-aware structural encoding we previously developed) by expanding the set of alignments used for the construction of substitution matrices and then applying it to secondary structure encodings ranging from fine-grained to more coarse-grained representations. We also introduce a re-interpretation of the Shannon Information applied on RNA alignments, proposing a new scoring metric, the Relative Information Gain (RIG). The RIG score is available for any position in an alignment, showing how different levels of detail encoded in the RNA representation can contribute differently to convey structural information. The approaches presented in this study can be used alongside state-of-the-art tools to synergistically gain insights into the structural elements that RNAs and RNA families are composed of. This additional information could potentially contribute to their improvement or increase the degree of confidence in the secondary structure of families and any set of aligned RNAs.
Collapse
Affiliation(s)
- Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Marta Adinolfi
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Andrea Guarracino
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna Alma Mater, Via Belmeloro 6, 40126 Bologna, Italy
| | - Gabriele Ausiello
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Ilio Vitale
- IIGM - Italian Institute for Genomic Medicine, c/o IRCSS Candiolo,10060 Torino, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| |
Collapse
|
21
|
Calonaci N, Jones A, Cuturello F, Sattler M, Bussi G. Machine learning a model for RNA structure prediction. NAR Genom Bioinform 2021; 2:lqaa090. [PMID: 33575634 PMCID: PMC7671377 DOI: 10.1093/nargab/lqaa090] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 10/06/2020] [Accepted: 10/20/2020] [Indexed: 01/04/2023] Open
Abstract
RNA function crucially depends on its structure. Thermodynamic models currently used for secondary structure prediction rely on computing the partition function of folding ensembles, and can thus estimate minimum free-energy structures and ensemble populations. These models sometimes fail in identifying native structures unless complemented by auxiliary experimental data. Here, we build a set of models that combine thermodynamic parameters, chemical probing data (DMS and SHAPE) and co-evolutionary data (direct coupling analysis) through a network that outputs perturbations to the ensemble free energy. Perturbations are trained to increase the ensemble populations of a representative set of known native RNA structures. In the chemical probing nodes of the network, a convolutional window combines neighboring reactivities, enlightening their structural information content and the contribution of local conformational ensembles. Regularization is used to limit overfitting and improve transferability. The most transferable model is selected through a cross-validation strategy that estimates the performance of models on systems on which they are not trained. With the selected model we obtain increased ensemble populations for native structures and more accurate predictions in an independent validation set. The flexibility of the approach allows the model to be easily retrained and adapted to incorporate arbitrary experimental information.
Collapse
Affiliation(s)
- Nicola Calonaci
- International School for Advanced Studies, via Bonomea 265, 34136 Trieste, Italy
| | - Alisha Jones
- Institute of Structural Biology, Helmholtz Zentrum München, Neuherberg 85764, Germany
| | - Francesca Cuturello
- International School for Advanced Studies, via Bonomea 265, 34136 Trieste, Italy
| | - Michael Sattler
- Institute of Structural Biology, Helmholtz Zentrum München, Neuherberg 85764, Germany
| | - Giovanni Bussi
- International School for Advanced Studies, via Bonomea 265, 34136 Trieste, Italy
| |
Collapse
|
22
|
|
23
|
Rivas E. RNA structure prediction using positive and negative evolutionary information. PLoS Comput Biol 2020; 16:e1008387. [PMID: 33125376 PMCID: PMC7657543 DOI: 10.1371/journal.pcbi.1008387] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 11/11/2020] [Accepted: 09/24/2020] [Indexed: 12/22/2022] Open
Abstract
Knowing the structure of conserved structural RNAs is important to elucidate their function and mechanism of action. However, predicting a conserved RNA structure remains unreliable, even when using a combination of thermodynamic stability and evolutionary covariation information. Here we present a method to predict a conserved RNA structure that combines the following three features. First, it uses significant covariation due to RNA structure and removes spurious covariation due to phylogeny. Second, it uses negative evolutionary information: basepairs that have variation but no significant covariation are prevented from occurring. Lastly, it uses a battery of probabilistic folding algorithms that incorporate all positive covariation into one structure. The method, named CaCoFold (Cascade variation/covariation Constrained Folding algorithm), predicts a nested structure guided by a maximal subset of positive basepairs, and recursively incorporates all remaining positive basepairs into alternative helices. The alternative helices can be compatible with the nested structure such as pseudoknots, or overlapping such as competing structures, base triplets, or other 3D non-antiparallel interactions. We present evidence that CaCoFold predictions are consistent with structures modeled from crystallography. The availability of deeper comparative sequence alignments and recent advances in statistical analysis of RNA sequence covariation have made it possible to identify a reliable set of conserved base pairs, as well as a reliable set of non-basepairs (positions that vary without covarying). Predicting an overall consensus secondary structure consistent with a set of individual inferred pairs and non-pairs remains a problem. Current RNA structure prediction algorithms that predict nested secondary structures cannot use the full set of inferred covarying pairs, because covariation analysis also identifies important non-nested pairing interactions such as pseudoknots, base triples, and alternative structures. Moreover, although algorithms for incorporating negative constraints exist, negative information from covariation analysis (inferred non-pairs) has not been systematically exploited. Here I introduce an efficient approximate RNA structure prediction algorithm that incorporates all inferred pairs and excludes all non-pairs. Using this, and an improved visualization tool, I show that the method correctly identifies many non-nested structures in agreement with known crystal structures, and improves many curated consensus secondary structure annotations in RNA sequence alignment databases.
Collapse
Affiliation(s)
- Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
- * E-mail:
| |
Collapse
|
24
|
Li B, Cao Y, Westhof E, Miao Z. Advances in RNA 3D Structure Modeling Using Experimental Data. Front Genet 2020; 11:574485. [PMID: 33193680 PMCID: PMC7649352 DOI: 10.3389/fgene.2020.574485] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 09/02/2020] [Indexed: 12/26/2022] Open
Abstract
RNA is a unique bio-macromolecule that can both record genetic information and perform biological functions in a variety of molecular processes, including transcription, splicing, translation, and even regulating protein function. RNAs adopt specific three-dimensional conformations to enable their functions. Experimental determination of high-resolution RNA structures using x-ray crystallography is both laborious and demands expertise, thus, hindering our comprehension of RNA structural biology. The computational modeling of RNA structure was a milestone in the birth of bioinformatics. Although computational modeling has been greatly improved over the last decade showing many successful cases, the accuracy of such computational modeling is not only length-dependent but also varies according to the complexity of the structure. To increase credibility, various experimental data were integrated into computational modeling. In this review, we summarize the experiments that can be integrated into RNA structure modeling as well as the computational methods based on these experimental data. We also demonstrate how computational modeling can help the experimental determination of RNA structure. We highlight the recent advances in computational modeling which can offer reliable structure models using high-throughput experimental data.
Collapse
Affiliation(s)
- Bing Li
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Eric Westhof
- Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, Strasbourg, France
| | - Zhichao Miao
- Translational Research Institute of Brain and Brain-Like Intelligence, Department of Anesthesiology, Shanghai Fourth People’s Hospital Affiliated to Tongji University School of Medicine, Shanghai, China
- Newcastle Fibrosis Research Group, Institute of Cellular Medicine, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| |
Collapse
|
25
|
Greenwood T, Heitsch CE. On the Problem of Reconstructing a Mixture of RNA Structures. Bull Math Biol 2020; 82:133. [PMID: 33029669 DOI: 10.1007/s11538-020-00804-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Accepted: 09/08/2020] [Indexed: 01/02/2023]
Abstract
A growing number of RNA sequences are now known to exist in some distribution with two or more different stable structures. Recent algorithms attempt to reconstruct such mixtures using the list of nucleotides in a sequence in conjunction with auxiliary experimental footprinting data. In this paper, we demonstrate some challenges which remain in addressing this problem; in particular we consider the difficulty of reconstructing a mixture of two RNA structures across a spectrum of different relative abundances. Although progress has been made in identifying the stable structures present, it remains nontrivial to predict the relative abundance of each within the experimentally sampled mixture. Because the ratio of structures present can change depending on experimental conditions, it is the footprinting data-and not the sequence-which must encode information on changes in the relative abundance. Here, we use simulated experimental data to demonstrate that there exist RNA sequences and relative abundance combinations which cannot be recovered by current methods. We then prove that this is not a single exception, but rather part of the rule. In particular, we show, using a Nussinov-Jacobson model, that recovering the relative abundances is difficult for a large proportion of RNA structure pairs. Lastly, we use information theory to establish a framework for quantifying how useful auxiliary data is in predicting the relative abundance of a structure. Together, these results demonstrate that aspects of the problem of reconstructing a mixture of RNA structures from experimental data remain open.
Collapse
|
26
|
Saaidi A, Allouche D, Regnier M, Sargueil B, Ponty Y. IPANEMAP: integrative probing analysis of nucleic acids empowered by multiple accessibility profiles. Nucleic Acids Res 2020; 48:8276-8289. [PMID: 32735675 PMCID: PMC7470984 DOI: 10.1093/nar/gkaa607] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 07/03/2020] [Accepted: 07/29/2020] [Indexed: 11/13/2022] Open
Abstract
The manual production of reliable RNA structure models from chemical probing experiments benefits from the integration of information derived from multiple protocols and reagents. However, the interpretation of multiple probing profiles remains a complex task, hindering the quality and reproducibility of modeling efforts. We introduce IPANEMAP, the first automated method for the modeling of RNA structure from multiple probing reactivity profiles. Input profiles can result from experiments based on diverse protocols, reagents, or collection of variants, and are jointly analyzed to predict the dominant conformations of an RNA. IPANEMAP combines sampling, clustering and multi-optimization, to produce secondary structure models that are both stable and well-supported by experimental evidences. The analysis of multiple reactivity profiles, both publicly available and produced in our study, demonstrates the good performances of IPANEMAP, even in a mono probing setting. It confirms the potential of integrating multiple sources of probing data, informing the design of informative probing assays.
Collapse
Affiliation(s)
- Afaf Saaidi
- CNRS UMR 7161, LIX, Ecole Polytechnique, Institut Polytechnique de Paris, 1 rue Estienne d'Orves, 91120 Palaiseau, France
| | - Delphine Allouche
- CNRS UMR 8038, CitCoM, Université de Paris, 4 avenue de l'observatoire, 75006 Paris, France
| | - Mireille Regnier
- CNRS UMR 7161, LIX, Ecole Polytechnique, Institut Polytechnique de Paris, 1 rue Estienne d'Orves, 91120 Palaiseau, France
| | - Bruno Sargueil
- CNRS UMR 8038, CitCoM, Université de Paris, 4 avenue de l'observatoire, 75006 Paris, France
| | - Yann Ponty
- CNRS UMR 7161, LIX, Ecole Polytechnique, Institut Polytechnique de Paris, 1 rue Estienne d'Orves, 91120 Palaiseau, France
| |
Collapse
|
27
|
Li TJX, Reidys CM. On an enhancement of RNA probing data using information theory. Algorithms Mol Biol 2020; 15:15. [PMID: 32782456 PMCID: PMC7413225 DOI: 10.1186/s13015-020-00176-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2019] [Accepted: 07/31/2020] [Indexed: 12/21/2022] Open
Abstract
Identifying the secondary structure of an RNA is crucial for understanding its diverse regulatory functions. This paper focuses on how to enhance target identification in a Boltzmann ensemble of structures via chemical probing data. We employ an information-theoretic approach to solve the problem, via considering a variant of the Rényi-Ulam game. Our framework is centered around the ensemble tree, a hierarchical bi-partition of the input ensemble, that is constructed by recursively querying about whether or not a base pair of maximum information entropy is contained in the target. These queries are answered via relating local with global probing data, employing the modularity in RNA secondary structures. We present that leaves of the tree are comprised of sub-samples exhibiting a distinguished structure with high probability. In particular, for a Boltzmann ensemble incorporating probing data, which is well established in the literature, the probability of our framework correctly identifying the target in the leaf is greater than \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$90\%$$\end{document}90%.
Collapse
|
28
|
Twittenhoff C, Brandenburg VB, Righetti F, Nuss AM, Mosig A, Dersch P, Narberhaus F. Lead-seq: transcriptome-wide structure probing in vivo using lead(II) ions. Nucleic Acids Res 2020; 48:e71. [PMID: 32463449 PMCID: PMC7337928 DOI: 10.1093/nar/gkaa404] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 04/08/2020] [Accepted: 05/06/2020] [Indexed: 12/24/2022] Open
Abstract
The dynamic conformation of RNA molecules within living cells is key to their function. Recent advances in probing the RNA structurome in vivo, including the use of SHAPE (Selective 2'-Hydroxyl Acylation analyzed by Primer Extension) or kethoxal reagents or DMS (dimethyl sulfate), provided unprecedented insights into the architecture of RNA molecules in the living cell. Here, we report the establishment of lead probing in a global RNA structuromics approach. In order to elucidate the transcriptome-wide RNA landscape in the enteric pathogen Yersinia pseudotuberculosis, we combined lead(II) acetate-mediated cleavage of single-stranded RNA regions with high-throughput sequencing. This new approach, termed 'Lead-seq', provides structural information independent of base identity. We show that the method recapitulates secondary structures of tRNAs, RNase P RNA, tmRNA, 16S rRNA and the rpsT 5'-untranslated region, and that it reveals global structural features of mRNAs. The application of Lead-seq to Y. pseudotuberculosis cells grown at two different temperatures unveiled the first temperature-responsive in vivo RNA structurome of a bacterial pathogen. The translation of candidate genes derived from this approach was confirmed to be temperature regulated. Overall, this study establishes Lead-seq as complementary approach to interrogate intracellular RNA structures on a global scale.
Collapse
Affiliation(s)
| | | | | | - Aaron M Nuss
- Department of Molecular Infection Biology, Helmholtz Centre for Infection Research, 381214 Braunschweig, Germany
| | - Axel Mosig
- Department of Biophysics, Ruhr University Bochum, 44780 Bochum, Germany
| | - Petra Dersch
- Department of Molecular Infection Biology, Helmholtz Centre for Infection Research, 381214 Braunschweig, Germany
- Institute of Infectiology, Center for Molecular Biology of Inflammation, University of Münster, 48149 Münster, Germany
| | - Franz Narberhaus
- Microbial Biology, Ruhr University Bochum, 44780 Bochum, Germany
| |
Collapse
|
29
|
Kuksa PP, Li F, Kannan S, Gregory BD, Leung YY, Wang LS. HiPR: High-throughput probabilistic RNA structure inference. Comput Struct Biotechnol J 2020; 18:1539-1547. [PMID: 32637050 PMCID: PMC7327253 DOI: 10.1016/j.csbj.2020.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 05/15/2020] [Accepted: 06/01/2020] [Indexed: 11/20/2022] Open
Abstract
Recent high-throughput structure-sensitive genome-wide sequencing-based assays have enabled large-scale studies of RNA structure, and robust transcriptome-wide computational prediction of individual RNA structures across RNA classes from these assays has potential to further improve the prediction accuracy. Here, we describe HiPR, a novel method for RNA structure prediction at single-nucleotide resolution that combines high-throughput structure probing data (DMS-seq, DMS-MaPseq) with a novel probabilistic folding algorithm. On validation data spanning a variety of RNA classes, HiPR often increases accuracy for predicting RNA structures, giving researchers new tools to study RNA structure.
Collapse
Affiliation(s)
- Pavel P. Kuksa
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Fan Li
- Children’s Hospital Los Angeles, Los Angeles, CA 90027, USA
| | - Sampath Kannan
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Brian D. Gregory
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yuk Yee Leung
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Li-San Wang
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
30
|
Willmott D, Murrugarra D, Ye Q. Improving RNA secondary structure prediction via state inference with deep recurrent neural networks. COMPUTATIONAL AND MATHEMATICAL BIOPHYSICS 2020. [DOI: 10.1515/cmb-2020-0002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Abstract
The problem of determining which nucleotides of an RNA sequence are paired or unpaired in the secondary structure of an RNA, which we call RNA state inference, can be studied by different machine learning techniques. Successful state inference of RNA sequences can be used to generate auxiliary information for data-directed RNA secondary structure prediction. Typical tools for state inference, such as hidden Markov models, exhibit poor performance in RNA state inference, owing in part to their inability to recognize nonlocal dependencies. Bidirectional long short-term memory (LSTM) neural networks have emerged as a powerful tool that can model global nonlinear sequence dependencies and have achieved state-of-the-art performances on many different classification problems.
This paper presents a practical approach to RNA secondary structure inference centered around a deep learning method for state inference. State predictions from a deep bidirectional LSTM are used to generate synthetic SHAPE data that can be incorporated into RNA secondary structure prediction via the Nearest Neighbor Thermodynamic Model (NNTM). This method produces predicted secondary structures for a diverse test set of 16S ribosomal RNA that are, on average, 25 percentage points more accurate than undirected MFE structures. Accuracy is highly dependent on the success of our state inference method, and investigating the global features of our state predictions reveals that accuracy of both our state inference and structure inference methods are highly dependent on the similarity of pairing patterns of the sequence to the training dataset. Availability of a large training dataset is critical to the success of this approach. Code available at https://github.com/dwillmott/rna-state-inf.
Collapse
Affiliation(s)
| | - David Murrugarra
- Department of Mathematics , University of Kentucky , Lexington , KY 40506-0027 USA
| | - Qiang Ye
- Department of Mathematics , University of Kentucky , Lexington , KY 40506-0027 USA
| |
Collapse
|
31
|
Karner H, Webb CH, Carmona S, Liu Y, Lin B, Erhard M, Chan D, Baldi P, Spitale RC, Sun S. Functional Conservation of LncRNA JPX Despite Sequence and Structural Divergence. J Mol Biol 2019; 432:283-300. [PMID: 31518612 DOI: 10.1016/j.jmb.2019.09.002] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Revised: 08/29/2019] [Accepted: 09/02/2019] [Indexed: 02/02/2023]
Abstract
Long noncoding RNAs (lncRNAs) have been identified in all eukaryotes and are most abundant in the human genome. However, the functional importance and mechanisms of action for human lncRNAs are largely unknown. Using comparative sequence, structural, and functional analyses, we characterize the evolution and molecular function of human lncRNA JPX. We find that human JPX and its mouse homolog, lncRNA Jpx, have deep divergence in their nucleotide sequences and RNA secondary structures. Despite such differences, both lncRNAs demonstrate robust binding to CTCF, a protein that is central to Jpx's role in X chromosome inactivation. In addition, our functional rescue experiment using Jpx-deletion mutant cells shows that human JPX can functionally complement the loss of Jpx in mouse embryonic stem cells. Our findings support a model for functional conservation of lncRNAs independent from sequence and structural divergence. This study provides mechanistic insight into the evolution of lncRNA function.
Collapse
Affiliation(s)
- Heather Karner
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Chiu-Ho Webb
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Sarah Carmona
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Yu Liu
- Department of Computer Science, Institute for Genomics and Bioinformatics, University of California Irvine, Irvine, CA 92697, USA
| | - Benjamin Lin
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Micaela Erhard
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Dalen Chan
- Department of Pharmaceutical Sciences, College of Health Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Pierre Baldi
- Department of Computer Science, Institute for Genomics and Bioinformatics, University of California Irvine, Irvine, CA 92697, USA
| | - Robert C Spitale
- Department of Pharmaceutical Sciences, College of Health Sciences, University of California Irvine, Irvine, CA 92697, USA
| | - Sha Sun
- Department of Developmental and Cell Biology, School of Biological Sciences, University of California Irvine, Irvine, CA 92697, USA.
| |
Collapse
|
32
|
Katz N, Cohen R, Solomon O, Kaufmann B, Atar O, Yakhini Z, Goldberg S, Amit R. Synthetic 5' UTRs Can Either Up- or Downregulate Expression upon RNA-Binding Protein Binding. Cell Syst 2019; 9:93-106.e8. [PMID: 31129060 DOI: 10.1016/j.cels.2019.04.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 02/07/2019] [Accepted: 04/26/2019] [Indexed: 01/08/2023]
Abstract
The construction of complex gene-regulatory networks requires both inhibitory and upregulatory modules. However, the vast majority of RNA-based regulatory "parts" are inhibitory. Using a synthetic biology approach combined with SHAPE-seq, we explored the regulatory effect of RNA-binding protein (RBP)-RNA interactions in bacterial 5' UTRs. By positioning a library of RNA hairpins upstream of a reporter gene and co-expressing them with the matching RBP, we observed a set of regulatory responses, including translational stimulation, translational repression, and cooperative behavior. Our combined approach revealed three distinct states in vivo: in the absence of RBPs, the RNA molecules can be found in either a molten state that is amenable to translation or a structured phase that inhibits translation. In the presence of RBPs, the RNA molecules are in a semi-structured phase with partial translational capacity. Our work provides new insight into RBP-based regulation and a blueprint for designing complete gene-regulatory circuits at the post-transcriptional level.
Collapse
Affiliation(s)
- Noa Katz
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel
| | - Roni Cohen
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel
| | - Oz Solomon
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel; School of Computer Science, Interdisciplinary Center, 46150 Herzeliya, Israel
| | - Beate Kaufmann
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel
| | - Orna Atar
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel
| | - Zohar Yakhini
- Department of Computer Science, Technion - Israel Institute of Technology, 32000 Haifa, Israel; School of Computer Science, Interdisciplinary Center, 46150 Herzeliya, Israel
| | - Sarah Goldberg
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel
| | - Roee Amit
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, 32000 Haifa, Israel; Russell Berrie Nanotechnology Institute, Technion - Israel Institute of Technology, 32000 Haifa, Israel.
| |
Collapse
|
33
|
Abstract
RNA performs and regulates a diverse range of cellular processes, with new functional roles being uncovered at a rapid pace. Interest is growing in how these functions are linked to RNA structures that form in the complex cellular environment. A growing suite of technologies that use advances in RNA structural probes, high-throughput sequencing and new computational approaches to interrogate RNA structure at unprecedented throughput are beginning to provide insights into RNA structures at new spatial, temporal and cellular scales.
Collapse
Affiliation(s)
- Eric J Strobel
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | - Angela M Yu
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Julius B Lucks
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
34
|
Spasic A, Assmann SM, Bevilacqua PC, Mathews DH. Modeling RNA secondary structure folding ensembles using SHAPE mapping data. Nucleic Acids Res 2019; 46:314-323. [PMID: 29177466 PMCID: PMC5758915 DOI: 10.1093/nar/gkx1057] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 10/30/2017] [Indexed: 12/22/2022] Open
Abstract
RNA secondary structure prediction is widely used for developing hypotheses about the structures of RNA sequences, and structure can provide insight about RNA function. The accuracy of structure prediction is known to be improved using experimental mapping data that provide information about the pairing status of single nucleotides, and these data can now be acquired for whole transcriptomes using high-throughput sequencing. Prior methods for using these experimental data focused on predicting structures for sequences assuming that they populate a single structure. Most RNAs populate multiple structures, however, where the ensemble of strands populates structures with different sets of canonical base pairs. The focus on modeling single structures has been a bottleneck for accurately modeling RNA structure. In this work, we introduce Rsample, an algorithm for using experimental data to predict more than one RNA structure for sequences that populate multiple structures at equilibrium. We demonstrate, using SHAPE mapping data, that we can accurately model RNA sequences that populate multiple structures, including the relative probabilities of those structures. This program is freely available as part of the RNAstructure software package.
Collapse
Affiliation(s)
- Aleksandar Spasic
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Sarah M Assmann
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Philip C Bevilacqua
- Department of Chemistry, Department of Biochemistry & Molecular Biology, Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| |
Collapse
|
35
|
Frezza E, Courban A, Allouche D, Sargueil B, Pasquali S. The interplay between molecular flexibility and RNA chemical probing reactivities analyzed at the nucleotide level via an extensive molecular dynamics study. Methods 2019; 162-163:108-127. [PMID: 31145972 DOI: 10.1016/j.ymeth.2019.05.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 05/22/2019] [Accepted: 05/22/2019] [Indexed: 12/20/2022] Open
Abstract
Determination of the tridimensional structure of ribonucleic acid molecules is fundamental for understanding their function in the cell. A common method to investigate RNA structures of large molecules is the use of chemical probes such as SHAPE (2'-hydroxyl acylation analyzed by primer extension) reagents, DMS (dimethyl sulfate) and CMCT (1-cyclohexyl-3-(2-morpholinoethyl) carbodiimide metho-p-toluene sulfate), the reaction of which is dependent on the local structural properties of each nucleotide. In order to understand the interplay between local flexibility, sugar pucker, canonical pairing and chemical reactivity of the probes, we performed all-atom molecular dynamics simulations on a set of RNA molecules for which both tridimensional structure and chemical probing data are available and we analyzed the correlations between geometrical parameters and the chemical reactivity. Our study confirms that SHAPE reactivity is guided by the local flexibility of the different chemical moieties but suggests that a combination of multiple parameters is needed to better understand the implications of the reactivity at the molecular level. This is also the case for DMS and CMCT for which the reactivity appears to be more complex than commonly accepted.
Collapse
Affiliation(s)
- Elisa Frezza
- Faculté de Pharmacie de Paris, Laboratoire de Cristallographie et RMN Biologiques, UMR 8015 - CNRS, Université Paris Descartes, 4 Avenue de l'Observatoire 75270 PARIS CEDEX 06, France.
| | - Antoine Courban
- Faculté de Pharmacie de Paris, Laboratoire de Cristallographie et RMN Biologiques, UMR 8015 - CNRS, Université Paris Descartes, 4 Avenue de l'Observatoire 75270 PARIS CEDEX 06, France
| | - Delphine Allouche
- Faculté de Pharmacie de Paris, Laboratoire de Cristallographie et RMN Biologiques, UMR 8015 - CNRS, Université Paris Descartes, 4 Avenue de l'Observatoire 75270 PARIS CEDEX 06, France
| | - Bruno Sargueil
- Faculté de Pharmacie de Paris, Laboratoire de Cristallographie et RMN Biologiques, UMR 8015 - CNRS, Université Paris Descartes, 4 Avenue de l'Observatoire 75270 PARIS CEDEX 06, France.
| | - Samuela Pasquali
- Faculté de Pharmacie de Paris, Laboratoire de Cristallographie et RMN Biologiques, UMR 8015 - CNRS, Université Paris Descartes, 4 Avenue de l'Observatoire 75270 PARIS CEDEX 06, France.
| |
Collapse
|
36
|
Special Issue: Computational Analysis of RNA Structure and Function. Genes (Basel) 2019; 10:genes10010055. [PMID: 30654585 PMCID: PMC6357010 DOI: 10.3390/genes10010055] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Accepted: 01/08/2019] [Indexed: 01/18/2023] Open
Abstract
RNA structure often plays a key role in determining the function of non-coding and coding transcripts [...].
Collapse
|
37
|
Andrews RJ, Roche J, Moss WN. ScanFold: an approach for genome-wide discovery of local RNA structural elements-applications to Zika virus and HIV. PeerJ 2018; 6:e6136. [PMID: 30627482 PMCID: PMC6317755 DOI: 10.7717/peerj.6136] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 11/15/2018] [Indexed: 12/24/2022] Open
Abstract
In addition to encoding RNA primary structures, genomes also encode RNA secondary and tertiary structures that play roles in gene regulation and, in the case of RNA viruses, genome replication. Methods for the identification of functional RNA structures in genomes typically rely on scanning analysis windows, where multiple partially-overlapping windows are used to predict RNA structures and folding metrics to deduce regions likely to form functional structure. Separate structural models are produced for each window, where the step size can greatly affect the returned model. This makes deducing unique local structures challenging, as the same nucleotides in each window can be alternatively base paired. We are presenting here a new approach where all base pairs from analysis windows are considered and weighted by favorable folding. This results in unique base pairing throughout the genome and the generation of local regions/structures that can be ranked by their propensity to form unusually thermodynamically stable folds. We applied this approach to the Zika virus (ZIKV) and HIV-1 genomes. ZIKV is linked to a variety of neurological ailments including microcephaly and Guillain-Barré syndrome and its (+)-sense RNA genome encodes two, previously described, functionally essential structured RNA regions. HIV, the cause of AIDS, contains multiple functional RNA motifs in its genome, which have been extensively studied. Our approach is able to successfully identify and model the structures of known functional motifs in both viruses, while also finding additional regions likely to form functional structures. All data have been archived at the RNAStructuromeDB (www.structurome.bb.iastate.edu), a repository of RNA folding data for humans and their pathogens.
Collapse
Affiliation(s)
- Ryan J. Andrews
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, USA
| | - Julien Roche
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, USA
| | - Walter N. Moss
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, USA
| |
Collapse
|
38
|
Genome-wide probing RNA structure with the modified DMS-MaPseq in Arabidopsis. Methods 2018; 155:30-40. [PMID: 30503825 DOI: 10.1016/j.ymeth.2018.11.018] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2018] [Revised: 11/20/2018] [Accepted: 11/27/2018] [Indexed: 11/20/2022] Open
Abstract
Transcripts have intrinsic propensity to form stable secondary structure that is fundamental to regulate RNA transcription, splicing, translation, RNA localization and turnover. Numerous methods that integrate chemical reactions with next-generation sequencing (NGS) have been applied to study in vivo RNA structure, providing new insights into RNA biology. Dimethyl sulfate (DMS) probing coupled with mutational profiling through NGS (DMS-MaPseq) is a newly developed method for revealing genome-wide or target-specific RNA structure. Herein, we present our experimental protocol of a modified DMS-MaPseq method for plant materials. The DMS treatment condition was optimized, and library preparation procedures were simplified. We also provided custom scripts for bioinformatic analysis of genome-wide DMS-MaPseq data. Bioinformatic results showed that our method could generate high-quality and reproducible data. Further, we assessed sequencing depth and coverage for genome-wide RNA structure profiling in Arabidopsis, and provided two examples of in vivo structure of mobile RNAs. We hope that our modified DMS-MaPseq method will serve as a powerful tool for analyzing in vivo RNA structurome in plants.
Collapse
|
39
|
Mailler E, Paillart JC, Marquet R, Smyth RP, Vivet-Boudou V. The evolution of RNA structural probing methods: From gels to next-generation sequencing. WILEY INTERDISCIPLINARY REVIEWS-RNA 2018; 10:e1518. [PMID: 30485688 DOI: 10.1002/wrna.1518] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Revised: 09/13/2018] [Accepted: 10/17/2018] [Indexed: 01/09/2023]
Abstract
RNA molecules are important players in all domains of life and the study of the relationship between their multiple flexible states and the associated biological roles has increased in recent years. For several decades, chemical and enzymatic structural probing experiments have been used to determine RNA structure. During this time, there has been a steady improvement in probing reagents and experimental methods, and today the structural biologist community has a large range of tools at its disposal to probe the secondary structure of RNAs in vitro and in cells. Early experiments used radioactive labeling and polyacrylamide gel electrophoresis as read-out methods. This was superseded by capillary electrophoresis, and more recently by next-generation sequencing. Today, powerful structural probing methods can characterize RNA structure on a genome-wide scale. In this review, we will provide an overview of RNA structural probing methodologies from a historical and technical perspective. This article is categorized under: RNA Structure and Dynamics > RNA Structure, Dynamics, and Chemistry RNA Methods > RNA Analyses in vitro and In Silico RNA Methods > RNA Analyses in Cells.
Collapse
Affiliation(s)
- Elodie Mailler
- Architecture et Réactivité de l'ARN, Université de Strasbourg, CNRS, Strasbourg, France
| | | | - Roland Marquet
- Architecture et Réactivité de l'ARN, Université de Strasbourg, CNRS, Strasbourg, France
| | - Redmond P Smyth
- Architecture et Réactivité de l'ARN, Université de Strasbourg, CNRS, Strasbourg, France
| | - Valerie Vivet-Boudou
- Architecture et Réactivité de l'ARN, Université de Strasbourg, CNRS, Strasbourg, France
| |
Collapse
|
40
|
A Functional riboSNitch in the 3' Untranslated Region of FKBP5 Alters MicroRNA-320a Binding Efficiency and Mediates Vulnerability to Chronic Post-Traumatic Pain. J Neurosci 2018; 38:8407-8420. [PMID: 30150364 DOI: 10.1523/jneurosci.3458-17.2018] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 07/11/2018] [Accepted: 07/13/2018] [Indexed: 01/30/2023] Open
Abstract
Previous studies have shown that common variants of the gene coding for FK506-binding protein 51 (FKBP5), a critical regulator of glucocorticoid sensitivity, affect vulnerability to stress-related disorders. In a previous report, FKBP5 rs1360780 was identified as a functional variant because of its effect on gene methylation. Here we report evidence for a novel functional FKBP5 allele, rs3800373. This study assessed the association between rs3800373 and post-traumatic chronic pain in 1607 women and men from two ethnically diverse human cohorts. The molecular mechanism through which rs3800373 affects adverse outcomes was established via in silico, in vivo, and in vitro analyses. The rs3800373 minor allele predicted worse adverse outcomes after trauma exposure, such that individuals with the minor (risk) allele developed more severe post-traumatic chronic musculoskeletal pain. Among these individuals, peritraumatic circulating FKBP5 expression levels increased as cortisol and glucocorticoid receptor (NR3C1) mRNA levels increased, consistent with increased glucocorticoid resistance. Bioinformatic, in vitro, and mutational analyses indicate that the rs3800373 minor allele reduces the binding of a stress- and pain-associated microRNA, miR-320a, to FKBP5 via altering the FKBP5 mRNA 3'UTR secondary structure (i.e., is a riboSNitch). This results in relatively greater FKBP5 translation, unchecked by miR-320a. Overall, these results identify an important gene-miRNA interaction influencing chronic pain risk in vulnerable individuals and suggest that exogenous methods to achieve targeted reduction in poststress FKBP5 mRNA expression may constitute useful therapeutic strategies.SIGNIFICANCE STATEMENT FKBP5 is a critical regulator of the stress response. Previous studies have shown that dysregulation of the expression of this gene plays a role in the pathogenesis of chronic pain development as well as a number of comorbid neuropsychiatric disorders. In the current study, we identified a functional allele (rs3800373) in the 3'UTR of FKBP5 that influences vulnerability to chronic post-traumatic pain in two ethnic cohorts. Using multiple complementary experimental approaches, we show that the FKBP5 rs3800373 minor allele alters the secondary structure of FKBP5 mRNA, decreasing the binding of a stress- and pain-associated microRNA, miR-320a. This results in relatively greater FKBP5 translation, unchecked by miR-320a, increasing glucocorticoid resistance and increasing vulnerability to post-traumatic pain.
Collapse
|
41
|
Wright PR, Mann M, Backofen R. Structure and Interaction Prediction in Prokaryotic RNA Biology. Microbiol Spectr 2018; 6:10.1128/microbiolspec.rwr-0001-2017. [PMID: 29676245 PMCID: PMC11633574 DOI: 10.1128/microbiolspec.rwr-0001-2017] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Indexed: 01/01/2023] Open
Abstract
Many years of research in RNA biology have soundly established the importance of RNA-based regulation far beyond most early traditional presumptions. Importantly, the advances in "wet" laboratory techniques have produced unprecedented amounts of data that require efficient and precise computational analysis schemes and algorithms. Hence, many in silico methods that attempt topological and functional classification of novel putative RNA-based regulators are available. In this review, we technically outline thermodynamics-based standard RNA secondary structure and RNA-RNA interaction prediction approaches that have proven valuable to the RNA research community in the past and present. For these, we highlight their usability with a special focus on prokaryotic organisms and also briefly mention recent advances in whole-genome interactomics and how this may influence the field of predictive RNA research.
Collapse
Affiliation(s)
| | | | - Rolf Backofen
- Bioinformatics Group
- Center for Biological Signaling Studies (BIOSS), University of Freiburg, Freiburg, Germany
| |
Collapse
|
42
|
Abdelsayed MM, Ho BT, Vu MMK, Polanco J, Spitale RC, Lupták A. Multiplex Aptamer Discovery through Apta-Seq and Its Application to ATP Aptamers Derived from Human-Genomic SELEX. ACS Chem Biol 2017; 12:2149-2156. [PMID: 28661647 DOI: 10.1021/acschembio.7b00001] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Laboratory-evolved RNAs bind a wide variety of targets and serve highly diverse functions, including as diagnostic and therapeutic aptamers. The majority of aptamers have been identified using in vitro selection (SELEX), a molecular evolution technique based on selecting target-binding RNAs from highly diverse pools through serial rounds of enrichment and amplification. In vitro selection typically yields multiple distinct motifs of highly variable abundance and target-binding affinities. The discovery of new aptamers is often limited by the difficulty of characterizing the selected motifs, because testing of individual sequences tends to be a tedious process. To facilitate the discovery of new aptamers within in vitro selected pools, we developed Apta-Seq, a multiplex analysis based on quantitative, ligand-dependent 2' acylation of solvent-accessible regions of the selected RNA pools, followed by reverse transcription (SHAPE) and deep sequencing. The method reveals, in a single sequencing experiment, the identity, structural features, and target dissociation constants for aptamers present in the selected pool. Application of Apta-Seq to a human genomic pool enriched for ATP-binding RNAs yielded three new aptamers, which together with previously identified human aptamers suggest that ligand-binding RNAs may be common in mammals.
Collapse
Affiliation(s)
- Michael M. Abdelsayed
- Department
of Molecular Biology and Biochemistry, University of California—Irvine, Irvine, California 92697, United States
| | - Bao T. Ho
- Department
of Pharmaceutical Sciences, University of California—Irvine, Irvine, California 92697, United States
| | - Michael M. K. Vu
- Department
of Chemistry, University of California—Irvine, Irvine, California 92697, United States
| | - Julio Polanco
- Department
of Molecular Biology and Biochemistry, University of California—Irvine, Irvine, California 92697, United States
| | - Robert C. Spitale
- Department
of Pharmaceutical Sciences, University of California—Irvine, Irvine, California 92697, United States
- Department
of Chemistry, University of California—Irvine, Irvine, California 92697, United States
| | - Andrej Lupták
- Department
of Molecular Biology and Biochemistry, University of California—Irvine, Irvine, California 92697, United States
- Department
of Pharmaceutical Sciences, University of California—Irvine, Irvine, California 92697, United States
- Department
of Chemistry, University of California—Irvine, Irvine, California 92697, United States
| |
Collapse
|
43
|
Tan Z, Sharma G, Mathews DH. Modeling RNA Secondary Structure with Sequence Comparison and Experimental Mapping Data. Biophys J 2017; 113:330-338. [PMID: 28735622 DOI: 10.1016/j.bpj.2017.06.039] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Revised: 06/07/2017] [Accepted: 06/19/2017] [Indexed: 10/19/2022] Open
Abstract
Secondary structure prediction is an important problem in RNA bioinformatics because knowledge of structure is critical to understanding the functions of RNA sequences. Significant improvements in prediction accuracy have recently been demonstrated though the incorporation of experimentally obtained structural information, for instance using selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) mapping. However, such mapping data is currently available only for a limited number of RNA sequences. In this article, we present a method for extending the benefit of experimental mapping data in secondary structure prediction to homologous sequences. Specifically, we propose a method for integrating experimental mapping data into a comparative sequence analysis algorithm for secondary structure prediction of multiple homologs, whereby the mapping data benefits not only the prediction for the specific sequence that was mapped but also other homologs. The proposed method is realized by modifying the TurboFold II algorithm for prediction of RNA secondary structures to utilize basepairing probabilities guided by SHAPE experimental data when such data are available. The SHAPE-mapping-guided basepairing probabilities are obtained using the RSample method. Results demonstrate that the SHAPE mapping data for a sequence improves structure prediction accuracy of other homologous sequences beyond the accuracy obtained by sequence comparison alone (TurboFold II). The updated version of TurboFold II is freely available as part of the RNAstructure software package.
Collapse
Affiliation(s)
- Zhen Tan
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, New York; Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - Gaurav Sharma
- Center for RNA Biology, University of Rochester Medical Center, Rochester, New York; Department of Electrical and Computer Engineering, University of Rochester Medical Center, Rochester, New York; Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York.
| | - David H Mathews
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, New York; Center for RNA Biology, University of Rochester Medical Center, Rochester, New York; Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York.
| |
Collapse
|
44
|
Abstract
The discoveries of myriad non-coding RNA molecules, each transiting through multiple flexible states in cells or virions, present major challenges for structure determination. Advances in high-throughput chemical mapping give new routes for characterizing entire transcriptomes in vivo, but the resulting one-dimensional data generally remain too information-poor to allow accurate de novo structure determination. Multidimensional chemical mapping (MCM) methods seek to address this challenge. Mutate-and-map (M2), RNA interaction groups by mutational profiling (RING-MaP and MaP-2D analysis) and multiplexed •OH cleavage analysis (MOHCA) measure how the chemical reactivities of every nucleotide in an RNA molecule change in response to modifications at every other nucleotide. A growing body of in vitro blind tests and compensatory mutation/rescue experiments indicate that MCM methods give consistently accurate secondary structures and global tertiary structures for ribozymes, ribosomal domains and ligand-bound riboswitch aptamers up to 200 nucleotides in length. Importantly, MCM analyses provide detailed information on structurally heterogeneous RNA states, such as ligand-free riboswitches that are functionally important but difficult to resolve with other approaches. The sequencing requirements of currently available MCM protocols scale at least quadratically with RNA length, precluding general application to transcriptomes or viral genomes at present. We propose a modify-cross-link-map (MXM) expansion to overcome this and other current limitations to resolving the in vivo 'RNA structurome'.
Collapse
|
45
|
Choudhary K, Deng F, Aviran S. Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions. QUANTITATIVE BIOLOGY 2017; 5:3-24. [PMID: 28717530 PMCID: PMC5510538 DOI: 10.1007/s40484-017-0093-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 12/08/2016] [Accepted: 12/15/2016] [Indexed: 12/30/2022]
Abstract
BACKGROUND Structure profiling experiments provide single-nucleotide information on RNA structure. Recent advances in chemistry combined with application of high-throughput sequencing have enabled structure profiling at transcriptome scale and in living cells, creating unprecedented opportunities for RNA biology. Propelled by these experimental advances, massive data with ever-increasing diversity and complexity have been generated, which give rise to new challenges in interpreting and analyzing these data. RESULTS We review current practices in analysis of structure profiling data with emphasis on comparative and integrative analysis as well as highlight emerging questions. Comparative analysis has revealed structural patterns across transcriptomes and has become an integral component of recent profiling studies. Additionally, profiling data can be integrated into traditional structure prediction algorithms to improve prediction accuracy. CONCLUSIONS To keep pace with experimental developments, methods to facilitate, enhance and refine such analyses are needed. Parallel advances in analysis methodology will complement profiling technologies and help them reach their full potential.
Collapse
Affiliation(s)
| | | | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA 95616, USA
| |
Collapse
|
46
|
Deng F, Ledda M, Vaziri S, Aviran S. Data-directed RNA secondary structure prediction using probabilistic modeling. RNA (NEW YORK, N.Y.) 2016; 22:1109-1119. [PMID: 27251549 PMCID: PMC4931104 DOI: 10.1261/rna.055756.115] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Accepted: 04/26/2016] [Indexed: 06/05/2023]
Abstract
Structure dictates the function of many RNAs, but secondary RNA structure analysis is either labor intensive and costly or relies on computational predictions that are often inaccurate. These limitations are alleviated by integration of structure probing data into prediction algorithms. However, existing algorithms are optimized for a specific type of probing data. Recently, new chemistries combined with advances in sequencing have facilitated structure probing at unprecedented scale and sensitivity. These novel technologies and anticipated wealth of data highlight a need for algorithms that readily accommodate more complex and diverse input sources. We implemented and investigated a recently outlined probabilistic framework for RNA secondary structure prediction and extended it to accommodate further refinement of structural information. This framework utilizes direct likelihood-based calculations of pseudo-energy terms per considered structural context and can readily accommodate diverse data types and complex data dependencies. We use real data in conjunction with simulations to evaluate performances of several implementations and to show that proper integration of structural contexts can lead to improvements. Our tests also reveal discrepancies between real data and simulations, which we show can be alleviated by refined modeling. We then propose statistical preprocessing approaches to standardize data interpretation and integration into such a generic framework. We further systematically quantify the information content of data subsets, demonstrating that high reactivities are major drivers of SHAPE-directed predictions and that better understanding of less informative reactivities is key to further improvements. Finally, we provide evidence for the adaptive capability of our framework using mock probe simulations.
Collapse
Affiliation(s)
- Fei Deng
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| | - Mirko Ledda
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| | - Sana Vaziri
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| |
Collapse
|
47
|
Kutchko KM, Laederach A. Transcending the prediction paradigm: novel applications of SHAPE to RNA function and evolution. WILEY INTERDISCIPLINARY REVIEWS-RNA 2016; 8. [PMID: 27396578 PMCID: PMC5179297 DOI: 10.1002/wrna.1374] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Revised: 04/29/2016] [Accepted: 05/23/2016] [Indexed: 12/31/2022]
Abstract
Selective 2′‐hydroxyl acylation analyzed by primer extension (SHAPE) provides information on RNA structure at single‐nucleotide resolution. It is most often used in conjunction with RNA secondary structure prediction algorithms as a probabilistic or thermodynamic restraint. With the recent advent of ultra‐high‐throughput approaches for collecting SHAPE data, the applications of this technology are extending beyond structure prediction. In this review, we discuss recent applications of SHAPE data in the transcriptomic context and how this new experimental paradigm is changing our understanding of these experiments and RNA folding in general. SHAPE experiments probe both the secondary and tertiary structure of an RNA, suggesting that model‐free approaches for within and comparative RNA structure analysis can provide significant structural insight without the need for a full structural model. New methods incorporating SHAPE at different nucleotide resolutions are required to parse these transcriptomic data sets to transcend secondary structure modeling with global structural metrics. These ‘multiscale’ approaches provide deeper insights into RNA global structure, evolution, and function in the cell. WIREs RNA 2017, 8:e1374. doi: 10.1002/wrna.1374 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Katrina M Kutchko
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Alain Laederach
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
48
|
Abstract
Deciphering the folding pathways and predicting the structures of complex three-dimensional biomolecules is central to elucidating biological function. RNA is single-stranded, which gives it the freedom to fold into complex secondary and tertiary structures. These structures endow RNA with the ability to perform complex chemistries and functions ranging from enzymatic activity to gene regulation. Given that RNA is involved in many essential cellular processes, it is critical to understand how it folds and functions in vivo. Within the last few years, methods have been developed to probe RNA structures in vivo and genome-wide. These studies reveal that RNA often adopts very different structures in vivo and in vitro, and provide profound insights into RNA biology. Nonetheless, both in vitro and in vivo approaches have limitations: studies in the complex and uncontrolled cellular environment make it difficult to obtain insight into RNA folding pathways and thermodynamics, and studies in vitro often lack direct cellular relevance, leaving a gap in our knowledge of RNA folding in vivo. This gap is being bridged by biophysical and mechanistic studies of RNA structure and function under conditions that mimic the cellular environment. To date, most artificial cytoplasms have used various polymers as molecular crowding agents and a series of small molecules as cosolutes. Studies under such in vivo-like conditions are yielding fresh insights, such as cooperative folding of functional RNAs and increased activity of ribozymes. These observations are accounted for in part by molecular crowding effects and interactions with other molecules. In this review, we report milestones in RNA folding in vitro and in vivo and discuss ongoing experimental and computational efforts to bridge the gap between these two conditions in order to understand how RNA folds in the cell.
Collapse
|
49
|
Wu Y, Qu R, Huang Y, Shi B, Liu M, Li Y, Lu ZJ. RNAex: an RNA secondary structure prediction server enhanced by high-throughput structure-probing data. Nucleic Acids Res 2016; 44:W294-301. [PMID: 27137891 PMCID: PMC4987914 DOI: 10.1093/nar/gkw362] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2016] [Accepted: 04/24/2016] [Indexed: 01/16/2023] Open
Abstract
Several high-throughput technologies have been developed to probe RNA base pairs and loops at the transcriptome level in multiple species. However, to obtain the final RNA secondary structure, extensive effort and considerable expertise is required to statistically process the probing data and combine them with free energy models. Therefore, we developed an RNA secondary structure prediction server that is enhanced by experimental data (RNAex). RNAex is a web interface that enables non-specialists to easily access cutting-edge structure-probing data and predict RNA secondary structures enhanced by in vivo and in vitro data. RNAex annotates the RNA editing, RNA modification and SNP sites on the predicted structures. It provides four structure-folding methods, restrained MaxExpect, SeqFold, RNAstructure (Fold) and RNAfold that can be selected by the user. The performance of these four folding methods has been verified by previous publications on known structures. We re-mapped the raw sequencing data of the probing experiments to the whole genome for each species. RNAex thus enables users to predict secondary structures for both known and novel RNA transcripts in human, mouse, yeast and Arabidopsis The RNAex web server is available at http://RNAex.ncrnalab.org/.
Collapse
Affiliation(s)
- Yang Wu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Rihao Qu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Yiming Huang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Binbin Shi
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Mengrong Liu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Yang Li
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Zhi John Lu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| |
Collapse
|
50
|
Lorenz R, Hofacker IL, Stadler PF. RNA folding with hard and soft constraints. Algorithms Mol Biol 2016; 11:8. [PMID: 27110276 PMCID: PMC4842303 DOI: 10.1186/s13015-016-0070-z] [Citation(s) in RCA: 69] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 04/01/2016] [Indexed: 12/21/2022] Open
Abstract
Background A large class of RNA secondary structure prediction programs uses an elaborate energy model grounded in extensive thermodynamic measurements and exact dynamic programming algorithms. External experimental evidence can be in principle be incorporated by means of hard constraints that restrict the search space or by means of soft constraints that distort the energy model. In particular recent advances in coupling chemical and enzymatic probing with sequencing techniques but also comparative approaches provide an increasing amount of experimental data to be combined with secondary structure prediction. Results Responding to the increasing needs for a versatile and user-friendly inclusion of external evidence into diverse flavors of RNA secondary structure prediction tools we implemented a generic layer of constraint handling into the ViennaRNA Package. It makes explicit use of the conceptual separation of the “folding grammar” defining the search space and the actual energy evaluation, which allows constraints to be interleaved in a natural way between recursion steps and evaluation of the standard energy function. Conclusions The extension of the ViennaRNA Package provides a generic way to include diverse types of constraints into RNA folding algorithms. The computational overhead incurred is negligible in practice. A wide variety of application scenarios can be accommodated by the new framework, including the incorporation of structure probing data, non-standard base pairs and chemical modifications, as well as structure-dependent ligand binding. Electronic supplementary material The online version of this article (doi:10.1186/s13015-016-0070-z) contains supplementary material, which is available to authorized users.
Collapse
|