1
|
Mittal A, Ali SE, Mathews DH. Using the RNAstructure Software Package to Predict Conserved RNA Structures. Curr Protoc 2024; 4:e70054. [PMID: 39540715 DOI: 10.1002/cpz1.70054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
The structures of many non-coding RNAs (ncRNA) are conserved by evolution to a greater extent than their sequences. By predicting the conserved structure of two or more homologous sequences, the accuracy of secondary structure prediction can be improved as compared to structure prediction for a single sequence. Here, we provide protocols for the use of four programs in the RNAstructure suite to predict conserved structures: Multilign, TurboFold, Dynalign, and PARTS. TurboFold iteratively aligns multiple homologous sequences and estimates the pairing probabilities for the conserved structure. Dynalign, PARTS, and Multilign are dynamic programming algorithms that simultaneously align sequences and identify the common secondary structure. Dynalign uses a pair of homologs and finds the lowest free energy common structure. PARTS uses a pair of homologs and estimates pairing probabilities from the base pairing probabilities estimated for each sequence. Multilign uses two or more homologs and finds the lowest free energy common structure using multiple pairwise calculations with Dynalign. It scales linearly with the number of sequences. We outline the strengths of each program. These programs can be run through web servers, on the command line, or with graphical user interfaces. © 2024 Wiley Periodicals LLC. Basic Protocol 1: Predicting a structure conserved in three or more sequences with the RNAstructure web server Basic Protocol 2: Predicting a structure conserved in two sequences with the RNAstructure web server Alternative Protocol 1: Predicting a structure conserved in multiple sequences in the RNAstructure graphical user interface Alternative Protocol 2: Predicting a structure conserved in two sequences with Dynalign in the RNAstructure graphical user interface Alternative Protocol 3: Running TurboFold on the command line.
Collapse
Affiliation(s)
- Abhinav Mittal
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - Sara E Ali
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| |
Collapse
|
2
|
Liu D, Liu Z, Xia Y, Wang Z, Song J, Yu DJ. TransC-ac4C: Identification of N4-Acetylcytidine (ac4C) Sites in mRNA Using Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1403-1412. [PMID: 38607721 DOI: 10.1109/tcbb.2024.3386972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/14/2024]
Abstract
N4-acetylcytidine (ac4C) is a post-transcriptional modification in mRNA that is critical in mRNA translation in terms of stability and regulation. In the past few years, numerous approaches employing convolutional neural networks (CNN) and Transformer have been proposed for the identification of ac4C sites, with each variety of approaches processing distinct characteristics. CNN-based methods excel at extracting local features and positional information, whereas Transformer-based ones stands out in establishing long-range dependencies and generating global representations. Given the importance of both local and global features in mRNA ac4C sites identification, we propose a novel method termed TransC-ac4C which combines CNN and Transformer together for enhancing the feature extraction capability and improving the identification accuracy. Five different feature encoding strategies (One-hot, NCP, ND, EIIP, and K-mer) are employed to generate the mRNA sequence representations, in which way the sequence attributes and physical and chemical properties of the sequences can be embedded. To strengthen the relevance of features, we construct a novel feature fusion method. Firstly, the CNN is employed to process five single features, stitch them together and feed them to the Transformer layer. Then, our approach employs CNN to extract local features and Transformer subsequently to establish global long-range dependencies among extracted features. We use 5-fold cross-validation to evaluate the model, and the evaluation indicators are significantly improved. The prediction accuracy of the two datasets is as high as 81.42% and 80.69%, respectively. It demonstrates the stronger competitiveness and generalization performance of our model.
Collapse
|
3
|
Tang M, Hwang K, Kang SH. StemP: A Fast and Deterministic Stem-Graph Approach for RNA Secondary Structure Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3278-3291. [PMID: 37028040 DOI: 10.1109/tcbb.2023.3253049] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
We propose a new deterministic methodology to predict the secondary structure of RNA sequences. What information of stem is important for structure prediction, and is it enough ? The proposed simple deterministic algorithm uses minimum stem length, Stem-Loop score, and co-existence of stems, to give good structure predictions for short RNA and tRNA sequences. The main idea is to consider all possible stem with certain stem loop energy and strength to predict RNA secondary structure. We use graph notation, where stems are represented as vertexes, and co-existence between stems as edges. This full Stem-graph presents all possible folding structure, and we pick sub-graph(s) which give the best matching energy for structure prediction. Stem-Loop score adds structure information and speeds up the computation. The proposed method can predict secondary structure even with pseudo knots. One of the strengths of this approach is the simplicity and flexibility of the algorithm, and it gives a deterministic answer. Numerical experiments are done on various sequences from Protein Data Bank and the Gutell Lab using a laptop and results take only a few seconds.
Collapse
|
4
|
Tagashira M, Asai K. ConsAlifold: considering RNA structural alignments improves prediction accuracy of RNA consensus secondary structures. Bioinformatics 2022; 38:710-719. [PMID: 34694364 DOI: 10.1093/bioinformatics/btab738] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 08/24/2021] [Accepted: 10/20/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION By detecting homology among RNAs, the probabilistic consideration of RNA structural alignments has improved the prediction accuracy of significant RNA prediction problems. Predicting an RNA consensus secondary structure from an RNA sequence alignment is a fundamental research objective because in the detection of conserved base-pairings among RNA homologs, predicting an RNA consensus secondary structure is more convenient than predicting an RNA structural alignment. RESULTS We developed and implemented ConsAlifold, a dynamic programming-based method that predicts the consensus secondary structure of an RNA sequence alignment. ConsAlifold considers RNA structural alignments. ConsAlifold achieves moderate running time and the best prediction accuracy of RNA consensus secondary structures among available prediction methods. AVAILABILITY AND IMPLEMENTATION ConsAlifold, data and Python scripts for generating both figures and tables are freely available at https://github.com/heartsh/consalifold. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Masaki Tagashira
- Department of Computational Biology and Medical Sciences, University of Tokyo, Chiba 277-8561, Japan.,Artificial Intelligence Research Center, AIST, Tokyo 135-0064, Japan
| | - Kiyoshi Asai
- Department of Computational Biology and Medical Sciences, University of Tokyo, Chiba 277-8561, Japan.,Artificial Intelligence Research Center, AIST, Tokyo 135-0064, Japan
| |
Collapse
|
5
|
Zhao J, Kennedy SD, Turner DH. Nuclear Magnetic Resonance Spectra and AMBER OL3 and ROC-RNA Simulations of UCUCGU Reveal Force Field Strengths and Weaknesses for Single-Stranded RNA. J Chem Theory Comput 2022; 18:1241-1254. [PMID: 34990548 DOI: 10.1021/acs.jctc.1c00643] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Single-stranded regions of RNA are important for folding of sequences into 3D structures and for design of therapeutics targeting RNA. Prediction of ensembles of 3D structures for single-stranded regions often involves classical mechanical approximations of interactions defined by quantum mechanical calculations on small model systems. Nuclear magnetic resonance (NMR) spectra and molecular dynamics (MD) simulations of short single strands provide tests for how well the approximations model many of the interactions. Here, the NMR spectra for UCUCGU at 2, 15, and 30 °C are compared to simulations with the AMBER force fields, OL3 and ROC-RNA. This is the first such comparison to an oligoribonucleotide containing an internal guanosine nucleotide (G). G is particularly interesting because of its many H-bonding groups, large dipole moment, and proclivity for both syn and anti conformations. Results reveal formation of a G amino to phosphate non-bridging oxygen H-bond. The results also demonstrate dramatic differences in details of the predicted structures. The variations emphasize the dependence of predictions on individual parameters and their balance with the rest of the force field. The NMR data can serve as a benchmark for future force fields.
Collapse
|
6
|
Yu AM, Gasper PM, Cheng L, Lai LB, Kaur S, Gopalan V, Chen AA, Lucks JB. Computationally reconstructing cotranscriptional RNA folding from experimental data reveals rearrangement of non-native folding intermediates. Mol Cell 2021; 81:870-883.e10. [PMID: 33453165 DOI: 10.1016/j.molcel.2020.12.017] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 12/08/2020] [Accepted: 12/10/2020] [Indexed: 11/16/2022]
Abstract
The series of RNA folding events that occur during transcription can critically influence cellular RNA function. Here, we present reconstructing RNA dynamics from data (R2D2), a method to uncover details of cotranscriptional RNA folding. We model the folding of the Escherichia coli signal recognition particle (SRP) RNA and show that it requires specific local structural fluctuations within a key hairpin to engender efficient cotranscriptional conformational rearrangement into the functional structure. All-atom molecular dynamics simulations suggest that this rearrangement proceeds through an internal toehold-mediated strand-displacement mechanism, which can be disrupted with a point mutation that limits local structural fluctuations and rescued with compensating mutations that restore these fluctuations. Moreover, a cotranscriptional folding intermediate could be cleaved in vitro by recombinant E. coli RNase P, suggesting potential cotranscriptional processing. These results from experiment-guided multi-scale modeling demonstrate that even an RNA with a simple functional structure can undergo complex folding and processing during synthesis.
Collapse
Affiliation(s)
- Angela M Yu
- Tri-Institutional Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY 10065, USA; Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60201, USA
| | - Paul M Gasper
- Department of Chemistry and the RNA Institute, University at Albany, Albany, NY 12222, USA
| | - Luyi Cheng
- Interdisciplinary Biological Sciences Graduate Program, Northwestern University, Evanston, IL 60201, USA
| | - Lien B Lai
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA; Center for RNA Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Simi Kaur
- Department of Chemistry and the RNA Institute, University at Albany, Albany, NY 12222, USA
| | - Venkat Gopalan
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA; Center for RNA Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Alan A Chen
- Department of Chemistry and the RNA Institute, University at Albany, Albany, NY 12222, USA.
| | - Julius B Lucks
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60201, USA.
| |
Collapse
|
7
|
Emami N, Pakchin PS, Ferdousi R. Computational predictive approaches for interaction and structure of aptamers. J Theor Biol 2020; 497:110268. [PMID: 32311376 DOI: 10.1016/j.jtbi.2020.110268] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 03/27/2020] [Accepted: 04/02/2020] [Indexed: 02/07/2023]
Abstract
Aptamers are short single-strand sequences that can bind to their specific targets with high affinity and specificity. Usually, aptamers are selected experimentally via systematic evolution of ligands by exponential enrichment (SELEX), an evolutionary process that consists of multiple cycles of selection and amplification. The SELEX process is expensive, time-consuming, and its success rates are relatively low. To overcome these difficulties, in recent years, several computational techniques have been developed in aptamer sciences that bring together different disciplines and branches of technologies. In this paper, a complementary review on computational predictive approaches of the aptamer has been organized. Generally, the computational prediction approaches of aptamer have been proposed to carry out in two main categories: interaction-based prediction and structure-based predictions. Furthermore, the available software packages and toolkits in this scope were reviewed. The aim of describing computational methods and tools in aptamer science is that aptamer scientists might take advantage of these computational techniques to develop more accurate and more sensitive aptamers.
Collapse
Affiliation(s)
- Neda Emami
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Parvin Samadi Pakchin
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran; Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
8
|
Abstract
RNA performs and regulates a diverse range of cellular processes, with new functional roles being uncovered at a rapid pace. Interest is growing in how these functions are linked to RNA structures that form in the complex cellular environment. A growing suite of technologies that use advances in RNA structural probes, high-throughput sequencing and new computational approaches to interrogate RNA structure at unprecedented throughput are beginning to provide insights into RNA structures at new spatial, temporal and cellular scales.
Collapse
Affiliation(s)
- Eric J Strobel
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | - Angela M Yu
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Julius B Lucks
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
9
|
Adams RL, Huston NC, Tavares RCA, Pyle AM. Sensitive detection of structural features and rearrangements in long, structured RNA molecules. Methods Enzymol 2019; 623:249-289. [PMID: 31239050 DOI: 10.1016/bs.mie.2019.04.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Technical innovations in structural probing have drastically advanced the field of RNA structure analysis. These advances have led to parallel approaches developed in separate labs for analyzing RNA structure and dynamics. With the wealth of methodologies available, it can be difficult to determine which is best suited for a given application. Here, using a long, highly structured viral RNA as an example (the positive strand genome of Hepatitis C Virus), we present a semi-comprehensive analysis and describe the major approaches for analyzing the architecture of RNA that is modified with structure-sensitive probes. Additionally, we present an updated method for generating in vitro transcribed and folded RNA that maintains native secondary structures in long RNA molecules. We anticipate that the methods described here will streamline the use of current approaches and help investigators who are unfamiliar with structure probing, obviating the need for time-consuming and expensive optimization.
Collapse
Affiliation(s)
- Rebecca L Adams
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT, United States
| | - Nicholas C Huston
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT, United States
| | - Rafael C A Tavares
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT, United States; Department of Chemistry, Yale University, New Haven, CT, United States
| | - Anna M Pyle
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT, United States; Department of Chemistry, Yale University, New Haven, CT, United States; Howard Hughes Medical Institute, Chevy Chase, MD, United States.
| |
Collapse
|
10
|
Mathews DH. How to benchmark RNA secondary structure prediction accuracy. Methods 2019; 162-163:60-67. [PMID: 30951834 DOI: 10.1016/j.ymeth.2019.04.003] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2018] [Revised: 03/24/2019] [Accepted: 04/01/2019] [Indexed: 11/18/2022] Open
Abstract
RNA secondary structure prediction is widely used. As new methods are developed, these are often benchmarked for accuracy against existing methods. This review discusses good practices for performing these benchmarks, including the choice of benchmarking structures, metrics to quantify accuracy, the importance of allowing flexibility for pairs in the accepted structure, and the importance of statistical testing for significance.
Collapse
Affiliation(s)
- David H Mathews
- Center for RNA Biology, Department of Biochemistry & Biophysics, and Department of Biostatistics & Computational Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY 14642, United States.
| |
Collapse
|
11
|
Dans PD, Gallego D, Balaceanu A, Darré L, Gómez H, Orozco M. Modeling, Simulations, and Bioinformatics at the Service of RNA Structure. Chem 2019. [DOI: 10.1016/j.chempr.2018.09.015] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
12
|
Mlýnský V, Bussi G. Molecular Dynamics Simulations Reveal an Interplay between SHAPE Reagent Binding and RNA Flexibility. J Phys Chem Lett 2018; 9:313-318. [PMID: 29265824 PMCID: PMC5830694 DOI: 10.1021/acs.jpclett.7b02921] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 12/21/2017] [Indexed: 05/10/2023]
Abstract
The function of RNA molecules usually depends on their overall fold and on the presence of specific structural motifs. Chemical probing methods are routinely used in combination with nearest-neighbor models to determine RNA secondary structure. Among the available methods, SHAPE is relevant due to its capability to probe all RNA nucleotides and the possibility to be used in vivo. However, the structural determinants for SHAPE reactivity and its mechanism of reaction are still unclear. Here molecular dynamics simulations and enhanced sampling techniques are used to predict the accessibility of nucleotide analogs and larger RNA structural motifs to SHAPE reagents. We show that local RNA reconformations are crucial in allowing reagents to reach the 2'-OH group of a particular nucleotide and that sugar pucker is a major structural factor influencing SHAPE reactivity.
Collapse
Affiliation(s)
- Vojtěch Mlýnský
- Scuola Internazionale Superiore di
Studi Avanzati, SISSA, via Bonomea 265, 34136 Trieste, Italy
| | - Giovanni Bussi
- Scuola Internazionale Superiore di
Studi Avanzati, SISSA, via Bonomea 265, 34136 Trieste, Italy
| |
Collapse
|