1
|
Fukunaga T, Hamada M. Computational approaches for alternative and transient secondary structures of ribonucleic acids. Brief Funct Genomics 2018; 18:182-191. [PMID: 30689706 DOI: 10.1093/bfgp/ely042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Transient and alternative structures of ribonucleic acids (RNAs) play essential roles in various regulatory processes, such as translation regulation in living cells. Because experimental analyses for RNA structures are difficult and time-consuming, computational approaches based on RNA secondary structures are promising. In this article, we review computational methods for detecting and analyzing transient/alternative secondary structures of RNAs, including static approaches based on probabilistic distributions of RNA secondary structures and dynamic approaches such as kinetic folding and folding pathway predictions.
Collapse
|
2
|
Clote P, Bayegan AH. RNA folding kinetics using Monte Carlo and Gillespie algorithms. J Math Biol 2017; 76:1195-1227. [PMID: 28780735 DOI: 10.1007/s00285-017-1169-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 07/09/2017] [Indexed: 11/26/2022]
Abstract
RNA secondary structure folding kinetics is known to be important for the biological function of certain processes, such as the hok/sok system in E. coli. Although linear algebra provides an exact computational solution of secondary structure folding kinetics with respect to the Turner energy model for tiny ([Formula: see text]20 nt) RNA sequences, the folding kinetics for larger sequences can only be approximated by binning structures into macrostates in a coarse-grained model, or by repeatedly simulating secondary structure folding with either the Monte Carlo algorithm or the Gillespie algorithm. Here we investigate the relation between the Monte Carlo algorithm and the Gillespie algorithm. We prove that asymptotically, the expected time for a K-step trajectory of the Monte Carlo algorithm is equal to [Formula: see text] times that of the Gillespie algorithm, where [Formula: see text] denotes the Boltzmann expected network degree. If the network is regular (i.e. every node has the same degree), then the mean first passage time (MFPT) computed by the Monte Carlo algorithm is equal to MFPT computed by the Gillespie algorithm multiplied by [Formula: see text]; however, this is not true for non-regular networks. In particular, RNA secondary structure folding kinetics, as computed by the Monte Carlo algorithm, is not equal to the folding kinetics, as computed by the Gillespie algorithm, although the mean first passage times are roughly correlated. Simulation software for RNA secondary structure folding according to the Monte Carlo and Gillespie algorithms is publicly available, as is our software to compute the expected degree of the network of secondary structures of a given RNA sequence-see http://bioinformatics.bc.edu/clote/RNAexpNumNbors .
Collapse
Affiliation(s)
- Peter Clote
- Department of Biology, Boston College, Chestnut Hill, MA, 02467, USA.
| | - Amir H Bayegan
- Department of Biology, Boston College, Chestnut Hill, MA, 02467, USA
| |
Collapse
|
3
|
Abstract
We describe the first dynamic programming algorithm that computes the expected degree for the network, or graph G = (V, E) of all secondary structures of a given RNA sequence a = a1, …, an. Here, the nodes V correspond to all secondary structures of a, while an edge exists between nodes s, t if the secondary structure t can be obtained from s by adding, removing or shifting a base pair. Since secondary structure kinetics programs implement the Gillespie algorithm, which simulates a random walk on the network of secondary structures, the expected network degree may provide a better understanding of kinetics of RNA folding when allowing defect diffusion, helix zippering, and related conformation transformations. We determine the correlation between expected network degree, contact order, conformational entropy, and expected number of native contacts for a benchmarking dataset of RNAs. Source code is available at http://bioinformatics.bc.edu/clotelab/RNAexpNumNbors.
Collapse
|
4
|
Dykeman EC. An implementation of the Gillespie algorithm for RNA kinetics with logarithmic time update. Nucleic Acids Res 2015; 43:5708-15. [PMID: 25990741 PMCID: PMC4499123 DOI: 10.1093/nar/gkv480] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Accepted: 05/01/2015] [Indexed: 12/17/2022] Open
Abstract
In this paper I outline a fast method called KFOLD for implementing the Gillepie algorithm to stochastically sample the folding kinetics of an RNA molecule at single base-pair resolution. In the same fashion as the KINFOLD algorithm, which also uses the Gillespie algorithm to predict folding kinetics, KFOLD stochastically chooses a new RNA secondary structure state that is accessible from the current state by a single base-pair addition/deletion following the Gillespie procedure. However, unlike KINFOLD, the KFOLD algorithm utilizes the fact that many of the base-pair addition/deletion reactions and their corresponding rates do not change between each step in the algorithm. This allows KFOLD to achieve a substantial speed-up in the time required to compute a prediction of the folding pathway and, for a fixed number of base-pair moves, performs logarithmically with sequence size. This increase in speed opens up the possibility of studying the kinetics of much longer RNA sequences at single base-pair resolution while also allowing for the RNA folding statistics of smaller RNA sequences to be computed much more quickly.
Collapse
Affiliation(s)
- Eric C Dykeman
- York Centre for Complex Systems Analysis, Department of Mathematics and Biology University of York, Deramore Lane, York, YO10 5GE, UK
| |
Collapse
|
5
|
Abstract
In this article, we introduce the software suite Hermes, which provides fast, novel algorithms for RNA secondary structure kinetics. Using the fast Fourier transform to efficiently compute the Boltzmann probability that a secondary structure S of a given RNA sequence has base pair distance x (resp. y) from reference structure A (resp. B), Hermes computes the exact kinetics of folding from A to B in this coarse-grained model. In particular, Hermes computes the mean first passage time from the transition probability matrix by using matrix inversion, and also computes the equilibrium time from the rate matrix by using spectral decomposition. Due to the model granularity and the speed of Hermes, it is capable of determining secondary structure refolding kinetics for large RNA sequences, beyond the range of other methods. Comparative benchmarking of Hermes with other methods indicates that Hermes provides refolding kinetics of accuracy suitable for use in the computational design of RNA, an important area of synthetic biology. Source code and documentation for Hermes are available.
Collapse
Affiliation(s)
- Evan Senter
- Department of Biology, Boston College , Chestnut Hill, Massachusetts
| | | |
Collapse
|
6
|
Mann M, Kucharík M, Flamm C, Wolfinger MT. Memory-efficient RNA energy landscape exploration. Bioinformatics 2014; 30:2584-91. [PMID: 24833804 PMCID: PMC4155248 DOI: 10.1093/bioinformatics/btu337] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Revised: 04/25/2014] [Accepted: 05/08/2014] [Indexed: 02/01/2023] Open
Abstract
MOTIVATION Energy landscapes provide a valuable means for studying the folding dynamics of short RNA molecules in detail by modeling all possible structures and their transitions. Higher abstraction levels based on a macro-state decomposition of the landscape enable the study of larger systems; however, they are still restricted by huge memory requirements of exact approaches. RESULTS We present a highly parallelizable local enumeration scheme that enables the computation of exact macro-state transition models with highly reduced memory requirements. The approach is evaluated on RNA secondary structure landscapes using a gradient basin definition for macro-states. Furthermore, we demonstrate the need for exact transition models by comparing two barrier-based approaches, and perform a detailed investigation of gradient basins in RNA energy landscapes. AVAILABILITY AND IMPLEMENTATION Source code is part of the C++ Energy Landscape Library available at http://www.bioinf.uni-freiburg.de/Software/.
Collapse
Affiliation(s)
- Martin Mann
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany, Institute for Theoretical Chemistry, University of Vienna, 1090 Vienna, Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, and Department of Biochemistry and Molecular Cell Biology, Max F. Perutz Laboratories, University of Vienna, A-1030 Vienna, Austria
| | - Marcel Kucharík
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany, Institute for Theoretical Chemistry, University of Vienna, 1090 Vienna, Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, and Department of Biochemistry and Molecular Cell Biology, Max F. Perutz Laboratories, University of Vienna, A-1030 Vienna, Austria
| | - Christoph Flamm
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany, Institute for Theoretical Chemistry, University of Vienna, 1090 Vienna, Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, and Department of Biochemistry and Molecular Cell Biology, Max F. Perutz Laboratories, University of Vienna, A-1030 Vienna, Austria
| | - Michael T Wolfinger
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany, Institute for Theoretical Chemistry, University of Vienna, 1090 Vienna, Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, and Department of Biochemistry and Molecular Cell Biology, Max F. Perutz Laboratories, University of Vienna, A-1030 Vienna, Austria Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany, Institute for Theoretical Chemistry, University of Vienna, 1090 Vienna, Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, and Department of Biochemistry and Molecular Cell Biology, Max F. Perutz Laboratories, University of Vienna, A-1030 Vienna, Austria Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany, Institute for Theoretical Chemistry, University of Vienna, 1090 Vienna, Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, and Department of Biochemistry and Molecular Cell Biology, Max F. Perutz Laboratories, University of Vienna, A-1030 Vienna, Austria
| |
Collapse
|