1
|
Lorenz R. RNA Secondary Structure Thermodynamics. Methods Mol Biol 2024; 2726:45-83. [PMID: 38780727 DOI: 10.1007/978-1-0716-3519-3_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Several different ways to predict RNA secondary structures have been suggested in the literature. Statistical methods, such as those that utilize stochastic context-free grammars (SCFGs), or approaches based on machine learning aim to predict the best representative structure for the underlying ensemble of possible conformations. Their parameters have therefore been trained on larger subsets of well-curated, known secondary structures. Physics-based methods, on the other hand, usually refrain from using optimized parameters. They model secondary structures from loops as individual building blocks which have been assigned a physical property instead: the free energy of the respective loop. Such free energies are either derived from experiments or from mathematical modeling. This rigorous use of physical properties then allows for the application of statistical mechanics to describe the entire state space of RNA secondary structures in terms of equilibrium probabilities. On that basis, and by using efficient algorithms, many more descriptors of the conformational state space of RNA molecules can be derived to investigate and explain the many functions of RNA molecules. Moreover, compared to other methods, physics-based models allow for a much easier extension with other properties that can be measured experimentally. For instance, small molecules or proteins can bind to an RNA and their binding affinity can be assessed experimentally. Under certain conditions, existing RNA secondary structure prediction tools can be used to model this RNA-ligand binding and to eventually shed light on its impact on structure formation and function.
Collapse
Affiliation(s)
- Ronny Lorenz
- Department of Theoretical Chemistry, University of Vienna, Vienna, Austria.
| |
Collapse
|
2
|
Hollar A, Bursey H, Jabbari H. Pseudoknots in RNA Structure Prediction. Curr Protoc 2023; 3:e661. [PMID: 36779804 DOI: 10.1002/cpz1.661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/14/2023]
Abstract
RNA molecules play active roles in the cell and are important for numerous applications in biotechnology and medicine. The function of an RNA molecule stems from its structure. RNA structure determination is time consuming, challenging, and expensive using experimental methods. Thus, much research has been directed at RNA structure prediction through computational means. Many of these methods focus primarily on the secondary structure of the molecule, ignoring the possibility of pseudoknotted structures. However, pseudoknots are known to play functional roles in many RNA molecules or in their method of interaction with other molecules. Improving the accuracy and efficiency of computational methods that predict pseudoknots is an ongoing challenge for single RNA molecules, RNA-RNA interactions, and RNA-protein interactions. To improve the accuracy of prediction, many methods focus on specific applications while restricting the length and the class of the pseudoknotted structures they can identify. In recent years, computational methods for structure prediction have begun to catch up with the impressive developments seen in biotechnology. Here, we provide a non-comprehensive overview of available pseudoknot prediction methods and their best-use cases. © 2023 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Andrew Hollar
- Department of Computer Science, University of Victoria, Victoria, Canada
| | - Hunter Bursey
- Department of Computer Science, University of Victoria, Victoria, Canada
| | - Hosna Jabbari
- Department of Computer Science, University of Victoria, Victoria, Canada
| |
Collapse
|
3
|
Gaither J, Lin YH, Bundschuh R. RBPBind: Quantitative prediction of Protein-RNA interactions. J Mol Biol 2022; 434:167515. [DOI: 10.1016/j.jmb.2022.167515] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 02/21/2022] [Accepted: 02/22/2022] [Indexed: 10/19/2022]
|
4
|
Shatoff E, Bundschuh R. dsRBPBind: modeling the effect of RNA secondary structure on double-stranded RNA-protein binding. Bioinformatics 2022; 38:687-693. [PMID: 34668517 DOI: 10.1093/bioinformatics/btab724] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 09/15/2021] [Accepted: 10/15/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION RNA-binding proteins are fundamental to many cellular processes. Double-stranded RNA-binding proteins (dsRBPs) in particular are crucial for RNA interference, mRNA elongation, A-to-I editing, host defense, splicing and a multitude of other important mechanisms. Since dsRBPs require double-stranded RNA to bind, their binding affinity depends on the competition among all possible secondary structures of the target RNA molecule. Here, we introduce a quantitative model that allows calculation of the effective affinity of dsRBPs to any RNA given a principal affinity and the sequence of the RNA, while fully taking into account the entire secondary structure ensemble of the RNA. RESULTS We implement our model within the ViennaRNA folding package while maintaining its O(N3) time complexity. We validate our quantitative model by comparing with experimentally determined binding affinities and stoichiometries for transactivation response element RNA-binding protein (TRBP). We also find that the change in dsRBP binding affinity purely due to the presence of alternative RNA structures can be many orders of magnitude and that the predicted affinity of TRBP for pre-miRNA-like constructs correlates with experimentally measured processing rates. AVAILABILITY AND IMPLEMENTATION Our modified version of the ViennaRNA package is available for download at http://bioserv.mps.ohio-state.edu/dsRBPBind, is free to use for research and educational purposes, and utilizes simple get/set methods for footprint size, concentration, cooperativity, principal dissociation constant and overlap.
Collapse
Affiliation(s)
- Elan Shatoff
- Department of Physics, The Ohio State University, Columbus, OH 43210, USA.,Center for RNA Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Ralf Bundschuh
- Department of Physics, The Ohio State University, Columbus, OH 43210, USA.,Center for RNA Biology, The Ohio State University, Columbus, OH 43210, USA.,Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA.,Division of Hematology, Department of Internal Medicine, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
5
|
Zakh R, Churkin A, Totzeck F, Parr M, Tuller T, Etzion O, Dahari H, Roggendorf M, Frishman D, Barash D. A Mathematical Analysis of HDV Genotypes: From Molecules to Cells. MATHEMATICS (BASEL, SWITZERLAND) 2021; 9:2063. [PMID: 34540628 PMCID: PMC8445514 DOI: 10.3390/math9172063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Hepatitis D virus (HDV) is classified according to eight genotypes. The various genotypes are included in the HDVdb database, where each HDV sequence is specified by its genotype. In this contribution, a mathematical analysis is performed on RNA sequences in HDVdb. The RNA folding predicted structures of the Genbank HDV genome sequences in HDVdb are classified according to their coarse-grain tree-graph representation. The analysis allows discarding in a simple and efficient way the vast majority of the sequences that exhibit a rod-like structure, which is important for the virus replication, to attempt to discover other biological functions by structure consideration. After the filtering, there remain only a small number of sequences that can be checked for their additional stem-loops besides the main one that is known to be responsible for virus replication. It is found that a few sequences contain an additional stem-loop that is responsible for RNA editing or other possible functions. These few sequences are grouped into two main classes, one that is well-known experimentally belonging to genotype 3 for patients from South America associated with RNA editing, and the other that is not known at present belonging to genotype 7 for patients from Cameroon. The possibility that another function besides virus replication reminiscent of the editing mechanism in HDV genotype 3 exists in HDV genotype 7 has not been explored before and is predicted by eigenvalue analysis. Finally, when comparing native and shuffled sequences, it is shown that HDV sequences belonging to all genotypes are accentuated in their mutational robustness and thermodynamic stability as compared to other viruses that were subjected to such an analysis.
Collapse
Affiliation(s)
- Rami Zakh
- Department of Computer Science, Ben-Gurion University, Beer-Sheva 8410501, Israel
| | - Alexander Churkin
- Department of Software Engineering, Sami Shamoon College of Engineering, Beer-Sheva 8410501, Israel
| | - Franziska Totzeck
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
| | - Marina Parr
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel-Aviv University, Tel-Aviv 6997801, Israel
| | - Ohad Etzion
- Soroka University Medical Center, Ben-Gurion University, Beer-Sheva 8410501, Israel
| | - Harel Dahari
- Stritch School of Medicine, Loyola University Chicago, Maywood, IL 60153, USA
| | - Michael Roggendorf
- Institute of Virology, Technische Universität München, 81675 Munich, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
| | - Danny Barash
- Department of Computer Science, Ben-Gurion University, Beer-Sheva 8410501, Israel
| |
Collapse
|
6
|
Sohrabi-Jahromi S, Söding J. Thermodynamic modeling reveals widespread multivalent binding by RNA-binding proteins. Bioinformatics 2021; 37:i308-i316. [PMID: 34252974 PMCID: PMC8275352 DOI: 10.1093/bioinformatics/btab300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Understanding how proteins recognize their RNA targets is essential to elucidate regulatory processes in the cell. Many RNA-binding proteins (RBPs) form complexes or have multiple domains that allow them to bind to RNA in a multivalent, cooperative manner. They can thereby achieve higher specificity and affinity than proteins with a single RNA-binding domain. However, current approaches to de novo discovery of RNA binding motifs do not take multivalent binding into account. RESULTS We present Bipartite Motif Finder (BMF), which is based on a thermodynamic model of RBPs with two cooperatively binding RNA-binding domains. We show that bivalent binding is a common strategy among RBPs, yielding higher affinity and sequence specificity. We furthermore illustrate that the spatial geometry between the binding sites can be learned from bound RNA sequences. These discovered bipartite motifs are consistent with previously known motifs and binding behaviors. Our results demonstrate the importance of multivalent binding for RNA-binding proteins and highlight the value of bipartite motif models in representing the multivalency of protein-RNA interactions. AVAILABILITY AND IMPLEMENTATION BMF source code is available at https://github.com/soedinglab/bipartite_motif_finder under a GPL license. The BMF web server is accessible at https://bmf.soedinglab.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Salma Sohrabi-Jahromi
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen 37077, Germany
| | - Johannes Söding
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen 37077, Germany.,Campus-Institut Data Science (CIDAS), Göttingen 37077, Germany
| |
Collapse
|
7
|
Shatoff E, Bundschuh R. Single nucleotide polymorphisms affect RNA-protein interactions at a distance through modulation of RNA secondary structures. PLoS Comput Biol 2020; 16:e1007852. [PMID: 32379750 PMCID: PMC7237046 DOI: 10.1371/journal.pcbi.1007852] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 05/19/2020] [Accepted: 04/06/2020] [Indexed: 11/19/2022] Open
Abstract
Single nucleotide polymorphisms are widely associated with disease, but the ways in which they cause altered phenotypes are often unclear, especially when they appear in non-coding regions. One way in which non-coding polymorphisms could cause disease is by affecting crucial RNA-protein interactions. While it is clear that changing a protein binding motif will alter protein binding, it has been shown that single nucleotide polymorphisms can affect RNA secondary structure, and here we show that single nucleotide polymorphisms can affect RNA-protein interactions from outside binding motifs through altered RNA secondary structure. By using a modified version of the Vienna Package and PAR-CLIP data for HuR (ELAVL1) in humans we characterize the genome-wide effect of single nucleotide polymorphisms on HuR binding and show that they can have a many-fold effect on the affinity of HuR binding to RNA transcripts from tens of bases away. We also find some evidence that the effect of single nucleotide polymorphisms on protein binding might be under selection, with the non-reference alleles tending to make it harder for a protein to bind.
Collapse
Affiliation(s)
- Elan Shatoff
- Department of Physics, The Ohio State University, Columbus, Ohio, United States of America
- Center for RNA Biology, The Ohio State University, Columbus, Ohio, United States of America
| | - Ralf Bundschuh
- Department of Physics, The Ohio State University, Columbus, Ohio, United States of America
- Center for RNA Biology, The Ohio State University, Columbus, Ohio, United States of America
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio, United States of America
- Division of Hematology, Department of Internal Medicine, The Ohio State University, Columbus, Ohio, United States of America
| |
Collapse
|
8
|
Abstract
Interactions between RNA and proteins are pervasive in biology, driving fundamental processes such as protein translation and participating in the regulation of gene expression. Modeling the energies of RNA-protein interactions is therefore critical for understanding and repurposing living systems but has been hindered by complexities unique to RNA-protein binding. Here, we bring together several advances to complete a calculation framework for RNA-protein binding affinities, including a unified free energy function for bound complexes, automated Rosetta modeling of mutations, and use of secondary structure-based energetic calculations to model unbound RNA states. The resulting Rosetta-Vienna RNP-ΔΔG method achieves root-mean-squared errors (RMSEs) of 1.3 kcal/mol on high-throughput MS2 coat protein-RNA measurements and 1.5 kcal/mol on an independent test set involving the signal recognition particle, human U1A, PUM1, and FOX-1. As a stringent test, the method achieves RMSE accuracy of 1.4 kcal/mol in blind predictions of hundreds of human PUM2-RNA relative binding affinities. Overall, these RMSE accuracies are significantly better than those attained by prior structure-based approaches applied to the same systems. Importantly, Rosetta-Vienna RNP-ΔΔG establishes a framework for further improvements in modeling RNA-protein binding that can be tested by prospective high-throughput measurements on new systems.
Collapse
|
9
|
Sasse A, Laverty KU, Hughes TR, Morris QD. Motif models for RNA-binding proteins. Curr Opin Struct Biol 2018; 53:115-123. [PMID: 30172081 DOI: 10.1016/j.sbi.2018.08.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Accepted: 08/07/2018] [Indexed: 01/24/2023]
Abstract
Identifying the binding preferences of RNA-binding proteins (RBPs) is important in understanding their contribution to post-transcriptional regulation. Here, we review the current state-of-the art of RNA motif identification tools for RBPs. New in vivo and in vitro data sets provide sufficient statistical power to enable detection of relatively long and complex sequence and sequence-structure binding preferences, and recent computational methods are geared towards quantitative identification of these patterns. We classify methods by their motif model's representational power and describe the underlying considerations for RNA-protein interactions. All classical motif identification algorithms apply physically motivated architectures, consisting of a motif and an occupancy model, we call these explicit motif models. Recent methods, such as convolutional neural networks and support vector machines, abandon the classical architecture and implicitly model RNA binding without defining a motif model. Although they achieve high accuracy on held-out data they may be unsuitable to solve the ultimate goal of the field, using motifs trained on in vitro data to predict in vivo binding sites. For this task methods need to separate intrinsic binding preferences from cellular effects from protein and RNA concentrations, cooperativity, and competition. To tackle this problem, we advocate for the use of a `three-layer' architecture, consisting of motif model, occupancy model, and extrinsic factor model, which enables separation and adjustment to cellular conditions.
Collapse
Affiliation(s)
- Alexander Sasse
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Kaitlin U Laverty
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Timothy R Hughes
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Canadian Institute for Advanced Research, MaRS Centre, West Tower, 661 University Avenue, Suite 505, Toronto, ON M5G 1M1, Canada
| | - Quaid D Morris
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Computer Science, University of Toronto, Toronto, ON M5T 3A1, Canada
| |
Collapse
|
10
|
Lorenz R, Hofacker IL, Stadler PF. RNA folding with hard and soft constraints. Algorithms Mol Biol 2016; 11:8. [PMID: 27110276 PMCID: PMC4842303 DOI: 10.1186/s13015-016-0070-z] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 04/01/2016] [Indexed: 12/21/2022] Open
Abstract
Background A large class of RNA secondary structure prediction programs uses an elaborate energy model grounded in extensive thermodynamic measurements and exact dynamic programming algorithms. External experimental evidence can be in principle be incorporated by means of hard constraints that restrict the search space or by means of soft constraints that distort the energy model. In particular recent advances in coupling chemical and enzymatic probing with sequencing techniques but also comparative approaches provide an increasing amount of experimental data to be combined with secondary structure prediction. Results Responding to the increasing needs for a versatile and user-friendly inclusion of external evidence into diverse flavors of RNA secondary structure prediction tools we implemented a generic layer of constraint handling into the ViennaRNA Package. It makes explicit use of the conceptual separation of the “folding grammar” defining the search space and the actual energy evaluation, which allows constraints to be interleaved in a natural way between recursion steps and evaluation of the standard energy function. Conclusions The extension of the ViennaRNA Package provides a generic way to include diverse types of constraints into RNA folding algorithms. The computational overhead incurred is negligible in practice. A wide variety of application scenarios can be accommodated by the new framework, including the incorporation of structure probing data, non-standard base pairs and chemical modifications, as well as structure-dependent ligand binding. Electronic supplementary material The online version of this article (doi:10.1186/s13015-016-0070-z) contains supplementary material, which is available to authorized users.
Collapse
|
11
|
Lin YH, Bundschuh R. RNA structure generates natural cooperativity between single-stranded RNA binding proteins targeting 5' and 3'UTRs. Nucleic Acids Res 2014; 43:1160-9. [PMID: 25550422 PMCID: PMC4333377 DOI: 10.1093/nar/gku1320] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
In post-transcriptional regulation, an mRNA molecule is bound by many proteins and/or miRNAs to modulate its function. To enable combinatorial gene regulation, these binding partners of an RNA must communicate with each other, exhibiting cooperativity. Even in the absence of direct physical interactions between the binding partners, such cooperativity can be mediated through RNA secondary structures, since they affect the accessibility of the binding sites. Here we propose a quantitative measure of this structure-mediated cooperativity that can be numerically calculated for an arbitrary RNA sequence. Focusing on an RNA with two binding sites, we derive a characteristic difference of free energy differences, i.e. ΔΔG, as a measure of the effect of the occupancy of one binding site on the binding strength of another. We apply this measure to a large number of human and Caenorhabditis elegans mRNAs, and find that structure-mediated cooperativity is a generic feature. Interestingly, this cooperativity not only affects binding sites in close proximity along the sequence but also configurations in which one binding site is located in the 5′UTR and the other is located in the 3′UTR of the mRNA. Furthermore, we find that this end-to-end cooperativity is determined by the UTR sequences while the sequences of the coding regions are irrelevant.
Collapse
Affiliation(s)
- Yi-Hsuan Lin
- Department of Physics, The Ohio State University, 191W Woodruff Avenue, Columbus, OH 43210-1107, USA
| | - Ralf Bundschuh
- Department of Physics, The Ohio State University, 191W Woodruff Avenue, Columbus, OH 43210-1107, USA Department of Chemistry & Biochemistry, The Ohio State University, 100W 18th Avenue, Columbus, OH 43210-1340, USA Division of Hematology, Department of Internal Medicine, The Ohio State University, 320W 10th Avenue, Columbus, OH 43210, USA Center for RNA Biology, The Ohio State University, 484W 12th Avenue, Columbus, OH 43210-1292, USA
| |
Collapse
|
12
|
Qin J, Fricke M, Marz M, Stadler PF, Backofen R. Graph-distance distribution of the Boltzmann ensemble of RNA secondary structures. Algorithms Mol Biol 2014; 9:19. [PMID: 25285153 PMCID: PMC4181469 DOI: 10.1186/1748-7188-9-19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2013] [Accepted: 06/30/2014] [Indexed: 12/02/2022] Open
Abstract
Background Large RNA molecules are often composed of multiple functional domains whose spatial arrangement strongly influences their function. Pre-mRNA splicing, for instance, relies on the spatial proximity of the splice junctions that can be separated by very long introns. Similar effects appear in the processing of RNA virus genomes. Albeit a crude measure, the distribution of spatial distances in thermodynamic equilibrium harbors useful information on the shape of the molecule that in turn can give insights into the interplay of its functional domains. Result Spatial distance can be approximated by the graph-distance in RNA secondary structure. We show here that the equilibrium distribution of graph-distances between a fixed pair of nucleotides can be computed in polynomial time by means of dynamic programming. While a naïve implementation would yield recursions with a very high time complexity of O(n6D5) for sequence length n and D distinct distance values, it is possible to reduce this to O(n4) for practical applications in which predominantly small distances are of of interest. Further reductions, however, seem to be difficult. Therefore, we introduced sampling approaches that are much easier to implement. They are also theoretically favorable for several real-life applications, in particular since these primarily concern long-range interactions in very large RNA molecules. Conclusions The graph-distance distribution can be computed using a dynamic programming approach. Although a crude approximation of reality, our initial results indicate that the graph-distance can be related to the smFRET data. The additional file and the software of our paper are available from http://www.rna.uni-jena.de/RNAgraphdist.html.
Collapse
|
13
|
Lin YH, Bundschuh R. Interplay between single-stranded binding proteins on RNA secondary structure. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2013; 88:052707. [PMID: 24329296 DOI: 10.1103/physreve.88.052707] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2013] [Indexed: 06/03/2023]
Abstract
RNA-protein interactions control the fate of cellular RNAs and play an important role in gene regulation. An interdependency between such interactions allows for the implementation of logic functions in gene regulation. We investigate the interplay between RNA-binding partners in the context of the statistical physics of RNA secondary structure and define a linear correlation function between the two partners as a measurement of the interdependency of their binding events. We demonstrate the emergence of a long-range power-law behavior of this linear correlation function. This suggests RNA secondary structure driven interdependency between binding sites as a general mechanism for combinatorial post-transcriptional gene regulation.
Collapse
Affiliation(s)
- Yi-Hsuan Lin
- Department of Physics, The Ohio State University, 191 West Woodruff Avenue, Columbus, Ohio 43210-1107, USA
| | - Ralf Bundschuh
- Department of Physics, Department of Chemistry & Biochemistry, Division of Hematology, Center for RNA Biology, The Ohio State University, 191 West Woodruff Avenue, Columbus, Ohio 43210-1107, USA
| |
Collapse
|
14
|
Darling A, Stoye J. Distribution of Graph-Distances in Boltzmann Ensembles of RNA Secondary Structures. LECTURE NOTES IN COMPUTER SCIENCE 2013. [PMCID: PMC7114971 DOI: 10.1007/978-3-642-40453-5_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Large RNA molecules often carry multiple functional domains whose spatial arrangement is an important determinant of their function. Pre-mRNA splicing, furthermore, relies on the spatial proximity of the splice junctions that can be separated by very long introns. Similar effects appear in the processing of RNA virus genomes. Albeit a crude measure, the distribution of spatial distances in thermodynamic equilibrium therefore provides useful information on the overall shape of the molecule can provide insights into the interplay of its functional domains. Spatial distance can be approximated by the graph-distance in RNA secondary structure. We show here that the equilibrium distribution of graph-distances between arbitrary nucleotides can be computed in polynomial time by means of dynamic programming. A naive implementation would yield recursions with a very high time complexity of O(n11). Although we were able to reduce this to O(n6) for many practical applications a further reduction seems difficult. We conclude, therefore, that sampling approaches, which are much easier to implement, are also theoretically favorable for most real-life applications, in particular since these primarily concern long-range interactions in very large RNA molecules.
Collapse
Affiliation(s)
- Aaron Darling
- ithree institute,, University of Technology Sydney, 2007 Ultimo, NSW Australia
| | - Jens Stoye
- Faculty of Technology, Bielefeld University, Universitätsstraße 25, 33615 Bielefeld, Germany
| |
Collapse
|
15
|
Dieterich C, Stadler PF. Computational biology of RNA interactions. WILEY INTERDISCIPLINARY REVIEWS-RNA 2012; 4:107-20. [PMID: 23139167 DOI: 10.1002/wrna.1147] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The biodiversity of the RNA world has been underestimated for decades. RNA molecules are key building blocks, sensors, and regulators of modern cells. The biological function of RNA molecules cannot be separated from their ability to bind to and interact with a wide space of chemical species, including small molecules, nucleic acids, and proteins. Computational chemists, physicists, and biologists have developed a rich tool set for modeling and predicting RNA interactions. These interactions are to some extent determined by the binding conformation of the RNA molecule. RNA binding conformations are approximated with often acceptable accuracy by sequence and secondary structure motifs. Secondary structure ensembles of a given RNA molecule can be efficiently computed in many relevant situations by employing a standard energy model for base pair interactions and dynamic programming techniques. The case of bi-molecular RNA-RNA interactions can be seen as an extension of this approach. However, unbiased transcriptome-wide scans for local RNA-RNA interactions are computationally challenging yet become efficient if the binding motif/mode is known and other external information can be used to confine the search space. Computational methods are less developed for proteins and small molecules, which bind to RNA with very high specificity. Binding descriptors of proteins are usually determined by in vitro high-throughput assays (e.g., microarrays or sequencing). Intriguingly, recent experimental advances, which are mostly based on light-induced cross-linking of binding partners, render in vivo binding patterns accessible yet require new computational methods for careful data interpretation. The grand challenge is to model the in vivo situation where a complex interplay of RNA binders competes for the same target RNA molecule. Evidently, bioinformaticians are just catching up with the impressive pace of these developments.
Collapse
Affiliation(s)
- Christoph Dieterich
- Berlin Institute for Medical Systems Biology, Max Delbrück Centre for Molecular Medicine, Robert-Rössle-Straße 10, Berlin, Germany.
| | | |
Collapse
|
16
|
Integrating chemical footprinting data into RNA secondary structure prediction. PLoS One 2012; 7:e45160. [PMID: 23091593 PMCID: PMC3473038 DOI: 10.1371/journal.pone.0045160] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2012] [Accepted: 08/16/2012] [Indexed: 01/20/2023] Open
Abstract
Chemical and enzymatic footprinting experiments, such as shape (selective 2′-hydroxyl acylation analyzed by primer extension), yield important information about RNA secondary structure. Indeed, since the -hydroxyl is reactive at flexible (loop) regions, but unreactive at base-paired regions, shape yields quantitative data about which RNA nucleotides are base-paired. Recently, low error rates in secondary structure prediction have been reported for three RNAs of moderate size, by including base stacking pseudo-energy terms derived from shape data into the computation of minimum free energy secondary structure. Here, we describe a novel method, RNAsc (RNA soft constraints), which includes pseudo-energy terms for each nucleotide position, rather than only for base stacking positions. We prove that RNAsc is self-consistent, in the sense that the nucleotide-specific probabilities of being unpaired in the low energy Boltzmann ensemble always become more closely correlated with the input shape data after application of RNAsc. From this mathematical perspective, the secondary structure predicted by RNAsc should be ‘correct’, in as much as the shape data is ‘correct’. We benchmark RNAsc against the previously mentioned method for eight RNAs, for which both shape data and native structures are known, to find the same accuracy in 7 out of 8 cases, and an improvement of 25% in one case. Furthermore, we present what appears to be the first direct comparison of shape data and in-line probing data, by comparing yeast asp-tRNA shape data from the literature with data from in-line probing experiments we have recently performed. With respect to several criteria, we find that shape data appear to be more robust than in-line probing data, at least in the case of asp-tRNA.
Collapse
|
17
|
Abstract
RNA localisation is an important mode of delivering proteins to their site of function. Cis-acting signals within the RNAs, which can be thought of as zip-codes, determine the site of localisation. There are few examples of fully characterised RNA signals, but the signals are thought to be defined through a combination of primary, secondary, and tertiary structures. In this chapter, we describe a selection of computational methods for predicting RNA secondary structure, identifying localisation signals, and searching for similar localisation signals on a genome-wide scale. The chapter is aimed at the biologist rather than presenting the details of each of the individual methods.
Collapse
|
18
|
Kishore S, Luber S, Zavolan M. Deciphering the role of RNA-binding proteins in the post-transcriptional control of gene expression. Brief Funct Genomics 2010; 9:391-404. [PMID: 21127008 DOI: 10.1093/bfgp/elq028] [Citation(s) in RCA: 120] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Eukaryotic cells express a large variety of ribonucleic acid-(RNA)-binding proteins (RBPs) with diverse affinity and specificity towards target RNAs that play a crucial role in almost every aspect of RNA metabolism. In addition, specific domains in RBPs impart catalytic activity or mediate protein-protein interactions, making RBPs versatile regulators of gene expression. In this review, we elaborate on recent experimental and computational approaches that have increased our understanding of RNA-protein interactions and their role in cellular function. We review aspects of gene expression that are modulated post-transcriptionally by RBPs, namely the stability of polymerase II-derived mRNA transcripts and their rate of translation into proteins. We further highlight the extensive regulatory networks of RBPs that implement a combinatorial control of gene expression. Taking cues from the recent development in the field, we argue that understanding spatio-temporal RNA-protein association on a transcriptome level will provide invaluable and unexpected insights into the regulatory codes that define growth, differentiation and disease.
Collapse
|