1
|
Greenwood T, Heitsch CE. How Parameters Influence SHAPE-Directed Predictions. Methods Mol Biol 2024; 2726:105-124. [PMID: 38780729 DOI: 10.1007/978-1-0716-3519-3_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
The structure of an RNA sequence encodes information about its biological function. Dynamic programming algorithms are often used to predict the conformation of an RNA molecule from its sequence alone, and adding experimental data as auxiliary information improves prediction accuracy. This auxiliary data is typically incorporated into the nearest neighbor thermodynamic model22 by converting the data into pseudoenergies. Here, we look at how much of the space of possible structures auxiliary data allows prediction methods to explore. We find that for a large class of RNA sequences, auxiliary data shifts the predictions significantly. Additionally, we find that predictions are highly sensitive to the parameters which define the auxiliary data pseudoenergies. In fact, the parameter space can typically be partitioned into regions where different structural predictions predominate.
Collapse
|
2
|
Kolberg T, von Löhneysen S, Ozerova I, Wellner K, Hartmann R, Stadler P, Mörl M. Led-Seq: ligation-enhanced double-end sequence-based structure analysis of RNA. Nucleic Acids Res 2023; 51:e63. [PMID: 37114986 PMCID: PMC10287922 DOI: 10.1093/nar/gkad312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 03/21/2023] [Accepted: 04/13/2023] [Indexed: 04/29/2023] Open
Abstract
Structural analysis of RNA is an important and versatile tool to investigate the function of this type of molecules in the cell as well as in vitro. Several robust and reliable procedures are available, relying on chemical modification inducing RT stops or nucleotide misincorporations during reverse transcription. Others are based on cleavage reactions and RT stop signals. However, these methods address only one side of the RT stop or misincorporation position. Here, we describe Led-Seq, a new approach based on lead-induced cleavage of unpaired RNA positions, where both resulting cleavage products are investigated. The RNA fragments carrying 2', 3'-cyclic phosphate or 5'-OH ends are selectively ligated to oligonucleotide adapters by specific RNA ligases. In a deep sequencing analysis, the cleavage sites are identified as ligation positions, avoiding possible false positive signals based on premature RT stops. With a benchmark set of transcripts in Escherichia coli, we show that Led-Seq is an improved and reliable approach based on metal ion-induced phosphodiester hydrolysis to investigate RNA structures in vivo.
Collapse
Affiliation(s)
- Tim Kolberg
- Institute for Biochemistry, Leipzig University, Brüderstr. 34, 04103 Leipzig, Germany
| | - Sarah von Löhneysen
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstr. 16–18, 04107 Leipzig, Germany
| | - Iuliia Ozerova
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstr. 16–18, 04107 Leipzig, Germany
| | - Karolin Wellner
- Institute for Biochemistry, Leipzig University, Brüderstr. 34, 04103 Leipzig, Germany
| | - Roland K Hartmann
- Institute for Pharmaceutical Chemistry, Philipps University Marburg, Marbacher Weg 6, 35037 Marburg, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstr. 16–18, 04107 Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany
- Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria
- Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Mario Mörl
- Institute for Biochemistry, Leipzig University, Brüderstr. 34, 04103 Leipzig, Germany
| |
Collapse
|
3
|
Aviran S, Incarnato D. Computational approaches for RNA structure ensemble deconvolution from structure probing data. J Mol Biol 2022; 434:167635. [PMID: 35595163 DOI: 10.1016/j.jmb.2022.167635] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 04/29/2022] [Accepted: 05/05/2022] [Indexed: 12/15/2022]
Abstract
RNA structure probing experiments have emerged over the last decade as a straightforward way to determine the structure of RNA molecules in a number of different contexts. Although powerful, the ability of RNA to dynamically interconvert between, and to simultaneously populate, alternative structural configurations, poses a nontrivial challenge to the interpretation of data derived from these experiments. Recent efforts aimed at developing computational methods for the reconstruction of coexisting alternative RNA conformations from structure probing data are paving the way to the study of RNA structure ensembles, even in the context of living cells. In this review, we critically discuss these methods, their limitations and possible future improvements.
Collapse
Affiliation(s)
- Sharon Aviran
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA.
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, the Netherlands.
| |
Collapse
|
4
|
Ganser LR, Chu CC, Bogerd HP, Kelly ML, Cullen BR, Al-Hashimi HM. Probing RNA Conformational Equilibria within the Functional Cellular Context. Cell Rep 2021; 30:2472-2480.e4. [PMID: 32101729 DOI: 10.1016/j.celrep.2020.02.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Revised: 12/24/2019] [Accepted: 01/31/2020] [Indexed: 12/17/2022] Open
Abstract
Low-abundance short-lived non-native conformations referred to as excited states (ESs) are increasingly observed in vitro and implicated in the folding and biological activities of regulatory RNAs. We developed an approach for assessing the relative abundance of RNA ESs within the functional cellular context. Nuclear magnetic resonance (NMR) spectroscopy was used to estimate the degree to which substitution mutations bias conformational equilibria toward the inactive ES in vitro. The cellular activity of the ES-stabilizing mutants was used as an indirect measure of the conformational equilibria within the functional cellular context. Compensatory mutations that restore the ground-state conformation were used to control for changes in sequence. Using this approach, we show that the ESs of two regulatory RNAs from HIV-1, the transactivation response element (TAR) and the Rev response element (RRE), likely form in cells with abundances comparable to those measured in vitro, and their targeted stabilization may provide an avenue for developing anti-HIV therapeutics.
Collapse
Affiliation(s)
- Laura R Ganser
- Department of Biochemistry, Duke University Medical Center, Durham, NC 27710, USA
| | - Chia-Chieh Chu
- Department of Biochemistry, Duke University Medical Center, Durham, NC 27710, USA
| | - Hal P Bogerd
- Department of Molecular Genetics and Microbiology, Center for Virology, Duke University Medical Center, Durham, NC 27710, USA
| | - Megan L Kelly
- Department of Biochemistry, Duke University Medical Center, Durham, NC 27710, USA
| | - Bryan R Cullen
- Department of Molecular Genetics and Microbiology, Center for Virology, Duke University Medical Center, Durham, NC 27710, USA.
| | - Hashim M Al-Hashimi
- Department of Biochemistry, Duke University Medical Center, Durham, NC 27710, USA.
| |
Collapse
|
5
|
Greenwood T, Heitsch CE. On the Problem of Reconstructing a Mixture of RNA Structures. Bull Math Biol 2020; 82:133. [PMID: 33029669 DOI: 10.1007/s11538-020-00804-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Accepted: 09/08/2020] [Indexed: 01/02/2023]
Abstract
A growing number of RNA sequences are now known to exist in some distribution with two or more different stable structures. Recent algorithms attempt to reconstruct such mixtures using the list of nucleotides in a sequence in conjunction with auxiliary experimental footprinting data. In this paper, we demonstrate some challenges which remain in addressing this problem; in particular we consider the difficulty of reconstructing a mixture of two RNA structures across a spectrum of different relative abundances. Although progress has been made in identifying the stable structures present, it remains nontrivial to predict the relative abundance of each within the experimentally sampled mixture. Because the ratio of structures present can change depending on experimental conditions, it is the footprinting data-and not the sequence-which must encode information on changes in the relative abundance. Here, we use simulated experimental data to demonstrate that there exist RNA sequences and relative abundance combinations which cannot be recovered by current methods. We then prove that this is not a single exception, but rather part of the rule. In particular, we show, using a Nussinov-Jacobson model, that recovering the relative abundances is difficult for a large proportion of RNA structure pairs. Lastly, we use information theory to establish a framework for quantifying how useful auxiliary data is in predicting the relative abundance of a structure. Together, these results demonstrate that aspects of the problem of reconstructing a mixture of RNA structures from experimental data remain open.
Collapse
|
6
|
Ganser LR, Kelly ML, Herschlag D, Al-Hashimi HM. The roles of structural dynamics in the cellular functions of RNAs. Nat Rev Mol Cell Biol 2020; 20:474-489. [PMID: 31182864 DOI: 10.1038/s41580-019-0136-0] [Citation(s) in RCA: 254] [Impact Index Per Article: 63.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
RNAs fold into 3D structures that range from simple helical elements to complex tertiary structures and quaternary ribonucleoprotein assemblies. The functions of many regulatory RNAs depend on how their 3D structure changes in response to a diverse array of cellular conditions. In this Review, we examine how the structural characterization of RNA as dynamic ensembles of conformations, which form with different probabilities and at different timescales, is improving our understanding of RNA function in cells. We discuss the mechanisms of gene regulation by microRNAs, riboswitches, ribozymes, post-transcriptional RNA modifications and RNA-binding proteins, and how the cellular environment and processes such as liquid-liquid phase separation may affect RNA folding and activity. The emerging RNA-ensemble-function paradigm is changing our perspective and understanding of RNA regulation, from in vitro to in vivo and from descriptive to predictive.
Collapse
Affiliation(s)
- Laura R Ganser
- Department of Biochemistry, Duke University School of Medicine, Durham, NC, USA
| | - Megan L Kelly
- Department of Biochemistry, Duke University School of Medicine, Durham, NC, USA
| | - Daniel Herschlag
- Department of Biochemistry, Stanford ChEM-H Chemistry, Engineering, and Medicine for Human Health, Stanford University, Stanford, CA, USA.,Department of Chemical Engineering, Stanford ChEM-H Chemistry, Engineering, and Medicine for Human Health, Stanford University, Stanford, CA, USA.,Department of Chemistry, Stanford ChEM-H Chemistry, Engineering, and Medicine for Human Health, Stanford University, Stanford, CA, USA
| | - Hashim M Al-Hashimi
- Department of Biochemistry, Duke University School of Medicine, Durham, NC, USA. .,Department of Chemistry, Duke University, Durham, NC, USA.
| |
Collapse
|
7
|
Munteanu A, Mukherjee N, Ohler U. SSMART: sequence-structure motif identification for RNA-binding proteins. Bioinformatics 2019; 34:3990-3998. [PMID: 29893814 DOI: 10.1093/bioinformatics/bty404] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Accepted: 06/07/2018] [Indexed: 01/12/2023] Open
Abstract
Motivation RNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized. Results We developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3'UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP. Availability and implementation SSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alina Munteanu
- Berlin Institute for Medical Systems Biology, Max Delbruck Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany.,Department of Computer Science, Humboldt University, Berlin, Germany
| | - Neelanjan Mukherjee
- Berlin Institute for Medical Systems Biology, Max Delbruck Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
| | - Uwe Ohler
- Berlin Institute for Medical Systems Biology, Max Delbruck Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany.,Department of Computer Science, Humboldt University, Berlin, Germany
| |
Collapse
|
8
|
Wang F, Sun LZ, Sun T, Chang S, Xu X. Helix-Based RNA Landscape Partition and Alternative Secondary Structure Determination. ACS OMEGA 2019; 4:15407-15413. [PMID: 31572840 PMCID: PMC6761681 DOI: 10.1021/acsomega.9b01430] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Accepted: 09/03/2019] [Indexed: 06/10/2023]
Abstract
RNA is a versatile macromolecule with the ability to fold into and interconvert between multiple functional conformations. The elucidation of the RNA folding landscape, especially the knowledge of alternative structures, is critical to uncover the physical mechanism of RNA functions. Here, we introduce a helix-based strategy for RNA folding landscape partition and alternative secondary structure determination. The benchmark test of 27 RNAs involving alternative stable structures shows that the model has the ability to divide the whole landscape into distinct partitions at the secondary structure level and predict the representative structures for each partition. Furthermore, the predicted structures and equilibrium populations of metastable conformations for the 2'dG-sensing riboswitch reveal the allosteric conformational switch on transcript length, which is consistent with the experimental study, indicating the importance of metastable states for RNA-based gene regulation. The model delivers a starting point for the landscape-based strategy toward the RNA folding mechanism and functions.
Collapse
Affiliation(s)
- Fengfei Wang
- Institute
of Bioinformatics and Medical Engineering, School of Mathematics and
Physics, Jiangsu University of Technology, Changzhou, Jiangsu 213001, China
| | - Li-Zhen Sun
- Department
of Applied Physics, Zhejiang University
of Technology, Hangzhou, Zhejiang 310023, China
| | - Tingting Sun
- Department
of Physics, Zhejiang University of Science
and Technology, Hangzhou, Zhejiang 310023, China
| | - Shan Chang
- Institute
of Bioinformatics and Medical Engineering, School of Mathematics and
Physics, Jiangsu University of Technology, Changzhou, Jiangsu 213001, China
| | - Xiaojun Xu
- Institute
of Bioinformatics and Medical Engineering, School of Mathematics and
Physics, Jiangsu University of Technology, Changzhou, Jiangsu 213001, China
| |
Collapse
|
9
|
Su C, Weir JD, Zhang F, Yan H, Wu T. ENTRNA: a framework to predict RNA foldability. BMC Bioinformatics 2019; 20:373. [PMID: 31269893 PMCID: PMC6610807 DOI: 10.1186/s12859-019-2948-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 06/12/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA molecules play many crucial roles in living systems. The spatial complexity that exists in RNA structures determines their cellular functions. Therefore, understanding RNA folding conformations, in particular, RNA secondary structures, is critical for elucidating biological functions. Existing literature has focused on RNA design as either an RNA structure prediction problem or an RNA inverse folding problem where free energy has played a key role. RESULTS In this research, we propose a Positive-Unlabeled data- driven framework termed ENTRNA. Other than free energy and commonly studied sequence and structural features, we propose a new feature, Sequence Segment Entropy (SSE), to measure the diversity of RNA sequences. ENTRNA is trained and cross-validated using 1024 pseudoknot-free RNAs and 1060 pseudoknotted RNAs from the RNASTRAND database respectively. To test the robustness of the ENTRNA, the models are further blind tested on 206 pseudoknot-free and 93 pseudoknotted RNAs from the PDB database. For pseudoknot-free RNAs, ENTRNA has 86.5% sensitivity on the training dataset and 80.6% sensitivity on the testing dataset. For pseudoknotted RNAs, ENTRNA shows 81.5% sensitivity on the training dataset and 71.0% on the testing dataset. To test the applicability of ENTRNA to long structural-complex RNA, we collect 5 laboratory synthetic RNAs ranging from 1618 to 1790 nucleotides. ENTRNA is able to predict the foldability of 4 RNAs. CONCLUSION In this article, we reformulate the RNA design problem as a foldability prediction problem which is to predict the likelihood of the co-existence of a sequence-structure pair. This new construct has the potential for both RNA structure prediction and the inverse folding problem. In addition, this new construct enables us to explore data-driven approaches in RNA research.
Collapse
Affiliation(s)
- Congzhe Su
- School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, AZ 85281 USA
| | - Jeffery D. Weir
- Department of Operational Sciences, Graduate School of Engineering and Management, Air Force Institute of Technology, Wright-Patterson AFB, Dayton, OH 45433 USA
| | - Fei Zhang
- Biodesign Center for Molecular Design and Biomimetics, The Biodesign Institute & School of Molecular Sciences, Arizona State University, Tempe, AZ 85281 USA
| | - Hao Yan
- Biodesign Center for Molecular Design and Biomimetics, The Biodesign Institute & School of Molecular Sciences, Arizona State University, Tempe, AZ 85281 USA
| | - Teresa Wu
- School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, AZ 85281 USA
| |
Collapse
|
10
|
Sullivan R, Adams MC, Naik RR, Milam VT. Analyzing Secondary Structure Patterns in DNA Aptamers Identified via CompELS. Molecules 2019; 24:molecules24081572. [PMID: 31010064 PMCID: PMC6515186 DOI: 10.3390/molecules24081572] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 04/09/2019] [Accepted: 04/15/2019] [Indexed: 12/12/2022] Open
Abstract
In contrast to sophisticated high-throughput sequencing tools for genomic DNA, analytical tools for comparing secondary structure features between multiple single-stranded DNA sequences are less developed. For single-stranded nucleic acid ligands called aptamers, secondary structure is widely thought to play a pivotal role in driving recognition-based binding activity between an aptamer sequence and its specific target. Here, we employ a competition-based aptamer screening platform called CompELS to identify DNA aptamers for a colloidal target. We then analyze predicted secondary structures of the aptamers and a large population of random sequences to identify sequence features and patterns. Our secondary structure analysis identifies patterns ranging from position-dependent score matrixes of individual structural elements to position-independent consensus domains resulting from global alignment.
Collapse
Affiliation(s)
- Richard Sullivan
- School of Materials Science and Engineering, Georgia Institute of Technology, 771 Ferst Dr. NW, Atlanta, GA 30332-0245, USA.
| | - Mary Catherine Adams
- School of Materials Science and Engineering, Georgia Institute of Technology, 771 Ferst Dr. NW, Atlanta, GA 30332-0245, USA.
| | - Rajesh R Naik
- 711 Human Performance Wing, Air Force Research Laboratory, Wright Patterson AFB, OH 45433, USA.
| | - Valeria T Milam
- School of Materials Science and Engineering, Georgia Institute of Technology, 771 Ferst Dr. NW, Atlanta, GA 30332-0245, USA.
- Wallace H. Coulter, Department of Biomedical Engineering, Georgia Institute of Technology, 313 Ferst Dr., Atlanta, GA 30332, USA.
- Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, 315 Ferst Dr., Atlanta, GA 30332-0363, USA.
| |
Collapse
|
11
|
Barrett C, He Q, Huang FW, Reidys CM. A Boltzmann Sampler for 1-Pairs with Double Filtration. J Comput Biol 2019; 26:173-192. [PMID: 30653353 DOI: 10.1089/cmb.2018.0095] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Recently, a framework considering RNA sequences and their RNA secondary structures as pairs led to some information-theoretic perspectives on how the semantics encoded in RNA sequences can be inferred. This pairing arises naturally from the energy model of RNA secondary structures. Fixing the sequence in the pairing produces the RNA energy landscape, whose partition function was discovered by McCaskill. Dually, fixing the structure induces the energy landscape of sequences. The latter has been considered originally for designing more efficient inverse folding algorithms and subsequently enhanced by facilitating the sampling of sequences. We present here a partition function of sequence/structure pairs, with endowed Hamming distance and base pair distance filtration. This partition function is an augmentation of the previous mentioned (dual) partition function. We develop an efficient dynamic programming routine to recursively compute the partition function with this double filtration. Our framework is capable of dealing with RNA secondary structures as well as 1-structures, where a 1-structure is an RNA pseudoknot structure consisting of "building blocks" of genus 0 or 1. In particular, 0-structures, consisting of only "building blocks" of genus 0, are exactly RNA secondary structures. The time complexity for calculating the partition function of 1-pairs, that is, sequence/structure pairs where the structures are 1-structures, is O(h3b3n6), where h, b, n denote the Hamming distance, base pair distance, and sequence length, respectively. The time complexity for the partition function of 0-pairs is O(h2b2n3).
Collapse
Affiliation(s)
- Christopher Barrett
- 1 Biocomplexity Initiative and Institute, University of Virginia, Charlottesville, Virginia.,2 Department of Computer Science, University of Virginia, Charlottesville, Virginia
| | - Qijun He
- 1 Biocomplexity Initiative and Institute, University of Virginia, Charlottesville, Virginia
| | - Fenix W Huang
- 1 Biocomplexity Initiative and Institute, University of Virginia, Charlottesville, Virginia
| | - Christian M Reidys
- 1 Biocomplexity Initiative and Institute, University of Virginia, Charlottesville, Virginia.,3 Department of Mathematics, University of Virginia, Charlottesville, Virginia
| |
Collapse
|
12
|
Yoon HR, Coria A, Laederach A, Heitsch C. Towards an understanding of RNA structural modalities: a riboswitch case study. COMPUTATIONAL AND MATHEMATICAL BIOPHYSICS 2019; 7:48-63. [PMID: 34113790 DOI: 10.1515/cmb-2019-0004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
A riboswitch is a type of RNA molecule that regulates important biological functions by changing structure, typically under ligand-binding. We assess the extent that these ligand-bound structural alternatives are present in the Boltzmann sample, a standard RNA secondary structure prediction method, for three riboswitch test cases. We use the cluster analysis tool RNAStructProfiling to characterize the different modalities present among the suboptimal structures sampled. We compare these modalities to the putative base pairing models obtained from independent experiments using NMR or fluorescence spectroscopy. We find, somewhat unexpectedly, that profiling the Boltzmann sample captures evidence of ligand-bound conformations for two of three riboswitches studied. Moreover, this agreement between predicted modalities and experimental models is consistent with the classification of riboswitches into thermodynamic versus kinetic regulatory mechanisms. Our results support cluster analysis of Boltzmann samples by RNAStructProfiling as a possible basis for de novo identification of thermodynamic riboswitches, while highlighting the challenges for kinetic ones.
Collapse
Affiliation(s)
- Hee Rhang Yoon
- School of Mathematics, Georgia Institute of Technology, Atlanta, GA, 30332
| | - Aaztli Coria
- Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, NC, 27599
| | - Alain Laederach
- Department of Biology, University of North Carolina, Chapel Hill, NC, 27599
| | - Christine Heitsch
- School of Mathematics, Georgia Institute of Technology, Atlanta, GA, 30332
| |
Collapse
|
13
|
Schroeder SJ. Challenges and approaches to predicting RNA with multiple functional structures. RNA (NEW YORK, N.Y.) 2018; 24:1615-1624. [PMID: 30143552 PMCID: PMC6239171 DOI: 10.1261/rna.067827.118] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The revolution in sequencing technology demands new tools to interpret the genetic code. As in vivo transcriptome-wide chemical probing techniques advance, new challenges emerge in the RNA folding problem. The emphasis on one sequence folding into a single minimum free energy structure is fading as a new focus develops on generating RNA structural ensembles and identifying functional structural features in ensembles. This review describes an efficient combinatorially complete method and three free energy minimization approaches to predicting RNA structures with more than one functional fold, as well as two methods for analysis of a thermodynamics-based Boltzmann ensemble of structures. The review then highlights two examples of viral RNA 3'-UTR regions that fold into more than one conformation and have been characterized by single molecule fluorescence energy resonance transfer or NMR spectroscopy. These examples highlight the different approaches and challenges in predicting structure and function from sequence for RNA with multiple biological roles and folds. More well-defined examples and new metrics for measuring differences in RNA structures will guide future improvements in prediction of RNA structure and function from sequence.
Collapse
Affiliation(s)
- Susan J Schroeder
- Department of Chemistry and Biochemistry, Department of Microbiology and Plant Biology, University of Oklahoma, Norman, Oklahoma 73019, USA
| |
Collapse
|
14
|
Barrett C, He Q, Huang FW, Reidys CM. An Efficient Dual Sampling Algorithm with Hamming Distance Filtration. J Comput Biol 2018; 25:1179-1192. [DOI: 10.1089/cmb.2018.0075] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Affiliation(s)
- Christopher Barrett
- Biocomplexity Institute of Virginia Tech, Blacksburg, Virginia
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia
| | - Qijun He
- Biocomplexity Institute of Virginia Tech, Blacksburg, Virginia
| | - Fenix W. Huang
- Biocomplexity Institute of Virginia Tech, Blacksburg, Virginia
| | - Christian M. Reidys
- Biocomplexity Institute of Virginia Tech, Blacksburg, Virginia
- Department of Mathematics, Virginia Tech, Blacksburg, Virginia
- Thermo Fisher Scientific Fellow in Advanced Systems for Information Biology, Thermo Fisher Scientific, Waltham, Massachusetts
| |
Collapse
|
15
|
Lin L, McKerrow WH, Richards B, Phonsom C, Lawrence CE. Characterization and visualization of RNA secondary structure Boltzmann ensemble via information theory. BMC Bioinformatics 2018; 19:82. [PMID: 29506466 PMCID: PMC5836418 DOI: 10.1186/s12859-018-2078-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 02/20/2018] [Indexed: 12/26/2022] Open
Abstract
Background The nearest neighbor model and associated dynamic programming algorithms allow for the efficient estimation of the RNA secondary structure Boltzmann ensemble. However because a given RNA secondary structure only contains a fraction of the possible helices that could form from a given sequence, the Boltzmann ensemble is multimodal. Several methods exist for clustering structures and finding those modes. However less focus is given to exploring the underlying reasons for this multimodality: the presence of conflicting basepairs. Information theory, or more specifically mutual information, provides a method to identify those basepairs that are key to the secondary structure. Results To this end we find most informative basepairs and visualize the effect of these basepairs on the secondary structure. Knowing whether a most informative basepair is present tells us not only the status of the particular pair but also provides a large amount of information about which other pairs are present or not present. We find that a few basepairs account for a large amount of the structural uncertainty. The identification of these pairs indicates small changes to sequence or stability that will have a large effect on structure. Conclusion We provide a novel algorithm that uses mutual information to identify the key basepairs that lead to a multimodal Boltzmann distribution. We then visualize the effect of these pairs on the overall Boltzmann ensemble.
Collapse
Affiliation(s)
- Luan Lin
- Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, 20993, MD, USA
| | - Wilson H McKerrow
- Division of Applied Mathematics, Brown University, Providence, 02912, RI, USA
| | | | - Chukiat Phonsom
- Department of Mathematics, University of Southern California, Los Angeles, 90089, CA, USA
| | - Charles E Lawrence
- Division of Applied Mathematics, Brown University, Providence, 02912, RI, USA.
| |
Collapse
|
16
|
Statistical modeling of RNA structure profiling experiments enables parsimonious reconstruction of structure landscapes. Nat Commun 2018; 9:606. [PMID: 29426922 PMCID: PMC5807309 DOI: 10.1038/s41467-018-02923-8] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Accepted: 01/09/2018] [Indexed: 11/23/2022] Open
Abstract
RNA plays key regulatory roles in diverse cellular processes, where its functionality often derives from folding into and converting between structures. Many RNAs further rely on co-existence of alternative structures, which govern their response to cellular signals. However, characterizing heterogeneous landscapes is difficult, both experimentally and computationally. Recently, structure profiling experiments have emerged as powerful and affordable structure characterization methods, which improve computational structure prediction. To date, efforts have centered on predicting one optimal structure, with much less progress made on multiple-structure prediction. Here, we report a probabilistic modeling approach that predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data. We demonstrate robust landscape reconstruction and quantitative insights into structural dynamics by analyzing numerous data sets. This work establishes a framework for data-directed characterization of structure landscapes to aid experimentalists in performing structure-function studies. Different experimental and computational approaches can be used to study RNA structures. Here, the authors present a computational method for data-directed reconstruction of complex RNA structure landscapes, which predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data.
Collapse
|
17
|
Rogers E, Murrugarra D, Heitsch C. Conditioning and Robustness of RNA Boltzmann Sampling under Thermodynamic Parameter Perturbations. Biophys J 2017. [PMID: 28629618 DOI: 10.1016/j.bpj.2017.05.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Understanding how RNA secondary structure prediction methods depend on the underlying nearest-neighbor thermodynamic model remains a fundamental challenge in the field. Minimum free energy (MFE) predictions are known to be "ill conditioned" in that small changes to the thermodynamic model can result in significantly different optimal structures. Hence, the best practice is now to sample from the Boltzmann distribution, which generates a set of suboptimal structures. Although the structural signal of this Boltzmann sample is known to be robust to stochastic noise, the conditioning and robustness under thermodynamic perturbations have yet to be addressed. We present here a mathematically rigorous model for conditioning inspired by numerical analysis, and also a biologically inspired definition for robustness under thermodynamic perturbation. We demonstrate the strong correlation between conditioning and robustness and use its tight relationship to define quantitative thresholds for well versus ill conditioning. These resulting thresholds demonstrate that the majority of the sequences are at least sample robust, which verifies the assumption of sampling's improved conditioning over the MFE prediction. Furthermore, because we find no correlation between conditioning and MFE accuracy, the presence of both well- and ill-conditioned sequences indicates the continued need for both thermodynamic model refinements and alternate RNA structure prediction methods beyond the physics-based ones.
Collapse
Affiliation(s)
- Emily Rogers
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia
| | - David Murrugarra
- Department of Mathematics, University of Kentucky, Lexington, Kentucky
| | - Christine Heitsch
- School of Mathematics, Georgia Institute of Technology, Atlanta, Georgia.
| |
Collapse
|
18
|
Tremblay-Savard O, Reinharz V, Waldispühl J. Reconstruction of ancestral RNA sequences under multiple structural constraints. BMC Genomics 2016; 17:862. [PMID: 28185557 PMCID: PMC5123390 DOI: 10.1186/s12864-016-3105-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Secondary structures form the scaffold of multiple sequence alignment of non-coding RNA (ncRNA) families. An accurate reconstruction of ancestral ncRNAs must use this structural signal. However, the inference of ancestors of a single ncRNA family with a single consensus structure may bias the results towards sequences with high affinity to this structure, which are far from the true ancestors. METHODS In this paper, we introduce achARNement, a maximum parsimony approach that, given two alignments of homologous ncRNA families with consensus secondary structures and a phylogenetic tree, simultaneously calculates ancestral RNA sequences for these two families. RESULTS We test our methodology on simulated data sets, and show that achARNement outperforms classical maximum parsimony approaches in terms of accuracy, but also reduces by several orders of magnitude the number of candidate sequences. To conclude this study, we apply our algorithms on the Glm clan and the FinP-traJ clan from the Rfam database. CONCLUSIONS Our results show that our methods reconstruct small sets of high-quality candidate ancestors with better agreement to the two target structures than with classical approaches. Our program is freely available at: http://csb.cs.mcgill.ca/acharnement .
Collapse
Affiliation(s)
- Olivier Tremblay-Savard
- School of Computer Science, McGill University, Montreal, H3A 0E9, Canada.,Department of Computer Science, University of Manitoba, Winnipeg, R3T 2N2, Canada
| | - Vladimir Reinharz
- School of Computer Science, McGill University, Montreal, H3A 0E9, Canada
| | - Jérôme Waldispühl
- School of Computer Science, McGill University, Montreal, H3A 0E9, Canada.
| |
Collapse
|
19
|
Rogers E, Heitsch C. New insights from cluster analysis methods for RNA secondary structure prediction. WILEY INTERDISCIPLINARY REVIEWS-RNA 2016; 7:278-94. [PMID: 26971529 DOI: 10.1002/wrna.1334] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Revised: 12/03/2015] [Accepted: 12/17/2015] [Indexed: 01/12/2023]
Abstract
A widening gap exists between the best practices for RNA secondary structure prediction developed by computational researchers and the methods used in practice by experimentalists. Minimum free energy predictions, although broadly used, are outperformed by methods which sample from the Boltzmann distribution and data mine the results. In particular, moving beyond the single structure prediction paradigm yields substantial gains in accuracy. Furthermore, the largest improvements in accuracy and precision come from viewing secondary structures not at the base pair level but at lower granularity/higher abstraction. This suggests that random errors affecting precision and systematic ones affecting accuracy are both reduced by this 'fuzzier' view of secondary structures. Thus experimentalists who are willing to adopt a more rigorous, multilayered approach to secondary structure prediction by iterating through these levels of granularity will be much better able to capture fundamental aspects of RNA base pairing. WIREs RNA 2016, 7:278-294. doi: 10.1002/wrna.1334 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Emily Rogers
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0765, USA
| | - Christine Heitsch
- School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332-0160, USA
| |
Collapse
|
20
|
Kutchko KM, Sanders W, Ziehr B, Phillips G, Solem A, Halvorsen M, Weeks KM, Moorman N, Laederach A. Multiple conformations are a conserved and regulatory feature of the RB1 5' UTR. RNA (NEW YORK, N.Y.) 2015; 21:1274-85. [PMID: 25999316 PMCID: PMC4478346 DOI: 10.1261/rna.049221.114] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2014] [Accepted: 03/27/2015] [Indexed: 05/22/2023]
Abstract
Folding to a well-defined conformation is essential for the function of structured ribonucleic acids (RNAs) like the ribosome and tRNA. Structured elements in the untranslated regions (UTRs) of specific messenger RNAs (mRNAs) are known to control expression. The importance of unstructured regions adopting multiple conformations, however, is still poorly understood. High-resolution SHAPE-directed Boltzmann suboptimal sampling of the Homo sapiens Retinoblastoma 1 (RB1) 5' UTR yields three distinct conformations compatible with the experimental data. Private single nucleotide variants (SNVs) identified in two patients with retinoblastoma each collapse the structural ensemble to a single but distinct well-defined conformation. The RB1 5' UTRs from Bos taurus (cow) and Trichechus manatus latirostris (manatee) are divergent in sequence from H. sapiens (human) yet maintain structural compatibility with high-probability base pairs. SHAPE chemical probing of the cow and manatee RB1 5' UTRs reveals that they also adopt multiple conformations. Luciferase reporter assays reveal that 5' UTR mutations alter RB1 expression. In a traditional model of disease, causative SNVs disrupt a key structural element in the RNA. For the subset of patients with heritable retinoblastoma-associated SNVs in the RB1 5' UTR, the absence of multiple structures is likely causative of the cancer. Our data therefore suggest that selective pressure will favor multiple conformations in eukaryotic UTRs to regulate expression.
Collapse
Affiliation(s)
- Katrina M Kutchko
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Wes Sanders
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA
| | - Ben Ziehr
- Department of Microbiology and Immunology, University of North Carolina, Chapel Hill, North Carolina 27599, USA Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Gabriela Phillips
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA
| | - Amanda Solem
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA
| | - Matthew Halvorsen
- Institute for Genomic Medicine, Columbia University Medical Center, New York, New York 10032, USA
| | - Kevin M Weeks
- Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA
| | - Nathaniel Moorman
- Department of Microbiology and Immunology, University of North Carolina, Chapel Hill, North Carolina 27599, USA Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Alain Laederach
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA
| |
Collapse
|