1
|
Mittal A, Ali SE, Mathews DH. Using the RNAstructure Software Package to Predict Conserved RNA Structures. Curr Protoc 2024; 4:e70054. [PMID: 39540715 DOI: 10.1002/cpz1.70054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
The structures of many non-coding RNAs (ncRNA) are conserved by evolution to a greater extent than their sequences. By predicting the conserved structure of two or more homologous sequences, the accuracy of secondary structure prediction can be improved as compared to structure prediction for a single sequence. Here, we provide protocols for the use of four programs in the RNAstructure suite to predict conserved structures: Multilign, TurboFold, Dynalign, and PARTS. TurboFold iteratively aligns multiple homologous sequences and estimates the pairing probabilities for the conserved structure. Dynalign, PARTS, and Multilign are dynamic programming algorithms that simultaneously align sequences and identify the common secondary structure. Dynalign uses a pair of homologs and finds the lowest free energy common structure. PARTS uses a pair of homologs and estimates pairing probabilities from the base pairing probabilities estimated for each sequence. Multilign uses two or more homologs and finds the lowest free energy common structure using multiple pairwise calculations with Dynalign. It scales linearly with the number of sequences. We outline the strengths of each program. These programs can be run through web servers, on the command line, or with graphical user interfaces. © 2024 Wiley Periodicals LLC. Basic Protocol 1: Predicting a structure conserved in three or more sequences with the RNAstructure web server Basic Protocol 2: Predicting a structure conserved in two sequences with the RNAstructure web server Alternative Protocol 1: Predicting a structure conserved in multiple sequences in the RNAstructure graphical user interface Alternative Protocol 2: Predicting a structure conserved in two sequences with Dynalign in the RNAstructure graphical user interface Alternative Protocol 3: Running TurboFold on the command line.
Collapse
Affiliation(s)
- Abhinav Mittal
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - Sara E Ali
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| |
Collapse
|
2
|
Allan MF, Aruda J, Plung JS, Grote SL, Martin des Taillades YJ, de Lajarte AA, Bathe M, Rouskin S. Discovery and Quantification of Long-Range RNA Base Pairs in Coronavirus Genomes with SEARCH-MaP and SEISMIC-RNA. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.29.591762. [PMID: 38746332 PMCID: PMC11092567 DOI: 10.1101/2024.04.29.591762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
RNA molecules perform a diversity of essential functions for which their linear sequences must fold into higher-order structures. Techniques including crystallography and cryogenic electron microscopy have revealed 3D structures of ribosomal, transfer, and other well-structured RNAs; while chemical probing with sequencing facilitates secondary structure modeling of any RNAs of interest, even within cells. Ongoing efforts continue increasing the accuracy, resolution, and ability to distinguish coexisting alternative structures. However, no method can discover and quantify alternative structures with base pairs spanning arbitrarily long distances - an obstacle for studying viral, messenger, and long noncoding RNAs, which may form long-range base pairs. Here, we introduce the method of Structure Ensemble Ablation by Reverse Complement Hybridization with Mutational Profiling (SEARCH-MaP) and software for Structure Ensemble Inference by Sequencing, Mutation Identification, and Clustering of RNA (SEISMIC-RNA). We use SEARCH-MaP and SEISMIC-RNA to discover that the frameshift stimulating element of SARS coronavirus 2 base-pairs with another element 1 kilobase downstream in nearly half of RNA molecules, and that this structure competes with a pseudoknot that stimulates ribosomal frameshifting. Moreover, we identify long-range base pairs involving the frameshift stimulating element in other coronaviruses including SARS coronavirus 1 and transmissible gastroenteritis virus, and model the full genomic secondary structure of the latter. These findings suggest that long-range base pairs are common in coronaviruses and may regulate ribosomal frameshifting, which is essential for viral RNA synthesis. We anticipate that SEARCH-MaP will enable solving many RNA structure ensembles that have eluded characterization, thereby enhancing our general understanding of RNA structures and their functions. SEISMIC-RNA, software for analyzing mutational profiling data at any scale, could power future studies on RNA structure and is available on GitHub and the Python Package Index.
Collapse
|
3
|
Gong T, Ju F, Bu D. Accurate prediction of RNA secondary structure including pseudoknots through solving minimum-cost flow with learned potentials. Commun Biol 2024; 7:297. [PMID: 38461362 PMCID: PMC10924946 DOI: 10.1038/s42003-024-05952-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 02/21/2024] [Indexed: 03/11/2024] Open
Abstract
Pseudoknots are key structure motifs of RNA and pseudoknotted RNAs play important roles in a variety of biological processes. Here, we present KnotFold, an accurate approach to the prediction of RNA secondary structure including pseudoknots. The key elements of KnotFold include a learned potential function and a minimum-cost flow algorithm to find the secondary structure with the lowest potential. KnotFold learns the potential from the RNAs with known structures using an attention-based neural network, thus avoiding the inaccuracy of hand-crafted energy functions. The specially designed minimum-cost flow algorithm used by KnotFold considers all possible combinations of base pairs and selects from them the optimal combination. The algorithm breaks the restriction of nested base pairs required by the widely used dynamic programming algorithms, thus enabling the identification of pseudoknots. Using 1,009 pseudoknotted RNAs as representatives, we demonstrate the successful application of KnotFold in predicting RNA secondary structures including pseudoknots with accuracy higher than the state-of-the-art approaches. We anticipate that KnotFold, with its superior accuracy, will greatly facilitate the understanding of RNA structures and functionalities.
Collapse
Affiliation(s)
- Tiansu Gong
- Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, 100190, Beijing, China
- University of Chinese Academy of Sciences, 100190, Beijing, China
| | - Fusong Ju
- Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, 100190, Beijing, China
- University of Chinese Academy of Sciences, 100190, Beijing, China
| | - Dongbo Bu
- Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, 100190, Beijing, China.
- University of Chinese Academy of Sciences, 100190, Beijing, China.
- Central China Artificial Intelligence Research Institute, Henan Academy of Sciences, Zhengzhou, 450046, Henan, China.
| |
Collapse
|
4
|
Abstract
RNAstructure is a user-friendly program for the prediction and analysis of RNA secondary structure. It is available as a web server, a program with a graphical user interface, or a set of command line tools. The programs are available for Microsoft Windows, macOS, or Linux. This article provides protocols for prediction of RNA secondary structure (using the web server, the graphical user interface, or the command line) and high-affinity oligonucleotide binding sites to a structured RNA target (using the graphical user interface). © 2023 Wiley Periodicals LLC. Basic Protocol 1: Predicting RNA secondary structure using the RNAstructure web server Alternate Protocol 1: Predicting secondary structure and base pair probabilities using the RNAstructure graphical user interface Alternate Protocol 2: Predicting secondary structure and base pair probabilities using the RNAstructure command line interface Basic Protocol 2: Predicting binding affinities of oligonucleotides complementary to an RNA target using OligoWalk.
Collapse
Affiliation(s)
- Sara E. Ali
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, New York 14642
| | - Abhinav Mittal
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, New York 14642
| | - David H. Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, New York 14642
| |
Collapse
|
5
|
Kensinger AH, Makowski JA, Pellegrene KA, Imperatore JA, Cunningham CL, Frye CJ, Lackey PE, Mihailescu MR, Evanseck JD. Structural, Dynamical, and Entropic Differences between SARS-CoV and SARS-CoV-2 s2m Elements Using Molecular Dynamics Simulations. ACS PHYSICAL CHEMISTRY AU 2023; 3:30-43. [PMID: 36711027 PMCID: PMC9578647 DOI: 10.1021/acsphyschemau.2c00032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 09/21/2022] [Accepted: 09/21/2022] [Indexed: 11/05/2022]
Abstract
The functional role of the highly conserved stem-loop II motif (s2m) in SARS-CoV and SARS-CoV-2 in the viral lifecycle remains enigmatic and an intense area of research. Structure and dynamics of the s2m are key to establishing a structure-function connection, yet a full set of atomistic resolution coordinates is not available for SARS-CoV-2 s2m. Our work constructs three-dimensional coordinates consistent with NMR solution phase data for SARS-CoV-2 s2m and provides a comparative analysis with its counterpart SARS-CoV s2m. We employed initial coordinates based on PDB ID 1XJR for SARS-CoV s2m and two models for SARS-CoV-2 s2m: one based on 1XJR in which we introduced the mutations present in SARS-CoV-2 s2m and the second based on the available SARS-CoV-2 NMR NOE data supplemented with knowledge-based methods. For each of the three systems, 3.5 μs molecular dynamics simulations were used to sample the structure and dynamics, and principal component analysis (PCA) reduced the ensembles to hierarchal conformational substates for detailed analysis. Dilute solution simulations of SARS-CoV s2m demonstrate that the GNRA-like terminal pentaloop is rigidly defined by base stacking uniquely positioned for possible kissing dimer formation. However, the SARS-CoV-2 s2m simulation did not retain the reported crystallographic SARS-CoV motifs and the terminal loop expands to a highly dynamic "nonaloop." Increased flexibility and structural disorganization are observed for the larger terminal loop, where an entropic penalty is computed to explain the experimentally observed reduction in kissing complex formation. Overall, both SARS-CoV and SARS-CoV-2 s2m elements have a similarly pronounced L-shape due to different motif interactions. Our study establishes the atomistic three-dimensional structure and uncovers dynamic differences that arise from s2m sequence changes, which sets the stage for the interrogation of different mechanistic pathways of suspected biological function.
Collapse
Affiliation(s)
- Adam H. Kensinger
- Department
of Chemistry and Biochemistry and Center for Computational Sciences, Duquesne University, Pittsburgh, Pennsylvania15282, United States
| | - Joseph A. Makowski
- Department
of Chemistry and Biochemistry and Center for Computational Sciences, Duquesne University, Pittsburgh, Pennsylvania15282, United States
| | - Kendy A. Pellegrene
- Department
of Chemistry and Biochemistry and Center for Computational Sciences, Duquesne University, Pittsburgh, Pennsylvania15282, United States
| | - Joshua A. Imperatore
- Department
of Chemistry and Biochemistry and Center for Computational Sciences, Duquesne University, Pittsburgh, Pennsylvania15282, United States
| | - Caylee L. Cunningham
- Department
of Chemistry and Biochemistry and Center for Computational Sciences, Duquesne University, Pittsburgh, Pennsylvania15282, United States
| | - Caleb J. Frye
- Department
of Chemistry and Biochemistry and Center for Computational Sciences, Duquesne University, Pittsburgh, Pennsylvania15282, United States
| | - Patrick E. Lackey
- Department
of Biochemistry and Chemistry, Westminster
College, New Wilmington, Pennsylvania16172, United States
| | - Mihaela Rita Mihailescu
- Department
of Chemistry and Biochemistry and Center for Computational Sciences, Duquesne University, Pittsburgh, Pennsylvania15282, United States
| | - Jeffrey D. Evanseck
- Department
of Chemistry and Biochemistry and Center for Computational Sciences, Duquesne University, Pittsburgh, Pennsylvania15282, United States
| |
Collapse
|
6
|
Aviran S, Incarnato D. Computational approaches for RNA structure ensemble deconvolution from structure probing data. J Mol Biol 2022; 434:167635. [PMID: 35595163 DOI: 10.1016/j.jmb.2022.167635] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 04/29/2022] [Accepted: 05/05/2022] [Indexed: 12/15/2022]
Abstract
RNA structure probing experiments have emerged over the last decade as a straightforward way to determine the structure of RNA molecules in a number of different contexts. Although powerful, the ability of RNA to dynamically interconvert between, and to simultaneously populate, alternative structural configurations, poses a nontrivial challenge to the interpretation of data derived from these experiments. Recent efforts aimed at developing computational methods for the reconstruction of coexisting alternative RNA conformations from structure probing data are paving the way to the study of RNA structure ensembles, even in the context of living cells. In this review, we critically discuss these methods, their limitations and possible future improvements.
Collapse
Affiliation(s)
- Sharon Aviran
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA.
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, the Netherlands.
| |
Collapse
|
7
|
Zuber J, Schroeder SJ, Sun H, Turner DH, Mathews DH. Nearest neighbor rules for RNA helix folding thermodynamics: improved end effects. Nucleic Acids Res 2022; 50:5251-5262. [PMID: 35524574 PMCID: PMC9122537 DOI: 10.1093/nar/gkac261] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 03/29/2022] [Accepted: 04/08/2022] [Indexed: 12/26/2022] Open
Abstract
Nearest neighbor parameters for estimating the folding stability of RNA secondary structures are in widespread use. For helices, current parameters penalize terminal AU base pairs relative to terminal GC base pairs. We curated an expanded database of helix stabilities determined by optical melting experiments. Analysis of the updated database shows that terminal penalties depend on the sequence identity of the adjacent penultimate base pair. New nearest neighbor parameters that include this additional sequence dependence accurately predict the measured values of 271 helices in an updated database with a correlation coefficient of 0.982. This refined understanding of helix ends facilitates fitting terms for base pair stacks with GU pairs. Prior parameter sets treated 5′GGUC3′ paired to 3′CUGG5′ separately from other 5′GU3′/3′UG5′ stacks. The improved understanding of helix end stability, however, makes the separate treatment unnecessary. Introduction of the additional terms was tested with three optical melting experiments. The average absolute difference between measured and predicted free energy changes at 37°C for these three duplexes containing terminal adjacent AU and GU pairs improved from 1.38 to 0.27 kcal/mol. This confirms the need for the additional sequence dependence in the model.
Collapse
Affiliation(s)
- Jeffrey Zuber
- Alnylam Pharmaceuticals, Inc., Cambridge, MA 02142, USA
| | - Susan J Schroeder
- Department of Chemistry and Biochemistry, and Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK 73019, USA
| | - Hongying Sun
- Department of Biochemistry & Biophysics, University of Rochester, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester, Rochester, NY 14642, USA
| | - Douglas H Turner
- Center for RNA Biology, University of Rochester, Rochester, NY 14642, USA.,Department of Chemistry, University of Rochester, Rochester, NY 14627, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester, Rochester, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester, Rochester, NY 14642, USA
| |
Collapse
|
8
|
Secondary structure prediction for RNA sequences including N 6-methyladenosine. Nat Commun 2022; 13:1271. [PMID: 35277476 PMCID: PMC8917230 DOI: 10.1038/s41467-022-28817-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 02/10/2022] [Indexed: 01/22/2023] Open
Abstract
There is increasing interest in the roles of covalently modified nucleotides in RNA. There has been, however, an inability to account for modifications in secondary structure prediction because of a lack of software and thermodynamic parameters. We report the solution for these issues for N6-methyladenosine (m6A), allowing secondary structure prediction for an alphabet of A, C, G, U, and m6A. The RNAstructure software now works with user-defined nucleotide alphabets of any size. We also report a set of nearest neighbor parameters for helices and loops containing m6A, using experiments. Interestingly, N6-methylation decreases folding stability for adenosines in the middle of a helix, has little effect on folding stability for adenosines at the ends of helices, and increases folding stability for unpaired adenosines stacked on a helix. We demonstrate predictions for an N6-methylation-activated protein recognition site from MALAT1 and human transcriptome-wide effects of N6-methylation on the probability of adenosine being buried in a helix. RNA folding free energy nearest neighbor parameters were determined for sequences with the nucleotide m6A. The RNAstructure software package can accommodate modified nucleotides, enabling secondary structure prediction of sequences with m6A.
Collapse
|
9
|
New RNA Structural Elements Identified in the Coding Region of the Coxsackie B3 Virus Genome. Viruses 2020; 12:v12111232. [PMID: 33143071 PMCID: PMC7692623 DOI: 10.3390/v12111232] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 10/27/2020] [Accepted: 10/28/2020] [Indexed: 01/25/2023] Open
Abstract
Here we present a set of new structural elements formed within the open reading frame of the virus, which are highly probable, evolutionarily conserved and may interact with host proteins. This work focused on the coding regions of the CVB3 genome (particularly the V4-, V1-, 2C-, and 3D-coding regions), which, with the exception of the cis-acting replication element (CRE), have not yet been subjected to experimental analysis of their structures. The SHAPE technique, chemical modification with DMS and RNA cleavage with Pb2+, were performed in order to characterize the RNA structure. The experimental results were used to improve the computer prediction of the structural models, whereas a phylogenetic analysis was performed to check universality of the newly identified structural elements for twenty CVB3 genomes and 11 other enteroviruses. Some of the RNA motifs turned out to be conserved among different enteroviruses. We also observed that the 3'-terminal region of the genome tends to dimerize in a magnesium concentration-dependent manner. RNA affinity chromatography was used to confirm RNA-protein interactions hypothesized by database searches, leading to the discovery of several interactions, which may be important for virus propagation.
Collapse
|
10
|
Gumna J, Zok T, Figurski K, Pachulska-Wieczorek K, Szachniuk M. RNAthor - fast, accurate normalization, visualization and statistical analysis of RNA probing data resolved by capillary electrophoresis. PLoS One 2020; 15:e0239287. [PMID: 33002005 PMCID: PMC7529196 DOI: 10.1371/journal.pone.0239287] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Accepted: 09/03/2020] [Indexed: 12/18/2022] Open
Abstract
RNAs adopt specific structures to perform their functions, which are critical to fundamental cellular processes. For decades, these structures have been determined and modeled with strong support from computational methods. Still, the accuracy of the latter ones depends on the availability of experimental data, for example, chemical probing information that can define pseudo-energy constraints for RNA folding algorithms. At the same time, diverse computational tools have been developed to facilitate analysis and visualization of data from RNA structure probing experiments followed by capillary electrophoresis or next-generation sequencing. RNAthor, a new software tool for the fully automated normalization of SHAPE and DMS probing data resolved by capillary electrophoresis, has recently joined this collection. RNAthor automatically identifies unreliable probing data. It normalizes the reactivity information to a uniform scale and uses it in the RNA secondary structure prediction. Our web server also provides tools for fast and easy RNA probing data visualization and statistical analysis that facilitates the comparison of multiple data sets. RNAthor is freely available at http://rnathor.cs.put.poznan.pl/.
Collapse
Affiliation(s)
- Julita Gumna
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Kacper Figurski
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | | | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
- * E-mail: (KPW); (MS)
| |
Collapse
|
11
|
Andrzejewska A, Zawadzka M, Pachulska-Wieczorek K. On the Way to Understanding the Interplay between the RNA Structure and Functions in Cells: A Genome-Wide Perspective. Int J Mol Sci 2020; 21:E6770. [PMID: 32942713 PMCID: PMC7554983 DOI: 10.3390/ijms21186770] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 09/04/2020] [Accepted: 09/11/2020] [Indexed: 12/22/2022] Open
Abstract
RNAs adopt specific structures in order to perform their biological activities. The structure of RNA is an important layer of gene expression regulation, and can impact a plethora of cellular processes, starting with transcription, RNA processing, and translation, and ending with RNA turnover. The development of high-throughput technologies has enabled a deeper insight into the sophisticated interplay between the structure of the cellular transcriptome and the living cells environment. In this review, we present the current view on the RNA structure in vivo resulting from the most recent transcriptome-wide studies in different organisms, including mammalians, yeast, plants, and bacteria. We focus on the relationship between the mRNA structure and translation, mRNA stability and degradation, protein binding, and RNA posttranscriptional modifications.
Collapse
Affiliation(s)
| | | | - Katarzyna Pachulska-Wieczorek
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Department of Structure and Function of Retrotransposons, Noskowskiego 12/14, 61-704 Poznan, Poland; (A.A.); (M.Z.)
| |
Collapse
|
12
|
Abstract
RNA performs and regulates a diverse range of cellular processes, with new functional roles being uncovered at a rapid pace. Interest is growing in how these functions are linked to RNA structures that form in the complex cellular environment. A growing suite of technologies that use advances in RNA structural probes, high-throughput sequencing and new computational approaches to interrogate RNA structure at unprecedented throughput are beginning to provide insights into RNA structures at new spatial, temporal and cellular scales.
Collapse
Affiliation(s)
- Eric J Strobel
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | - Angela M Yu
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Julius B Lucks
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
13
|
Spasic A, Assmann SM, Bevilacqua PC, Mathews DH. Modeling RNA secondary structure folding ensembles using SHAPE mapping data. Nucleic Acids Res 2019; 46:314-323. [PMID: 29177466 PMCID: PMC5758915 DOI: 10.1093/nar/gkx1057] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 10/30/2017] [Indexed: 12/22/2022] Open
Abstract
RNA secondary structure prediction is widely used for developing hypotheses about the structures of RNA sequences, and structure can provide insight about RNA function. The accuracy of structure prediction is known to be improved using experimental mapping data that provide information about the pairing status of single nucleotides, and these data can now be acquired for whole transcriptomes using high-throughput sequencing. Prior methods for using these experimental data focused on predicting structures for sequences assuming that they populate a single structure. Most RNAs populate multiple structures, however, where the ensemble of strands populates structures with different sets of canonical base pairs. The focus on modeling single structures has been a bottleneck for accurately modeling RNA structure. In this work, we introduce Rsample, an algorithm for using experimental data to predict more than one RNA structure for sequences that populate multiple structures at equilibrium. We demonstrate, using SHAPE mapping data, that we can accurately model RNA sequences that populate multiple structures, including the relative probabilities of those structures. This program is freely available as part of the RNAstructure software package.
Collapse
Affiliation(s)
- Aleksandar Spasic
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Sarah M Assmann
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Philip C Bevilacqua
- Department of Chemistry, Department of Biochemistry & Molecular Biology, Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| |
Collapse
|
14
|
Mathews DH. How to benchmark RNA secondary structure prediction accuracy. Methods 2019; 162-163:60-67. [PMID: 30951834 DOI: 10.1016/j.ymeth.2019.04.003] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2018] [Revised: 03/24/2019] [Accepted: 04/01/2019] [Indexed: 11/18/2022] Open
Abstract
RNA secondary structure prediction is widely used. As new methods are developed, these are often benchmarked for accuracy against existing methods. This review discusses good practices for performing these benchmarks, including the choice of benchmarking structures, metrics to quantify accuracy, the importance of allowing flexibility for pairs in the accepted structure, and the importance of statistical testing for significance.
Collapse
Affiliation(s)
- David H Mathews
- Center for RNA Biology, Department of Biochemistry & Biophysics, and Department of Biostatistics & Computational Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY 14642, United States.
| |
Collapse
|
15
|
van Cruchten RTP, Wieringa B, Wansink DG. Expanded CUG repeats in DMPK transcripts adopt diverse hairpin conformations without influencing the structure of the flanking sequences. RNA (NEW YORK, N.Y.) 2019; 25:481-495. [PMID: 30700578 PMCID: PMC6426290 DOI: 10.1261/rna.068940.118] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Accepted: 01/24/2019] [Indexed: 06/09/2023]
Abstract
Myotonic dystrophy type 1 (DM1) is a complex neuromuscular disorder caused by expansion of a CTG repeat in the 3'-untranslated region (UTR) of the DMPK gene. Mutant DMPK transcripts form aberrant structures and anomalously associate with RNA-binding proteins (RBPs). As a first step toward better understanding of the involvement of abnormal DMPK mRNA folding in DM1 manifestation, we used SHAPE, DMS, CMCT, and RNase T1 structure probing in vitro for modeling of the topology of the DMPK 3'-UTR with normal and pathogenic repeat lengths of up to 197 CUG triplets. The resulting structural information was validated by disruption of base-pairing with LNA antisense oligonucleotides (AONs) and used for prediction of therapeutic AON accessibility and verification of DMPK knockdown efficacy in cells. Our model for DMPK RNA structure demonstrates that the hairpin formed by the CUG repeat has length-dependent conformational plasticity, with a structure that is guided by and embedded in an otherwise rigid architecture of flanking regions in the DMPK 3'-UTR. Evidence is provided that long CUG repeats may form not only single asymmetrical hairpins but also exist as branched structures. These newly identified structures have implications for DM1 pathogenic mechanisms, like sequestration of RBPs and repeat-associated non-AUG (RAN) translation.
Collapse
Affiliation(s)
- Remco T P van Cruchten
- Department of Cell Biology, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
| | - Bé Wieringa
- Department of Cell Biology, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
| | - Derick G Wansink
- Department of Cell Biology, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
| |
Collapse
|
16
|
Genome-Wide Discovery of DEAD-Box RNA Helicase Targets Reveals RNA Structural Remodeling in Transcription Termination. Genetics 2019; 212:153-174. [PMID: 30902808 DOI: 10.1534/genetics.119.302058] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 03/19/2019] [Indexed: 11/18/2022] Open
Abstract
RNA helicases are a class of enzymes that unwind RNA duplexes in vitro but whose cellular functions are largely enigmatic. Here, we provide evidence that the DEAD-box protein Dbp2 remodels RNA-protein complex (RNP) structure to facilitate efficient termination of transcription in Saccharomyces cerevisiae via the Nrd1-Nab3-Sen1 (NNS) complex. First, we find that loss of DBP2 results in RNA polymerase II accumulation at the 3' ends of small nucleolar RNAs and a subset of mRNAs. In addition, Dbp2 associates with RNA sequence motifs and regions bound by Nrd1 and can promote its recruitment to NNS-targeted regions. Using Structure-seq, we find altered RNA/RNP structures in dbp2∆ cells that correlate with inefficient termination. We also show a positive correlation between the stability of structures in the 3' ends and a requirement for Dbp2 in termination. Taken together, these studies provide a role for RNA remodeling by Dbp2 and further suggests a mechanism whereby RNA structure is exploited for gene regulation.
Collapse
|
17
|
Choudhary K, Lai YH, Tran EJ, Aviran S. dStruct: identifying differentially reactive regions from RNA structurome profiling data. Genome Biol 2019; 20:40. [PMID: 30791935 PMCID: PMC6385470 DOI: 10.1186/s13059-019-1641-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Accepted: 01/24/2019] [Indexed: 12/16/2022] Open
Abstract
RNA biology is revolutionized by recent developments of diverse high-throughput technologies for transcriptome-wide profiling of molecular RNA structures. RNA structurome profiling data can be used to identify differentially structured regions between groups of samples. Existing methods are limited in scope to specific technologies and/or do not account for biological variation. Here, we present dStruct which is the first broadly applicable method for differential analysis accounting for biological variation in structurome profiling data. dStruct is compatible with diverse profiling technologies, is validated with experimental data and simulations, and outperforms existing methods.
Collapse
Affiliation(s)
- Krishna Choudhary
- Department of Biomedical Engineering and Genome Center, University of California, Davis, One Shields Avenue, Davis, 95616 CA USA
| | - Yu-Hsuan Lai
- Department of Biochemistry, Purdue University, BCHM 305, 175 S. University Street, West Lafayette, 47907-2063 IN USA
| | - Elizabeth J. Tran
- Department of Biochemistry, Purdue University, BCHM 305, 175 S. University Street, West Lafayette, 47907-2063 IN USA
- Purdue University Center for Cancer Research, Purdue University, Hansen Life Sciences Research Building, Room 141, 201 S. University Street, West Lafayette, 47907-2064 IN USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California, Davis, One Shields Avenue, Davis, 95616 CA USA
| |
Collapse
|
18
|
Eubanks CS, Hargrove AE. RNA Structural Differentiation: Opportunities with Pattern Recognition. Biochemistry 2018; 58:199-213. [PMID: 30513196 DOI: 10.1021/acs.biochem.8b01090] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Our awareness and appreciation of the many regulatory roles of RNA have dramatically increased in the past decade. This understanding, in addition to the impact of RNA in many disease states, has renewed interest in developing selective RNA-targeted small molecule probes. However, the fundamental guiding principles in RNA molecular recognition that could accelerate these efforts remain elusive. While high-resolution structural characterization can provide invaluable insight, examples of well-characterized RNA structures, not to mention small molecule:RNA complexes, remain limited. This Perspective provides an overview of the current techniques used to understand RNA molecular recognition when high-resolution structural information is unavailable. We will place particular emphasis on a new method, pattern recognition of RNA with small molecules (PRRSM), that provides rapid insight into critical components of RNA recognition and differentiation by small molecules as well as into RNA structural features.
Collapse
Affiliation(s)
- Christopher S Eubanks
- Department of Chemistry , Duke University , Durham , North Carolina 27708-0354 , United States
| | - Amanda E Hargrove
- Department of Chemistry , Duke University , Durham , North Carolina 27708-0354 , United States
| |
Collapse
|
19
|
Schroeder SJ. Challenges and approaches to predicting RNA with multiple functional structures. RNA (NEW YORK, N.Y.) 2018; 24:1615-1624. [PMID: 30143552 PMCID: PMC6239171 DOI: 10.1261/rna.067827.118] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The revolution in sequencing technology demands new tools to interpret the genetic code. As in vivo transcriptome-wide chemical probing techniques advance, new challenges emerge in the RNA folding problem. The emphasis on one sequence folding into a single minimum free energy structure is fading as a new focus develops on generating RNA structural ensembles and identifying functional structural features in ensembles. This review describes an efficient combinatorially complete method and three free energy minimization approaches to predicting RNA structures with more than one functional fold, as well as two methods for analysis of a thermodynamics-based Boltzmann ensemble of structures. The review then highlights two examples of viral RNA 3'-UTR regions that fold into more than one conformation and have been characterized by single molecule fluorescence energy resonance transfer or NMR spectroscopy. These examples highlight the different approaches and challenges in predicting structure and function from sequence for RNA with multiple biological roles and folds. More well-defined examples and new metrics for measuring differences in RNA structures will guide future improvements in prediction of RNA structure and function from sequence.
Collapse
Affiliation(s)
- Susan J Schroeder
- Department of Chemistry and Biochemistry, Department of Microbiology and Plant Biology, University of Oklahoma, Norman, Oklahoma 73019, USA
| |
Collapse
|
20
|
Extracting information from RNA SHAPE data: Kalman filtering approach. PLoS One 2018; 13:e0207029. [PMID: 30462682 PMCID: PMC6248965 DOI: 10.1371/journal.pone.0207029] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2018] [Accepted: 10/23/2018] [Indexed: 01/26/2023] Open
Abstract
RNA SHAPE experiments have become important and successful sources of information for RNA structure prediction. In such experiments, chemical reagents are used to probe RNA backbone flexibility at the nucleotide level, which in turn provides information on base pairing and therefore secondary structure. Little is known, however, about the statistics of such SHAPE data. In this work, we explore different representations of noise in SHAPE data and propose a statistically sound framework for extracting reliable reactivity information from multiple SHAPE replicates. Our analyses of RNA SHAPE experiments underscore that a normal noise model is not adequate to represent their data. We propose instead a log-normal representation of noise and discuss its relevance. Under this assumption, we observe that processing simulated SHAPE data by directly averaging different replicates leads to bias. Such bias can be reduced by analyzing the data following a log transformation, either by log-averaging or Kalman filtering. Application of Kalman filtering has the additional advantage that a prior on the nucleotide reactivities can be introduced. We show that the performance of Kalman filtering is then directly dependent on the quality of that prior. We conclude the paper with guidelines on signal processing of RNA SHAPE data.
Collapse
|
21
|
Choudhary K, Ruan L, Deng F, Shih N, Aviran S. SEQualyzer: interactive tool for quality control and exploratory analysis of high-throughput RNA structural profiling data. Bioinformatics 2018; 33:441-443. [PMID: 28172632 DOI: 10.1093/bioinformatics/btw627] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 09/25/2016] [Accepted: 09/26/2016] [Indexed: 11/14/2022] Open
Abstract
Summary To serve numerous functional roles, RNA must fold into specific structures. Determining these structures is thus of paramount importance. The recent advent of high-throughput sequencing-based structure profiling experiments has provided important insights into RNA structure and widened the scope of RNA studies. However, as a broad range of approaches continues to emerge, a universal framework is needed to quantitatively ensure consistent and high-quality data. We present SEQualyzer, a visual and interactive application that makes it easy and efficient to gauge data quality, screen for transcripts with high-quality information and identify discordant replicates in structure profiling experiments. Our methods rely on features common to a wide range of protocols and can serve as standards for quality control and analyses. Availability and Implementation SEQualyzer is written in R, is platform-independent, and is freely available at http://bme.ucdavis.edu/aviranlab/SEQualyzer. Contact saviran@ucdavis.edu Supplementary Informantion Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Krishna Choudhary
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Luyao Ruan
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Fei Deng
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Nathan Shih
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| |
Collapse
|
22
|
Automated Recognition of RNA Structure Motifs by Their SHAPE Data Signatures. Genes (Basel) 2018; 9:genes9060300. [PMID: 29904019 PMCID: PMC6027059 DOI: 10.3390/genes9060300] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 06/04/2018] [Accepted: 06/13/2018] [Indexed: 02/03/2023] Open
Abstract
High-throughput structure profiling (SP) experiments that provide information at nucleotide resolution are revolutionizing our ability to study RNA structures. Of particular interest are RNA elements whose underlying structures are necessary for their biological functions. We previously introduced patteRNA, an algorithm for rapidly mining SP data for patterns characteristic of such motifs. This work provided a proof-of-concept for the detection of motifs and the capability of distinguishing structures displaying pronounced conformational changes. Here, we describe several improvements and automation routines to patteRNA. We then consider more elaborate biological situations starting with the comparison or integration of results from searches for distinct motifs and across datasets. To facilitate such analyses, we characterize patteRNA’s outputs and describe a normalization framework that regularizes results. We then demonstrate that our algorithm successfully discerns between highly similar structural variants of the human immunodeficiency virus type 1 (HIV-1) Rev response element (RRE) and readily identifies its exact location in whole-genome structure profiles of HIV-1. This work highlights the breadth of information that can be gleaned from SP data and broadens the utility of data-driven methods as tools for the detection of novel RNA elements.
Collapse
|
23
|
Ledda M, Aviran S. PATTERNA: transcriptome-wide search for functional RNA elements via structural data signatures. Genome Biol 2018; 19:28. [PMID: 29495968 PMCID: PMC5833111 DOI: 10.1186/s13059-018-1399-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 01/30/2018] [Indexed: 02/08/2023] Open
Abstract
Establishing a link between RNA structure and function remains a great challenge in RNA biology. The emergence of high-throughput structure profiling experiments is revolutionizing our ability to decipher structure, yet principled approaches for extracting information on structural elements directly from these data sets are lacking. We present PATTERNA, an unsupervised pattern recognition algorithm that rapidly mines RNA structure motifs from profiling data. We demonstrate that PATTERNA detects motifs with an accuracy comparable to commonly used thermodynamic models and highlight its utility in automating data-directed structure modeling from large data sets. PATTERNA is versatile and compatible with diverse profiling techniques and experimental conditions.
Collapse
Affiliation(s)
- Mirko Ledda
- Department of Biomedical Engineering and Genome Center, UC Davis, 1 Shields Ave, Davis, 95616 USA
- Integrative Genetics and Genomics Graduate Group, UC Davis, 1 Shields Ave, Davis, 95616 USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, UC Davis, 1 Shields Ave, Davis, 95616 USA
| |
Collapse
|
24
|
Statistical modeling of RNA structure profiling experiments enables parsimonious reconstruction of structure landscapes. Nat Commun 2018; 9:606. [PMID: 29426922 PMCID: PMC5807309 DOI: 10.1038/s41467-018-02923-8] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Accepted: 01/09/2018] [Indexed: 11/23/2022] Open
Abstract
RNA plays key regulatory roles in diverse cellular processes, where its functionality often derives from folding into and converting between structures. Many RNAs further rely on co-existence of alternative structures, which govern their response to cellular signals. However, characterizing heterogeneous landscapes is difficult, both experimentally and computationally. Recently, structure profiling experiments have emerged as powerful and affordable structure characterization methods, which improve computational structure prediction. To date, efforts have centered on predicting one optimal structure, with much less progress made on multiple-structure prediction. Here, we report a probabilistic modeling approach that predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data. We demonstrate robust landscape reconstruction and quantitative insights into structural dynamics by analyzing numerous data sets. This work establishes a framework for data-directed characterization of structure landscapes to aid experimentalists in performing structure-function studies. Different experimental and computational approaches can be used to study RNA structures. Here, the authors present a computational method for data-directed reconstruction of complex RNA structure landscapes, which predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data.
Collapse
|
25
|
Antunes D, Jorge NAN, Caffarena ER, Passetti F. Using RNA Sequence and Structure for the Prediction of Riboswitch Aptamer: A Comprehensive Review of Available Software and Tools. Front Genet 2018; 8:231. [PMID: 29403526 PMCID: PMC5780412 DOI: 10.3389/fgene.2017.00231] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Accepted: 12/21/2017] [Indexed: 12/14/2022] Open
Abstract
RNA molecules are essential players in many fundamental biological processes. Prokaryotes and eukaryotes have distinct RNA classes with specific structural features and functional roles. Computational prediction of protein structures is a research field in which high confidence three-dimensional protein models can be proposed based on the sequence alignment between target and templates. However, to date, only a few approaches have been developed for the computational prediction of RNA structures. Similar to proteins, RNA structures may be altered due to the interaction with various ligands, including proteins, other RNAs, and metabolites. A riboswitch is a molecular mechanism, found in the three kingdoms of life, in which the RNA structure is modified by the binding of a metabolite. It can regulate multiple gene expression mechanisms, such as transcription, translation initiation, and mRNA splicing and processing. Due to their nature, these entities also act on the regulation of gene expression and detection of small metabolites and have the potential to helping in the discovery of new classes of antimicrobial agents. In this review, we describe software and web servers currently available for riboswitch aptamer identification and secondary and tertiary structure prediction, including applications.
Collapse
Affiliation(s)
- Deborah Antunes
- Scientific Computing Program (PROCC), Computational Biophysics and Molecular Modeling Group, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil
| | - Natasha A N Jorge
- Laboratory of Functional Genomics and Bioinformatics, Oswaldo Cruz Institute, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil.,Laboratory of Gene Expression Regulation, Carlos Chagas Institute, Fundação Oswaldo Cruz, Curitiba, Brazil
| | - Ernesto R Caffarena
- Scientific Computing Program (PROCC), Computational Biophysics and Molecular Modeling Group, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil
| | - Fabio Passetti
- Laboratory of Functional Genomics and Bioinformatics, Oswaldo Cruz Institute, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil.,Laboratory of Gene Expression Regulation, Carlos Chagas Institute, Fundação Oswaldo Cruz, Curitiba, Brazil
| |
Collapse
|
26
|
Schlick T, Pyle AM. Opportunities and Challenges in RNA Structural Modeling and Design. Biophys J 2017; 113:225-234. [PMID: 28162235 PMCID: PMC5529161 DOI: 10.1016/j.bpj.2016.12.037] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2016] [Revised: 12/08/2016] [Accepted: 12/19/2016] [Indexed: 01/27/2023] Open
Abstract
We describe opportunities and challenges in RNA structural modeling and design, as recently discussed during the second Telluride Science Research Center workshop organized in June 2016. Topics include fundamental processes of RNA, such as structural assemblies (hierarchical folding, multiple conformational states and their clustering), RNA motifs, and chemical reactivity of RNA, as used for structural prediction and functional inference. We also highlight the software and database issues associated with RNA structures, such as the multiple approaches for motif annotation, the need for frequent database updating, and the importance of quality control of RNA structures. We discuss various modeling approaches for structure prediction, mechanistic analysis of RNA reactions, and RNA design, and the complementary roles that both atomistic and coarse-grained approaches play in such simulations. Collectively, as scientists from varied disciplines become familiar and drawn into these unique challenges, new approaches and collaborative efforts will undoubtedly be catalyzed.
Collapse
Affiliation(s)
- Tamar Schlick
- Department of Chemistry, New York University, New York, New York; Courant Institute of Mathematical Sciences, New York University, New York, New York.
| | - Anna Marie Pyle
- Department of Molecular and Cellular and Developmental Biology and Department of Chemistry, Yale University; Howard Hughes Medical Institute, New Haven, Connecticut.
| |
Collapse
|
27
|
Tan Z, Sharma G, Mathews DH. Modeling RNA Secondary Structure with Sequence Comparison and Experimental Mapping Data. Biophys J 2017; 113:330-338. [PMID: 28735622 DOI: 10.1016/j.bpj.2017.06.039] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Revised: 06/07/2017] [Accepted: 06/19/2017] [Indexed: 10/19/2022] Open
Abstract
Secondary structure prediction is an important problem in RNA bioinformatics because knowledge of structure is critical to understanding the functions of RNA sequences. Significant improvements in prediction accuracy have recently been demonstrated though the incorporation of experimentally obtained structural information, for instance using selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) mapping. However, such mapping data is currently available only for a limited number of RNA sequences. In this article, we present a method for extending the benefit of experimental mapping data in secondary structure prediction to homologous sequences. Specifically, we propose a method for integrating experimental mapping data into a comparative sequence analysis algorithm for secondary structure prediction of multiple homologs, whereby the mapping data benefits not only the prediction for the specific sequence that was mapped but also other homologs. The proposed method is realized by modifying the TurboFold II algorithm for prediction of RNA secondary structures to utilize basepairing probabilities guided by SHAPE experimental data when such data are available. The SHAPE-mapping-guided basepairing probabilities are obtained using the RSample method. Results demonstrate that the SHAPE mapping data for a sequence improves structure prediction accuracy of other homologous sequences beyond the accuracy obtained by sequence comparison alone (TurboFold II). The updated version of TurboFold II is freely available as part of the RNAstructure software package.
Collapse
Affiliation(s)
- Zhen Tan
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, New York; Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - Gaurav Sharma
- Center for RNA Biology, University of Rochester Medical Center, Rochester, New York; Department of Electrical and Computer Engineering, University of Rochester Medical Center, Rochester, New York; Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York.
| | - David H Mathews
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, New York; Center for RNA Biology, University of Rochester Medical Center, Rochester, New York; Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York.
| |
Collapse
|
28
|
Dawn of the in vivo RNA structurome and interactome. Biochem Soc Trans 2017; 44:1395-1410. [PMID: 27911722 DOI: 10.1042/bst20160075] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2016] [Revised: 06/19/2016] [Accepted: 07/04/2016] [Indexed: 12/11/2022]
Abstract
RNA is one of the most fascinating biomolecules in living systems given its structural versatility to fold into elaborate architectures for important biological functions such as gene regulation, catalysis, and information storage. Knowledge of RNA structures and interactions can provide deep insights into their functional roles in vivo For decades, RNA structural studies have been conducted on a transcript-by-transcript basis. The advent of next-generation sequencing (NGS) has enabled the development of transcriptome-wide structural probing methods to profile the global landscape of RNA structures and interactions, also known as the RNA structurome and interactome, which transformed our understanding of the RNA structure-function relationship on a transcriptomic scale. In this review, molecular tools and NGS methods used for RNA structure probing are presented, novel insights uncovered by RNA structurome and interactome studies are highlighted, and perspectives on current challenges and potential future directions are discussed. A more complete understanding of the RNA structures and interactions in vivo will help illuminate the novel roles of RNA in gene regulation, development, and diseases.
Collapse
|
29
|
Shi J, Li X, Dong M, Graham M, Yadav N, Liang C. JNSViewer-A JavaScript-based Nucleotide Sequence Viewer for DNA/RNA secondary structures. PLoS One 2017; 12:e0179040. [PMID: 28582416 PMCID: PMC5459502 DOI: 10.1371/journal.pone.0179040] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2017] [Accepted: 05/23/2017] [Indexed: 11/19/2022] Open
Abstract
Many tools are available for visualizing RNA or DNA secondary structures, but there is scarce implementation in JavaScript that provides seamless integration with the increasingly popular web computational platforms. We have developed JNSViewer, a highly interactive web service, which is bundled with several popular tools for DNA/RNA secondary structure prediction and can provide precise and interactive correspondence among nucleotides, dot-bracket data, secondary structure graphs, and genic annotations. In JNSViewer, users can perform RNA secondary structure predictions with different programs and settings, add customized genic annotations in GFF format to structure graphs, search for specific linear motifs, and extract relevant structure graphs of sub-sequences. JNSViewer also allows users to choose a transcript or specific segment of Arabidopsis thaliana genome sequences and predict the corresponding secondary structure. Popular genome browsers (i.e., JBrowse and BrowserGenome) were integrated into JNSViewer to provide powerful visualizations of chromosomal locations, genic annotations, and secondary structures. In addition, we used StructureFold with default settings to predict some RNA structures for Arabidopsis by incorporating in vivo high-throughput RNA structure profiling data and stored the results in our web server, which might be a useful resource for RNA secondary structure studies in plants. JNSViewer is available at http://bioinfolab.miamioh.edu/jnsviewer/index.html.
Collapse
Affiliation(s)
- Jieming Shi
- Department of Biology, Miami University, Oxford, Ohio, United States of America
| | - Xi Li
- Department of Biology, Miami University, Oxford, Ohio, United States of America
- College of Information Science and Engineering, Guangxi University for Nationalities, Nanning, Guangxi, China
| | - Min Dong
- Department of Biology, Miami University, Oxford, Ohio, United States of America
- Department of Automation, Xiamen University, Fujian, China
| | - Mitchell Graham
- Department of Biology, Miami University, Oxford, Ohio, United States of America
| | - Nehul Yadav
- Department of Biology, Miami University, Oxford, Ohio, United States of America
| | - Chun Liang
- Department of Biology, Miami University, Oxford, Ohio, United States of America
| |
Collapse
|
30
|
Abstract
In addition to continuous rapid progress in RNA structure determination, probing, and biophysical studies, the past decade has seen remarkable advances in the development of a new generation of RNA folding theories and models. In this article, we review RNA structure prediction models and models for ion-RNA and ligand-RNA interactions. These new models are becoming increasingly important for a mechanistic understanding of RNA function and quantitative design of RNA nanotechnology. We focus on new methods for physics-based, knowledge-based, and experimental data-directed modeling for RNA structures and explore the new theories for the predictions of metal ion and ligand binding sites and metal ion-dependent RNA stabilities. The integration of these new methods with theories about the cellular environment effects in RNA folding, such as molecular crowding and cotranscriptional kinetic effects, may ultimately lead to an all-encompassing RNA folding model.
Collapse
Affiliation(s)
- Li-Zhen Sun
- Department of Physics, Department of Biochemistry, and MU Informatics Institute, University of Missouri, Columbia, Missouri 65211;
| | - Dong Zhang
- Department of Physics, Department of Biochemistry, and MU Informatics Institute, University of Missouri, Columbia, Missouri 65211;
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry, and MU Informatics Institute, University of Missouri, Columbia, Missouri 65211;
| |
Collapse
|
31
|
Li B, Tambe A, Aviran S, Pachter L. PROBer Provides a General Toolkit for Analyzing Sequencing-Based Toeprinting Assays. Cell Syst 2017; 4:568-574.e7. [PMID: 28501650 DOI: 10.1016/j.cels.2017.04.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2016] [Revised: 12/19/2016] [Accepted: 04/13/2017] [Indexed: 11/19/2022]
Abstract
A number of sequencing-based transcriptase drop-off assays have recently been developed to probe post-transcriptional dynamics of RNA-protein interaction, RNA structure, and RNA modification. Although these assays survey a diverse set of epitranscriptomic marks, we use the term toeprinting assays since they share methodological similarities. Their interpretation is predicated on addressing a similar computational challenge: how to learn isoform-specific chemical modification profiles in the face of complex read multi-mapping. We introduce PROBer, a statistical model and associated software, that addresses this challenge for the analysis of toeprinting assays. PROBer takes sequencing data as input and outputs estimated transcript abundances and isoform-specific modification profiles. Results on both simulated and biological data demonstrate that PROBer significantly outperforms individual methods tailored for specific toeprinting assays. Since the space of toeprinting assays is ever expanding and these assays are likely to be performed and analyzed together, we believe PROBer's unified data analysis solution will be valuable to the RNA community.
Collapse
Affiliation(s)
- Bo Li
- Center for RNA Systems Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Akshay Tambe
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California, Davis, Davis, CA 95616, USA
| | - Lior Pachter
- Departments of Biology and Computing & Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA.
| |
Collapse
|
32
|
Choudhary K, Deng F, Aviran S. Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions. QUANTITATIVE BIOLOGY 2017; 5:3-24. [PMID: 28717530 PMCID: PMC5510538 DOI: 10.1007/s40484-017-0093-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 12/08/2016] [Accepted: 12/15/2016] [Indexed: 12/30/2022]
Abstract
BACKGROUND Structure profiling experiments provide single-nucleotide information on RNA structure. Recent advances in chemistry combined with application of high-throughput sequencing have enabled structure profiling at transcriptome scale and in living cells, creating unprecedented opportunities for RNA biology. Propelled by these experimental advances, massive data with ever-increasing diversity and complexity have been generated, which give rise to new challenges in interpreting and analyzing these data. RESULTS We review current practices in analysis of structure profiling data with emphasis on comparative and integrative analysis as well as highlight emerging questions. Comparative analysis has revealed structural patterns across transcriptomes and has become an integral component of recent profiling studies. Additionally, profiling data can be integrated into traditional structure prediction algorithms to improve prediction accuracy. CONCLUSIONS To keep pace with experimental developments, methods to facilitate, enhance and refine such analyses are needed. Parallel advances in analysis methodology will complement profiling technologies and help them reach their full potential.
Collapse
Affiliation(s)
| | | | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA 95616, USA
| |
Collapse
|
33
|
Krokhotin A, Mustoe AM, Weeks KM, Dokholyan NV. Direct identification of base-paired RNA nucleotides by correlated chemical probing. RNA (NEW YORK, N.Y.) 2017; 23:6-13. [PMID: 27803152 PMCID: PMC5159650 DOI: 10.1261/rna.058586.116] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Accepted: 10/28/2016] [Indexed: 05/04/2023]
Abstract
Many RNA molecules fold into complex secondary and tertiary structures that play critical roles in biological function. Among the best-established methods for examining RNA structure are chemical probing experiments, which can report on local nucleotide structure in a concise and extensible manner. While probing data are highly useful for inferring overall RNA secondary structure, these data do not directly measure through-space base-pairing interactions. We recently introduced an approach for single-molecule correlated chemical probing with dimethyl sulfate (DMS) that measures RNA interaction groups by mutational profiling (RING-MaP). RING-MaP experiments reveal diverse through-space interactions corresponding to both secondary and tertiary structure. Here we develop a framework for using RING-MaP data to directly and robustly identify canonical base pairs in RNA. When applied to three representative RNAs, this framework identified 20%-50% of accepted base pairs with a <10% false discovery rate, allowing detection of 88% of duplexes containing four or more base pairs, including pseudoknotted pairs. We further show that base pairs determined from RING-MaP analysis significantly improve secondary structure modeling. RING-MaP-based correlated chemical probing represents a direct, experimentally concise, and accurate approach for detection of individual base pairs and helices and should greatly facilitate structure modeling for complex RNAs.
Collapse
Affiliation(s)
- Andrey Krokhotin
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Anthony M Mustoe
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Kevin M Weeks
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Nikolay V Dokholyan
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| |
Collapse
|
34
|
Sloma MF, Mathews DH. Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures. RNA (NEW YORK, N.Y.) 2016; 22:1808-1818. [PMID: 27852924 PMCID: PMC5113201 DOI: 10.1261/rna.053694.115] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Accepted: 09/08/2016] [Indexed: 05/10/2023]
Abstract
RNA secondary structure prediction is widely used to analyze RNA sequences. In an RNA partition function calculation, free energy nearest neighbor parameters are used in a dynamic programming algorithm to estimate statistical properties of the secondary structure ensemble. Previously, partition functions have largely been used to estimate the probability that a given pair of nucleotides form a base pair, the conditional stacking probability, the accessibility to binding of a continuous stretch of nucleotides, or a representative sample of RNA structures. Here it is demonstrated that an RNA partition function can also be used to calculate the exact probability of formation of hairpin loops, internal loops, bulge loops, or multibranch loops at a given position. This calculation can also be used to estimate the probability of formation of specific helices. Benchmarking on a set of RNA sequences with known secondary structures indicated that loops that were calculated to be more probable were more likely to be present in the known structure than less probable loops. Furthermore, highly probable loops are more likely to be in the known structure than the set of loops predicted in the lowest free energy structures.
Collapse
Affiliation(s)
- Michael F Sloma
- Department of Biochemistry and Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, USA
| | - David H Mathews
- Department of Biochemistry and Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, USA
| |
Collapse
|
35
|
Abstract
Single-stranded RNA molecules fold into extraordinarily complicated secondary and tertiary structures as a result of intramolecular base pairing. In vivo, these RNA structures are not static. Instead, they are remodeled in response to changes in the prevailing physicochemical environment of the cell and as a result of intermolecular base pairing and interactions with RNA-binding proteins. Remarkable technical advances now allow us to probe RNA secondary structure at single-nucleotide resolution and genome-wide, both in vitro and in vivo. These data sets provide new glimpses into the RNA universe. Analyses of RNA structuromes in HIV, yeast, Arabidopsis, and mammalian cells and tissues have revealed regulatory effects of RNA structure on messenger RNA (mRNA) polyadenylation, splicing, translation, and turnover. Application of new methods for genome-wide identification of mRNA modifications, particularly methylation and pseudouridylation, has shown that the RNA "epitranscriptome" both influences and is influenced by RNA structure. In this review, we describe newly developed genome-wide RNA structure-probing methods and synthesize the information emerging from their application.
Collapse
Affiliation(s)
- Philip C Bevilacqua
- Department of Chemistry.,Department of Biochemistry and Molecular Biology.,Center for RNA Molecular Biology
| | | | - Zhao Su
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802;
| | - Sarah M Assmann
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802;
| |
Collapse
|
36
|
Choudhary K, Shih NP, Deng F, Ledda M, Li B, Aviran S. Metrics for rapid quality control in RNA structure probing experiments. Bioinformatics 2016; 32:3575-3583. [PMID: 27497441 DOI: 10.1093/bioinformatics/btw501] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Revised: 07/02/2016] [Accepted: 07/26/2016] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The diverse functionalities of RNA can be attributed to its capacity to form complex and varied structures. The recent proliferation of new structure probing techniques coupled with high-throughput sequencing has helped RNA studies expand in both scope and depth. Despite differences in techniques, most experiments face similar challenges in reproducibility due to the stochastic nature of chemical probing and sequencing. As these protocols expand to transcriptome-wide studies, quality control becomes a more daunting task. General and efficient methodologies are needed to quantify variability and quality in the wide range of current and emerging structure probing experiments. RESULTS We develop metrics to rapidly and quantitatively evaluate data quality from structure probing experiments, demonstrating their efficacy on both small synthetic libraries and transcriptome-wide datasets. We use a signal-to-noise ratio concept to evaluate replicate agreement, which has the capacity to identify high-quality data. We also consider and compare two methods to assess variability inherent in probing experiments, which we then utilize to evaluate the coverage adjustments needed to meet desired quality. The developed metrics and tools will be useful in summarizing large-scale datasets and will help standardize quality control in the field. AVAILABILITY AND IMPLEMENTATION The data and methods used in this article are freely available at: http://bme.ucdavis.edu/aviranlab/SPEQC_software CONTACT: saviran@ucdavis.eduSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Krishna Choudhary
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Nathan P Shih
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Fei Deng
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Mirko Ledda
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Bo Li
- Center for RNA Systems Biology, University of California at Berkeley, Berkeley, CA, USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| |
Collapse
|
37
|
Deng F, Ledda M, Vaziri S, Aviran S. Data-directed RNA secondary structure prediction using probabilistic modeling. RNA (NEW YORK, N.Y.) 2016; 22:1109-1119. [PMID: 27251549 PMCID: PMC4931104 DOI: 10.1261/rna.055756.115] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Accepted: 04/26/2016] [Indexed: 06/05/2023]
Abstract
Structure dictates the function of many RNAs, but secondary RNA structure analysis is either labor intensive and costly or relies on computational predictions that are often inaccurate. These limitations are alleviated by integration of structure probing data into prediction algorithms. However, existing algorithms are optimized for a specific type of probing data. Recently, new chemistries combined with advances in sequencing have facilitated structure probing at unprecedented scale and sensitivity. These novel technologies and anticipated wealth of data highlight a need for algorithms that readily accommodate more complex and diverse input sources. We implemented and investigated a recently outlined probabilistic framework for RNA secondary structure prediction and extended it to accommodate further refinement of structural information. This framework utilizes direct likelihood-based calculations of pseudo-energy terms per considered structural context and can readily accommodate diverse data types and complex data dependencies. We use real data in conjunction with simulations to evaluate performances of several implementations and to show that proper integration of structural contexts can lead to improvements. Our tests also reveal discrepancies between real data and simulations, which we show can be alleviated by refined modeling. We then propose statistical preprocessing approaches to standardize data interpretation and integration into such a generic framework. We further systematically quantify the information content of data subsets, demonstrating that high reactivities are major drivers of SHAPE-directed predictions and that better understanding of less informative reactivities is key to further improvements. Finally, we provide evidence for the adaptive capability of our framework using mock probe simulations.
Collapse
Affiliation(s)
- Fei Deng
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| | - Mirko Ledda
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| | - Sana Vaziri
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, California 95616, USA
| |
Collapse
|
38
|
Abstract
Deciphering the folding pathways and predicting the structures of complex three-dimensional biomolecules is central to elucidating biological function. RNA is single-stranded, which gives it the freedom to fold into complex secondary and tertiary structures. These structures endow RNA with the ability to perform complex chemistries and functions ranging from enzymatic activity to gene regulation. Given that RNA is involved in many essential cellular processes, it is critical to understand how it folds and functions in vivo. Within the last few years, methods have been developed to probe RNA structures in vivo and genome-wide. These studies reveal that RNA often adopts very different structures in vivo and in vitro, and provide profound insights into RNA biology. Nonetheless, both in vitro and in vivo approaches have limitations: studies in the complex and uncontrolled cellular environment make it difficult to obtain insight into RNA folding pathways and thermodynamics, and studies in vitro often lack direct cellular relevance, leaving a gap in our knowledge of RNA folding in vivo. This gap is being bridged by biophysical and mechanistic studies of RNA structure and function under conditions that mimic the cellular environment. To date, most artificial cytoplasms have used various polymers as molecular crowding agents and a series of small molecules as cosolutes. Studies under such in vivo-like conditions are yielding fresh insights, such as cooperative folding of functional RNAs and increased activity of ribozymes. These observations are accounted for in part by molecular crowding effects and interactions with other molecules. In this review, we report milestones in RNA folding in vitro and in vivo and discuss ongoing experimental and computational efforts to bridge the gap between these two conditions in order to understand how RNA folds in the cell.
Collapse
|
39
|
McFadden EJ, Hargrove AE. Biochemical Methods To Investigate lncRNA and the Influence of lncRNA:Protein Complexes on Chromatin. Biochemistry 2016; 55:1615-30. [PMID: 26859437 DOI: 10.1021/acs.biochem.5b01141] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Long noncoding RNAs (lncRNAs), defined as nontranslated transcripts greater than 200 nucleotides in length, are often differentially expressed throughout developmental stages, tissue types, and disease states. The identification, visualization, and suppression/overexpression of these sequences have revealed impacts on a wide range of biological processes, including epigenetic regulation. Biochemical investigations on select systems have revealed striking insight into the biological roles of lncRNAs and lncRNA:protein complexes, which in turn prompt even more unanswered questions. To begin, multiple protein- and RNA-centric technologies have been employed to isolate lncRNA:protein and lncRNA:chromatin complexes. LncRNA interactions with the multi-subunit protein complex PRC2, which acts as a transcriptional silencer, represent some of the few cases where the binding affinity, selectivity, and activity of a lncRNA:protein complex have been investigated. At the same time, recent reports of full-length lncRNA secondary structures suggest the formation of complex structures with multiple independent folding domains and pave the way for more detailed structural investigations and predictions of lncRNA three-dimensional structure. This review will provide an overview of the methods and progress made to date as well as highlight new methods that promise to further inform the molecular recognition, specificity, and function of lncRNAs.
Collapse
Affiliation(s)
- Emily J McFadden
- Department of Biochemistry, Duke University Medical Center , Durham, North Carolina 27710, United States
| | - Amanda E Hargrove
- Department of Biochemistry, Duke University Medical Center , Durham, North Carolina 27710, United States.,Department of Chemistry, Duke University , 124 Science Drive, Durham, North Carolina 27708, United States
| |
Collapse
|
40
|
Flynn RA, Zhang QC, Spitale RC, Lee B, Mumbach MR, Chang HY. Transcriptome-wide interrogation of RNA secondary structure in living cells with icSHAPE. Nat Protoc 2016; 11:273-90. [PMID: 26766114 PMCID: PMC4896316 DOI: 10.1038/nprot.2016.011] [Citation(s) in RCA: 112] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
icSHAPE (in vivo click selective 2-hydroxyl acylation and profiling experiment) captures RNA secondary structure at a transcriptome-wide level by measuring nucleotide flexibility at base resolution. Living cells are treated with the icSHAPE chemical NAI-N3 followed by selective chemical enrichment of NAI-N3-modified RNA, which provides an improved signal-to-noise ratio compared with similar methods leveraging deep sequencing. Purified RNA is then reverse-transcribed to produce cDNA, with SHAPE-modified bases leading to truncated cDNA. After deep sequencing of cDNA, computational analysis yields flexibility scores for every base across the starting RNA population. The entire experimental procedure can be completed in ∼5 d, and the sequencing and bioinformatics data analysis take an additional 4-5 d with no extensive computational skills required. Comparing in vivo and in vitro icSHAPE measurements can reveal in vivo RNA-binding protein imprints or facilitate the dissection of RNA post-transcriptional modifications. icSHAPE reactivities can additionally be used to constrain and improve RNA secondary structure prediction models.
Collapse
Affiliation(s)
- Ryan A Flynn
- Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, California, USA
| | - Qiangfeng Cliff Zhang
- Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, California, USA
| | - Robert C Spitale
- Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, California, USA
| | - Byron Lee
- Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, California, USA
| | - Maxwell R Mumbach
- Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, California, USA
| | - Howard Y Chang
- Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, California, USA
| |
Collapse
|
41
|
Abstract
Experimental probing data can be used to improve the accuracy of RNA secondary structure prediction. The software package RNAstructure can take advantage of enzymatic cleavage data, FMN cleavage data, traditional chemical modification reactivity data, and SHAPE reactivity data for secondary structure modeling. This chapter provides protocols for using experimental probing data with RNAstructure to restrain or constrain RNA secondary structure prediction.
Collapse
Affiliation(s)
- Zhenjiang Zech Xu
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY, 14642, USA
- Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY, 14642, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY, 14642, USA.
- Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY, 14642, USA.
- Department of Biostatistics & Computational Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY, 14642, USA.
| |
Collapse
|
42
|
Somarowthu S. Progress and Current Challenges in Modeling Large RNAs. J Mol Biol 2015; 428:736-747. [PMID: 26585404 DOI: 10.1016/j.jmb.2015.11.011] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Revised: 11/03/2015] [Accepted: 11/08/2015] [Indexed: 12/21/2022]
Abstract
Recent breakthroughs in next-generation sequencing technologies have led to the discovery of several classes of non-coding RNAs (ncRNAs). It is now apparent that RNA molecules are not only just carriers of genetic information but also key players in many cellular processes. While there has been a rapid increase in the number of ncRNA sequences deposited in various databases over the past decade, the biological functions of these ncRNAs are largely not well understood. Similar to proteins, RNA molecules carry out a function by forming specific three-dimensional structures. Understanding the function of a particular RNA therefore requires a detailed knowledge of its structure. However, determining experimental structures of RNA is extremely challenging. In fact, RNA-only structures represent just 1% of the total structures deposited in the PDB. Thus, computational methods that predict three-dimensional RNA structures are in high demand. Computational models can provide valuable insights into structure-function relationships in ncRNAs and can aid in the development of functional hypotheses and experimental designs. In recent years, a set of diverse RNA structure prediction tools have become available, which differ in computational time, input data and accuracy. This review discusses the recent progress and challenges in RNA structure prediction methods.
Collapse
Affiliation(s)
- Srinivas Somarowthu
- Department of Molecular, Cellular and Developmental Biology, Yale University, 219 Prospect Street, Kline Biology Tower, New Haven, CT 06511, USA.
| |
Collapse
|