1
|
Lee S, Yan S, Dey A, Laederach A, Schlick T. An intricate balancing act: Upstream and downstream frameshift co-regulatory elements. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.27.599960. [PMID: 38979256 PMCID: PMC11230384 DOI: 10.1101/2024.06.27.599960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Targeting ribosomal frameshifting has emerged as a potential therapeutic intervention strategy against Covid-19. During ribosomal translation, a fraction of elongating ribosomes slips by one base in the 5' direction and enters a new reading frame for viral protein synthesis. Any interference with this process profoundly affects viral replication and propagation. For Covid-19, two RNA sites associated with ribosomal frameshifting for SARS-CoV-2 are positioned on the 5' and 3' of the frameshifting residues. Although much attention has been on the 3' frameshift element (FSE), the 5' stem-loop (attenuator hairpin, AH) can play a role. The formation of AH has been suggested to occur as refolding of the 3' RNA structure is triggered by ribosomal unwinding. However, the attenuation activity and the relationship between the two regions are unknown. To gain more insight into these two related viral RNAs and to further enrich our understanding of ribosomal frameshifting for SARS-CoV-2, we explore the RNA folding of both 5' and 3' regions associated with frameshifting. Using our graph-theory-based modeling tools to represent RNA secondary structures, "RAG" (RNA- As-Graphs), and conformational landscapes to analyze length-dependent conformational distributions, we show that AH coexists with the 3-stem pseudoknot of the 3' FSE (graph 3_6 in our dual graph notation) and alternative pseudoknot (graph 3_3) but less likely with other 3' FSE alternative folds (such as 3-way junction 3_5). This is because an alternative length-dependent Stem 1 (AS1) can disrupt the FSE pseudoknots and trigger other folds. In addition, we design four mutants for long lengths that stabilize or disrupt AH, AS1 or FSE pseudoknot to illustrate the deduced AH/AS1 roles and favor the 3_5, 3_6 or stem-loop. These mutants further show how a strengthened pseudoknot can result from a weakened AS1, while a dominant stem-loop occurs with a strengthened AS1. These structural and mutational insights into both ends of the FSE in SARS-CoV-2 advance our understanding of the SARS-CoV-2 frameshifting mechanism by suggesting a sequence of length-dependent folds, which in turn define potential therapeutic intervention techniques involving both elements. Our work also highlights the complexity of viral landscapes with length-dependent folds, and challenges in analyzing these multiple conformations.
Collapse
Affiliation(s)
- Samuel Lee
- Department of Chemistry, New York University, New York, 10003, NY, U.S.A
| | - Shuting Yan
- Department of Chemistry, New York University, New York, 10003, NY, U.S.A
| | - Abhishek Dey
- Department of Biotechnology, National Institute of Pharmaceutical Education and Research-Raebareli (NIPER-R), Lucknow, 226002, Uttar Pradesh, India
| | - Alain Laederach
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, 27599, NC, U.S.A
| | - Tamar Schlick
- Department of Chemistry, New York University, New York, 10003, NY, U.S.A
- Courant Institute of Mathematical Sciences, New York University, New York, 10012, NY, U.S.A
- NYU-ECNU Center for Computational Chemistry, NYU Shanghai, Shanghai, 200062, P.R.China
- NYU Simons Center for Computational Physical Chemistry, New York University, New York, 10003, NY, U.S.A
| |
Collapse
|
2
|
Trinity L, Stege U, Jabbari H. Tying the knot: Unraveling the intricacies of the coronavirus frameshift pseudoknot. PLoS Comput Biol 2024; 20:e1011787. [PMID: 38713726 PMCID: PMC11108256 DOI: 10.1371/journal.pcbi.1011787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 05/21/2024] [Accepted: 04/27/2024] [Indexed: 05/09/2024] Open
Abstract
Understanding and targeting functional RNA structures towards treatment of coronavirus infection can help us to prepare for novel variants of SARS-CoV-2 (the virus causing COVID-19), and any other coronaviruses that could emerge via human-to-human transmission or potential zoonotic (inter-species) events. Leveraging the fact that all coronaviruses use a mechanism known as -1 programmed ribosomal frameshifting (-1 PRF) to replicate, we apply algorithms to predict the most energetically favourable secondary structures (each nucleotide involved in at most one pairing) that may be involved in regulating the -1 PRF event in coronaviruses, especially SARS-CoV-2. We compute previously unknown most stable structure predictions for the frameshift site of coronaviruses via hierarchical folding, a biologically motivated framework where initial non-crossing structure folds first, followed by subsequent, possibly crossing (pseudoknotted), structures. Using mutual information from 181 coronavirus sequences, in conjunction with the algorithm KnotAli, we compute secondary structure predictions for the frameshift site of different coronaviruses. We then utilize the Shapify algorithm to obtain most stable SARS-CoV-2 secondary structure predictions guided by frameshift sequence-specific and genome-wide experimental data. We build on our previous secondary structure investigation of the singular SARS-CoV-2 68 nt frameshift element sequence, by using Shapify to obtain predictions for 132 extended sequences and including covariation information. Previous investigations have not applied hierarchical folding to extended length SARS-CoV-2 frameshift sequences. By doing so, we simulate the effects of ribosome interaction with the frameshift site, providing insight to biological function. We contribute in-depth discussion to contextualize secondary structure dual-graph motifs for SARS-CoV-2, highlighting the energetic stability of the previously identified 3_8 motif alongside the known dominant 3_3 and 3_6 (native-type) -1 PRF structures. Using a combination of thermodynamic methods and sequence covariation, our novel predictions suggest function of the attenuator hairpin via previously unknown pseudoknotted base pairing. While certain initial RNA folding is consistent, other pseudoknotted base pairs form which indicate potential conformational switching between the two structures.
Collapse
Affiliation(s)
- Luke Trinity
- Department of Computer Science, University of Victoria, Victoria, British Columbia, Canada
| | - Ulrike Stege
- Department of Computer Science, University of Victoria, Victoria, British Columbia, Canada
| | - Hosna Jabbari
- Department of Biomedical Engineering, University of Alberta, Edmonton, Alberta, Canada
- Institute on Aging and Lifelong Health, Victoria, British Columbia, Canada
| |
Collapse
|
3
|
Pietrek LM, Stelzl LS, Hummer G. Hierarchical Assembly of Single-Stranded RNA. J Chem Theory Comput 2024; 20:2246-2260. [PMID: 38361440 PMCID: PMC10938505 DOI: 10.1021/acs.jctc.3c01049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 12/09/2023] [Accepted: 01/25/2024] [Indexed: 02/17/2024]
Abstract
Single-stranded RNA (ssRNA) plays a major role in the flow of genetic information-most notably, in the form of messenger RNA (mRNA)-and in the regulation of biological processes. The highly dynamic nature of chains of unpaired nucleobases challenges structural characterizations of ssRNA by experiments or molecular dynamics (MD) simulations alike. Here, we use hierarchical chain growth (HCG) to construct ensembles of ssRNA chains. HCG assembles the structures of protein and nucleic acid chains from fragment libraries created by MD simulations. Applied to homo- and heteropolymeric ssRNAs of different lengths, we find that HCG produces structural ensembles that overall are in good agreement with diverse experiments, including nuclear magnetic resonance (NMR), small-angle X-ray scattering (SAXS), and single-molecule Förster resonance energy transfer (FRET). The agreement can be further improved by ensemble refinement using Bayesian inference of ensembles (BioEn). HCG can also be used to assemble RNA structures that combine base-paired and base-unpaired regions, as illustrated for the 5' untranslated region (UTR) of SARS-CoV-2 RNA.
Collapse
Affiliation(s)
- Lisa M. Pietrek
- Department
of Theoretical Biophysics, Max Planck Institute
of Biophysics, Max-von-Laue-Straße 3, 60438 Frankfurt am Main, Germany
| | - Lukas S. Stelzl
- Faculty
of Biology, Johannes Gutenberg University
Mainz, Gresemundweg 2, 55128 Mainz, Germany
- KOMET
1, Institute of Physics, Johannes Gutenberg
University Mainz, 55099 Mainz, Germany
- Institute
of Molecular Biology (IMB), 55128 Mainz, Germany
| | - Gerhard Hummer
- Department
of Theoretical Biophysics, Max Planck Institute
of Biophysics, Max-von-Laue-Straße 3, 60438 Frankfurt am Main, Germany
- Institute
for Biophysics, Goethe University, Max-von-Laue-Straße 9, 60438 Frankfurt am Main, Germany
| |
Collapse
|
4
|
Quadrini M, Tesei L, Merelli E. Automatic generation of pseudoknotted RNAs taxonomy. BMC Bioinformatics 2023; 23:575. [PMID: 37322429 DOI: 10.1186/s12859-023-05362-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 05/25/2023] [Indexed: 06/17/2023] Open
Abstract
BACKGROUND The ability to compare RNA secondary structures is important in understanding their biological function and for grouping similar organisms into families by looking at evolutionarily conserved sequences such as 16S rRNA. Most comparison methods and benchmarks in the literature focus on pseudoknot-free structures due to the difficulty of mapping pseudoknots in classical tree representations. Some approaches exist that permit to cluster pseudoknotted RNAs but there is not a general framework for evaluating their performance. RESULTS We introduce an evaluation framework based on a similarity/dissimilarity measure obtained by a comparison method and agglomerative clustering. Their combination automatically partition a set of molecules into groups. To illustrate the framework we define and make available a benchmark of pseudoknotted (16S and 23S) and pseudoknot-free (5S) rRNA secondary structures belonging to Archaea, Bacteria and Eukaryota. We also consider five different comparison methods from the literature that are able to manage pseudoknots. For each method we clusterize the molecules in the benchmark to obtain the taxa at the rank phylum according to the European Nucleotide Archive curated taxonomy. We compute appropriate metrics for each method and we compare their suitability to reconstruct the taxa.
Collapse
Affiliation(s)
- Michela Quadrini
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy
| | - Luca Tesei
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy.
| | - Emanuela Merelli
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy
| |
Collapse
|
5
|
Abstract
Developing mathematical representations of biological systems that can allow predictions is a challenging and important research goal. It is demonstrated here how the ribosome, the nano-machine responsible for synthesizing all proteins necessary for cellular life, can be represented as a bipartite network. Ten ribosomal structures from Bacteria and six from Eukarya are explored. Ribosomal networks are found to exhibit unique properties despite variations in the nodes and edges of the different graphs. The ribosome is shown to exhibit very large topological redundancies, demonstrating mathematical resiliency. These results can potentially explain how it can function consistently despite changes in composition and connectivity. Furthermore, this representation can be used to analyze ribosome function within the large machinery of network theory, where the degrees of freedom are the possible interactions, and can be used to provide new insights for translation regulation and therapeutics.
Collapse
|
6
|
RNA-As-Graphs Motif Atlas—Dual Graph Library of RNA Modules and Viral Frameshifting-Element Applications. Int J Mol Sci 2022; 23:ijms23169249. [PMID: 36012512 PMCID: PMC9408923 DOI: 10.3390/ijms23169249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 08/13/2022] [Accepted: 08/14/2022] [Indexed: 11/25/2022] Open
Abstract
RNA motif classification is important for understanding structure/function connections and building phylogenetic relationships. Using our coarse-grained RNA-As-Graphs (RAG) representations, we identify recurrent dual graph motifs in experimentally solved RNA structures based on an improved search algorithm that finds and ranks independent RNA substructures. Our expanded list of 183 existing dual graph motifs reveals five common motifs found in transfer RNA, riboswitch, and ribosomal 5S RNA components. Moreover, we identify three motifs for available viral frameshifting RNA elements, suggesting a correlation between viral structural complexity and frameshifting efficiency. We further partition the RNA substructures into 1844 distinct submotifs, with pseudoknots and junctions retained intact. Common modules are internal loops and three-way junctions, and three submotifs are associated with riboswitches that bind nucleotides, ions, and signaling molecules. Together, our library of existing RNA motifs and submotifs adds to the growing universe of RNA modules, and provides a resource of structures and substructures for novel RNA design.
Collapse
|
7
|
Zhang T, Jiang W, Liao F, Zhu P, Guo L, Zhao Z, Liu Y, Huang X, Zhou N. Identification of the key exosomal lncRNAs/mRNAs in the serum during distraction osteogenesis. J Orthop Surg Res 2022; 17:291. [PMID: 35643547 PMCID: PMC9148531 DOI: 10.1186/s13018-022-03163-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 05/03/2022] [Indexed: 11/13/2022] Open
Abstract
Background Distraction osteogenesis (DO), a kind of bone regenerative process, is not only extremely effective, but the osteogenesis rate is far beyond ordinary bone fracture (BF) healing. Exosomes (Exo) are thought to play a part in bone regeneration and healing as key players in cell-to-cell contact. The object of this work was to determine whether exosomes derived from DO and BF serum could stimulate the Osteogenic Differentiation in these two processes, and if so, which genes could be involved. Methods The osteogenesis in DO-gap or BF-gap was evaluated using radiographic analysis and histological analysis. On the 14th postoperative day, DO-Exos and BF-Exos were isolated and cocultured with the jaw of bone marrow mesenchymal stem cells (JBMMSCs). Proliferation, migration and osteogenic differentiation of JBMMSCs were ascertained, after which exosomes RNA-seq was performed to identify the relevant gene. Results Radiographic and histological analyses manifested that osteogenesis was remarkably accelerated in DO-gap in comparison with BF-gap. Both of the two types of Exos were taken up by JBMMSCs, and their migration and osteogenic differentiation were also seen to improve. However, the proliferation showed no significant difference. Finally, exosome RNA-seq revealed that the lncRNA MSTRG.532277.1 and the mRNA F-box and leucine-rich repeat protein 14(FBXL14) may play a key role in DO. Conclusions Our findings suggest that exosomes from serum exert a critical effect on the rapid osteogenesis in DO. This promoting effect might have relevance with the co-expression of MSTRG.532277.1 and FBXL14. On the whole, these findings provide new insights into bone regeneration, thereby outlining possible therapeutic targets for clinical intervention.
Collapse
|
8
|
Leeder WM, Geyer FK, Göringer HU. Fuzzy RNA recognition by the Trypanosoma brucei editosome. Nucleic Acids Res 2022; 50:5818-5833. [PMID: 35580050 PMCID: PMC9178004 DOI: 10.1093/nar/gkac357] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 04/20/2022] [Accepted: 04/26/2022] [Indexed: 11/30/2022] Open
Abstract
The assembly of high molecular mass ribonucleoprotein complexes typically relies on the binary interaction of defined RNA sequences or precisely folded RNA motifs with dedicated RNA-binding domains on the protein side. Here we describe a new molecular recognition principle of RNA molecules by a high molecular mass protein complex. By chemically probing the solvent accessibility of mitochondrial pre-mRNAs when bound to the Trypanosoma brucei editosome, we identified multiple similar but non-identical RNA motifs as editosome contact sites. However, by treating the different motifs as mathematical graph objects we demonstrate that they fit a consensus 2D-graph consisting of 4 vertices (V) and 3 edges (E) with a Laplacian eigenvalue of 0.5477 (λ2). We establish that synthetic 4V(3E)-RNAs are sufficient to compete for the editosomal pre-mRNA binding site and that they inhibit RNA editing in vitro. Furthermore, we demonstrate that only two topological indices are necessary to predict the binding of any RNA motif to the editosome with a high level of confidence. Our analysis corroborates that the editosome has adapted to the structural multiplicity of the mitochondrial mRNA folding space by recognizing a fuzzy continuum of RNA folds that fit a consensus graph descriptor.
Collapse
Affiliation(s)
| | - Felix Klaus Geyer
- Molecular Genetics, Technical University Darmstadt, 64287 Darmstadt, Germany
| | | |
Collapse
|
9
|
Yao Q, Zhang X, Chen D. Emerging Roles and Mechanisms of lncRNA FOXD3-AS1 in Human Diseases. Front Oncol 2022; 12:848296. [PMID: 35280790 PMCID: PMC8914342 DOI: 10.3389/fonc.2022.848296] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 02/01/2022] [Indexed: 01/02/2023] Open
Abstract
Numerous long noncoding RNAs (lncRNAs) have been identified as powerful regulators of human diseases. The lncRNA FOXD3-AS1 is a novel lncRNA that was recently shown to exert imperative roles in the initialization and progression of several diseases. Emerging studies have shown aberrant expression of FOXD3-AS1 and close correlation with pathophysiological traits of numerous diseases, particularly cancers. More importantly, FOXD3-AS1 was also found to ubiquitously impact a range of biological functions. This study aims to summarize the expression, associated clinicopathological features, major functions and molecular mechanisms of FOXD3-AS1 in human diseases and to explore its possible clinical applications.
Collapse
Affiliation(s)
- Qinfan Yao
- Kidney Disease Center, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China
- Key Laboratory of Kidney Disease Prevention and Control Technology, Hangzhou, China
- National Key Clinical Department of Kidney Diseases, Institute of Nephrology, Zhejiang University, Hangzhou, China
- Zhejiang Clinical Research Center of Kidney and Urinary System Disease, Hangzhou, China
| | - Xiuyuan Zhang
- Kidney Disease Center, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China
- Key Laboratory of Kidney Disease Prevention and Control Technology, Hangzhou, China
- National Key Clinical Department of Kidney Diseases, Institute of Nephrology, Zhejiang University, Hangzhou, China
- Zhejiang Clinical Research Center of Kidney and Urinary System Disease, Hangzhou, China
| | - Dajin Chen
- Kidney Disease Center, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China
- Key Laboratory of Kidney Disease Prevention and Control Technology, Hangzhou, China
- National Key Clinical Department of Kidney Diseases, Institute of Nephrology, Zhejiang University, Hangzhou, China
- Zhejiang Clinical Research Center of Kidney and Urinary System Disease, Hangzhou, China
- *Correspondence: Dajin Chen,
| |
Collapse
|
10
|
Mak CH, Phan ENH. Diagrammatic approaches to RNA structures with trinucleotide repeats. Biophys J 2021; 120:2343-2354. [PMID: 33887227 PMCID: PMC8390803 DOI: 10.1016/j.bpj.2021.04.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 04/07/2021] [Accepted: 04/09/2021] [Indexed: 11/30/2022] Open
Abstract
Trinucleotide repeat expansion disorders are associated with the overexpansion of (CNG) repeats on the genome. Messenger RNA transcripts of sequences with greater than 60–100 (CNG) tandem units have been implicated in trinucleotide repeat expansion disorder pathogenesis. In this work, we develop a diagrammatic theory to study the structural diversity of these (CNG)n RNA sequences. Representing structural elements on the chain’s conformation by a set of graphs and employing elementary diagrammatic methods, we have formulated a renormalization procedure to re-sum these graphs and arrive at a closed-form expression for the ensemble partition function. With a simple approximation for the renormalization and applied to extended (CNG)n sequences, this theory can comprehensively capture an infinite set of conformations with any number and any combination of duplexes, hairpins, multiway junctions, and quadruplexes. To quantify the diversity of different (CNG)n ensembles, the analytical equations derived from the diagrammatic theory were solved numerically to derive equilibrium estimates for the secondary structural contents of the chains. The results suggest that the structural ensembles of (CNG)n repeat sequence with n ∼60 are surprisingly diverse, and the distribution is sensitive to the ability of the N nucleotide to make noncanonical pairs and whether the (CNG)n sequence can sustain stable quadruplexes. The results show how perturbations in the form of biases on the stabilities of the various structural motifs, duplexes, junctions, helices, and quadruplexes could affect the secondary structures of the chains and how these structures may switch when they are perturbed.
Collapse
Affiliation(s)
- Chi H Mak
- Department of Chemistry, Center of Applied Mathematical Sciences and Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California.
| | - Ethan N H Phan
- Department of Chemistry, University of Southern California, Los Angeles, California
| |
Collapse
|
11
|
Abstract
Novel RNA motif design is of great practical importance for technology and medicine. Increasingly, computational design plays an important role in such efforts. Our coarse-grained RAG (RNA-As-Graphs) framework offers strategies for enumerating the universe of RNA 2D folds, selecting "RNA-like" candidates for design, and determining sequences that fold onto these candidates. In RAG, RNA secondary structures are represented as tree or dual graphs. Graphs with known RNA structures are called "existing", and the others are labeled "hypothetical". By using simplified features for RNA graphs, we have clustered the hypothetical graphs into "RNA-like" and "non-RNA-like" groups and proposed RNA-like graphs as candidates for design. Here, we propose a new way of designing graph features by using Fiedler vectors. The new features reflect graph shapes better, and they lead to a more clustered organization of existing graphs. We show significant increases in K-means clustering accuracy by using the new features (e.g., up to 95% and 98% accuracy for tree and dual graphs, respectively). In addition, we propose a scoring model for top graph candidate selection. This scoring model allows users to set a threshold for candidates, and it incorporates weighing of existing graphs based on their corresponding number of known RNAs. We include a list of top scored RNA-like candidates, which we hope will stimulate future novel RNA design.
Collapse
Affiliation(s)
- Qiyao Zhu
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, United States
| | - Tamar Schlick
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, United States
- Department of Chemistry, New York University, New York, New York 10003, United States
- NYU-ECNU Center for Computational Chemistry, NYU Shanghai, Shanghai 200062, P. R. China
| |
Collapse
|
12
|
Schlick T, Zhu Q, Jain S, Yan S. Structure-altering mutations of the SARS-CoV-2 frameshifting RNA element. Biophys J 2020; 120:1040-1053. [PMID: 33096082 PMCID: PMC7575535 DOI: 10.1016/j.bpj.2020.10.012] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 10/06/2020] [Accepted: 10/13/2020] [Indexed: 12/15/2022] Open
Abstract
With the rapid rate of COVID-19 infections and deaths, treatments and cures besides hand washing, social distancing, masks, isolation, and quarantines are urgently needed. The treatments and vaccines rely on the basic biophysics of the complex viral apparatus. Although proteins are serving as main drug and vaccine targets, therapeutic approaches targeting the 30,000 nucleotide RNA viral genome form important complementary approaches. Indeed, the high conservation of the viral genome, its close evolutionary relationship to other viruses, and the rise of gene editing and RNA-based vaccines all argue for a focus on the RNA agent itself. One of the key steps in the viral replication cycle inside host cells is the ribosomal frameshifting required for translation of overlapping open reading frames. The RNA frameshifting element (FSE), one of three highly conserved regions of coronaviruses, is believed to include a pseudoknot considered essential for this ribosomal switching. In this work, we apply our graph-theory-based framework for representing RNA secondary structures, "RAG (or RNA-As-Graphs)," to alter key structural features of the FSE of the SARS-CoV-2 virus. Specifically, using RAG machinery of genetic algorithms for inverse folding adapted for RNA structures with pseudoknots, we computationally predict minimal mutations that destroy a structurally important stem and/or the pseudoknot of the FSE, potentially dismantling the virus against translation of the polyproteins. Our microsecond molecular dynamics simulations of mutant structures indicate relatively stable secondary structures. These findings not only advance our computational design of RNAs containing pseudoknots, they pinpoint key residues of the SARS-CoV-2 virus as targets for antiviral drugs and gene editing approaches.
Collapse
Affiliation(s)
- Tamar Schlick
- Department of Chemistry, New York University, New York, New York; Courant Institute of Mathematical Sciences, New York University, New York, New York; NYU-ECNU Center for Computational Chemistry, NYU Shanghai, Shanghai, P. R. China.
| | - Qiyao Zhu
- Department of Chemistry, New York University, New York, New York; Courant Institute of Mathematical Sciences, New York University, New York, New York
| | - Swati Jain
- Department of Chemistry, New York University, New York, New York
| | - Shuting Yan
- Department of Chemistry, New York University, New York, New York
| |
Collapse
|
13
|
Schlick T, Zhu Q, Jain S, Yan S. Structure-Altering Mutations of the SARS-CoV-2 Frame Shifting RNA Element. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2020:2020.08.28.271965. [PMID: 32869017 PMCID: PMC7457599 DOI: 10.1101/2020.08.28.271965] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
With the rapid rate of Covid-19 infections and deaths, treatments and cures besides hand washing, social distancing, masks, isolation, and quarantines are urgently needed. The treatments and vaccines rely on the basic biophysics of the complex viral apparatus. While proteins are serving as main drug and vaccine targets, therapeutic approaches targeting the 30,000 nucleotide RNA viral genome form important complementary approaches. Indeed, the high conservation of the viral genome, its close evolutionary relationship to other viruses, and the rise of gene editing and RNA-based vaccines all argue for a focus on the RNA agent itself. One of the key steps in the viral replication cycle inside host cells is the ribosomal frameshifting required for translation of overlapping open reading frames. The frameshifting element (FSE), one of three highly conserved regions of coronaviruses, includes an RNA pseudoknot considered essential for this ribosomal switching. In this work, we apply our graph-theory-based framework for representing RNA secondary structures, "RAG" (RNA-As Graphs), to alter key structural features of the FSE of the SARS-CoV-2 virus. Specifically, using RAG machinery of genetic algorithms for inverse folding adapted for RNA structures with pseudoknots, we computationally predict minimal mutations that destroy a structurally-important stem and/or the pseudoknot of the FSE, potentially dismantling the virus against translation of the polyproteins. Additionally, our microsecond molecular dynamics simulations of mutant structures indicate relatively stable secondary structures. These findings not only advance our computational design of RNAs containing pseudoknots; they pinpoint to key residues of the SARS-CoV-2 virus as targets for anti-viral drugs and gene editing approaches.
Collapse
|
14
|
Xu J, Tojo S, Fujitsuka M, Kawai K. Dynamics of Single‐Stranded RNA Looping Probed and Photoregulated by Sulfonated Pyrene. ChemistrySelect 2020. [DOI: 10.1002/slct.202002231] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Affiliation(s)
- Jie Xu
- The Institute of Scientific and Industrial Research (SANKEN)Osaka University Mihogaoka 8–1 Ibaraki Osaka 567-0047 Japan
| | - Sachiko Tojo
- The Institute of Scientific and Industrial Research (SANKEN)Osaka University Mihogaoka 8–1 Ibaraki Osaka 567-0047 Japan
| | - Mamoru Fujitsuka
- The Institute of Scientific and Industrial Research (SANKEN)Osaka University Mihogaoka 8–1 Ibaraki Osaka 567-0047 Japan
| | - Kiyohiko Kawai
- The Institute of Scientific and Industrial Research (SANKEN)Osaka University Mihogaoka 8–1 Ibaraki Osaka 567-0047 Japan
| |
Collapse
|
15
|
Jain S, Zhu Q, Paz ASP, Schlick T. Identification of novel RNA design candidates by clustering the extended RNA-As-Graphs library. Biochim Biophys Acta Gen Subj 2020; 1864:129534. [PMID: 31954797 DOI: 10.1016/j.bbagen.2020.129534] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Revised: 01/10/2020] [Accepted: 01/14/2020] [Indexed: 12/31/2022]
Abstract
BACKGROUND We re-evaluate our RNA-As-Graphs clustering approach, using our expanded graph library and new RNA structures, to identify potential RNA-like topologies for design. Our coarse-grained approach represents RNA secondary structures as tree and dual graphs, with vertices and edges corresponding to RNA helices and loops. The graph theoretical framework facilitates graph enumeration, partitioning, and clustering approaches to study RNA structure and its applications. METHODS Clustering graph topologies based on features derived from graph Laplacian matrices and known RNA structures allows us to classify topologies into 'existing' or hypothetical, and the latter into, 'RNA-like' or 'non RNA-like' topologies. Here we update our list of existing tree graph topologies and RAG-3D database of atomic fragments to include newly determined RNA structures. We then use linear and quadratic regression, optionally with dimensionality reduction, to derive graph features and apply several clustering algorithms on our tree-graph library and recently expanded dual-graph library to classify them into the three groups. RESULTS The unsupervised PAM and K-means clustering approaches correctly classify 72-77% of all existing graph topologies and 75-82% of newly added ones as RNA-like. For supervised k-NN clustering, the cross-validation accuracy ranges from 57 to 81%. CONCLUSIONS Using linear regression with unsupervised clustering, or quadratic regression with supervised clustering, provides better accuracies than supervised/linear clustering. All accuracies are better than random, especially for newly added existing topologies, thus lending credibility to our approach. GENERAL SIGNIFICANCE Our updated RAG-3D database and motif classification by clustering present new RNA substructures and RNA-like motifs as novel design candidates.
Collapse
Affiliation(s)
- Swati Jain
- Department of Chemistry, New York University, 1021 Silver, 100 Washington Square East, New York, NY 10003, USA
| | - Qiyao Zhu
- Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA
| | - Amiel S P Paz
- NYU Shanghai, 1555 Century Avenue, Shanghai 200135, China; NYU-ECNU Center for Computational Chemistry, NYU Shanghai, 3663 Zhongshang Road North, Shanghai 200062, China
| | - Tamar Schlick
- Department of Chemistry, New York University, 1021 Silver, 100 Washington Square East, New York, NY 10003, USA; Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA; NYU-ECNU Center for Computational Chemistry, NYU Shanghai, 3663 Zhongshang Road North, Shanghai 200062, China.
| |
Collapse
|
16
|
Inverse folding with RNA-As-Graphs produces a large pool of candidate sequences with target topologies. J Struct Biol 2019; 209:107438. [PMID: 31874236 DOI: 10.1016/j.jsb.2019.107438] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 12/18/2019] [Accepted: 12/19/2019] [Indexed: 02/07/2023]
Abstract
We present an RNA-As-Graphs (RAG) based inverse folding algorithm, RAG-IF, to design novel RNA sequences that fold onto target tree graph topologies. The algorithm can be used to enhance our recently reported computational design pipeline (Jain et al., NAR 2018). The RAG approach represents RNA secondary structures as tree and dual graphs, where RNA loops and helices are coarse-grained as vertices and edges, opening the usage of graph theory methods to study, predict, and design RNA structures. Our recently developed computational pipeline for design utilizes graph partitioning (RAG-3D) and atomic fragment assembly (F-RAG) to design sequences to fold onto RNA-like tree graph topologies; the atomic fragments are taken from existing RNA structures that correspond to tree subgraphs. Because F-RAG may not produce the target folds for all designs, automated mutations by RAG-IF algorithm enhance the candidate pool markedly. The crucial residues for mutation are identified by differences between the predicted and the target topology. A genetic algorithm then mutates the selected residues, and the successful sequences are optimized to retain only the minimal or essential mutations. Here we evaluate RAG-IF for 6 RNA-like topologies and generate a large pool of successful candidate sequences with a variety of minimal mutations. We find that RAG-IF adds robustness and efficiency to our RNA design pipeline, making inverse folding motivated by graph topology rather than secondary structure more productive.
Collapse
|
17
|
Kimchi O, Cragnolini T, Brenner MP, Colwell LJ. A Polymer Physics Framework for the Entropy of Arbitrary Pseudoknots. Biophys J 2019; 117:520-532. [PMID: 31353036 PMCID: PMC6697467 DOI: 10.1016/j.bpj.2019.06.037] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 06/21/2019] [Accepted: 06/27/2019] [Indexed: 11/18/2022] Open
Abstract
The accurate prediction of RNA secondary structure from primary sequence has had enormous impact on research from the past 40 years. Although many algorithms are available to make these predictions, the inclusion of non-nested loops, termed pseudoknots, still poses challenges arising from two main factors: 1) no physical model exists to estimate the loop entropies of complex intramolecular pseudoknots, and 2) their NP-complete enumeration has impeded their study. Here, we address both challenges. First, we develop a polymer physics model that can address arbitrarily complex pseudoknots using only two parameters corresponding to concrete physical quantities-over an order of magnitude fewer than the sparsest state-of-the-art phenomenological methods. Second, by coupling this model to exhaustive enumeration of the set of possible structures, we compute the entire free energy landscape of secondary structures resulting from a primary RNA sequence. We demonstrate that for RNA structures of ∼80 nucleotides, with minimal heuristics, the complete enumeration of possible secondary structures can be accomplished quickly despite the NP-complete nature of the problem. We further show that despite our loop entropy model's parametric sparsity, it performs better than or on par with previously published methods in predicting both pseudoknotted and non-pseudoknotted structures on a benchmark data set of RNA structures of ≤80 nucleotides. We suggest ways in which the accuracy of the model can be further improved.
Collapse
Affiliation(s)
- Ofer Kimchi
- Harvard Graduate Program in Biophysics, Harvard University, Cambridge, Massachusetts.
| | - Tristan Cragnolini
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Michael P Brenner
- School of Engineering and Applied Sciences, Cambridge, Massachusetts; Kavli Institute for Bionano Science and Technology, Harvard University, Cambridge, Massachusetts
| | - Lucy J Colwell
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom.
| |
Collapse
|
18
|
Meng G, Tariq M, Jain S, Elmetwaly S, Schlick T. RAG-Web: RNA structure prediction/design using RNA-As-Graphs. Bioinformatics 2019; 36:647-648. [PMID: 31373604 PMCID: PMC7999136 DOI: 10.1093/bioinformatics/btz611] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Revised: 07/11/2019] [Accepted: 08/01/2019] [Indexed: 01/31/2023] Open
Abstract
SUMMARY We launch a webserver for RNA structure prediction and design corresponding to tools developed using our RNA-As-Graphs (RAG) approach. RAG uses coarse-grained tree graphs to represent RNA secondary structure, allowing the application of graph theory to analyze and advance RNA structure discovery. Our webserver consists of three modules: (a) RAG Sampler: samples tree graph topologies from an RNA secondary structure to predict corresponding tertiary topologies, (b) RAG Builder: builds three-dimensional atomic models from candidate graphs generated by RAG Sampler, and (c) RAG Designer: designs sequences that fold onto novel RNA motifs (described by tree graph topologies). Results analyses are performed for further assessment/selection. The Results page provides links to download results and indicates possible errors encountered. RAG-Web offers a user-friendly interface to utilize our RAG software suite to predict and design RNA structures and sequences. AVAILABILITY AND IMPLEMENTATION The webserver is freely available online at: http://www.biomath.nyu.edu/ragtop/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Grace Meng
- Department of Chemistry, New York University, New York, NY 10003, USA
| | - Marva Tariq
- Department of Chemistry, Smith College, Northampton, MA 01063, USA
| | - Swati Jain
- Department of Chemistry, New York University, New York, NY 10003, USA
| | - Shereef Elmetwaly
- Department of Chemistry, New York University, New York, NY 10003, USA
| | | |
Collapse
|
19
|
Jain S, Saju S, Petingi L, Schlick T. An extended dual graph library and partitioning algorithm applicable to pseudoknotted RNA structures. Methods 2019; 162-163:74-84. [PMID: 30928508 DOI: 10.1016/j.ymeth.2019.03.022] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 02/28/2019] [Accepted: 03/22/2019] [Indexed: 12/18/2022] Open
Abstract
Exploring novel RNA topologies is imperative for understanding RNA structure and pursuing its design. Our RNA-As-Graphs (RAG) approach exploits graph theory tools and uses coarse-grained tree and dual graphs to represent RNA helices and loops by vertices and edges. Only dual graphs represent pseudoknotted RNAs fully. Here we develop a dual graph enumeration algorithm to generate an expanded library of dual graph topologies for 2-9 vertices, and extend our dual graph partitioning algorithm to identify all possible RNA subgraphs. Our enumeration algorithm connects smaller-vertex graphs, using all possible edge combinations, to build larger-vertex graphs and retain all non-isomorphic graph topologies, thereby more than doubling the size of our prior library to a total of 110,667 dual graph topologies. We apply our dual graph partitioning algorithm, which keeps pseudoknots and junctions intact, to all existing RNA structures to identify all possible substructures up to 9 vertices. In addition, our expanded dual graph library assigns graph topologies to all RNA graphs and subgraphs, rectifying prior inconsistencies. We update our RAG-3Dual database of RNA atomic fragments with all newly identified substructures and their graph IDs, increasing its size by more than 50 times. The enlarged dual graph library and RAG-3Dual database provide a comprehensive repertoire of graph topologies and atomic fragments to study yet undiscovered RNA molecules and design RNA sequences with novel topologies, including a variety of pseudoknotted RNAs.
Collapse
Affiliation(s)
- Swati Jain
- Department of Chemistry, New York University, 1021 Silver, 100 Washington Square East, New York, NY 10003, USA
| | - Sera Saju
- Department of Chemistry, New York University, 1021 Silver, 100 Washington Square East, New York, NY 10003, USA
| | - Louis Petingi
- Computer Science Department, College of Staten Island, City University of New York, Staten Island, New York, NY 10314, USA
| | - Tamar Schlick
- Department of Chemistry, New York University, 1021 Silver, 100 Washington Square East, New York, NY 10003, USA; Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA; NYU-East China Normal University Center for Computational Chemistry at New York University Shanghai, Room 340, Geography Building, North Zhongshan Road, 3663 Shanghai, China.
| |
Collapse
|
20
|
Jain S, Laederach A, Ramos SBV, Schlick T. A pipeline for computational design of novel RNA-like topologies. Nucleic Acids Res 2018; 46:7040-7051. [PMID: 30137633 PMCID: PMC6101589 DOI: 10.1093/nar/gky524] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Revised: 05/22/2018] [Accepted: 05/24/2018] [Indexed: 12/11/2022] Open
Abstract
Designing novel RNA topologies is a challenge, with important therapeutic and industrial applications. We describe a computational pipeline for design of novel RNA topologies based on our coarse-grained RNA-As-Graphs (RAG) framework. RAG represents RNA structures as tree graphs and describes RNA secondary (2D) structure topologies (currently up to 13 vertices, ≈260 nucleotides). We have previously identified novel graph topologies that are RNA-like among these. Here we describe a systematic design pipeline and illustrate design for six broad design problems using recently developed tools for graph-partitioning and fragment assembly (F-RAG). Following partitioning of the target graph, corresponding atomic fragments from our RAG-3D database are combined using F-RAG, and the candidate atomic models are scored using a knowledge-based potential developed for 3D structure prediction. The sequences of the top scoring models are screened further using available tools for 2D structure prediction. The results indicate that our modular approach based on RNA-like topologies rather than specific 2D structures allows for greater flexibility in the design process, and generates a large number of candidate sequences quickly. Experimental structure probing using SHAPE-MaP for two sequences agree with our predictions and suggest that our combined tools yield excellent candidates for further sequence and experimental screening.
Collapse
Affiliation(s)
- Swati Jain
- Department of Chemistry, New York University, 1001 Silver, 100 Washington Square East, New York, NY 10003, USA
| | - Alain Laederach
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Silvia B V Ramos
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Tamar Schlick
- Department of Chemistry, New York University, 1001 Silver, 100 Washington Square East, New York, NY 10003, USA
- Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA
- NYU-ECNU Center for Computational Chemistry at New York University Shanghai, Room 340, Geography Building, North Zhongshan Road, 3663 Shanghai, China
| |
Collapse
|
21
|
Jain S, Bayrak CS, Petingi L, Schlick T. Dual Graph Partitioning Highlights a Small Group of Pseudoknot-Containing RNA Submotifs. Genes (Basel) 2018; 9:E371. [PMID: 30044451 PMCID: PMC6115904 DOI: 10.3390/genes9080371] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 06/26/2018] [Accepted: 06/26/2018] [Indexed: 12/31/2022] Open
Abstract
RNA molecules are composed of modular architectural units that define their unique structural and functional properties. Characterization of these building blocks can help interpret RNA structure/function relationships. We present an RNA secondary structure motif and submotif library using dual graph representation and partitioning. Dual graphs represent RNA helices as vertices and loops as edges. Unlike tree graphs, dual graphs can represent RNA pseudoknots (intertwined base pairs). For a representative set of RNA structures, we construct dual graphs from their secondary structures, and apply our partitioning algorithm to identify non-separable subgraphs (or blocks) without breaking pseudoknots. We report 56 subgraph blocks up to nine vertices; among them, 22 are frequently occurring, 15 of which contain pseudoknots. We then catalog atomic fragments corresponding to the subgraph blocks to define a library of building blocks that can be used for RNA design, which we call RAG-3Dual, as we have done for tree graphs. As an application, we analyze the distribution of these subgraph blocks within ribosomal RNAs of various prokaryotic and eukaryotic species to identify common subgraphs and possible ancestry relationships. Other applications of dual graph partitioning and motif library can be envisioned for RNA structure analysis and design.
Collapse
Affiliation(s)
- Swati Jain
- Department of Chemistry, New York University, New York, NY 10003, USA.
| | - Cigdem S Bayrak
- Department of Chemistry, New York University, New York, NY 10003, USA.
| | - Louis Petingi
- Computer Science Department, College of Staten Island, City University of New York, Staten Island, New York, NY 10314, USA.
| | - Tamar Schlick
- Department of Chemistry, New York University, New York, NY 10003, USA.
- Courant Institute of Mathematical Sciences, New York University, New York, NY 10012, USA.
- NYU-East China Normal University Center for Computational Chemistry, New York University Shanghai, Shanghai 3663, China.
| |
Collapse
|
22
|
Abstract
The structure of RNA has been a natural subject for mathematical modeling, inviting many innovative computational frameworks. This single-stranded polynucleotide chain can fold upon itself in numerous ways to form hydrogen-bonded segments, imperfect with single-stranded loops. Illustrating these paired and non-paired interaction networks, known as RNA's secondary (2D) structure, using mathematical graph objects has been illuminating for RNA structure analysis. Building upon such seminal work from the 1970s and 1980s, graph models are now used to study not only RNA structure but also describe RNA's recurring modular units, sample the conformational space accessible to RNAs, predict RNA's three-dimensional folds, and apply the combined aspects to novel RNA design. In this article, we outline the development of the RNA-As-Graphs (or RAG) approach and highlight current applications to RNA structure prediction and design.
Collapse
Affiliation(s)
- Tamar Schlick
- Department of Chemistry, 100 Washington Square East, Silver Building, New York University, New York, NY 10003, USA; Courant Institute of Mathematical Sciences, New York University, 251 Mercer St., New York, NY 10012, USA; New York University ECNU - Center for Computational Chemistry at NYU Shanghai, 3663 North Zhongshan Road, Shanghai, 200062, China.
| |
Collapse
|
23
|
Arslan AN, Anandan J, Fry E, Monschke K, Ganneboina N, Bowerman J. Efficient RNA structure comparison algorithms. J Bioinform Comput Biol 2017; 15:1740009. [DOI: 10.1142/s0219720017400091] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Recently proposed relative addressing-based ([Formula: see text]) RNA secondary structure representation has important features by which an RNA structure database can be stored into a suffix array. A fast substructure search algorithm has been proposed based on binary search on this suffix array. Using this substructure search algorithm, we present a fast algorithm that finds the largest common substructure of given multiple RNA structures in [Formula: see text] format. The multiple RNA structure comparison problem is NP-hard in its general formulation. We introduced a new problem for comparing multiple RNA structures. This problem has more strict similarity definition and objective, and we propose an algorithm that solves this problem efficiently. We also develop another comparison algorithm that iteratively calls this algorithm to locate nonoverlapping large common substructures in compared RNAs. With the new resulting tools, we improved the RNASSAC website (linked from http://faculty.tamuc.edu/aarslan ). This website now also includes two drawing tools: one specialized for preparing RNA substructures that can be used as input by the search tool, and another one for automatically drawing the entire RNA structure from a given structure sequence.
Collapse
Affiliation(s)
- Abdullah N. Arslan
- Department of Computer Science, Texas A&M University-Commerce, Commerce, TX 75428, USA
| | - Jithendar Anandan
- Department of Computer Science, Texas A&M University-Commerce, Commerce, TX 75428, USA
| | - Eric Fry
- Department of Computer Science, Texas A&M University-Commerce, Commerce, TX 75428, USA
| | - Keith Monschke
- Department of Computer Science, Texas A&M University-Commerce, Commerce, TX 75428, USA
| | - Nitin Ganneboina
- Department of Computer Science, Texas A&M University-Commerce, Commerce, TX 75428, USA
| | - Jason Bowerman
- Department of Computer Science, Texas A&M University-Commerce, Commerce, TX 75428, USA
| |
Collapse
|
24
|
Jain S, Schlick T. F-RAG: Generating Atomic Coordinates from RNA Graphs by Fragment Assembly. J Mol Biol 2017; 429:3587-3605. [PMID: 28988954 PMCID: PMC5693719 DOI: 10.1016/j.jmb.2017.09.017] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Revised: 09/12/2017] [Accepted: 09/22/2017] [Indexed: 10/18/2022]
Abstract
Coarse-grained models represent attractive approaches to analyze and simulate ribonucleic acid (RNA) molecules, for example, for structure prediction and design, as they simplify the RNA structure to reduce the conformational search space. Our structure prediction protocol RAGTOP (RNA-As-Graphs Topology Prediction) represents RNA structures as tree graphs and samples graph topologies to produce candidate graphs. However, for a more detailed study and analysis, construction of atomic from coarse-grained models is required. Here we present our graph-based fragment assembly algorithm (F-RAG) to convert candidate three-dimensional (3D) tree graph models, produced by RAGTOP into atomic structures. We use our related RAG-3D utilities to partition graphs into subgraphs and search for structurally similar atomic fragments in a data set of RNA 3D structures. The fragments are edited and superimposed using common residues, full atomic models are scored using RAGTOP's knowledge-based potential, and geometries of top scoring models is optimized. To evaluate our models, we assess all-atom RMSDs and Interaction Network Fidelity (a measure of residue interactions) with respect to experimentally solved structures and compare our results to other fragment assembly programs. For a set of 50 RNA structures, we obtain atomic models with reasonable geometries and interactions, particularly good for RNAs containing junctions. Additional improvements to our protocol and databases are outlined. These results provide a good foundation for further work on RNA structure prediction and design applications.
Collapse
Affiliation(s)
- Swati Jain
- Department of Chemistry, New York University, 1001 Silver, 100 Washington Square East, New York, NY 10003, USA
| | - Tamar Schlick
- Department of Chemistry, New York University, 1001 Silver, 100 Washington Square East, New York, NY 10003, USA; Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA; New York University-East China Normal University Center for Computational Chemistry at New York University Shanghai, Room 340, Geography Building, North Zhongshan Road, 3663 Shanghai, China.
| |
Collapse
|
25
|
Schlick T, Pyle AM. Opportunities and Challenges in RNA Structural Modeling and Design. Biophys J 2017; 113:225-234. [PMID: 28162235 PMCID: PMC5529161 DOI: 10.1016/j.bpj.2016.12.037] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2016] [Revised: 12/08/2016] [Accepted: 12/19/2016] [Indexed: 01/27/2023] Open
Abstract
We describe opportunities and challenges in RNA structural modeling and design, as recently discussed during the second Telluride Science Research Center workshop organized in June 2016. Topics include fundamental processes of RNA, such as structural assemblies (hierarchical folding, multiple conformational states and their clustering), RNA motifs, and chemical reactivity of RNA, as used for structural prediction and functional inference. We also highlight the software and database issues associated with RNA structures, such as the multiple approaches for motif annotation, the need for frequent database updating, and the importance of quality control of RNA structures. We discuss various modeling approaches for structure prediction, mechanistic analysis of RNA reactions, and RNA design, and the complementary roles that both atomistic and coarse-grained approaches play in such simulations. Collectively, as scientists from varied disciplines become familiar and drawn into these unique challenges, new approaches and collaborative efforts will undoubtedly be catalyzed.
Collapse
Affiliation(s)
- Tamar Schlick
- Department of Chemistry, New York University, New York, New York; Courant Institute of Mathematical Sciences, New York University, New York, New York.
| | - Anna Marie Pyle
- Department of Molecular and Cellular and Developmental Biology and Department of Chemistry, Yale University; Howard Hughes Medical Institute, New Haven, Connecticut.
| |
Collapse
|
26
|
Abstract
Inspired by the recent success of scientific-discovery games for predicting protein tertiary and RNA secondary structures, we have developed an open software for coarse-grained RNA folding simulations, guided by human intuition. To determine the extent to which interactive simulations can accurately predict 3D RNA structures of increasing complexity and lengths (four RNAs with 22-47 nucleotides), an interactive experiment was conducted with 141 participants who had very little knowledge of nucleic acids systems and computer simulations, and had received only a brief description of the important forces stabilizing RNA structures. Their structures and full trajectories have been analyzed statistically and compared to standard replica exchange molecular dynamics simulations. Our analyses show that participants gain easily chemical intelligence to fold simple and nontrivial topologies, with little computer time, and this result opens the door for the use of human-guided simulations to RNA folding. Our experiment shows that interactive simulations have better chances of success when the user widely explores the conformational space. Interestingly, providing on-the-fly feedback of the root mean square deviation with respect to the experimental structure did not improve the quality of the proposed models.
Collapse
|
27
|
Jager S, Schiller B, Babel P, Blumenroth M, Strufe T, Hamacher K. StreAM-[Formula: see text]: algorithms for analyzing coarse grained RNA dynamics based on Markov models of connectivity-graphs. Algorithms Mol Biol 2017; 12:15. [PMID: 28572834 PMCID: PMC5450175 DOI: 10.1186/s13015-017-0105-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2016] [Accepted: 05/16/2017] [Indexed: 12/05/2022] Open
Abstract
Background In this work, we present a new coarse grained representation of RNA dynamics. It is based on adjacency matrices and their interactions patterns obtained from molecular dynamics simulations. RNA molecules are well-suited for this representation due to their composition which is mainly modular and assessable by the secondary structure alone. These interactions can be represented as adjacency matrices of k nucleotides. Based on those, we define transitions between states as changes in the adjacency matrices which form Markovian dynamics. The intense computational demand for deriving the transition probability matrices prompted us to develop StreAM-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$T_g$$\end{document}Tg, a stream-based algorithm for generating such Markov models of k-vertex adjacency matrices representing the RNA. Results We benchmark StreAM-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$T_g$$\end{document}Tg (a) for random and RNA unit sphere dynamic graphs (b) for the robustness of our method against different parameters. Moreover, we address a riboswitch design problem by applying StreAM-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$T_g$$\end{document}Tg on six long term molecular dynamics simulation of a synthetic tetracycline dependent riboswitch (500 ns) in combination with five different antibiotics. Conclusions The proposed algorithm performs well on large simulated as well as real world dynamic graphs. Additionally, StreAM-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$T_g$$\end{document}Tg provides insights into nucleotide based RNA dynamics in comparison to conventional metrics like the root-mean square fluctuation. In the light of experimental data our results show important design opportunities for the riboswitch.
Collapse
|
28
|
Petingi L, Schlick T. Partitioning and Classification of RNA Secondary Structures into Pseudonotted and Pseudoknot-free Regions Using a Graph-Theoretical Approach. IAENG INTERNATIONAL JOURNAL OF COMPUTER SCIENCE 2017; 44:241-246. [PMID: 30474081 PMCID: PMC6250053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Dual graphs have been applied to model RNA secondary structures with pseudoknots, or intertwined base pairs. In this paper we present a linear-time algorithm to partition dual graphs into maximal topological components called blocks and determine whether each block contains a pseudoknot or not. We show that a block contains a pseudoknot if and only if the block has a vertex of degree 3 or more; this characterization allows us to efficiently isolate smaller RNA fragments and classify them as pseudoknotted or pseudoknot-free regions, while keeping these sub-structures intact. Applications to RNA design can be envisioned since modular building blocks with intact pseudoknots can be combined to form new constructs.
Collapse
Affiliation(s)
- Louis Petingi
- Department of Computer Science, College of Staten Island, City University of New York, Staten Island, NY, USA,
| | - Tamar Schlick
- Department of Chemistry, and Courant Institute of Mathematical Sciences, New York University, New York, NY, USA,
| |
Collapse
|
29
|
Shabash B, Wiese KC. RNA Visualization: Relevance and the Current State-of-the-Art Focusing on Pseudoknots. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:696-712. [PMID: 26915129 DOI: 10.1109/tcbb.2016.2522421] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
RNA visualization is crucial in order to understand the relationship that exists between RNA structure and its function, as well as the development of better RNA structure prediction algorithms. However, in the context of RNA visualization, one key structure remains difficult to visualize: Pseudoknots. Pseudoknots occur in RNA folding when two secondary structural components form base-pairs between them. The three-dimensional nature of these components makes them challenging to visualize in two-dimensional media, such as print media or screens. In this review, we focus on the advancements that have been made in the field of RNA visualization in two-dimensional media in the past two decades. The review aims at presenting all relevant aspects of pseudoknot visualization. We start with an overview of several pseudoknotted structures and their relevance in RNA function. Next, we discuss the theoretical basis for RNA structural topology classification and present RNA classification systems for both pseudoknotted and non-pseudoknotted RNAs. Each description of RNA classification system is followed by a discussion of the software tools and algorithms developed to date to visualize RNA, comparing the different tools' strengths and shortcomings.
Collapse
|
30
|
Li Y, Shi X, Liang Y, Xie J, Zhang Y, Ma Q. RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation. BMC Bioinformatics 2017; 18:51. [PMID: 28109252 PMCID: PMC5251234 DOI: 10.1186/s12859-017-1481-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 01/10/2017] [Indexed: 01/10/2023] Open
Abstract
Background RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. Results An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. Conclusion RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1481-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ying Li
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China
| | - Xiaohu Shi
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China
| | - Yanchun Liang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China.,Zhuhai Laboratory of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University, Zhuhai, 519041, China
| | - Juan Xie
- Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, 57007, USA.,Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, 57007, USA.,BioSNTR, Brookings, SD, USA
| | - Yu Zhang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China. .,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China.
| | - Qin Ma
- Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, 57007, USA. .,Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, 57007, USA. .,BioSNTR, Brookings, SD, USA.
| |
Collapse
|
31
|
Accurate Classification of RNA Structures Using Topological Fingerprints. PLoS One 2016; 11:e0164726. [PMID: 27755571 PMCID: PMC5068708 DOI: 10.1371/journal.pone.0164726] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Accepted: 09/29/2016] [Indexed: 12/26/2022] Open
Abstract
While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furthermore, predicted RNA structures often lack pseudoknots (a crucial aspect of biological activity), and are only partially correct, or incomplete. A topological approach addresses all of these difficulties. In this work we describe each RNA structure as a graph that can be converted to a topological spectrum (RNA fingerprint). The set of subgraphs in an RNA structure, its RNA fingerprint, can be compared with the fingerprints of other RNA structures to identify and correctly classify functionally related RNAs. Topologically similar RNAs can be identified even when a large fraction, up to 30%, of the stems are omitted, indicating that highly accurate structures are not necessary. We investigate the performance of the RNA fingerprint approach on a set of eight highly curated RNA families, with diverse sizes and functions, containing pseudoknots, and with little sequence similarity-an especially difficult test set. In spite of the difficult test set, the RNA fingerprint approach is very successful (ROC AUC > 0.95). Due to the inclusion of pseudoknots, the RNA fingerprint approach both covers a wider range of possible structures than methods based only on secondary structure, and its tolerance for incomplete structures suggests that it can be applied even to predicted structures. Source code is freely available at https://github.rcac.purdue.edu/mgribsko/XIOS_RNA_fingerprint.
Collapse
|
32
|
RNAComposer and RNA 3D structure prediction for nanotechnology. Methods 2016; 103:120-7. [PMID: 27016145 DOI: 10.1016/j.ymeth.2016.03.010] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Revised: 03/04/2016] [Accepted: 03/21/2016] [Indexed: 11/21/2022] Open
Abstract
RNAs adopt specific, stable tertiary architectures to perform their activities. Knowledge of RNA tertiary structure is fundamental to understand RNA functions beginning with transcription and ending with turnover. Contrary to advanced RNA secondary structure prediction algorithms, which allow good accuracy when experimental data are integrated into the prediction, tertiary structure prediction of large RNAs still remains a significant challenge. However, the field of RNA tertiary structure prediction is rapidly developing and new computational methods based on different strategies are emerging. RNAComposer is a user-friendly and freely available server for 3D structure prediction of RNA up to 500 nucleotide residues. RNAComposer employs fully automated fragment assembly based on RNA secondary structure specified by the user. Importantly, this method allows incorporation of distance restraints derived from the experimental data to strengthen the 3D predictions. The potential and limitations of RNAComposer are discussed and an application to RNA design for nanotechnology is presented.
Collapse
|
33
|
Jager S, Schiller B, Strufe T, Hamacher K. StreAM- $$T_g$$ : Algorithms for Analyzing Coarse Grained RNA Dynamics Based on Markov Models of Connectivity-Graphs. LECTURE NOTES IN COMPUTER SCIENCE 2016. [DOI: 10.1007/978-3-319-43681-4_16] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
34
|
Biesiada M, Purzycka KJ, Szachniuk M, Blazewicz J, Adamiak RW. Automated RNA 3D Structure Prediction with RNAComposer. Methods Mol Biol 2016; 1490:199-215. [PMID: 27665601 DOI: 10.1007/978-1-4939-6433-8_13] [Citation(s) in RCA: 92] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
RNAs adopt specific structures to perform their activities and these are critical to virtually all RNA-mediated processes. Because of difficulties in experimentally assessing structures of large RNAs using NMR, X-ray crystallography, or cryo-microscopy, there is currently great demand for new high-resolution 3D structure prediction methods. Recently we reported on RNAComposer, a knowledge-based method for the fully automated RNA 3D structure prediction from a user-defined secondary structure. RNAComposer method is especially suited for structural biology users. Since our initial report in 2012, both servers, freely available at http://rnacomposer.ibch.poznan.pl and http://rnacomposer.cs.put.poznan.pl have been often visited. Therefore this chapter provides guidance for using RNAComposer and discusses points that should be considered when predicting 3D RNA structure. An application example presents current scope and limitations of RNAComposer.
Collapse
Affiliation(s)
- Marcin Biesiada
- European Center for Bioinformatics and Genomics, Institute of Computing Science, Poznan University of Technology, Poznan, Poland
- Department of Structural Chemistry and Biology of Nucleic Acids, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Katarzyna J Purzycka
- Department of Structural Chemistry and Biology of Nucleic Acids, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Marta Szachniuk
- European Center for Bioinformatics and Genomics, Institute of Computing Science, Poznan University of Technology, Poznan, Poland
- Department of Bioinformatics, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Jacek Blazewicz
- European Center for Bioinformatics and Genomics, Institute of Computing Science, Poznan University of Technology, Poznan, Poland
- Department of Bioinformatics, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Ryszard W Adamiak
- European Center for Bioinformatics and Genomics, Institute of Computing Science, Poznan University of Technology, Poznan, Poland.
- Department of Structural Chemistry and Biology of Nucleic Acids, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland.
| |
Collapse
|
35
|
Dingle K, Schaper S, Louis AA. The structure of the genotype-phenotype map strongly constrains the evolution of non-coding RNA. Interface Focus 2015; 5:20150053. [PMID: 26640651 DOI: 10.1098/rsfs.2015.0053] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The prevalence of neutral mutations implies that biological systems typically have many more genotypes than phenotypes. But, can the way that genotypes are distributed over phenotypes determine evolutionary outcomes? Answering such questions is difficult, in part because the number of genotypes can be hyper-astronomically large. By solving the genotype-phenotype (GP) map for RNA secondary structure (SS) for systems up to length L = 126 nucleotides (where the set of all possible RNA strands would weigh more than the mass of the visible universe), we show that the GP map strongly constrains the evolution of non-coding RNA (ncRNA). Simple random sampling over genotypes predicts the distribution of properties such as the mutational robustness or the number of stems per SS found in naturally occurring ncRNA with surprising accuracy. Because we ignore natural selection, this strikingly close correspondence with the mapping suggests that structures allowing for functionality are easily discovered, despite the enormous size of the genetic spaces. The mapping is extremely biased: the majority of genotypes map to an exponentially small portion of the morphospace of all biophysically possible structures. Such strong constraints provide a non-adaptive explanation for the convergent evolution of structures such as the hammerhead ribozyme. These results present a particularly clear example of bias in the arrival of variation strongly shaping evolutionary outcomes and may be relevant to Mayr's distinction between proximate and ultimate causes in evolutionary biology.
Collapse
Affiliation(s)
- Kamaludin Dingle
- Rudolf Peierls Centre for Theoretical Physics , University of Oxford , Oxford OX1 3NP , UK ; Systems Biology DTC , University of Oxford , Oxford , UK ; Department of Mathematics and Natural Sciences , Gulf University for Science and Technology , Block 5, West Mishref , Kuwait
| | - Steffen Schaper
- Rudolf Peierls Centre for Theoretical Physics , University of Oxford , Oxford OX1 3NP , UK
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics , University of Oxford , Oxford OX1 3NP , UK
| |
Collapse
|
36
|
Abstract
Genomic studies have greatly expanded our knowledge of structural non-coding RNAs (ncRNAs). These RNAs fold into characteristic secondary structures and perform specific-structure dependent biological functions. Hence RNA secondary structure prediction is one of the most well studied problems in computational RNA biology. Comparative sequence analysis is one of the more reliable RNA structure prediction approaches as it exploits information of multiple related sequences to infer the consensus secondary structure. This class of methods essentially learns a global secondary structure from the input sequences. In this paper, we consider the more general problem of unearthing common local secondary structure based patterns from a set of related sequences. The input sequences for example could correspond to 3(') or 5(') untranslated regions of a set of orthologous genes and the unearthed local patterns could correspond to regulatory motifs found in these regions. These sequences could also correspond to in vitro selected RNA, genomic segments housing ncRNA genes from the same family and so on. Here, we give a detailed review of the various computational techniques proposed in literature attempting to solve this general motif discovery problem. We also give empirical comparisons of some of the current state of the art methods and point out future directions of research.
Collapse
Affiliation(s)
- Avinash Achar
- Department of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, Norway
| | - Pål Sætrom
- Department of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, Norway.
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway.
| |
Collapse
|
37
|
Zahran M, Sevim Bayrak C, Elmetwaly S, Schlick T. RAG-3D: a search tool for RNA 3D substructures. Nucleic Acids Res 2015; 43:9474-88. [PMID: 26304547 PMCID: PMC4627073 DOI: 10.1093/nar/gkv823] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 08/03/2015] [Indexed: 01/23/2023] Open
Abstract
To address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally described in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.
Collapse
Affiliation(s)
- Mai Zahran
- Biological Sciences Department, New York City College of Technology, City University of New York, Brooklyn, NY 11201, USA
| | | | - Shereef Elmetwaly
- Department of Chemistry, New York University, New York, NY 10003, USA
| | - Tamar Schlick
- Department of Chemistry, New York University, New York, NY 10003, USA Courant Institute of Mathematical Sciences, New York University, New York, NY 10012, USA
| |
Collapse
|
38
|
Cragnolini T, Derreumaux P, Pasquali S. Ab initio RNA folding. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2015; 27:233102. [PMID: 25993396 DOI: 10.1088/0953-8984/27/23/233102] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
RNA molecules are essential cellular machines performing a wide variety of functions for which a specific three-dimensional structure is required. Over the last several years, the experimental determination of RNA structures through x-ray crystallography and NMR seems to have reached a plateau in the number of structures resolved each year, but as more and more RNA sequences are being discovered, the need for structure prediction tools to complement experimental data is strong. Theoretical approaches to RNA folding have been developed since the late nineties, when the first algorithms for secondary structure prediction appeared. Over the last 10 years a number of prediction methods for 3D structures have been developed, first based on bioinformatics and data-mining, and more recently based on a coarse-grained physical representation of the systems. In this review we are going to present the challenges of RNA structure prediction and the main ideas behind bioinformatic approaches and physics-based approaches. We will focus on the description of the more recent physics-based phenomenological models and on how they are built to include the specificity of the interactions of RNA bases, whose role is critical in folding. Through examples from different models, we will point out the strengths of physics-based approaches, which are able not only to predict equilibrium structures, but also to investigate dynamical and thermodynamical behavior, and the open challenges to include more key interactions ruling RNA folding.
Collapse
Affiliation(s)
- Tristan Cragnolini
- Laboratoire de Biochimie Théorique UPR 9080 CNRS, Université Paris Diderot, Sorbonne, Paris Cité, IBPC 13 rue Pierre et Marie Curie, 75005 Paris, France
| | | | | |
Collapse
|
39
|
Cragnolini T, Laurin Y, Derreumaux P, Pasquali S. Coarse-Grained HiRE-RNA Model for ab Initio RNA Folding beyond Simple Molecules, Including Noncanonical and Multiple Base Pairings. J Chem Theory Comput 2015; 11:3510-22. [PMID: 26575783 DOI: 10.1021/acs.jctc.5b00200] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
HiRE-RNA is a coarse-grained model for RNA structure prediction and the dynamical study of RNA folding. Using a reduced set of particles and detailed interactions accounting for base-pairing and stacking, we show that noncanonical and multiple base interactions are necessary to capture the full physical behavior of complex RNAs. In this paper, we give a full account of the model and present results on the folding, stability, and free energy surfaces of 16 systems with 12 to 76 nucleotides of increasingly complex architectures, ranging from monomers to dimers, using a total of 850 μs of simulation time.
Collapse
Affiliation(s)
- Tristan Cragnolini
- Laboratoire de Biochimie Théorique UPR 9080 CNRS, Université Paris Diderot , Sorbonne, Paris Cité, IBPC 13 rue Pierre et Marie Curie, 75005 Paris, France
| | - Yoann Laurin
- Laboratoire de Biochimie Théorique UPR 9080 CNRS, Université Paris Diderot , Sorbonne, Paris Cité, IBPC 13 rue Pierre et Marie Curie, 75005 Paris, France
| | - Philippe Derreumaux
- Laboratoire de Biochimie Théorique UPR 9080 CNRS, Université Paris Diderot , Sorbonne, Paris Cité, IBPC 13 rue Pierre et Marie Curie, 75005 Paris, France.,Institut Universitaire de France , Boulevard Saint-Michel, 75005 Paris, France
| | - Samuela Pasquali
- Laboratoire de Biochimie Théorique UPR 9080 CNRS, Université Paris Diderot , Sorbonne, Paris Cité, IBPC 13 rue Pierre et Marie Curie, 75005 Paris, France
| |
Collapse
|
40
|
Purzycka KJ, Popenda M, Szachniuk M, Antczak M, Lukasiak P, Blazewicz J, Adamiak RW. Automated 3D RNA structure prediction using the RNAComposer method for riboswitches. Methods Enzymol 2015; 553:3-34. [PMID: 25726459 DOI: 10.1016/bs.mie.2014.10.050] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Understanding the numerous functions of RNAs depends critically on the knowledge of their three-dimensional (3D) structure. In contrast to the protein field, a much smaller number of RNA 3D structures have been assessed using X-ray crystallography, NMR spectroscopy, and cryomicroscopy. This has led to a great demand to obtain the RNA 3D structures using prediction methods. The 3D structure prediction, especially of large RNAs, still remains a significant challenge and there is still a great demand for high-resolution structure prediction methods. In this chapter, we describe RNAComposer, a method and server for the automated prediction of RNA 3D structures based on the knowledge of secondary structure. Its applications are supported by other automated servers: RNA FRABASE and RNApdbee, developed to search and analyze secondary and 3D structures. Another method, RNAlyzer, offers new way to analyze and visualize quality of RNA 3D models. Scope and limitations of RNAComposer in application for an automated prediction of riboswitches' 3D structure will be presented and discussed. Analysis of the cyclic di-GMP-II riboswitch from Clostridium acetobutylicum (PDB ID 3Q3Z) as an example allows for 3D structure prediction of related riboswitches from Clostridium difficile 4, Bacillus halodurans 1, and Thermus aquaticus Y5.1 of yet unknown structures.
Collapse
Affiliation(s)
- K J Purzycka
- Department of Structural Chemistry and Biology of Nucleic Acids, Institute of Bioorganic Chemistry Polish Academy of Sciences, Poznan, Poland
| | - M Popenda
- Department of Structural Chemistry and Biology of Nucleic Acids, Institute of Bioorganic Chemistry Polish Academy of Sciences, Poznan, Poland
| | - M Szachniuk
- Department of Structural Chemistry and Biology of Nucleic Acids, Institute of Bioorganic Chemistry Polish Academy of Sciences, Poznan, Poland; European Center for Bioinformatics and Genomics, Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - M Antczak
- European Center for Bioinformatics and Genomics, Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - P Lukasiak
- Department of Structural Chemistry and Biology of Nucleic Acids, Institute of Bioorganic Chemistry Polish Academy of Sciences, Poznan, Poland; European Center for Bioinformatics and Genomics, Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - J Blazewicz
- Department of Structural Chemistry and Biology of Nucleic Acids, Institute of Bioorganic Chemistry Polish Academy of Sciences, Poznan, Poland; European Center for Bioinformatics and Genomics, Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - R W Adamiak
- Department of Structural Chemistry and Biology of Nucleic Acids, Institute of Bioorganic Chemistry Polish Academy of Sciences, Poznan, Poland; European Center for Bioinformatics and Genomics, Institute of Computing Science, Poznan University of Technology, Poznan, Poland.
| |
Collapse
|
41
|
Gawronski AR, Turcotte M. RiboFSM: frequent subgraph mining for the discovery of RNA structures and interactions. BMC Bioinformatics 2014; 15 Suppl 13:S2. [PMID: 25434643 PMCID: PMC4248650 DOI: 10.1186/1471-2105-15-s13-s2] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Frequent subgraph mining is a useful method for extracting meaningful patterns from a set of graphs or a single large graph. Here, the graph represents all possible RNA structures and interactions. Patterns that are significantly more frequent in this graph over a random graph are extracted. We hypothesize that these patterns are most likely to represent biological mechanisms. The graph representation used is a directed dual graph, extended to handle intermolecular interactions. The graph is sampled for subgraphs, which are labeled using a canonical labeling method and counted. The resulting patterns are compared to those created from a randomized dataset and scored. The algorithm was applied to the mitochondrial genome of the kinetoplastid species Trypanosoma brucei, which has a unique RNA editing mechanism. The most significant patterns contain two stem-loops, indicative of gRNA, and represent interactions of these structures with target mRNA.
Collapse
|
42
|
Gopal A, Egecioglu DE, Yoffe AM, Ben-Shaul A, Rao ALN, Knobler CM, Gelbart WM. Viral RNAs are unusually compact. PLoS One 2014; 9:e105875. [PMID: 25188030 PMCID: PMC4154850 DOI: 10.1371/journal.pone.0105875] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2014] [Accepted: 07/21/2014] [Indexed: 01/28/2023] Open
Abstract
A majority of viruses are composed of long single-stranded genomic RNA molecules encapsulated by protein shells with diameters of just a few tens of nanometers. We examine the extent to which these viral RNAs have evolved to be physically compact molecules to facilitate encapsulation. Measurements of equal-length viral, non-viral, coding and non-coding RNAs show viral RNAs to have among the smallest sizes in solution, i.e., the highest gel-electrophoretic mobilities and the smallest hydrodynamic radii. Using graph-theoretical analyses we demonstrate that their sizes correlate with the compactness of branching patterns in predicted secondary structure ensembles. The density of branching is determined by the number and relative positions of 3-helix junctions, and is highly sensitive to the presence of rare higher-order junctions with 4 or more helices. Compact branching arises from a preponderance of base pairing between nucleotides close to each other in the primary sequence. The density of branching represents a degree of freedom optimized by viral RNA genomes in response to the evolutionary pressure to be packaged reliably. Several families of viruses are analyzed to delineate the effects of capsid geometry, size and charge stabilization on the selective pressure for RNA compactness. Compact branching has important implications for RNA folding and viral assembly.
Collapse
Affiliation(s)
- Ajaykumar Gopal
- Department of Chemistry & Biochemistry, University of California Los Angeles, Los Angeles, California, United States of America
| | - Defne E. Egecioglu
- Department of Chemistry & Biochemistry, University of California Los Angeles, Los Angeles, California, United States of America
| | - Aron M. Yoffe
- Department of Chemistry & Biochemistry, University of California Los Angeles, Los Angeles, California, United States of America
| | - Avinoam Ben-Shaul
- Institute of Chemistry & The Fritz Haber Research Center, The Hebrew University of Jerusalem, Givat Ram, Jerusalem, Israel
| | - Ayala L. N. Rao
- Department of Plant Pathology, University of California Riverside, Riverside, California, United States of America
| | - Charles M. Knobler
- Department of Chemistry & Biochemistry, University of California Los Angeles, Los Angeles, California, United States of America
| | - William M. Gelbart
- Department of Chemistry & Biochemistry, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail:
| |
Collapse
|
43
|
Mashaghi A, van Wijk R, Tans S. Circuit Topology of Proteins and Nucleic Acids. Structure 2014; 22:1227-1237. [DOI: 10.1016/j.str.2014.06.015] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Revised: 05/10/2014] [Accepted: 06/17/2014] [Indexed: 01/19/2023]
|
44
|
Firdaus-Raih M, Hamdani HY, Nadzirin N, Ramlan EI, Willett P, Artymiuk PJ. COGNAC: a web server for searching and annotating hydrogen-bonded base interactions in RNA three-dimensional structures. Nucleic Acids Res 2014; 42:W382-8. [PMID: 24831543 PMCID: PMC4086061 DOI: 10.1093/nar/gku438] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Hydrogen bonds are crucial factors that stabilize a complex ribonucleic acid (RNA) molecule's three-dimensional (3D) structure. Minute conformational changes can result in variations in the hydrogen bond interactions in a particular structure. Furthermore, networks of hydrogen bonds, especially those found in tight clusters, may be important elements in structure stabilization or function and can therefore be regarded as potential tertiary motifs. In this paper, we describe a graph theoretical algorithm implemented as a web server that is able to search for unbroken networks of hydrogen-bonded base interactions and thus provide an accounting of such interactions in RNA 3D structures. This server, COGNAC (COnnection tables Graphs for Nucleic ACids), is also able to compare the hydrogen bond networks between two structures and from such annotations enable the mapping of atomic level differences that may have resulted from conformational changes due to mutations or binding events. The COGNAC server can be accessed at http://mfrlab.org/grafss/cognac.
Collapse
Affiliation(s)
- Mohd Firdaus-Raih
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Malaysia Institute of Systems Biology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Malaysia
| | - Hazrina Yusof Hamdani
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Malaysia
| | - Nurul Nadzirin
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Malaysia
| | - Effirul Ikhwan Ramlan
- Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, University of Malaya, 50603 Kuala Lumpur, Malaysia
| | - Peter Willett
- Information School, University of Sheffield, Western Bank, Sheffield S10 2TN, UK
| | - Peter J Artymiuk
- Department of Molecular Biology and Biotechnology, Krebs Institute, University of Sheffield, Western Bank, Sheffield S10 2TN, UK
| |
Collapse
|
45
|
Combinatorial Insights into RNA Secondary Structure. DISCRETE AND TOPOLOGICAL MODELS IN MOLECULAR BIOLOGY 2014. [DOI: 10.1007/978-3-642-40193-0_7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
46
|
Kim N, Petingi L, Schlick T. Network Theory Tools for RNA Modeling. WSEAS TRANSACTIONS ON MATHEMATICS 2013; 9:941-955. [PMID: 25414570 PMCID: PMC4235620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
An introduction into the usage of graph or network theory tools for the study of RNA molecules is presented. By using vertices and edges to define RNA secondary structures as tree and dual graphs, we can enumerate, predict, and design RNA topologies. Graph connectivity and associated Laplacian eigenvalues relate to biological properties of RNA and help understand RNA motifs as well as build, by computational design, various RNA target structures. Importantly, graph theoretical representations of RNAs reduce drastically the conformational space size and therefore simplify modeling and prediction tasks. Ongoing challenges remain regarding general RNA design, representation of RNA pseudoknots, and tertiary structure prediction. Thus, developments in network theory may help advance RNA biology.
Collapse
Affiliation(s)
- Namhee Kim
- New York University Department of Chemistry Courant Institute of Mathematical Sciences 251 Mercer Street New York, NY 10012, USA
| | - Louis Petingi
- College of Staten Island City University of New York Department of Computer Science 2800 Victory Boulevard Staten Island, NY 10314, USA
| | - Tamar Schlick
- New York University Department of Chemistry Courant Institute of Mathematical Sciences 251 Mercer Street New York, NY 10012, USA
| |
Collapse
|
47
|
Laing C, Jung S, Kim N, Elmetwaly S, Zahran M, Schlick T. Predicting helical topologies in RNA junctions as tree graphs. PLoS One 2013; 8:e71947. [PMID: 23991010 PMCID: PMC3753280 DOI: 10.1371/journal.pone.0071947] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Accepted: 07/05/2013] [Indexed: 01/11/2023] Open
Abstract
RNA molecules are important cellular components involved in many fundamental biological processes. Understanding the mechanisms behind their functions requires knowledge of their tertiary structures. Though computational RNA folding approaches exist, they often require manual manipulation and expert intuition; predicting global long-range tertiary contacts remains challenging. Here we develop a computational approach and associated program module (RNAJAG) to predict helical arrangements/topologies in RNA junctions. Our method has two components: junction topology prediction and graph modeling. First, junction topologies are determined by a data mining approach from a given secondary structure of the target RNAs; second, the predicted topology is used to construct a tree graph consistent with geometric preferences analyzed from solved RNAs. The predicted graphs, which model the helical arrangements of RNA junctions for a large set of 200 junctions using a cross validation procedure, yield fairly good representations compared to the helical configurations in native RNAs, and can be further used to develop all-atom models as we show for two examples. Because junctions are among the most complex structural elements in RNA, this work advances folding structure prediction methods of large RNAs. The RNAJAG module is available to academic users upon request.
Collapse
Affiliation(s)
- Christian Laing
- Department of Biology, Wilkes University, Wilkes-Barre, Pennsylvania, United States of America
- Department of Mathematics and Computer Science, Wilkes University, Wilkes-Barre, Pennsylvania, United States of America
| | - Segun Jung
- Department of Chemistry, New York University, New York, United States of America
| | - Namhee Kim
- Department of Chemistry, New York University, New York, United States of America
| | - Shereef Elmetwaly
- Department of Chemistry, New York University, New York, United States of America
| | - Mai Zahran
- Department of Chemistry, New York University, New York, United States of America
| | - Tamar Schlick
- Department of Chemistry, New York University, New York, United States of America
- Courant Institute of Mathematical Sciences, New York University, New York, United States of America
- * E-mail:
| |
Collapse
|
48
|
Heyne S, Costa F, Rose D, Backofen R. GraphClust: alignment-free structural clustering of local RNA secondary structures. ACTA ACUST UNITED AC 2013; 28:i224-32. [PMID: 22689765 PMCID: PMC3371856 DOI: 10.1093/bioinformatics/bts224] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Motivation: Clustering according to sequence–structure similarity has now become a generally accepted scheme for ncRNA annotation. Its application to complete genomic sequences as well as whole transcriptomes is therefore desirable but hindered by extremely high computational costs. Results: We present a novel linear-time, alignment-free method for comparing and clustering RNAs according to sequence and structure. The approach scales to datasets of hundreds of thousands of sequences. The quality of the retrieved clusters has been benchmarked against known ncRNA datasets and is comparable to state-of-the-art sequence–structure methods although achieving speedups of several orders of magnitude. A selection of applications aiming at the detection of novel structural ncRNAs are presented. Exemplarily, we predicted local structural elements specific to lincRNAs likely functionally associating involved transcripts to vital processes of the human nervous system. In total, we predicted 349 local structural RNA elements. Availability: The GraphClust pipeline is available on request. Contact:backofen@informatik.uni-freiburg.de Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Steffen Heyne
- Bioinformatics Group, Department of Computer Science, University of Freiburg,Georges-Köhler-Allee 106, D-79110 Freiburg, Germany
| | | | | | | |
Collapse
|
49
|
Zhao Y, Huang Y, Gong Z, Wang Y, Man J, Xiao Y. Automated and fast building of three-dimensional RNA structures. Sci Rep 2012; 2:734. [PMID: 23071898 PMCID: PMC3471093 DOI: 10.1038/srep00734] [Citation(s) in RCA: 142] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2012] [Accepted: 09/17/2012] [Indexed: 12/22/2022] Open
Abstract
Building tertiary structures of non-coding RNA is required to understand their functions and design new molecules. Current algorithms of RNA tertiary structure prediction give satisfactory accuracy only for small size and simple topology and many of them need manual manipulation. Here, we present an automated and fast program,3dRNA, for RNA tertiary structure prediction with reasonable accuracy for RNAs of larger size and complex topology.
Collapse
Affiliation(s)
- Yunjie Zhao
- Biomolecular Physics and Modeling Group, Department of Physics Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | | | | | | | | | | |
Collapse
|
50
|
Widmann J, Stombaugh J, McDonald D, Chocholousova J, Gardner P, Iyer MK, Liu Z, Lozupone CA, Quinn J, Smit S, Wikman S, Zaneveld JR, Knight R. RNASTAR: an RNA STructural Alignment Repository that provides insight into the evolution of natural and artificial RNAs. RNA (NEW YORK, N.Y.) 2012; 18:1319-27. [PMID: 22645380 PMCID: PMC3383963 DOI: 10.1261/rna.032052.111] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Automated RNA alignment algorithms often fail to recapture the essential conserved sites that are critical for function. To assist in the refinement of these algorithms, we manually curated a set of 148 alignments with a total of 9600 unique sequences, in which each alignment was backed by at least one crystal or NMR structure. These alignments included both naturally and artificially selected molecules. We used principles of isostericity to improve the alignments from an average of 83%-94% isosteric base pairs. We expect that this alignment collection will assist in a wide range of benchmarking efforts and provide new insight into evolutionary principles governing change in RNA structural motifs. The improved alignments have been contributed to the Rfam database.
Collapse
Affiliation(s)
- Jeremy Widmann
- Department of Chemistry and Biochemistry, University of Colorado at Boulder, Boulder, Colorado 80309, USA
| | - Jesse Stombaugh
- Department of Chemistry and Biochemistry, University of Colorado at Boulder, Boulder, Colorado 80309, USA
| | - Daniel McDonald
- Biofrontiers Institute, University of Colorado at Boulder, Boulder, Colorado 80309, USA
| | - Jana Chocholousova
- Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic, Prague 6, Czech Republic
| | - Paul Gardner
- School of Biological Sciences, University of Canterbury, Christchurch 8140, New Zealand
| | - Matthew K. Iyer
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Zongzhi Liu
- Department of Pathology Informatics, School of Medicine, Yale University, New Haven, Connecticut 06510, USA
| | - Catherine A. Lozupone
- Department of Chemistry and Biochemistry, University of Colorado at Boulder, Boulder, Colorado 80309, USA
| | - John Quinn
- Thermo Fisher Scientific, Lafayette, Colorado 80026, USA
| | - Sandra Smit
- Laboratory of Bioinformatics, Wageningen University, 6700 AN Wageningen, The Netherlands
| | | | - Jesse R.R. Zaneveld
- Department of Microbiology, Oregon State University, Corvallis, Oregon 97331, USA
| | - Rob Knight
- Department of Chemistry and Biochemistry, University of Colorado at Boulder, Boulder, Colorado 80309, USA
- Howard Hughes Medical Institute, Boulder, Colorado 80309, USA
- Corresponding authorE-mail
| |
Collapse
|