1
|
Quadrini M, Tesei L, Merelli E. Automatic generation of pseudoknotted RNAs taxonomy. BMC Bioinformatics 2023; 23:575. [PMID: 37322429 DOI: 10.1186/s12859-023-05362-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 05/25/2023] [Indexed: 06/17/2023] Open
Abstract
BACKGROUND The ability to compare RNA secondary structures is important in understanding their biological function and for grouping similar organisms into families by looking at evolutionarily conserved sequences such as 16S rRNA. Most comparison methods and benchmarks in the literature focus on pseudoknot-free structures due to the difficulty of mapping pseudoknots in classical tree representations. Some approaches exist that permit to cluster pseudoknotted RNAs but there is not a general framework for evaluating their performance. RESULTS We introduce an evaluation framework based on a similarity/dissimilarity measure obtained by a comparison method and agglomerative clustering. Their combination automatically partition a set of molecules into groups. To illustrate the framework we define and make available a benchmark of pseudoknotted (16S and 23S) and pseudoknot-free (5S) rRNA secondary structures belonging to Archaea, Bacteria and Eukaryota. We also consider five different comparison methods from the literature that are able to manage pseudoknots. For each method we clusterize the molecules in the benchmark to obtain the taxa at the rank phylum according to the European Nucleotide Archive curated taxonomy. We compute appropriate metrics for each method and we compare their suitability to reconstruct the taxa.
Collapse
Affiliation(s)
- Michela Quadrini
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy
| | - Luca Tesei
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy.
| | - Emanuela Merelli
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy
| |
Collapse
|
2
|
Marchei D, Merelli E. RNA secondary structure factorization in prime tangles. BMC Bioinformatics 2022; 23:345. [PMID: 35982399 PMCID: PMC9386957 DOI: 10.1186/s12859-022-04879-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 08/03/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Due to its key role in various biological processes, RNA secondary structures have always been the focus of in-depth analyses, with great efforts from mathematicians and biologists, to find a suitable abstract representation for modelling its functional and structural properties. One contribution is due to Kauffman and Magarshak, who modelled RNA secondary structures as mathematical objects constructed in link theory: tangles of the Brauer Monoid. In this paper, we extend the tangle-based model with its minimal prime factorization, useful to analyze patterns that characterize the RNA secondary structure. RESULTS By leveraging the mapping between RNA and tangles, we prove that the prime factorizations of tangle-based models share some patterns with RNA folding's features. We analyze the E. coli tRNA and provide some visual examples of interesting patterns. CONCLUSIONS We formulate an open question on the nature of the class of equivalent factorizations and discuss some research directions in this regard. We also propose some practical applications of the tangle-based method to RNA classification and folding prediction as a useful tool for learning algorithms, even though the full factorization is not known.
Collapse
Affiliation(s)
- Daniele Marchei
- University of Camerino, Via Madonna delle Carceri 9, 62032, Camerino, Italy.
| | - Emanuela Merelli
- University of Camerino, Via Madonna delle Carceri 9, 62032, Camerino, Italy
| |
Collapse
|
3
|
Boi L. A reappraisal of the form: function problem-theory and phenomenology. Theory Biosci 2022; 141:73-103. [PMID: 35471494 DOI: 10.1007/s12064-022-00368-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 03/11/2022] [Indexed: 11/26/2022]
Abstract
This paper is aimed at demonstrating that some geometrical and topological transformations and operations serve not only as promoters of many specific genetic and cellular events in multicellular living organisms, but also as initiators of the organization and regulation of their functions. Thus, changes in the form and structure of macromolecular and cellular systems must be directly associated to their functions. There are specific classes of enzymes that manipulate the geometry and topology of complex DNA-protein structures, and thereby they perform many important cellular processes, including segregation of daughter chromosomes, gene regulation, and DNA repair. We argue that form has an organizing power, hence a causal action, in the sense that it enables to induce functional events during different biological processes, at the supramolecular, cellular, and organismal levels of organization. Clearly, topological forms must be matched with specific kinetic and dynamical parameters to have a functional effectiveness in living systems. This effectiveness is remarkably apparent, to give an example, in the regulation of the genome functions and in cell activity. In more general terms, we try to show that the conformational plasticity of biological systems depends on different kinds of topological manipulations performed by specific families of enzymes. In doing so, they catalyze all those spatial and dynamical changes of biological structures that are suitable for the functions to be acted by the organism.
Collapse
Affiliation(s)
- Luciano Boi
- École des Hautes Études en Sciences Sociales, Centre de Mathématiques (CAMS), 54, bd Raspail, 75006, Paris, France.
| |
Collapse
|
4
|
Huang FW, Barrett CL, Reidys CM. The energy-spectrum of bicompatible sequences. Algorithms Mol Biol 2021; 16:7. [PMID: 34074304 PMCID: PMC8167974 DOI: 10.1186/s13015-021-00187-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 05/24/2021] [Indexed: 12/04/2022] Open
Abstract
Background Genotype-phenotype maps provide a meaningful filtration of sequence space and RNA secondary structures are particular such phenotypes. Compatible sequences, which satisfy the base-pairing constraints of a given RNA structure, play an important role in the context of neutral evolution. Sequences that are simultaneously compatible with two given structures (bicompatible sequences), are beacons in phenotypic transitions, induced by erroneously replicating populations of RNA sequences. RNA riboswitches, which are capable of expressing two distinct secondary structures without changing the underlying sequence, are one example of bicompatible sequences in living organisms. Results We present a full loop energy model Boltzmann sampler of bicompatible sequences for pairs of structures. The sequence sampler employs a dynamic programming routine whose time complexity is polynomial when assuming the maximum number of exposed vertices, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\kappa $$\end{document}κ, is a constant. The parameter \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\kappa $$\end{document}κ depends on the two structures and can be very large. We introduce a novel topological framework encapsulating the relations between loops that sheds light on the understanding of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\kappa $$\end{document}κ. Based on this framework, we give an algorithm to sample sequences with minimum \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\kappa $$\end{document}κ on a particular topologically classified case as well as giving hints to the solution in the other cases. As a result, we utilize our sequence sampler to study some established riboswitches. Conclusion Our analysis of riboswitch sequences shows that a pair of structures needs to satisfy key properties in order to facilitate phenotypic transitions and that pairs of random structures are unlikely to do so. Our analysis observes a distinct signature of riboswitch sequences, suggesting a new criterion for identifying native sequences and sequences subjected to evolutionary pressure. Our free software is available at: https://github.com/FenixHuang667/Bifold.
Collapse
|
5
|
Quadrini M. Structural relation matching: an algorithm to identify structural patterns into RNAs and their interactions. J Integr Bioinform 2021; 18:111-126. [PMID: 34051708 PMCID: PMC9382659 DOI: 10.1515/jib-2020-0039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 04/19/2021] [Indexed: 11/15/2022] Open
Abstract
RNA molecules play crucial roles in various biological processes. Their three-dimensional configurations determine the functions and, in turn, influences the interaction with other molecules. RNAs and their interaction structures, the so-called RNA-RNA interactions, can be abstracted in terms of secondary structures, i.e., a list of the nucleotide bases paired by hydrogen bonding within its nucleotide sequence. Each secondary structure, in turn, can be abstracted into cores and shadows. Both are determined by collapsing nucleotides and arcs properly. We formalize all of these abstractions as arc diagrams, whose arcs determine loops. A secondary structure, represented by an arc diagram, is pseudoknot-free if its arc diagram does not present any crossing among arcs otherwise, it is said pseudoknotted. In this study, we face the problem of identifying a given structural pattern into secondary structures or the associated cores or shadow of both RNAs and RNA-RNA interactions, characterized by arbitrary pseudoknots. These abstractions are mapped into a matrix, whose elements represent the relations among loops. Therefore, we face the problem of taking advantage of matrices and submatrices. The algorithms, implemented in Python, work in polynomial time. We test our approach on a set of 16S ribosomal RNAs with inhibitors of Thermus thermophilus, and we quantify the structural effect of the inhibitors.
Collapse
Affiliation(s)
- Michela Quadrini
- University of Camerino, School of Science and Technology, via Madonna delle Carceri, Camerino, Italy
| |
Collapse
|
6
|
Mak CH, Phan ENH. Diagrammatic approaches to RNA structures with trinucleotide repeats. Biophys J 2021; 120:2343-2354. [PMID: 33887227 PMCID: PMC8390803 DOI: 10.1016/j.bpj.2021.04.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 04/07/2021] [Accepted: 04/09/2021] [Indexed: 11/30/2022] Open
Abstract
Trinucleotide repeat expansion disorders are associated with the overexpansion of (CNG) repeats on the genome. Messenger RNA transcripts of sequences with greater than 60–100 (CNG) tandem units have been implicated in trinucleotide repeat expansion disorder pathogenesis. In this work, we develop a diagrammatic theory to study the structural diversity of these (CNG)n RNA sequences. Representing structural elements on the chain’s conformation by a set of graphs and employing elementary diagrammatic methods, we have formulated a renormalization procedure to re-sum these graphs and arrive at a closed-form expression for the ensemble partition function. With a simple approximation for the renormalization and applied to extended (CNG)n sequences, this theory can comprehensively capture an infinite set of conformations with any number and any combination of duplexes, hairpins, multiway junctions, and quadruplexes. To quantify the diversity of different (CNG)n ensembles, the analytical equations derived from the diagrammatic theory were solved numerically to derive equilibrium estimates for the secondary structural contents of the chains. The results suggest that the structural ensembles of (CNG)n repeat sequence with n ∼60 are surprisingly diverse, and the distribution is sensitive to the ability of the N nucleotide to make noncanonical pairs and whether the (CNG)n sequence can sustain stable quadruplexes. The results show how perturbations in the form of biases on the stabilities of the various structural motifs, duplexes, junctions, helices, and quadruplexes could affect the secondary structures of the chains and how these structures may switch when they are perturbed.
Collapse
Affiliation(s)
- Chi H Mak
- Department of Chemistry, Center of Applied Mathematical Sciences and Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California.
| | - Ethan N H Phan
- Department of Chemistry, University of Southern California, Los Angeles, California
| |
Collapse
|
7
|
Chowdhury S, Bhuiya S, Haque L, Das S. A Spectroscopic Approach towards the Comparative Binding Studies of the Antioxidizing Flavonol Myricetin with Various Single‐Stranded RNA. ChemistrySelect 2020. [DOI: 10.1002/slct.202003601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Susmita Chowdhury
- Biophysical Chemistry Laboratory Physical Chemistry Section Department of Chemistry Jadavpur University 188, Raja S. C. Mallick Road Kolkata 700032 India
| | - Sutanwi Bhuiya
- Biophysical Chemistry Laboratory Physical Chemistry Section Department of Chemistry Jadavpur University 188, Raja S. C. Mallick Road Kolkata 700032 India
| | - Lucy Haque
- Biophysical Chemistry Laboratory Physical Chemistry Section Department of Chemistry Jadavpur University 188, Raja S. C. Mallick Road Kolkata 700032 India
| | - Suman Das
- Biophysical Chemistry Laboratory Physical Chemistry Section Department of Chemistry Jadavpur University 188, Raja S. C. Mallick Road Kolkata 700032 India
| |
Collapse
|
8
|
Scalvini B, Sheikhhassani V, Woodard J, Aupič J, Dame RT, Jerala R, Mashaghi A. Topology of Folded Molecular Chains: From Single Biomolecules to Engineered Origami. TRENDS IN CHEMISTRY 2020. [DOI: 10.1016/j.trechm.2020.04.009] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
9
|
Abstract
There are some NP-hard problems in the prediction of RNA structures. Prediction of RNA folding structure in RNA nucleotide sequence remains an unsolved challenge. We investigate the computing algorithm in RNA folding structural prediction based on extended structure and basin hopping graph, it is a computing mode of basin hopping graph in RNA folding structural prediction including pseudoknots. This study presents the predicting algorithm based on extended structure, it also proposes an improved computing algorithm based on barrier tree and basin hopping graph, which are the attractive approaches in RNA folding structural prediction. Many experiments have been implemented in Rfam14.1 database and PseudoBase database, the experimental results show that our two algorithms are efficient and accurate than the other existing algorithms.
Collapse
Affiliation(s)
- Zhendong Liu
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan 250101, P. R. China
- Department of Biostatistics, University of California, Los Angeles, Los Angeles 90095, USA
- Department of Statistics, Harvard University, Cambridge, MA 02138, USA
| | - Gang Li
- Department of Biostatistics, University of California, Los Angeles, Los Angeles 90095, USA
| | - Jun S. Liu
- Department of Statistics, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
10
|
Rubach P, Zajac S, Jastrzebski B, Sulkowska JI, Sułkowski P. Genus for biomolecules. Nucleic Acids Res 2020; 48:D1129-D1135. [PMID: 31584078 PMCID: PMC6943057 DOI: 10.1093/nar/gkz845] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Revised: 09/17/2019] [Accepted: 10/02/2019] [Indexed: 01/03/2023] Open
Abstract
The 'Genus for biomolecules' database (http://genus.fuw.edu.pl) collects information about topological structure and complexity of proteins and RNA chains, which is captured by the genus of a given chain and its subchains. For each biomolecule, this information is shown in the form of a genus trace plot, as well as a genus matrix diagram. We assemble such information for all and RNA structures deposited in the Protein Data Bank (PDB). This database presents also various statistics and extensive information about the biological function of the analyzed biomolecules. The database is regularly self-updating, once new structures are deposited in the PDB. Moreover, users can analyze their own structures.
Collapse
Affiliation(s)
- Paweł Rubach
- Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
- Warsaw School of Economics, Al. Niepodległości 162, 02-554 Warsaw, Poland
| | - Sebastian Zajac
- Warsaw School of Economics, Al. Niepodległości 162, 02-554 Warsaw, Poland
| | - Borys Jastrzebski
- Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Joanna I Sulkowska
- Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Piotr Sułkowski
- Faculty of Physics, University of Warsaw, Pasteura 5, 02-093 Warsaw, Poland
- Walter Burke Institute for Theoretical Physics, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
11
|
Iannelli F, Mamasakhlisov Y, Netz RR. Cold denaturation of RNA secondary structures with loop entropy and quenched disorder. Phys Rev E 2020; 101:012502. [PMID: 32069687 DOI: 10.1103/physreve.101.012502] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Indexed: 06/10/2023]
Abstract
The critical behavior of ribonucleic acid (RNA) secondary structures with quenched sequence randomness is studied by means of the constrained annealing method. A thermodynamic phase transition is induced by including the conformational weight of loop structures. In addition to the expected melting at high temperature, a cold-melting transition appears when the disorder strength induces competition between favorable and unfavorable base pairs. Our results suggest that the cold denaturation of RNA found experimentally might be triggered by quenched sequence disorder. We calculate hot- and cold-melting critical temperatures for competing favorable and unfavorable base-pair energies and present a folding phase diagram as a function of the loop exponent and temperature.
Collapse
Affiliation(s)
- Flavio Iannelli
- Humboldt-Universität zu Berlin, Institut für Physik, Newtonstraße 15, 12481 Berlin, Germany
| | - Yevgeni Mamasakhlisov
- Department of Molecular Physics, Yerevan State University, 1 Alex Manougian Street, Yerevan 0025, Armenia
| | - Roland R Netz
- Fachbereich Physik, Freie Universität Berlin, 14195 Berlin, Germany
| |
Collapse
|
12
|
Li TJX, Burris CS, Reidys CM. The block spectrum of RNA pseudoknot structures. J Math Biol 2019; 79:791-822. [PMID: 31172257 DOI: 10.1007/s00285-019-01379-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Revised: 04/29/2019] [Indexed: 01/08/2023]
Abstract
In this paper we analyze the length-spectrum of blocks in [Formula: see text]-structures. [Formula: see text]-structures are a class of RNA pseudoknot structures that play a key role in the context of polynomial time RNA folding. A [Formula: see text]-structure is constructed by nesting and concatenating specific building components having topological genus at most [Formula: see text]. A block is a substructure enclosed by crossing maximal arcs with respect to the partial order induced by nesting. We show that, in uniformly generated [Formula: see text]-structures, there is a significant gap in this length-spectrum, i.e., there asymptotically almost surely exists a unique longest block of length at least [Formula: see text] and that with high probability any other block has finite length. For fixed [Formula: see text], we prove that the length of the complement of the longest block converges to a discrete limit law, and that the distribution of short blocks of given length tends to a negative binomial distribution in the limit of long sequences. We refine this analysis to the length spectrum of blocks of specific pseudoknot types, such as H-type and kissing hairpins. Our results generalize the rainbow spectrum on secondary structures by the first and third authors and are being put into context with the structural prediction of long non-coding RNAs.
Collapse
Affiliation(s)
- Thomas J X Li
- Biocomplexity Institute and Initiative, University of Virginia, Charlottesville, VA, USA
| | | | - Christian M Reidys
- Biocomplexity Institute and Initiative, University of Virginia, Charlottesville, VA, USA. .,Department of Mathematics, University of Virginia, Charlottesville, VA, USA.
| |
Collapse
|
13
|
Shabash B, Wiese KC. jViz.RNA 4.0-Visualizing pseudoknots and RNA editing employing compressed tree graphs. PLoS One 2019; 14:e0210281. [PMID: 31059508 PMCID: PMC6502502 DOI: 10.1371/journal.pone.0210281] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Accepted: 12/19/2018] [Indexed: 11/18/2022] Open
Abstract
Previously, we have introduced an improved version of jViz.RNA which enabled faster and more stable RNA visualization by employing compressed tree graphs. However, the new RNA representation and visualization method required a sophisticated mechanism of pseudoknot visualization. In this work, we present our novel pseudoknot classification and implementation of pseudoknot visualization in the context of the new RNA graph model. We then compare our approach with other RNA visualization software, and demonstrate jViz.RNA 4.0's benefits compared to other software. Additionally, we introduce interactive editing functionality into jViz.RNA and demonstrate its benefits in exploring and building RNA structures. The results presented highlight the new high degree of utility jViz.RNA 4.0 now offers. Users are now able to visualize pseudoknotted RNA, manipulate the resulting automatic layouts to suit their individual needs, and change both positioning and connectivity of the RNA molecules examined. Care was taken to limit overlap between structural elements, particularly in the case of pseudoknots to ensure an intuitive and informative layout of the final RNA structure. Availability: The software is freely available at: https://jviz.cs.sfu.ca/.
Collapse
Affiliation(s)
- Boris Shabash
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Kay C. Wiese
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada
- * E-mail:
| |
Collapse
|
14
|
Genus trace reveals the topological complexity and domain structure of biomolecules. Sci Rep 2018; 8:17537. [PMID: 30510290 PMCID: PMC6277428 DOI: 10.1038/s41598-018-35557-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Accepted: 11/01/2018] [Indexed: 01/19/2023] Open
Abstract
The structure of bonds in biomolecules, such as base pairs in RNA chains or native interactions in proteins, can be presented in the form of a chord diagram. A given biomolecule is then characterized by the genus of an auxiliary two-dimensional surface associated to such a diagram. In this work we introduce the notion of the genus trace, which describes dependence of genus on the choice of a subchain of a given backbone chain. We find that the genus trace encodes interesting physical and biological information about a given biomolecule and its three dimensional structural complexity; in particular it gives a way to quantify how much more complicated a biomolecule is than its nested secondary structure alone would indicate. We illustrate this statement in many examples, involving both RNA and protein chains. First, we conduct a survey of all published RNA structures with better than 3 Å resolution in the PDB database, and find that the genus of natural structural RNAs has roughly linear dependence on their length. Then, we show that the genus trace captures properties of various types of base pairs in RNA, and enables the identification of the domain structure of a ribosome. Furthermore, we find that not only does the genus trace detect a domain structure, but it also predicts a cooperative folding pattern in multi-domain proteins. The genus trace turns out to be a useful and versatile tool, with many potential applications.
Collapse
|
15
|
Yamanaka M. <b>Random matrix theory for an inter-fragment interaction energy matrix in fragment molecular orbital method </b>. CHEM-BIO INFORMATICS JOURNAL 2018. [DOI: 10.1273/cbij.18.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Masanori Yamanaka
- Department of Physics, College of Science and Technology, Nihon University
| |
Collapse
|
16
|
Liu Z, Zhu D, Dai Q. Predicting Model and Algorithm in RNA Folding Structure Including Pseudoknots. INT J PATTERN RECOGN 2018. [DOI: 10.1142/s0218001418510059] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The prediction of RNA structure with pseudoknots is a nondeterministic polynomial-time hard (NP-hard) problem; according to minimum free energy models and computational methods, we investigate the RNA-pseudoknotted structure. Our paper presents an efficient algorithm for predicting RNA structure with pseudoknots, and the algorithm takes O([Formula: see text]) time and O([Formula: see text]) space, the experimental tests in Rfam10.1 and PseudoBase indicate that the algorithm is more effective and precise. The predicting accuracy, the time complexity and space complexity outperform existing algorithms, such as Maximum Weight Matching (MWM) algorithm, PKNOTS algorithm and Inner Limiting Layer (ILM) algorithm, and the algorithm can predict arbitrary pseudoknots. And there exists a [Formula: see text] ([Formula: see text]) polynomial time approximation scheme in searching maximum number of stackings, and we give the proof of the approximation scheme in RNA-pseudoknotted structure. We have improved several types of pseudoknots considered in RNA folding structure, and analyze their possible transitions between types of pseudoknots.
Collapse
Affiliation(s)
- Zhendong Liu
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan 250101, P. R. China
| | - Daming Zhu
- School of Computer Science and Technology, Shandong University, Jinan 250101, P. R. China
| | - Qionghai Dai
- Department of Automation, Tsinghua University, Beijing 100084, P. R. China
| |
Collapse
|
17
|
Barrett C, Huang FW, Reidys CM. Sequence-structure relations of biopolymers. Bioinformatics 2018; 33:382-389. [PMID: 28171628 DOI: 10.1093/bioinformatics/btw621] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Revised: 05/16/2016] [Accepted: 09/26/2016] [Indexed: 12/12/2022] Open
Abstract
Motivation DNA data is transcribed into single-stranded RNA, which folds into specific molecular structures. In this paper we pose the question to what extent sequence- and structure-information correlate. We view this correlation as structural semantics of sequence data that allows for a different interpretation than conventional sequence alignment. Structural semantics could enable us to identify more general embedded ‘patterns’ in DNA and RNA sequences. Results We compute the partition function of sequences with respect to a fixed structure and connect this computation to the mutual information of a sequence–structure pair for RNA secondary structures. We present a Boltzmann sampler and obtain the a priori probability of specific sequence patterns. We present a detailed analysis for the three PDB-structures, 2JXV (hairpin), 2N3R (3-branch multi-loop) and 1EHZ (tRNA). We localize specific sequence patterns, contrast the energy spectrum of the Boltzmann sampled sequences versus those sequences that refold into the same structure and derive a criterion to identify native structures. We illustrate that there are multiple sequences in the partition function of a fixed structure, each having nearly the same mutual information, that are nevertheless poorly aligned. This indicates the possibility of the existence of relevant patterns embedded in the sequences that are not discoverable using alignments. Availability and Implementation The source code is freely available at http://staff.vbi.vt.edu/fenixh/Sampler.zip Contact duckcr@vbi.vt.edu Supplimentary Information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christopher Barrett
- Biocomplexity Institute of Virginia Tech, Virginia Tech University, Blacksburg, VA, USA
| | - Fenix W Huang
- Biocomplexity Institute of Virginia Tech, Virginia Tech University, Blacksburg, VA, USA
| | - Christian M Reidys
- Biocomplexity Institute of Virginia Tech, Virginia Tech University, Blacksburg, VA, USA
| |
Collapse
|
18
|
Deguchi T, Uehara E. Statistical and Dynamical Properties of Topological Polymers with Graphs and Ring Polymers with Knots. Polymers (Basel) 2017; 9:E252. [PMID: 30970929 PMCID: PMC6432503 DOI: 10.3390/polym9070252] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Revised: 06/22/2017] [Accepted: 06/23/2017] [Indexed: 11/16/2022] Open
Abstract
We review recent theoretical studies on the statistical and dynamical properties of polymers with nontrivial structures in chemical connectivity and those of polymers with a nontrivial topology, such as knotted ring polymers in solution. We call polymers with nontrivial structures in chemical connectivity expressed by graphs "topological polymers". Graphs with no loop have only trivial topology, while graphs with loops such as multiple-rings may have nontrivial topology of spatial graphs as embeddings in three dimensions, e.g., knots or links in some loops. We thus call also such polymers with nontrivial topology "topological polymers", for simplicity. For various polymers with different structures in chemical connectivity, we numerically evaluate the mean-square radius of gyration and the hydrodynamic radius systematically through simulation. We evaluate the ratio of the gyration radius to the hydrodynamic radius, which we expect to be universal from the viewpoint of the renormalization group. Furthermore, we show that the short-distance intrachain correlation is much enhanced for real topological polymers (the Kremer⁻Grest model) expressed with complex graphs. We then address topological properties of ring polymers in solution. We define the knotting probability of a knot K by the probability that a given random polygon or self-avoiding polygon of N vertices has the knot K. We show a formula for expressing it as a function of the number of segments N, which gives good fitted curves to the data of the knotting probability versus N. We show numerically that the average size of self-avoiding polygons with a fixed knot can be much larger than that of no topological constraint if the excluded volume is small. We call it "topological swelling".
Collapse
Affiliation(s)
- Tetsuo Deguchi
- Department of Physics, Faculty of Core Research, Ochanomizu University, Ohtsuka 2-1-1, Bunkyo-ku, Tokyo 112-8610, Japan.
| | - Erica Uehara
- Department of Physics, Faculty of Core Research, Ochanomizu University, Ohtsuka 2-1-1, Bunkyo-ku, Tokyo 112-8610, Japan.
| |
Collapse
|
19
|
Shabash B, Wiese KC. RNA Visualization: Relevance and the Current State-of-the-Art Focusing on Pseudoknots. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:696-712. [PMID: 26915129 DOI: 10.1109/tcbb.2016.2522421] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
RNA visualization is crucial in order to understand the relationship that exists between RNA structure and its function, as well as the development of better RNA structure prediction algorithms. However, in the context of RNA visualization, one key structure remains difficult to visualize: Pseudoknots. Pseudoknots occur in RNA folding when two secondary structural components form base-pairs between them. The three-dimensional nature of these components makes them challenging to visualize in two-dimensional media, such as print media or screens. In this review, we focus on the advancements that have been made in the field of RNA visualization in two-dimensional media in the past two decades. The review aims at presenting all relevant aspects of pseudoknot visualization. We start with an overview of several pseudoknotted structures and their relevance in RNA function. Next, we discuss the theoretical basis for RNA structural topology classification and present RNA classification systems for both pseudoknotted and non-pseudoknotted RNAs. Each description of RNA classification system is followed by a discussion of the software tools and algorithms developed to date to visualize RNA, comparing the different tools' strengths and shortcomings.
Collapse
|
20
|
Topological Classification of RNA Structures via Intersection Graph. THEORY AND PRACTICE OF NATURAL COMPUTING 2017. [DOI: 10.1007/978-3-319-71069-3_16] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
21
|
Huang FW, Reidys CM. Topological language for RNA. Math Biosci 2016; 282:109-120. [DOI: 10.1016/j.mbs.2016.10.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Revised: 10/17/2016] [Accepted: 10/17/2016] [Indexed: 12/26/2022]
|
22
|
Li TJX, Reidys CM. Statistics of topological RNA structures. J Math Biol 2016; 74:1793-1821. [DOI: 10.1007/s00285-016-1078-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2016] [Revised: 10/30/2016] [Indexed: 11/30/2022]
|
23
|
Uehara E, Deguchi T. Statistical and hydrodynamic properties of topological polymers for various graphs showing enhanced short-range correlation. J Chem Phys 2016; 145:164905. [DOI: 10.1063/1.4965828] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
24
|
Vernizzi G, Orland H, Zee A. Classification and predictions of RNA pseudoknots based on topological invariants. Phys Rev E 2016; 94:042410. [PMID: 27841638 DOI: 10.1103/physreve.94.042410] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2016] [Indexed: 01/21/2023]
Abstract
We propose a new topological characterization of ribonucleic acid (RNA) secondary structures with pseudoknots based on two topological invariants. Starting from the classic arc representation of RNA secondary structures, we consider a model that couples both (i) the topological genus of the graph and (ii) the number of crossing arcs of the corresponding primitive graph. We add a term proportional to these topological invariants to the standard free energy of the RNA molecule, thus obtaining a novel free-energy parametrization that takes into account the abundance of topologies of RNA pseudoknots observed in RNA databases.
Collapse
Affiliation(s)
| | - Henri Orland
- Institut de Physique Théorique, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France.,Beijing Computational Science Research Center, Haidian District Beijing, 100084, China.,Department of Physics, University of California, Santa Barbara, CA 93106, USA
| | - A Zee
- Institut de Physique Théorique, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France.,Department of Physics, University of California, Santa Barbara, CA 93106, USA.,Kavli Institute for Theoretical Physics, University of California, Santa Barbara, CA 93106, USA
| |
Collapse
|
25
|
Baulin E, Yacovlev V, Khachko D, Spirin S, Roytberg M. URS DataBase: universe of RNA structures and their motifs. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw085. [PMID: 27242032 PMCID: PMC4885603 DOI: 10.1093/database/baw085] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2015] [Accepted: 05/02/2016] [Indexed: 12/17/2022]
Abstract
The Universe of RNA Structures DataBase (URSDB) stores information obtained from all RNA-containing PDB entries (2935 entries in October 2015). The content of the database is updated regularly. The database consists of 51 tables containing indexed data on various elements of the RNA structures. The database provides a web interface allowing user to select a subset of structures with desired features and to obtain various statistical data for a selected subset of structures or for all structures. In particular, one can easily obtain statistics on geometric parameters of base pairs, on structural motifs (stems, loops, etc.) or on different types of pseudoknots. The user can also view and get information on an individual structure or its selected parts, e.g. RNA–protein hydrogen bonds. URSDB employs a new original definition of loops in RNA structures. That definition fits both pseudoknot-free and pseudoknotted secondary structures and coincides with the classical definition in case of pseudoknot-free structures. To our knowledge, URSDB is the first database supporting searches based on topological classification of pseudoknots and on extended loop classification. Database URL: http://server3.lpm.org.ru/urs/
Collapse
Affiliation(s)
- Eugene Baulin
- Laboratory of Applied Mathematics, Institute of Mathematical Problems of Biology, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russia Department of Algorithms and Technology of Programming, Faculty of Innovations and High Technology, Moscow Institute of Physics and Technology (State University), Dolgoprudny, Moscow Region 141700, Russia
| | - Victor Yacovlev
- Laboratory of Applied Mathematics, Institute of Mathematical Problems of Biology, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russia Department of Big Data and Information Retrieval, Faculty of Computer Science, National Research University Higher School of Economics, Moscow 101000, Russia
| | - Denis Khachko
- Laboratory of Applied Mathematics, Institute of Mathematical Problems of Biology, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russia
| | - Sergei Spirin
- Department of Mathematical Methods in Biology, Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119992, Russia
| | - Mikhail Roytberg
- Laboratory of Applied Mathematics, Institute of Mathematical Problems of Biology, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russia Department of Algorithms and Technology of Programming, Faculty of Innovations and High Technology, Moscow Institute of Physics and Technology (State University), Dolgoprudny, Moscow Region 141700, Russia Department of Big Data and Information Retrieval, Faculty of Computer Science, National Research University Higher School of Economics, Moscow 101000, Russia
| |
Collapse
|
26
|
Kucharík M, Hofacker IL, Stadler PF, Qin J. Pseudoknots in RNA folding landscapes. Bioinformatics 2016; 32:187-94. [PMID: 26428288 PMCID: PMC4708108 DOI: 10.1093/bioinformatics/btv572] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2015] [Revised: 09/10/2015] [Accepted: 09/27/2015] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION The function of an RNA molecule is not only linked to its native structure, which is usually taken to be the ground state of its folding landscape, but also in many cases crucially depends on the details of the folding pathways such as stable folding intermediates or the timing of the folding process itself. To model and understand these processes, it is necessary to go beyond ground state structures. The study of rugged RNA folding landscapes holds the key to answer these questions. Efficient coarse-graining methods are required to reduce the intractably vast energy landscapes into condensed representations such as barrier trees or basin hopping graphs : BHG) that convey an approximate but comprehensive picture of the folding kinetics. So far, exact and heuristic coarse-graining methods have been mostly restricted to the pseudoknot-free secondary structures. Pseudoknots, which are common motifs and have been repeatedly hypothesized to play an important role in guiding folding trajectories, were usually excluded. RESULTS We generalize the BHG framework to include pseudoknotted RNA structures and systematically study the differences in predicted folding behavior depending on whether pseudoknotted structures are allowed to occur as folding intermediates or not. We observe that RNAs with pseudoknotted ground state structures tend to have more pseudoknotted folding intermediates than RNAs with pseudoknot-free ground state structures. The occurrence and influence of pseudoknotted intermediates on the folding pathway, however, appear to depend very strongly on the individual RNAs so that no general rule can be inferred. AVAILABILITY AND IMPLEMENTATION The algorithms described here are implemented in C++ as standalone programs. Its source code and Supplemental material can be freely downloaded from http://www.tbi.univie.ac.at/bhg.html. CONTACT qin@bioinf.uni-leipzig.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, Research Group BCB, Faculty of Computer Science, University of Vienna, Austria, RTH, University of Copenhagen, Frederiksberg, Denmark
| | - Peter F Stadler
- Institute for Theoretical Chemistry, RTH, University of Copenhagen, Frederiksberg, Denmark, Department of Computer Science & IZBI & iDiv & LIFE, Leipzig University, Max Planck Institute for Mathematics in the Sciences, Fraunhofer Institute IZI, Leipzig, Germany, Santa Fe Institute, Santa Fe, NM 87501, USA and
| | - Jing Qin
- Institute for Theoretical Chemistry, RTH, University of Copenhagen, Frederiksberg, Denmark, IMADA, University of Southern Denmark, Campusvej 55, Odense, Denmark
| |
Collapse
|
27
|
Huang FWD, Reidys CM. Shapes of topological RNA structures. Math Biosci 2015; 270:57-65. [PMID: 26482318 DOI: 10.1016/j.mbs.2015.10.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2014] [Revised: 09/30/2015] [Accepted: 10/01/2015] [Indexed: 11/18/2022]
Abstract
A topological RNA structure is derived by fattening the edges of a contact structure into ribbons. The shape of a topological RNA structure is obtained by collapsing the stacks of the structure into single arcs and by removing any arcs of length one, as well as isolated vertices. A shape contains the key topological information of the molecular conformation and for fixed topological genus there exist only finitely many such shapes. In this paper we compute the generating polynomial of shapes of fixed topological genus g. We furthermore derive an algorithm having O(glog g) time complexity uniformly generating shapes of genus g and discuss some applications in the context of databases of RNA pseudoknot structures.
Collapse
Affiliation(s)
- Fenix W D Huang
- Virginia Bioinformatics Institute, 1015 Life Sciences Circle, Blacksburg, VA, USA.
| | - Christian M Reidys
- Virginia Bioinformatics Institute, 1015 Life Sciences Circle, Blacksburg, VA, USA.
| |
Collapse
|
28
|
Mamasakhlisov YS, Bellucci S, Hayryan S, Caturyan H, Grigoryan Z, Hu CK. Collapse and hybridization of RNA: view from replica technique approach. THE EUROPEAN PHYSICAL JOURNAL. E, SOFT MATTER 2015; 38:100. [PMID: 26385736 DOI: 10.1140/epje/i2015-15100-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2015] [Revised: 07/19/2015] [Accepted: 07/31/2015] [Indexed: 06/05/2023]
Abstract
The replica technique method is applied to investigate the kinetic behavior of the coarse-grained model for the RNA molecule. A non-equilibrium phase transition of second order between the glassy phase and the ensemble of freely fluctuating structures has been observed. The non-equilibrium steady state is investigated as well and the thermodynamic characteristics of the system have been evaluated. The non-equilibrium behavior of the specific heat is discussed. Based on our analysis, we point out the state in the kinetic pathway in which the RNA molecule is most prone to hybridization.
Collapse
Affiliation(s)
| | - S Bellucci
- INFN-Laboratori Nazionali di Frascati, Via Enrico Fermi, 40, 00044, Frascati RM, Italy
| | - Shura Hayryan
- Institute of Physics, Academia Sinica, 128 Sec. 2, Academia Rd., 11529, Nankang, Taipei, Taiwan
| | - H Caturyan
- Yerevan State University, 1 A. Manoogian Str., 0025, Yerevan, Armenia
| | - Z Grigoryan
- Goris State University, 4 Avangard Str., 3204, Goris, Armenia
| | - Chin-Kun Hu
- Institute of Physics, Academia Sinica, 128 Sec. 2, Academia Rd., 11529, Nankang, Taipei, Taiwan.
- National Center for Theoretical Sciences, National Tsing Hua University, 30013, Hsinchu, Taiwan.
- Business School, University of Shanghai for Science and Technology, 200093, Shanghai, China.
| |
Collapse
|
29
|
Grigoryan ZA, Karapetian AT. The Globular State of the Single-Stranded RNA: Effect of the Secondary Structure Rearrangements. J Nucleic Acids 2015; 2015:295264. [PMID: 26345143 PMCID: PMC4546806 DOI: 10.1155/2015/295264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Revised: 07/06/2015] [Accepted: 07/06/2015] [Indexed: 11/26/2022] Open
Abstract
The mutual influence of the slow rearrangements of secondary structure and fast collapse of the long single-stranded RNA (ssRNA) in approximation of coarse-grained model is studied with analytic calculations. It is assumed that the characteristic time of the secondary structure rearrangement is much longer than that for the formation of the tertiary structure. A nonequilibrium phase transition of the 2nd order has been observed.
Collapse
Affiliation(s)
| | - Armen T. Karapetian
- Yerevan State University of Architecture and Construction, Teryan 105, 0009 Yerevan, Armenia
| |
Collapse
|
30
|
Barik A, C N, Pilla SP, Bahadur RP. Molecular architecture of protein-RNA recognition sites. J Biomol Struct Dyn 2015; 33:2738-51. [PMID: 25562181 DOI: 10.1080/07391102.2015.1004652] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The molecular architecture of protein-RNA interfaces are analyzed using a non-redundant dataset of 152 protein-RNA complexes. We find that an average protein-RNA interface is smaller than an average protein-DNA interface but larger than an average protein-protein interface. Among the different classes of protein-RNA complexes, interfaces with tRNA are the largest, while the interfaces with the single-stranded RNA are the smallest. Significantly, RNA contributes more to the interface area than its partner protein. Moreover, unlike protein-protein interfaces where the side chain contributes less to the interface area compared to the main chain, the main chain and side chain contributions flipped in protein-RNA interfaces. We find that the protein surface in contact with the RNA in protein-RNA complexes is better packed than that in contact with the DNA in protein-DNA complexes, but loosely packed than that in contact with the protein in protein-protein complexes. Shape complementarity and electrostatic potential are the two major factors that determine the specificity of the protein-RNA interaction. We find that the H-bond density at the protein-RNA interfaces is similar with that of protein-DNA interfaces but higher than the protein-protein interfaces. Unlike protein-DNA interfaces where the deoxyribose has little role in intermolecular H-bonds, due to the presence of an oxygen atom at the 2' position, the ribose in RNA plays significant role in protein-RNA H-bonds. We find that besides H-bonds, salt bridges and stacking interactions also play significant role in stabilizing protein-nucleic acids interfaces; however, their contribution at the protein-protein interfaces is insignificant.
Collapse
Affiliation(s)
- Amita Barik
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Nithin C
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Smita P Pilla
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Ranjit Prasad Bahadur
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| |
Collapse
|
31
|
Fu BMM, Han HSW, Reidys CM. On RNA-RNA interaction structures of fixed topological genus. Math Biosci 2015; 262:88-104. [PMID: 25640867 DOI: 10.1016/j.mbs.2014.12.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Revised: 12/03/2014] [Accepted: 12/17/2014] [Indexed: 11/29/2022]
Abstract
Interacting RNA complexes are studied via bicellular maps using a filtration via their topological genus. Our main result is a new bijection for RNA-RNA interaction structures and a linear time uniform sampling algorithm for RNA complexes of fixed topological genus. The bijection allows to either reduce the topological genus of a bicellular map directly, or to lose connectivity by decomposing the complex into a pair of single stranded RNA structures. Our main result is proved bijectively. It provides an explicit algorithm of how to rewire the corresponding complexes and an unambiguous decomposition grammar. Using the concept of genus induction, we construct bicellular maps of fixed topological genus g uniformly in linear time. We present various statistics on these topological RNA complexes and compare our findings with biological complexes. Furthermore we show how to construct loop-energy based complexes using our decomposition grammar.
Collapse
Affiliation(s)
- Benjamin M M Fu
- Department of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark.
| | - Hillary S W Han
- Department of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark.
| | - Christian M Reidys
- Department of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark.
| |
Collapse
|
32
|
Abstract
The ongoing effort to detect and characterize physical entanglement in biopolymers has so far established that knots are present in many globular proteins and also, abound in viral DNA packaged inside bacteriophages. RNA molecules, however, have not yet been systematically screened for the occurrence of physical knots. We have accordingly undertaken the systematic profiling of the several thousand RNA structures present in the Protein Data Bank (PDB). The search identified no more than three deeply knotted RNA molecules. These entries are rRNAs of about 3,000 nt solved by cryo-EM. Their genuine knotted state is, however, doubtful based on the detailed structural comparison with homologs of higher resolution, which are all unknotted. Compared with the case of proteins and viral DNA, the observed incidence of knots in available RNA structures is, therefore, practically negligible. This fact suggests that either evolutionary selection or thermodynamic and kinetic folding mechanisms act toward minimizing the entanglement of RNA to an extent that is unparalleled by other types of biomolecules. A possible general strategy for designing synthetic RNA sequences capable of self-tying in a twist-knot fold is finally proposed.
Collapse
|
33
|
Chiu JKH, Chen YPP. Efficient conversion of RNA pseudoknots to knot-free structures using a graphical model. IEEE Trans Biomed Eng 2014; 62:1265-71. [PMID: 25474805 DOI: 10.1109/tbme.2014.2375360] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
RNA secondary structures are vital in determining the 3-D structures of noncoding RNA molecules, which in turn affect their functions. Computational RNA secondary structure alignment and analysis are biologically significant, because they help identify numerous functionally important motifs. Unfortunately, many analysis methods suffer from computational intractability in the presence of pseudoknots. The conversion of knotted to knot-free secondary structures is an essential preprocessing step, and is regarded as pseudoknot removal. Although exact methods have been proposed for this task, their computational complexities are undetermined, and so their efficiencies in processing complex pseudoknots are currently unknown. We transformed the pseudoknot removal problem into a circle graph maximum weight independent set (MWIS) problem, in which each MWIS represents a unique optimal deknotted structure. An existing circle graph MWIS algorithm was extended to report either single or all solutions. Its time complexity depends on the number of MWISs, and is guaranteed to report one solution in polynomial time. Experimental results suggest that our extended algorithm is much more efficient than the state-of-the-art tool. We also devised a novel concept called the structural scoring function, and investigated its effectiveness in more accurate solution candidate selection for a certain criteria.
Collapse
|
34
|
Abstract
Shapes of interacting RNA complexes are studied using a filtration via their topological genus. A shape of an RNA complex is obtained by (iteratively) collapsing stacks and eliminating hairpin loops. This shape projection preserves the topological core of the RNA complex, and for fixed topological genus there are only finitely many such shapes. Our main result is a new bijection that relates the shapes of RNA complexes with shapes of RNA structures. This allows for computing the shape polynomial of RNA complexes via the shape polynomial of RNA structures. We furthermore present a linear time uniform sampling algorithm for shapes of RNA complexes of fixed topological genus.
Collapse
Affiliation(s)
- Benjamin M M Fu
- Department of Mathematics and Computer Science, University of Southern Denmark , Odense M, Denmark
| | | |
Collapse
|
35
|
Abstract
In this article we study canonical γ-structures, a class of RNA pseudoknot structures that plays a key role in the context of polynomial time folding of RNA pseudoknot structures. A γ-structure is composed of specific building blocks that have topological genus less than or equal to γ, where composition means concatenation and nesting of such blocks. Our main result is the derivation of the generating function of γ-structures via symbolic enumeration using so called irreducible shadows. We furthermore recursively compute the generating polynomials of irreducible shadows of genus ≤ γ. The γ-structures are constructed via γ-matchings. For 1 ≤ γ ≤ 10, we compute Puiseux expansions at the unique, dominant singularities, allowing us to derive simple asymptotic formulas for the number of γ-structures.
Collapse
Affiliation(s)
- Hillary S W Han
- Department of Mathematics and Computer Science, University of Southern Denmark , Odense, Denmark
| | | | | |
Collapse
|
36
|
Koehl P. Mathematics's role in the grand challenge of deciphering the molecular basis of life. Front Mol Biosci 2014; 1:2. [PMID: 25988143 PMCID: PMC4428350 DOI: 10.3389/fmolb.2014.00002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Accepted: 03/19/2014] [Indexed: 11/13/2022] Open
Affiliation(s)
- Patrice Koehl
- Department of Computer Science and Genome Center, University of California at Davis Davis, CA, USA
| |
Collapse
|
37
|
Generation of RNA pseudoknot structures with topological genus filtration. Math Biosci 2013; 245:216-25. [DOI: 10.1016/j.mbs.2013.07.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2013] [Revised: 06/11/2013] [Accepted: 07/12/2013] [Indexed: 11/22/2022]
|
38
|
Bhadola P, Deo N. Genus distribution and thermodynamics of a random matrix model of RNA with Penner interaction. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2013; 88:032706. [PMID: 24125293 DOI: 10.1103/physreve.88.032706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2013] [Revised: 08/05/2013] [Indexed: 06/02/2023]
Abstract
The nonlinear Penner external interaction is introduced and studied in the random matrix model of homo RNA. A numerical technique is developed to study the partition function, and a general formula is obtained for all lengths. The genus distribution function for the system is obtained, plotted, and compared with the genus distribution for the real RNA structures found from the protein databank. The genus distribution shows that the nonlinear interaction favors the formation of low genus structures and matches the result for real RNA structures. The distribution of structure with temperature suggests that nonlinear interaction is biased toward the planar structures. The variation of chemical potential with temperature and interaction strength indicates the presence of additional molecules in the system other than the magnesium ions and possibly represents a phase transition. The specific heat has a bump and its derivatives shows a double-peak behavior at a particular temperature. On analyzing the specific heat and derivatives for each genus separately, the planar structure (genus zero) is shown to contribute the most to the bump and double peak. This observation in the nonlinear model is similar to that observed in the unfolding experiments on RNA.
Collapse
Affiliation(s)
- Pradeep Bhadola
- Department of Physics and Astrophysics, University of Delhi, Delhi 110007, India
| | | |
Collapse
|
39
|
Abstract
Recently a folding algorithm of topological RNA pseudoknot structures was presented in Reidys et al. (2011). This algorithm folds single-stranded γ-structures, that is, RNA structures composed by distinct motifs of bounded topological genus. In this article, we set the theoretical foundations for the folding of the two backbone analogues of γ structures: the RNA γ-interaction structures. These are RNA-RNA interaction structures that are constructed by a finite number of building blocks over two backbones having genus at most γ. Combinatorial properties of γ-interaction structures are of practical interest since they have direct implications for the folding of topological interaction structures. We compute the generating function of γ-interaction structures and show that it is algebraic, which implies that the numbers of interaction structures can be computed recursively. We obtain simple asymptotic formulas for 0- and 1-interaction structures. The simplest class of interaction structures are the 0-interaction structures, which represent the two backbone analogues of secondary structures.
Collapse
Affiliation(s)
- Jing Qin
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
| | | |
Collapse
|
40
|
Bon M, Micheletti C, Orland H. McGenus: a Monte Carlo algorithm to predict RNA secondary structures with pseudoknots. Nucleic Acids Res 2012; 41:1895-900. [PMID: 23248008 PMCID: PMC3561945 DOI: 10.1093/nar/gks1204] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
We present McGenus, an algorithm to predict RNA secondary structures with pseudoknots. The method is based on a classification of RNA structures according to their topological genus. McGenus can treat sequences of up to 1000 bases and performs an advanced stochastic search of their minimum free energy structure allowing for non-trivial pseudoknot topologies. Specifically, McGenus uses a Monte Carlo algorithm with replica exchange for minimizing a general scoring function which includes not only free energy contributions for pair stacking, loop penalties, etc. but also a phenomenological penalty for the genus of the pairing graph. The good performance of the stochastic search strategy was successfully validated against TT2NE which uses the same free energy parametrization and performs exhaustive or partially exhaustive structure search, albeit for much shorter sequences (up to 200 bases). Next, the method was applied to other RNA sets, including an extensive tmRNA database, yielding results that are competitive with existing algorithms. Finally, it is shown that McGenus highlights possible limitations in the free energy scoring function. The algorithm is available as a web server at http://ipht.cea.fr/rna/mcgenus.php.
Collapse
Affiliation(s)
- Michaël Bon
- Institut de Physique Théorique, CEA Saclay, CNRS URA 2306, 91191 Gif-sur-Yvette, France
| | | | | |
Collapse
|
41
|
Huang FWD, Reidys CM. On the combinatorics of sparsification. Algorithms Mol Biol 2012; 7:28. [PMID: 23088372 PMCID: PMC3549849 DOI: 10.1186/1748-7188-7-28] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2011] [Accepted: 10/11/2012] [Indexed: 12/30/2022] Open
Abstract
UNLABELLED BACKGROUND We study the sparsification of dynamic programming based on folding algorithms of RNA structures. Sparsification is a method that improves significantly the computation of minimum free energy (mfe) RNA structures. RESULTS We provide a quantitative analysis of the sparsification of a particular decomposition rule, Λ∗. This rule splits an interval of RNA secondary and pseudoknot structures of fixed topological genus. Key for quantifying sparsifications is the size of the so called candidate sets. Here we assume mfe-structures to be specifically distributed (see Assumption 1) within arbitrary and irreducible RNA secondary and pseudoknot structures of fixed topological genus. We then present a combinatorial framework which allows by means of probabilities of irreducible sub-structures to obtain the expectation of the Λ∗-candidate set w.r.t. a uniformly random input sequence. We compute these expectations for arc-based energy models via energy-filtered generating functions (GF) in case of RNA secondary structures as well as RNA pseudoknot structures. Furthermore, for RNA secondary structures we also analyze a simplified loop-based energy model. Our combinatorial analysis is then compared to the expected number of Λ∗-candidates obtained from the folding mfe-structures. In case of the mfe-folding of RNA secondary structures with a simplified loop-based energy model our results imply that sparsification provides a significant, constant improvement of 91% (theory) to be compared to an 96% (experimental, simplified arc-based model) reduction. However, we do not observe a linear factor improvement. Finally, in case of the "full" loop-energy model we can report a reduction of 98% (experiment). CONCLUSIONS Sparsification was initially attributed a linear factor improvement. This conclusion was based on the so called polymer-zeta property, which stems from interpreting polymer chains as self-avoiding walks. Subsequent findings however reveal that the O(n) improvement is not correct. The combinatorial analysis presented here shows that, assuming a specific distribution (see Assumption 1), of mfe-structures within irreducible and arbitrary structures, the expected number of Λ∗-candidates is Θ(n2). However, the constant reduction is quite significant, being in the range of 96%. We furthermore show an analogous result for the sparsification of the Λ∗-decomposition rule for RNA pseudoknotted structures of genus one. Finally we observe that the effect of sparsification is sensitive to the employed energy model.
Collapse
|
42
|
Topological classification and enumeration of RNA structures by genus. J Math Biol 2012; 67:1261-78. [PMID: 23053535 DOI: 10.1007/s00285-012-0594-x] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2011] [Revised: 06/27/2012] [Indexed: 10/27/2022]
Abstract
To an RNA pseudoknot structure is naturally associated a topological surface, which has its associated genus, and structures can thus be classified by the genus. Based on earlier work of Harer-Zagier, we compute the generating function Dg,σ (z) = ∑n dg,σ (n)zn for the number dg,σ (n) of those structures of fixed genus g and minimum stack size σ with n nucleotides so that no two consecutive nucleotides are basepaired and show that Dg,σ (z) is algebraic. In particular, we prove that dg,2(n) ∼ kg n3(g−1/2 )γ n2, where γ2 ≈ 1.9685. Thus, for stack size at least two, the genus only enters through the sub-exponential factor, and the slow growth rate compared to the number of RNA molecules implies the existence of neutral networks of distinct molecules with the same structure of any genus. Certain RNA structures called shapes are shown to be in natural one-to-one correspondence with the cells in the Penner-Strebel decomposition of Riemann's moduli space of a surface of genus g with one boundary component, thus providing a link between RNA enumerative problems and the geometry of Riemann's moduli space.
Collapse
|
43
|
The topological filtration of γ-structures. Math Biosci 2012; 241:24-33. [PMID: 23022027 DOI: 10.1016/j.mbs.2012.09.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2012] [Revised: 09/14/2012] [Accepted: 09/15/2012] [Indexed: 11/23/2022]
Abstract
In this paper we study γ-structures filtered by topological genus. γ-structures are a class of RNA pseudoknot structures that plays a key role in the context of polynomial time folding of RNA pseudoknot structures. A γ-structure is composed by specific building blocks, that have topological genus less than or equal to γ, where composition means concatenation and nesting of such blocks. Our main results are the derivation of a new bivariate generating function for γ-structures via symbolic methods, the singularity analysis of the solutions and a central limit theorem for the distribution of topological genus in γ-structures of given length. In our derivation specific bivariate polynomials play a central role. Their coefficients count particular motifs of fixed topological genus and they are of relevance in the context of genus recursion and novel folding algorithms.
Collapse
|
44
|
Washietl S, Will S, Hendrix DA, Goff LA, Rinn JL, Berger B, Kellis M. Computational analysis of noncoding RNAs. WILEY INTERDISCIPLINARY REVIEWS-RNA 2012; 3:759-78. [PMID: 22991327 DOI: 10.1002/wrna.1134] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Noncoding RNAs have emerged as important key players in the cell. Understanding their surprisingly diverse range of functions is challenging for experimental and computational biology. Here, we review computational methods to analyze noncoding RNAs. The topics covered include basic and advanced techniques to predict RNA structures, annotation of noncoding RNAs in genomic data, mining RNA-seq data for novel transcripts and prediction of transcript structures, computational aspects of microRNAs, and database resources.
Collapse
Affiliation(s)
- Stefan Washietl
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | | | | | | | | | | | | |
Collapse
|
45
|
Chiu JKH, Chen YPP. Conformational features of topologically classified RNA secondary structures. PLoS One 2012; 7:e39907. [PMID: 22792195 PMCID: PMC3390330 DOI: 10.1371/journal.pone.0039907] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2012] [Accepted: 05/29/2012] [Indexed: 11/18/2022] Open
Abstract
Background Current RNA secondary structure prediction approaches predict prevalent pseudoknots such as the H-pseudoknot and kissing hairpin. The number of possible structures increases drastically when more complex pseudoknots are considered, thus leading to computational limitations. On the other hand, the enormous population of possible structures means not all of them appear in real RNA molecules. Therefore, it is of interest to understand how many of them really exist and the reasons for their preferred existence over the others, as any new findings revealed by this study might enhance the capability of future structure prediction algorithms for more accurate prediction of complex pseudoknots. Methodology/Principal Findings A novel algorithm was devised to estimate the exact number of structural possibilities for a pseudoknot constructed with a specified number of base pair stems. Then, topological classification was applied to classify RNA pseudoknotted structures from data in the RNA STRAND database. By showing the vast possibilities and the real population, it is clear that most of these plausible complex pseudoknots are not observed. Moreover, from these classified motifs that exist in nature, some features were identified for further investigation. It was found that some features are related to helical stacking. Other features are still left open to discover underlying tertiary interactions. Conclusions Results from topological classification suggest that complex pseudoknots are usually some well-known motifs that are themselves complex or the interaction results of some special motifs. Heuristics can be proposed to predict the essential parts of these complex motifs, even if the required thermodynamic parameters are currently unknown.
Collapse
Affiliation(s)
- Jimmy Ka Ho Chiu
- Department of Computer Science and Computer Engineering, La Trobe University, Melbourne, Victoria, Australia
| | - Yi-Ping Phoebe Chen
- Department of Computer Science and Computer Engineering, La Trobe University, Melbourne, Victoria, Australia
- * E-mail:
| |
Collapse
|
46
|
Andersen JE, Huang FW, Penner RC, Reidys CM. Topology of RNA-RNA Interaction Structures. J Comput Biol 2012; 19:928-43. [DOI: 10.1089/cmb.2011.0308] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Affiliation(s)
- Jørgen E. Andersen
- Center for Quantum Geometry of Moduli Spaces, Aarhus University, Århus, Denmark
| | - Fenix W.D. Huang
- Institut for Matematik og Datalogi, University of Southern Denmark, Odense, Denmark
| | - Robert C. Penner
- Center for Quantum Geometry of Moduli Spaces, Aarhus University, Århus, Denmark
- Math and Physics Departments, California Institute of Technology, Pasadena, California
| | - Christian M. Reidys
- Institut for Matematik og Datalogi, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
47
|
On the page number of RNA secondary structures with pseudoknots. J Math Biol 2011; 65:1337-57. [PMID: 22159642 DOI: 10.1007/s00285-011-0493-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2011] [Revised: 07/28/2011] [Indexed: 01/05/2023]
Abstract
Let S denote the set of (possibly noncanonical) base pairs {i, j } of an RNA tertiary structure; i.e. {i, j} ∈ S if there is a hydrogen bond between the ith and jth nucleotide. The page number of S, denoted π(S), is the minimum number k such that Scan be decomposed into a disjoint union of k secondary structures. Here, we show that computing the page number is NP-complete; we describe an exact computation of page number, using constraint programming, and determine the page number of a collection of RNA tertiary structures, for which the topological genus is known. We describe an approximation algorithm from which it follows that ω(S) ≤ π(S) ≤ ω(S) ・log n,where the clique number of S, ω(S), denotes the maximum number of base pairs that pairwise cross each other.
Collapse
|
48
|
Reidys CM, Huang FWD, Andersen JE, Penner RC, Stadler PF, Nebel ME. Addendum: topology and prediction of RNA pseudoknots. Bioinformatics 2011; 28:300. [PMID: 22106334 DOI: 10.1093/bioinformatics/btr643] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
49
|
Izzo JA, Kim N, Elmetwaly S, Schlick T. RAG: an update to the RNA-As-Graphs resource. BMC Bioinformatics 2011; 12:219. [PMID: 21627789 PMCID: PMC3123240 DOI: 10.1186/1471-2105-12-219] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2010] [Accepted: 05/31/2011] [Indexed: 02/08/2023] Open
Abstract
Background In 2004, we presented a web resource for stimulating the search for novel RNAs, RNA-As-Graphs (RAG), which classified, catalogued, and predicted RNA secondary structure motifs using clustering and build-up approaches. With the increased availability of secondary structures in recent years, we update the RAG resource and provide various improvements for analyzing RNA structures. Description Our RAG update includes a new supervised clustering algorithm that can suggest RNA motifs that may be "RNA-like". We use this utility to describe RNA motifs as three classes: existing, RNA-like, and non-RNA-like. This produces 126 tree and 16,658 dual graphs as candidate RNA-like topologies using the supervised clustering algorithm with existing RNAs serving as the training data. A comparison of this clustering approach to an earlier method shows considerable improvements. Additional RAG features include greatly expanded search capabilities, an interface to better utilize the benefits of relational database, and improvements to several of the utilities such as directed/labeled graphs and a subgraph search program. Conclusions The RAG updates presented here augment the database's intended function - stimulating the search for novel RNA functionality - by classifying available motifs, suggesting new motifs for design, and allowing for more specific searches for specific topologies. The updated RAG web resource offers users a graph-based tool for exploring available RNA motifs and suggesting new RNAs for design.
Collapse
Affiliation(s)
- Joseph A Izzo
- Department of Chemistry, New York University, New York, NY 10003, USA
| | | | | | | |
Collapse
|
50
|
Bon M, Orland H. TT2NE: a novel algorithm to predict RNA secondary structures with pseudoknots. Nucleic Acids Res 2011; 39:e93. [PMID: 21593129 PMCID: PMC3152363 DOI: 10.1093/nar/gkr240] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
We present TT2NE, a new algorithm to predict RNA secondary structures with pseudoknots. The method is based on a classification of RNA structures according to their topological genus. TT2NE is guaranteed to find the minimum free energy structure regardless of pseudoknot topology. This unique proficiency is obtained at the expense of the maximum length of sequences that can be treated, but comparison with state-of-the-art algorithms shows that TT2NE significantly improves the quality of predictions. Analysis of TT2NE's incorrect predictions sheds light on the need to study how sterical constraints limit the range of pseudoknotted structures that can be formed from a given sequence. An implementation of TT2NE on a public server can be found at http://ipht.cea.fr/rna/tt2ne.php.
Collapse
Affiliation(s)
- Michaël Bon
- Institut de Physique Théorique, CEA Saclay, CNRS URA 2306, 91191 Gif-sur-Yvette, France
| | | |
Collapse
|