1
|
Rahaman MM, Khan NS, Zhang S. RNAMotifComp: a comprehensive method to analyze and identify structurally similar RNA motif families. Bioinformatics 2023; 39:i337-i346. [PMID: 37387191 DOI: 10.1093/bioinformatics/btad223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION The 3D structures of RNA play a critical role in understanding their functionalities. There exist several computational methods to study RNA 3D structures by identifying structural motifs and categorizing them into several motif families based on their structures. Although the number of such motif families is not limited, a few of them are well-studied. Out of these structural motif families, there exist several families that are visually similar or very close in structure, even with different base interactions. Alternatively, some motif families share a set of base interactions but maintain variation in their 3D formations. These similarities among different motif families, if known, can provide a better insight into the RNA 3D structural motifs as well as their characteristic functions in cell biology. RESULTS In this work, we proposed a method, RNAMotifComp, that analyzes the instances of well-known structural motif families and establishes a relational graph among them. We also have designed a method to visualize the relational graph where the families are shown as nodes and their similarity information is represented as edges. We validated our discovered correlations of the motif families using RNAMotifContrast. Additionally, we used a basic Naïve Bayes classifier to show the importance of RNAMotifComp. The relational analysis explains the functional analogies of divergent motif families and illustrates the situations where the motifs of disparate families are predicted to be of the same family. AVAILABILITY AND IMPLEMENTATION Source code publicly available at https://github.com/ucfcbb/RNAMotifFamilySimilarity.
Collapse
Affiliation(s)
- Md Mahfuzur Rahaman
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Nabila Shahnaz Khan
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Shaojie Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| |
Collapse
|
2
|
Gianfrotta C, Reinharz V, Lespinet O, Barth D, Denise A. On the predictibility of A-minor motifs from their local contexts. RNA Biol 2022; 19:1208-1227. [PMID: 36384383 PMCID: PMC9673937 DOI: 10.1080/15476286.2022.2144611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
This study investigates the importance of the structural context in the formation of a type I/II A-minor motif. This very frequent structural motif has been shown to be important in the spatial folding of RNA molecules. We developed an automated method to classify A-minor motif occurrences according to their 3D context similarities, and we used a graph approach to represent both the structural A-minor motif occurrences and their classes at different scales. This approach leads us to uncover new subclasses of A-minor motif occurrences according to their local 3D similarities. The majority of classes are composed of homologous occurrences, but some of them are composed of non-homologous occurrences. The different classifications we obtain allow us to better understand the importance of the context in the formation of A-minor motifs. In a second step, we investigate how much knowledge of the context around an A-minor motif can help to infer its presence (and position). More specifically, we want to determine what kind of information, contained in the structural context, can be useful to characterize and predict A-minor motifs. We show that, for some A-minor motifs, the topology combined with a sequence signal is sufficient to predict the presence and the position of an A-minor motif occurrence. In most other cases, these signals are not sufficient for predicting the A-minor motif, however we show that they are good signals for this purpose. All the classification and prediction pipelines rely on automated processes, for which we describe the underlying algorithms and parameters.
Collapse
Affiliation(s)
- Coline Gianfrotta
- Données et Algorithmes pour une Ville Intelligente et Durable (DAVID), Université de Versailles Saint-Quentin-en-Yvelines, Université Paris-Saclay, Versailles, France,Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Université Paris-Saclay, CNRS, Orsay, France,CONTACT Coline Gianfrotta Données et Algorithmes pour une Ville Intelligente et Durable (DAVID), Université de Versailles Saint-Quentin-en-Yvelines, Université Paris-Saclay, France
| | - Vladimir Reinharz
- Department of Computer Science, Université du Québec à Montréal, Québec, Canada
| | - Olivier Lespinet
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, Gif-sur-Yvette, France
| | - Dominique Barth
- Données et Algorithmes pour une Ville Intelligente et Durable (DAVID), Université de Versailles Saint-Quentin-en-Yvelines, Université Paris-Saclay, Versailles, France
| | - Alain Denise
- Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Université Paris-Saclay, CNRS, Orsay, France,Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, Gif-sur-Yvette, France
| |
Collapse
|
3
|
Soulé A, Reinharz V, Sarrazin-Gendron R, Denise A, Waldispühl J. Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs. PLoS Comput Biol 2021; 17:e1008990. [PMID: 34048427 PMCID: PMC8191989 DOI: 10.1371/journal.pcbi.1008990] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 06/10/2021] [Accepted: 04/22/2021] [Indexed: 11/25/2022] Open
Abstract
RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs are not. Nor are distant interactions linking together those secondary structure elements (SSEs). Databases of conserved 3D geometries (a.k.a. modules) not captured by energetic models are leveraged for structure prediction and design, but the computational complexity has limited their study to local elements, loops. Representing the RNA structure as a graph has recently allowed to expend this work to pairs of SSEs, uncovering a hierarchical organization of these 3D modules, at great computational cost. Systematically capturing recurrent patterns on a large scale is a main challenge in the study of RNA structures. In this paper, we present an efficient algorithm to compute maximal isomorphisms in edge colored graphs. We extend this algorithm to a framework well suited to identify RNA modules, and fast enough to considerably generalize previous approaches. To exhibit the versatility of our framework, we first reproduce results identifying all common modules spanning more than 2 SSEs, in a few hours instead of weeks. The efficiency of our new algorithm is demonstrated by computing the maximal modules between any pair of entire RNA in the non-redundant corpus of known RNA 3D structures. We observe that the biggest modules our method uncovers compose large shared sub-structure spanning hundreds of nucleotides and base pairs between the ribosomes of Thermus thermophilus, Escherichia Coli, and Pseudomonas aeruginosa. Ribonucleic Acids (RNAs) are performing a broad range of essential molecular functions in cells, many of which rely on intricate folding properties of the molecule. Watson-Crick and Wobble base pairs form early, stack onto each other to create stems connected by loops, which are themselves stabilized by more sophisticated base interaction patterns. These networks are essential to shape RNA 3D structures but unfortunately still poorly understood. Here, we undertake the task to build a catalog of base interaction networks occurring in multiple structures. However, a pairwise comparison of all RNA structures is computationally heavy. Therefore, we devise an algorithm leveraging intrinsic properties of RNA base interaction networks that enables us to quickly mine full databases of 3D structures. Compared to previous methods, our techniques bring the total running time of the analysis from months to hours while performing more general searches. The data collected though this work will benefit molecular evolution studies and serve in structure prediction tools.
Collapse
Affiliation(s)
- Antoine Soulé
- School of Computer Science, McGill University, Montréal, Canada
- LiX, École Polytechnique, Paris, France
| | - Vladimir Reinharz
- Department of Computer Science, Université du Québec à Montréal, Montréal, Canada
| | | | - Alain Denise
- Laboratoire de recherche en informatique, Université Paris-Saclay - CNRS, Orsay, France
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay - CEA - CNRS, Gif-sur-Yvette, France
| | - Jérôme Waldispühl
- School of Computer Science, McGill University, Montréal, Canada
- * E-mail:
| |
Collapse
|
4
|
Richardson KE, Adams MS, Kirkpatrick CC, Gohara DW, Znosko BM. Identification and Characterization of New RNA Tetraloop Sequence Families. Biochemistry 2019; 58:4809-4820. [PMID: 31714066 DOI: 10.1021/acs.biochem.9b00535] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
There is an abundance of RNA sequence information available due to the efforts of sequencing projects. However, current techniques implemented to solve the tertiary structures of RNA, such as NMR and X-ray crystallography, are difficult and time-consuming. Therefore, biophysical techniques are not able to keep pace with the abundance of sequence information available. Because of this, there is a need to develop quick and efficient ways to predict RNA tertiary structure from sequence. One promising approach is to identify structural patterns within previously solved 3D structures and apply these patterns to new sequences. RNA tetraloops are one of the most common naturally occurring secondary structure motifs. Here, we use RNA Characterization of Secondary Structure Motifs (CoSSMos), Dissecting the Spatial Structure of RNA (DSSR), and a bioinformatic approach to search for and characterize tertiary structure patterns among tetraloops. Not surprising, we identified the well-known GNRA and UNCG tetraloops, as well as the previously identified RNYA tetraloop. However, some previously identified characteristics of these families were not observed in this data set, and some new characteristics were identified. In addition, we also identified and characterized three new tetraloop sequence families: YGAR, UGGU, and RMSA. This new structural information sheds light on the tertiary structure of tetraloops and contributes to the efforts of RNA tertiary structure prediction from sequence.
Collapse
Affiliation(s)
- Katherine E Richardson
- Department of Chemistry , Saint Louis University , Saint Louis , Missouri 63103 , United States
| | - Miranda S Adams
- Department of Chemistry , Saint Louis University , Saint Louis , Missouri 63103 , United States
| | - Charles C Kirkpatrick
- Department of Chemistry , Saint Louis University , Saint Louis , Missouri 63103 , United States
| | - David W Gohara
- Department of Biochemistry and Molecular Biology , Saint Louis University , Saint Louis , Missouri 63103 , United States
| | - Brent M Znosko
- Department of Chemistry , Saint Louis University , Saint Louis , Missouri 63103 , United States
| |
Collapse
|
5
|
Reinharz V, Soulé A, Westhof E, Waldispühl J, Denise A. Mining for recurrent long-range interactions in RNA structures reveals embedded hierarchies in network families. Nucleic Acids Res 2019; 46:3841-3851. [PMID: 29608773 PMCID: PMC5934684 DOI: 10.1093/nar/gky197] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Accepted: 03/22/2018] [Indexed: 11/14/2022] Open
Abstract
The wealth of the combinatorics of nucleotide base pairs enables RNA molecules to assemble into sophisticated interaction networks, which are used to create complex 3D substructures. These interaction networks are essential to shape the 3D architecture of the molecule, and also to provide the key elements to carry molecular functions such as protein or ligand binding. They are made of organised sets of long-range tertiary interactions which connect distinct secondary structure elements in 3D structures. Here, we present a de novo data-driven approach to extract automatically from large data sets of full RNA 3D structures the recurrent interaction networks (RINs). Our methodology enables us for the first time to detect the interaction networks connecting distinct components of the RNA structure, highlighting their diversity and conservation through non-related functional RNAs. We use a graphical model to perform pairwise comparisons of all RNA structures available and to extract RINs and modules. Our analysis yields a complete catalog of RNA 3D structures available in the Protein Data Bank and reveals the intricate hierarchical organization of the RNA interaction networks and modules. We assembled our results in an online database (http://carnaval.lri.fr) which will be regularly updated. Within the site, a tool allows users with a novel RNA structure to detect automatically whether the novel structure contains previously observed RINs.
Collapse
Affiliation(s)
- Vladimir Reinharz
- Department of Computer Science, Ben-Gurion University of the Negev, P.O.B. 653 Beer-Sheva, 84105, Israel.,School of Computer Science, McGill University, 3480 University, Montreal, Quebec H3A 0E9, Canada
| | - Antoine Soulé
- School of Computer Science, McGill University, 3480 University, Montreal, Quebec H3A 0E9, Canada.,LIX, École Polytechnique, CNRS, Inria, Palaiseau 91120, France
| | - Eric Westhof
- ARN, Université de Strasbourg, IBMC-CNRS, 15 rue René Descartes, Strasbourg Cedex 67084, France
| | - Jérôme Waldispühl
- School of Computer Science, McGill University, 3480 University, Montreal, Quebec H3A 0E9, Canada
| | - Alain Denise
- LRI, Université Paris-Sud, CNRS, Université Paris-Saclay, Bâtiment 650, Orsay cedex 91405, France.,I2BC, Université Paris-Sud, CNRS, CEA, Université Paris-Saclay, Bâtiment 400, Orsay cedex 91405, France
| |
Collapse
|
6
|
Lemkul JA, MacKerell AD. Polarizable force field for RNA based on the classical drude oscillator. J Comput Chem 2018; 39:2624-2646. [PMID: 30515902 PMCID: PMC6284239 DOI: 10.1002/jcc.25709] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Revised: 08/01/2018] [Accepted: 09/23/2018] [Indexed: 12/15/2022]
Abstract
RNA molecules are highly dynamic and capable of adopting a wide range of complex, folded structures. The factors driving the folding and dynamics of these structures are dependent on a balance of base pairing, hydration, base stacking, ion interactions, and the conformational sampling of the 2'-hydroxyl group in the ribose sugar. The representation of these features is a challenge for empirical force fields used in molecular dynamics simulations. Toward meeting this challenge, the inclusion of explicit electronic polarization is important in accurately modeling RNA structure. In this work, we present a polarizable force field for RNA based on the classical Drude oscillator model, which represents electronic degrees of freedom via negatively charged particles attached to their parent atoms by harmonic springs. Beginning with parametrization against quantum mechanical base stacking interaction energy and conformational energy data, we have extended the Drude-2017 nucleic acid force field to include RNA. The conformational sampling of a range of RNA sequences were used to validate the force field, including canonical A-form RNA duplexes, stem-loops, and complex tertiary folds that bind multiple Mg2+ ions. Overall, the Drude-2017 RNA force field reproduces important properties of these structures, including the conformational sampling of the 2'-hydroxyl and key interactions with Mg2+ ions. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
| | - Alexander D. MacKerell
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Baltimore, MD 21201
| |
Collapse
|
7
|
Geary C, Chworos A, Verzemnieks E, Voss NR, Jaeger L. Composing RNA Nanostructures from a Syntax of RNA Structural Modules. NANO LETTERS 2017; 17:7095-7101. [PMID: 29039189 PMCID: PMC6363482 DOI: 10.1021/acs.nanolett.7b03842] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Natural stable RNAs fold and assemble into complex three-dimensional architectures by relying on the hierarchical formation of intricate, recurrent networks of noncovalent tertiary interactions. These sequence-dependent networks specify RNA structural modules enabling orientational and topological control of helical struts to form larger self-folding domains. Borrowing concepts from linguistics, we defined an extended structural syntax of RNA modules for programming RNA strands to assemble into complex, responsive nanostructures under both thermodynamic and kinetic control. Based on this syntax, various RNA building blocks promote the multimolecular assembly of objects with well-defined three-dimensional shapes as well as the isothermal folding of long RNAs into complex single-stranded nanostructures during transcription. This work offers a glimpse of the limitless potential of RNA as an informational medium for designing programmable and functional nanomaterials useful for synthetic biology, nanomedicine, and nanotechnology.
Collapse
Affiliation(s)
- Cody Geary
- Department of Chemistry and Biochemistry, Biomolecular Science and Engineering Program, University of California, Santa Barbara, California 93106-9510, United States
| | - Arkadiusz Chworos
- Department of Chemistry and Biochemistry, Biomolecular Science and Engineering Program, University of California, Santa Barbara, California 93106-9510, United States
| | - Erik Verzemnieks
- Department of Chemistry and Biochemistry, Biomolecular Science and Engineering Program, University of California, Santa Barbara, California 93106-9510, United States
| | - Neil R. Voss
- Biological, Chemical, and Physical Sciences Department, Roosevelt University, 1400 North Roosevelt Blvd., Schaumburg, Illinois 60173, United States
| | - Luc Jaeger
- Department of Chemistry and Biochemistry, Biomolecular Science and Engineering Program, University of California, Santa Barbara, California 93106-9510, United States
| |
Collapse
|
8
|
Zhong C, Zhang S. RNAMotifScanX: a graph alignment approach for RNA structural motif identification. RNA (NEW YORK, N.Y.) 2015; 21:333-346. [PMID: 25595715 PMCID: PMC4338331 DOI: 10.1261/rna.044891.114] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2014] [Accepted: 11/28/2014] [Indexed: 06/04/2023]
Abstract
RNA structural motifs are recurrent three-dimensional (3D) components found in the RNA architecture. These RNA structural motifs play important structural or functional roles and usually exhibit highly conserved 3D geometries and base-interaction patterns. Analysis of the RNA 3D structures and elucidation of their molecular functions heavily rely on efficient and accurate identification of these motifs. However, efficient RNA structural motif search tools are lacking due to the high complexity of these motifs. In this work, we present RNAMotifScanX, a motif search tool based on a base-interaction graph alignment algorithm. This novel algorithm enables automatic identification of both partially and fully matched motif instances. RNAMotifScanX considers noncanonical base-pairing interactions, base-stacking interactions, and sequence conservation of the motifs, which leads to significantly improved sensitivity and specificity as compared with other state-of-the-art search tools. RNAMotifScanX also adopts a carefully designed branch-and-bound technique, which enables ultra-fast search of large kink-turn motifs against a 23S rRNA. The software package RNAMotifScanX is implemented using GNU C++, and is freely available from http://genome.ucf.edu/RNAMotifScanX.
Collapse
Affiliation(s)
- Cuncong Zhong
- Department of Electrical Engineering and Computer Science, University of Central Florida, Orlando, Florida 32816, USA
| | - Shaojie Zhang
- Department of Electrical Engineering and Computer Science, University of Central Florida, Orlando, Florida 32816, USA
| |
Collapse
|
9
|
Bottaro S, Di Palma F, Bussi G. The role of nucleobase interactions in RNA structure and dynamics. Nucleic Acids Res 2014; 42:13306-14. [PMID: 25355509 PMCID: PMC4245972 DOI: 10.1093/nar/gku972] [Citation(s) in RCA: 97] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The intricate network of interactions observed in RNA three-dimensional structures is often described in terms of a multitude of geometrical properties, including helical parameters, base pairing/stacking, hydrogen bonding and backbone conformation. We show that a simple molecular representation consisting in one oriented bead per nucleotide can account for the fundamental structural properties of RNA. In this framework, canonical Watson-Crick, non-Watson-Crick base-pairing and base-stacking interactions can be unambiguously identified within a well-defined interaction shell. We validate this representation by performing two independent, complementary tests. First, we use it to construct a sequence-independent, knowledge-based scoring function for RNA structural prediction, which compares favorably to fully atomistic, state-of-the-art techniques. Second, we define a metric to measure deviation between RNA structures that directly reports on the differences in the base–base interaction network. The effectiveness of this metric is tested with respect to the ability to discriminate between structurally and kinetically distant RNA conformations, performing better compared to standard techniques. Taken together, our results suggest that this minimalist, nucleobase-centric representation captures the main interactions that are relevant for describing RNA structure and dynamics.
Collapse
Affiliation(s)
- Sandro Bottaro
- Scuola Internazionale Superiore di Studi Avanzati, International School for Advanced Studies, 265, Via Bonomea I-34136 Trieste, Italy
| | - Francesco Di Palma
- Scuola Internazionale Superiore di Studi Avanzati, International School for Advanced Studies, 265, Via Bonomea I-34136 Trieste, Italy
| | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati, International School for Advanced Studies, 265, Via Bonomea I-34136 Trieste, Italy
| |
Collapse
|
10
|
Francis AR. An algebraic view of bacterial genome evolution. J Math Biol 2013; 69:1693-718. [DOI: 10.1007/s00285-013-0747-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2013] [Revised: 11/23/2013] [Indexed: 10/25/2022]
|
11
|
Borozan SZ, Dimitrijević BP, Stojanović SĐ. Cation−π interactions in high resolution protein−RNA complex crystal structures. Comput Biol Chem 2013; 47:105-12. [DOI: 10.1016/j.compbiolchem.2013.08.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2013] [Revised: 08/01/2013] [Accepted: 08/19/2013] [Indexed: 12/27/2022]
|
12
|
Sheth P, Cervantes-Cervantes M, Nagula A, Laing C, Wang JTL. Novel features for identifying A-minors in three-dimensional RNA molecules. Comput Biol Chem 2013; 47:240-5. [PMID: 24211672 DOI: 10.1016/j.compbiolchem.2013.10.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2013] [Revised: 10/15/2013] [Accepted: 10/16/2013] [Indexed: 01/08/2023]
Abstract
RNA tertiary interactions or tertiary motifs are conserved structural patterns formed by pairwise interactions between nucleotides. They include base-pairing, base-stacking, and base-phosphate interactions. A-minor motifs are the most common tertiary interactions in the large ribosomal subunit. The A-minor motif is a nucleotide triple in which minor groove edges of an adenine base are inserted into the minor groove of neighboring helices, leading to interaction with a stabilizing base pair. We propose here novel features for identifying and predicting A-minor motifs in a given three-dimensional RNA molecule. By utilizing the features together with machine learning algorithms including random forests and support vector machines, we show experimentally that our approach is capable of predicting A-minor motifs in the given RNA molecule effectively, demonstrating the usefulness of the proposed approach. The techniques developed from this work will be useful for molecular biologists and biochemists to analyze RNA tertiary motifs, specifically A-minor interactions.
Collapse
Affiliation(s)
- Palak Sheth
- Bioinformatics Program, New Jersey Institute of Technology, Newark, NJ 07102, USA
| | | | | | | | | |
Collapse
|
13
|
Shen Y, Wong HS, Zhang S, Zhang L. RNA structural motif recognition based on least-squares distance. RNA (NEW YORK, N.Y.) 2013; 19:1183-1191. [PMID: 23887146 PMCID: PMC3753925 DOI: 10.1261/rna.037648.112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2012] [Accepted: 06/13/2013] [Indexed: 06/02/2023]
Abstract
RNA structural motifs are recurrent structural elements occurring in RNA molecules. RNA structural motif recognition aims to find RNA substructures that are similar to a query motif, and it is important for RNA structure analysis and RNA function prediction. In view of this, we propose a new method known as RNA Structural Motif Recognition based on Least-Squares distance (LS-RSMR) to effectively recognize RNA structural motifs. A test set consisting of five types of RNA structural motifs occurring in Escherichia coli ribosomal RNA is compiled by us. Experiments are conducted for recognizing these five types of motifs. The experimental results fully reveal the superiority of the proposed LS-RSMR compared with four other state-of-the-art methods.
Collapse
Affiliation(s)
- Ying Shen
- School of Software Engineering, Tongji University, Shanghai 200092, China
| | - Hau-San Wong
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
| | - Shaohong Zhang
- Department of Computer Science, Guangzhou University, Guangzhou 510006, China
| | - Lin Zhang
- School of Software Engineering, Tongji University, Shanghai 200092, China
| |
Collapse
|
14
|
Abstract
The recent discoveries of regulatory non-coding RNAs changed our view of RNA as a simple information transfer molecule. Understanding the architecture and function of active RNA molecules requires methods for comparing and analyzing their 3D structures. While structural alignment of short RNAs is achievable in a reasonable amount of time, large structures represent much bigger challenge. Here, we present the SETTER web server for the RNA structure pairwise comparison utilizing the SETTER (SEcondary sTructure-based TERtiary Structure Similarity Algorithm) algorithm. The SETTER method divides an RNA structure into the set of non-overlapping structural elements called generalized secondary structure units (GSSUs). The SETTER algorithm scales as O(n2) with the size of a GSSUs and as O(n) with the number of GSSUs in the structure. This scaling gives SETTER its high speed as the average size of the GSSU remains constant irrespective of the size of the structure. However, the favorable speed of the algorithm does not compromise its accuracy. The SETTER web server together with the stand-alone implementation of the SETTER algorithm are freely accessible at http://siret.cz/setter.
Collapse
Affiliation(s)
- Petr Cech
- Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Prague, Czech Republic
| | | | | |
Collapse
|
15
|
Shen Y, Wong HS, Zhang S, Yu Z. Feature-based 3D motif filtering for ribosomal RNA. ACTA ACUST UNITED AC 2011; 27:2828-35. [PMID: 21873638 DOI: 10.1093/bioinformatics/btr495] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION RNA 3D motifs are recurrent substructures in an RNA subunit and are building blocks of the RNA architecture. They play an important role in binding proteins and consolidating RNA tertiary structures. RNA 3D motif searching consists of two steps: candidate generation and candidate filtering. We proposed a novel method, known as Feature-based RNA Motif Filtering (FRMF), for identifying motifs based on a set of moment invariants and the Earth Mover's Distance in the second step. RESULTS A positive set of RNA motifs belonging to six characteristic types, with eight subtypes occurring in HM 50S, is compiled by us. The proposed method is validated on this representative set. FRMF successfully finds most of the positive fragments. Besides the proposed new method and the compiled positive set, we also recognize some new motifs, in particular a π-turn and some non-standard A-minor motifs are found. These newly discovered motifs provide more information about RNA structure conformation. AVAILABILITY Matlab code can be downloaded from www.cs.cityu.edu.hk/~yingshen/FRMF.html CONTACT cshswong@cityu.edu.hk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ying Shen
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
| | | | | | | |
Collapse
|
16
|
Sequence-based identification of 3D structural modules in RNA with RMDetect. Nat Methods 2011; 8:513-21. [PMID: 21552257 DOI: 10.1038/nmeth.1603] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2010] [Accepted: 04/11/2011] [Indexed: 01/24/2023]
Abstract
Structural RNA modules, sets of ordered non-Watson-Crick base pairs embedded between Watson-Crick pairs, have central roles as architectural organizers and sites of ligand binding in RNA molecules, and are recurrently observed in RNA families throughout the phylogeny. Here we describe a computational tool, RNA three-dimensional (3D) modules detection, or RMDetect, for identifying known 3D structural modules in single and multiple RNA sequences in the absence of any other information. Currently, four modules can be searched for: G-bulge loop, kink-turn, C-loop and tandem-GA loop. In control test sequences we found all of the known modules with a false discovery rate of 0.23. Scanning through 1,444 publicly available alignments, we identified 21 yet unreported modules and 141 known modules. RMDetect can be used to refine RNA 2D structure, assemble RNA 3D models, and search and annotate structured RNAs in genomic data.
Collapse
|
17
|
Zhong C, Tang H, Zhang S. RNAMotifScan: automatic identification of RNA structural motifs using secondary structural alignment. Nucleic Acids Res 2010; 38:e176. [PMID: 20696653 PMCID: PMC2952876 DOI: 10.1093/nar/gkq672] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Recent studies have shown that RNA structural motifs play essential roles in RNA folding and interaction with other molecules. Computational identification and analysis of RNA structural motifs remains a challenging task. Existing motif identification methods based on 3D structure may not properly compare motifs with high structural variations. Other structural motif identification methods consider only nested canonical base-pairing structures and cannot be used to identify complex RNA structural motifs that often consist of various non-canonical base pairs due to uncommon hydrogen bond interactions. In this article, we present a novel RNA structural alignment method for RNA structural motif identification, RNAMotifScan, which takes into consideration the isosteric (both canonical and non-canonical) base pairs and multi-pairings in RNA structural motifs. The utility and accuracy of RNAMotifScan is demonstrated by searching for kink-turn, C-loop, sarcin-ricin, reverse kink-turn and E-loop motifs against a 23S rRNA (PDBid: 1S72), which is well characterized for the occurrences of these motifs. Finally, we search these motifs against the RNA structures in the entire Protein Data Bank and the abundances of them are estimated. RNAMotifScan is freely available at our supplementary website (http://genome.ucf.edu/RNAMotifScan).
Collapse
Affiliation(s)
- Cuncong Zhong
- School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | | | | |
Collapse
|
18
|
Kirillova S, Tosatto SCE, Carugo O. FRASS: the web-server for RNA structural comparison. BMC Bioinformatics 2010; 11:327. [PMID: 20553602 PMCID: PMC2902451 DOI: 10.1186/1471-2105-11-327] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2010] [Accepted: 06/16/2010] [Indexed: 11/24/2022] Open
Abstract
Background The impressive increase of novel RNA structures, during the past few years, demands automated methods for structure comparison. While many algorithms handle only small motifs, few techniques, developed in recent years, (ARTS, DIAL, SARA, SARSA, and LaJolla) are available for the structural comparison of large and intact RNA molecules. Results The FRASS web-server represents a RNA chain with its Gauss integrals and allows one to compare structures of RNA chains and to find similar entries in a database derived from the Protein Data Bank. We observed that FRASS scores correlate well with the ARTS and LaJolla similarity scores. Moreover, the-web server can also reproduce satisfactorily the DARTS classification of RNA 3D structures and the classification of the SCOR functions that was obtained by the SARA method. Conclusions The FRASS web-server can be easily used to detect relationships among RNA molecules and to scan efficiently the rapidly enlarging structural databases.
Collapse
Affiliation(s)
- Svetlana Kirillova
- Department of Structural and Computational Biology, Max F Perutz Laboratories, Vienna University, Campus Vienna Biocenter 5, A-1030 Vienna, Austria.
| | | | | |
Collapse
|
19
|
Abstract
Structural 3D motifs in RNA play an important role in the RNA stability and function. Previous studies have focused on the characterization and discovery of 3D motifs in RNA secondary and tertiary structures. However, statistical analyses of the distribution of 3D motifs along the RNA appear to be lacking. Herein, we present a novel strategy for evaluating the distribution of 3D motifs along the RNA chain and those motifs whose distributions are significantly non-random are identified. By applying it to the X-ray structure of the large ribosomal subunit from Haloarcula marismortui, helical motifs were found to cluster together along the chain and in the 3D structure, whereas the known tetraloops tend to be sequentially and spatially dispersed. That the distribution of key structural motifs such as tetraloops differ significantly from a random one suggests that our method could also be used to detect novel 3D motifs of any size in sufficiently long/large RNA structures. The motif distribution type can help in the prediction and design of 3D structures of large RNA molecules.
Collapse
Affiliation(s)
- Karen Sargsyan
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | | |
Collapse
|
20
|
Ciriello G, Gallina C, Guerra C. Analysis of interactions between ribosomal proteins and RNA structural motifs. BMC Bioinformatics 2010; 11 Suppl 1:S41. [PMID: 20122215 PMCID: PMC3009514 DOI: 10.1186/1471-2105-11-s1-s41] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Background One important goal of structural bioinformatics is to recognize and predict the interactions between protein binding sites and RNA. Recently, a comprehensive analysis of ribosomal proteins and their interactions with rRNA has been done. Interesting results emerged from the comparison of r-proteins within the small subunit in T. thermophilus and E. coli, supporting the idea of a core made by both RNA and proteins, conserved by evolution. Recent work showed also that ribosomal RNA is modularly composed. Motifs are generally single-stranded sequences of consecutive nucleotides (ssRNA) with characteristic folding. The role of these motifs in protein-RNA interactions has been so far only sparsely investigated. Results This work explores the role of RNA structural motifs in the interaction of proteins with ribosomal RNA (rRNA). We analyze composition, local geometries and conformation of interface regions involving motifs such as tetraloops, kink turns and single extruded nucleotides. We construct an interaction map of protein binding sites that allows us to identify the common types of shared 3-D physicochemical binding patterns for tetraloops. Furthermore, we investigate the protein binding pockets that accommodate single extruded nucleotides either involved in kink-turns or in arbitrary RNA strands. This analysis reveals a new structural motif, called tripod. It corresponds to small pockets consisting of three aminoacids arranged at the vertices of an almost equilateral triangle. We developed a search procedure for the recognition of tripods, based on an empirical tripod fingerprint. Conclusion A comparative analysis with the overall RNA surface and interfaces shows that contact surfaces involving RNA motifs have distinctive features that may be useful for the recognition and prediction of interactions.
Collapse
Affiliation(s)
- Giovanni Ciriello
- Dept, of Information Engineering, University of Padova, Via Gradenigo 6a, 35131 Padova, Italy.
| | | | | |
Collapse
|
21
|
Robeva R, Davies R, Hodge T, Enyedi A. Mathematical biology modules based on modern molecular biology and modern discrete mathematics. CBE LIFE SCIENCES EDUCATION 2010; 9:227-40. [PMID: 20810955 PMCID: PMC2931670 DOI: 10.1187/cbe.10-03-0019] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2010] [Revised: 05/24/2010] [Accepted: 06/02/2010] [Indexed: 05/23/2023]
Abstract
We describe an ongoing collaborative curriculum materials development project between Sweet Briar College and Western Michigan University, with support from the National Science Foundation. We present a collection of modules under development that can be used in existing mathematics and biology courses, and we address a critical national need to introduce students to mathematical methods beyond the interface of biology with calculus. Based on ongoing research, and designed to use the project-based-learning approach, the modules highlight applications of modern discrete mathematics and algebraic statistics to pressing problems in molecular biology. For the majority of projects, calculus is not a required prerequisite and, due to the modest amount of mathematical background needed for some of the modules, the materials can be used for an early introduction to mathematical modeling. At the same time, most modules are connected with topics in linear and abstract algebra, algebraic geometry, and probability, and they can be used as meaningful applied introductions into the relevant advanced-level mathematics courses. Open-source software is used to facilitate the relevant computations. As a detailed example, we outline a module that focuses on Boolean models of the lac operon network.
Collapse
Affiliation(s)
- Raina Robeva
- Department of Mathematical Sciences, Western Michigan University, Kalamazoo MI 49008, USA.
| | | | | | | |
Collapse
|
22
|
Affiliation(s)
- Raina Robeva
- Department of Mathematical Sciences, Sweet Briar College, Sweet Briar, VA 24595, USA.
| | | |
Collapse
|
23
|
Hsiao C, Williams LD. A recurrent magnesium-binding motif provides a framework for the ribosomal peptidyl transferase center. Nucleic Acids Res 2009; 37:3134-42. [PMID: 19279186 PMCID: PMC2691814 DOI: 10.1093/nar/gkp119] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
The ribosome is an ancient macromolecular machine responsible for the synthesis of all proteins in all living organisms. Here we demonstrate that the ribosomal peptidyl transferase center (PTC) is supported by a framework of magnesium microclusters (Mg2+-μc's). Common features of Mg2+-μc's include two paired Mg2+ ions that are chelated by a common bridging phosphate group in the form Mg(a)2+–(O1P-P-O2P)–Mg(b)2+. This bridging phosphate is part of a 10-membered chelation ring in the form Mg(a)2+–(OP-P-O5′-C5′-C4′-C3′-O3′-P-OP)–Mg(a)2+. The two phosphate groups of this 10-membered ring are contributed by adjacent residues along the RNA backbone. Both Mg2+ ions are octahedrally coordinated, but are substantially dehydrated by interactions with additional RNA phosphate groups. The Mg2+-μc's in the LSU (large subunit) appear to be highly conserved over evolution, since they are unchanged in bacteria (Thermus thermophilus, PDB entry 2J01) and archaea (Haloarcula marismortui, PDB entry 1JJ2). The 2D elements of the 23S rRNA that are linked by Mg2+-μc's are conserved between the rRNAs of bacteria, archaea and eukarya and in mitochondrial rRNA, and in a proposed minimal 23S-rRNA. We observe Mg2+-μc's in other rRNAs including the bacterial 16S rRNA, and the P4–P6 domain of the tetrahymena Group I intron ribozyme. It appears that Mg2+-μc's are a primeval motif, with pivotal roles in RNA folding, function and evolution.
Collapse
Affiliation(s)
- Chiaolong Hsiao
- School of Chemistry and Biochemistry, Parker H. Petit Institute of Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA
| | | |
Collapse
|