1
|
Gianfrotta C, Reinharz V, Lespinet O, Barth D, Denise A. On the predictibility of A-minor motifs from their local contexts. RNA Biol 2022; 19:1208-1227. [PMID: 36384383 PMCID: PMC9673937 DOI: 10.1080/15476286.2022.2144611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
This study investigates the importance of the structural context in the formation of a type I/II A-minor motif. This very frequent structural motif has been shown to be important in the spatial folding of RNA molecules. We developed an automated method to classify A-minor motif occurrences according to their 3D context similarities, and we used a graph approach to represent both the structural A-minor motif occurrences and their classes at different scales. This approach leads us to uncover new subclasses of A-minor motif occurrences according to their local 3D similarities. The majority of classes are composed of homologous occurrences, but some of them are composed of non-homologous occurrences. The different classifications we obtain allow us to better understand the importance of the context in the formation of A-minor motifs. In a second step, we investigate how much knowledge of the context around an A-minor motif can help to infer its presence (and position). More specifically, we want to determine what kind of information, contained in the structural context, can be useful to characterize and predict A-minor motifs. We show that, for some A-minor motifs, the topology combined with a sequence signal is sufficient to predict the presence and the position of an A-minor motif occurrence. In most other cases, these signals are not sufficient for predicting the A-minor motif, however we show that they are good signals for this purpose. All the classification and prediction pipelines rely on automated processes, for which we describe the underlying algorithms and parameters.
Collapse
Affiliation(s)
- Coline Gianfrotta
- Données et Algorithmes pour une Ville Intelligente et Durable (DAVID), Université de Versailles Saint-Quentin-en-Yvelines, Université Paris-Saclay, Versailles, France,Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Université Paris-Saclay, CNRS, Orsay, France,CONTACT Coline Gianfrotta Données et Algorithmes pour une Ville Intelligente et Durable (DAVID), Université de Versailles Saint-Quentin-en-Yvelines, Université Paris-Saclay, France
| | - Vladimir Reinharz
- Department of Computer Science, Université du Québec à Montréal, Québec, Canada
| | - Olivier Lespinet
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, Gif-sur-Yvette, France
| | - Dominique Barth
- Données et Algorithmes pour une Ville Intelligente et Durable (DAVID), Université de Versailles Saint-Quentin-en-Yvelines, Université Paris-Saclay, Versailles, France
| | - Alain Denise
- Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Université Paris-Saclay, CNRS, Orsay, France,Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, Gif-sur-Yvette, France
| |
Collapse
|
2
|
Soulé A, Reinharz V, Sarrazin-Gendron R, Denise A, Waldispühl J. Finding recurrent RNA structural networks with fast maximal common subgraphs of edge-colored graphs. PLoS Comput Biol 2021; 17:e1008990. [PMID: 34048427 PMCID: PMC8191989 DOI: 10.1371/journal.pcbi.1008990] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 06/10/2021] [Accepted: 04/22/2021] [Indexed: 11/25/2022] Open
Abstract
RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs are not. Nor are distant interactions linking together those secondary structure elements (SSEs). Databases of conserved 3D geometries (a.k.a. modules) not captured by energetic models are leveraged for structure prediction and design, but the computational complexity has limited their study to local elements, loops. Representing the RNA structure as a graph has recently allowed to expend this work to pairs of SSEs, uncovering a hierarchical organization of these 3D modules, at great computational cost. Systematically capturing recurrent patterns on a large scale is a main challenge in the study of RNA structures. In this paper, we present an efficient algorithm to compute maximal isomorphisms in edge colored graphs. We extend this algorithm to a framework well suited to identify RNA modules, and fast enough to considerably generalize previous approaches. To exhibit the versatility of our framework, we first reproduce results identifying all common modules spanning more than 2 SSEs, in a few hours instead of weeks. The efficiency of our new algorithm is demonstrated by computing the maximal modules between any pair of entire RNA in the non-redundant corpus of known RNA 3D structures. We observe that the biggest modules our method uncovers compose large shared sub-structure spanning hundreds of nucleotides and base pairs between the ribosomes of Thermus thermophilus, Escherichia Coli, and Pseudomonas aeruginosa. Ribonucleic Acids (RNAs) are performing a broad range of essential molecular functions in cells, many of which rely on intricate folding properties of the molecule. Watson-Crick and Wobble base pairs form early, stack onto each other to create stems connected by loops, which are themselves stabilized by more sophisticated base interaction patterns. These networks are essential to shape RNA 3D structures but unfortunately still poorly understood. Here, we undertake the task to build a catalog of base interaction networks occurring in multiple structures. However, a pairwise comparison of all RNA structures is computationally heavy. Therefore, we devise an algorithm leveraging intrinsic properties of RNA base interaction networks that enables us to quickly mine full databases of 3D structures. Compared to previous methods, our techniques bring the total running time of the analysis from months to hours while performing more general searches. The data collected though this work will benefit molecular evolution studies and serve in structure prediction tools.
Collapse
Affiliation(s)
- Antoine Soulé
- School of Computer Science, McGill University, Montréal, Canada
- LiX, École Polytechnique, Paris, France
| | - Vladimir Reinharz
- Department of Computer Science, Université du Québec à Montréal, Montréal, Canada
| | | | - Alain Denise
- Laboratoire de recherche en informatique, Université Paris-Saclay - CNRS, Orsay, France
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay - CEA - CNRS, Gif-sur-Yvette, France
| | - Jérôme Waldispühl
- School of Computer Science, McGill University, Montréal, Canada
- * E-mail:
| |
Collapse
|
3
|
Reinharz V, Soulé A, Westhof E, Waldispühl J, Denise A. Mining for recurrent long-range interactions in RNA structures reveals embedded hierarchies in network families. Nucleic Acids Res 2019; 46:3841-3851. [PMID: 29608773 PMCID: PMC5934684 DOI: 10.1093/nar/gky197] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Accepted: 03/22/2018] [Indexed: 11/14/2022] Open
Abstract
The wealth of the combinatorics of nucleotide base pairs enables RNA molecules to assemble into sophisticated interaction networks, which are used to create complex 3D substructures. These interaction networks are essential to shape the 3D architecture of the molecule, and also to provide the key elements to carry molecular functions such as protein or ligand binding. They are made of organised sets of long-range tertiary interactions which connect distinct secondary structure elements in 3D structures. Here, we present a de novo data-driven approach to extract automatically from large data sets of full RNA 3D structures the recurrent interaction networks (RINs). Our methodology enables us for the first time to detect the interaction networks connecting distinct components of the RNA structure, highlighting their diversity and conservation through non-related functional RNAs. We use a graphical model to perform pairwise comparisons of all RNA structures available and to extract RINs and modules. Our analysis yields a complete catalog of RNA 3D structures available in the Protein Data Bank and reveals the intricate hierarchical organization of the RNA interaction networks and modules. We assembled our results in an online database (http://carnaval.lri.fr) which will be regularly updated. Within the site, a tool allows users with a novel RNA structure to detect automatically whether the novel structure contains previously observed RINs.
Collapse
Affiliation(s)
- Vladimir Reinharz
- Department of Computer Science, Ben-Gurion University of the Negev, P.O.B. 653 Beer-Sheva, 84105, Israel.,School of Computer Science, McGill University, 3480 University, Montreal, Quebec H3A 0E9, Canada
| | - Antoine Soulé
- School of Computer Science, McGill University, 3480 University, Montreal, Quebec H3A 0E9, Canada.,LIX, École Polytechnique, CNRS, Inria, Palaiseau 91120, France
| | - Eric Westhof
- ARN, Université de Strasbourg, IBMC-CNRS, 15 rue René Descartes, Strasbourg Cedex 67084, France
| | - Jérôme Waldispühl
- School of Computer Science, McGill University, 3480 University, Montreal, Quebec H3A 0E9, Canada
| | - Alain Denise
- LRI, Université Paris-Sud, CNRS, Université Paris-Saclay, Bâtiment 650, Orsay cedex 91405, France.,I2BC, Université Paris-Sud, CNRS, CEA, Université Paris-Saclay, Bâtiment 400, Orsay cedex 91405, France
| |
Collapse
|
4
|
Zaharia A, Labedan B, Froidevaux C, Denise A. CoMetGeNe: mining conserved neighborhood patterns in metabolic and genomic contexts. BMC Bioinformatics 2019; 20:19. [PMID: 30630411 PMCID: PMC6327494 DOI: 10.1186/s12859-018-2542-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Accepted: 11/22/2018] [Indexed: 02/07/2023] Open
Abstract
Background In systems biology, there is an acute need for integrative approaches in heterogeneous network mining in order to exploit the continuous flux of genomic data. Simultaneous analysis of the metabolic pathways and genomic context of a given species leads to the identification of patterns consisting in reaction chains catalyzed by products of neighboring genes. Similar such patterns across several species can reveal their mode of conservation throughout the tree of life. Results We present CoMetGeNe (COnserved METabolic and GEnomic NEighborhoods), a novel method that identifies metabolic and genomic patterns consisting in maximal trails of reactions being catalyzed by products of neighboring genes. Patterns determined by CoMetGeNe in one species are subsequently employed in order to reflect their degree of conservation across multiple prokaryotic species. These interspecies comparisons help to improve genome annotation and can reveal putative alternative metabolic routes as well as unexpected gene ordering occurrences. Conclusions CoMetGeNe is an exploratory tool at both the genomic and the metabolic levels, leading to insights into the conservation of functionally related clusters of neighboring enzyme-coding genes. The open-source CoMetGeNe pipeline is freely available at https://cometgene.lri.fr. Electronic supplementary material The online version of this article (10.1186/s12859-018-2542-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alexandra Zaharia
- Laboratoire de Recherche en Informatique (LRI), CNRS, Université Paris-Sud, Université Paris-Saclay, Orsay, 91405, France
| | - Bernard Labedan
- Laboratoire de Recherche en Informatique (LRI), CNRS, Université Paris-Sud, Université Paris-Saclay, Orsay, 91405, France
| | - Christine Froidevaux
- Laboratoire de Recherche en Informatique (LRI), CNRS, Université Paris-Sud, Université Paris-Saclay, Orsay, 91405, France
| | - Alain Denise
- Laboratoire de Recherche en Informatique (LRI), CNRS, Université Paris-Sud, Université Paris-Saclay, Orsay, 91405, France. .,Institut de Biologie Intégrative de la Cellule (I2BC), CEA, CNRS, Université Paris-Sud, Université Paris-Saclay, Orsay, 91405, France.
| |
Collapse
|
5
|
Boudard M, Barth D, Bernauer J, Denise A, Cohen J. GARN2: coarse-grained prediction of 3D structure of large RNA molecules by regret minimization. Bioinformatics 2018; 33:2479-2486. [PMID: 28398456 DOI: 10.1093/bioinformatics/btx175] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Accepted: 04/06/2017] [Indexed: 11/12/2022] Open
Abstract
Motivation Predicting the 3D structure of RNA molecules is a key feature towards predicting their functions. Methods which work at atomic or nucleotide level are not suitable for large molecules. In these cases, coarse-grained prediction methods aim to predict a shape which could be refined later by using more precise methods on smaller parts of the molecule. Results We developed a complete method for sampling 3D RNA structure at a coarse-grained model, taking a secondary structure as input. One of the novelties of our method is that a second step extracts two best possible structures close to the native, from a set of possible structures. Although our method benefits from the first version of GARN, some of the main features on GARN2 are very different. GARN2 is much faster than the previous version and than the well-known methods of the state-of-art. Our experiments show that GARN2 can also provide better structures than the other state-of-the-art methods. Availability and implementation GARN2 is written in Java. It is freely distributed and available at http://garn.lri.fr/. Contact melanie.boudard@lri.fr or johanne.cohen@lri.fr. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mélanie Boudard
- DAVID, Université de Versailles-St-Quentin-en-Yvelines, Université Paris-Saclay, France
- LRI, Université Paris-Sud, CNRS, Université Paris-Saclay, France
| | - Dominique Barth
- DAVID, Université de Versailles-St-Quentin-en-Yvelines, Université Paris-Saclay, France
| | - Julie Bernauer
- LIX, Ecole Polytechnique, CNRS, Université Paris-Saclay, France
- Inria Saclay, Université Paris-Saclay, France
| | - Alain Denise
- LRI, Université Paris-Sud, CNRS, Université Paris-Saclay, France
- I2BC, Université Paris-Sud, CNRS, Université Paris-Saclay, France
| | - Johanne Cohen
- LRI, Université Paris-Sud, CNRS, Université Paris-Saclay, France
| |
Collapse
|
6
|
Boudard M, Bernauer J, Barth D, Cohen J, Denise A. GARN: Sampling RNA 3D Structure Space with Game Theory and Knowledge-Based Scoring Strategies. PLoS One 2015; 10:e0136444. [PMID: 26313379 PMCID: PMC4551674 DOI: 10.1371/journal.pone.0136444] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2015] [Accepted: 08/03/2015] [Indexed: 11/19/2022] Open
Abstract
Cellular processes involve large numbers of RNA molecules. The functions of these RNA molecules and their binding to molecular machines are highly dependent on their 3D structures. One of the key challenges in RNA structure prediction and modeling is predicting the spatial arrangement of the various structural elements of RNA. As RNA folding is generally hierarchical, methods involving coarse-grained models hold great promise for this purpose. We present here a novel coarse-grained method for sampling, based on game theory and knowledge-based potentials. This strategy, GARN (Game Algorithm for RNa sampling), is often much faster than previously described techniques and generates large sets of solutions closely resembling the native structure. GARN is thus a suitable starting point for the molecular modeling of large RNAs, particularly those with experimental constraints. GARN is available from: http://garn.lri.fr/.
Collapse
Affiliation(s)
- Mélanie Boudard
- PRiSM, CNRS UMR 8144, Université de Versailles-St-Quentin-en-Yvelines, 78000 Versailles, France
- LRI, CNRS UMR 8623, Université Paris-Sud, 91405 Orsay, France
- * E-mail: (MB); (JC)
| | - Julie Bernauer
- AMIB, Inria Saclay-Ile de France, 91120 Palaiseau, France
- LIX, CNRS UMR 7161, Ecole Polytechnique, 91120 Palaiseau, France
| | - Dominique Barth
- PRiSM, CNRS UMR 8144, Université de Versailles-St-Quentin-en-Yvelines, 78000 Versailles, France
| | - Johanne Cohen
- LRI, CNRS UMR 8623, Université Paris-Sud, 91405 Orsay, France
- * E-mail: (MB); (JC)
| | - Alain Denise
- LRI, CNRS UMR 8623, Université Paris-Sud, 91405 Orsay, France
- AMIB, Inria Saclay-Ile de France, 91120 Palaiseau, France
- I2BC, CNRS, Université Paris-Sud, 91405 Orsay, France
| |
Collapse
|
7
|
Shao C, Yang B, Wu T, Huang J, Tang P, Zhou Y, Zhou J, Qiu J, Jiang L, Li H, Chen G, Sun H, Zhang Y, Denise A, Zhang DE, Fu XD. Mechanisms for U2AF to define 3' splice sites and regulate alternative splicing in the human genome. Nat Struct Mol Biol 2014. [PMID: 25326705 DOI: 10.1038/nsmb2906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
The U2AF heterodimer has been well studied for its role in defining functional 3' splice sites in pre-mRNA splicing, but many fundamental questions still remain unaddressed regarding the function of U2AF in mammalian genomes. Through genome-wide analysis of U2AF-RNA interactions, we report that U2AF has the capacity to directly define ~88% of functional 3' splice sites in the human genome, but numerous U2AF binding events also occur in intronic locations. Mechanistic dissection reveals that upstream intronic binding events interfere with the immediate downstream 3' splice site associated either with the alternative exon, to cause exon skipping, or with the competing constitutive exon, to induce exon inclusion. We further demonstrate partial functional impairment with leukemia-associated mutations in U2AF35, but not U2AF65, in regulated splicing. These findings reveal the genomic function and regulatory mechanism of U2AF in both normal and disease states.
Collapse
Affiliation(s)
- Changwei Shao
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Bo Yang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
- Laboratoire de Recherche en Informatique, Institut de Génétique et Microbiologie I, Université Paris-Sud and Centre National de la Recherche Scientifique, Orsay, France
| | - Tongbin Wu
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Jie Huang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Peng Tang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Yu Zhou
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California, USA
| | - Jie Zhou
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Jinsong Qiu
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California, USA
| | - Li Jiang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Hairi Li
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California, USA
| | - Geng Chen
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Hui Sun
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Yi Zhang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Alain Denise
- Laboratoire de Recherche en Informatique, Institut de Génétique et Microbiologie I, Université Paris-Sud and Centre National de la Recherche Scientifique, Orsay, France
| | - Dong-Er Zhang
- UC San Diego Moores Cancer Center, University of California, San Diego, La Jolla, California, USA
| | - Xiang-Dong Fu
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California, USA
| |
Collapse
|
8
|
Shao C, Yang B, Wu T, Huang J, Tang P, Zhou Y, Zhou J, Qiu J, Jiang L, Li H, Chen G, Sun H, Zhang Y, Denise A, Zhang DE, Fu XD. Mechanisms for U2AF to define 3' splice sites and regulate alternative splicing in the human genome. Nat Struct Mol Biol 2014; 21:997-1005. [PMID: 25326705 DOI: 10.1038/nsmb.2906] [Citation(s) in RCA: 119] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2014] [Accepted: 09/25/2014] [Indexed: 12/24/2022]
Abstract
The U2AF heterodimer has been well studied for its role in defining functional 3' splice sites in pre-mRNA splicing, but many fundamental questions still remain unaddressed regarding the function of U2AF in mammalian genomes. Through genome-wide analysis of U2AF-RNA interactions, we report that U2AF has the capacity to directly define ~88% of functional 3' splice sites in the human genome, but numerous U2AF binding events also occur in intronic locations. Mechanistic dissection reveals that upstream intronic binding events interfere with the immediate downstream 3' splice site associated either with the alternative exon, to cause exon skipping, or with the competing constitutive exon, to induce exon inclusion. We further demonstrate partial functional impairment with leukemia-associated mutations in U2AF35, but not U2AF65, in regulated splicing. These findings reveal the genomic function and regulatory mechanism of U2AF in both normal and disease states.
Collapse
Affiliation(s)
- Changwei Shao
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Bo Yang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China.,Laboratoire de Recherche en Informatique, Institut de Génétique et Microbiologie I, Université Paris-Sud and Centre National de la Recherche Scientifique, Orsay, France
| | - Tongbin Wu
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Jie Huang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Peng Tang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Yu Zhou
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California, USA
| | - Jie Zhou
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Jinsong Qiu
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California, USA
| | - Li Jiang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Hairi Li
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California, USA
| | - Geng Chen
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Hui Sun
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Yi Zhang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China
| | - Alain Denise
- Laboratoire de Recherche en Informatique, Institut de Génétique et Microbiologie I, Université Paris-Sud and Centre National de la Recherche Scientifique, Orsay, France
| | - Dong-Er Zhang
- UC San Diego Moores Cancer Center, University of California, San Diego, La Jolla, California, USA
| | - Xiang-Dong Fu
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, China.,Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California, USA
| |
Collapse
|
9
|
Abstract
BACKGROUND In comparative genomics, orthologs are used to transfer annotation from genes already characterized to newly sequenced genomes. Many methods have been developed for finding orthologs in sets of genomes. However, the application of different methods on the same proteome set can lead to distinct orthology predictions. METHODS We developed a method based on a meta-approach that is able to combine the results of several methods for orthologous group prediction. The purpose of this method is to produce better quality results by using the overlapping results obtained from several individual orthologous gene prediction procedures. Our method proceeds in two steps. The first aims to construct seeds for groups of orthologous genes; these seeds correspond to the exact overlaps between the results of all or several methods. In the second step, these seed groups are expanded by using HMM profiles. RESULTS We evaluated our method on two standard reference benchmarks, OrthoBench and Orthology Benchmark Service. Our method presents a higher level of accurately predicted groups than the individual input methods of orthologous group prediction. Moreover, our method increases the number of annotated orthologous pairs without decreasing the annotation quality compared to twelve state-of-the-art methods. CONCLUSIONS The meta-approach based method appears to be a reliable procedure for predicting orthologous groups. Since a large number of methods for predicting groups of orthologous genes exist, it is quite conceivable to apply this meta-approach to several combinations of different methods.
Collapse
|
10
|
Lamiable A, Quessette F, Vial S, Barth D, Denise A. An algorithmic game-theory approach for coarse-grain prediction of RNA 3D structure. IEEE/ACM Trans Comput Biol Bioinform 2013; 10:193-199. [PMID: 23702555 DOI: 10.1109/tcbb.2012.148] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
We present a new approach for the prediction of the coarse-grain 3D structure of RNA molecules. We model a molecule as being made of helices and junctions. Those junctions are classified into topological families that determine their preferred 3D shapes. All the parts of the molecule are then allowed to establish long-distance contacts that induce a 3D folding of the molecule. An algorithm relying on game theory is proposed to discover such long-distance contacts that allow the molecule to reach a Nash equilibrium. As reported by our experiments, this approach allows one to predict the global shape of large molecules of several hundreds of nucleotides that are out of reach of the state-of-the-art methods.
Collapse
Affiliation(s)
- Alexis Lamiable
- PRiSM, Université de Versailles-St-Quentin-en-Yvelines/CNRS, France
| | | | | | | | | |
Collapse
|
11
|
Lamiable A, Barth D, Denise A, Quessette F, Vial S, Westhof É. Automated prediction of three-way junction topological families in RNA secondary structures. Comput Biol Chem 2012; 37:1-5. [DOI: 10.1016/j.compbiolchem.2011.11.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2011] [Revised: 11/14/2011] [Accepted: 11/16/2011] [Indexed: 11/24/2022]
|
12
|
Rinaudo P, Ponty Y, Barth D, Denise A. Tree Decomposition and Parameterized Algorithms for RNA Structure-Sequence Alignment Including Tertiary Interactions and Pseudoknots. ACTA ACUST UNITED AC 2012. [DOI: 10.1007/978-3-642-33122-0_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2023]
|
13
|
Abstract
In 2004, Condon and coauthors gave a hierarchical classification of exact RNA structure prediction algorithms according to the generality of structure classes that they handle. We complete this classification by adding two recent prediction algorithms. More importantly, we precisely quantify the hierarchy by giving closed or asymptotic formulas for the theoretical number of structures of given size n in all the classes but one. This allows us to assess the tradeoff between the expressiveness and the computational complexity of RNA structure prediction algorithms.
Collapse
Affiliation(s)
- Cédric Saule
- LRI, Université Paris-Sud and CNRS, Orsay Cedex, France
| | | | | | | |
Collapse
|
14
|
Cohen-Boulakia S, Denise A, Hamel S. Using Medians to Generate Consensus Rankings for Biological Data. Lecture Notes in Computer Science 2011. [DOI: 10.1007/978-3-642-22351-8_5] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
15
|
Liu W, Zhou Y, Hu Z, Sun T, Denise A, Fu XD, Zhang Y. Regulation of splicing enhancer activities by RNA secondary structures. FEBS Lett 2010; 584:4401-7. [PMID: 20888818 DOI: 10.1016/j.febslet.2010.09.039] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2010] [Accepted: 08/28/2010] [Indexed: 12/13/2022]
Abstract
In this report, we studied the effect of RNA structures on the activity of exonic splicing enhancers on the SMN1 minigene model by engineering known ESEs into different positions of stable hairpins. We found that as short as 7-bp stem is sufficient to abolish the enhancer activity. When placing ESEs in the loop region, AG-rich ESEs are fully active, but a UCG-rich ESE is not because of additional structural constraints. ESEs placed adjacent to the 3' end of the hairpin structure display high enhancer activity, regardless of their sequence identities. These rules explain the suppression of multiple ESEs by point mutations that result in a stable RNA structure, and provide an additional mechanism for the C6T mutation in SMN2.
Collapse
Affiliation(s)
- Wei Liu
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, Hubei, China
| | | | | | | | | | | | | |
Collapse
|
16
|
Abstract
We describe a theoretical unifying framework to express the comparison of RNA structures, which we call alignment hierarchy. This framework relies on the definition of common supersequences for arc-annotated sequences and encompasses the main existing models for RNA structure comparison based on trees and arc-annotated sequences with a variety of edit operations. It also gives rise to edit models that have not been studied yet. We provide a thorough analysis of the alignment hierarchy, including a new polynomial-time algorithm and an NP-completeness proof. The polynomial-time algorithm involves biologically relevant edit operations such as pairing or unpairing nucleotides. It has been implemented in a software, called gardenia, which is available at the Web server http://bioinfo.lifl.fr/RNA/gardenia.
Collapse
Affiliation(s)
- Guillaume Blin
- Institut Gaspard Monge, UMR 8049 CNRS, Université Paris-Est, 5 Bd Descartes-Champs sur Marne, 77454 Marne La Vallée Cedex 2, France.
| | | | | | | | | |
Collapse
|
17
|
Darty K, Denise A, Ponty Y. VARNA: Interactive drawing and editing of the RNA secondary structure. Bioinformatics 2009; 25:1974-5. [PMID: 19398448 PMCID: PMC2712331 DOI: 10.1093/bioinformatics/btp250] [Citation(s) in RCA: 828] [Impact Index Per Article: 55.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2008] [Revised: 04/07/2009] [Accepted: 04/07/2009] [Indexed: 12/04/2022] Open
Abstract
DESCRIPTION VARNA is a tool for the automated drawing, visualization and annotation of the secondary structure of RNA, designed as a companion software for web servers and databases. FEATURES VARNA implements four drawing algorithms, supports input/output using the classic formats dbn, ct, bpseq and RNAML and exports the drawing as five picture formats, either pixel-based (JPEG, PNG) or vector-based (SVG, EPS and XFIG). It also allows manual modification and structural annotation of the resulting drawing using either an interactive point and click approach, within a web server or through command-line arguments. AVAILABILITY VARNA is a free software, released under the terms of the GPLv3.0 license and available at http://varna.lri.fr. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kévin Darty
- LRI, UMR CNRS 8623, UMR CNRS 8621, Université Paris-Sud 11, F91405 Orsay cedex, France.
| | | | | |
Collapse
|
18
|
Djelloul M, Denise A. Automated motif extraction and classification in RNA tertiary structures. RNA 2008; 14:2489-2497. [PMID: 18957493 PMCID: PMC2590963 DOI: 10.1261/rna.1061108] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2008] [Accepted: 08/15/2008] [Indexed: 05/27/2023]
Abstract
We used a novel graph-based approach to extract RNA tertiary motifs. We cataloged them all and clustered them using an innovative graph similarity measure. We applied our method to three widely studied structures: Haloarcula marismortui 50S (H.m 50S), Escherichia coli 50S (E. coli 50S), and Thermus thermophilus 16S (T.th 16S) RNAs. We identified 10 known motifs without any prior knowledge of their shapes or positions. We additionally identified four putative new motifs.
Collapse
Affiliation(s)
- Mahassine Djelloul
- Laboratoire de Recherche en Informatique, Université Paris-Sud 11 and CNRS, 91405 Orsay Cedex, France
| | | |
Collapse
|
19
|
|
20
|
Namy O, Zhou Y, Gundllapalli S, Polycarpo CR, Denise A, Rousset JP, Söll D, Ambrogelly A. Adding pyrrolysine to the Escherichia coli genetic code. FEBS Lett 2007; 581:5282-8. [PMID: 17967457 DOI: 10.1016/j.febslet.2007.10.022] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2007] [Accepted: 10/10/2007] [Indexed: 11/24/2022]
Abstract
Pyrrolysyl-tRNA synthetase and its cognate suppressor tRNA(Pyl) mediate pyrrolysine (Pyl) insertion at in frame UAG codons. The presence of an RNA hairpin structure named Pyl insertion structure (PYLIS) downstream of the suppression site has been shown to stimulate the insertion of Pyl in archaea. We study here the impact of the presence of PYLIS on the level of Pyl and the Pyl analog N-epsilon-cyclopentyloxycarbonyl-l-lysine (Cyc) incorporation using a quantitative lacZ-luc tandem reporter system in an Escherichia coli context. We show that PYLIS has no effect on the level of neither Pyl nor Cyc incorporation. Exogenously supplying our reporter system with d-ornithine significantly increases suppression efficiency, indicating that d-ornithine is a direct precursor to Pyl.
Collapse
Affiliation(s)
- Olivier Namy
- Institut de Genetique et Microbiologie, Université Paris-Sud, CNRS UMR8621, Orsay F-91405, France
| | | | | | | | | | | | | | | |
Collapse
|
21
|
Abstract
SUMMARY GenRGenS is a software tool dedicated to randomly generating genomic sequences and structures. It handles several classes of models useful for sequence analysis, such as Markov chains, hidden Markov models, weighted context-free grammars, regular expressions and PROSITE expressions. GenRGenS is the only program that can handle weighted context-free grammars, thus allowing the user to model and to generate structured objects (such as RNA secondary structures) of any given desired size. GenRGenS also allows the user to combine several of these different models at the same time.
Collapse
Affiliation(s)
- Yann Ponty
- LRI, UMR CNRS 8623, Université Paris-Sud 11 F91405 Orsay cedex, France
| | | | | |
Collapse
|
22
|
Nédélec E, Moncion T, Gassiat E, Bossard B, Duchateau-Nguyen G, Denise A, Termier M. A pairwise alignment algorithm which favors clusters of blocks. J Comput Biol 2005; 12:33-47. [PMID: 15725732 DOI: 10.1089/cmb.2005.12.33] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Pairwise sequence alignments aim to decide whether two sequences are related and, if so, to exhibit their related domains. Recent works have pointed out that a significant number of true homologous sequences are missed when using classical comparison algorithms. This is the case when two homologous sequences share several little blocks of homology, too small to lead to a significant score. On the other hand, classical alignment algorithms, when detecting homologies, may fail to recognize all the significant biological signals. The aim of the paper is to give a solution to these two problems. We propose a new scoring method which tends to increase the score of an alignment when "blocks" are detected. This so-called Block-Scoring algorithm, which makes use of dynamic programming, is worth being used as a complementary tool to classical exact alignments methods. We validate our approach by applying it on a large set of biological data. Finally, we give a limit theorem for the score statistics of the algorithm.
Collapse
Affiliation(s)
- Elodie Nédélec
- Laboratoire de Mathématiques, Equipe de Probabilités, Statistique et Modélisation, UMR CNRS 8628, Université Paris-Sud, 91405 Orsay Cedex, France
| | | | | | | | | | | | | |
Collapse
|
23
|
Bekaert M, Bidou L, Denise A, Duchateau-Nguyen G, Forest JP, Froidevaux C, Hatin I, Rousset JP, Termier M. Towards a computational model for -1 eukaryotic frameshifting sites. Bioinformatics 2003; 19:327-35. [PMID: 12584117 PMCID: PMC7109833 DOI: 10.1093/bioinformatics/btf868] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
MOTIVATION Unconventional decoding events are now well acknowledged, but not yet well formalized. In this study, we present a bioinformatics analysis of eukaryotic -1 frameshifting, in order to model this event. RESULTS A consensus model has already been established for -1 frameshifting sites. Our purpose here is to provide new constraints which make the model more precise. We show how a machine learning approach can be used to refine the current model. We identify new properties that may be involved in frameshifting. Each of the properties found was experimentally validated. Initially, we identify features of the overall model that are to be simultaneously satisfied. We then focus on the following two components: the spacer and the slippery sequence. As a main result, we point out that the identity of the primary structure of the so-called spacer is of great importance. AVAILABILITY Sequences of the oligonucleotides in the functional tests are available at http://www.igmors.u-psud.fr/rousset/bioinformatics/.
Collapse
Affiliation(s)
- Michaël Bekaert
- Génétique Moléculaire de la Traduction, UMR CNRS 8623, Université Paris-Sud, 91405 Orsay Cedex, France.
| | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Bassères F, Collard M, Denise A. [Ocular motoricity and multiple sclerosis]. Rev Laryngol Otol Rhinol (Bord) 1974; 95 SUPPL:379-98. [PMID: 4445641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
|
25
|
Guerrier Y, Dejean Y, Denise A. [Osseous neoplasms of the upper jaw]. J Fr Otorhinolaryngol Audiophonol Chir Maxillofac (1967) 1971; 20:17-21. [PMID: 4252081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
26
|
Guerrier Y, Dejean Y, Basseres F, Denise A. [Electronystagmographic exploration in children]. J Fr Otorhinolaryngol Audiophonol Chir Maxillofac (1967) 1969; 18:671-82. [PMID: 4249637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
27
|
Guerrier Y, Bassens F, Auriacombe Y, Denise A, Ibos F, Saglier J. [Audiovestibular exploration with electronystagmography of topflight automobile racing drivers]. J Fr Otorhinolaryngol Audiophonol Chir Maxillofac (1967) 1969; 18:139-40. [PMID: 4245850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
28
|
Basseres F, Guerrier Y, Dejean Y, Denise A. [Contribution to the study of sudden deafness]. J Fr Otorhinolaryngol Audiophonol Chir Maxillofac (1967) 1968; 17:421-8. [PMID: 4234618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
29
|
Gros C, Guerrier Y, Dejean Y, Denise A. [Nontraumatic cerebrospinal rhinorrhea]. J Fr Otorhinolaryngol Audiophonol Chir Maxillofac (1967) 1968; 17:93-8 passim. [PMID: 4249519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|