1
|
Voß B. Classified Dynamic Programming in RNA Structure Analysis. Methods Mol Biol 2024; 2726:125-141. [PMID: 38780730 DOI: 10.1007/978-1-0716-3519-3_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Analysis of the folding space of RNA generally suffers from its exponential size. With classified Dynamic Programming algorithms, it is possible to alleviate this burden and to analyse the folding space of RNA in great depth. Key to classified DP is that the search space is partitioned into classes based on an on-the-fly computed feature. A class-wise evaluation is then used to compute class-wide properties, such as the lowest free energy structure for each class, or aggregate properties, such as the class' probability. In this paper we describe the well-known shape and hishape abstraction of RNA structures, their power to help better understand RNA function and related methods that are based on these abstractions.
Collapse
Affiliation(s)
- Björn Voß
- RNA Biology and Bioinformatics, Institute of Biomedical Genetics, University of Stuttgart, Stuttgart, Germany
| |
Collapse
|
2
|
Huang J, Voß B. Simulation of Folding Kinetics for Aligned RNAs. Genes (Basel) 2021; 12:genes12030347. [PMID: 33652983 PMCID: PMC7996734 DOI: 10.3390/genes12030347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 02/18/2021] [Accepted: 02/22/2021] [Indexed: 11/16/2022] Open
Abstract
Studying the folding kinetics of an RNA can provide insight into its function and is thus a valuable method for RNA analyses. Computational approaches to the simulation of folding kinetics suffer from the exponentially large folding space that needs to be evaluated. Here, we present a new approach that combines structure abstraction with evolutionary conservation to restrict the analysis to common parts of folding spaces of related RNAs. The resulting algorithm can recapitulate the folding kinetics known for single RNAs and is able to analyse even long RNAs in reasonable time. Our program RNAliHiKinetics is the first algorithm for the simulation of consensus folding kinetics and addresses a long-standing problem in a new and unique way.
Collapse
Affiliation(s)
- Jiabin Huang
- Institute of Medical Microbiology, Virology and Hygiene, University Medical Center Hamburg-Eppendorf, Martinistraße 52, 20246 Hamburg, Germany;
| | - Björn Voß
- Computational Biology Group, Institute of Biochemical Engineering, University of Stuttgart, Allmandring 31, 70569 Stuttgart, Germany
- Correspondence:
| |
Collapse
|
3
|
Grillone K, Riillo C, Scionti F, Rocca R, Tradigo G, Guzzi PH, Alcaro S, Di Martino MT, Tagliaferri P, Tassone P. Non-coding RNAs in cancer: platforms and strategies for investigating the genomic "dark matter". J Exp Clin Cancer Res 2020; 39:117. [PMID: 32563270 PMCID: PMC7305591 DOI: 10.1186/s13046-020-01622-x] [Citation(s) in RCA: 119] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Accepted: 06/11/2020] [Indexed: 12/18/2022] Open
Abstract
The discovery of the role of non-coding RNAs (ncRNAs) in the onset and progression of malignancies is a promising frontier of cancer genetics. It is clear that ncRNAs are candidates for therapeutic intervention, since they may act as biomarkers or key regulators of cancer gene network. Recently, profiling and sequencing of ncRNAs disclosed deep deregulation in human cancers mostly due to aberrant mechanisms of ncRNAs biogenesis, such as amplification, deletion, abnormal epigenetic or transcriptional regulation. Although dysregulated ncRNAs may promote hallmarks of cancer as oncogenes or antagonize them as tumor suppressors, the mechanisms behind these events remain to be clarified. The development of new bioinformatic tools as well as novel molecular technologies is a challenging opportunity to disclose the role of the "dark matter" of the genome. In this review, we focus on currently available platforms, computational analyses and experimental strategies to investigate ncRNAs in cancer. We highlight the differences among experimental approaches aimed to dissect miRNAs and lncRNAs, which are the most studied ncRNAs. These two classes indeed need different investigation taking into account their intrinsic characteristics, such as length, structures and also the interacting molecules. Finally, we discuss the relevance of ncRNAs in clinical practice by considering promises and challenges behind the bench to bedside translation.
Collapse
Affiliation(s)
- Katia Grillone
- Laboratory of Translational Medical Oncology, Department of Experimental and Clinical Medicine, Magna Graecia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
| | - Caterina Riillo
- Laboratory of Translational Medical Oncology, Department of Experimental and Clinical Medicine, Magna Graecia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
- Medical and Translational Oncology Units, AOU Mater Domini, 88100 Catanzaro, Italy
| | - Francesca Scionti
- Laboratory of Translational Medical Oncology, Department of Experimental and Clinical Medicine, Magna Graecia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
| | - Roberta Rocca
- Laboratory of Translational Medical Oncology, Department of Experimental and Clinical Medicine, Magna Graecia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
- Net4science srl, Magna Graecia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
| | - Giuseppe Tradigo
- Laboratory of Bioinformatics, Department of Medical and Surgical Sciences, Magna Graecia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
| | - Pietro Hiram Guzzi
- Laboratory of Bioinformatics, Department of Medical and Surgical Sciences, Magna Graecia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
| | - Stefano Alcaro
- Net4science srl, Magna Graecia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
- Department of Health Sciences, Magna Græcia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
| | - Maria Teresa Di Martino
- Laboratory of Translational Medical Oncology, Department of Experimental and Clinical Medicine, Magna Graecia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
- Medical and Translational Oncology Units, AOU Mater Domini, 88100 Catanzaro, Italy
| | - Pierosandro Tagliaferri
- Laboratory of Translational Medical Oncology, Department of Experimental and Clinical Medicine, Magna Graecia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
- Medical and Translational Oncology Units, AOU Mater Domini, 88100 Catanzaro, Italy
| | - Pierfrancesco Tassone
- Laboratory of Translational Medical Oncology, Department of Experimental and Clinical Medicine, Magna Graecia University, Salvatore Venuta University Campus, 88100 Catanzaro, Italy
- Medical and Translational Oncology Units, AOU Mater Domini, 88100 Catanzaro, Italy
| |
Collapse
|
4
|
Tourasse NJ, Darfeuille F. Structural Alignment and Covariation Analysis of RNA Sequences. Bio Protoc 2020; 10:e3511. [PMID: 33654736 PMCID: PMC7842705 DOI: 10.21769/bioprotoc.3511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Revised: 12/29/2019] [Accepted: 12/29/2019] [Indexed: 11/02/2022] Open
Abstract
RNA molecules adopt defined structural conformations that are essential to exert their function. During the course of evolution, the structure of a given RNA can be maintained via compensatory base-pair changes that occur among covarying nucleotides in paired regions. Therefore, for comparative, structural, and evolutionary studies of RNA molecules, numerous computational tools have been developed to incorporate structural information into sequence alignments and a number of tools have been developed to study covariation. The bioinformatic protocol presented here explains how to use some of these tools to generate a secondary-structure-aware multiple alignment of RNA sequences and to annotate the alignment to examine the conservation and covariation of structural elements among the sequences.
Collapse
Affiliation(s)
- Nicolas J. Tourasse
- ARNA Laboratory, INSERM U1212, CNRS UMR5320, University of Bordeaux, Bordeaux, France
| | - Fabien Darfeuille
- ARNA Laboratory, INSERM U1212, CNRS UMR5320, University of Bordeaux, Bordeaux, France
| |
Collapse
|
5
|
Emamjomeh A, Zahiri J, Asadian M, Behmanesh M, Fakheri BA, Mahdevar G. Identification, Prediction and Data Analysis of Noncoding RNAs: A Review. Med Chem 2019; 15:216-230. [PMID: 30484409 DOI: 10.2174/1573406414666181015151610] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Revised: 06/03/2018] [Accepted: 09/30/2018] [Indexed: 12/13/2022]
Abstract
BACKGROUND Noncoding RNAs (ncRNAs) which play an important role in various cellular processes are important in medicine as well as in drug design strategies. Different studies have shown that ncRNAs are dis-regulated in cancer cells and play an important role in human tumorigenesis. Therefore, it is important to identify and predict such molecules by experimental and computational methods, respectively. However, to avoid expensive experimental methods, computational algorithms have been developed for accurately and fast prediction of ncRNAs. OBJECTIVE The aim of this review was to introduce the experimental and computational methods to identify and predict ncRNAs structure. Also, we explained the ncRNA's roles in cellular processes and drugs design, briefly. METHOD In this survey, we will introduce ncRNAs and their roles in biological and medicinal processes. Then, some important laboratory techniques will be studied to identify ncRNAs. Finally, the state-of-the-art models and algorithms will be introduced along with important tools and databases. RESULTS The results showed that the integration of experimental and computational approaches improves to identify ncRNAs. Moreover, the high accurate databases, algorithms and tools were compared to predict the ncRNAs. CONCLUSION ncRNAs prediction is an exciting research field, but there are different difficulties. It requires accurate and reliable algorithms and tools. Also, it should be mentioned that computational costs of such algorithm including running time and usage memory are very important. Finally, some suggestions were presented to improve computational methods of ncRNAs gene and structural prediction.
Collapse
Affiliation(s)
- Abbasali Emamjomeh
- Laboratory of Computational Biotechnology and Bioinformatics (CBB), Department of Plant Breeding and Biotechnology (PBB), University of Zabol, Zabol, Iran
| | - Javad Zahiri
- Bioinformatics and Computational Omics Lab (BioCOOL), Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Mehrdad Asadian
- Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Mehrdad Behmanesh
- Department of Genetics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Barat A Fakheri
- Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Ghasem Mahdevar
- Department of Mathematics, Faculty of Sciences, University of Isfahan, Isfahan, Iran
| |
Collapse
|
6
|
Abstract
MOTIVATION Abstract shape analysis, first proposed in 2004, allows one to extract several relevant structures from the folding space of an RNA sequence, preferable to focusing in a single structure of minimal free energy. We report recent extensions to this approach. RESULTS We have rebuilt the original RNAshapes as a repository of components that allows us to integrate several established tools for RNA structure analysis: RNAshapes, RNAalishapes and pknotsRG, including its recent extension pKiss. As a spin-off, we obtain heretofore unavailable functionality: e. g. with pKiss, we can now perform abstract shape analysis for structures holding pseudoknots up to the complexity of kissing hairpin motifs. The new tool pAliKiss can predict kissing hairpin motifs from aligned sequences. Along with the integration, the functionality of the tools was also extended in manifold ways. AVAILABILITY AND IMPLEMENTATION As before, the tool is available on the Bielefeld Bioinformatics server at http://bibiserv.cebitec.uni-bielefeld.de/rnashapesstudio. CONTACT bibi-help@cebitec.uni-bielefeld.de.
Collapse
Affiliation(s)
- Stefan Janssen
- Practical Computer Science, Faculty of Technology, Bielefeld University, D-33615 Bielefeld, Germany
| | - Robert Giegerich
- Practical Computer Science, Faculty of Technology, Bielefeld University, D-33615 Bielefeld, Germany
| |
Collapse
|
7
|
Abstract
Abstract shape analysis abstract shape analysis is a method to learn more about the complete Boltzmann ensemble of the secondary structures of a single RNA molecule. Abstract shapes classify competing secondary structures into classes that are defined by their arrangement of helices. It allows us to compute, in addition to the structure of minimal free energy, a set of structures that represents relevant and interesting structural alternatives. Furthermore, it allows to compute probabilities of all structures within a shape class. This allows to ensure that our representative subset covers the complete Boltzmann ensemble, except for a portion of negligible probability. This chapter explains the main functions of abstract shape analysis, as implemented in the tool RNA shapes. RNA shapes It reports on some other types of analysis that are based on the abstract shapes idea and shows how you can solve novel problems by creating your own shape abstractions.
Collapse
|
8
|
Wiebe NJP, Meyer IM. TRANSAT-- method for detecting the conserved helices of functional RNA structures, including transient, pseudo-knotted and alternative structures. PLoS Comput Biol 2010; 6:e1000823. [PMID: 20589081 PMCID: PMC2891591 DOI: 10.1371/journal.pcbi.1000823] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2009] [Accepted: 05/19/2010] [Indexed: 12/20/2022] Open
Abstract
The prediction of functional RNA structures has attracted increased interest, as it allows us to study the potential functional roles of many genes. RNA structure prediction methods, however, assume that there is a unique functional RNA structure and also do not predict functional features required for in vivo folding. In order to understand how functional RNA structures form in vivo, we require sophisticated experiments or reliable prediction methods. So far, there exist only a few, experimentally validated transient RNA structures. On the computational side, there exist several computer programs which aim to predict the co-transcriptional folding pathway in vivo, but these make a range of simplifying assumptions and do not capture all features known to influence RNA folding in vivo. We want to investigate if evolutionarily related RNA genes fold in a similar way in vivo. To this end, we have developed a new computational method, Transat, which detects conserved helices of high statistical significance. We introduce the method, present a comprehensive performance evaluation and show that Transat is able to predict the structural features of known reference structures including pseudo-knotted ones as well as those of known alternative structural configurations. Transat can also identify unstructured sub-sequences bound by other molecules and provides evidence for new helices which may define folding pathways, supporting the notion that homologous RNA sequence not only assume a similar reference RNA structure, but also fold similarly. Finally, we show that the structural features predicted by Transat differ from those assuming thermodynamic equilibrium. Unlike the existing methods for predicting folding pathways, our method works in a comparative way. This has the disadvantage of not being able to predict features as function of time, but has the considerable advantage of highlighting conserved features and of not requiring a detailed knowledge of the cellular environment. Many non-coding genes exert their function via an RNA structure which starts emerging while the RNA sequence is being transcribed from the genome. The resulting folding pathway is known to depend on a variety of features such as the transcription speed, the concentration of various ions and the binding of proteins and other molecules. Not all of these influences can be adequately captured by the existing computational methods which try to replicate what happens in vivo. So far, it has been challenging to experimentally investigate co-transcriptional folding pathways in vivo and only little data from in vitro experiments exists. In order to investigate if functionally similar RNA sequences from different organisms fold in a similar way, we have developed a new computational method, called Transat, which does not require the detailed computational modeling of the cellular environment. We show in a comprehensive analysis that our method is capable of detecting known structural features and provide evidence that structural features of the in vivo folding pathways have been conserved for several biologically interesting classes of RNA sequences.
Collapse
Affiliation(s)
- Nicholas J. P. Wiebe
- Centre for High-Throughput Biology & Department of Computer Science and Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Irmtraud M. Meyer
- Centre for High-Throughput Biology & Department of Computer Science and Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- * E-mail:
| |
Collapse
|
9
|
Voss B, Georg J, Schön V, Ude S, Hess WR. Biocomputational prediction of non-coding RNAs in model cyanobacteria. BMC Genomics 2009; 10:123. [PMID: 19309518 PMCID: PMC2662882 DOI: 10.1186/1471-2164-10-123] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2008] [Accepted: 03/23/2009] [Indexed: 01/15/2023] Open
Abstract
Background In bacteria, non-coding RNAs (ncRNA) are crucial regulators of gene expression, controlling various stress responses, virulence, and motility. Previous work revealed a relatively high number of ncRNAs in some marine cyanobacteria. However, for efficient genetic and biochemical analysis it would be desirable to identify a set of ncRNA candidate genes in model cyanobacteria that are easy to manipulate and for which extended mutant, transcriptomic and proteomic data sets are available. Results Here we have used comparative genome analysis for the biocomputational prediction of ncRNA genes and other sequence/structure-conserved elements in intergenic regions of the three unicellular model cyanobacteria Synechocystis PCC6803, Synechococcus elongatus PCC6301 and Thermosynechococcus elongatus BP1 plus the toxic Microcystis aeruginosa NIES843. The unfiltered numbers of predicted elements in these strains is 383, 168, 168, and 809, respectively, combined into 443 sequence clusters, whereas the numbers of individual elements with high support are 94, 56, 64, and 406, respectively. Removing also transposon-associated repeats, finally 78, 53, 42 and 168 sequences, respectively, are left belonging to 109 different clusters in the data set. Experimental analysis of selected ncRNA candidates in Synechocystis PCC6803 validated new ncRNAs originating from the fabF-hoxH and apcC-prmA intergenic spacers and three highly expressed ncRNAs belonging to the Yfr2 family of ncRNAs. Yfr2a promoter-luxAB fusions confirmed a very strong activity of this promoter and indicated a stimulation of expression if the cultures were exposed to elevated light intensities. Conclusion Comparison to entries in Rfam and experimental testing of selected ncRNA candidates in Synechocystis PCC6803 indicate a high reliability of the current prediction, despite some contamination by the high number of repetitive sequences in some of these species. In particular, we identified in the four species altogether 8 new ncRNA homologs belonging to the Yfr2 family of ncRNAs. Modelling of RNA secondary structures indicated two conserved single-stranded sequence motifs that might be involved in RNA-protein interactions or in the recognition of target RNAs. Since our analysis has been restricted to find ncRNA candidates with a reasonable high degree of conservation among these four cyanobacteria, there might be many more, requiring direct experimental approaches for their identification.
Collapse
Affiliation(s)
- Björn Voss
- University of Freiburg, Faculty of Biology, Genetics and Experimental Bioinformatics, Freiburg, Germany.
| | | | | | | | | |
Collapse
|
10
|
|
11
|
Steglich C, Futschik ME, Lindell D, Voss B, Chisholm SW, Hess WR. The challenge of regulation in a minimal photoautotroph: non-coding RNAs in Prochlorococcus. PLoS Genet 2008; 4:e1000173. [PMID: 18769676 PMCID: PMC2518516 DOI: 10.1371/journal.pgen.1000173] [Citation(s) in RCA: 118] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2008] [Accepted: 07/17/2008] [Indexed: 12/18/2022] Open
Abstract
Prochlorococcus, an extremely small cyanobacterium that is very abundant in the world's oceans, has a very streamlined genome. On average, these cells have about 2,000 genes and very few regulatory proteins. The limited capability of regulation is thought to be a result of selection imposed by a relatively stable environment in combination with a very small genome. Furthermore, only ten non-coding RNAs (ncRNAs), which play crucial regulatory roles in all forms of life, have been described in Prochlorococcus. Most strains also lack the RNA chaperone Hfq, raising the question of how important this mode of regulation is for these cells. To explore this question, we examined the transcription of intergenic regions of Prochlorococcus MED4 cells subjected to a number of different stress conditions: changes in light qualities and quantities, phage infection, or phosphorus starvation. Analysis of Affymetrix microarray expression data from intergenic regions revealed 276 novel transcriptional units. Among these were 12 new ncRNAs, 24 antisense RNAs (asRNAs), as well as 113 short mRNAs. Two additional ncRNAs were identified by homology, and all 14 new ncRNAs were independently verified by Northern hybridization and 5'RACE. Unlike its reduced suite of regulatory proteins, the number of ncRNAs relative to genome size in Prochlorococcus is comparable to that found in other bacteria, suggesting that RNA regulators likely play a major role in regulation in this group. Moreover, the ncRNAs are concentrated in previously identified genomic islands, which carry genes of significance to the ecology of this organism, many of which are not of cyanobacterial origin. Expression profiles of some of these ncRNAs suggest involvement in light stress adaptation and/or the response to phage infection consistent with their location in the hypervariable genomic islands.
Collapse
MESH Headings
- DNA, Intergenic/chemistry
- DNA, Intergenic/genetics
- DNA, Intergenic/metabolism
- Gene Expression Regulation, Bacterial
- Genome, Bacterial
- Nucleic Acid Conformation
- Open Reading Frames
- Phototrophic Processes
- Prochlorococcus/chemistry
- Prochlorococcus/genetics
- Prochlorococcus/metabolism
- RNA, Bacterial/chemistry
- RNA, Bacterial/genetics
- RNA, Bacterial/metabolism
- RNA, Untranslated/chemistry
- RNA, Untranslated/genetics
- RNA, Untranslated/metabolism
- Transcription, Genetic
Collapse
|
12
|
Janssen S, Reeder J, Giegerich R. Shape based indexing for faster search of RNA family databases. BMC Bioinformatics 2008; 9:131. [PMID: 18312625 PMCID: PMC2277397 DOI: 10.1186/1471-2105-9-131] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2007] [Accepted: 02/29/2008] [Indexed: 11/29/2022] Open
Abstract
Background Most non-coding RNA families exert their function by means of a conserved, common secondary structure. The Rfam data base contains more than five hundred structurally annotated RNA families. Unfortunately, searching for new family members using covariance models (CMs) is very time consuming. Filtering approaches that use the sequence conservation to reduce the number of CM searches, are fast, but it is unknown to which sacrifice. Results We present a new filtering approach, which exploits the family specific secondary structure and significantly reduces the number of CM searches. The filter eliminates approximately 85% of the queries and discards only 2.6% true positives when evaluating Rfam against itself. First results also capture previously undetected non-coding RNAs in a recent human RNAz screen. Conclusion The RNA shape index filter (RNAsifter) is based on the following rationale: An RNA family is characterised by structure, much more succinctly than by sequence content. Structures of individual family members, which naturally have different length and sequence composition, may exhibit structural variation in detail, but overall, they have a common shape in a more abstract sense. Given a fixed release of the Rfam data base, we can compute these abstract shapes for all families. This is called a shape index. If a query sequence belongs to a certain family, it must be able to fold into the family shape with reasonable free energy. Therefore, rather than matching the query against all families in the data base, we can first (and quickly) compute its feasible shape(s), and use the shape index to access only those families where a good match is possible due to a common shape with the query.
Collapse
Affiliation(s)
- Stefan Janssen
- Faculty of Technology, Bielefeld University, 33615 Bielefeld, Germany.
| | | | | |
Collapse
|
13
|
Voss B, Gierga G, Axmann IM, Hess WR. A motif-based search in bacterial genomes identifies the ortholog of the small RNA Yfr1 in all lineages of cyanobacteria. BMC Genomics 2007; 8:375. [PMID: 17941988 PMCID: PMC2190773 DOI: 10.1186/1471-2164-8-375] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2007] [Accepted: 10/17/2007] [Indexed: 11/21/2022] Open
Abstract
Background Non-coding RNAs (ncRNA) are regulators of gene expression in all domains of life. They control growth and differentiation, virulence, motility and various stress responses. The identification of ncRNAs can be a tedious process due to the heterogeneous nature of this molecule class and the missing sequence similarity of orthologs, even among closely related species. The small ncRNA Yfr1 has previously been found in the Prochlorococcus/Synechococcus group of marine cyanobacteria. Results Here we show that screening available genome sequences based on an RNA motif and followed by experimental analysis works successfully in detecting this RNA in all lineages of cyanobacteria. Yfr1 is an abundant ncRNA between 54 and 69 nt in size that is ubiquitous for cyanobacteria except for two low light-adapted strains of Prochlorococcus, MIT 9211 and SS120, in which it must have been lost secondarily. Yfr1 consists of two predicted stem-loop elements separated by an unpaired sequence of 16–20 nucleotides containing the ultraconserved undecanucleotide 5'-ACUCCUCACAC-3'. Conclusion Starting with an ncRNA previously found in a narrow group of cyanobacteria only, we show here the highly specific and sensitive identification of its homologs within all lineages of cyanobacteria, whereas it was not detected within the genome sequences of E. coli and of 7 other eubacteria belonging to the alpha-proteobacteria, chlorobiaceae and spirochaete. The integration of RNA motif prediction into computational pipelines for the detection of ncRNAs in bacteria appears as a promising step to improve the quality of such predictions.
Collapse
Affiliation(s)
- Björn Voss
- University of Freiburg, Faculty of Biology, Experimental Bioinformatics, Schänzlestr, 1, D-79104 Freiburg, Germany.
| | | | | | | |
Collapse
|
14
|
Horesh Y, Doniger T, Michaeli S, Unger R. RNAspa: a shortest path approach for comparative prediction of the secondary structure of ncRNA molecules. BMC Bioinformatics 2007; 8:366. [PMID: 17908318 PMCID: PMC2147038 DOI: 10.1186/1471-2105-8-366] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2007] [Accepted: 10/01/2007] [Indexed: 12/27/2022] Open
Abstract
Background In recent years, RNA molecules that are not translated into proteins (ncRNAs) have drawn a great deal of attention, as they were shown to be involved in many cellular functions. One of the most important computational problems regarding ncRNA is to predict the secondary structure of a molecule from its sequence. In particular, we attempted to predict the secondary structure for a set of unaligned ncRNA molecules that are taken from the same family, and thus presumably have a similar structure. Results We developed the RNAspa program, which comparatively predicts the secondary structure for a set of ncRNA molecules in linear time in the number of molecules. We observed that in a list of several hundred suboptimal minimal free energy (MFE) predictions, as provided by the RNAsubopt program of the Vienna package, it is likely that at least one suggested structure would be similar to the true, correct one. The suboptimal solutions of each molecule are represented as a layer of vertices in a graph. The shortest path in this graph is the basis for structural predictions for the molecule. We also show that RNA secondary structures can be compared very rapidly by a simple string Edit-Distance algorithm with a minimal loss of accuracy. We show that this approach allows us to more deeply explore the suboptimal structure space. Conclusion The algorithm was tested on three datasets which include several ncRNA families taken from the Rfam database. These datasets allowed for comparison of the algorithm with other methods. In these tests, RNAspa performed better than four other programs.
Collapse
Affiliation(s)
- Yair Horesh
- Department of Computer Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Tirza Doniger
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Shulamit Michaeli
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Ron Unger
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| |
Collapse
|
15
|
Axmann IM, Holtzendorff J, Voss B, Kensche P, Hess WR. Two distinct types of 6S RNA in Prochlorococcus. Gene 2007; 406:69-78. [PMID: 17640832 DOI: 10.1016/j.gene.2007.06.011] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2006] [Revised: 02/12/2007] [Accepted: 06/04/2007] [Indexed: 11/21/2022]
Abstract
Different forms of the 6S non-coding RNA (ncRNA) exist in enterobacteria and in B. subtilis but there is only limited information about this RNA from other groups of bacteria. Prochlorococcus is an oceanic, ecologically important, cyanobacterium. It possesses the most streamlined genome within the cyanobacterial phylum, lacking many regulatory proteins and mechanisms well-known from other bacteria. Here we show the accumulation of two distinct types of 6S RNA in Prochlorococcus MED4. One of these RNAs is transcribed from a specific promoter located 23 nucleotides downstream the terminal codon of the purK gene, whereas the longer transcript is produced by processing from a purK-6S RNA precursor. The expression of both 6S transcripts is under diel control, reaching maxima during the day and minima coinciding with the S- and G2-like phases which are typical for synchronized cultures of this prokaryote. Based on data from four closely related Prochlorococcus strains and 11 environmental sequences from the Sargasso Sea, a previously unknown structural element is predicted within the 6S RNA 5' domain by comparative computational analysis. The divergent expression in synchronized cultures and unusual structural domains that were detected based on metagenomic data sets indicate that 6S RNA is an extremely important global regulator in these marine cyanobacteria.
Collapse
Affiliation(s)
- Ilka M Axmann
- Humboldt University Berlin, Institute for Theoretical Biology, Invalidenstrasse 43, D-10115 Berlin, Germany
| | | | | | | | | |
Collapse
|
16
|
Jossinet F, Ludwig TE, Westhof E. RNA structure: bioinformatic analysis. Curr Opin Microbiol 2007; 10:279-85. [PMID: 17548241 DOI: 10.1016/j.mib.2007.05.010] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2007] [Accepted: 05/23/2007] [Indexed: 01/30/2023]
Abstract
The range of functions ascribed to RNA molecules has grown considerably during recent years. Consequently, the analysis and comparison of RNA sequences have become recurrent tasks in molecular biology. Because the biological function of an RNA is expressed more by its folded architecture than by its sequence, original computational tools adapted to the multifaceted RNA functions have to be developed. Such tools, recently published, enable a user to solve classical problems related to RNA research: constructing 'structural' multiple alignments, inferring complete structures and structural motifs from RNA alignments, or searching structural homology in genomic databases.
Collapse
Affiliation(s)
- Fabrice Jossinet
- Architecture et Réactivité de l'ARN, Université Louis Pasteur, Institut de Biologie Moléculaire et Cellulaire, CNRS, F-67084 Strasbourg, France
| | | | | |
Collapse
|