1
|
Shi S, Zhang XL, Zhao XL, Yang L, Du W, Wang YJ. Prediction of the RNA Secondary Structure Using a Multi-Population Assisted Quantum Genetic Algorithm. Hum Hered 2019; 84:1-8. [PMID: 31461710 DOI: 10.1159/000501480] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 06/13/2019] [Indexed: 12/15/2022] Open
Abstract
Quantum-inspired genetic algorithms (QGAs) were recently introduced for the prediction of RNA secondary structures, and they showed some superiority over the existing popular strategies. In this paper, for RNA secondary structure prediction, we introduce a new QGA named multi-population assisted quantum genetic algorithm (MAQGA). In contrast to the existing QGAs, our strategy involves multi-populations which evolve together in a cooperative way in each iteration, and the genetic exchange between various populations is performed by an operator transfer operation. The numerical results show that the performances of existing genetic algorithms (evolutionary algorithms [EAs]), including traditional EAs and QGAs, can be significantly improved by using our approach. Moreover, for RNA sequences with middle-short length, the MAQGA improves even this state-of-the-art software in terms of both prediction accuracy and sensitivity.
Collapse
Affiliation(s)
- Sha Shi
- Engineering Research Center of Molecular and Neuroimaging, Ministry of Education of China, and School of Life Science and Technology, Xidian University, Xi'an, China
| | | | - Xian-Li Zhao
- Northwestern Women and Children's Hospital, Xi'an, China
| | - Le Yang
- The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an Jiaotong University, Xi'an, China
| | - Wei Du
- The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yun-Jiang Wang
- The State Key Laboratory of Integrated Services Network (ISN), Xidian University, Xi'an, China,
| |
Collapse
|
2
|
Rife Magalis B, Kosakovsky Pond SL, Summers MF, Salemi M. Evaluation of global HIV/SIV envelope gp120 RNA structure and evolution within and among infected hosts. Virus Evol 2018; 4:vey018. [PMID: 29951250 PMCID: PMC6014367 DOI: 10.1093/ve/vey018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Lentiviral RNA genomes contain structural elements that play critical roles in viral replication. Although structural features of 5'-untranslated regions have been well characterized, attempts to identify important structures in other genomic regions by Selective 2'-Hydroxyl Acylation analyzed by Primer Extension (SHAPE) have led to conflicting structural and mechanistic conclusions. Previous approaches accounted neither for sequence heterogeneity that is ubiquitous in viral populations, nor for selective constraints operating at the protein level. We developed an approach that augments SHAPE with phylogenetic analyses and applied it to investigate structure in coding regions (cRNA) within the HIV and SIV envelope genes. Analysis of single-genome SHAPE data with phylogenetic information from diverse lentiviral sequences argues against the conservation of a putative global gp120 RNA structure but points to the existence of core RNA sub-structures. Our findings establish a framework for considering sequence heterogeneity and protein function in de novo RNA structure inference approaches.
Collapse
Affiliation(s)
- Brittany Rife Magalis
- Emerging Pathogens Institute and Department of Pathology, Immunology and Laboratory Medicine, University of Florida, Gainesville, FL, USA
- Institute for Genomics and Evolutionary Medicine and Department of Biology, Temple University, Philadelphia, PA, USA
| | - Sergei L Kosakovsky Pond
- Institute for Genomics and Evolutionary Medicine and Department of Biology, Temple University, Philadelphia, PA, USA
| | - Michael F Summers
- Howard Hughes Medical Institute and Department of Chemistry and Biochemistry, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Marco Salemi
- Emerging Pathogens Institute and Department of Pathology, Immunology and Laboratory Medicine, University of Florida, Gainesville, FL, USA
| |
Collapse
|
3
|
Abstract
RNA structure is conserved by evolution to a greater extent than sequence. Predicting the conserved structure for multiple homologous sequences can be much more accurate than predicting the structure for a single sequence. RNAstructure is a software package that includes the programs Dynalign, Multilign, TurboFold, and PARTS for predicting conserved RNA secondary structure. This chapter provides protocols for using these programs.
Collapse
|
4
|
Badr G, Al-Turaiki I, Turcotte M, Mathkour H. IncMD: incremental trie-based structural motif discovery algorithm. J Bioinform Comput Biol 2014; 12:1450027. [PMID: 25362841 DOI: 10.1142/s0219720014500279] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The discovery of common RNA secondary structure motifs is an important problem in bioinformatics. The presence of such motifs is usually associated with key biological functions. However, the identification of structural motifs is far from easy. Unlike motifs in sequences, which have conserved bases, structural motifs have common structure arrangements even if the underlying sequences are different. Over the past few years, hundreds of algorithms have been published for the discovery of sequential motifs, while less work has been done for the structural motifs case. Current structural motif discovery algorithms are limited in terms of accuracy and scalability. In this paper, we present an incremental and scalable algorithm for discovering RNA secondary structure motifs, namely IncMD. We consider the structural motif discovery as a frequent pattern mining problem and tackle it using a modified a priori algorithm. IncMD uses data structures, trie-based linked lists of prefixes (LLP), to accelerate the search and retrieval of patterns, support counting, and candidate generation. We modify the candidate generation step in order to adapt it to the RNA secondary structure representation. IncMD constructs the frequent patterns incrementally from RNA secondary structure basic elements, using nesting and joining operations. The notion of a motif group is introduced in order to simulate an alignment of motifs that only differ in the number of unpaired bases. In addition, we use a cluster beam approach to select motifs that will survive to the next iterations of the search. Results indicate that IncMD can perform better than some of the available structural motif discovery algorithms in terms of sensitivity (Sn), positive predictive value (PPV), and specificity (Sp). The empirical results also show that the algorithm is scalable and runs faster than all of the compared algorithms.
Collapse
Affiliation(s)
- Ghada Badr
- College of Computer and Information Sciences, King Saud University, Riyadh, Kingdom of Saudi Arabia , IRI - The City of Scientific Research and Technological Applications, University and Research District, P. O. 21934, New Borg Alarab, Alexandria, Egypt
| | | | | | | |
Collapse
|
5
|
Combinatorial Insights into RNA Secondary Structure. DISCRETE AND TOPOLOGICAL MODELS IN MOLECULAR BIOLOGY 2014. [DOI: 10.1007/978-3-642-40193-0_7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
6
|
Badr G, Al-Turaiki I, Mathkour H. Classification and assessment tools for structural motif discovery algorithms. BMC Bioinformatics 2013; 14 Suppl 9:S4. [PMID: 23902564 PMCID: PMC3698030 DOI: 10.1186/1471-2105-14-s9-s4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Motif discovery is the problem of finding recurring patterns in biological data. Patterns can be sequential, mainly when discovered in DNA sequences. They can also be structural (e.g. when discovering RNA motifs). Finding common structural patterns helps to gain a better understanding of the mechanism of action (e.g. post-transcriptional regulation). Unlike DNA motifs, which are sequentially conserved, RNA motifs exhibit conservation in structure, which may be common even if the sequences are different. Over the past few years, hundreds of algorithms have been developed to solve the sequential motif discovery problem, while less work has been done for the structural case. METHODS In this paper, we survey, classify, and compare different algorithms that solve the structural motif discovery problem, where the underlying sequences may be different. We highlight their strengths and weaknesses. We start by proposing a benchmark dataset and a measurement tool that can be used to evaluate different motif discovery approaches. Then, we proceed by proposing our experimental setup. Finally, results are obtained using the proposed benchmark to compare available tools. To the best of our knowledge, this is the first attempt to compare tools solely designed for structural motif discovery. RESULTS Results show that the accuracy of discovered motifs is relatively low. The results also suggest a complementary behavior among tools where some tools perform well on simple structures, while other tools are better for complex structures. CONCLUSIONS We have classified and evaluated the performance of available structural motif discovery tools. In addition, we have proposed a benchmark dataset with tools that can be used to evaluate newly developed tools.
Collapse
Affiliation(s)
- Ghada Badr
- King Saud University, College of Computer and Information Sciences, Riyadh, Kingdom of Saudi Arabia
- IRI - The City of Scientific Research and Technological Applications, University and Research District, P.O. 21934 New Borg Alarab, Alexandria, Egypt
| | - Isra Al-Turaiki
- King Saud University, College of Computer and Information Sciences, Riyadh, Kingdom of Saudi Arabia
| | - Hassan Mathkour
- King Saud University, College of Computer and Information Sciences, Riyadh, Kingdom of Saudi Arabia
| |
Collapse
|
7
|
Pollom E, Dang KK, Potter EL, Gorelick RJ, Burch CL, Weeks KM, Swanstrom R. Comparison of SIV and HIV-1 genomic RNA structures reveals impact of sequence evolution on conserved and non-conserved structural motifs. PLoS Pathog 2013; 9:e1003294. [PMID: 23593004 PMCID: PMC3616985 DOI: 10.1371/journal.ppat.1003294] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Accepted: 02/22/2013] [Indexed: 11/25/2022] Open
Abstract
RNA secondary structure plays a central role in the replication and metabolism of all RNA viruses, including retroviruses like HIV-1. However, structures with known function represent only a fraction of the secondary structure reported for HIV-1(NL4-3). One tool to assess the importance of RNA structures is to examine their conservation over evolutionary time. To this end, we used SHAPE to model the secondary structure of a second primate lentiviral genome, SIVmac239, which shares only 50% sequence identity at the nucleotide level with HIV-1NL4-3. Only about half of the paired nucleotides are paired in both genomic RNAs and, across the genome, just 71 base pairs form with the same pairing partner in both genomes. On average the RNA secondary structure is thus evolving at a much faster rate than the sequence. Structure at the Gag-Pro-Pol frameshift site is maintained but in a significantly altered form, while the impact of selection for maintaining a protein binding interaction can be seen in the conservation of pairing partners in the small RRE stems where Rev binds. Structures that are conserved between SIVmac239 and HIV-1(NL4-3) also occur at the 5' polyadenylation sequence, in the plus strand primer sites, PPT and cPPT, and in the stem-loop structure that includes the first splice acceptor site. The two genomes are adenosine-rich and cytidine-poor. The structured regions are enriched in guanosines, while unpaired regions are enriched in adenosines, and functionaly important structures have stronger base pairing than nonconserved structures. We conclude that much of the secondary structure is the result of fortuitous pairing in a metastable state that reforms during sequence evolution. However, secondary structure elements with important function are stabilized by higher guanosine content that allows regions of structure to persist as sequence evolution proceeds, and, within the confines of selective pressure, allows structures to evolve.
Collapse
Affiliation(s)
- Elizabeth Pollom
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Kristen K. Dang
- Department of Biomedical Engineering, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - E. Lake Potter
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Robert J. Gorelick
- AIDS and Cancer Virus Program, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Christina L. Burch
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Kevin M. Weeks
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Ronald Swanstrom
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
8
|
Heuristic approaches to the optimization of acceptor systems in bulk heterojunction cells: a computational study. Theor Chem Acc 2012. [DOI: 10.1007/s00214-012-1191-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
9
|
Bernhart SH, Hofacker IL. From consensus structure prediction to RNA gene finding. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2009; 8:461-71. [PMID: 19833701 DOI: 10.1093/bfgp/elp043] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Reliable structure prediction is a prerequisite for most types of bioinformatical analysis of RNA. Since the accuracy of structure prediction from single sequences is limited, one often resorts to computing the consensus structure for a set of related RNA sequences. Since functionally important RNA structures are expected to evolve much more slowly than the underlying sequences, the pattern of sequence (co-)variation can be exploited to dramatically improve structure prediction. Since a conserved common structure is only expected when the RNA structure is under selective pressure, consensus structure prediction also provides an ideal starting point for the de novo detection of structured non-coding RNAs. Here, we review different strategies for the prediction of consensus secondary structures, and show how these approaches can be used to predict non-coding RNA genes.
Collapse
Affiliation(s)
- Stephan H Bernhart
- Department of Theoretical Chemistry, University of Vienna, Währingerstrasse 17, A-1090 Wien, Austria.
| | | |
Collapse
|
10
|
Zou Q, Zhao T, Liu Y, Guo M. Predicting RNA secondary structure based on the class information and Hopfield network. Comput Biol Med 2009; 39:206-14. [DOI: 10.1016/j.compbiomed.2008.12.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2008] [Revised: 10/28/2008] [Accepted: 12/16/2008] [Indexed: 11/24/2022]
|
11
|
Abraham M, Dror O, Nussinov R, Wolfson HJ. Analysis and classification of RNA tertiary structures. RNA (NEW YORK, N.Y.) 2008; 14:2274-89. [PMID: 18824509 PMCID: PMC2578864 DOI: 10.1261/rna.853208] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2007] [Accepted: 07/05/2008] [Indexed: 05/19/2023]
Abstract
There is a fast growing interest in noncoding RNA transcripts. These transcripts are not translated into proteins, but play essential roles in many cellular and pathological processes. Recent efforts toward comprehension of their function has led to a substantial increase in both the number and the size of solved RNA structures. With the aim of addressing questions relating to RNA structural diversity, we examined RNA conservation at three structural levels: primary, secondary, and tertiary structure. Additionally, we developed an automated method for classifying RNA structures based on spatial (three-dimensional [3D]) similarity. Applying the method to all solved RNA structures resulted in a classified database of RNA tertiary structures (DARTS). DARTS embodies 1333 solved RNA structures classified into 94 clusters. The classification is hierarchical, reflecting the structural relationship between and within clusters. We also developed an application for searching DARTS with a new structure. The search is fast and its performance was successfully tested on all solved RNA structures since the creation of DARTS. A user-friendly interface for both the database and the search application is available online. We show intracluster and intercluster similarities in DARTS and demonstrate the usefulness of the search application. The analysis reveals the current structural repertoire of RNA and exposes common global folds and local tertiary motifs. Further study of these conserved substructures may suggest possible RNA domains and building blocks. This should be beneficial for structure prediction and for gaining insights into structure-function relationships.
Collapse
Affiliation(s)
- Mira Abraham
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | | | | | | |
Collapse
|
12
|
Mathews D. Predicting the secondary structure common to two RNA sequences with Dynalign. ACTA ACUST UNITED AC 2008; Chapter 12:Unit 12.4. [PMID: 18428718 DOI: 10.1002/0471250953.bi1204s08] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Dynalign is a dynamic programming algorithm for the simultaneous prediction of the lowest-free-energy secondary structure common to two RNA sequences and the alignment of the two sequences. It has been shown that the average accuracy of secondary structure prediction is improved using Dynalign, as compared to free-energy minimization of a single sequence. This unit provides protocols for using Dynalign on a Microsoft Windows platform as part of the RNAstructure package, and for compiling and using Dynalign on Unix/Linux computers.
Collapse
Affiliation(s)
- David Mathews
- Center for Human Genetics and Molecular Pediatric Disease Aab Institute of Biomedical Sciences University of Rochester Medical Center, Rochester, New York, USA
| |
Collapse
|
13
|
Kierzek E, Kierzek R, Moss WN, Christensen SM, Eickbush TH, Turner DH. Isoenergetic penta- and hexanucleotide microarray probing and chemical mapping provide a secondary structure model for an RNA element orchestrating R2 retrotransposon protein function. Nucleic Acids Res 2008; 36:1770-82. [PMID: 18252773 PMCID: PMC2346776 DOI: 10.1093/nar/gkm1085] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
LNA (locked nucleic acids, i.e. oligonucleotides with a methyl bridge between the 2′ oxygen and 4′ carbon of ribose) and 2,6-diaminopurine were incorporated into 2′-O-methyl RNA pentamer and hexamer probes to make a microarray that binds unpaired RNA approximately isoenergetically. That is, binding is roughly independent of target sequence if target is unfolded. The isoenergetic binding and short probe length simplify interpretation of binding to a structured RNA to provide insight into target RNA secondary structure. Microarray binding and chemical mapping were used to probe the secondary structure of a 323 nt segment of the 5′ coding region of the R2 retrotransposon from Bombyx mori (R2Bm 5′ RNA). This R2Bm 5′ RNA orchestrates functioning of the R2 protein responsible for cleaving the second strand of DNA during insertion of the R2 sequence into the genome. The experimental results were used as constraints in a free energy minimization algorithm to provide an initial model for the secondary structure of the R2Bm 5′ RNA.
Collapse
Affiliation(s)
- Elzbieta Kierzek
- Department of Chemistry, University of Rochester, RC Box 270216, Rochester, NY 14627-0216, USA
| | | | | | | | | | | |
Collapse
|
14
|
Wiese K, Deschenes A, Hendriks A. RnaPredict--an evolutionary algorithm for RNA secondary structure prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:25-41. [PMID: 18245873 DOI: 10.1109/tcbb.2007.1054] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
This paper presents two in-depth studies on RnaPredict, an evolutionary algorithm for RNA secondary structure prediction. The first study is an analysis of the performance of two thermodynamic models, Individual Nearest Neighbor (INN) and Individual Nearest Neighbor Hydrogen Bond (INN-HB). The correlation between the free energy of predicted structures and the sensitivity is analyzed for 19 RNA sequences. Although some variance is shown, there is a clear trend between a lower free energy and an increase in true positive base pairs. With increasing sequence length, this correlation generally decreases. In the second experiment, the accuracy of the predicted structures for these 19 sequences are compared against the accuracy of the structures generated by the mfold dynamic programming algorithm (DPA) and also to known structures. RnaPredict is shown to outperform the minimum free energy structures produced by mfold and has comparable performance when compared to sub-optimal structures produced by mfold.
Collapse
|
15
|
Chai D. RNA structure and modeling: progress and techniques. PROGRESS IN NUCLEIC ACID RESEARCH AND MOLECULAR BIOLOGY 2008; 82:71-100. [PMID: 18929139 DOI: 10.1016/s0079-6603(08)00003-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Affiliation(s)
- Dinggeng Chai
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
16
|
Michal S, Ivry T, Sipper M, Barash D, Schalit-Cohen O. Finding a common motif of RNA sequences using genetic programming: the GeRNAMo system. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2007; 4:596-610. [PMID: 17975271 DOI: 10.1109/tcbb.2007.1045] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
We focus on finding a consensus motif of a set of homologous or functionally related RNA molecules. Recent approaches to this problem have been limited to simple motifs, require sequence alignment, and make prior assumptions concerning the data set. We use genetic programming to predict RNA consensus motifs based solely on the data set. Our system -- dubbed GeRNAMo (Genetic programming of RNA Motifs) -- predicts the most common motifs without sequence alignment and is capable of dealing with any motif size. Our program only requires the maximum number of stems in the motif, and if prior knowledge is available the user can specify other attributes of the motif (e.g., the range of the motif's minimum and maximum sizes), thereby increasing both sensitivity and speed. We describe several experiments using either ferritin iron response element (IRE); signal recognition particle (SRP); or microRNA sequences, showing that the most common motif is found repeatedly, and that our system offers substantial advantages over previous methods.
Collapse
|
17
|
Horesh Y, Doniger T, Michaeli S, Unger R. RNAspa: a shortest path approach for comparative prediction of the secondary structure of ncRNA molecules. BMC Bioinformatics 2007; 8:366. [PMID: 17908318 PMCID: PMC2147038 DOI: 10.1186/1471-2105-8-366] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2007] [Accepted: 10/01/2007] [Indexed: 12/27/2022] Open
Abstract
Background In recent years, RNA molecules that are not translated into proteins (ncRNAs) have drawn a great deal of attention, as they were shown to be involved in many cellular functions. One of the most important computational problems regarding ncRNA is to predict the secondary structure of a molecule from its sequence. In particular, we attempted to predict the secondary structure for a set of unaligned ncRNA molecules that are taken from the same family, and thus presumably have a similar structure. Results We developed the RNAspa program, which comparatively predicts the secondary structure for a set of ncRNA molecules in linear time in the number of molecules. We observed that in a list of several hundred suboptimal minimal free energy (MFE) predictions, as provided by the RNAsubopt program of the Vienna package, it is likely that at least one suggested structure would be similar to the true, correct one. The suboptimal solutions of each molecule are represented as a layer of vertices in a graph. The shortest path in this graph is the basis for structural predictions for the molecule. We also show that RNA secondary structures can be compared very rapidly by a simple string Edit-Distance algorithm with a minimal loss of accuracy. We show that this approach allows us to more deeply explore the suboptimal structure space. Conclusion The algorithm was tested on three datasets which include several ncRNA families taken from the Rfam database. These datasets allowed for comparison of the algorithm with other methods. In these tests, RNAspa performed better than four other programs.
Collapse
Affiliation(s)
- Yair Horesh
- Department of Computer Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Tirza Doniger
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Shulamit Michaeli
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Ron Unger
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| |
Collapse
|
18
|
Zhao J, Malmberg RL, Cai L. Rapid ab initio prediction of RNA pseudoknots via graph tree decomposition. J Math Biol 2007; 56:145-59. [PMID: 17906862 DOI: 10.1007/s00285-007-0124-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2007] [Indexed: 11/29/2022]
Abstract
The prediction of RNA secondary structure including pseudoknots remains a challenge due to the intractable computation of the sequence conformation from nucleotide interactions under free energy models. Optimal algorithms often assume a restricted class for the predicted RNA structures and yet still require a high-degree polynomial time complexity, which is too expensive to use. Heuristic methods may yield time-efficient algorithms but they do not guarantee optimality of the predicted structure. This paper introduces a new and efficient algorithm for the prediction of RNA structure with pseudoknots for which the structure is not restricted. Novel prediction techniques are developed based on graph tree decomposition. In particular, based on a simplified energy model, stem overlapping relationships are defined with a graph, in which a specialized maximum independent set corresponds to the desired optimal structure. Such a graph is tree decomposable; dynamic programming over a tree decomposition of the graph leads to an efficient optimal algorithm. The final structure predictions are then based on re-ranking a list of suboptimal structures under a more comprehensive free energy model. The new algorithm is evaluated on a large number of RNA sequence sets taken from diverse resources. It demonstrates overall sensitivity and specificity that outperforms or is comparable with those of previous optimal and heuristic algorithms yet it requires significantly less time than the compared optimal algorithms.
Collapse
Affiliation(s)
- Jizhen Zhao
- Department of Computer Science, University of Georgia, Athens, GA 30602, USA.
| | | | | |
Collapse
|
19
|
Machado-Lima A, del Portillo HA, Durham AM. Computational methods in noncoding RNA research. J Math Biol 2007; 56:15-49. [PMID: 17786447 DOI: 10.1007/s00285-007-0122-6] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2007] [Indexed: 11/26/2022]
Abstract
Non protein-coding RNAs (ncRNAs) are a research hotspot in bioinformatics. Recent discoveries have revealed new ncRNA families performing a variety of roles, from gene expression regulation to catalytic activities. It is also believed that other families are still to be unveiled. Computational methods developed for protein coding genes often fail when searching for ncRNAs. Noncoding RNAs functionality is often heavily dependent on their secondary structure, which makes gene discovery very different from protein coding RNA genes. This motivated the development of specific methods for ncRNA research. This article reviews the main approaches used to identify ncRNAs and predict secondary structure.
Collapse
Affiliation(s)
- Ariane Machado-Lima
- Institute of Mathematics and Statistics, University of Sao Paulo, Sao Paulo, SP, Brazil.
| | | | | |
Collapse
|
20
|
Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign. BMC Bioinformatics 2007; 8:130. [PMID: 17445273 PMCID: PMC1868766 DOI: 10.1186/1471-2105-8-130] [Citation(s) in RCA: 89] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2006] [Accepted: 04/19/2007] [Indexed: 12/02/2022] Open
Abstract
Background Joint alignment and secondary structure prediction of two RNA sequences can significantly improve the accuracy of the structural predictions. Methods addressing this problem, however, are forced to employ constraints that reduce computation by restricting the alignments and/or structures (i.e. folds) that are permissible. In this paper, a new methodology is presented for the purpose of establishing alignment constraints based on nucleotide alignment and insertion posterior probabilities. Using a hidden Markov model, posterior probabilities of alignment and insertion are computed for all possible pairings of nucleotide positions from the two sequences. These alignment and insertion posterior probabilities are additively combined to obtain probabilities of co-incidence for nucleotide position pairs. A suitable alignment constraint is obtained by thresholding the co-incidence probabilities. The constraint is integrated with Dynalign, a free energy minimization algorithm for joint alignment and secondary structure prediction. The resulting method is benchmarked against the previous version of Dynalign and against other programs for pairwise RNA structure prediction. Results The proposed technique eliminates manual parameter selection in Dynalign and provides significant computational time savings in comparison to prior constraints in Dynalign while simultaneously providing a small improvement in the structural prediction accuracy. Savings are also realized in memory. In experiments over a 5S RNA dataset with average sequence length of approximately 120 nucleotides, the method reduces computation by a factor of 2. The method performs favorably in comparison to other programs for pairwise RNA structure prediction: yielding better accuracy, on average, and requiring significantly lesser computational resources. Conclusion Probabilistic analysis can be utilized in order to automate the determination of alignment constraints for pairwise RNA structure prediction methods in a principled fashion. These constraints can reduce the computational and memory requirements of these methods while maintaining or improving their accuracy of structural prediction. This extends the practical reach of these methods to longer length sequences. The revised Dynalign code is freely available for download.
Collapse
|
21
|
Abstract
The knowledge about classes of non-coding RNAs (ncRNAs) is growing very fast and it is mainly the structure which is the common characteristic property shared by members of the same class. For correct characterization of such classes it is therefore of great importance to analyse the structural features in great detail. In this manuscript I present RNAlishapes which combines various secondary structure analysis methods, such as suboptimal folding and shape abstraction, with a comparative approach known as RNA alignment folding. RNAlishapes makes use of an extended thermodynamic model and covariance scoring, which allows to reward covariation of paired bases. Applying the algorithm to a set of bacterial trp-operon leaders using shape abstraction it was able to identify the two alternating conformations of this attenuator. Besides providing in-depth analysis methods for aligned RNAs, the tool also shows a fairly well prediction accuracy. Therefore, RNAlishapes provides the community with a powerful tool for structural analysis of classes of RNAs and is also a reasonable method for consensus structure prediction based on sequence alignments. RNAlishapes is available for online use and download at .
Collapse
Affiliation(s)
- Björn Voss
- Experimental Bioinformatics, Institute of Biology II, Freiburg University, Schänzlestrasse 1, 79104 Freiburg, Germany.
| |
Collapse
|
22
|
Abstract
The cell has many ways to regulate the production of proteins. One mechanism is through the changes to the machinery of translation initiation. These alterations favor the translation of one subset of mRNAs over another. It was first shown that internal ribosome entry sites (IRESes) within viral RNA genomes allowed the production of viral proteins more efficiently than most of the host proteins. The RNA secondary structure of viral IRESes has sometimes been conserved between viral species even though the primary sequences differ. These structures are important for IRES function, but no similar structure conservation has yet to be shown in cellular IRES. With the advances in mathematical modeling and computational approaches to complex biological problems, is there a way to predict an IRES in a data set of unknown sequences? This review examines what is known about cellular IRES structures, as well as the data sets and tools available to examine this question. We find that the lengths, number of upstream AUGs, and %GC content of 5'-UTRs of the human transcriptome have a similar distribution to those of published IRES-containing UTRs. Although the UTRs containing IRESes are on the average longer, almost half of all 5'-UTRs are long enough to contain an IRES. Examination of the available RNA structure prediction software and RNA motif searching programs indicates that while these programs are useful tools to fine tune the empirically determined RNA secondary structure, the accuracy of de novo secondary structure prediction of large RNA molecules and subsequent identification of new IRES elements by computational approaches, is still not possible.
Collapse
Affiliation(s)
- Stephen D Baird
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ontario K1H 8M5, Canada
| | | | | | | |
Collapse
|
23
|
Mathews DH, Turner DH. Prediction of RNA secondary structure by free energy minimization. Curr Opin Struct Biol 2006; 16:270-8. [PMID: 16713706 DOI: 10.1016/j.sbi.2006.05.010] [Citation(s) in RCA: 247] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2006] [Revised: 05/02/2006] [Accepted: 05/10/2006] [Indexed: 10/24/2022]
Abstract
RNA secondary structure is often predicted from sequence by free energy minimization. Over the past two years, advances have been made in the estimation of folding free energy change, the mapping of secondary structure and the implementation of computer programs for structure prediction. The trends in computer program development are: efficient use of experimental mapping of structures to constrain structure prediction; use of statistical mechanics to improve the fidelity of structure prediction; inclusion of pseudoknots in secondary structure prediction; and use of two or more homologous sequences to find a common structure.
Collapse
Affiliation(s)
- David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY 14642, USA
| | | |
Collapse
|
24
|
|
25
|
Abstract
Scales in RNA, based on geometrical considerations, can be exploited for the analysis and prediction of RNA structures. By using spectral decomposition, geometric information that relates to a given RNA fold can be reduced to a single positive scalar number, the second eigenvalue of the Laplacian matrix corresponding to the tree-graph representation of the RNA secondary structure. Along with the free energy of the structure, being the most important scalar number in the prediction of RNA folding by energy minimization methods, the second eigenvalue of the Laplacian matrix can be used as an effective signature for locating a target folded structure given a set of RNA folds. Furthermore, the second eigenvector of the Laplacian matrix can be used to partition large RNA structures into smaller fragments. An illustrative example is given for the use of the second eigenvalue to predict mutations that may cause structural rearrangements, thereby disrupting stable motifs.
Collapse
Affiliation(s)
- Danny Barash
- Genome Diversity Center, Institute of Evolution, University of Haifa, Mount Carmel, Haifa, Israel.
| |
Collapse
|
26
|
Liu J, Wang JTL, Hu J, Tian B. A method for aligning RNA secondary structures and its application to RNA motif detection. BMC Bioinformatics 2005; 6:89. [PMID: 15817128 PMCID: PMC1090556 DOI: 10.1186/1471-2105-6-89] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2004] [Accepted: 04/07/2005] [Indexed: 11/17/2022] Open
Abstract
Background Alignment of RNA secondary structures is important in studying functional RNA motifs. In recent years, much progress has been made in RNA motif finding and structure alignment. However, existing tools either require a large number of prealigned structures or suffer from high time complexities. This makes it difficult for the tools to process RNAs whose prealigned structures are unavailable or process very large RNA structure databases. Results We present here an efficient tool called RSmatch for aligning RNA secondary structures and for motif detection. Motivated by widely used algorithms for RNA folding, we decompose an RNA secondary structure into a set of atomic structure components that are further organized by a tree model to capture the structural particularities. RSmatch can find the optimal global or local alignment between two RNA secondary structures using two scoring matrices, one for single-stranded regions and the other for double-stranded regions. The time complexity of RSmatch is O(mn) where m is the size of the query structure and n that of the subject structure. When applied to searching a structure database, RSmatch can find similar RNA substructures, and is capable of conducting multiple structure alignment and iterative database search. Therefore it can be used to identify functional RNA motifs. The accuracy of RSmatch is tested by experiments using a number of known RNA structures, including simple stem-loops and complex structures containing junctions. Conclusion With respect to computing efficiency and accuracy, RSmatch compares favorably with other tools for RNA structure alignment and motif detection. This tool shall be useful to researchers interested in comparing RNA structures obtained from wet lab experiments or RNA folding programs, particularly when the size of the structure dataset is large.
Collapse
Affiliation(s)
- Jianghui Liu
- Department of Biochemistry and Molecular Biology, New Jersey Medical School, University of Medicine and Dentistry of New Jersey, Newark, NJ 07101, USA
- Department of Computer Science, New Jersey Institute of Technology, University Heights, Newark, NJ 07102, USA
| | - Jason TL Wang
- Department of Computer Science, New Jersey Institute of Technology, University Heights, Newark, NJ 07102, USA
| | - Jun Hu
- Department of Biochemistry and Molecular Biology, New Jersey Medical School, University of Medicine and Dentistry of New Jersey, Newark, NJ 07101, USA
| | - Bin Tian
- Department of Biochemistry and Molecular Biology, New Jersey Medical School, University of Medicine and Dentistry of New Jersey, Newark, NJ 07101, USA
| |
Collapse
|
27
|
Taneda A. Cofolga: a genetic algorithm for finding the common folding of two RNAs. Comput Biol Chem 2005; 29:111-9. [PMID: 15833439 DOI: 10.1016/j.compbiolchem.2005.02.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2004] [Revised: 02/22/2005] [Accepted: 02/22/2005] [Indexed: 10/25/2022]
Abstract
In order to predict non-coding RNA genes and functions on the basis of genome sequences, accurate secondary structure prediction is useful. Although single-sequence folding programs such as mfold have been successful, it is of great importance to develop a novel approach for further improvement of the prediction performance. In the present paper, a secondary structure prediction method based on genetic algorithm, Cofolga, is proposed. The program developed performs folding and alignment of two homologous RNAs simultaneously. Cofolga was tested with a dataset composed of 13 tRNAs, seven 5S rRNAs, five RNase P RNAs, and five SRP RNAs; as a result, it turned out that the average prediction accuracies for the tRNAs, 5S rRNAs, RNase P RNAs, and SRP RNAs obtained by Cofolga with an optimal weight factor and default parameters were 83.6, 81.8, 73.5, and 67.7%, respectively. These results were superior to those obtained by a single-sequence folding based on free-energy minimization in which corresponding average prediction accuracies were 52.4, 47.4, 57.7, and 52.3%, respectively. Cofolga has a post-processing in which a single-sequence folding is performed after fixation of a predicted common structure; this post-processing enables Cofolga to predict a structure that is present in one of two RNAs alone. The executable files of Cofolga (for Windows/Unix/Mac) can be obtained by an e-mail request.
Collapse
Affiliation(s)
- Akito Taneda
- Department of Electronic and Information System Engineering, Faculty of Science and Technology, Hirosaki University, Hirosaki 036-8561, Japan.
| |
Collapse
|
28
|
Mathews DH. Predicting a set of minimal free energy RNA secondary structures common to two sequences. Bioinformatics 2005; 21:2246-53. [PMID: 15731207 DOI: 10.1093/bioinformatics/bti349] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Function derives from structure, therefore, there is need for methods to predict functional RNA structures. RESULTS The Dynalign algorithm, which predicts the lowest free energy secondary structure common to two unaligned RNA sequences, is extended to the prediction of a set of low-energy structures. Dot plots can be drawn to show all base pairs in structures within an energy increment. Dynalign predicts more well-defined structures than structure prediction using a single sequence; in 5S rRNA sequences, the average number of base pairs in structures with energy within 20% of the lowest energy structure is 317 using Dynalign, but 569 using a single sequence. Structure prediction with Dynalign can also be constrained according to experiment or comparative analysis. The accuracy, measured as sensitivity and positive predictive value, of Dynalign is greater than predictions with a single sequence. AVAILABILITY Dynalign can be downloaded at http://rna.urmc.rochester.edu
Collapse
Affiliation(s)
- David H Mathews
- Center for Human Genetics and Molecular Pediatric Disease, University of Rochester Medical Center, 601 Elmwood Avenue, Box 703, Rochester, NY 14642, USA.
| |
Collapse
|
29
|
Abstract
We present a tool for the prediction of conserved secondary structure elements of a family of homologous non-coding RNAs. Our method does not require any prior multiple sequence alignment. Thus, it successfully applies to datasets with low primary structure similarity. The functionality is demonstrated using three example datasets: sequences of RNase P RNAs, ciliate telomerases and enterovirus messenger RNAs. CARNAC has a web server that can be accessed at the URL http://bioinfo.lifl.fr/carnac.
Collapse
Affiliation(s)
- Hélène Touzet
- Laboratoire d'Informatique Fondamentale de Lille, UMR CNRS 8022, Université des Sciences et Technologies de Lille, France
| | | |
Collapse
|
30
|
Ruan J, Stormo GD, Zhang W. ILM: a web server for predicting RNA secondary structures with pseudoknots. Nucleic Acids Res 2004; 32:W146-9. [PMID: 15215368 PMCID: PMC441582 DOI: 10.1093/nar/gkh444] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
The ILM web server provides a web interface to two algorithms, iterated loop matching and maximum weighted matching, for efficiently predicting RNA secondary structures with pseudoknots. The algorithms can utilize either thermodynamic or comparative information or both, and thus can work on both aligned and individual sequences. Predicted secondary structures are presented in several formats compatible with a variety of existing visualization tools. The service can be accessed at http://cic.cs.wustl.edu/RNA/.
Collapse
Affiliation(s)
- Jianhua Ruan
- Department of Computer Science and Engineering, Washington University in St Louis, St Louis, MO 63130, USA
| | | | | |
Collapse
|
31
|
Doshi KJ, Cannone JJ, Cobaugh CW, Gutell RR. Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction. BMC Bioinformatics 2004; 5:105. [PMID: 15296519 PMCID: PMC514602 DOI: 10.1186/1471-2105-5-105] [Citation(s) in RCA: 168] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2004] [Accepted: 08/05/2004] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND A detailed understanding of an RNA's correct secondary and tertiary structure is crucial to understanding its function and mechanism in the cell. Free energy minimization with energy parameters based on the nearest-neighbor model and comparative analysis are the primary methods for predicting an RNA's secondary structure from its sequence. Version 3.1 of Mfold has been available since 1999. This version contains an expanded sequence dependence of energy parameters and the ability to incorporate coaxial stacking into free energy calculations. We test Mfold 3.1 by performing the largest and most phylogenetically diverse comparison of rRNA and tRNA structures predicted by comparative analysis and Mfold, and we use the results of our tests on 16S and 23S rRNA sequences to assess the improvement between Mfold 2.3 and Mfold 3.1. RESULTS The average prediction accuracy for a 16S or 23S rRNA sequence with Mfold 3.1 is 41%, while the prediction accuracies for the majority of 16S and 23S rRNA structures tested are between 20% and 60%, with some having less than 20% prediction accuracy. The average prediction accuracy was 71% for 5S rRNA and 69% for tRNA. The majority of the 5S rRNA and tRNA sequences have prediction accuracies greater than 60%. The prediction accuracy of 16S rRNA base-pairs decreases exponentially as the number of nucleotides intervening between the 5' and 3' halves of the base-pair increases. CONCLUSION Our analysis indicates that the current set of nearest-neighbor energy parameters in conjunction with the Mfold folding algorithm are unable to consistently and reliably predict an RNA's correct secondary structure. For 16S or 23S rRNA structure prediction, Mfold 3.1 offers little improvement over Mfold 2.3. However, the nearest-neighbor energy parameters do work well for shorter RNA sequences such as tRNA or 5S rRNA, or for larger rRNAs when the contact distance between the base-pairs is less than 100 nucleotides.
Collapse
MESH Headings
- Base Sequence
- Computational Biology/methods
- Computational Biology/standards
- Entropy
- Models, Genetic
- Nucleic Acid Conformation
- Phylogeny
- Predictive Value of Tests
- RNA/chemistry
- RNA, Archaeal/chemistry
- RNA, Bacterial/chemistry
- RNA, Chloroplast/chemistry
- RNA, Mitochondrial
- RNA, Ribosomal, 16S/chemistry
- RNA, Ribosomal, 23S/chemistry
- RNA, Ribosomal, 5S/chemistry
- Thermodynamics
Collapse
Affiliation(s)
- Kishore J Doshi
- The Institute for Cellular and Molecular Biology, The University of Texas at Austin, 1 University Station A4800, Austin, TX 78712-0159, USA
| | - Jamie J Cannone
- The Institute for Cellular and Molecular Biology, The University of Texas at Austin, 1 University Station A4800, Austin, TX 78712-0159, USA
| | - Christian W Cobaugh
- The Institute for Cellular and Molecular Biology, The University of Texas at Austin, 1 University Station A4800, Austin, TX 78712-0159, USA
| | - Robin R Gutell
- The Institute for Cellular and Molecular Biology, The University of Texas at Austin, 1 University Station A4800, Austin, TX 78712-0159, USA
| |
Collapse
|
32
|
Wiese KC, Glen E. A permutation-based genetic algorithm for the RNA folding problem: a critical look at selection strategies, crossover operators, and representation issues. Biosystems 2004; 72:29-41. [PMID: 14642657 DOI: 10.1016/s0303-2647(03)00133-3] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
This paper presents a Genetic Algorithm (GA) to predict the secondary structure of RNA molecules, where the secondary structure is encoded as a permutation. More specifically, the proposed algorithm predicts which specific canonical base pairs will form hydrogen bonds and build helices, also known as stems. Since RNA is involved in both transcription and translation and also has catalytic and structural roles in the cell, determining the structure of RNA is of fundamental importance in helping to determine RNA function. We introduce a GA where a permutation is used to encode the secondary structure of RNA molecules. We discuss results on RNA sequences of lengths 76, 210, 681, and 785 nucleotides and present several improvements to our algorithm. We show that the Keep-Best Reproduction operator has similar benefits as in the traveling salesman problem domain. In addition, a comparison of several crossover operators is provided. We also compare the results of the permutation-based GA with a binary GA, demonstrating the benefits of the newly proposed representation.
Collapse
Affiliation(s)
- Kay C Wiese
- Information Technology, Simon Fraser University, 2400 Central City, 10153 King George Highway, Surrey, BC, Canada V3T 2W1.
| | | |
Collapse
|
33
|
Ruschak AM, Mathews DH, Bibillo A, Spinelli SL, Childs JL, Eickbush TH, Turner DH. Secondary structure models of the 3' untranslated regions of diverse R2 RNAs. RNA (NEW YORK, N.Y.) 2004; 10:978-87. [PMID: 15146081 PMCID: PMC1370589 DOI: 10.1261/rna.5216204] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2003] [Accepted: 03/10/2004] [Indexed: 05/19/2023]
Abstract
The RNA structure of the 3' untranslated region (UTR) of the R2 retrotransposable element is recognized by the R2-encoded reverse transcriptase in a reaction called target primed reverse transcription (TPRT). To provide insight into structure-function relationships important for TPRT, we have created alignments that reveal the secondary structure for 22 Drosophila and five silkmoth 3' UTR R2 sequences. In addition, free energy minimization has been used to predict the secondary structure for the 3' UTR R2 RNA of Forficula auricularia. The predicted structures for Bombyx mori and F. auricularia are consistent with chemical modification data obtained with beta-ethoxy-alpha-ketobutyraldehyde (kethoxal), dimethyl sulfate, and 1-cyclohexyl-3-(2-morpholinoethyl)carbodiimide metho-p-toluene sulfonate. The structures appear to have common helices that are likely important for function.
Collapse
Affiliation(s)
- Amy M Ruschak
- Department of Chemistry, University of Rochester, Rochester, New York 14627-0216, USA
| | | | | | | | | | | | | |
Collapse
|
34
|
Luedtke NW, Liu Q, Tor Y. Synthesis, photophysical properties, and nucleic acid binding of phenanthridinium derivatives based on ethidium. Bioorg Med Chem 2003; 11:5235-47. [PMID: 14604688 DOI: 10.1016/j.bmc.2003.08.006] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
A series of substituted phenanthridine derivatives has been synthesized by converting the amines at the 3- and 8-positions of ethidium bromide into guanidine, pyrrole, urea, and various substituted ureas. The resulting derivatives exhibit unique spectral properties that change upon binding nucleic acids. The compounds were analyzed for their ability to inhibit the HIV-1 Rev-Rev Response Element (RRE) interaction, as well as for their affinity to calf thymus DNA. One derivative (3,8-bis-urea-ethylenediamine-5-ethyl-6-phenylphenanthridinium trifuroracetate) has an enhanced affinity and specificity for HIV-1 RRE as compared to ethidium bromide. These results indicate that the nucleic acid affinity and specificity of an intercalating agent can be tuned by synthetic modification of its exocyclic amines.
Collapse
Affiliation(s)
- Nathan W Luedtke
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0358, USA.
| | | | | |
Collapse
|
35
|
Luedtke NW, Tor Y. Fluorescence-based methods for evaluating the RNA affinity and specificity of HIV-1 Rev-RRE inhibitors. Biopolymers 2003; 70:103-19. [PMID: 12925996 DOI: 10.1002/bip.10428] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
RNA plays a pivotal role in the replication of all organisms, including viral and bacterial pathogens. The development of small molecules that selectively interfere with undesired RNA activity is a promising new direction for drug design. Currently, there are no anti-HIV treatments that target nucleic acids. This article presents the HIV-1 Rev response element (RRE) as an important focus for the development of antiviral agents that target RNA. The Rev binding site on the RRE is highly conserved, even between different groups of HIV-1 isolates. Compounds that inhibit HIV replication by binding to the RRE and displacing Rev are therefore expected to retain activity across groups of genetically diverse HIV infections. Systematic evaluations of both the RRE affinity and specificity of numerous small molecule inhibitors are essential for deciphering the parameters that govern effective RRE recognition. This article discusses fluorescence-based techniques that are useful for probing a small molecule's RRE affinity and its ability to inhibit Rev-RRE binding. Rev displacement experiments can be conducted by observing the fluorescence anisotropy of a fluorescein-labeled Rev peptide, or by quantifying its displacement from a solid-phase immobilized RRE. Experiments conducted in the presence of competing nucleic acids are useful for evaluating the RRE specificity of Rev-RRE inhibitors. The discovery and characterization of new RRE ligands are described. Eilatin is a polycyclic aromatic heterocycle that has at least one binding site on the RRE (apparent Kd is approximately 0.13 microM), but it does not displace Rev upon binding the RRE (IC50 > 3 microM). In contrast, ethidium bromide and two eilatin-containing metal complexes show better consistency between their RRE affinity and their ability to displace a fluorescent Rev peptide from the RRE. These results highlight the importance of conducting orthogonal binding assays that establish both the RNA affinity of a small molecule and its ability to inhibit the function of the RNA target. Some Rev-RRE inhibitors, including ethidium bromide, Lambda-[Ru(bpy)(2)eilatin]2+, and Delta-[Ru(bpy)(2)eilatin]2+ also inhibit HIV-1 gene expression in cell cultures (IC50 = 0.2-3 microM). These (and similar) results should facilitate the future discovery and implementation of anti-HIV drugs that are targeted to viral RNA sites. In addition, a deeper general understanding of RNA-small molecule recognition will assist in the effective targeting of other therapeutically important RNA sites.
Collapse
Affiliation(s)
- Nathan W Luedtke
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0358, USA
| | | |
Collapse
|
36
|
Mahara A, Iwase R, Sakamoto T, Yamaoka T, Yamana K, Murakami A. Detection of acceptor sites for antisense oligonucleotides on native folded RNA by fluorescence spectroscopy. Bioorg Med Chem 2003; 11:2783-90. [PMID: 12788352 DOI: 10.1016/s0968-0896(03)00227-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Antisense strategy has high potential for curing diseases and studying gene functions by suppressing the translation step. For the strategy, it is essential to detect acceptor sites of antisense molecules on mRNA under physiological conditions. We propose a new analytical method for the detection of acceptor sites of antisense molecules with high sensitivity. 2'-O-Methyloligoribonucleotide containing 2'-O-(1-pyrenylmethyl)uridine (OMUpy) was chosen as the fluorescence probe. The fluorescence intensity due to the pyrene in single-stranded OMUpy was scarcely observed. When OMUpy was hybridized with the complementary oligoRNA, the fluorescence intensity at 375 nm was remarkably increased. It was found that the increase was derived from the localization of the pyrene by the measurements of time-resolved fluorescence spectroscopy, CD and UV absorption spectra. These results suggest that the change of the fluorescence intensity of OMUpy can be a useful index to monitor hybridization. In this study, we chose Escherichia coli. 16S-rRNA as the model RNA and chose seven regions for probing by OMUpy based on the reported secondary structure of 16S-rRNA. The fluorescence intensity of an equimolar mixture of OMUpy with 16S-rRNA varied depending on the sequence. In particular, the increment in the system of OMUpy-8, which can hybridize with region 887-896 nt of 16S-rRNA, was most significant among the systems. These results indicated that the site targeted by OMUpy-8 was exposed to regulatory molecules, and suggest that the method presented here is useful to design antisense molecules.
Collapse
Affiliation(s)
- Atsushi Mahara
- Department of Polymer Science and Engineering, Kyoto Institute of Technology, Matsugasaki, Kyoto 606-8585, Japan
| | | | | | | | | | | |
Collapse
|
37
|
Knudsen B, Hein J. Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Res 2003; 31:3423-8. [PMID: 12824339 PMCID: PMC169020 DOI: 10.1093/nar/gkg614] [Citation(s) in RCA: 285] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
RNA secondary structures are important in many biological processes and efficient structure prediction can give vital directions for experimental investigations. Many available programs for RNA secondary structure prediction only use a single sequence at a time. This may be sufficient in some applications, but often it is possible to obtain related RNA sequences with conserved secondary structure. These should be included in structural analyses to give improved results. This work presents a practical way of predicting RNA secondary structure that is especially useful when related sequences can be obtained. The method improves a previous algorithm based on an explicit evolutionary model and a probabilistic model of structures. Predictions can be done on a web server at http://www.daimi.au.dk/~compbio/pfold.
Collapse
Affiliation(s)
- Bjarne Knudsen
- BiRC (Bioinformatics Research Center), University of Aarhus, Høegh Guldbergsgade 10, Building 090, 8000 Arhus C, Denmark.
| | | |
Collapse
|
38
|
Hu YJ. GPRM: A genetic programming approach to finding common RNA secondary structure elements. Nucleic Acids Res 2003; 31:3446-9. [PMID: 12824343 PMCID: PMC168928 DOI: 10.1093/nar/gkg521] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
RNA molecules play an important role in many biological activities. Knowing its secondary structure can help us better understand the molecule's ability to function. The methods for RNA structure determination have traditionally been implemented through biochemical, biophysical and phylogenetic analyses. As the advance of computer technology, an increasing number of computational approaches have recently been developed. They have different goals and apply various algorithms. For example, some focus on secondary structure prediction for a single sequence; some aim at finding a global alignment of multiple sequences. Some predict the structure based on free energy minimization; some make comparative sequence analyses to determine the structure. In this paper, we describe how to correctly use GPRM, a genetic programming approach to finding common secondary structure elements in a set of unaligned coregulated or homologous RNA sequences. GPRM can be accessed at http://bioinfo.cis.nctu.edu.tw/service/gprm/.
Collapse
Affiliation(s)
- Yuh-Jyh Hu
- Computer and Information Science Department, National Chiao Tung University, 1001 Ta Hsueh Rd, Hsinchu, Taiwan.
| |
Collapse
|
39
|
Fogel GB, Porto VW, Weekes DG, Fogel DB, Griffey RH, McNeil JA, Lesnik E, Ecker DJ, Sampath R. Discovery of RNA structural elements using evolutionary computation. Nucleic Acids Res 2002; 30:5310-7. [PMID: 12466557 PMCID: PMC137967 DOI: 10.1093/nar/gkf653] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
RNA molecules fold into characteristic secondary and tertiary structures that account for their diverse functional activities. Many of these RNA structures, or certain structural motifs within them, are thought to recur in multiple genes within a single organism or across the same gene in several organisms and provide a common regulatory mechanism. Search algorithms, such as RNAMotif, can be used to mine nucleotide sequence databases for these repeating motifs. RNAMotif allows users to capture essential features of known structures in detailed descriptors and can be used to identify, with high specificity, other similar motifs within the nucleotide database. However, when the descriptor constraints are relaxed to provide more flexibility, or when there is very little a priori information about hypothesized RNA structures, the number of motif 'hits' may become very large. Exhaustive methods to search for similar RNA structures over these large search spaces are likely to be computationally intractable. Here we describe a powerful new algorithm based on evolutionary computation to solve this problem. A series of experiments using ferritin IRE and SRP RNA stem-loop motifs were used to verify the method. We demonstrate that even when searching extremely large search spaces, of the order of 10(23) potential solutions, we could find the correct solution in a fraction of the time it would have taken for exhaustive comparisons.
Collapse
Affiliation(s)
- Gary B Fogel
- Natural Selection Inc., 3333 North Torrey Pines Court, Suite 200, La Jolla, CA 92037, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Lesnik EA, Sampath R, Ecker DJ. Rev response elements (RRE) in lentiviruses: an RNAMotif algorithm-based strategy for RRE prediction. Med Res Rev 2002; 22:617-36. [PMID: 12369091 DOI: 10.1002/med.10027] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Lentiviruses (a sub-family of the retroviridae family) include primate and non-primate viruses associated with chronic diseases of the immune system and the central nervous system. All lentiviruses encode a regulatory protein Rev that is essential for post-transcriptional transport of the unspliced and incompletely spliced viral mRNAs from nuclei to cytoplasm. The Rev protein acts via binding to an RNA structural element known as the Rev responsive element (RRE). The RRE location and structure and the mechanism of the Rev-RRE interaction in primate and non-primate lentiviruses have been analyzed and compared. Based on structural data available for RRE of HIV-1, a two step computational strategy for prediction of putative RRE regions in lentivirus genomes has been developed. First, the RNAMotif algorithm was used to search genomic sequence for highly structured regions (HSR). Then the program RNAstructure, version 3.6 was used to calculate the structure and thermodynamic stability of the region of approximately 350 nucleotides encompassing the HSR. Our strategy correctly predicted the locations of all previously reported lentivirus RREs. We were able also to predict the locations and structures of potential RREs in four additional lentiviruses.
Collapse
Affiliation(s)
- Elena A Lesnik
- IBIS Therapeutics, 2292 Faraday Ave, Carlsbad, California 92008, USA
| | | | | |
Collapse
|
41
|
Hu YJ. Prediction of consensus structural motifs in a family of coregulated RNA sequences. Nucleic Acids Res 2002; 30:3886-93. [PMID: 12202774 PMCID: PMC137409 DOI: 10.1093/nar/gkf485] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
Given a set of homologous or functionally related RNA sequences, the consensus motifs may represent the binding sites of RNA regulatory proteins. Unlike DNA motifs, RNA motifs are more conserved in structures than in sequences. Knowing the structural motifs can help us gain a deeper insight of the regulation activities. There have been various studies of RNA secondary structure prediction, but most of them are not focused on finding motifs from sets of functionally related sequences. Although recent research shows some new approaches to RNA motif finding, they are limited to finding relatively simple structures, e.g. stem-loops. In this paper, we propose a novel genetic programming approach to RNA secondary structure prediction. It is capable of finding more complex structures than stem-loops. To demonstrate the performance of our new approach as well as to keep the consistency of our comparative study, we first tested it on the same data sets previously used to verify the current prediction systems. To show the flexibility of our new approach, we also tested it on a data set that contains pseudoknot motifs which most current systems cannot identify. A web-based user interface of the prediction system is set up at http://bioinfo. cis.nctu.edu.tw/service/gprm/.
Collapse
Affiliation(s)
- Yuh-Jyh Hu
- Computer and Information Science Department, National Chiao Tung University, 1001 Ta Hsueh Road, Hsinchu, Taiwan.
| |
Collapse
|
42
|
Johansson S, Niklasson B, Maizel J, Gorbalenya AE, Lindberg AM. Molecular analysis of three Ljungan virus isolates reveals a new, close-to-root lineage of the Picornaviridae with a cluster of two unrelated 2A proteins. J Virol 2002; 76:8920-30. [PMID: 12163611 PMCID: PMC137002 DOI: 10.1128/jvi.76.17.8920-8930.2002] [Citation(s) in RCA: 78] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Ljungan virus (LV) is a suspected human pathogen recently isolated from bank voles (Clethrionomys glareolus). In the present study, it is revealed through comparative sequence analysis that three newly determined Swedish LV genomes are closely related and possess a deviant picornavirus-like organization: 5' untranslated region-VP0-VP3-VP1-2A1-2A2-2B-2C-3A-3B-3C-3D-3' untranslated region. The LV genomes and the polyproteins encoded by them exhibit several exceptional features, such as the absence of a predicted maturation cleavage of VP0, a conserved sequence determinant in VP0 that is typically found in VP1 of other picornaviruses, and a cluster of two unrelated 2A proteins. The 2A1 protein is related to the 2A protein of cardio-, erbo-, tescho-, and aphthoviruses, and the 2A2 protein is related to the 2A protein of parechoviruses, kobuviruses, and avian encephalomyelitis virus. The unprecedented association of two structurally different 2A proteins is a feature never previously observed among picornaviruses and implies that their functions are not mutually exclusive. Secondary polyprotein processing of the LV polyprotein is mediated by proteinase 3C (3C(pro)) possessing canonical affinity to Glu and Gln at the P1 position and small amino acid residues at the P1' position. In addition, LV 3C(pro) appears to have unique substrate specificity to Asn, Gln, and Asp and to bulky hydrophobic residues at the P2 and P4 positions, respectively. Phylogenetic analysis suggests that LVs form a separate division, which, together with the Parechovirus genus, has branched off the picornavirus tree most closely to its root. The presence of two 2A proteins indicates that some contemporary picornaviruses with a single 2A may have evolved from the ancestral multi-2A picornavirus.
Collapse
Affiliation(s)
- Susanne Johansson
- Department of Chemistry and Biomedical Sciences, University of Kalmar, S-391 82 Kalmar, Sweden
| | | | | | | | | |
Collapse
|
43
|
Wu P, Nakano SI, Sugimoto N. Temperature dependence of thermodynamic properties for DNA/DNA and RNA/DNA duplex formation. EUROPEAN JOURNAL OF BIOCHEMISTRY 2002; 269:2821-30. [PMID: 12071944 DOI: 10.1046/j.1432-1033.2002.02970.x] [Citation(s) in RCA: 111] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
A clear difference in the enthalpy changes derived from spectroscopic and calorimetric measurements has recently been shown. The exact interpretation of this deviation varied from study to study, but it was generally attributed to the non-two-state transition and heat capacity change. Although the temperature-dependent thermodynamics of the duplex formation was often implied, systemic and extensive studies have been lacking in universally assigning the appropriate thermodynamic parameter sets. In the present study, the 24 DNA/DNA and 41 RNA/DNA oligonucleotide duplexes, designed to avoid the formation of hairpin or slipped duplex structures and to limit the base pair length less than 12 bp, were selected to evaluate the heat capacity changes and temperature-dependent thermodynamic properties of duplex formation. Direct comparison reveals that the temperature-independent thermodynamic parameters could provide a reasonable approximation only when the temperature of interest has a small deviation from the mean melting temperature over the experimental range. The heat capacity changes depend on the base composition and sequences and are generally limited in the range of -160 to approximately -40 cal.mol-1.K-1 per base pair. In contrast to the enthalpy and entropy changes, the free energy change and melting temperature are relatively insensitive to the heat capacity change. Finally, the 16 NN-model free energy parameters and one helix initiation at physiological temperature were extracted from the temperature-dependent thermodynamic data of the 41 RNA/DNA hybrids.
Collapse
Affiliation(s)
- Peng Wu
- High Technology Research Center, Faculty of Science and Engineering, Konan University, Okamoto, Higashinada-ku, Japan
| | | | | |
Collapse
|
44
|
Mathews DH, Turner DH. Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. J Mol Biol 2002; 317:191-203. [PMID: 11902836 DOI: 10.1006/jmbi.2001.5351] [Citation(s) in RCA: 255] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
With the rapid increase in the size of the genome sequence database, computational analysis of RNA will become increasingly important in revealing structure-function relationships and potential drug targets. RNA secondary structure prediction for a single sequence is 73 % accurate on average for a large database of known secondary structures. This level of accuracy provides a good starting point for determining a secondary structure either by comparative sequence analysis or by the interpretation of experimental studies. Dynalign is a new computer algorithm that improves the accuracy of structure prediction by combining free energy minimization and comparative sequence analysis to find a low free energy structure common to two sequences without requiring any sequence identity. It uses a dynamic programming construct suggested by Sankoff. Dynalign, however, restricts the maximum distance, M, allowed between aligned nucleotides in the two sequences. This makes the calculation tractable because the complexity is simplified to O(M(3)N(3)), where N is the length of the shorter sequence. The accuracy of Dynalign was tested with sets of 13 tRNAs, seven 5 S rRNAs, and two R2 3' UTR sequences. On average, Dynalign predicted 86.1 % of known base-pairs in the tRNAs, as compared to 59.7 % for free energy minimization alone. For the 5 S rRNAs, the average accuracy improves from 47.8 % to 86.4 %. The secondary structure of the R2 3' UTR from Drosophila takahashii is poorly predicted by standard free energy minimization. With Dynalign, however, the structure predicted in tandem with the sequence from Drosophila melanogaster nearly matches the structure determined by comparative sequence analysis.
Collapse
Affiliation(s)
- David H Mathews
- Department of Chemistry, University of Rochester, NY 14627-0216, USA
| | | |
Collapse
|
45
|
Dawson W, Suzuki K, Yamamoto K. A physical origin for functional domain structure in nucleic acids as evidenced by cross-linking entropy: II. J Theor Biol 2001; 213:387-412. [PMID: 11735287 DOI: 10.1006/jtbi.2001.2437] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
In Part I, cross-linking entropy (CLE) was proposed as a mechanism that limits the size of functional domains of RNA. To test this hypothesis, the theory is developed into an RNA secondary structure prediction filter which is applied to nearest-neighbor secondary structure (NNSS) algorithms that utilize a free energy (FE) minimization strategy. (The NNSS strategies are also referred to as the dynamic programming algorithm in the literature.) The cross-linking entropy for RNA is derived from a generalized Gaussian polymer chain model where the entropic contributions caused by the formation of base pairs (stacking) in RNA are analysed globally. Local entropic contributions are associated with the freezing out of degrees of freedom in the links. Both global and local entropic effects are strongly influenced by the persistence length. The cross-linking entropy provides a physical origin for the size of functional domains in long nucleic acid sequences and may go further to explain as to why the majority of the domain regions in typical sequences tend to be less than 600 nucleotides in length. In addition, improvements were observed in the "best guess" predictive capacity over NNSS prediction strategies. The thermodynamic distribution is more representative of the expected structures and is strongly governed by such physical parameters as the persistence length and the excluded volume. The CLE appears to generalize the tabulated penalties used in NNSS algorithms. The principal parameter influencing this entropy is the persistence length. The model is shown to accomodate a variable persistence length and is capable of describing the folding dynamics of RNA. A two-state kinetic model based on the CLE principle is used to help elucidate the folding kinetics of functional domains in the group I introns.
Collapse
Affiliation(s)
- W Dawson
- Department of Bioactive Molecules, National Institute of Infectious Diseases, 1-23-1 Toyama, Shinjuku-ku, Tokyo, 162-8640, Japan.
| | | | | |
Collapse
|
46
|
|
47
|
Abstract
During the past year, major improvements have been made in methods used to solve RNA structures from crystals, find RNA patterns in sequence data and determine RNA secondary structure. Computational methods for assisting an interactive computer graphics human modeler, searching the conformational space of RNA tertiary structure, studying the dynamics of complexes involving RNA and simulating RNA catalytic activities have also been advanced.
Collapse
Affiliation(s)
- F Major
- Département d'Informatique et de Recherche Opérationnelle, Université de Montréal, CP 6128, Succ Centre-Ville, Montréal, Québec, H3C 3J7, Canada.
| | | |
Collapse
|
48
|
Gorodkin J, Stricklin SL, Stormo GD. Discovering common stem-loop motifs in unaligned RNA sequences. Nucleic Acids Res 2001; 29:2135-44. [PMID: 11353083 PMCID: PMC55461 DOI: 10.1093/nar/29.10.2135] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2001] [Revised: 03/14/2001] [Accepted: 03/27/2001] [Indexed: 11/13/2022] Open
Abstract
Post-transcriptional regulation of gene expression is often accomplished by proteins binding to specific sequence motifs in mRNA molecules, to affect their translation or stability. The motifs are often composed of a combination of sequence and structural constraints such that the overall structure is preserved even though much of the primary sequence is variable. While several methods exist to discover transcriptional regulatory sites in the DNA sequences of coregulated genes, the RNA motif discovery problem is much more difficult because of covariation in the positions. We describe the combined use of two approaches for RNA structure prediction, FOLDALIGN and COVE, that together can discover and model stem-loop RNA motifs in unaligned sequences, such as UTRs from post-transcriptionally coregulated genes. We evaluate the method on two datasets, one a section of rRNA genes with randomly truncated ends so that a global alignment is not possible, and the other a hyper-variable collection of IRE-like elements that were inserted into randomized UTR sequences. In both cases the combined method identified the motifs correctly, and in the rRNA example we show that it is capable of determining the structure, which includes bulge and internal loops as well as a variable length hairpin loop. Those automated results are quantitatively evaluated and found to agree closely with structures contained in curated databases, with correlation coefficients up to 0.9. A basic server, Stem-Loop Align SearcH (SLASH), which will perform stem-loop searches in unaligned RNA sequences, is available at http://www.bioinf.au.dk/slash/.
Collapse
Affiliation(s)
- J Gorodkin
- Department of Genetics and Ecology, The Institute of Biological Sciences, University of Aarhus, Building 540, Ny Munkegade, DK-8000 Aarhus C, Denmark
| | | | | |
Collapse
|
49
|
Abstract
New results for calculating nucleic acid secondary structure by free energy minimization and phylogenetic comparisons have recently been reported. A complete set of DNA energy parameters is now available and the RNA parameters have been improved. Although databases of RNA secondary structures are still derived and expanded using computer-assisted, ad hoc comparative analysis, a number of new computer algorithms combine covariation analysis with energy methods.
Collapse
Affiliation(s)
- M Zuker
- Department of Biochemistry and Molecular Biophysics, Washington University, St Louis, 63110, USA.
| |
Collapse
|