1
|
Fei Y, Zhang H, Wang Y, Liu Z, Liu Y. LTPConstraint: a transfer learning based end-to-end method for RNA secondary structure prediction. BMC Bioinformatics 2022; 23:354. [PMID: 35999499 PMCID: PMC9396797 DOI: 10.1186/s12859-022-04847-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 07/18/2022] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND RNA secondary structure is very important for deciphering cell's activity and disease occurrence. The first method which was used by the academics to predict this structure is biological experiment, But this method is too expensive, causing the promotion to be affected. Then, computing methods emerged, which has good efficiency and low cost. However, the accuracy of computing methods are not satisfactory. Many machine learning methods have also been applied to this area, but the accuracy has not improved significantly. Deep learning has matured and achieves great success in many areas such as computer vision and natural language processing. It uses neural network which is a kind of structure that has good functionality and versatility, but its effect is highly correlated with the quantity and quality of the data. At present, there is no model with high accuracy, low data dependence and high convenience in predicting RNA secondary structure. RESULTS This paper designs a neural network called LTPConstraint to predict RNA secondary structure. The network is based on many network structure such as Bidirectional LSTM, Transformer and generator. It also uses transfer learning to train modelso that the data dependence can be reduced. CONCLUSIONS LTPConstraint has achieved high accuracy in RNA secondary structure prediction. Compared with the previous methods, the accuracy improves obviously both in predicting the structure with pseudoknot and the structure without pseudoknot. At the same time, LTPConstraint is easy to operate and can achieve result very quickly.
Collapse
Affiliation(s)
- Yinchao Fei
- College of Computer Science and Technology, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, China
| | - Hao Zhang
- College of Computer Science and Technology, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, China
| | - Yili Wang
- College of Computer Science and Technology, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, China
| | - Zhen Liu
- Graduate School of Engineering, Nagasaki Institute of Applied Science, Nagasaki, Japan
| | - Yuanning Liu
- College of Computer Science and Technology, Jilin University, Changchun, China.
- Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, China.
| |
Collapse
|
2
|
Rogers E, Heitsch C. New insights from cluster analysis methods for RNA secondary structure prediction. WILEY INTERDISCIPLINARY REVIEWS-RNA 2016; 7:278-94. [PMID: 26971529 DOI: 10.1002/wrna.1334] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Revised: 12/03/2015] [Accepted: 12/17/2015] [Indexed: 01/12/2023]
Abstract
A widening gap exists between the best practices for RNA secondary structure prediction developed by computational researchers and the methods used in practice by experimentalists. Minimum free energy predictions, although broadly used, are outperformed by methods which sample from the Boltzmann distribution and data mine the results. In particular, moving beyond the single structure prediction paradigm yields substantial gains in accuracy. Furthermore, the largest improvements in accuracy and precision come from viewing secondary structures not at the base pair level but at lower granularity/higher abstraction. This suggests that random errors affecting precision and systematic ones affecting accuracy are both reduced by this 'fuzzier' view of secondary structures. Thus experimentalists who are willing to adopt a more rigorous, multilayered approach to secondary structure prediction by iterating through these levels of granularity will be much better able to capture fundamental aspects of RNA base pairing. WIREs RNA 2016, 7:278-294. doi: 10.1002/wrna.1334 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Emily Rogers
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0765, USA
| | - Christine Heitsch
- School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332-0160, USA
| |
Collapse
|
3
|
Rogers E, Heitsch CE. Profiling small RNA reveals multimodal substructural signals in a Boltzmann ensemble. Nucleic Acids Res 2014; 42:e171. [PMID: 25392423 PMCID: PMC4267672 DOI: 10.1093/nar/gku959] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Revised: 09/26/2014] [Accepted: 10/01/2014] [Indexed: 11/13/2022] Open
Abstract
As the biomedical impact of small RNAs grows, so does the need to understand competing structural alternatives for regions of functional interest. Suboptimal structure analysis provides significantly more RNA base pairing information than a single minimum free energy prediction. Yet computational enhancements like Boltzmann sampling have not been fully adopted by experimentalists since identifying meaningful patterns in this data can be challenging. Profiling is a novel approach to mining RNA suboptimal structure data which makes the power of ensemble-based analysis accessible in a stable and reliable way. Balancing abstraction and specificity, profiling identifies significant combinations of base pairs which dominate low-energy RNA secondary structures. By design, critical similarities and differences are highlighted, yielding crucial information for molecular biologists. The code is freely available via http://gtfold.sourceforge.net/profiling.html.
Collapse
Affiliation(s)
- Emily Rogers
- School of Computational Science and Engineering, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA 30332-0765, USA
| | - Christine E Heitsch
- School of Mathematics, Georgia Institute of Technology, 686 Cherry St., Atlanta, GA 30332-0160, USA
| |
Collapse
|
4
|
Cao S, Chen SJ. Statistical mechanical modeling of RNA folding: from free energy landscape to tertiary structural prediction. NUCLEIC ACIDS AND MOLECULAR BIOLOGY 2012; 27:185-212. [PMID: 27293312 DOI: 10.1007/978-3-642-25740-7_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
In spite of the success of computational methods for predicting RNA secondary structure, the problem of predicting RNA tertiary structure folding remains. Low-resolution structural models show promise as they allow for rigorous statistical mechanical computation for the conformational entropies, free energies, and the coarse-grained structures of tertiary folds. Molecular dynamics refinement of coarse-grained structures leads to all-atom 3D structures. Modeling based on statistical mechanics principles also has the unique advantage of predicting the full free energy landscape, including local minima and the global free energy minimum. The energy landscapes combined with the 3D structures form the basis for quantitative predictions of RNA functions. In this chapter, we present an overview of statistical mechanical models for RNA folding and then focus on a recently developed RNA statistical mechanical model -- the Vfold model. The main emphasis is placed on the physics underpinning the models, the computational strategies, and the connections to RNA biology.
Collapse
Affiliation(s)
- Song Cao
- Department of Physics and Department of Biochemistry, University of Missouri, Columbia, MO 65211
| | - Shi-Jie Chen
- Department of Physics and Department of Biochemistry, University of Missouri, Columbia, MO 65211
| |
Collapse
|
5
|
Cao S, Chen SJ. Predicting structures and stabilities for H-type pseudoknots with interhelix loops. RNA (NEW YORK, N.Y.) 2009; 15:696-706. [PMID: 19237463 PMCID: PMC2661829 DOI: 10.1261/rna.1429009] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2008] [Accepted: 01/10/2009] [Indexed: 05/20/2023]
Abstract
RNA pseudoknots play a critical role in RNA-related biology from the assembly of ribosome to the regulation of viral gene expression. A predictive model for pseudoknot structure and stability is essential for understanding and designing RNA structure and function. A previous statistical mechanical theory allows us to treat canonical H-type RNA pseudoknots that contain no intervening loop between the helices (see S. Cao and S.J. Chen [2006] in Nucleic Acids Research, Vol. 34; pp. 2634-2652). Biologically significant RNA pseudoknots often contain interhelix loops. Predicting the structure and stability for such more-general pseudoknots remains an unsolved problem. In the present study, we develop a predictive model for pseudoknots with interhelix loops. The model gives conformational entropy, stability, and the free-energy landscape from RNA sequences. The main features of this new model are the computation of the conformational entropy and folding free-energy base on the complete conformational ensemble and rigorous treatment for the excluded volume effects. Extensive tests for the structural predictions show overall good accuracy with average sensitivity and specificity equal to 0.91 and 0.91, respectively. The theory developed here may be a solid starting point for first-principles modeling of more complex, larger RNAs.
Collapse
Affiliation(s)
- Song Cao
- Department of Physics, University of Missouri, Columbia, 65211, USA
| | | |
Collapse
|
6
|
Mathews DH. Revolutions in RNA secondary structure prediction. J Mol Biol 2006; 359:526-32. [PMID: 16500677 DOI: 10.1016/j.jmb.2006.01.067] [Citation(s) in RCA: 120] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2005] [Revised: 01/13/2006] [Accepted: 01/18/2006] [Indexed: 01/09/2023]
Abstract
RNA structure formation is hierarchical and, therefore, secondary structure, the sum of canonical base-pairs, can generally be predicted without knowledge of the three-dimensional structure. Secondary structure prediction algorithms evolved from predicting a single, lowest free energy structure to their current state where statistics can be determined from the thermodynamic ensemble. This article reviews the free energy minimization technique and the salient revolutions in the dynamic programming algorithm methods for secondary structure prediction. Emphasis is placed on highlighting the recently developed method, which statistically samples structures from the complete Boltzmann ensemble.
Collapse
Affiliation(s)
- David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, NY 14642, USA.
| |
Collapse
|
7
|
Abstract
This article presents a general statistical mechanical approach to describe self-folding together with the hybridization between a pair of finite length DNA or RNA molecules. The model takes into account the entire ensemble of single- and double-stranded species in solution and their mole fractions at different temperatures. The folding and hybridization models deal with matched pairs, mismatches, symmetric and asymmetric interior loops, bulges, and single-base stacking that might exist at duplex ends or at the ends of helices. All possible conformations of the single- and double-stranded species are explored. Only intermolecular basepairs are considered in duplexes at this stage.In particular we focus on the role of stacking between neighboring nucleotide residues of single unfolded strands as an important source of enthalpy change on helix formation which has not been modeled computationally thus far. Changes in the states of the single strands with temperature are shown to lead to a larger heat effect at higher temperature. An important consequence of this is that predictions of enthalpies, which are based on databases of nearest-neighbor energy parameters determined for molecules or duplexes with lower melting temperatures compared with the melting temperatures of the oligos for which they are used as a predictive tool, will be underestimated.
Collapse
Affiliation(s)
- Roumen A Dimitrov
- Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, New York 12180, USA
| | | |
Collapse
|
8
|
Ding Y, Lawrence CE. A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res 2004; 31:7280-301. [PMID: 14654704 PMCID: PMC297010 DOI: 10.1093/nar/gkg938] [Citation(s) in RCA: 362] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
An RNA molecule, particularly a long-chain mRNA, may exist as a population of structures. Further more, multiple structures have been demonstrated to play important functional roles. Thus, a representation of the ensemble of probable structures is of interest. We present a statistical algorithm to sample rigorously and exactly from the Boltzmann ensemble of secondary structures. The forward step of the algorithm computes the equilibrium partition functions of RNA secondary structures with recent thermodynamic parameters. Using conditional probabilities computed with the partition functions in a recursive sampling process, the backward step of the algorithm quickly generates a statistically representative sample of structures. With cubic run time for the forward step, quadratic run time in the worst case for the sampling step, and quadratic storage, the algorithm is efficient for broad applicability. We demonstrate that, by classifying sampled structures, the algorithm enables a statistical delineation and representation of the Boltzmann ensemble. Applications of the algorithm show that alternative biological structures are revealed through sampling. Statistical sampling provides a means to estimate the probability of any structural motif, with or without constraints. For example, the algorithm enables probability profiling of single-stranded regions in RNA secondary structure. Probability profiling for specific loop types is also illustrated. By overlaying probability profiles, a mutual accessibility plot can be displayed for predicting RNA:RNA interactions. Boltzmann probability-weighted density of states and free energy distributions of sampled structures can be readily computed. We show that a sample of moderate size from the ensemble of an enormous number of possible structures is sufficient to guarantee statistical reproducibility in the estimates of typical sampling statistics. Our applications suggest that the sampling algorithm may be well suited to prediction of mRNA structure and target accessibility. The algorithm is applicable to the rational design of small interfering RNAs (siRNAs), antisense oligonucleotides, and trans-cleaving ribozymes in gene knock-down studies.
Collapse
Affiliation(s)
- Ye Ding
- Bioinformatics Center, Wadsworth Center, New York State Department of Health, 150 New Scotland Avenue, Albany, NY 12208, USA.
| | | |
Collapse
|
9
|
Dawson W, Suzuki K, Yamamoto K. A physical origin for functional domain structure in nucleic acids as evidenced by cross-linking entropy: I. J Theor Biol 2001; 213:359-86. [PMID: 11735286 DOI: 10.1006/jtbi.2001.2436] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A global strategy for estimating the entropy of long sequences of RNA is proposed to help improve the predictive capacity of RNA secondary structure dynamic programming algorithm (DPA) free energy (FE) minimization methods. These DPA strategies only consider the effects that occur in the immediate (nearest neighbor) vicinity of a given base pair (bp) in a secondary structure plot. They are therefore defined as nearest-neighbor secondary structure (NNSS) strategies. The new approach utilizes the statistical properties of the Gaussian polymer chain model to introduce both local and global contributions to the entropy of a given secondary structure. These entropic contributions are primarily a function of the persistence length of the RNA. Limits on the domain size are strongly suggested by this model and these limits are a function of both the length and the percentage of bp enclosed within a given domain. The model generalizes the penalties found in the NNSS algorithms. The approach considers the importance of flexibility in the folding and stability of RNA by considering the role of the persistence length in a biopolymer structure. The theory also suggests that molecular machinery may also take advantage of this global entropic effect to bring about catalytic effects. The applications can also be extended to protein structure calculations with some additional considerations.
Collapse
Affiliation(s)
- W Dawson
- Department of Bioactive Molecules, National Institute of Infectious Diseases, 1-23-1 Toyama, Shinjuku-ku, Tokyo, 162-8640, Japan.
| | | | | |
Collapse
|
10
|
Dawson W, Suzuki K, Yamamoto K. A physical origin for functional domain structure in nucleic acids as evidenced by cross-linking entropy: II. J Theor Biol 2001; 213:387-412. [PMID: 11735287 DOI: 10.1006/jtbi.2001.2437] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
In Part I, cross-linking entropy (CLE) was proposed as a mechanism that limits the size of functional domains of RNA. To test this hypothesis, the theory is developed into an RNA secondary structure prediction filter which is applied to nearest-neighbor secondary structure (NNSS) algorithms that utilize a free energy (FE) minimization strategy. (The NNSS strategies are also referred to as the dynamic programming algorithm in the literature.) The cross-linking entropy for RNA is derived from a generalized Gaussian polymer chain model where the entropic contributions caused by the formation of base pairs (stacking) in RNA are analysed globally. Local entropic contributions are associated with the freezing out of degrees of freedom in the links. Both global and local entropic effects are strongly influenced by the persistence length. The cross-linking entropy provides a physical origin for the size of functional domains in long nucleic acid sequences and may go further to explain as to why the majority of the domain regions in typical sequences tend to be less than 600 nucleotides in length. In addition, improvements were observed in the "best guess" predictive capacity over NNSS prediction strategies. The thermodynamic distribution is more representative of the expected structures and is strongly governed by such physical parameters as the persistence length and the excluded volume. The CLE appears to generalize the tabulated penalties used in NNSS algorithms. The principal parameter influencing this entropy is the persistence length. The model is shown to accomodate a variable persistence length and is capable of describing the folding dynamics of RNA. A two-state kinetic model based on the CLE principle is used to help elucidate the folding kinetics of functional domains in the group I introns.
Collapse
Affiliation(s)
- W Dawson
- Department of Bioactive Molecules, National Institute of Infectious Diseases, 1-23-1 Toyama, Shinjuku-ku, Tokyo, 162-8640, Japan.
| | | | | |
Collapse
|
11
|
Abstract
New results for calculating nucleic acid secondary structure by free energy minimization and phylogenetic comparisons have recently been reported. A complete set of DNA energy parameters is now available and the RNA parameters have been improved. Although databases of RNA secondary structures are still derived and expanded using computer-assisted, ad hoc comparative analysis, a number of new computer algorithms combine covariation analysis with energy methods.
Collapse
Affiliation(s)
- M Zuker
- Department of Biochemistry and Molecular Biophysics, Washington University, St Louis, 63110, USA.
| |
Collapse
|
12
|
Dawson WK, Yamamoto K. Mean free energy topology for nucleotide sequences of varying composition based on secondary structure calculations. J Theor Biol 1999; 201:113-40. [PMID: 10556021 DOI: 10.1006/jtbi.1999.1018] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The mean free energy generated from the secondary structure of RNA sequences of varying length and composition has been studied by way of probability theory. The expected boundaries or maximal and minimal values of a given distribution are explored and a method for estimating error as a function of the number of shuffled sequences is also examined. For typical nucleotide sequences found in biologically active organisms, the mean free energy, free energy distributions and errors appear to be scalable in terms of a fixed set of algorithm-dependent parameters and the nucleotide composition of the particular sequence under evaluation. In addition, a general semi-analytical formula for predicting the mean free energy is proposed which, at least to first-order approximation, can be used to rapidly predict the mean free energy of any sequence length and composition of RNA. The general methodology appears to be algorithm independent. The results are expected to provide a reference point for certain types of analysis related to structure of RNA or DNA sequences and to assist in measuring the somewhat related matter of complexity in algorithm development. Some related applications are discussed.
Collapse
Affiliation(s)
- W K Dawson
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108, Japan
| | | |
Collapse
|
13
|
Ding Y, Lawrence CE. A bayesian statistical algorithm for RNA secondary structure prediction. COMPUTERS & CHEMISTRY 1999; 23:387-400. [PMID: 10404626 DOI: 10.1016/s0097-8485(99)00010-8] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
A Bayesian approach for predicting RNA secondary structure that addresses the following three open issues is described: (1) the need for a representation of the full ensemble of probable structures; (2) the need to specify a fixed set of energy parameters; (3) the desire to make statistical inferences on all variables in the problem. It has recently been shown that Bayesian inference can be employed to relax or eliminate the need to specify the parameters of bioinformatics recursive algorithms and to give a statistical representation of the full ensemble of probable solutions with the incorporation of uncertainty in parameter values. In this paper, we make an initial exploration of these potential advantages of the Bayesian approach. We present a Bayesian algorithm that is based on stacking energy rules but relaxes the need to specify the parameters. The algorithm returns the exact posterior distribution of the number of destabilizing loops, stacking energy matrices, and secondary structures. The algorithm generates statistically representative structures from the full ensemble of probable secondary structures in exact proportion to the posterior probabilities. Once the forward recursions for the algorithm are completed, the backward recursive sampling executes in O(n) time, providing a very efficient approach for generating representative structures. We demonstrate the utility of the Bayesian approach with several tRNA sequences. The potential of the approach for predicting RNA secondary structures and presenting alternative structures is illustrated with applications to the Escherichia coli tRNA(Ala) sequence and the Xenopus laevis oocyte 5S rRNA sequence.
Collapse
Affiliation(s)
- Y Ding
- Division of Molecular Medicine, Wadsworth Center, New York State Department of Health, Albany 12201-0509, USA.
| | | |
Collapse
|
14
|
Zuker M, Jacobson AB. Using reliability information to annotate RNA secondary structures. RNA (NEW YORK, N.Y.) 1998; 4:669-79. [PMID: 9622126 PMCID: PMC1369649 DOI: 10.1017/s1355838298980116] [Citation(s) in RCA: 122] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
A number of heuristic descriptors have been developed previously in conjunction with the mfold package that describe the propensity of individual bases to participate in base pairs and whether or not a predicted helix is "well-determined." They were developed for the "energy dot plot" output of mfold. Two descriptors, P-num and H-num, are used to measure the level of promiscuity in the association of any given nucleotide or helix with alternative complementary pairs. The third descriptor, S-num, measures the propensity of bases to be single-stranded. In the current work, we describe a series of programs that were developed in order to annotate individual structures with "well-definedness" information. We use color annotation to present the information. The programs can annotate PostScript files that are created by the mfold package or the PostScript secondary structure plots produced by the Weiser and Noller program XRNA (Weiser B, Noller HF, 1995, XRNA: Auto-interactive program for modeling RNA, The Center for Molecular Biology of RNA, Santa Cruz, California: University of California; Internet: ftp://fangio.ucsc.edu/pub/XRNA). In addition, these programs can annotate ss files that serve as input to XRNA. The annotation package can also handle structure comparison with a reference structure. This feature can be used to compare predicted structure with a phylogenetically deduced model, to compare two different predicted foldings, and to identify conformational changes that are predicted between wild-type and mutant RNAs. We provide several examples of application. Predicted structures of two RNase P RNAs were colored with P-num information and further annotated with comparative information. The comparative model of a 16S rRNA was annotated with P-num information from mfold and with base pair probabilities obtained from the Vienna RNA folding package. Further annotation adds comparisons with the optimal foldings obtained from mfold and the Vienna package, respectively. The results of all of these analyses are discussed in the context of the reliability of structure prediction.
Collapse
Affiliation(s)
- M Zuker
- Institute for Biomedical Computing, Washington University, St. Louis, Missouri 63110, USA.
| | | |
Collapse
|
15
|
Kuo KW, Leung MF, Leung WC. Intrinsic secondary structure of human TNFR-I mRNA influences the determination of gene expression by RT-PCR. Mol Cell Biochem 1997; 177:1-6. [PMID: 9450638 DOI: 10.1023/a:1006862304381] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The secondary structure of human tumor necrosis factor receptor I (TNFR-I) mRNA based on its lowest folding energy was predicted. Three combinations of primers selected from open-regions and four combinations of primers from closed-regions of TNFR-I mRNA structure were employed for single-tube reverse transcription-polymerase chain reaction (RT-PCR) for the determination of TNFR-I gene expression in U937 cell. All the primers were designed with the same criteria. However, the different primers generated distinct quantities of RT-PCR products from the same concentration of TNFR-I mRNA, implying that the determination of gene expression by RT-PCR was affected by the mRNA secondary structure. In addition, the sensitivity of the open-region RT-PCR was approximately one hundred-fold higher than that in the closed-regions of TNFR-I mRNA. The low efficiency of the closed-region RT-PCR was not correlated with the G/C content of the TNFR-I mRNA structure. These results suggest that consideration of the influence of intrinsic mRNA structure of a gene is essential prior to the determination of gene expression by quantitative RT-PCR, and this open-region strategy of primer design may yield an efficient primer for in vitro amplification of cDNA by RT-PCR.
Collapse
MESH Headings
- Antigens, CD/genetics
- Gene Expression Regulation
- Humans
- Lymphoma, Large B-Cell, Diffuse
- Models, Molecular
- Nucleic Acid Conformation
- Polymerase Chain Reaction
- RNA/isolation & purification
- RNA, Messenger/chemistry
- RNA, Messenger/physiology
- Receptors, Tumor Necrosis Factor/genetics
- Receptors, Tumor Necrosis Factor, Type I
- Temperature
- Tumor Cells, Cultured
Collapse
Affiliation(s)
- K W Kuo
- Department of Biochemistry, Kaohsiung Medical College, Taiwan
| | | | | |
Collapse
|
16
|
Klaff P, Riesner D, Steger G. RNA structure and the regulation of gene expression. PLANT MOLECULAR BIOLOGY 1996; 32:89-106. [PMID: 8980476 DOI: 10.1007/bf00039379] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
RNA secondary and tertiary structure is involved in post-transcriptional regulation of gene expression either by exposing specific sequences or through the formation of specific structural motifs. An overview of RNA secondary and tertiary structures known from biophysical studies is followed by a review of examples of the elements of RNA processing, mRNA stability and translation of the messenger. These structural elements comprise sense-antisense double-stranded RNA, hairpin and stem-loop structures, and more complex structures such as bifurcations, pseudoknots and triple-helical elements. Metastable structures formed during RNA folding pathway are also discussed. The examples presented are mostly chosen from plant systems, plant viruses, and viroids. Examples from bacteria or fungi are discussed only when unique regulatory properties of RNA structures have been elucidated in these systems.
Collapse
Affiliation(s)
- P Klaff
- Institut für Physikalische Biologie, Heinrich-Heine-Universität Düsseldorf, Germany
| | | | | |
Collapse
|
17
|
Kolchanov NA, Titov II, Vlassova IE, Vlassov VV. Chemical and computer probing of RNA structure. PROGRESS IN NUCLEIC ACID RESEARCH AND MOLECULAR BIOLOGY 1996; 53:131-96. [PMID: 8650302 PMCID: PMC7133174 DOI: 10.1016/s0079-6603(08)60144-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Ribonucleic acids (RNAs) are one of the most important types of biopolymers. RNAs play key roles in the storage and multiplication of genetic information. They are important in catalysis and RNA splicing and are the most important steps of translation. This chapter describes experimental methods for probing RNA structure and theoretical methods allowing the prediction of thermodynamically favorable RNA folding. These methods are complementary and together they provide a powerful approach to determine the structure of RNAs. The three-dimensional (tertiary) structure of RNA is formed by hydrogen-bonding among functional groups of nucleosides in different regions of the molecule, by coordination of polyvalent cations, and by stacking between the double-stranded regions present in the RNA. The tertiary structures of only some small RNAs have been determined by high-resolution X-ray crystallographic analysis and nuclear magnetic resonance analysis. The most widely used approach for the investigation of RNA structure is chemical and enzymatic probing, in combination with theoretical methods and phylogenetic studies allowing the prediction of variants of RNA folding. Investigations of RNA structures with different enzymatic and chemical probes can provide detailed data allowing the identification of double-stranded regions of the molecules and nucleotides involved in tertiary interactions.
Collapse
Affiliation(s)
- N A Kolchanov
- Institute of Cytology and Genetics, Siberian Division of Russian Academy of Sciences, Novosibirsk, Russia
| | | | | | | |
Collapse
|
18
|
Suvernev AA, Frantsuzov PA. Statistical description of nucleic acid secondary structure folding. J Biomol Struct Dyn 1995; 13:135-44. [PMID: 8527025 DOI: 10.1080/07391102.1995.10508826] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
A simple statistical model describing the folding of nucleic acids is proposed. For long sequences the real configuration of the secondary structure is a quasi equilibrium state that cannot be characterised by minimal free energy. This is because the time required to achieve complete thermal equilibrium considerably exceeds the life-time of the molecule. The formation of the secondary structure is represented as a random walk process in the space of all possible molecular configurations. The quasi equilibrium structure is obtained by successive linking and disruptions of helix segments with probabilities determined by the rate constants of corresponding unimolecular reactions. The probabilities of configurations consisting of all possible compatible helices are calculated. Structures of some t-RNAs and ribosomal RNAs are analysed.
Collapse
Affiliation(s)
- A A Suvernev
- Institute of Cytology and Genetics, Novosibirsk, Russia
| | | |
Collapse
|
19
|
Zuker M, Jacobson AB. "Well-determined" regions in RNA secondary structure prediction: analysis of small subunit ribosomal RNA. Nucleic Acids Res 1995; 23:2791-8. [PMID: 7544463 PMCID: PMC307106 DOI: 10.1093/nar/23.14.2791] [Citation(s) in RCA: 83] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Recent structural analyses of genomic RNAs from RNA coliphages suggest that both well-determined base paired helices and well-determined structural domains that are identified by "energy dot plot" analysis using the RNA folding package mfold, are likely to be predicted correctly. To test these observations with another group of large RNAs, we have analyzed 15 ribosomal RNAs. Published secondary structure models that were derived by comparative sequence analysis were used to evaluate the predicted structures. Both the optimal predicted fold and the predicted "energy dot plot" of each sequence were examined. Each prediction was obtained from a single computer run on an entire ribosomal RNA sequence. All predicted base pairs in optimal foldings were examined for agreement with proven base pairs in the comparative models. Our analyses show that the overall correspondence between the predicted and comparative models varied for different RNAs and ranges from a low of 27% to high of 70%, with a mean value of 49%. The correspondence improves to a mean value of 81% when the analysis is limited to well-determined helices. In addition to well-determined helices, large well-determined structural domains can be observed in "energy dot plots" of some 16S ribosomal RNAs. The predicted domains correspond closely with structural domains that are found by the comparative method in the same RNAs. Our analyses also show that measuring the agreement between predicted and comparative secondary structure models underestimates the reliability of structural prediction by mfold.
Collapse
Affiliation(s)
- M Zuker
- Institute for Biomedical Computing, Washington University, St Louis, MO 63110, USA
| | | |
Collapse
|
20
|
Christoffersen RE, McSwiggen J, Konings D. Application of computational technologies to ribozyme biotechnology products. J Mol Struct 1994. [DOI: 10.1016/s0022-2860(10)80037-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
21
|
|
22
|
Chernov BK, Merits A, Blinov VM. Computer-assisted predictions of the secondary structure in the plant virus single-stranded DNA genome. J Biomol Struct Dyn 1994; 11:837-47. [PMID: 8204218 DOI: 10.1080/07391102.1994.10508036] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Coconut foliar decay virus (CFDV) contains the single-stranded circular DNA molecules of 1291 nucleotides which were found to replicate autonomously in the cells of the diseased palms. The special features of the CFDV DNA sequence, including putative secondary structure and the distribution of the inverted repeat motifs, are investigated with computer-assisted prediction methods. It is evident that the structural principle of the branched series of long and short double helixes interspersed by short non-helical regions is existed for CFDV virion DNA. The total degree of base pairing is near 62%. We have also predicted the presence of several sequence elements formed by inverted repeat motifs which are potentially capable of binding the eukaryotic transcriptional regulatory factors.
Collapse
|
23
|
Abstract
A statistical reference for RNA secondary structures with minimum free energies is computed by folding large ensembles of random RNA sequences. Four nucleotide alphabets are used: two binary alphabets, AU and GC, the biophysical AUGC and the synthetic GCXK alphabet. RNA secondary structures are made of structural elements, such as stacks, loops, joints, and free ends. Statistical properties of these elements are computed for small RNA molecules of chain lengths up to 100. The results of RNA structure statistics depend strongly on the particular alphabet chosen. The statistical reference is compared with the data derived from natural RNA molecules with similar base frequencies. Secondary structures are represented as trees. Tree editing provides a quantitative measure for the distance dt, between two structures. We compute a structure density surface as the conditional probability of two structures having distance t given that their sequences have distance h. This surface indicates that the vast majority of possible minimum free energy secondary structures occur within a fairly small neighborhood of any typical (random) sequence. Correlation lengths for secondary structures in their tree representations are computed from probability densities. They are appropriate measures for the complexity of the sequence-structure relation. The correlation length also provides a quantitative estimate for the mean sensitivity of structures to point mutations.
Collapse
Affiliation(s)
- W Fontana
- Theoretical Division, Los Alamos National Laboratory, New Mexico 87545
| | | | | | | |
Collapse
|
24
|
Le SY, Chen JH, Maizel JV. Prediction of alternative RNA secondary structures based on fluctuating thermodynamic parameters. Nucleic Acids Res 1993; 21:2173-8. [PMID: 7684834 PMCID: PMC309481 DOI: 10.1093/nar/21.9.2173] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
In this paper we present a new method for predicting a set of RNA secondary structures that are thermodynamically favored in RNA folding simulations. This method uses a large number of 'simulated energy rules' (SER) generated by perturbing the free energy parameters derived experimentally within the range of the experimental errors. The structure with the lowest free energy is computed for each SER. Structural comparisons are used to avoid multiple generation of similar structures. Computed structures are evaluated using the energy distribution of the lowest free energy structures derived in the simulation. Predicted be graphically displayed with their occurring frequencies in the simulation by dot-plot representations. On average, about 90% of phylogenetic helixes in the known models of tRNA, Group I self-splicing intron, and Escherichia coli 16 S rRNA, were predicted using the method.
Collapse
Affiliation(s)
- S Y Le
- Laboratory of Mathematical Biology, National Cancer Institute, NIH, Frederick, MD 21702
| | | | | |
Collapse
|
25
|
Kister A, Magarshak Y, Malinsky J. The theoretical analysis of the process of RNA molecule self-assembly. Biosystems 1993; 30:31-48. [PMID: 7690610 DOI: 10.1016/0303-2647(93)90060-p] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
The Kinetic approach to the problem of the RNA structure prediction based on the analysis of the molecule self-formation is proposed. Re-structurization that occurs during processing is described in terms of Markov processes. A new formalism designating nucleotides by complex numbers is proposed, leading to the complex unitary space of nucleic vectors. Properties of structure and transition matrices are discussed in relation to the analysis of RNA structural formation processes. The non-linear dynamic behavior of secondary structure transition is analyzed. Soliton-like oscillations of RNA and DNA tertiary structures are predicted. The Monte-Carlo simulation of the RNA structure self-formation is used to calculate the ensemble of the secondary structures of the tRNA(Ala) precursor from Bombix mori formed during processing.
Collapse
Affiliation(s)
- A Kister
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02115
| | | | | |
Collapse
|
26
|
Möckel B, Eggeling L, Sahm H. Functional and structural analyses of threonine dehydratase from Corynebacterium glutamicum. J Bacteriol 1992; 174:8065-72. [PMID: 1459955 PMCID: PMC207545 DOI: 10.1128/jb.174.24.8065-8072.1992] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Threonine dehydratase activity is an important element in the flux control of isoleucine biosynthesis. The enzyme of Corynebacterium glutamicum demonstrates a marked sigmoidal dependence of initial velocity on the threonine concentration, a dependence that is consistent with substrate-promoted conversion of the enzyme from a low-activity to a high-activity conformation. In the presence of the negative allosteric effector isoleucine, the K0.5 increased from 21 to 78 mM and the cooperativity, as expressed by the Hill coefficient increased from 2.4 to 3.7. Valine promoted opposite effects: the K0.5 was reduced to 12 mM, and the enzyme exhibited almost no cooperativity. Sequence determination of the C. glutamicum gene for this enzyme revealed an open reading frame coding for a polypeptide of 436 amino acids. From this information and the molecular weight determination of the native enzyme, it follows that the dehydratase is a tetramer with a total mass of 186,396 daltons. Comparison of the deduced polypeptide sequence with the sequences of known threonine dehydratases revealed surprising differences from the C. glutamicum enzyme in the carboxy-terminal portion. This portion is greatly reduced in size, and a large gap of 95 amino acids must be introduced to achieve homology. Therefore, the C. glutamicum enzyme must be considered a small variant of threonine dehydratase that is typically controlled by isoleucine and valine but has an altered structure reflecting a topological difference in the portion of the protein most likely to be important for allosteric regulation.
Collapse
Affiliation(s)
- B Möckel
- Institut für Biotechnologie, Forschungszentrum, Jülich, Germany
| | | | | |
Collapse
|
27
|
Abstract
High-order RNA structures are involved in regulating many biological processes; various algorithms have been designed to predict them. Experimental methods to probe such structures and to decipher the results are tedious. Artificial intelligence and the neural network approach can support the process of discovering RNA structures. Secondary structures of RNA molecules are probed by autoradiographing gels, separating end-labeled fragments generated by base-specific RNases. This process is performed in both conditions, denaturing (for sequencing purposes) and native. The resultant autoradiograms are scanned using line-detection techniques to identify the fragments by comparing the lines with those obtained by 'alkaline ladders'. The identified paired bases are treated by either one of two methods to find the foldings which are consistent with the RNases' 'cutting' rules. One exploits the maximum independent set algorithm; the other, the planarization algorithm. They require, respectively, n and n2 processing elements, where n is the number of base pairs. The state of the system usually converges to the near-optimum solution within about 500 iteration steps, where each processing element implements the McCulloch-Pitts binary neuron. Our simulator, based on the proposed algorithm, discovered a new structure in a sequence of 38 bases, which is more stable than that formerly proposed.
Collapse
Affiliation(s)
- Y Takefuji
- Department of Electrical Engineering and Applied Physics, Case Western Reserve University, Cleveland, Ohio 44106
| | | | | |
Collapse
|
28
|
Mandl CW, Kunz C, Heinz FX. Presence of poly(A) in a flavivirus: significant differences between the 3' noncoding regions of the genomic RNAs of tick-borne encephalitis virus strains. J Virol 1991; 65:4070-7. [PMID: 1712858 PMCID: PMC248839 DOI: 10.1128/jvi.65.8.4070-4077.1991] [Citation(s) in RCA: 46] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
A poly(A) tail was identified on the 3' end of the prototype tick-borne encephalitis (TBE) virus strain Neudoerfl. This is in contrast to the general lack of poly(A) in the genomic RNAs of mosquito-borne flaviviruses analyzed so far. Analysis of several closely related strains of TBE virus, however, revealed the existence of two different types of 3' noncoding (NC) regions. One type (represented by strain Neudoerfl) is only 114 nucleotides long and carries a 3'-terminal poly(A) structure. This was also found in several TBE virus strains isolated from different geographic regions over a period of almost 30 years. The other type (represented by strain Hypr) is 461 nucleotides long and not polyadenylated. The sequence homology between the two types of TBE virus 3' NC regions terminates at a specific position 81 nucleotides after the stop codon. The second type of 3' NC region more closely resembles the common flavivirus pattern, including the potential for the formation of a 3'-terminal hairpin structure. However, it lacks primary sequence elements that are conserved among other flavivirus genomes.
Collapse
Affiliation(s)
- C W Mandl
- Institute of Virology, University of Vienna, Austria
| | | | | |
Collapse
|
29
|
van de Guchte M, van der Lende T, Kok J, Venema G. A possible contribution of mRNA secondary structure to translation initiation efficiency in Lactococcus lactis. FEMS Microbiol Lett 1991; 65:201-8. [PMID: 1715834 DOI: 10.1111/j.1574-6968.1991.tb04746.x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Gene expression signals derived from Lactococcus lactis were linked to lacZ-fused genes with different 5'-nucleotide sequences. Computer predictions of mRNA secondary structure were combined with lacZ expression studies to direct base-substitutions that could possibly influence gene expression. Mutations were made such that the DNA sequence upstream of the ATG start codon was not changed. Moreover, care was taken that the substitutions, which were all within the first six codons, neither affected the amino acid sequence of the gene product nor introduced codons rarely used in L. lactis. The results suggest that mRNA secondary structure contributes to the efficiency of translation initiation in L. lactis.
Collapse
Affiliation(s)
- M van de Guchte
- Department of Genetics, Centre of Biological Sciences, University of Groningen, Haren, The Netherlands
| | | | | | | |
Collapse
|
30
|
Zuker M, Jaeger JA, Turner DH. A comparison of optimal and suboptimal RNA secondary structures predicted by free energy minimization with structures determined by phylogenetic comparison. Nucleic Acids Res 1991; 19:2707-14. [PMID: 1710343 PMCID: PMC328190 DOI: 10.1093/nar/19.10.2707] [Citation(s) in RCA: 139] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
This article describes the latest version of an RNA folding algorithm that predicts both optimal and suboptimal solutions based on free energy minimization. A number of RNA's with known structures deduced from comparative sequence analysis are folded to test program performance. The group of solutions obtained for each molecule is analysed to determine how many of the known helixes occur in the optimal solution and in the best suboptimal solution. In most cases, a structure about 80% correct is found with a free energy within 2% of the predicted lowest free energy structure.
Collapse
Affiliation(s)
- M Zuker
- Institute for Biological Sciences, National Research Council of Canada, Ottawa, Ontario
| | | | | |
Collapse
|
31
|
Abstract
A new approach is proposed for determining common RNA secondary structures within a set of homologous RNAs. The approach is a combination of phylogenetic and thermodynamic methods which is based on the prediction of optimal and suboptimal secondary structures, topological similarity searches and phylogenetic comparative analysis. The optimal and suboptimal RNA secondary structures are predicted by energy minimization. Structural comparison of the predicted RNA secondary structures is used to find conserved structures that are topologically similar in all these homologous RNAs. The validity of the conserved structural elements found is then checked by phylogenetic comparison of the sequences. This procedure is used to predict common structures of ribonuclease P (RNAase P) RNAs.
Collapse
Affiliation(s)
- S Y Le
- Institute for Biological Sciences, National Research Council of Canada, Ottawa, Ontario
| | | |
Collapse
|
32
|
Transcription attenuation-mediated control of leu operon expression: influence of the number of Leu control codons. J Bacteriol 1991; 173:1634-41. [PMID: 1999384 PMCID: PMC207312 DOI: 10.1128/jb.173.5.1634-1641.1991] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Four adjacent Leu codons within the leu leader RNA are critically important in transcription attenuation-mediated control of leu operon expression in Salmonella typhimurium and Escherichia coli (P. W. Carter, D. L. Weiss, H. L. Weith, and J. M. Calvo, J. Bacteriol. 162:943-949, 1985). The leader region from S. typhimurium was altered by site-directed mutagenesis to produce constructs having between one and seven adjacent Leu codons, all CUA. leu operon expression was measured in strains containing six of these constructs, each integrated into the chromosome in a single copy. Operon expression was sufficiently high that all strains grew in minimal medium unsupplemented by leucine. Expression of the operon was measured in strains cultured in such a way that their growth was limited by the intracellular concentration of either leucine or of leucyl-tRNA. In general, the leu operon for each construct responded similarly to the parent construct in terms of the degree of expression as a function of the degree of limitation. However, a strain containing (CUA)1 and, to a certain extent, a strain having (CUA)2 responded somewhat more sluggishly and strains containing (CUA)6 and (CUA)7 responded more sensitively to limitations than did the parent construct. In addition, DNA fragments containing the leu promoter and leader region were used as templates in in vitro transcription reactions employing purified RNA polymerase. With nucleoside triphosphate concentrations of 200 microM, RNA polymerase paused during transcription of the leu leader region at a site about 95 bp downstream from the site of transcription initiation. The halftimes of the pause were 1 min at 37 degrees C and 3 min at 22 degrees C. The pause was lengthened substantially when the GTP concentration was lowered to 20 micromoles. Our results are interpreted most easily in terms of an all-or-none model. Given two Leu control codons, the operon responds with nearly maximum output over a wide range of leucine limitation, and that outcome does not change much with increasing numbers of control codons.
Collapse
|
33
|
Abstract
This chapter describes the RNA structural characteristics that have emerged so far. Folded RNA molecules are stabilized by a variety of interactions, the most prevalent of which are stacking and hydrogen bonding between bases. Many interactions among backbone atoms also occur in the structure of tRNA, although they are often ignored when considering RNA structure because they are not as well-characterized as interactions among bases. Backbone interactions include hydrogen bonding and the stacking of sugar or phosphate groups with bases or with other sugar and phosphate groups. The interactions found in a three-dimensional RNA structure can be divided into two categories: secondary interactions and tertiary interactions. This division is useful for several reasons. Secondary structures are routinely determined by a combination of techniques discussed in chapter, whereas tertiary interactions are more difficult to determine. Computer algorithms that generate RNA structures can search completely through possible secondary structures, but the inclusion of tertiary interactions makes a complete search of possible structures impractical for RNA molecules even as small as tRNA. The division of RNA structure into building blocks consisting of secondary or tertiary interactions makes it easier to describe RNA structures. In those cases in which RNA studies are incomplete, the studies of DNA are described with the rationalization that RNA structures may be analogous to DNA structures, or that the techniques used to study DNA could be applied to the analogous RNA structures. The chapter focuses on the aspects of RNA structure that affect the three-dimensional shape of RNA and that affect its ability to interact with other molecules.
Collapse
Affiliation(s)
- M Chastain
- University of California, Berkeley 94720
| | | |
Collapse
|
34
|
Kwakman JH, Konings DA, Hogeweg P, Pel HJ, Grivell LA. Structural analysis of a group II intron by chemical modifications and minimal energy calculations. J Biomol Struct Dyn 1990; 8:413-30. [PMID: 1702639 DOI: 10.1080/07391102.1990.10507813] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Folding of the yeast mitochondrial group II intron aI5c has been analysed by chemical modification of the in vitro synthesised RNA with dimethylsulfate and diethylpyrocarbonate. Computer calculations of the intron secondary structure through minimization of free energy were also performed in order to study thermodynamic properties of the intron and to relate these to data obtained from chemical modification. Comparison of the two sets of data with the current phylogenetic model structure of the intron aI5 reveals close agreement, thus lending strong support for the existence of a typical group II intron core structure comprising six neighbouring stem-loop domains. Local discrepancies between the experimental data and the model structures have been analyzed by reference to thermodynamic properties of the structure. This shows that use of the latest refined set of free energy values improves the structure calculation significantly.
Collapse
Affiliation(s)
- J H Kwakman
- Department of Molecular Cell Biology, University of Amsterdam, The Netherlands
| | | | | | | | | |
Collapse
|
35
|
Benedetti G, De Santis P, Morosetti S. Secondary structures of Tetrahymena thermophila rRNA IVS sequence involved in its self-splicing reactions: a new computer analysis. J Biomol Struct Dyn 1990; 7:1269-77. [PMID: 2194496 DOI: 10.1080/07391102.1990.10508564] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
The secondary structures of Tetrahymena thermophila rRNA IVS sequence involved in the self-splicing reactions, are theoretically investigated with a refined computer method previously proposed, able to select a set of the deepest free energy RNA secondary structures under constraints of model hypotheses and experimental evidences. The secondary structures obtained are characterized by the close proximity of self-reactions sites and account for double mutations experiments, and differential digestion data.
Collapse
Affiliation(s)
- G Benedetti
- Department of Chemistry, University of Rome I, Italy
| | | | | |
Collapse
|
36
|
Abstract
The simplest dynamic algorithm for planar RNA folding searches for the maximum number of base pairs. The algorithm uses O(n3) steps. The more general case, where different weights (energies) are assigned to stacked base pairs and to the various types of single-stranded region topologies, requires a considerably longer computation time because of the partial backtracking involved. Limiting the loop size reduces the running time back to O(n3). Reduction in the number of steps in the calculations of the various RNA topologies has recently been suggested, thereby improving the time behavior. Here we show how a "jumping" procedure can be used to speed up the computation, not only for the maximal number of base pairs algorithm, but for the minimal energy algorithm as well.
Collapse
Affiliation(s)
- R Nussinov
- Sackler Institute for Molecular Medicine, Sackler Faculty of Medicine, Tel Aviv University, Ramat Aviv, Israel
| | | | | | | |
Collapse
|
37
|
Haas ES, Brown JW, Daniels CJ, Reeve JN. Genes encoding the 7S RNA and tRNA(Ser) are linked to one of the two rRNA operons in the genome of the extremely thermophilic archaebacterium Methanothermus fervidus. Gene 1990; 90:51-9. [PMID: 2116370 DOI: 10.1016/0378-1119(90)90438-w] [Citation(s) in RCA: 30] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Analysis of gene structure in the extremely thermophilic archaebacterium, Methanothermus fervidus, has revealed the presence of a cluster of stable RNA-encoding genes arranged 5'-7S RNA-tRNA(Ser)-16S rRNA-tRNA(Ala)-23S rRNA-5S rRNA. The genome of M. fervidus contains two rRNA operons but only one operon has the closely linked 7S RNA-encoding gene. The sequences upstream from the two rRNA operons are identical for 206 bp but diverge at the 3' base of the tRNA(Ser) gene. The secondary structures predicted for the M. fervidus 7S, 16S rRNA, tRNA(Ala) and tRNA(Ser) have been compared with those of functionally homologous molecules from moderately thermophilic and mesophilic archaebacteria. A consensus secondary structure for archaebacterial 7S RNAs has been developed which incorporates bases and structural features also conserved in eukaryotic signal-recognition-particle RNAs and eubacterial 4.5S RNAs.
Collapse
MESH Headings
- Archaea/genetics
- Bacteria/genetics
- Base Sequence
- Cloning, Molecular
- DNA, Bacterial/genetics
- Genes, Bacterial
- Genetic Linkage
- Molecular Sequence Data
- Nucleic Acid Conformation
- Operon
- RNA Processing, Post-Transcriptional
- RNA, Ribosomal/genetics
- RNA, Ribosomal/ultrastructure
- RNA, Ribosomal, 16S/genetics
- RNA, Ribosomal, 16S/ultrastructure
- RNA, Small Nuclear/genetics
- RNA, Small Nuclear/ultrastructure
- RNA, Transfer, Amino Acid-Specific/genetics
- RNA, Transfer, Ser/genetics
Collapse
Affiliation(s)
- E S Haas
- Department of Microbiology, Ohio State University, Columbus 43210
| | | | | | | |
Collapse
|
38
|
Abrahams JP, van den Berg M, van Batenburg E, Pleij C. Prediction of RNA secondary structure, including pseudoknotting, by computer simulation. Nucleic Acids Res 1990; 18:3035-44. [PMID: 1693421 PMCID: PMC330835 DOI: 10.1093/nar/18.10.3035] [Citation(s) in RCA: 163] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
A computer program is presented which determines the secondary structure of linear RNA molecules by simulating a hypothetical process of folding. This process implies the concept of 'nucleation centres', regions in RNA which locally trigger the folding. During the simulation, the RNA is allowed to fold into pseudoknotted structures, unlike all other programs predicting RNA secondary structure. The simulation uses published, experimentally determined free energy values for nearest neighbour base pair stackings and loop regions, except for new extrapolated values for loops larger than seven nucleotides. The free energy value for a loop arising from pseudoknot formation is set to a single, estimated value of 4.2 kcal/mole. Especially in the case of long RNA sequences, our program appears superior to other secondary structure predicting programs described so far, as tests on tRNAs, the LSU intron of Tetrahymena thermophila and a number of plant viral RNAs show. In addition, pseudoknotted structures are often predicted successfully. The program is written in mainframe APL and is adapted to run on IBM compatible PCs, Atari ST and Macintosh personal computers. On an 8 MHz 8088 standard PC without coprocessor, using STSC APL, it folds a sequence of 700 nucleotides in one and a half hour.
Collapse
Affiliation(s)
- J P Abrahams
- Department of Biochemistry, Gorlaeus Laboratories, University of Leiden, The Netherlands
| | | | | | | |
Collapse
|
39
|
|
40
|
The most abundant small cytoplasmic RNA of Saccharomyces cerevisiae has an important function required for normal cell growth. Mol Cell Biol 1989. [PMID: 2477683 DOI: 10.1128/mcb.9.8.3260] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The most abundant RNA visible between 5.8S and 18S rRNA on an ethidium bromide-stained gel of total Saccharomyces cerevisiae RNA has an apparent size of about 600 nucleotides. By purifying the band and using it as a probe to screen a genomic library, we isolated and sequenced the unique gene for this RNA. The transcribed sequence, determined to be 519 nucleotides long, contains elements typical of RNA polymerase III transcription. The RNA is predominantly cytoplasmic, so we called it small cytoplasmic RNA 1 (scR1). ScR1 is neither 3'-polyadenylated nor 5'-trimethylguanosine capped. We constructed a null mutation of the gene by deleting 252 base pairs from the transcribed region. Haploid strains carrying the scr1-delta lesion grew very slowly, segregated cytoplasmic petites [( rho-]) at high frequency, and showed signs of aberrant cell division. A secondary structure model for scR1 shows some of the conserved features of the signal recognition particle 7SL RNAs.
Collapse
|
41
|
Rychlik W, Rhoads RE. A computer program for choosing optimal oligonucleotides for filter hybridization, sequencing and in vitro amplification of DNA. Nucleic Acids Res 1989; 17:8543-51. [PMID: 2587212 PMCID: PMC335026 DOI: 10.1093/nar/17.21.8543] [Citation(s) in RCA: 508] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
A method is presented for choosing optimal oligodeoxyribonucleotides as probes for filter hybridization, primers for sequencing, or primers for DNA amplification. Three main factors that determine the quality of a probe are considered: stability of the duplex formed between the probe and target nucleic acid, specificity of the probe for the intended target sequence, and self-complementarity. DNA duplex stability calculations are based on the nearest-neighbor thermodynamic values determined by Breslauer et al. [Proc. Natl. Acad. Sci. U.S.A. (1986), 83: 3746]. Temperatures of duplex dissociation predicted by the method described here were within 0.4 degrees C of the values obtained experimentally for ten oligonucleotides. Calculations for specificity of the probe and its self-complementarity are based on a simple dynamic algorithm.
Collapse
Affiliation(s)
- W Rychlik
- Department of Biochemistry, University of Kentucky, Lexington 40536
| | | |
Collapse
|
42
|
Pace NR, Smith DK, Olsen GJ, James BD. Phylogenetic comparative analysis and the secondary structure of ribonuclease P RNA--a review. Gene 1989; 82:65-75. [PMID: 2479592 DOI: 10.1016/0378-1119(89)90031-0] [Citation(s) in RCA: 88] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The most incisive a priori approach to inferring the higher order structure of large RNAs has proven to be the use of phylogenetic comparisons. This article provides guidelines to the method, using as an illustration the elucidation of the secondary structure of the catalytic RNA subunit of ribonuclease P (RNase P). The resultant structure is compared to the possibilities that are predicted thermodynamically for the RNase P RNA sequences of nine eubacteria.
Collapse
Affiliation(s)
- N R Pace
- Department of Biology, Indiana University, Bloomington 47405
| | | | | | | |
Collapse
|
43
|
Jaeger JA, Turner DH, Zuker M. Improved predictions of secondary structures for RNA. Proc Natl Acad Sci U S A 1989; 86:7706-10. [PMID: 2479010 PMCID: PMC298139 DOI: 10.1073/pnas.86.20.7706] [Citation(s) in RCA: 625] [Impact Index Per Article: 17.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The accuracy of computer predictions of RNA secondary structure from sequence data and free energy parameters has been increased to roughly 70%. Performance is judged by comparison with structures known from phylogenetic analysis. The algorithm also generates suboptimal structures. On average, the best structure within 10% of the lowest free energy contains roughly 90% of phylogenetically known helixes. The algorithm does not include tertiary interactions or pseudoknots and employs a crude model for single-stranded regions. The only favorable interactions are base pairing and stacking of terminal unpaired nucleotides at the ends of helixes. The excellent performance is consistent with these interactions being the primary interactions determining RNA secondary structure.
Collapse
Affiliation(s)
- J A Jaeger
- Department of Chemistry, University of Rochester, NY 14627
| | | | | |
Collapse
|
44
|
Le SY, Nussinov R, Maizel JV. Tree graphs of RNA secondary structures and their comparisons. COMPUTERS AND BIOMEDICAL RESEARCH, AN INTERNATIONAL JOURNAL 1989; 22:461-73. [PMID: 2776449 DOI: 10.1016/0010-4809(89)90039-6] [Citation(s) in RCA: 109] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
To facilitate comparison of RNA secondary structures each structure is represented as an ordered labeled tree. Several alternate secondary structures yielding a set of trees can be computed for any given RNA molecule (sequence). Frequently recurring subtrees are searched in this set of trees. The consensus structure motifs are then selected and used to construct a secondary structure model of the RNA. Given the difficulties involved in RNA secondary structure calculations, this procedure may significantly improve our predictive capabilities. In addition, the change of secondary structures between two different RNA sequences is described as a transformation of ordered trees. The transferable ratio of tree A from tree B is defined as a proportion of the largest common subtrees in trees A and B occurring in tree A. The method is applied to the study of the mechanism of human alpha 1 globin pre-mRNA splicing. In the study, two tentative splicing mechanisms, A and B, with different orders of intron excision from alpha 1 globin pre-mRNA have been stimulated. A possible relationship between the structural features of the secondary structures and the order of intron excision in the pathway of precursor splicing of human alpha 1 globin is discussed.
Collapse
Affiliation(s)
- S Y Le
- Division of Cancer Biology and Diagnosis, National Cancer Institute, Frederick, Maryland 21701
| | | | | |
Collapse
|
45
|
Abstract
The three-dimensional structures adopted by RNA molecules are crucial to their biological functions. The nucleotides of an RNA molecule interact to form characteristic secondary-structure motifs. Tertiary interactions orient these secondary-structure elements with respect to each other to form the functional RNA. Here we describe the basic structural elements with special emphasis on a novel tertiary motif, the pseudoknot.
Collapse
|
46
|
Felici F, Cesareni G, Hughes JM. The most abundant small cytoplasmic RNA of Saccharomyces cerevisiae has an important function required for normal cell growth. Mol Cell Biol 1989; 9:3260-8. [PMID: 2477683 PMCID: PMC362370 DOI: 10.1128/mcb.9.8.3260-3268.1989] [Citation(s) in RCA: 62] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The most abundant RNA visible between 5.8S and 18S rRNA on an ethidium bromide-stained gel of total Saccharomyces cerevisiae RNA has an apparent size of about 600 nucleotides. By purifying the band and using it as a probe to screen a genomic library, we isolated and sequenced the unique gene for this RNA. The transcribed sequence, determined to be 519 nucleotides long, contains elements typical of RNA polymerase III transcription. The RNA is predominantly cytoplasmic, so we called it small cytoplasmic RNA 1 (scR1). ScR1 is neither 3'-polyadenylated nor 5'-trimethylguanosine capped. We constructed a null mutation of the gene by deleting 252 base pairs from the transcribed region. Haploid strains carrying the scr1-delta lesion grew very slowly, segregated cytoplasmic petites [( rho-]) at high frequency, and showed signs of aberrant cell division. A secondary structure model for scR1 shows some of the conserved features of the signal recognition particle 7SL RNAs.
Collapse
Affiliation(s)
- F Felici
- European Molecular Biology Laboratory, Heidelberg, Federal Republic of Germany
| | | | | |
Collapse
|
47
|
Benedetti G, De Santis P, Morosetti S. A new method to find a set of energetically optimal RNA secondary structures. Nucleic Acids Res 1989; 17:5149-61. [PMID: 2474795 PMCID: PMC318102 DOI: 10.1093/nar/17.13.5149] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
We present a computer method to determine nucleic acid secondary structures. It is based on three steps: 1) the search for all possible helical regions relied on a mathematical approach derived from the convolution theorem; it uses a tetradimensional complex vector representation of the bases along the sequence; 2) a 'tree' search for a set of minimum free energy structures, by the aid of an approximate energy evaluation to reduce the computer time requirements; 3) the exact calculation and refinement of the energies. A method to introduce the experimental data and reach an arrangement between them and the free energy minimization criterion is shown. In order to demonstrate the confidence of the program a test on four RNA sequences is performed. The method has computer time requirement proportional to N2, where N is the length of the sequence and retrieves a set of optimal free energy structures.
Collapse
Affiliation(s)
- G Benedetti
- Department of Chemistry, University of Rome, Italy
| | | | | |
Collapse
|
48
|
Kwakman JH, Konings D, Pel HJ, Grivell LA. Structure-function relationships in a self-splicing group II intron: a large part of domain II of the mitochondrial intron aI5 is not essential for self-splicing. Nucleic Acids Res 1989; 17:4205-16. [PMID: 2472604 PMCID: PMC317929 DOI: 10.1093/nar/17.11.4205] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
An oligonucleotide-directed deletion of 156 nucleotides has been introduced into the yeast mitochondrial group II intron al5 (887 nt). The deletion comprises almost all of domain II, which is one of the six phylogenetically conserved structural elements of group II introns. This mutant displays reduced self-splicing activity, but results of chemical probing with dimethylsulphate suggest that sequences at the site of the deletion interfere with the normal folding of the intron. This is supported by computer analyses, which predict a number of alternative structures involving conserved intron sequences. Splicing activity could be restored by insertion of a 10-nucleotide palindromic sequence into the unique Smal site of the deletion mutant, resulting in the formation of a small stable stem-loop element at the position of domain II. These results provide a direct correlation between folding of the RNA and its activity. We conclude that at least a large part of domain II of the group II intron al5 is not required for self-splicing activity. This deletion mutant with a length of 731 nucleotides represents the smallest self-splicing group II intron so far known.
Collapse
Affiliation(s)
- J H Kwakman
- Department of Molecular Cell Biology, University of Amsterdam, The Netherlands
| | | | | | | |
Collapse
|
49
|
Konings DA, Hogeweg P. Pattern analysis of RNA secondary structure similarity and consensus of minimal-energy folding. J Mol Biol 1989; 207:597-614. [PMID: 2474658 DOI: 10.1016/0022-2836(89)90468-3] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
We describe an automated procedure to search for consensus structures or substructures in a set of homologous or related RNA molecules. The procedure is based on the calculation of optimal and sub-optimal secondary structures using thermodynamic rules for base-pairing by energy-minimization. A linear representation of the secondary structures of the related RNAs is used so that they can be compared and classified using standard alignment and clusterings programs. We illustrate the method by means of two sets of homologous small RNAs, U2 and U3, and a set of alpha-globin mRNAs and show that biologically interesting consensus structures are obtained.
Collapse
Affiliation(s)
- D A Konings
- European Molecular Biology Laboratory, Heidelberg, F.R.G
| | | |
Collapse
|
50
|
Zhang Y, Dolph PJ, Schneider RJ. Secondary Structure Analysis of Adenovirus Tripartite Leader. J Biol Chem 1989. [DOI: 10.1016/s0021-9258(18)81676-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
|