1
|
Voß B. Classified Dynamic Programming in RNA Structure Analysis. Methods Mol Biol 2024; 2726:125-141. [PMID: 38780730 DOI: 10.1007/978-1-0716-3519-3_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Analysis of the folding space of RNA generally suffers from its exponential size. With classified Dynamic Programming algorithms, it is possible to alleviate this burden and to analyse the folding space of RNA in great depth. Key to classified DP is that the search space is partitioned into classes based on an on-the-fly computed feature. A class-wise evaluation is then used to compute class-wide properties, such as the lowest free energy structure for each class, or aggregate properties, such as the class' probability. In this paper we describe the well-known shape and hishape abstraction of RNA structures, their power to help better understand RNA function and related methods that are based on these abstractions.
Collapse
Affiliation(s)
- Björn Voß
- RNA Biology and Bioinformatics, Institute of Biomedical Genetics, University of Stuttgart, Stuttgart, Germany
| |
Collapse
|
2
|
Martin NS, Ahnert SE. Insertions and deletions in the RNA sequence-structure map. J R Soc Interface 2021; 18:20210380. [PMID: 34610259 PMCID: PMC8492174 DOI: 10.1098/rsif.2021.0380] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Accepted: 09/13/2021] [Indexed: 12/21/2022] Open
Abstract
Genotype-phenotype maps link genetic changes to their fitness effect and are thus an essential component of evolutionary models. The map between RNA sequences and their secondary structures is a key example and has applications in functional RNA evolution. For this map, the structural effect of substitutions is well understood, but models usually assume a constant sequence length and do not consider insertions or deletions. Here, we expand the sequence-structure map to include single nucleotide insertions and deletions by using the RNAshapes concept. To quantify the structural effect of insertions and deletions, we generalize existing definitions for robustness and non-neutral mutation probabilities. We find striking similarities between substitutions, deletions and insertions: robustness to substitutions is correlated with robustness to insertions and, for most structures, to deletions. In addition, frequent structural changes after substitutions also tend to be common for insertions and deletions. This is consistent with the connection between energetically suboptimal folds and possible structural transitions. The similarities observed hold both for genotypic and phenotypic robustness and mutation probabilities, i.e. for individual sequences and for averages over sequences with the same structure. Our results could have implications for the rate of neutral and non-neutral evolution.
Collapse
Affiliation(s)
- Nora S. Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, UK
- Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge CB2 1LR, UK
| | - Sebastian E. Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK
- The Alan Turing Institute, British Library, Euston Road, London NW1 2DB, UK
| |
Collapse
|
3
|
Manzourolajdad A, Spouge JL. Structural prediction of RNA switches using conditional base-pair probabilities. PLoS One 2019; 14:e0217625. [PMID: 31188853 PMCID: PMC6561571 DOI: 10.1371/journal.pone.0217625] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Accepted: 05/15/2019] [Indexed: 11/23/2022] Open
Abstract
An RNA switch triggers biological functions by toggling between two conformations. RNA switches include bacterial riboswitches, where ligand binding can stabilize a bound structure. For RNAs with only one stable structure, structural prediction usually just requires a straightforward free energy minimization, but for an RNA switch, the prediction of a less stable alternative structure is often computationally costly and even problematic. The current sampling-clustering method predicts stable and alternative structures by partitioning structures sampled from the energy landscape into two clusters, but it is very time-consuming. Instead, we predict the alternative structure of an RNA switch from conditional probability calculations within the energy landscape. First, our method excludes base pairs related to the most stable structure in the energy landscape. Then, it detects stable stems (“seeds”) in the remaining landscape. Finally, it folds an alternative structure prediction around a seed. While having comparable riboswitch classification performance, the conditional-probability computations had fewer adjustable parameters, offered greater predictive flexibility, and were more than one thousand times faster than the sampling step alone in sampling-clustering predictions, the competing standard. Overall, the described approach helps traverse thermodynamically improbable energy landscapes to find biologically significant substructures and structures rapidly and effectively.
Collapse
Affiliation(s)
- Amirhossein Manzourolajdad
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail:
| | - John L. Spouge
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| |
Collapse
|
4
|
Lin L, McKerrow WH, Richards B, Phonsom C, Lawrence CE. Characterization and visualization of RNA secondary structure Boltzmann ensemble via information theory. BMC Bioinformatics 2018; 19:82. [PMID: 29506466 PMCID: PMC5836418 DOI: 10.1186/s12859-018-2078-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 02/20/2018] [Indexed: 12/26/2022] Open
Abstract
Background The nearest neighbor model and associated dynamic programming algorithms allow for the efficient estimation of the RNA secondary structure Boltzmann ensemble. However because a given RNA secondary structure only contains a fraction of the possible helices that could form from a given sequence, the Boltzmann ensemble is multimodal. Several methods exist for clustering structures and finding those modes. However less focus is given to exploring the underlying reasons for this multimodality: the presence of conflicting basepairs. Information theory, or more specifically mutual information, provides a method to identify those basepairs that are key to the secondary structure. Results To this end we find most informative basepairs and visualize the effect of these basepairs on the secondary structure. Knowing whether a most informative basepair is present tells us not only the status of the particular pair but also provides a large amount of information about which other pairs are present or not present. We find that a few basepairs account for a large amount of the structural uncertainty. The identification of these pairs indicates small changes to sequence or stability that will have a large effect on structure. Conclusion We provide a novel algorithm that uses mutual information to identify the key basepairs that lead to a multimodal Boltzmann distribution. We then visualize the effect of these pairs on the overall Boltzmann ensemble.
Collapse
Affiliation(s)
- Luan Lin
- Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, 20993, MD, USA
| | - Wilson H McKerrow
- Division of Applied Mathematics, Brown University, Providence, 02912, RI, USA
| | | | - Chukiat Phonsom
- Department of Mathematics, University of Southern California, Los Angeles, 90089, CA, USA
| | - Charles E Lawrence
- Division of Applied Mathematics, Brown University, Providence, 02912, RI, USA.
| |
Collapse
|
5
|
Abstract
RNA family models describe classes of functionally related, non-coding RNAs based on sequence and structure conservation. The most important method for modeling RNA families is the use of covariance models, which are stochastic models that serve in the discovery of yet unknown, homologous RNAs. However, the performance of covariance models in finding remote homologs is poor for RNA families with high sequence conservation, while for families with high structure but low sequence conservation, these models are difficult to built in the first place. A complementary approach to RNA family modeling involves the use of thermodynamic matchers. Thermodynamic matchers are RNA folding programs, based on the established thermodynamic model, but tailored to a specific structural motif. As thermodynamic matchers focus on structure and folding energy, they unfold their potential in discovering homologs, when high structure conservation is paired with low sequence conservation. In contrast to covariance models, construction of thermodynamic matchers does not require an input alignment, but requires human design decisions and experimentation, and hence, model construction is more laborious. Here we report a case study on an RNA family that was constructed by means of thermodynamic matchers. It starts from a set of known but structurally different members of the same RNA family. The consensus secondary structure of this family consists of 2 to 4 adjacent hairpins. Each hairpin loop carries the same motif, CCUCCUCCC, while the stems show high variability in their nucleotide content. The present study describes (1) a novel approach for the integration of the structurally varying family into a single RNA family model by means of the thermodynamic matcher methodology, and (2) provides the results of homology searches that were conducted with this model in a wide spectrum of bacterial species.
Collapse
Key Words
- CIN, conserved intergenic neighborhood
- CM, covariance model
- HMM, hidden Markov model
- MFE, minimum free energy
- OG, orthologous group of genes
- RBS, ribosome binding site
- RFM, RNA family model
- TDM, thermodynamic matcher
- aSD, anti Shine-Dalgarno
- alphaproteobacteria
- cuckoo RNA
- dRNA-seq, differential RNA sequencing
- family model
- homology search
- sRNA, small non-coding RNA
- small RNA
- structural RNA
- thermodynamic matcher
Collapse
Affiliation(s)
- Jan Reinkensmeier
- a Universität Bielefeld ; Technische Fakultät and Center of Biotechnology ; Bielefeld , Germany
| | | |
Collapse
|
6
|
Hinton A, Hunter SE, Afrikanova I, Jones GA, Lopez AD, Fogel GB, Hayek A, King CC. sRNA-seq analysis of human embryonic stem cells and definitive endoderm reveals differentially expressed microRNAs and novel IsomiRs with distinct targets. Stem Cells 2015; 32:2360-72. [PMID: 24805944 DOI: 10.1002/stem.1739] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Revised: 03/30/2014] [Accepted: 04/09/2014] [Indexed: 11/06/2022]
Abstract
MicroRNAs (miRNAs) are noncoding, regulatory RNAs expressed dynamically during differentiation of human embryonic stem cells (hESCs) into defined lineages. Mapping developmental expression of miRNAs during transition from pluripotency to definitive endoderm (DE) should help to elucidate the mechanisms underlying lineage specification and ultimately enhance differentiation protocols. In this report, next generation sequencing was used to build upon our previous analysis of miRNA expression in human hESCs and DE. From millions of sequencing reads, 747 and 734 annotated miRNAs were identified in pluripotent and DE cells, respectively, including 77 differentially expressed miRNAs. Among these, four of the top five upregulated miRNAs were previously undetected in DE. Furthermore, the stem-loop for miR-302a, an important miRNA for both hESCs self-renewal and endoderm specification, produced several highly expressed miRNA species (isomiRs). Overall, isomiRs represented >10% of sequencing reads in >40% of all detected stem-loop arms, suggesting that the impact of these abundant miRNA species may have been overlooked in previous studies. Because of their relative abundance, the role of differential isomiR targeting was studied using the miR-302 cluster as a model system. A miRNA mimetic for miR-302a-5p, but not miR-302a-5p(+3), decreased expression of orthodenticle homeobox 2 (OTX2). Conversely, isomiR 302a-5p(+3) selectively decreased expression of tuberous sclerosis protein 1, but not OTX2, indicating nonoverlapping specificity of miRNA processing variants. Taken together, our characterization of miRNA expression, which includes novel miRNAs and isomiRs, helps establish a foundation for understanding the role of miRNAs in DE formation and selective targeting by isomiRs.
Collapse
Affiliation(s)
- Andrew Hinton
- Pediatric Diabetes Research Center, University of California, San Diego, La Jolla, California, USA
| | | | | | | | | | | | | | | |
Collapse
|
7
|
Li C, Zhang Y, Li J, Kong L, Hu H, Pan H, Xu L, Deng Y, Li Q, Jin L, Yu H, Chen Y, Liu B, Yang L, Liu S, Zhang Y, Lang Y, Xia J, He W, Shi Q, Subramanian S, Millar CD, Meader S, Rands CM, Fujita MK, Greenwold MJ, Castoe TA, Pollock DD, Gu W, Nam K, Ellegren H, Ho SYW, Burt DW, Ponting CP, Jarvis ED, Gilbert MTP, Yang H, Wang J, Lambert DM, Wang J, Zhang G. Two Antarctic penguin genomes reveal insights into their evolutionary history and molecular changes related to the Antarctic environment. Gigascience 2014; 3:27. [PMID: 25671092 PMCID: PMC4322438 DOI: 10.1186/2047-217x-3-27] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Accepted: 11/06/2014] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Penguins are flightless aquatic birds widely distributed in the Southern Hemisphere. The distinctive morphological and physiological features of penguins allow them to live an aquatic life, and some of them have successfully adapted to the hostile environments in Antarctica. To study the phylogenetic and population history of penguins and the molecular basis of their adaptations to Antarctica, we sequenced the genomes of the two Antarctic dwelling penguin species, the Adélie penguin [Pygoscelis adeliae] and emperor penguin [Aptenodytes forsteri]. RESULTS Phylogenetic dating suggests that early penguins arose ~60 million years ago, coinciding with a period of global warming. Analysis of effective population sizes reveals that the two penguin species experienced population expansions from ~1 million years ago to ~100 thousand years ago, but responded differently to the climatic cooling of the last glacial period. Comparative genomic analyses with other available avian genomes identified molecular changes in genes related to epidermal structure, phototransduction, lipid metabolism, and forelimb morphology. CONCLUSIONS Our sequencing and initial analyses of the first two penguin genomes provide insights into the timing of penguin origin, fluctuations in effective population sizes of the two penguin species over the past 10 million years, and the potential associations between these biological patterns and global climate change. The molecular changes compared with other avian genomes reflect both shared and diverse adaptations of the two penguin species to the Antarctic environment.
Collapse
Affiliation(s)
- Cai Li
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
- />Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Yong Zhang
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Jianwen Li
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Lesheng Kong
- />MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, OX1 3QX UK
| | - Haofu Hu
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Hailin Pan
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Luohao Xu
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Yuan Deng
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Qiye Li
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
- />Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Lijun Jin
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Hao Yu
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Yan Chen
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Binghang Liu
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Linfeng Yang
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Shiping Liu
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Yan Zhang
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Yongshan Lang
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Jinquan Xia
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Weiming He
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Qiong Shi
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Sankar Subramanian
- />Environmental Futures Centre, Griffith University, Nathan, QLD 4111 Australia
| | - Craig D Millar
- />Allan Wilson Centre for Molecular Ecology and Evolution, School of Biological Sciences, University of Auckland, Private Bag 92019, Auckland, New Zealand
| | - Stephen Meader
- />MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, OX1 3QX UK
| | - Chris M Rands
- />MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, OX1 3QX UK
| | - Matthew K Fujita
- />MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, OX1 3QX UK
- />Department of Biology, University of Texas at Arlington, Arlington, TX 76019 USA
| | - Matthew J Greenwold
- />Department of Biological Sciences, University of South Carolina, Columbia, SC USA
| | - Todd A Castoe
- />Department of Biochemistry and Molecular Genetics, School of Medicine, University of Colorado, Aurora, CO 80045 USA
- />Biology Department, University of Texas Arlington, Arlington, TX 76016 USA
| | - David D Pollock
- />Department of Biochemistry and Molecular Genetics, School of Medicine, University of Colorado, Aurora, CO 80045 USA
| | - Wanjun Gu
- />Research Centre of Learning Sciences, Southeast University, Nanjing, 210096 China
| | - Kiwoong Nam
- />Department of Evolutionary Biology, Uppsala University, Norbyvagen 18D, SE-752 36 Uppsala, Sweden
- />Bioinformatics Research Centre (BiRC), Aarhus University, C.F.Møllers Allé 8, 8000 Aarhus C, Denmark
| | - Hans Ellegren
- />Department of Evolutionary Biology, Uppsala University, Norbyvagen 18D, SE-752 36 Uppsala, Sweden
| | - Simon YW Ho
- />School of Biological Sciences, University of Sydney, Sydney, NSW 2006 Australia
| | - David W Burt
- />Department of Genomics and Genetics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus Midlothian, Edinburgh, EH25 9RG UK
| | - Chris P Ponting
- />MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, OX1 3QX UK
| | - Erich D Jarvis
- />Department of Neurobiology, Howard Hughes Medical Institute, Duke University Medical Center, Durham, NC27710 USA
| | - M Thomas P Gilbert
- />Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
- />Trace and Environmental DNA Laboratory, Department of Environment and Agriculture, Curtin University, Perth, WA 6102 Australia
| | - Huanming Yang
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
- />Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah, 21589 Saudi Arabia
| | - Jian Wang
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - David M Lambert
- />Environmental Futures Centre, Griffith University, Nathan, QLD 4111 Australia
| | - Jun Wang
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
- />Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah, 21589 Saudi Arabia
- />Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
- />Macau University of Science and Technology, Avenida Wai long, Taipa, Macau, 999078 China
- />Department of Medicine, University of Hong Kong, Hong Kong, Hong Kong
| | - Guojie Zhang
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
- />Centre for Social Evolution, Department of Biology, University of Copenhagen, Universitetsparken 15, Copenhagen, DK-2100 Denmark
| |
Collapse
|
8
|
Abstract
Abstract shape analysis abstract shape analysis is a method to learn more about the complete Boltzmann ensemble of the secondary structures of a single RNA molecule. Abstract shapes classify competing secondary structures into classes that are defined by their arrangement of helices. It allows us to compute, in addition to the structure of minimal free energy, a set of structures that represents relevant and interesting structural alternatives. Furthermore, it allows to compute probabilities of all structures within a shape class. This allows to ensure that our representative subset covers the complete Boltzmann ensemble, except for a portion of negligible probability. This chapter explains the main functions of abstract shape analysis, as implemented in the tool RNA shapes. RNA shapes It reports on some other types of analysis that are based on the abstract shapes idea and shows how you can solve novel problems by creating your own shape abstractions.
Collapse
|
9
|
Abstract
MicroRNAs (miRNAs) have attracted ever-increasing interest in recent years. Since experimental approaches for determining miRNAs are nontrivial in their application, computational methods for the prediction of miRNAs have gained popularity. Such methods can be grouped into two broad categories (1) performing ab initio predictions of miRNAs from primary sequence alone and (2) additionally employing phylogenetic conservation. Most methods acknowledge the importance of hairpin or stem-loop structures and employ various methods for the prediction of RNA secondary structure. Machine learning has been employed in both categories with classification being the predominant method. In most cases, positive and negative examples are necessary for performing classification. Since it is currently elusive to experimentally determine all possible miRNAs for an organism, true negative examples are hard to come by, and therefore the accuracy assessment of algorithms is hampered. In this chapter, first RNA secondary structure prediction is introduced since it provides a basis for miRNA prediction. This is followed by an assessment of homology and then ab initio miRNA prediction methods.
Collapse
Affiliation(s)
- Jens Allmer
- Molecular Biology and Genetics, Izmir Institute of Technology, Izmir, Turkey
| |
Collapse
|
10
|
Achawanantakun R, Sun Y. Shape and secondary structure prediction for ncRNAs including pseudoknots based on linear SVM. BMC Bioinformatics 2013; 14 Suppl 2:S1. [PMID: 23369147 PMCID: PMC3549817 DOI: 10.1186/1471-2105-14-s2-s1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Accurate secondary structure prediction provides important information to undefirstafinding the tertiary structures and thus the functions of ncRNAs. However, the accuracy of the native structure derivation of ncRNAs is still not satisfactory, especially on sequences containing pseudoknots. It is recently shown that using the abstract shapes, which retain adjacency and nesting of structural features but disregard the length details of helix and loop regions, can improve the performance of structure prediction. In this work, we use SVM-based feature selection to derive the consensus abstract shape of homologous ncRNAs and apply the predicted shape to structure prediction including pseudoknots. Results Our approach was applied to predict shapes and secondary structures on hundreds of ncRNA data sets with and without psuedoknots. The experimental results show that we can achieve 18% higher accuracy in shape prediction than the state-of-the-art consensus shape prediction tools. Using predicted shapes in structure prediction allows us to achieve approximate 29% higher sensitivity and 10% higher positive predictive value than other pseudoknot prediction tools. Conclusions Extensive analysis of RNA properties based on SVM allows us to identify important properties of sequences and structures related to their shapes. The combination of mass data analysis and SVM-based feature selection makes our approach a promising method for shape and structure prediction. The implemented tools, Knot Shape and Knot Structure are open source software and can be downloaded at: http://www.cse.msu.edu/~achawana/KnotShape.
Collapse
Affiliation(s)
- Rujira Achawanantakun
- Department of Computer Science and Engineering, Michigan State University, Michigan, USA
| | | |
Collapse
|
11
|
BRASERO: A Resource for Benchmarking RNA Secondary Structure Comparison Algorithms. Adv Bioinformatics 2012; 2012:893048. [PMID: 22675348 PMCID: PMC3366197 DOI: 10.1155/2012/893048] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2011] [Accepted: 02/22/2012] [Indexed: 11/23/2022] Open
Abstract
The pairwise comparison of RNA secondary structures is a fundamental problem, with direct application in mining databases for annotating putative noncoding RNA candidates in newly sequenced genomes. An increasing number of software tools are available for comparing RNA secondary structures, based on different models (such as ordered trees or forests, arc annotated sequences, and multilevel trees) and computational principles (edit distance, alignment). We describe here the website BRASERO that offers tools for evaluating such software tools on real and synthetic datasets.
Collapse
|
12
|
Janssen S, Schudoma C, Steger G, Giegerich R. Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction. BMC Bioinformatics 2011; 12:429. [PMID: 22051375 PMCID: PMC3293930 DOI: 10.1186/1471-2105-12-429] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2011] [Accepted: 11/03/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many bioinformatics tools for RNA secondary structure analysis are based on a thermodynamic model of RNA folding. They predict a single, "optimal" structure by free energy minimization, they enumerate near-optimal structures, they compute base pair probabilities and dot plots, representative structures of different abstract shapes, or Boltzmann probabilities of structures and shapes. Although all programs refer to the same physical model, they implement it with considerable variation for different tasks, and little is known about the effects of heuristic assumptions and model simplifications used by the programs on the outcome of the analysis. RESULTS We extract four different models of the thermodynamic folding space which underlie the programs RNAFOLD, RNASHAPES, and RNASUBOPT. Their differences lie within the details of the energy model and the granularity of the folding space. We implement probabilistic shape analysis for all models, and introduce the shape probability shift as a robust measure of model similarity. Using four data sets derived from experimentally solved structures, we provide a quantitative evaluation of the model differences. CONCLUSIONS We find that search space granularity affects the computed shape probabilities less than the over- or underapproximation of free energy by a simplified energy model. Still, the approximations perform similar enough to implementations of the full model to justify their continued use in settings where computational constraints call for simpler algorithms. On the side, we observe that the rarely used level 2 shapes, which predict the complete arrangement of helices, multiloops, internal loops and bulges, include the "true" shape in a rather small number of predicted high probability shapes. This calls for an investigation of new strategies to extract high probability members from the (very large) level 2 shape space of an RNA sequence. We provide implementations of all four models, written in a declarative style that makes them easy to be modified. Based on our study, future work on thermodynamic RNA folding may make a choice of model based on our empirical data. It can take our implementations as a starting point for further program development.
Collapse
Affiliation(s)
- Stefan Janssen
- Faculty of Technology, Bielefeld University, 33615 Bielefeld, Germany
| | | | | | | |
Collapse
|
13
|
Wiebe NJP, Meyer IM. TRANSAT-- method for detecting the conserved helices of functional RNA structures, including transient, pseudo-knotted and alternative structures. PLoS Comput Biol 2010; 6:e1000823. [PMID: 20589081 PMCID: PMC2891591 DOI: 10.1371/journal.pcbi.1000823] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2009] [Accepted: 05/19/2010] [Indexed: 12/20/2022] Open
Abstract
The prediction of functional RNA structures has attracted increased interest, as it allows us to study the potential functional roles of many genes. RNA structure prediction methods, however, assume that there is a unique functional RNA structure and also do not predict functional features required for in vivo folding. In order to understand how functional RNA structures form in vivo, we require sophisticated experiments or reliable prediction methods. So far, there exist only a few, experimentally validated transient RNA structures. On the computational side, there exist several computer programs which aim to predict the co-transcriptional folding pathway in vivo, but these make a range of simplifying assumptions and do not capture all features known to influence RNA folding in vivo. We want to investigate if evolutionarily related RNA genes fold in a similar way in vivo. To this end, we have developed a new computational method, Transat, which detects conserved helices of high statistical significance. We introduce the method, present a comprehensive performance evaluation and show that Transat is able to predict the structural features of known reference structures including pseudo-knotted ones as well as those of known alternative structural configurations. Transat can also identify unstructured sub-sequences bound by other molecules and provides evidence for new helices which may define folding pathways, supporting the notion that homologous RNA sequence not only assume a similar reference RNA structure, but also fold similarly. Finally, we show that the structural features predicted by Transat differ from those assuming thermodynamic equilibrium. Unlike the existing methods for predicting folding pathways, our method works in a comparative way. This has the disadvantage of not being able to predict features as function of time, but has the considerable advantage of highlighting conserved features and of not requiring a detailed knowledge of the cellular environment. Many non-coding genes exert their function via an RNA structure which starts emerging while the RNA sequence is being transcribed from the genome. The resulting folding pathway is known to depend on a variety of features such as the transcription speed, the concentration of various ions and the binding of proteins and other molecules. Not all of these influences can be adequately captured by the existing computational methods which try to replicate what happens in vivo. So far, it has been challenging to experimentally investigate co-transcriptional folding pathways in vivo and only little data from in vitro experiments exists. In order to investigate if functionally similar RNA sequences from different organisms fold in a similar way, we have developed a new computational method, called Transat, which does not require the detailed computational modeling of the cellular environment. We show in a comprehensive analysis that our method is capable of detecting known structural features and provide evidence that structural features of the in vivo folding pathways have been conserved for several biologically interesting classes of RNA sequences.
Collapse
Affiliation(s)
- Nicholas J. P. Wiebe
- Centre for High-Throughput Biology & Department of Computer Science and Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Irmtraud M. Meyer
- Centre for High-Throughput Biology & Department of Computer Science and Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- * E-mail:
| |
Collapse
|
14
|
Cho HH, Cahill CM, Vanderburg CR, Scherzer CR, Wang B, Huang X, Rogers JT. Selective translational control of the Alzheimer amyloid precursor protein transcript by iron regulatory protein-1. J Biol Chem 2010; 285:31217-32. [PMID: 20558735 DOI: 10.1074/jbc.m110.149161] [Citation(s) in RCA: 130] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Iron influx increases the translation of the Alzheimer amyloid precursor protein (APP) via an iron-responsive element (IRE) RNA stem loop in its 5'-untranslated region. Equal modulated interaction of the iron regulatory proteins (IRP1 and IRP2) with canonical IREs controls iron-dependent translation of the ferritin subunits. However, our immunoprecipitation RT-PCR and RNA binding experiments demonstrated that IRP1, but not IRP2, selectively bound the APP IRE in human neural cells. This selective IRP1 interaction pattern was evident in human brain and blood tissue from normal and Alzheimer disease patients. We computer-predicted an optimal novel RNA stem loop structure for the human, rhesus monkey, and mouse APP IREs with reference to the canonical ferritin IREs but also the IREs encoded by erythroid heme biosynthetic aminolevulinate synthase and Hif-2α mRNAs, which preferentially bind IRP1. Selective 2'-hydroxyl acylation analyzed by primer extension analysis was consistent with a 13-base single-stranded terminal loop and a conserved GC-rich stem. Biotinylated RNA probes deleted of the conserved CAGA motif in the terminal loop did not bind to IRP1 relative to wild type probes and could no longer base pair to form a predicted AGA triloop. An AGU pseudo-triloop is key for IRP1 binding to the canonical ferritin IREs. RNA probes encoding the APP IRE stem loop exhibited the same high affinity binding to rhIRP1 as occurs for the H-ferritin IRE (35 pm). Intracellular iron chelation increased binding of IRP1 to the APP IRE, decreasing intracellular APP expression in SH-SY5Y cells. Functionally, shRNA knockdown of IRP1 caused increased expression of neural APP consistent with IRP1-APP IRE-driven translation.
Collapse
Affiliation(s)
- Hyun-Hee Cho
- Neurochemistry Laboratory, Department of Psychiatry-Neuroscience, Massachusetts General Hospital, Harvard Medical School, Charlestown, Massachusetts 02129, USA
| | | | | | | | | | | | | |
Collapse
|