1
|
Bose N, Moore SD. Variable Region Sequences Influence 16S rRNA Performance. Microbiol Spectr 2023; 11:e0125223. [PMID: 37212673 PMCID: PMC10269663 DOI: 10.1128/spectrum.01252-23] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 05/03/2023] [Indexed: 05/23/2023] Open
Abstract
16S rRNA gene sequences are commonly analyzed for taxonomic and phylogenetic studies because they contain variable regions that can help distinguish different genera. However, intra-genus distinction using variable region homology is often impossible due to the high overall sequence identities among closely related species, even though some residues may be conserved within respective species. Using a computational method that included the allelic diversity within individual genomes, we discovered that certain Escherichia and Shigella species can be distinguished by a multi-allelic 16S rRNA variable region single nucleotide polymorphism (SNP). To evaluate the performance of 16S rRNAs with altered variable regions, we developed an in vivo system that measures the acceptance and distribution of variant 16S rRNAs into a large pool of natural versions supporting normal translation and growth. We found that 16S rRNAs containing evolutionarily disparate variable regions were underpopulated both in ribosomes and in active translation pools, even for an SNP. Overall, this study revealed that variable region sequences can substantially influence the performance of 16S rRNAs and that this biological constraint can be leveraged to justify refining taxonomic assignments of variable region sequence data. IMPORTANCE This study reevaluates the notion that 16S rRNA gene variable region sequences are uninformative for intra-genus classification and that single nucleotide variations within them have no consequence to strains that bear them. We demonstrated that the performance of 16S rRNAs in Escherichia coli can be negatively impacted by sequence changes in variable regions, even for single nucleotide changes that are native to closely related Escherichia and Shigella species; thus, biological performance is likely constraining the evolution of variable regions in bacteria. Further, the native nucleotide variations we tested occur in all strains of their respective species and across their multiple 16S rRNA gene copies, suggesting that these species evolved beyond what would be discerned from a consensus sequence comparison. Therefore, this work also reveals that the multiple 16S rRNA gene alleles found in most bacteria can provide more informative phylogenetic and taxonomic detail than a single reference allele.
Collapse
Affiliation(s)
- Nikhil Bose
- Burnett School of Biomedical Sciences, College of Medicine, University of Central Florida, Orlando, Florida, USA
| | - Sean D. Moore
- Burnett School of Biomedical Sciences, College of Medicine, University of Central Florida, Orlando, Florida, USA
| |
Collapse
|
2
|
Lasher B, Hendrix DA. bpRNA-align: improved RNA secondary structure global alignment for comparing and clustering RNA structures. RNA (NEW YORK, N.Y.) 2023; 29:584-595. [PMID: 36759128 PMCID: PMC10159002 DOI: 10.1261/rna.079211.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 01/14/2023] [Indexed: 05/06/2023]
Abstract
Ribonucleic acid (RNA) is a polymeric molecule that is fundamental to biological processes, with structure being more highly conserved than primary sequence and often key to its function. Advances in RNA structure characterization have resulted in an increase in the number of accurate secondary structures. The task of uncovering common RNA structural motifs with a collective function through structural comparison, providing a level of similarity, remains challenging and could be used to improve RNA secondary structure databases and discover new RNA families. In this work, we present a novel secondary structure alignment method, bpRNA-align. bpRNA-align is a customized global structural alignment method, utilizing an inverted (gap extend costs more than gap open) and context-specific affine gap penalty along with a structural, feature-specific substitution matrix to provide similarity scores. We evaluate our similarity scores in comparison to other methods, using affinity propagation clustering, applied to a benchmarking data set of known structure types. bpRNA-align shows improvement in clustering performance over a broad range of structure types.
Collapse
Affiliation(s)
- Brittany Lasher
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, USA
| | - David A Hendrix
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, USA
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon 97331, USA
| |
Collapse
|
3
|
Forstmeier PC, Meyer MO, Bevilacqua PC. The Functional RNA Identification (FRID) Pipeline: Identification of Potential Pseudoknot-Containing RNA Elements as Therapeutic Targets for SARS-CoV-2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.03.535424. [PMID: 37066195 PMCID: PMC10103974 DOI: 10.1101/2023.04.03.535424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/22/2023]
Abstract
The COVID-19 pandemic persists despite the development of effective vaccines. As such, it remains crucial to identify new targets for antiviral therapies. The causative virus of COVID-19, SARS-CoV-2, is a positive-sense RNA virus with RNA structures that could serve as therapeutic targets. One such RNA with established function is the frameshift stimulatory element (FSE), which promotes programmed ribosomal frameshifting. To accelerate identification of additional functional RNA elements, we introduce a novel computational approach termed the Functional RNA Identification (FRID) pipeline. The guiding principle of our pipeline, which uses established component programs as well as customized component programs, is that functional RNA elements have conserved secondary and pseudoknot structures that facilitate function. To assess the presence and conservation of putative functional RNA elements in SARS-CoV-2, we compared over 6,000 SARS-CoV-2 genomic isolates. We identified 22 functional RNA elements from the SARS-CoV-2 genome, 14 of which have conserved pseudoknots and serve as potential targets for small molecule or antisense oligonucleotide therapeutics. The FRID pipeline is general and can be applied to identify pseudoknotted RNAs for targeted therapeutics in genomes or transcriptomes from any virus or organism.
Collapse
|
4
|
Robinson EK, Covarrubias S, Carpenter S. The how and why of lncRNA function: An innate immune perspective. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2020; 1863:194419. [PMID: 31487549 PMCID: PMC7185634 DOI: 10.1016/j.bbagrm.2019.194419] [Citation(s) in RCA: 167] [Impact Index Per Article: 41.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 08/21/2019] [Indexed: 02/06/2023]
Abstract
Next-generation sequencing has provided a more complete picture of the composition of the human transcriptome indicating that much of the "blueprint" is a vastness of poorly understood non-protein-coding transcripts. This includes a newly identified class of genes called long noncoding RNAs (lncRNAs). The lack of sequence conservation for lncRNAs across species meant that their biological importance was initially met with some skepticism. LncRNAs mediate their functions through interactions with proteins, RNA, DNA, or a combination of these. Their functions can often be dictated by their localization, sequence, and/or secondary structure. Here we provide a review of the approaches typically adopted to study the complexity of these genes with an emphasis on recent discoveries within the innate immune field. Finally, we discuss the challenges, as well as the emergence of new technologies that will continue to move this field forward and provide greater insight into the biological importance of this class of genes. This article is part of a Special Issue entitled: ncRNA in control of gene expression edited by Kotb Abdelmohsen.
Collapse
Affiliation(s)
- Elektra K Robinson
- Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA, United States of America
| | - Sergio Covarrubias
- Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA, United States of America
| | - Susan Carpenter
- Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA, United States of America.
| |
Collapse
|
5
|
Bernier CR, Petrov AS, Kovacs NA, Penev PI, Williams LD. Translation: The Universal Structural Core of Life. Mol Biol Evol 2019; 35:2065-2076. [PMID: 29788252 PMCID: PMC6063299 DOI: 10.1093/molbev/msy101] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The Universal Gene Set of Life (UGSL) is common to genomes of all extant organisms. The UGSL is small, consisting of <100 genes, and is dominated by genes encoding the translation system. Here we extend the search for biological universality to three dimensions. We characterize and quantitate the universality of structure of macromolecules that are common to all of life. We determine that around 90% of prokaryotic ribosomal RNA (rRNA) forms a common core, which is the structural and functional foundation of rRNAs of all cytoplasmic ribosomes. We have established a database, which we call the Sparse and Efficient Representation of the Extant Biology (the SEREB database). This database contains complete and cross-validated rRNA sequences of species chosen, as far as possible, to sparsely and efficiently sample all known phyla. Atomic-resolution structures of ribosomes provide data for structural comparison and validation of sequence-based models. We developed a similarity statistic called pairing adjusted sequence entropy, which characterizes paired nucleotides by their adherence to covariation and unpaired nucleotides by conventional conservation of identity. For canonically paired nucleotides the unit of structure is the nucleotide pair. For unpaired nucleotides, the unit of structure is the nucleotide. By quantitatively defining the common core of rRNA, we systematize the conservation and divergence of the translational system across the tree of life, and can begin to understand the unique evolutionary pressures that cause its universality. We explore the relationship between ribosomal size and diversity, geological time, and organismal complexity.
Collapse
Affiliation(s)
- Chad R Bernier
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA 30332
| | - Anton S Petrov
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA 30332
| | - Nicholas A Kovacs
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA 30332
| | - Petar I Penev
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332
| | - Loren Dean Williams
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA 30332
| |
Collapse
|
6
|
Low A, Rodrigue N, Wong A. COMPASS: the COMPletely Arbitrary Sequence Simulator. Bioinformatics 2018; 33:3101-3103. [PMID: 28582485 DOI: 10.1093/bioinformatics/btx347] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Accepted: 05/31/2017] [Indexed: 11/13/2022] Open
Abstract
Summary Simulated sequence alignments are frequently used to test bioinformatics tools, but current sequence simulators are limited to defined state spaces. Here, we present the COMPletely Arbitrary Sequence Simulator (COMPASS), which is able to simulate the evolution of absolutely any discrete state space along a tree, for any form of time-reversible model. Availability and implementation COMPASS is implemented in Python 2.7, and is freely available for all platforms with the Supplementary Information, as well as at http://labs.carleton.ca/eme/software-and-data. Contact alex_wong@carleton.ca. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Andrew Low
- Department of Biology, Carleton University, Ottawa, ON, Canada K1S 5B6
| | - Nicolas Rodrigue
- Department of Biology, Carleton University, Ottawa, ON, Canada K1S 5B6
| | - Alex Wong
- Department of Biology, Carleton University, Ottawa, ON, Canada K1S 5B6
| |
Collapse
|
7
|
Warner KD, Hajdin CE, Weeks KM. Principles for targeting RNA with drug-like small molecules. Nat Rev Drug Discov 2018; 17:547-558. [PMID: 29977051 PMCID: PMC6420209 DOI: 10.1038/nrd.2018.93] [Citation(s) in RCA: 407] [Impact Index Per Article: 67.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Recent studies have indicated the potential to develop small-molecule drugs that act on RNA targets, leading to burgeoning interest in the field. This article discusses general principles for discovering small-molecule drugs that target RNA and argues that the overarching challenge is to identify appropriate target structures in disease-causing RNAs that have high information content and, consequently, appropriate ligand-binding pockets. RNA molecules are essential for cellular information transfer and gene regulation, and RNAs have been implicated in many human diseases. Messenger and non-coding RNAs contain highly structured elements, and evidence suggests that many of these structures are important for function. Targeting these RNAs with small molecules offers opportunities to therapeutically modulate numerous cellular processes, including those linked to 'undruggable' protein targets. Despite this promise, there is currently only a single class of human-designed small molecules that target RNA used clinically — the linezolid antibiotics. However, a growing number of small-molecule RNA ligands are being identified, leading to burgeoning interest in the field. Here, we discuss principles for discovering small-molecule drugs that target RNA and argue that the overarching challenge is to identify appropriate target structures — namely, in disease-causing RNAs that have high information content and, consequently, appropriate ligand-binding pockets. If focus is placed on such druggable binding sites in RNA, extensive knowledge of the typical physicochemical properties of drug-like small molecules could then enable small-molecule drug discovery for RNA targets to become (only) roughly as difficult as for protein targets.
Collapse
Affiliation(s)
| | | | - Kevin M Weeks
- Department of Chemistry, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
8
|
Diaz-Toledano R, Lozano G, Martinez-Salas E. In-cell SHAPE uncovers dynamic interactions between the untranslated regions of the foot-and-mouth disease virus RNA. Nucleic Acids Res 2017; 45:1416-1432. [PMID: 28180318 PMCID: PMC5388415 DOI: 10.1093/nar/gkw795] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Revised: 08/26/2016] [Accepted: 08/29/2016] [Indexed: 12/14/2022] Open
Abstract
The genome of RNA viruses folds into 3D structures that include long-range RNA–RNA interactions relevant to control critical steps of the viral cycle. In particular, initiation of translation driven by the IRES element of foot-and-mouth disease virus is stimulated by the 3΄UTR. Here we sought to investigate the RNA local flexibility of the IRES element and the 3΄UTR in living cells. The SHAPE reactivity observed in vivo showed statistically significant differences compared to the free RNA, revealing protected or exposed positions within the IRES and the 3΄UTR. Importantly, the IRES local flexibility was modified in the presence of the 3΄UTR, showing significant protections at residues upstream from the functional start codon. Conversely, presence of the IRES element in cis altered the 3΄UTR local flexibility leading to an overall enhanced reactivity. Unlike the reactivity changes observed in the IRES element, the SHAPE differences of the 3΄UTR were large but not statistically significant, suggesting multiple dynamic RNA interactions. These results were supported by covariation analysis, which predicted IRES-3΄UTR conserved helices in agreement with the protections observed by SHAPE probing. Mutational analysis suggested that disruption of one of these interactions could be compensated by alternative base pairings, providing direct evidences for dynamic long-range interactions between these distant elements of the viral genome.
Collapse
Affiliation(s)
- Rosa Diaz-Toledano
- Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas - Universidad Autónoma de Madrid, Nicolas Cabrera 1, Madrid, Spain
| | - Gloria Lozano
- Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas - Universidad Autónoma de Madrid, Nicolas Cabrera 1, Madrid, Spain
| | - Encarnacion Martinez-Salas
- Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas - Universidad Autónoma de Madrid, Nicolas Cabrera 1, Madrid, Spain
| |
Collapse
|
9
|
Subtype-specific structural constraints in the evolution of influenza A virus hemagglutinin genes. Sci Rep 2016; 6:38892. [PMID: 27966593 PMCID: PMC5155281 DOI: 10.1038/srep38892] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Accepted: 11/14/2016] [Indexed: 11/08/2022] Open
Abstract
The influenza A virus genome consists of eight RNA segments. RNA structures within these segments and complementary (cRNA) and protein-coding mRNAs may play a role in virus replication. Here, conserved putative secondary structures that impose significant evolutionary constraints on the gene segment encoding the surface glycoprotein hemagglutinin (HA) were investigated using available sequence data on tens of thousands of virus strains. Structural constraints were identified by analysis of covariations of nucleotides suggested to be paired by structure prediction algorithms. The significance of covariations was estimated by mutual information calculations and tracing multiple covariation events during virus evolution. Covariation patterns demonstrated that structured domains in HA RNAs were mostly subtype-specific, whereas some structures were conserved in several subtypes. The influence of RNA folding on virus replication was studied by plaque assays of mutant viruses with disrupted structures. The results suggest that over the whole length of the HA segment there are local structured domains which contribute to the virus fitness but individually are not essential for the virus. Existence of subtype-specific structured regions in the segments of the influenza A virus genome is apparently an important factor in virus evolution and reassortment of its genes.
Collapse
|
10
|
Weinreb C, Riesselman AJ, Ingraham JB, Gross T, Sander C, Marks DS. 3D RNA and Functional Interactions from Evolutionary Couplings. Cell 2016; 165:963-75. [PMID: 27087444 DOI: 10.1016/j.cell.2016.03.030] [Citation(s) in RCA: 105] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2015] [Revised: 01/15/2016] [Accepted: 03/18/2016] [Indexed: 11/18/2022]
Abstract
Non-coding RNAs are ubiquitous, but the discovery of new RNA gene sequences far outpaces the research on the structure and functional interactions of these RNA gene sequences. We mine the evolutionary sequence record to derive precise information about the function and structure of RNAs and RNA-protein complexes. As in protein structure prediction, we use maximum entropy global probability models of sequence co-variation to infer evolutionarily constrained nucleotide-nucleotide interactions within RNA molecules and nucleotide-amino acid interactions in RNA-protein complexes. The predicted contacts allow all-atom blinded 3D structure prediction at good accuracy for several known RNA structures and RNA-protein complexes. For unknown structures, we predict contacts in 160 non-coding RNA families. Beyond 3D structure prediction, evolutionary couplings help identify important functional interactions-e.g., at switch points in riboswitches and at a complex nucleation site in HIV. Aided by increasing sequence accumulation, evolutionary coupling analysis can accelerate the discovery of functional interactions and 3D structures involving RNA.
Collapse
Affiliation(s)
- Caleb Weinreb
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Adam J Riesselman
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA; Program in Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - John B Ingraham
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Torsten Gross
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA; Institute of Pathology, Charité - Universitätsmedizin Berlin, 10117 Berlin, Germany
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
11
|
Hua L, Song Y, Kim N, Laing C, Wang JTL, Schlick T. CHSalign: A Web Server That Builds upon Junction-Explorer and RNAJAG for Pairwise Alignment of RNA Secondary Structures with Coaxial Helical Stacking. PLoS One 2016; 11:e0147097. [PMID: 26789998 PMCID: PMC4720362 DOI: 10.1371/journal.pone.0147097] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 12/29/2015] [Indexed: 01/01/2023] Open
Abstract
RNA junctions are important structural elements of RNA molecules. They are formed when three or more helices come together in three-dimensional space. Recent studies have focused on the annotation and prediction of coaxial helical stacking (CHS) motifs within junctions. Here we exploit such predictions to develop an efficient alignment tool to handle RNA secondary structures with CHS motifs. Specifically, we build upon our Junction-Explorer software for predicting coaxial stacking and RNAJAG for modelling junction topologies as tree graphs to incorporate constrained tree matching and dynamic programming algorithms into a new method, called CHSalign, for aligning the secondary structures of RNA molecules containing CHS motifs. Thus, CHSalign is intended to be an efficient alignment tool for RNAs containing similar junctions. Experimental results based on thousands of alignments demonstrate that CHSalign can align two RNA secondary structures containing CHS motifs more accurately than other RNA secondary structure alignment tools. CHSalign yields a high score when aligning two RNA secondary structures with similar CHS motifs or helical arrangement patterns, and a low score otherwise. This new method has been implemented in a web server, and the program is also made freely available, at http://bioinformatics.njit.edu/CHSalign/.
Collapse
Affiliation(s)
- Lei Hua
- Bioinformatics Laboratory, Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, United States of America
| | - Yang Song
- Bioinformatics Laboratory, Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, United States of America
| | - Namhee Kim
- Department of Chemistry, New York University, New York, New York, United States of America
| | - Christian Laing
- Bioinformatics Laboratory, Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, United States of America
| | - Jason T. L. Wang
- Bioinformatics Laboratory, Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, United States of America
- * E-mail: (JW); (TS)
| | - Tamar Schlick
- Department of Chemistry, New York University, New York, New York, United States of America
- Courant Institute of Mathematical Sciences, New York University, New York, New York, United States of America
- * E-mail: (JW); (TS)
| |
Collapse
|
12
|
Doris SM, Smith DR, Beamesderfer JN, Raphael BJ, Nathanson JA, Gerbi SA. Universal and domain-specific sequences in 23S-28S ribosomal RNA identified by computational phylogenetics. RNA (NEW YORK, N.Y.) 2015; 21:1719-1730. [PMID: 26283689 PMCID: PMC4574749 DOI: 10.1261/rna.051144.115] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2015] [Accepted: 07/07/2015] [Indexed: 06/01/2023]
Abstract
Comparative analysis of ribosomal RNA (rRNA) sequences has elucidated phylogenetic relationships. However, this powerful approach has not been fully exploited to address ribosome function. Here we identify stretches of evolutionarily conserved sequences, which correspond with regions of high functional importance. For this, we developed a structurally aligned database, FLORA (full-length organismal rRNA alignment) to identify highly conserved nucleotide elements (CNEs) in 23S-28S rRNA from each phylogenetic domain (Eukarya, Bacteria, and Archaea). Universal CNEs (uCNEs) are conserved in sequence and structural position in all three domains. Those in regions known to be essential for translation validate our approach. Importantly, some uCNEs reside in areas of unknown function, thus identifying novel sequences of likely great importance. In contrast to uCNEs, domain-specific CNEs (dsCNEs) are conserved in just one phylogenetic domain. This is the first report of conserved sequence elements in rRNA that are domain-specific; they are largely a eukaryotic phenomenon. The locations of the eukaryotic dsCNEs within the structure of the ribosome suggest they may function in nascent polypeptide transit through the ribosome tunnel and in tRNA exit from the ribosome. Our findings provide insights and a resource for ribosome function studies.
Collapse
Affiliation(s)
- Stephen M Doris
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University Division of Biology and Medicine, Providence, Rhode Island 02912, USA
| | - Deborah R Smith
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University Division of Biology and Medicine, Providence, Rhode Island 02912, USA
| | - Julia N Beamesderfer
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University Division of Biology and Medicine, Providence, Rhode Island 02912, USA
| | - Benjamin J Raphael
- Department of Computer Science and Center for Computational Molecular Biology, Brown University Division of Biology and Medicine, Providence, Rhode Island 02912, USA
| | - Judith A Nathanson
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University Division of Biology and Medicine, Providence, Rhode Island 02912, USA
| | - Susan A Gerbi
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University Division of Biology and Medicine, Providence, Rhode Island 02912, USA
| |
Collapse
|
13
|
De Leonardis E, Lutz B, Ratz S, Cocco S, Monasson R, Schug A, Weigt M. Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction. Nucleic Acids Res 2015; 43:10444-55. [PMID: 26420827 PMCID: PMC4666395 DOI: 10.1093/nar/gkv932] [Citation(s) in RCA: 64] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Accepted: 09/07/2015] [Indexed: 12/16/2022] Open
Abstract
Despite the biological importance of non-coding RNA, their structural characterization remains challenging. Making use of the rapidly growing sequence databases, we analyze nucleotide coevolution across homologous sequences via Direct-Coupling Analysis to detect nucleotide-nucleotide contacts. For a representative set of riboswitches, we show that the results of Direct-Coupling Analysis in combination with a generalized Nussinov algorithm systematically improve the results of RNA secondary structure prediction beyond traditional covariance approaches based on mutual information. Even more importantly, we show that the results of Direct-Coupling Analysis are enriched in tertiary structure contacts. By integrating these predictions into molecular modeling tools, systematically improved tertiary structure predictions can be obtained, as compared to using secondary structure information alone.
Collapse
Affiliation(s)
- Eleonora De Leonardis
- Computational and Quantitative Biology, Sorbonne Universités, Université Pierre et Marie Curie, UMR 7238, 75006 Paris, France Computational and Quantitative Biology, CNRS, UMR 7238, 75006 Paris, France Laboratoire de Physique Statistique de l'Ecole Normale Supérieure, associé au CNRS et à l'Université Pierre et Marie Curie, 75005 Paris, France
| | - Benjamin Lutz
- Steinbuch Centre for Computing, Karlsruher Institut für Technologie, 76133 Karlsruhe, Germany Fakultät für Physik, Karlsruher Institut für Technologie, 76133 Karlsruhe, Germany
| | - Sebastian Ratz
- Steinbuch Centre for Computing, Karlsruher Institut für Technologie, 76133 Karlsruhe, Germany Fakultät für Physik, Karlsruher Institut für Technologie, 76133 Karlsruhe, Germany
| | - Simona Cocco
- Laboratoire de Physique Statistique de l'Ecole Normale Supérieure, associé au CNRS et à l'Université Pierre et Marie Curie, 75005 Paris, France
| | - Rémi Monasson
- Laboratoire de Physique Théorique de l'Ecole Normale Supérieure, associé au CNRS et à l'Université Pierre et Marie Curie, 75005 Paris, France
| | - Alexander Schug
- Steinbuch Centre for Computing, Karlsruher Institut für Technologie, 76133 Karlsruhe, Germany
| | - Martin Weigt
- Computational and Quantitative Biology, Sorbonne Universités, Université Pierre et Marie Curie, UMR 7238, 75006 Paris, France Computational and Quantitative Biology, CNRS, UMR 7238, 75006 Paris, France
| |
Collapse
|
14
|
Model-Free RNA Sequence and Structure Alignment Informed by SHAPE Probing Reveals a Conserved Alternate Secondary Structure for 16S rRNA. PLoS Comput Biol 2015; 11:e1004126. [PMID: 25992778 PMCID: PMC4438973 DOI: 10.1371/journal.pcbi.1004126] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Accepted: 01/12/2015] [Indexed: 12/13/2022] Open
Abstract
Discovery and characterization of functional RNA structures remains challenging due to deficiencies in de novo secondary structure modeling. Here we describe a dynamic programming approach for model-free sequence comparison that incorporates high-throughput chemical probing data. Based on SHAPE probing data alone, ribosomal RNAs (rRNAs) from three diverse organisms--the eubacteria E. coli and C. difficile and the archeon H. volcanii--could be aligned with accuracies comparable to alignments based on actual sequence identity. When both base sequence identity and chemical probing reactivities were considered together, accuracies improved further. Derived sequence alignments and chemical probing data from protein-free RNAs were then used as pseudo-free energy constraints to model consensus secondary structures for the 16S and 23S rRNAs. There are critical differences between these experimentally-informed models and currently accepted models, including in the functionally important neck and decoding regions of the 16S rRNA. We infer that the 16S rRNA has evolved to undergo large-scale changes in base pairing as part of ribosome function. As high-quality RNA probing data become widely available, structurally-informed sequence alignment will become broadly useful for de novo motif and function discovery.
Collapse
|
15
|
In-cell SHAPE reveals that free 30S ribosome subunits are in the inactive state. Proc Natl Acad Sci U S A 2015; 112:2425-30. [PMID: 25675474 DOI: 10.1073/pnas.1411514112] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
It was shown decades ago that purified 30S ribosome subunits readily interconvert between "active" and "inactive" conformations in a switch that involves changes in the functionally important neck and decoding regions. However, the physiological significance of this conformational change had remained unknown. In exponentially growing Escherichia coli cells, RNA SHAPE probing revealed that 16S rRNA largely adopts the inactive conformation in stably assembled, mature 30S subunits and the active conformation in translating (70S) ribosomes. Inactive 30S subunits bind mRNA as efficiently as active subunits but initiate translation more slowly. Mutations that inhibited interconversion between states compromised translation in vivo. Binding by the small antibiotic paromomycin induced the inactive-to-active conversion, consistent with a low-energy barrier between the two states. Despite the small energetic barrier between states, but consistent with slow translation initiation and a functional role in vivo, interconversion involved large-scale changes in structure in the neck region that likely propagate across the 30S body via helix 44. These findings suggest the inactive state is a biologically relevant alternate conformation that regulates ribosome function as a conformational switch.
Collapse
|
16
|
Gultyaev AP, Tsyganov-Bodounov A, Spronken MIJ, van der Kooij S, Fouchier RAM, Olsthoorn RCL. RNA structural constraints in the evolution of the influenza A virus genome NP segment. RNA Biol 2014; 11:942-52. [PMID: 25180940 DOI: 10.4161/rna.29730] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Conserved RNA secondary structures were predicted in the nucleoprotein (NP) segment of the influenza A virus genome using comparative sequence and structure analysis. A number of structural elements exhibiting nucleotide covariations were identified over the whole segment length, including protein-coding regions. Calculations of mutual information values at the paired nucleotide positions demonstrate that these structures impose considerable constraints on the virus genome evolution. Functional importance of a pseudoknot structure, predicted in the NP packaging signal region, was confirmed by plaque assays of the mutant viruses with disrupted structure and those with restored folding using compensatory substitutions. Possible functions of the conserved RNA folding patterns in the influenza A virus genome are discussed.
Collapse
Affiliation(s)
- Alexander P Gultyaev
- Department of Viroscience, Erasmus Medical Center, The Netherlands; Leiden Institute of Advanced Computer Science (LIACS), Leiden University, Niels Bohrweg 1, The Netherlands
| | - Anton Tsyganov-Bodounov
- Leiden Institute of Chemistry, Leiden University, P.O.Box 9502, 2300 RA Leiden, The Netherlands;; Current address: Illumina UK Ltd., Chesterford Research Park, Little Chesterford, Essex, UK
| | | | - Sander van der Kooij
- Department of Viroscience, Erasmus Medical Center, The Netherlands; Current address: BaseClear B.V., Einsteinweg, The Netherlands
| | - Ron A M Fouchier
- Department of Viroscience, Erasmus Medical Center, The Netherlands
| | - René C L Olsthoorn
- Leiden Institute of Chemistry, Leiden University, P.O.Box 9502, 2300 RA Leiden, The Netherlands
| |
Collapse
|
17
|
Abstract
A few years before I started my graduate studies, Carl Woese was establishing a collaboration with his friend, colleague, and my PhD advisor, Harry Noller. Carl was introducing comparative methods to Harry's lab to determine the secondary structure for the 16S and 23S rRNAs. In addition to an experimental project that had minimal to no success, I was attempting to predict an RNA secondary structure from a single sequence. I determined after a few months that the complexity of RNA folding was much greater than ever anticipated. Ten lessons were learned about the dynamics of RNA folding, the comparative methods used to accurately predict the RNAs secondary structure and the beginnings of its tertiary structure, the use of comparative methods to reveal much more than ever anticipated about RNA structure, other applications beyond RNA structure, and the lessons about the process of scientific discovery.
Collapse
Affiliation(s)
- Robin R Gutell
- Institute for Cellular and Molecular Biology and Department of Integrative Biology; University of Texas; Austin, TX USA
| |
Collapse
|
18
|
Abstract
De novo discovery of "motifs" capturing the commonalities among related noncoding ncRNA structured RNAs is among the most difficult problems in computational biology. This chapter outlines the challenges presented by this problem, together with some approaches towards solving them, with an emphasis on an approach based on the CMfinder CMfinder program as a case study. Applications to genomic screens for novel de novo structured ncRNA ncRNA s, including structured RNA elements in untranslated portions of protein-coding genes, are presented.
Collapse
Affiliation(s)
- Walter L Ruzzo
- Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | | |
Collapse
|
19
|
Sheth P, Cervantes-Cervantes M, Nagula A, Laing C, Wang JTL. Novel features for identifying A-minors in three-dimensional RNA molecules. Comput Biol Chem 2013; 47:240-5. [PMID: 24211672 DOI: 10.1016/j.compbiolchem.2013.10.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2013] [Revised: 10/15/2013] [Accepted: 10/16/2013] [Indexed: 01/08/2023]
Abstract
RNA tertiary interactions or tertiary motifs are conserved structural patterns formed by pairwise interactions between nucleotides. They include base-pairing, base-stacking, and base-phosphate interactions. A-minor motifs are the most common tertiary interactions in the large ribosomal subunit. The A-minor motif is a nucleotide triple in which minor groove edges of an adenine base are inserted into the minor groove of neighboring helices, leading to interaction with a stabilizing base pair. We propose here novel features for identifying and predicting A-minor motifs in a given three-dimensional RNA molecule. By utilizing the features together with machine learning algorithms including random forests and support vector machines, we show experimentally that our approach is capable of predicting A-minor motifs in the given RNA molecule effectively, demonstrating the usefulness of the proposed approach. The techniques developed from this work will be useful for molecular biologists and biochemists to analyze RNA tertiary motifs, specifically A-minor interactions.
Collapse
Affiliation(s)
- Palak Sheth
- Bioinformatics Program, New Jersey Institute of Technology, Newark, NJ 07102, USA
| | | | | | | | | |
Collapse
|
20
|
Taylor WR, Hamilton RS, Sadowski MI. Prediction of contacts from correlated sequence substitutions. Curr Opin Struct Biol 2013; 23:473-9. [PMID: 23680395 DOI: 10.1016/j.sbi.2013.04.001] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2013] [Revised: 03/12/2013] [Accepted: 04/02/2013] [Indexed: 11/26/2022]
Abstract
Recent work has led to a substantial improvement in the accuracy of predictions of contacts between amino acids using evolutionary information derived from multiple sequence alignments. Where large numbers of diverse sequence relatives are available and can be aligned to the sequence of a protein of unknown structure it is now possible to generate high-resolution models without recourse to the structure of a template. In this review we describe these exciting new techniques and critically assess the state-of-the-art in contact prediction in the light of these. While concentrating on methods, we also discuss applications to protein and RNA structure prediction as well as potential future developments.
Collapse
Affiliation(s)
- William R Taylor
- Division of Mathematical Biology, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK.
| | | | | |
Collapse
|
21
|
Specificity between lactobacilli and hymenopteran hosts is the exception rather than the rule. Appl Environ Microbiol 2013; 79:1803-12. [PMID: 23291551 DOI: 10.1128/aem.03681-12] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Lactobacilli (Lactobacillales: Lactobacillaceae) are well known for their roles in food fermentation, as probiotics, and in human health, but they can also be dominant members of the microbiota of some species of Hymenoptera (ants, bees, and wasps). Honey bees and bumble bees associate with host-specific lactobacilli, and some evidence suggests that these lactobacilli are important for bee health. Social transmission helps maintain associations between these bees and their respective microbiota. To determine whether lactobacilli associated with social hymenopteran hosts are generally host specific, we gathered publicly available Lactobacillus 16S rRNA gene sequences, along with Lactobacillus sequences from 454 pyrosequencing surveys of six other hymenopteran species (three sweat bees and three ants). We determined the comparative secondary structural models of 16S rRNA, which allowed us to accurately align the entire 16S rRNA gene, including fast-evolving regions. BLAST searches and maximum-likelihood phylogenetic reconstructions confirmed that honey and bumble bees have host-specific Lactobacillus associates. Regardless of colony size or within-colony oral sharing of food (trophallaxis), sweat bees and ants associate with lactobacilli that are closely related to those found in vertebrate hosts or in diverse environments. Why honey and bumble bees associate with host-specific lactobacilli while other social Hymenoptera do not remains an open question. Lactobacilli are known to inhibit the growth of other microbes and can be beneficial whether they are coevolved with their host or are recruited by the host from environmental sources through mechanisms of partner choice.
Collapse
|