1
|
Gadekar V, Munk AW, Miladi M, Junge A, Backofen R, Seemann S, Gorodkin J. Clusters of mammalian conserved RNA structures in UTRs associate with RBP binding sites. NAR Genom Bioinform 2024; 6:lqae089. [PMID: 39131818 PMCID: PMC11310781 DOI: 10.1093/nargab/lqae089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 06/26/2024] [Accepted: 07/16/2024] [Indexed: 08/13/2024] Open
Abstract
RNA secondary structures play essential roles in the formation of the tertiary structure and function of a transcript. Recent genome-wide studies highlight significant potential for RNA structures in the mammalian genome. However, a major challenge is assigning functional roles to these structured RNAs. In this study, we conduct a guilt-by-association analysis of clusters of computationally predicted conserved RNA structure (CRSs) in human untranslated regions (UTRs) to associate them with gene functions. We filtered a broad pool of ∼500 000 human CRSs for UTR overlap, resulting in 4734 and 24 754 CRSs from the 5' and 3' UTR of protein-coding genes, respectively. We separately clustered these CRSs for both sets using RNAscClust, obtaining 793 and 2403 clusters, each containing an average of five CRSs per cluster. We identified overrepresented binding sites for 60 and 43 RNA-binding proteins co-localizing with the clustered CRSs. Furthermore, 104 and 441 clusters from the 5' and 3' UTRs, respectively, showed enrichment for various Gene Ontologies, including biological processes such as 'signal transduction', 'nervous system development', molecular functions like 'transferase activity' and the cellular components such as 'synapse' among others. Our study shows that significant functional insights can be gained by clustering RNA structures based on their structural characteristics.
Collapse
Affiliation(s)
- Veerendra P Gadekar
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Ridebanevej 9, 1870 Frederiksberg, Denmark
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, 1870 Frederiksberg, Denmark
- Centre for Integrative Biology and Systems Medicine (IBSE), IIT Madras, Chennai, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai, India
| | - Alexander Welford Munk
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Ridebanevej 9, 1870 Frederiksberg, Denmark
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, 1870 Frederiksberg, Denmark
| | - Milad Miladi
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg im Breisgau, Germany
| | - Alexander Junge
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Ridebanevej 9, 1870 Frederiksberg, Denmark
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, 1870 Frederiksberg, Denmark
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg im Breisgau, Germany
| | - Stefan E Seemann
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Ridebanevej 9, 1870 Frederiksberg, Denmark
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, 1870 Frederiksberg, Denmark
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Ridebanevej 9, 1870 Frederiksberg, Denmark
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, 1870 Frederiksberg, Denmark
| |
Collapse
|
2
|
Escamilla-Gutiérrez A, Córdova-Espinoza MG, Sánchez-Monciváis A, Tecuatzi-Cadena B, Regalado-García AG, Medina-Quero K. In silico selection of aptamers for bacterial toxins detection. J Biomol Struct Dyn 2023; 41:10909-10918. [PMID: 36546716 DOI: 10.1080/07391102.2022.2159529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 12/10/2022] [Indexed: 12/24/2022]
Abstract
The most commonly used toxins in biological warfare are staphylococcal enterotoxin B (3SEB), cholera toxin (1XTC), and botulinum toxin (3BTA). Uncovering novel strategies for identifying these toxins is paramount; therefore, aptamers are used for this purpose. Aptamers are single-stranded DNA or RNA oligonucleotides selected via Systematic Evolution of Ligands by Exponential Enrichment (SELEX) with high binding affinity and specificity against target molecules. However, SELEX in vitro is tedious; hence, adopting alternative in silico molecular docking approaches is necessary. We aimed to conduct molecular docking with accessible tools and obtain RNA aptamers. First, 4,820,095 sequences obtained from an initial library of 9.5 × 109 Python script sequences were used. The GraphClust program was used to create representative groups or clusters, and the DoGSiteScorer (https://proteins.plus/) was used to conduct binding site detection of the proteins: 5DO4 (thrombin), 3SEB, 1XTC, and 3BTA. rDock, HDock, and PatchDock were adopted, combining different docking program results (consensus scoring), to improve receptor-ligand prediction. An analysis of the poses and root mean square deviation (RMSD) was performed, and 468 structurally different aptamers were obtained. The DoGSiteScorer program predicted the binding site of each protein to direct the interaction with the aptamer. Candidate aptamers for 3SEB, 1XTC, and 3BTA were selected according to the pose value considering the closeness of the interaction with a lower mean of 45.923 Å, 45.854 Å, and 72.490 Å, respectively.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Alejandro Escamilla-Gutiérrez
- Laboratorio de Bacteriología Médica, Departamento de Microbiología, Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional, Ciudad de México, México
- Hospital General, Instituto Mexicano del Seguro Social IMSS, Ciudad de México, México
| | - María Guadalupe Córdova-Espinoza
- Laboratorio de Bacteriología Médica, Departamento de Microbiología, Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional, Ciudad de México, México
- Laboratorio de Inmunología, Escuela Militar de Graduados de Sanidad, Secretaría de la Defensa Nacional, Ciudad de México, México
| | - Anahí Sánchez-Monciváis
- Laboratorio de Inmunología, Escuela Militar de Graduados de Sanidad, Secretaría de la Defensa Nacional, Ciudad de México, México
| | - Brenda Tecuatzi-Cadena
- Laboratorio de Inmunología, Escuela Militar de Graduados de Sanidad, Secretaría de la Defensa Nacional, Ciudad de México, México
| | - Ana Gabriela Regalado-García
- Laboratorio de Inmunología, Escuela Militar de Graduados de Sanidad, Secretaría de la Defensa Nacional, Ciudad de México, México
| | - Karen Medina-Quero
- Laboratorio de Inmunología, Escuela Militar de Graduados de Sanidad, Secretaría de la Defensa Nacional, Ciudad de México, México
| |
Collapse
|
3
|
Lasher B, Hendrix DA. bpRNA-align: improved RNA secondary structure global alignment for comparing and clustering RNA structures. RNA (NEW YORK, N.Y.) 2023; 29:584-595. [PMID: 36759128 PMCID: PMC10159002 DOI: 10.1261/rna.079211.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 01/14/2023] [Indexed: 05/06/2023]
Abstract
Ribonucleic acid (RNA) is a polymeric molecule that is fundamental to biological processes, with structure being more highly conserved than primary sequence and often key to its function. Advances in RNA structure characterization have resulted in an increase in the number of accurate secondary structures. The task of uncovering common RNA structural motifs with a collective function through structural comparison, providing a level of similarity, remains challenging and could be used to improve RNA secondary structure databases and discover new RNA families. In this work, we present a novel secondary structure alignment method, bpRNA-align. bpRNA-align is a customized global structural alignment method, utilizing an inverted (gap extend costs more than gap open) and context-specific affine gap penalty along with a structural, feature-specific substitution matrix to provide similarity scores. We evaluate our similarity scores in comparison to other methods, using affinity propagation clustering, applied to a benchmarking data set of known structure types. bpRNA-align shows improvement in clustering performance over a broad range of structure types.
Collapse
Affiliation(s)
- Brittany Lasher
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, USA
| | - David A Hendrix
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, USA
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon 97331, USA
| |
Collapse
|
4
|
Mitrofanov A, Ziemann M, Alkhnbashi OS, Hess WR, Backofen R. CRISPRtracrRNA: robust approach for CRISPR tracrRNA detection. Bioinformatics 2022; 38:ii42-ii48. [PMID: 36124799 PMCID: PMC9486595 DOI: 10.1093/bioinformatics/btac466] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION The CRISPR-Cas9 system is a Type II CRISPR system that has rapidly become the most versatile and widespread tool for genome engineering. It consists of two components, the Cas9 effector protein, and a single guide RNA that combines the spacer (for identifying the target) with the tracrRNA, a trans-activating small RNA required for both crRNA maturation and interference. While there are well-established methods for screening Cas effector proteins and CRISPR arrays, the detection of tracrRNA remains the bottleneck in detecting Class 2 CRISPR systems. RESULTS We introduce a new pipeline CRISPRtracrRNA for screening and evaluation of tracrRNA candidates in genomes. This pipeline combines evidence from different components of the Cas9-sgRNA complex. The core is a newly developed structural model via covariance models from a sequence-structure alignment of experimentally validated tracrRNAs. As additional evidence, we determine the terminator signal (required for the tracrRNA transcription) and the RNA-RNA interaction between the CRISPR array repeat and the 5'-part of the tracrRNA. Repeats are detected via an ML-based approach (CRISPRidenify). Providing further evidence, we detect the cassette containing the Cas9 (Type II CRISPR systems) and Cas12 (Type V CRISPR systems) effector protein. Our tool is the first for detecting tracrRNA for Type V systems. AVAILABILITY AND IMPLEMENTATION The implementation of the CRISPRtracrRNA is available on GitHub upon requesting the access permission, (https://github.com/BackofenLab/CRISPRtracrRNA). Data generated in this study can be obtained upon request to the corresponding person: Rolf Backofen (backofen@informatik.uni-freiburg.de). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | | | - Wolfgang R Hess
- Faculty of Biology, Genetics and Experimental Bioinformatics, University of Freiburg, Freiburg, Germany
| | | |
Collapse
|
5
|
Tants JN, Becker L, McNicoll F, Müller-McNicoll M, Schlundt A. NMR-derived secondary structure of the full-length Ox40 mRNA 3'UTR and its multivalent binding to the immunoregulatory RBP Roquin. Nucleic Acids Res 2022; 50:4083-4099. [PMID: 35357505 PMCID: PMC9023295 DOI: 10.1093/nar/gkac212] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 02/24/2022] [Accepted: 03/17/2022] [Indexed: 12/31/2022] Open
Abstract
Control of posttranscriptional mRNA decay is a crucial determinant of cell homeostasis and differentiation. mRNA lifetime is governed by cis-regulatory elements in their 3' untranslated regions (UTR). Despite ongoing progress in the identification of cis elements we have little knowledge about the functional and structural integration of multiple elements in 3'UTR regulatory hubs and their recognition by mRNA-binding proteins (RBPs). Structural analyses are complicated by inconsistent mapping and prediction of RNA fold, by dynamics, and size. We here, for the first time, provide the secondary structure of a complete mRNA 3'UTR. We use NMR spectroscopy in a divide-and-conquer strategy complemented with SAXS, In-line probing and SHAPE-seq applied to the 3'UTR of Ox40 mRNA, which encodes a T-cell co-receptor repressed by the protein Roquin. We provide contributions of RNA elements to Roquin-binding. The protein uses its extended bi-modal ROQ domain to sequentially engage in a 2:1 stoichiometry with a 3'UTR core motif. We observe differential binding of Roquin to decay elements depending on their structural embedment. Our data underpins the importance of studying RNA regulation in a full sequence and structural context. This study serves as a paradigm for an approach in analysing structured RNA-regulatory hubs and their binding by RBPs.
Collapse
Affiliation(s)
- Jan-Niklas Tants
- Goethe University Frankfurt, Institute for Molecular Biosciences and Biomagnetic Resonance Centre (BMRZ), Max-von-Laue-Str. 9, 60438 Frankfurt, Germany
| | - Lea Marie Becker
- Goethe University Frankfurt, Institute for Molecular Biosciences and Biomagnetic Resonance Centre (BMRZ), Max-von-Laue-Str. 9, 60438 Frankfurt, Germany
| | - François McNicoll
- Goethe University Frankfurt, Institute for Molecular Biosciences, Max-von-Laue-Str. 13, 60438 Frankfurt, Germany
| | - Michaela Müller-McNicoll
- Goethe University Frankfurt, Institute for Molecular Biosciences, Max-von-Laue-Str. 13, 60438 Frankfurt, Germany
| | - Andreas Schlundt
- Goethe University Frankfurt, Institute for Molecular Biosciences and Biomagnetic Resonance Centre (BMRZ), Max-von-Laue-Str. 9, 60438 Frankfurt, Germany
| |
Collapse
|
6
|
Wang JP, Li C, Ding WC, Peng G, Xiao GL, Chen R, Cheng Q. Research Progress on the Inflammatory Effects of Long Non-coding RNA in Traumatic Brain Injury. Front Mol Neurosci 2022; 15:835012. [PMID: 35359568 PMCID: PMC8961287 DOI: 10.3389/fnmol.2022.835012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 02/08/2022] [Indexed: 11/29/2022] Open
Abstract
Globally, traumatic brain injury (TBI) is an acute clinical event and an important cause of death and long-term disability. However, the underlying mechanism of the pathophysiological has not been fully elucidated and the lack of effective treatment a huge burden to individuals, families, and society. Several studies have shown that long non-coding RNAs (lncRNAs) might play a crucial role in TBI; they are abundant in the central nervous system (CNS) and participate in a variety of pathophysiological processes, including oxidative stress, inflammation, apoptosis, blood-brain barrier protection, angiogenesis, and neurogenesis. Some lncRNAs modulate multiple therapeutic targets after TBI, including inflammation, thus, these lncRNAs have tremendous therapeutic potential for TBI, as they are promising biomarkers for TBI diagnosis, treatment, and prognosis prediction. This review discusses the differential expression of different lncRNAs in brain tissue during TBI, which is likely related to the physiological and pathological processes involved in TBI. These findings may provide new targets for further scientific research on the molecular mechanisms of TBI and potential therapeutic interventions.
Collapse
Affiliation(s)
- Jian-peng Wang
- Department of Neurosurgery, The Affiliated Nanhua Hospital, Hengyang Medical School, University of South China, Hengyang, China
| | - Chong Li
- Department of Neurosurgery, The Affiliated Nanhua Hospital, Hengyang Medical School, University of South China, Hengyang, China
| | - Wen-cong Ding
- Department of Neurosurgery, The Affiliated Nanhua Hospital, Hengyang Medical School, University of South China, Hengyang, China
| | - Gang Peng
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China
| | - Ge-lei Xiao
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China
| | - Rui Chen
- Department of Neurosurgery, The Affiliated Nanhua Hospital, Hengyang Medical School, University of South China, Hengyang, China
- *Correspondence: Rui Chen,
| | - Quan Cheng
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China
- Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Quan Cheng,
| |
Collapse
|
7
|
Raden M, Wallach T, Miladi M, Zhai Y, Krüger C, Mossmann ZJ, Dembny P, Backofen R, Lehnardt S. Structure-aware machine learning identifies microRNAs operating as Toll-like receptor 7/8 ligands. RNA Biol 2021; 18:268-277. [PMID: 34241565 PMCID: PMC8677043 DOI: 10.1080/15476286.2021.1940697] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MicroRNAs (miRNAs) can serve as activation signals for membrane receptors, a recently discovered function that is independent of the miRNAs’ conventional role in post-transcriptional gene regulation. Here, we introduce a machine learning approach, BrainDead, to identify oligonucleotides that act as ligands for single-stranded RNA-detecting Toll-like receptors (TLR)7/8, thereby triggering an immune response. BrainDead was trained on activation data obtained from in vitro experiments on murine microglia, incorporating sequence and intra-molecular structure, as well as inter-molecular homo-dimerization potential of candidate RNAs. The method was applied to analyse all known human miRNAs regarding their potential to induce TLR7/8 signalling and microglia activation. We validated the predicted functional activity of subsets of high- and low-scoring miRNAs experimentally, of which a selection has been linked to Alzheimer’s disease. High agreement between predictions and experiments confirms the robustness and power of BrainDead. The results provide new insight into the mechanisms of how miRNAs act as TLR ligands. Eventually, BrainDead implements a generic machine learning methodology for learning and predicting the functions of short RNAs in any context.
Collapse
Affiliation(s)
- Martin Raden
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Thomas Wallach
- Institute of Cell Biology and Neurobiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Milad Miladi
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Yuanyuan Zhai
- Institute of Cell Biology and Neurobiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Christina Krüger
- Institute of Cell Biology and Neurobiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Zoé J Mossmann
- Institute of Cell Biology and Neurobiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Paul Dembny
- Institute of Cell Biology and Neurobiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany.,Signalling Research Centre CIBSS, University of Freiburg, Freiburg, Germany
| | - Seija Lehnardt
- Department of Neurology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Zu Berlin, and Berlin Institute of Health, Berlin, Germany
| |
Collapse
|
8
|
Fremin BJ, Bhatt AS. Comparative genomics identifies thousands of candidate structured RNAs in human microbiomes. Genome Biol 2021; 22:100. [PMID: 33845850 PMCID: PMC8040213 DOI: 10.1186/s13059-021-02319-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 03/19/2021] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Structured RNAs play varied bioregulatory roles within microbes. To date, hundreds of candidate structured RNAs have been predicted using informatic approaches that search for motif structures in genomic sequence data. The human microbiome contains thousands of species and strains of microbes. Yet, much of the metagenomic data from the human microbiome remains unmined for structured RNA motifs primarily due to computational limitations. RESULTS We sought to apply a large-scale, comparative genomics approach to these organisms to identify candidate structured RNAs. With a carefully constructed, though computationally intensive automated analysis, we identify 3161 conserved candidate structured RNAs in intergenic regions, as well as 2022 additional candidate structured RNAs that may overlap coding regions. We validate the RNA expression of 177 of these candidate structures by analyzing small fragment RNA-seq data from four human fecal samples. CONCLUSIONS This approach identifies a wide variety of candidate structured RNAs, including tmRNAs, antitoxins, and likely ribosome protein leaders, from a wide variety of taxa. Overall, our pipeline enables conservative predictions of thousands of novel candidate structured RNAs from human microbiomes.
Collapse
Affiliation(s)
- Brayon J Fremin
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Ami S Bhatt
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA.
- Department of Medicine (Hematology), Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
9
|
Kalvari I, Nawrocki EP, Ontiveros-Palacios N, Argasinska J, Lamkiewicz K, Marz M, Griffiths-Jones S, Toffano-Nioche C, Gautheret D, Weinberg Z, Rivas E, Eddy SR, Finn RD, Bateman A, Petrov AI. Rfam 14: expanded coverage of metagenomic, viral and microRNA families. Nucleic Acids Res 2021; 49:D192-D200. [PMID: 33211869 PMCID: PMC7779021 DOI: 10.1093/nar/gkaa1047] [Citation(s) in RCA: 519] [Impact Index Per Article: 129.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/14/2020] [Accepted: 10/21/2020] [Indexed: 12/15/2022] Open
Abstract
Rfam is a database of RNA families where each of the 3444 families is represented by a multiple sequence alignment of known RNA sequences and a covariance model that can be used to search for additional members of the family. Recent developments have involved expert collaborations to improve the quality and coverage of Rfam data, focusing on microRNAs, viral and bacterial RNAs. We have completed the first phase of synchronising microRNA families in Rfam and miRBase, creating 356 new Rfam families and updating 40. We established a procedure for comprehensive annotation of viral RNA families starting with Flavivirus and Coronaviridae RNAs. We have also increased the coverage of bacterial and metagenome-based RNA families from the ZWD database. These developments have enabled a significant growth of the database, with the addition of 759 new families in Rfam 14. To facilitate further community contribution to Rfam, expert users are now able to build and submit new families using the newly developed Rfam Cloud family curation system. New Rfam website features include a new sequence similarity search powered by RNAcentral, as well as search and visualisation of families with pseudoknots. Rfam is freely available at https://rfam.org.
Collapse
Affiliation(s)
- Ioanna Kalvari
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eric P Nawrocki
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Nancy Ontiveros-Palacios
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Joanna Argasinska
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kevin Lamkiewicz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Leutragraben 1, 07743 Jena, Germany
| | - Manja Marz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Leutragraben 1, 07743 Jena, Germany
| | - Sam Griffiths-Jones
- Faculty of Biology, Medicine and Health, University of Manchester, Oxford Road, Manchester, M13 9PT, UK
| | - Claire Toffano-Nioche
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Daniel Gautheret
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Zasha Weinberg
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Centre for Bioinformatics, Leipzig University, 04107 Leipzig, Germany
| | - Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
| | - Sean R Eddy
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA.,Howard Hughes Medical Institute, Harvard University, Cambridge, MA 02138, USA.,John A. Paulson School of Engineering and Applied Science, Harvard University, Cambridge, MA 02138, USA
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anton I Petrov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
10
|
Kui L, Tang M. Overview of Computational Methods and Resources for Circular RNAs. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11638-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
11
|
Müller T, Miladi M, Hutter F, Hofacker I, Will S, Backofen R. The locality dilemma of Sankoff-like RNA alignments. Bioinformatics 2020; 36:i242-i250. [PMID: 32657398 PMCID: PMC7355259 DOI: 10.1093/bioinformatics/btaa431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Motivation Elucidating the functions of non-coding RNAs by homology has been strongly limited due to fundamental computational and modeling issues. While existing simultaneous alignment and folding (SA&F) algorithms successfully align homologous RNAs with precisely known boundaries (global SA&F), the more pressing problem of identifying new classes of homologous RNAs in the genome (local SA&F) is intrinsically more difficult and much less understood. Typically, the length of local alignments is strongly overestimated and alignment boundaries are dramatically mispredicted. We hypothesize that local SA&F approaches are compromised this way due to a score bias, which is caused by the contribution of RNA structure similarity to their overall alignment score. Results In the light of this hypothesis, we study pairwise local SA&F for the first time systematically—based on a novel local RNA alignment benchmark set and quality measure. First, we vary the relative influence of structure similarity compared to sequence similarity. Putting more emphasis on the structure component leads to overestimating the length of local alignments. This clearly shows the bias of current scores and strongly hints at the structure component as its origin. Second, we study the interplay of several important scoring parameters by learning parameters for local and global SA&F. The divergence of these optimized parameter sets underlines the fundamental obstacles for local SA&F. Third, by introducing a position-wise correction term in local SA&F, we constructively solve its principal issues. Availability and implementation The benchmark data, detailed results and scripts are available at https://github.com/BackofenLab/local_alignment. The RNA alignment tool LocARNA, including the modifications proposed in this work, is available at https://github.com/s-will/LocARNA/releases/tag/v2.0.0RC6. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Teresa Müller
- Bioinformatics Group, University of Freiburg, Freiburg 79110, Germany
| | - Milad Miladi
- Bioinformatics Group, University of Freiburg, Freiburg 79110, Germany
| | - Frank Hutter
- Machine Learning Lab, Department of Computer Science, University of Freiburg, Freiburg 79110, Germany
| | - Ivo Hofacker
- Theoretical Biochemistry Group (TBI), Institute for Theoretical Chemistry, University of Vienna, Vienna, Wien 1090, Austria
| | - Sebastian Will
- Theoretical Biochemistry Group (TBI), Institute for Theoretical Chemistry, University of Vienna, Vienna, Wien 1090, Austria.,Bioinformatics Group AMIBio, LIX-Laboratoire d'Informatique d'École Polytechnique, IPP, Palaiseau 91120, France
| | - Rolf Backofen
- Bioinformatics Group, University of Freiburg, Freiburg 79110, Germany.,Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Freiburg 79104, Germany
| |
Collapse
|
12
|
Miladi M, Sokhoyan E, Houwaart T, Heyne S, Costa F, Grüning B, Backofen R. GraphClust2: Annotation and discovery of structured RNAs with scalable and accessible integrative clustering. Gigascience 2019; 8:giz150. [PMID: 31808801 PMCID: PMC6897289 DOI: 10.1093/gigascience/giz150] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Revised: 08/23/2019] [Accepted: 11/20/2019] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND RNA plays essential roles in all known forms of life. Clustering RNA sequences with common sequence and structure is an essential step towards studying RNA function. With the advent of high-throughput sequencing techniques, experimental and genomic data are expanding to complement the predictive methods. However, the existing methods do not effectively utilize and cope with the immense amount of data becoming available. RESULTS Hundreds of thousands of non-coding RNAs have been detected; however, their annotation is lagging behind. Here we present GraphClust2, a comprehensive approach for scalable clustering of RNAs based on sequence and structural similarities. GraphClust2 bridges the gap between high-throughput sequencing and structural RNA analysis and provides an integrative solution by incorporating diverse experimental and genomic data in an accessible manner via the Galaxy framework. GraphClust2 can efficiently cluster and annotate large datasets of RNAs and supports structure-probing data. We demonstrate that the annotation performance of clustering functional RNAs can be considerably improved. Furthermore, an off-the-shelf procedure is introduced for identifying locally conserved structure candidates in long RNAs. We suggest the presence and the sparseness of phylogenetically conserved local structures for a collection of long non-coding RNAs. CONCLUSIONS By clustering data from 2 cross-linking immunoprecipitation experiments, we demonstrate the benefits of GraphClust2 for motif discovery under the presence of biological and methodological biases. Finally, we uncover prominent targets of double-stranded RNA binding protein Roquin-1, such as BCOR's 3' untranslated region that contains multiple binding stem-loops that are evolutionary conserved.
Collapse
Affiliation(s)
- Milad Miladi
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Eteri Sokhoyan
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Torsten Houwaart
- Institute of Medical Microbiology and Hospital Hygiene, University of Dusseldorf, Universitaetsstr. 1, 40225 Dusseldorf, Germany
| | - Steffen Heyne
- Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Stuebeweg 51, 79108 Freiburg, Germany
| | - Fabrizio Costa
- Department of Computer Science, University of Exeter, North Park Road, EX4 4QF Exeter, UK
| | - Björn Grüning
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
- ZBSA Centre for Biological Systems Analysis, University of Freiburg, Hauptstr. 1, 79104 Freiburg, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
- ZBSA Centre for Biological Systems Analysis, University of Freiburg, Hauptstr. 1, 79104 Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Schaenzlestr. 18, 79104 Freiburg, Germany
| |
Collapse
|
13
|
Crum M, Ram-Mohan N, Meyer MM. Regulatory context drives conservation of glycine riboswitch aptamers. PLoS Comput Biol 2019; 15:e1007564. [PMID: 31860665 PMCID: PMC6944388 DOI: 10.1371/journal.pcbi.1007564] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 01/06/2020] [Accepted: 11/25/2019] [Indexed: 12/13/2022] Open
Abstract
In comparison to protein coding sequences, the impact of mutation and natural selection on the sequence and function of non-coding (ncRNA) genes is not well understood. Many ncRNA genes are narrowly distributed to only a few organisms, and appear to be rapidly evolving. Compared to protein coding sequences, there are many challenges associated with assessment of ncRNAs that are not well addressed by conventional phylogenetic approaches, including: short sequence length, lack of primary sequence conservation, and the importance of secondary structure for biological function. Riboswitches are structured ncRNAs that directly interact with small molecules to regulate gene expression in bacteria. They typically consist of a ligand-binding domain (aptamer) whose folding changes drive changes in gene expression. The glycine riboswitch is among the most well-studied due to the widespread occurrence of a tandem aptamer arrangement (tandem), wherein two homologous aptamers interact with glycine and each other to regulate gene expression. However, a significant proportion of glycine riboswitches are comprised of single aptamers (singleton). Here we use graph clustering to circumvent the limitations of traditional phylogenetic analysis when studying the relationship between the tandem and singleton glycine aptamers. Graph clustering enables a broader range of pairwise comparison measures to be used to assess aptamer similarity. Using this approach, we show that one aptamer of the tandem glycine riboswitch pair is typically much more highly conserved, and that which aptamer is conserved depends on the regulated gene. Furthermore, our analysis also reveals that singleton aptamers are more similar to either the first or second tandem aptamer, again based on the regulated gene. Taken together, our findings suggest that tandem glycine riboswitches degrade into functional singletons, with the regulated gene(s) dictating which glycine-binding aptamer is conserved.
Collapse
Affiliation(s)
- Matt Crum
- Department of Biology, Boston College, Chestnut Hill, Massachusetts, United States of America
| | - Nikhil Ram-Mohan
- Department of Biology, Boston College, Chestnut Hill, Massachusetts, United States of America
| | - Michelle M. Meyer
- Department of Biology, Boston College, Chestnut Hill, Massachusetts, United States of America
| |
Collapse
|