1
|
von Löhneysen S, Spicher T, Varenyk Y, Yao HT, Lorenz R, Hofacker I, Stadler PF. Phylogenetic and Chemical Probing Information as Soft Constraints in RNA Secondary Structure Prediction. J Comput Biol 2024; 31:549-563. [PMID: 38935442 DOI: 10.1089/cmb.2024.0519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
Extrinsic, experimental information can be incorporated into thermodynamics-based RNA folding algorithms in the form of pseudo-energies. Evolutionary conservation of RNA secondary structure elements is detectable in alignments of phylogenetically related sequences and provides evidence for the presence of certain base pairs that can also be converted into pseudo-energy contributions. We show that the centroid base pairs computed from a consensus folding model such as RNAalifold result in a substantial improvement of the prediction accuracy for single sequences. Evidence for specific base pairs turns out to be more informative than a position-wise profile for the conservation of the pairing status. A comparison with chemical probing data, furthermore, strongly suggests that phylogenetic base pairing data are more informative than position-specific data on (un)pairedness as obtained from chemical probing experiments. In this context we demonstrate, in addition, that the conversion of signal from probing data into pseudo-energies is possible using thermodynamic structure predictions as a reference instead of known RNA structures.
Collapse
Affiliation(s)
- Sarah von Löhneysen
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany
| | - Thomas Spicher
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- UniVie Doctoral School Computer Science (DoCS), University of Vienna, Vienna, Austria
| | - Yuliia Varenyk
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- Vienna BioCenter PhD Program, Doctoral School of the University of Vienna and Medical, University of Vienna, Vienna, Austria
| | - Hua-Ting Yao
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Ronny Lorenz
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Ivo Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
- Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia
- Santa Fe Institute, Santa Fe, New Mexico, USA
| |
Collapse
|
6
|
Rose D, Stadler PF. Molecular evolution of the non-coding eosinophil granule ontogeny transcript. Front Genet 2011; 2:69. [PMID: 22303364 PMCID: PMC3268622 DOI: 10.3389/fgene.2011.00069] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2011] [Accepted: 09/16/2011] [Indexed: 01/22/2023] Open
Abstract
Eukaryotic genomes are pervasively transcribed. A large fraction of the transcriptional output consists of long, mRNA-like, non-protein-coding transcripts (mlncRNAs). The evolutionary history of mlncRNAs is still largely uncharted territory. In this contribution, we explore in detail the evolutionary traces of the eosinophil granule ontogeny transcript (EGOT), an experimentally confirmed representative of an abundant class of totally intronic non-coding transcripts (TINs). EGOT is located antisense to an intron of the ITPR1 gene. We computationally identify putative EGOT orthologs in the genomes of 32 different amniotes, including orthologs from primates, rodents, ungulates, carnivores, afrotherians, and xenarthrans, as well as putative candidates from basal amniotes, such as opossum or platypus. We investigate the EGOT gene phylogeny, analyze patterns of sequence conservation, and the evolutionary conservation of the EGOT gene structure. We show that EGO-B, the spliced isoform, may be present throughout the placental mammals, but most likely dates back even further. We demonstrate here for the first time that the whole EGOT locus is highly structured, containing several evolutionary conserved, and thermodynamic stable secondary structures. Our analyses allow us to postulate novel functional roles of a hitherto poorly understood region at the intron of EGO-B which is highly conserved at the sequence level. The region contains a novel ITPR1 exon and also conserved RNA secondary structures together with a conserved TATA-like element, which putatively acts as a promoter of an independent regulatory element.
Collapse
Affiliation(s)
- Dominic Rose
- Bioinformatics Group, Department of Computer Science, University of FreiburgFreiburg, Germany
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of LeipzigLeipzig, Germany
- Max Planck Institute for Mathematics in the SciencesLeipzig, Germany
- Fraunhofer Institut für Zelltherapie und ImmunologieLeipzig, Germany
- Department of Theoretical Chemistry, University of ViennaWien, Austria
- Center for non-coding RNA in Technology and Health, University of CopenhagenFrederiksberg, Denmark
- Santa Fe InstituteSanta Fe, NM, USA
| |
Collapse
|
9
|
Hertel J, de Jong D, Marz M, Rose D, Tafer H, Tanzer A, Schierwater B, Stadler PF. Non-coding RNA annotation of the genome of Trichoplax adhaerens. Nucleic Acids Res 2009; 37:1602-15. [PMID: 19151082 PMCID: PMC2655684 DOI: 10.1093/nar/gkn1084] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2008] [Revised: 12/22/2008] [Accepted: 12/23/2008] [Indexed: 02/06/2023] Open
Abstract
A detailed annotation of non-protein coding RNAs is typically missing in initial releases of newly sequenced genomes. Here we report on a comprehensive ncRNA annotation of the genome of Trichoplax adhaerens, the presumably most basal metazoan whose genome has been published to-date. Since blast identified only a small fraction of the best-conserved ncRNAs--in particular rRNAs, tRNAs and some snRNAs--we developed a semi-global dynamic programming tool, GotohScan, to increase the sensitivity of the homology search. It successfully identified the full complement of major and minor spliceosomal snRNAs, the genes for RNase P and MRP RNAs, the SRP RNA, as well as several small nucleolar RNAs. We did not find any microRNA candidates homologous to known eumetazoan sequences. Interestingly, most ncRNAs, including the pol-III transcripts, appear as single-copy genes or with very small copy numbers in the Trichoplax genome.
Collapse
Affiliation(s)
- Jana Hertel
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Danielle de Jong
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Manja Marz
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Dominic Rose
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Hakim Tafer
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Andrea Tanzer
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Bernd Schierwater
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| |
Collapse
|