1
|
Sridhara S. Multiple structural flavors of RNase P in precursor tRNA processing. WILEY INTERDISCIPLINARY REVIEWS. RNA 2024; 15:e1835. [PMID: 38479802 DOI: 10.1002/wrna.1835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 01/26/2024] [Accepted: 01/29/2024] [Indexed: 06/06/2024]
Abstract
The precursor transfer RNAs (pre-tRNAs) require extensive processing to generate mature tRNAs possessing proper fold, structural stability, and functionality required to sustain cellular viability. The road to tRNA maturation follows an ordered process: 5'-processing, 3'-processing, modifications at specific sites, if any, and 3'-CCA addition before aminoacylation and recruitment to the cellular protein synthesis machinery. Ribonuclease P (RNase P) is a universally conserved endonuclease in all domains of life, performing the hydrolysis of pre-tRNA sequences at the 5' end by the removal of phosphodiester linkages between nucleotides at position -1 and +1. Except for an archaeal species: Nanoarchaeum equitans where tRNAs are transcribed from leaderless-position +1, RNase P is indispensable for life and displays fundamental variations in terms of enzyme subunit composition, mechanism of substrate recognition and active site architecture, utilizing in all cases a two metal ion-mediated conserved catalytic reaction. While the canonical RNA-based ribonucleoprotein RNase P has been well-known to occur in bacteria, archaea, and eukaryotes, the occurrence of RNA-free protein-only RNase P in eukaryotes and RNA-free homologs of Aquifex RNase P in prokaryotes has been discovered more recently. This review aims to provide a comprehensive overview of structural diversity displayed by various RNA-based and RNA-free RNase P holoenzymes towards harnessing critical RNA-protein and protein-protein interactions in achieving conserved pre-tRNA processing functionality. Furthermore, alternate roles and functional interchangeability of RNase P are discussed in the context of its employability in several clinical and biotechnological applications. This article is categorized under: RNA Processing > tRNA Processing RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution RNA Interactions with Proteins and Other Molecules > RNA-Protein Complexes.
Collapse
Affiliation(s)
- Sagar Sridhara
- Department of Medical Biochemistry and Cell Biology, University of Gothenburg, Gothenburg, Sweden
| |
Collapse
|
2
|
Li J, Wu S, Zhang K, Sun X, Lin W, Wang C, Lin S. Clustered Regularly Interspaced Short Palindromic Repeat/CRISPR-Associated Protein and Its Utility All at Sea: Status, Challenges, and Prospects. Microorganisms 2024; 12:118. [PMID: 38257946 PMCID: PMC10820777 DOI: 10.3390/microorganisms12010118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 01/02/2024] [Accepted: 01/04/2024] [Indexed: 01/24/2024] Open
Abstract
Initially discovered over 35 years ago in the bacterium Escherichia coli as a defense system against invasion of viral (or other exogenous) DNA into the genome, CRISPR/Cas has ushered in a new era of functional genetics and served as a versatile genetic tool in all branches of life science. CRISPR/Cas has revolutionized the methodology of gene knockout with simplicity and rapidity, but it is also powerful for gene knock-in and gene modification. In the field of marine biology and ecology, this tool has been instrumental in the functional characterization of 'dark' genes and the documentation of the functional differentiation of gene paralogs. Powerful as it is, challenges exist that have hindered the advances in functional genetics in some important lineages. This review examines the status of applications of CRISPR/Cas in marine research and assesses the prospect of quickly expanding the deployment of this powerful tool to address the myriad fundamental marine biology and biological oceanography questions.
Collapse
Affiliation(s)
- Jiashun Li
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361101, China
| | - Shuaishuai Wu
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361101, China
| | - Kaidian Zhang
- State Key Laboratory of Marine Resource Utilization in the South China Sea, School of Marine Biology and Fisheries, Hainan University, Haikou 570203, China
| | - Xueqiong Sun
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361101, China
| | - Wenwen Lin
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361101, China
| | - Cong Wang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361101, China
| | - Senjie Lin
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361101, China
- Department of Marine Sciences, University of Connecticut, Groton, CT 06340, USA
| |
Collapse
|
3
|
Gößringer M, Wäber NB, Wiegard JC, Hartmann RK. Characterization of RNA-based and protein-only RNases P from bacteria encoding both enzyme types. RNA (NEW YORK, N.Y.) 2023; 29:376-391. [PMID: 36604113 PMCID: PMC9945441 DOI: 10.1261/rna.079459.122] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 12/17/2022] [Indexed: 06/17/2023]
Abstract
A small group of bacteria encode two types of RNase P, the classical ribonucleoprotein (RNP) RNase P as well as the protein-only RNase P HARP (homolog of Aquifex RNase P). We characterized the dual RNase P activities of five bacteria that belong to three different phyla. All five bacterial species encode functional RNA (gene rnpB) and protein (gene rnpA) subunits of RNP RNase P, but only the HARP of the thermophile Thermodesulfatator indicus (phylum Thermodesulfobacteria) was found to have robust tRNA 5'-end maturation activity in vitro and in vivo in an Escherichia coli RNase P depletion strain. These findings suggest that both types of RNase P are able to contribute to the essential tRNA 5'-end maturation activity in T. indicus, thus resembling the predicted evolutionary transition state in the progenitor of the Aquificaceae before the loss of rnpA and rnpB genes in this family of bacteria. Remarkably, T. indicus RNase P RNA is transcribed with a P12 expansion segment that is posttranscriptionally excised in vivo, such that the major fraction of the RNA is fragmented and thereby truncated by ∼70 nt in the native T. indicus host as well as in the E. coli complementation strain. Replacing the native P12 element of T. indicus RNase P RNA with the short P12 helix of Thermotoga maritima RNase P RNA abolished fragmentation, but simultaneously impaired complementation efficiency in E. coli cells, suggesting that intracellular fragmentation and truncation of T. indicus RNase P RNA may be beneficial to RNA folding and/or enzymatic activity.
Collapse
Affiliation(s)
- Markus Gößringer
- Philipps-Universität Marburg, Institut für Pharmazeutische Chemie, D-35037 Marburg, Germany
| | - Nadine B Wäber
- Philipps-Universität Marburg, Institut für Pharmazeutische Chemie, D-35037 Marburg, Germany
| | - Jana C Wiegard
- Philipps-Universität Marburg, Institut für Pharmazeutische Chemie, D-35037 Marburg, Germany
| | - Roland K Hartmann
- Philipps-Universität Marburg, Institut für Pharmazeutische Chemie, D-35037 Marburg, Germany
| |
Collapse
|
4
|
Cataldo PG, Klemm P, Thüring M, Saavedra L, Hebert EM, Hartmann RK, Lechner M. Insights into 6S RNA in lactic acid bacteria (LAB). BMC Genom Data 2021; 22:29. [PMID: 34479493 PMCID: PMC8414754 DOI: 10.1186/s12863-021-00983-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 08/12/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND 6S RNA is a regulator of cellular transcription that tunes the metabolism of cells. This small non-coding RNA is found in nearly all bacteria and among the most abundant transcripts. Lactic acid bacteria (LAB) constitute a group of microorganisms with strong biotechnological relevance, often exploited as starter cultures for industrial products through fermentation. Some strains are used as probiotics while others represent potential pathogens. Occasional reports of 6S RNA within this group already indicate striking metabolic implications. A conceivable idea is that LAB with 6S RNA defects may metabolize nutrients faster, as inferred from studies of Echerichia coli. This may accelerate fermentation processes with the potential to reduce production costs. Similarly, elevated levels of secondary metabolites might be produced. Evidence for this possibility comes from preliminary findings regarding the production of surfactin in Bacillus subtilis, which has functions similar to those of bacteriocins. The prerequisite for its potential biotechnological utility is a general characterization of 6S RNA in LAB. RESULTS We provide a genomic annotation of 6S RNA throughout the Lactobacillales order. It laid the foundation for a bioinformatic characterization of common 6S RNA features. This covers secondary structures, synteny, phylogeny, and product RNA start sites. The canonical 6S RNA structure is formed by a central bulge flanked by helical arms and a template site for product RNA synthesis. 6S RNA exhibits strong syntenic conservation. It is usually flanked by the replication-associated recombination protein A and the universal stress protein A. A catabolite responsive element was identified in over a third of all 6S RNA genes. It is known to modulate gene expression based on the available carbon sources. The presence of antisense transcripts could not be verified as a general trait of LAB 6S RNAs. CONCLUSIONS Despite a large number of species and the heterogeneity of LAB, the stress regulator 6S RNA is well-conserved both from a structural as well as a syntenic perspective. This is the first approach to describe 6S RNAs and short 6S RNA-derived transcripts beyond a single species, spanning a large taxonomic group covering multiple families. It yields universal insights into this regulator and complements the findings derived from other bacterial model organisms.
Collapse
Affiliation(s)
- Pablo Gabriel Cataldo
- Centro de Referencia para Lactobacilos (CERELA-CONICET), Chacabuco 145, San Miguel de Tucumán, 4000, Argentina
| | - Paul Klemm
- Philipps-Universität Marburg, Institut für Pharmazeutische Chemie, Marbacher Weg 6, Marburg, 35032, Germany
| | - Marietta Thüring
- Philipps-Universität Marburg, Institut für Pharmazeutische Chemie, Marbacher Weg 6, Marburg, 35032, Germany
| | - Lucila Saavedra
- Centro de Referencia para Lactobacilos (CERELA-CONICET), Chacabuco 145, San Miguel de Tucumán, 4000, Argentina
| | - Elvira Maria Hebert
- Centro de Referencia para Lactobacilos (CERELA-CONICET), Chacabuco 145, San Miguel de Tucumán, 4000, Argentina
| | - Roland K Hartmann
- Philipps-Universität Marburg, Institut für Pharmazeutische Chemie, Marbacher Weg 6, Marburg, 35032, Germany
| | - Marcus Lechner
- Philipps-Universität Marburg, Institut für Pharmazeutische Chemie, Marbacher Weg 6, Marburg, 35032, Germany. .,Philipps-Universität Marburg, Center for Synthetic Microbiology (Synmikro), Hans-Meerwein-Straße 6, Marburg, 35043, Germany.
| |
Collapse
|
5
|
Schaller D, Geiß M, Hellmuth M, Stadler PF. Heuristic algorithms for best match graph editing. Algorithms Mol Biol 2021; 16:19. [PMID: 34404422 PMCID: PMC8369769 DOI: 10.1186/s13015-021-00196-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 06/26/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Best match graphs (BMGs) are a class of colored digraphs that naturally appear in mathematical phylogenetics as a representation of the pairwise most closely related genes among multiple species. An arc connects a gene x with a gene y from another species (vertex color) Y whenever it is one of the phylogenetically closest relatives of x. BMGs can be approximated with the help of similarity measures between gene sequences, albeit not without errors. Empirical estimates thus will usually violate the theoretical properties of BMGs. The corresponding graph editing problem can be used to guide error correction for best match data. Since the arc set modification problems for BMGs are NP-complete, efficient heuristics are needed if BMGs are to be used for the practical analysis of biological sequence data. RESULTS Since BMGs have a characterization in terms of consistency of a certain set of rooted triples (binary trees on three vertices) defined on the set of genes, we consider heuristics that operate on triple sets. As an alternative, we show that there is a close connection to a set partitioning problem that leads to a class of top-down recursive algorithms that are similar to Aho's supertree algorithm and give rise to BMG editing algorithms that are consistent in the sense that they leave BMGs invariant. Extensive benchmarking shows that community detection algorithms for the partitioning steps perform best for BMG editing. CONCLUSION Noisy BMG data can be corrected with sufficient accuracy and efficiency to make BMGs an attractive alternative to classical phylogenetic methods.
Collapse
|
6
|
Minimal protein-only RNase P structure reveals insights into tRNA precursor recognition and catalysis. J Biol Chem 2021; 297:101028. [PMID: 34339732 PMCID: PMC8405995 DOI: 10.1016/j.jbc.2021.101028] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 07/20/2021] [Accepted: 07/29/2021] [Indexed: 11/22/2022] Open
Abstract
Ribonuclease P (RNase P) is an endoribonuclease that catalyzes the processing of the 5' leader sequence of precursor tRNA (pre-tRNA). Ribonucleoprotein RNase P and protein-only RNase P (PRORP) in eukaryotes have been extensively studied, but the mechanism by which a prokaryotic nuclease recognizes and cleaves pre-tRNA is unclear. To gain insights into this mechanism, we studied homologs of Aquifex RNase P (HARPs), thought to be enzymes of approximately 23 kDa comprising only this nuclease domain. We determined the cryo-EM structure of Aq880, the first identified HARP enzyme. The structure unexpectedly revealed that Aq880 consists of both the nuclease and protruding helical (PrH) domains. Aq880 monomers assemble into a dimer via the PrH domain. Six dimers form a dodecamer with a left-handed one-turn superhelical structure. The structure also revealed that the active site of Aq880 is analogous to that of eukaryotic PRORPs. The pre-tRNA docking model demonstrated that 5' processing of pre-tRNAs is achieved by two adjacent dimers within the dodecamer. One dimer is responsible for catalysis, and the PrH domains of the other dimer are responsible for pre-tRNA elbow recognition. Our study suggests that HARPs measure an invariant distance from the pre-tRNA elbow to cleave the 5' leader sequence, which is analogous to the mechanism of eukaryotic PRORPs and the ribonucleoprotein RNase P. Collectively, these findings shed light on how different types of RNase P enzymes utilize the same pre-tRNA processing.
Collapse
|
7
|
Guiral M, Giudici-Orticoni MT. Microbe Profile: Aquifex aeolicus: an extreme heat-loving bacterium that feeds on gases and inorganic chemicals. Microbiology (Reading) 2021; 167. [DOI: 10.1099/mic.0.001010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The bacterium ‘
Aquifex aeolicus
’ is the model organism for the deeply rooted phylum
Aquificae
. This ‘water-maker’ is an H2-oxidizing microaerophile that flourishes in extremely hot marine habitats, and it also thrives on the sulphur compounds commonly found in volcanic environments. ‘
A. aeolicus
’ has hyper-stable proteins and a fully sequenced genome, with some of its essential metabolic pathways deciphered (including energy conservation). Many of its proteins have also been characterized (especially structurally), including many of the enzymes involved in replication, transcription, RNA processing and cell envelope biosynthesis. Enzymes that are of promise for biotechnological applications have been widely investigated in this species. ‘
A. aeolicus
’ has also added to our understanding of the origins of life and evolution.
Collapse
Affiliation(s)
- Marianne Guiral
- BIP, UMR 7281, CNRS, Aix Marseille Université, Marseille, France
| | | |
Collapse
|
8
|
Wassarman KM. 6S RNA, a Global Regulator of Transcription. Microbiol Spectr 2018; 6:10.1128/microbiolspec.RWR-0019-2018. [PMID: 29916345 PMCID: PMC6013841 DOI: 10.1128/microbiolspec.rwr-0019-2018] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Indexed: 01/06/2023] Open
Abstract
6S RNA is a small RNA regulator of RNA polymerase (RNAP) that is present broadly throughout the bacterial kingdom. Initial functional studies in Escherichia coli revealed that 6S RNA forms a complex with RNAP resulting in regulation of transcription, and cells lacking 6S RNA have altered survival phenotypes. The last decade has focused on deepening the understanding of several aspects of 6S RNA activity, including (i) addressing questions of how broadly conserved 6S RNAs are in diverse organisms through continued identification and initial characterization of divergent 6S RNAs; (ii) the nature of the 6S RNA-RNAP interaction through examination of variant proteins and mutant RNAs, cross-linking approaches, and ultimately a cryo-electron microscopic structure; (iii) the physiological consequences of 6S RNA function through identification of the 6S RNA regulon and promoter features that determine 6S RNA sensitivity; and (iv) the mechanism and cellular impact of 6S RNA-directed synthesis of product RNAs (i.e., pRNA synthesis). Much has been learned about this unusual RNA, its mechanism of action, and how it is regulated; yet much still remains to be investigated, especially regarding potential differences in behavior of 6S RNAs in diverse bacteria.
Collapse
Affiliation(s)
- Karen M Wassarman
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI 53562
| |
Collapse
|
9
|
Lott SC, Wolfien M, Riege K, Bagnacani A, Wolkenhauer O, Hoffmann S, Hess WR. Customized workflow development and data modularization concepts for RNA-Sequencing and metatranscriptome experiments. J Biotechnol 2017; 261:85-96. [DOI: 10.1016/j.jbiotec.2017.06.1203] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Revised: 06/22/2017] [Accepted: 06/26/2017] [Indexed: 12/14/2022]
|
10
|
Abstract
RNase P is an essential tRNA-processing enzyme in all domains of life. We identified an unknown type of protein-only RNase P in the hyperthermophilic bacterium Aquifex aeolicus: Without an RNA subunit and the smallest of its kind, the 23-kDa polypeptide comprises a metallonuclease domain only. The protein has RNase P activity in vitro and rescued the growth of Escherichia coli and Saccharomyces cerevisiae strains with inactivations of their more complex and larger endogenous ribonucleoprotein RNase P. Homologs of Aquifex RNase P (HARP) were identified in many Archaea and some Bacteria, of which all Archaea and most Bacteria also encode an RNA-based RNase P; activity of both RNase P forms from the same bacterium or archaeon could be verified in two selected cases. Bioinformatic analyses suggest that A. aeolicus and related Aquificaceae likely acquired HARP by horizontal gene transfer from an archaeon.
Collapse
|
11
|
Abstract
6S RNA is a highly abundant small non-coding RNA widely spread among diverse bacterial groups. By competing with DNA promoters for binding to RNA polymerase (RNAP), the RNA regulates transcription on a global scale. RNAP produces small product RNAs derived from 6S RNA as template, which rearranges the 6S RNA structure leading to dissociation of 6S RNA:RNAP complexes. Although 6S RNA has been experimentally analysed in detail for some species, such as Escherichia coli and Bacillus subtilis, and was computationally predicted in many diverse bacteria, a complete and up-to-date overview of the distribution among all bacteria is missing. In this study we searched with new methods for 6S RNA genes in all currently available bacterial genomes. We ended up with a set of 1,750 6S RNA genes, of which 1,367 are novel and bona fide, distributed among 1,610 bacteria, and had a few tentative candidates among the remaining 510 assembled bacterial genomes accessible. We were able to confirm two tentative candidates by Northern blot analysis. We extended 6S RNA genes of the Flavobacteriia significantly in length compared to the present Rfam entry. We describe multiple homologs of 6S RNAs (including split 6S RNA genes) and performed a detailed synteny analysis.
Collapse
Affiliation(s)
- Stefanie Wehner
- a Department for Bioinformatics; Faculty of Mathematics and Computer Science ; Friedrich-Schiller-University of Jena , Jena , Germany
| | | | | | | |
Collapse
|
12
|
Möbius P, Hölzer M, Felder M, Nordsiek G, Groth M, Köhler H, Reichwald K, Platzer M, Marz M. Comprehensive insights in the Mycobacterium avium subsp. paratuberculosis genome using new WGS data of sheep strain JIII-386 from Germany. Genome Biol Evol 2015; 7:2585-2601. [PMID: 26384038 PMCID: PMC4607514 DOI: 10.1093/gbe/evv154] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Mycobacterium avium (M. a.) subsp. paratuberculosis (MAP)—the etiologic agent of Johne’s disease—affects cattle, sheep, and other ruminants worldwide. To decipher phenotypic differences among sheep and cattle strains (belonging to MAP-S [Type-I/III], respectively, MAP-C [Type-II]), comparative genome analysis needs data from diverse isolates originating from different geographic regions of the world. This study presents the so far best assembled genome of a MAP-S-strain: Sheep isolate JIII-386 from Germany. One newly sequenced cattle isolate (JII-1961, Germany), four published MAP strains of MAP-C and MAP-S from the United States and Australia, and M. a. subsp. hominissuis (MAH) strain 104 were used for assembly improvement and comparisons. All genomes were annotated by BacProt and results compared with NCBI (National Center for Biotechnology Information) annotation. Corresponding protein-coding sequences (CDSs) were detected, but also CDSs that were exclusively determined by either NCBI or BacProt. A new Shine–Dalgarno sequence motif (5′-AGCTGG-3′) was extracted. Novel CDSs including PE-PGRS family protein genes and about 80 noncoding RNAs exhibiting high sequence conservation are presented. Previously found genetic differences between MAP-types are partially revised. Four of ten assumed MAP-S-specific large sequence polymorphism regions (LSPSs) are still present in MAP-C strains; new LSPSs were identified. Independently of the regional origin of the strains, the number of individual CDSs and single nucleotide variants confirms the strong similarity of MAP-C strains and shows higher diversity among MAP-S strains. This study gives ambiguous results regarding the hypothesis that MAP-S is the evolutionary intermediate between MAH and MAP-C, but it clearly shows a higher similarity of MAP to MAH than to Mycobacterium intracellulare.
Collapse
Affiliation(s)
- Petra Möbius
- NRL for Paratuberculosis, Institute of Molecular Pathogenesis, Friedrich-Loeffler-Institut (Federal Research Institute for Animal Health), Naumburger Straße 96a, 07743 Jena, Germany
| | - Martin Hölzer
- RNA Bioinformatics and High Throughput Analysis, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
| | - Marius Felder
- Leibniz Institute for Age Research - Fritz-Lipmann-Institute (FLI), Beutenbergstraße 11, 07745 Jena, Germany
| | - Gabriele Nordsiek
- Department of Genome Analysis, Helmholtz Centre for Infection Research, Inhoffenstr. 7, 38124 Braunschweig, Germany
| | - Marco Groth
- Leibniz Institute for Age Research - Fritz-Lipmann-Institute (FLI), Beutenbergstraße 11, 07745 Jena, Germany
| | - Heike Köhler
- NRL for Paratuberculosis, Institute of Molecular Pathogenesis, Friedrich-Loeffler-Institut (Federal Research Institute for Animal Health), Naumburger Straße 96a, 07743 Jena, Germany
| | - Kathrin Reichwald
- Leibniz Institute for Age Research - Fritz-Lipmann-Institute (FLI), Beutenbergstraße 11, 07745 Jena, Germany
| | - Matthias Platzer
- Leibniz Institute for Age Research - Fritz-Lipmann-Institute (FLI), Beutenbergstraße 11, 07745 Jena, Germany
| | - Manja Marz
- RNA Bioinformatics and High Throughput Analysis, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
| |
Collapse
|
13
|
Abstract
Phylogenomics heavily relies on well-curated sequence data sets that comprise, for each gene, exclusively 1:1 orthologos. Paralogs are treated as a dangerous nuisance that has to be detected and removed. We show here that this severe restriction of the data sets is not necessary. Building upon recent advances in mathematical phylogenetics, we demonstrate that gene duplications convey meaningful phylogenetic information and allow the inference of plausible phylogenetic trees, provided orthologs and paralogs can be distinguished with a degree of certainty. Starting from tree-free estimates of orthology, cograph editing can sufficiently reduce the noise to find correct event-annotated gene trees. The information of gene trees can then directly be translated into constraints on the species trees. Although the resolution is very poor for individual gene families, we show that genome-wide data sets are sufficient to generate fully resolved phylogenetic trees, even in the presence of horizontal gene transfer.
Collapse
|