Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Lechner M, Hernandez-Rosales M, Doerr D, Wieseke N, Thévenin A, Stoye J, Hartmann RK, Prohaska SJ, Stadler PF. Orthology detection combining clustering and synteny for very large datasets. PLoS One 2014;9:e105015. [PMID: 25137074 PMCID: PMC4138177 DOI: 10.1371/journal.pone.0105015] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2014] [Accepted: 07/14/2014] [Indexed: 11/18/2022] Open

For:	Lechner M, Hernandez-Rosales M, Doerr D, Wieseke N, Thévenin A, Stoye J, Hartmann RK, Prohaska SJ, Stadler PF. Orthology detection combining clustering and synteny for very large datasets. PLoS One 2014;9:e105015. [PMID: 25137074 PMCID: PMC4138177 DOI: 10.1371/journal.pone.0105015] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2014] [Accepted: 07/14/2014] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Rubert DP, Braga MDV. Efficient gene orthology inference via large-scale rearrangements. Algorithms Mol Biol 2023;18:14. [PMID: 37770945 PMCID: PMC10540461 DOI: 10.1186/s13015-023-00238-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 08/17/2023] [Indexed: 09/30/2023] Open

Abstract

BACKGROUND

Recently we developed a gene orthology inference tool based on genome rearrangements (Journal of Bioinformatics and Computational Biology 19:6, 2021). Given a set of genomes our method first computes all pairwise gene similarities. Then it runs pairwise ILP comparisons to compute optimal gene matchings, which minimize, by taking the similarities into account, the weighted rearrangement distance between the analyzed genomes (a problem that is NP-hard). The gene matchings are then integrated into gene families in the final step. The mentioned ILP includes an optimal capping that connects each end of a linear segment of one genome to an end of a linear segment in the other genome, producing an exponential increase of the search space.

RESULTS

In this work, we design and implement a heuristic capping algorithm that replaces the optimal capping by clustering (based on their gene content intersections) the linear segments into [Formula: see text] subsets, whose ends are capped independently. Furthermore, in each subset, instead of allowing all possible connections, we let only the ends of content-related segments be connected. Although there is no guarantee that m is much bigger than one, and with the possible side effect of resulting in sub-optimal instead of optimal gene matchings, the heuristic works very well in practice, from both the speed performance and the quality of computed solutions. Our experiments on primate and fruit fly genomes show two positive results. First, for complete assemblies of five primates the version with heuristic capping reports orthologies that are very similar to the orthologies computed by the version of our tool with optimal capping. Second, we were able to efficiently analyze fruit fly genomes with incomplete assemblies distributed in hundreds or even thousands of contigs, obtaining gene families that are very similar to [Formula: see text] families. Indeed, our tool inferred a higher number of complete cliques, with a higher intersection with [Formula: see text], when compared to gene families computed by other inference tools. We added a post-processing for refining, with the aid of the [Formula: see text] algorithm, our ambiguous families (those with more than one gene per genome), improving even more the accuracy of our results. Our approach is implemented into a pipeline incorporating the pre-computation of gene similarities and the post-processing refinement of ambiguous families with [Formula: see text]. Both the original version with optimal capping and the new modified version with heuristic capping can be downloaded, together with their detailed documentations, at https://gitlab.ub.uni-bielefeld.de/gi/FFGC or as a Conda package at https://anaconda.org/bioconda/ffgc .

Collapse

Wiberg RAW, Brand JN, Viktorin G, Mitchell JO, Beisel C, Schärer L. Genome assemblies of the simultaneously hermaphroditic flatworms Macrostomum cliftonense and Macrostomum hystrix. G3 (BETHESDA, MD.) 2023;13:jkad149. [PMID: 37398989 PMCID: PMC10468722 DOI: 10.1093/g3journal/jkad149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 06/05/2023] [Accepted: 06/21/2023] [Indexed: 07/04/2023]

Perez M, Aroh O, Sun Y, Lan Y, Juniper SK, Young CR, Angers B, Qian PY. Third-Generation Sequencing Reveals the Adaptive Role of the Epigenome in Three Deep-Sea Polychaetes. Mol Biol Evol 2023;40:msad172. [PMID: 37494294 PMCID: PMC10414810 DOI: 10.1093/molbev/msad172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 06/16/2023] [Accepted: 07/17/2023] [Indexed: 07/28/2023] Open

Darnet E, Teixeira B, Schaller H, Rogez H, Darnet S. Elucidating the Mesocarp Drupe Transcriptome of Açai (Euterpe oleracea Mart.): An Amazonian Tree Palm Producer of Bioactive Compounds. Int J Mol Sci 2023;24:ijms24119315. [PMID: 37298279 DOI: 10.3390/ijms24119315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 05/13/2023] [Accepted: 05/16/2023] [Indexed: 06/12/2023] Open

Abstract

Euterpe oleracea palm, endemic to the Amazon region, is well known for açai, a fruit violet beverage with nutritional and medicinal properties. During E. oleracea fruit ripening, anthocyanin accumulation is not related to sugar production, contrarily to grape and blueberry. Ripened fruits have a high content of anthocyanins, isoprenoids, fibers, and proteins, and are poor in sugars. E. oleracea is proposed as a new genetic model for metabolism partitioning in the fruit. Approximately 255 million single-end-oriented reads were generated on an Ion Proton NGS platform combining fruit cDNA libraries at four ripening stages. The de novo transcriptome assembly was tested using six assemblers and 46 different combinations of parameters, a pre-processing and a post-processing step. The multiple k-mer approach with TransABySS as an assembler and Evidential Gene as a post-processer have shown the best results, with an N50 of 959 bp, a read coverage mean of 70x, a BUSCO complete sequence recovery of 36% and an RBMT of 61%. The fruit transcriptome dataset included 22,486 transcripts representing 18 Mbp, of which a proportion of 87% had significant homology with other plant sequences. Approximately 904 new EST-SSRs were described, and were common and transferable to Phoenix dactylifera and Elaeis guineensis, two other palm trees. The global GO classification of transcripts showed similar categories to that in P. dactylifera and E. guineensis fruit transcriptomes. For an accurate annotation and functional description of metabolism genes, a bioinformatic pipeline was developed to precisely identify orthologs, such as one-to-one orthologs between species, and to infer multigenic family evolution. The phylogenetic inference confirmed an occurrence of duplication events in the Arecaceae lineage and the presence of orphan genes in E. oleracea. Anthocyanin and tocopherol pathways were annotated entirely. Interestingly, the anthocyanin pathway showed a high number of paralogs, similar to in grape, whereas the tocopherol pathway exhibited a low and conserved gene number and the prediction of several splicing forms. The release of this exhaustively annotated molecular dataset of E. oleracea constitutes a valuable tool for further studies in metabolism partitioning and opens new great perspectives to study fruit physiology with açai as a model.

Collapse

Gout JF, Hao Y, Johri P, Arnaiz O, Doak TG, Bhullar S, Couloux A, Guérin F, Malinsky S, Potekhin A, Sawka N, Sperling L, Labadie K, Meyer E, Duharcourt S, Lynch M. Dynamics of Gene Loss following Ancient Whole-Genome Duplication in the Cryptic Paramecium Complex. Mol Biol Evol 2023;40:msad107. [PMID: 37154524 PMCID: PMC10195154 DOI: 10.1093/molbev/msad107] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 03/30/2023] [Accepted: 05/05/2023] [Indexed: 05/10/2023] Open

Affiliation(s)

Jean-Francois Gout Department of Biology, Indiana University, Bloomington, IN Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ Department of Biological Sciences, Mississippi State University, Starkville, MS
Yue Hao Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ Cancer and Cell Biology Division, Translational Genomics Research Institute, Phoenix, AZ
Parul Johri Department of Biology, Indiana University, Bloomington, IN Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ School of Life Sciences, Arizona State University, Tempe, AZ
Olivier Arnaiz Institute for Integrative Biology of the Cell (I2BC), Commissariat à l'Energie Atomique (CEA), CNRS, Université Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, France
Thomas G Doak Department of Biology, Indiana University, Bloomington, IN National Center for Genome Analysis Support, Indiana University, Bloomington, IN
Simran Bhullar Institut de biologie de l’ENS, Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, Université PSL, Paris, France
Arnaud Couloux Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Evry, France
Fréderic Guérin Université Paris Cité, CNRS, Institut Jacques Monod, Paris, France
Sophie Malinsky Institut de biologie de l’ENS, Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, Université PSL, Paris, France
Alexey Potekhin Department of Microbiology, Faculty of Biology, Saint Petersburg State University, Saint Petersburg, Russia Laboratory of Cellular and Molecular Protistology, Zoological Institute RAS, Saint Petersburg, Russia
Natalia Sawka Institute of Systematics and Evolution of Animals, Polish Academy of Sciences, Krakow, Poland
Linda Sperling Institute for Integrative Biology of the Cell (I2BC), Commissariat à l'Energie Atomique (CEA), CNRS, Université Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, France
Karine Labadie Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
Eric Meyer Institut de biologie de l’ENS, Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, Université PSL, Paris, France
Sandra Duharcourt Université Paris Cité, CNRS, Institut Jacques Monod, Paris, France
Michael Lynch Department of Biology, Indiana University, Bloomington, IN Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ

Collapse

Jia GS, Zhang WC, Liang Y, Liu XH, Rhind N, Pidoux A, Brysch-Herzberg M, Du LL. A high-quality reference genome for the fission yeast Schizosaccharomyces osmophilus. G3 (BETHESDA, MD.) 2023;13:jkad028. [PMID: 36748990 PMCID: PMC10085805 DOI: 10.1093/g3journal/jkad028] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 01/23/2023] [Accepted: 01/23/2023] [Indexed: 02/08/2023]

Zhilina TN, Sorokin DY, Toshchakov SV, Kublanov IV, Zavarzina DG. Natronogracilivirga saccharolytica gen. nov., sp. nov. and Cyclonatronum proteinivorum gen. nov., sp. nov., haloalkaliphilic organotrophic bacteroidetes from hypersaline soda lakes forming a new family Cyclonatronaceae fam. nov. in the order Balneolales. Syst Appl Microbiol 2023;46:126403. [PMID: 36736145 DOI: 10.1016/j.syapm.2023.126403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 01/16/2023] [Accepted: 01/22/2023] [Indexed: 01/26/2023]

Abstract

Two heterotrophic bacteroidetes strains were isolated as satellites from autotrophic enrichments inoculated with samples from hypersaline soda lakes in southwestern Siberia. Strain Z-1702^T is an obligate anaerobic fermentative saccharolytic bacterium from an iron-reducing enrichment culture, while Ca. Cyclonatronum proteinivorum Omega^T is an obligate aerobic proteolytic microorganism from a cyanobacterial enrichment. Cells of isolated bacteria are characterized by highly variable morphology. Both strains are chloride-independent moderate salt-tolerant obligate alkaliphiles and mesophiles. Strain Z-1702^T ferments glucose, maltose, fructose, mannose, sorbose, galactose, cellobiose, N-acetyl-glucosamine and alpha-glucans, including starch, glycogen, dextrin, and pullulan. Strain Omega^T is strictly proteolytic utilizing a range of proteins and peptones. The main polar lipid fatty acid in both strains is iso-C_15:0, while other major components are various C₁₆ and C₁₇ isomers. According to pairwise sequence alignments using BLAST Gracilimonas was the nearest cultured relative to both strains (<90% of 16S rRNA gene sequence identity). Phylogenetic analysis placed strain Z-1702^T and strain Omega^T as two different genera in a deep-branching clade of the new family level within the order Balneolales with genus. Based on physiological characteristics and phylogenetic position of strain Z-1702^T it was proposed to represent a novel genus and species Natronogracilivirga saccharolityca gen. nov., sp. nov. (= DSMZ 109061^T =JCM 32930^T =VKM B 3262^T). Furthermore, phylogenetic and phenotypic parameters of N. saccharolityca and C. proteinivorum gen. nov., sp. nov., strain Omega^T (=JCM 31662^T, =UNIQEM U979^T), make it possible to include them into a new family with a proposed designation Cyclonatronaceae fam. nov..

Collapse

Tanabe TS, Dahl C. HMS-S-S: a tool for the identification of sulfur metabolism-related genes and analysis of operon structures in genome and metagenome assemblies. Mol Ecol Resour 2022;22:2758-2774. [PMID: 35579058 DOI: 10.1111/1755-0998.13642] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 04/25/2022] [Accepted: 05/11/2022] [Indexed: 11/26/2022]

Wu Y, Ren WT, Zhong YW, Guo LL, Zhou P, Xu XW. Thiosulfatihalobacter marinus gen. nov. sp. nov., a novel member of the family Roseobacteraceae, isolated from the West Pacific Ocean. Int J Syst Evol Microbiol 2022;72. [DOI: 10.1099/ijsem.0.005286] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Abstract Two strains (GL-11-2T and ZH2-Y79) were isolated from the seawater collected from the West Pacific Ocean and the East China Sea, respectively. Cells were Gram-stain-negative, strictly aerobic, non-motile and rod-shaped. Cells grew in the medium containing 0.5–7.5 % NaCl (w/v, optimum, 1.0–3.0 %), at pH 6.0–8.0 (optimum, pH 6.5–7.0) and at 4–40 °C (optimum, 30 °C). H2S production occurred in marine broth supplemented with sodium thiosulphate. The almost-complete 16S rRNA gene sequences of the two isolates were identical, and exhibited the highest similarity to

Pseudoruegeria aquimaris

JCM 13603T (97.5 %), followed by

Ruegeria conchae

TW15T (97.2%),

Shimia aestuarii

DSM 15283T (97.1 %) and

Ruegeria lacuscaerulensis

ITI-1157T (97.0 %). Phylogenetic analysis revealed that the isolates were affiliated with the family

Roseobacteraceae

and represented an independent lineage. The sole isoprenoid quinone was ubiquinone 10. The principal fatty acids were summed feature 8 (C18 : 1 ω7c and/or C18 : 1 ω6c) and cyclo-C19 : 0 ω8c. The major polar lipids were phosphatidylglycerol, phosphatidylethanolamine, phosphatidylcholine and diphosphatidylglycerol. The DNA G+C content was 62.3 mol%. The orthologous average nucleotide identity, in silico DNA–DNA hybridization and average amino acid identity values among the genomes of strain GL-11-2T and the reference strains were 73.2–79.0, 20.3–22.5 and 66.0–80.8 %, respectively. Strains GL-11-2ᵀ and ZH2-Y79 possessed complete metabolic pathways for thiosulphate oxidation, dissimilatory nitrate reduction and denitrification. Phylogenetic distinctiveness, chemotaxonomic differences and phenotypic properties revealed that the isolates represent a novel genus and species of the family

Roseobacteraceae

, belonging to the class

Alphaproteobacteria

, for which the name Thiosulfatihalobacter marinus gen. nov., sp. nov. (type strain, GL-11–2T=KCTC 82723T=MCCC M20691T) is proposed. Collapse

Ahmed M, Roberts NG, Adediran F, Smythe AB, Kocot KM, Holovachov O. Phylogenomic Analysis of the Phylum Nematoda: Conflicts and Congruences With Morphology, 18S rRNA, and Mitogenomes. Front Ecol Evol 2022. [DOI: 10.3389/fevo.2021.769565] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

Rubert DP, Doerr D, Braga MDV. The potential of family-free rearrangements towards gene orthology inference. J Bioinform Comput Biol 2021;19:2140014. [PMID: 34775922 DOI: 10.1142/s021972002140014x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Ren WT, Meng FX, Guo LL, Sun L, Xu XW, Zhou P, Wu YH. Luteirhabdus pelagi gen. nov., sp. nov., a novel member of the family Flavobacteriaceae, isolated from the West Pacific Ocean. Arch Microbiol 2021;203:6021-6031. [PMID: 34698880 PMCID: PMC8590676 DOI: 10.1007/s00203-021-02557-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 08/23/2021] [Accepted: 08/26/2021] [Indexed: 12/02/2022]

Cavassim MIA, Andersen SU, Bataillon T, Schierup MH. Recombination facilitates adaptive evolution in rhizobial soil bacteria. Mol Biol Evol 2021;38:5480-5490. [PMID: 34410427 PMCID: PMC8662638 DOI: 10.1093/molbev/msab247] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open

Schaller D, Geiß M, Hellmuth M, Stadler PF. Heuristic algorithms for best match graph editing. Algorithms Mol Biol 2021;16:19. [PMID: 34404422 PMCID: PMC8369769 DOI: 10.1186/s13015-021-00196-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 06/26/2021] [Indexed: 11/10/2022] Open

Aviña-Padilla K, Ramírez-Rafael JA, Herrera-Oropeza GE, Muley VY, Valdivia DI, Díaz-Valenzuela E, García-García A, Varela-Echavarría A, Hernández-Rosales M. Evolutionary Perspective and Expression Analysis of Intronless Genes Highlight the Conservation of Their Regulatory Role. Front Genet 2021;12:654256. [PMID: 34306008 PMCID: PMC8302217 DOI: 10.3389/fgene.2021.654256] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 06/01/2021] [Indexed: 11/13/2022] Open

Abstract

The structure of eukaryotic genes is generally a combination of exons interrupted by intragenic non-coding DNA regions (introns) removed by RNA splicing to generate the mature mRNA. A fraction of genes, however, comprise a single coding exon with introns in their untranslated regions or are intronless genes (IGs), lacking introns entirely. The latter code for essential proteins involved in development, growth, and cell proliferation and their expression has been proposed to be highly specialized for neuro-specific functions and linked to cancer, neuropathies, and developmental disorders. The abundant presence of introns in eukaryotic genomes is pivotal for the precise control of gene expression. Notwithstanding, IGs exempting splicing events entail a higher transcriptional fidelity, making them even more valuable for regulatory roles. This work aimed to infer the functional role and evolutionary history of IGs centered on the mouse genome. IGs consist of a subgroup of genes with one exon including coding genes, non-coding genes, and pseudogenes, which conform approximately 6% of a total of 21,527 genes. To understand their prevalence, biological relevance, and evolution, we identified and studied 1,116 IG functional proteins validating their differential expression in transcriptomic data of embryonic mouse telencephalon. Our results showed that overall expression levels of IGs are lower than those of MEGs. However, strongly up-regulated IGs include transcription factors (TFs) such as the class 3 of POU (HMG Box), Neurog1, Olig1, and BHLHe22, BHLHe23, among other essential genes including the β-cluster of protocadherins. Most striking was the finding that IG-encoded BHLH TFs fit the criteria to be classified as microproteins. Finally, predicted protein orthologs in other six genomes confirmed high conservation of IGs associated with regulating neural processes and with chromatin organization and epigenetic regulation in Vertebrata. Moreover, this study highlights that IGs are essential modulators of regulatory processes, such as the Wnt signaling pathway and biological processes as pivotal as sensory organ developing at a transcriptional and post-translational level. Overall, our results suggest that IG proteins have specialized, prevalent, and unique biological roles and that functional divergence between IGs and MEGs is likely to be the result of specific evolutionary constraints.

Collapse

Berkemer SJ, McGlynn SE. A New Analysis of Archaea-Bacteria Domain Separation: Variable Phylogenetic Distance and the Tempo of Early Evolution. Mol Biol Evol 2021;37:2332-2340. [PMID: 32316034 PMCID: PMC7403611 DOI: 10.1093/molbev/msaa089] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

Abstract

Comparative genomics and molecular phylogenetics are foundational for understanding biological evolution. Although many studies have been made with the aim of understanding the genomic contents of early life, uncertainty remains. A study by Weiss et al. (Weiss MC, Sousa FL, Mrnjavac N, Neukirchen S, Roettger M, Nelson-Sathi S, Martin WF. 2016. The physiology and habitat of the last universal common ancestor. Nat Microbiol. 1(9):16116.) identified a number of protein families in the last universal common ancestor of archaea and bacteria (LUCA) which were not found in previous works. Here, we report new research that suggests the clustering approaches used in this previous study undersampled protein families, resulting in incomplete phylogenetic trees which do not reflect protein family evolution. Phylogenetic analysis of protein families which include more sequence homologs rejects a simple LUCA hypothesis based on phylogenetic separation of the bacterial and archaeal domains for a majority of the previously identified LUCA proteins (∼82%). To supplement limitations of phylogenetic inference derived from incompletely populated orthologous groups and to test the hypothesis of a period of rapid evolution preceding the separation of the domains, we compared phylogenetic distances both within and between domains, for thousands of orthologous groups. We find a substantial diversity of interdomain versus intradomain branch lengths, even among protein families which exhibit a single domain separating branch and are thought to be associated with the LUCA. Additionally, phylogenetic trees with long interdomain branches relative to intradomain branches are enriched in information categories of protein families in comparison to those associated with metabolic functions. These results provide a new view of protein family evolution and temper claims about the phenotype and habitat of the LUCA.

Collapse

Feurtey A, Lorrain C, Croll D, Eschenbrenner C, Freitag M, Habig M, Haueisen J, Möller M, Schotanus K, Stukenbrock EH. Genome compartmentalization predates species divergence in the plant pathogen genus Zymoseptoria. BMC Genomics 2020;21:588. [PMID: 32842972 PMCID: PMC7448473 DOI: 10.1186/s12864-020-06871-w] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 06/26/2020] [Indexed: 11/25/2022] Open

Abstract

Background

Antagonistic co-evolution can drive rapid adaptation in pathogens and shape genome architecture. Comparative genome analyses of several fungal pathogens revealed highly variable genomes, for many species characterized by specific repeat-rich genome compartments with exceptionally high sequence variability. Dynamic genome structure may enable fast adaptation to host genetics. The wheat pathogen Zymoseptoria tritici with its highly variable genome, has emerged as a model organism to study genome evolution of plant pathogens. Here, we compared genomes of Z. tritici isolates and of sister species infecting wild grasses to address the evolution of genome composition and structure.

Results

Using long-read technology, we sequenced and assembled genomes of Z. ardabiliae, Z. brevis, Z. pseudotritici and Z. passerinii, together with two isolates of Z. tritici. We report a high extent of genome collinearity among Zymoseptoria species and high conservation of genomic, transcriptomic and epigenomic signatures of compartmentalization. We identify high gene content variability both within and between species. In addition, such variability is mainly limited to the accessory chromosomes and accessory compartments. Despite strong host specificity and non-overlapping host-range between species, predicted effectors are mainly shared among Zymoseptoria species, yet exhibiting a high level of presence-absence polymorphism within Z. tritici. Using in planta transcriptomic data from Z. tritici, we suggest different roles for the shared orthologs and for the accessory genes during infection of their hosts.

Conclusion

Despite previous reports of high genomic plasticity in Z. tritici, we describe here a high level of conservation in genomic, epigenomic and transcriptomic composition and structure across the genus Zymoseptoria. The compartmentalized genome allows the maintenance of a functional core genome co-occurring with a highly variable accessory genome.

Collapse

Affiliation(s)

Alice Feurtey Environmental Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany.,Environmental Genomics, Christian-Albrechts University of Kiel, 24118, Kiel, Germany
Cécile Lorrain Environmental Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany. .,Environmental Genomics, Christian-Albrechts University of Kiel, 24118, Kiel, Germany. .,INRA Centre Grand Est - Nancy, UMR 1136 INRA/Universite de Lorraine Interactions Arbres/Microorganismes, 54280, Champenoux, France.
Daniel Croll Laboratory of Evolutionary Genetics, Institute of Biology, University of Neuchâtel, 2000, Neuchâtel, Switzerland
Christoph Eschenbrenner Environmental Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany.,Environmental Genomics, Christian-Albrechts University of Kiel, 24118, Kiel, Germany
Michael Freitag Department of Biochemistry and Biophysics, Oregon State University, Corvallis, OR, USA
Michael Habig Environmental Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany.,Environmental Genomics, Christian-Albrechts University of Kiel, 24118, Kiel, Germany
Janine Haueisen Environmental Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany.,Environmental Genomics, Christian-Albrechts University of Kiel, 24118, Kiel, Germany
Mareike Möller Environmental Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany.,Environmental Genomics, Christian-Albrechts University of Kiel, 24118, Kiel, Germany.,Department of Biochemistry and Biophysics, Oregon State University, Corvallis, OR, USA
Klaas Schotanus Environmental Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany.,Environmental Genomics, Christian-Albrechts University of Kiel, 24118, Kiel, Germany.,Department of Molecular Genetics and Microbiology, Duke University, Duke University Medical Center, Durham, NC, 27710, USA
Eva H Stukenbrock Environmental Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany.,Environmental Genomics, Christian-Albrechts University of Kiel, 24118, Kiel, Germany

Collapse

Lafond M, Hellmuth M. Reconstruction of time-consistent species trees. Algorithms Mol Biol 2020;15:16. [PMID: 32843891 PMCID: PMC7439642 DOI: 10.1186/s13015-020-00175-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 07/25/2020] [Indexed: 02/04/2023] Open

Dual RNA-seq of Orientia tsutsugamushi informs on host-pathogen interactions for this neglected intracellular human pathogen. Nat Commun 2020;11:3363. [PMID: 32620750 PMCID: PMC7335160 DOI: 10.1038/s41467-020-17094-8] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Accepted: 06/11/2020] [Indexed: 12/12/2022] Open

The first transcriptomic resource for the flatworm Triaenophorus nodulosus (Cestoda: Bothriocephalidea), a common parasite of holarctic freshwater fish. Mar Genomics 2020. [DOI: 10.1016/j.margen.2019.100702] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Galperin MY, Kristensen DM, Makarova KS, Wolf YI, Koonin EV. Microbial genome analysis: the COG approach. Brief Bioinform 2020;20:1063-1070. [PMID: 28968633 DOI: 10.1093/bib/bbx117] [Citation(s) in RCA: 152] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 08/01/2017] [Indexed: 11/15/2022] Open

Stadler PF, Geiß M, Schaller D, López Sánchez A, González Laffitte M, Valdivia DI, Hellmuth M, Hernández Rosales M. From pairs of most similar sequences to phylogenetic best matches. Algorithms Mol Biol 2020;15:5. [PMID: 32308731 PMCID: PMC7147060 DOI: 10.1186/s13015-020-00165-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 03/26/2020] [Indexed: 11/10/2022] Open

Affiliation(s)

Peter F. Stadler Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany Competence Center for Scalable Data Services and Solutions Dresden/Leipzig, Interdisciplinary Center for Bioinformatics, German Centre for Integrative Biodiversity Research (iDiv), and Leipzig Research Center for Civilization Diseases, Universität Leipzig, Augustusplatz 12, 04107 Leipzig, Germany Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany Department of Theoretical Chemistry, University of Vienna, Währinger Straße 17, 1090 Vienna, Austria Facultad de Ciencias, Universidad National de Colombia, Sede Bogotá, Ciudad Universitaria, 111321 Bogotá, D.C. Colombia Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM87501 USA
Manuela Geiß Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany Software Competence Center Hagenberg GmbH, Softwarepark 21, 4232 Hagenberg, Austria
David Schaller Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany
Alitzel López Sánchez CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México
Marcos González Laffitte CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México
Dulce I. Valdivia Departamento de Ingeniería Genética, Centro de Investigación y de Estudios Avanzados del IPN (CINVESTAV), Km. 9.6 Libramiento Norte Carretera Irapuato-León, 36821 Irapuato, GTO México
Marc Hellmuth School of Computing, University of Leeds, E C Stoner Building, Leeds, LS2 9JT UK
Maribel Hernández Rosales CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México

Collapse

Cavassim MIA, Moeskjær S, Moslemi C, Fields B, Bachmann A, Vilhjálmsson BJ, Schierup MH, W. Young JP, Andersen SU. Symbiosis genes show a unique pattern of introgression and selection within a Rhizobium leguminosarum species complex. Microb Genom 2020;6:e000351. [PMID: 32176601 PMCID: PMC7276703 DOI: 10.1099/mgen.0.000351] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Accepted: 02/17/2020] [Indexed: 12/22/2022] Open

Geiß M, Laffitte MEG, Sánchez AL, Valdivia DI, Hellmuth M, Rosales MH, Stadler PF. Best match graphs and reconciliation of gene trees with species trees. J Math Biol 2020;80:1459-1495. [PMID: 32002659 PMCID: PMC7052050 DOI: 10.1007/s00285-020-01469-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Revised: 01/08/2020] [Indexed: 11/19/2022]

Affiliation(s)

Manuela Geiß Bioinformatics Group, Department of Computer Science, Interdisciplinary Center of Bioinformatics, University of Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany
Marcos E. González Laffitte CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO Mexico
Alitzel López Sánchez CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO Mexico
Dulce I. Valdivia Centro de Ciencias Básicas, Universidad Autónoma de Aguascalientes, Av. Universidad 940, 20131 Aguascalientes, AGS México Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO Mexico
Marc Hellmuth Institute of Mathematics and Computer Science, University of Greifswald, Walther-Rathenau-Straße 47, 17487 Greifswald, Germany Center for Bioinformatics, Saarland University, Building E 2.1, P.O. Box 151150, 66041 Saarbrücken, Germany
Maribel Hernández Rosales CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO Mexico
Peter F. Stadler Bioinformatics Group, Department of Computer Science, Interdisciplinary Center of Bioinformatics, University of Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany Competence Center for Scalable Data Services and Solutions, Leipzig Research Center for Civilization Diseases, Leipzig University, Härtelstraße 16-18, 04107 Leipzig, Germany Max-Planck-Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany Inst. f. Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria Facultad de Ciencias, Universidad National de Colombia, Bogotá, Colombia Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501 USA

Collapse

Geiß M, Stadler PF, Hellmuth M. Reciprocal best match graphs. J Math Biol 2019;80:865-953. [PMID: 31691135 DOI: 10.1007/s00285-019-01444-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 06/10/2019] [Indexed: 11/24/2022]

Hellmuth M, Huber KT, Moulton V. Reconciling event-labeled gene trees with MUL-trees and species networks. J Math Biol 2019;79:1885-1925. [PMID: 31410552 DOI: 10.1007/s00285-019-01414-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 05/08/2019] [Indexed: 11/30/2022]

Slabaugh E, Desai JS, Sartor RC, Lawas LMF, Jagadish SVK, Doherty CJ. Analysis of differential gene expression and alternative splicing is significantly influenced by choice of reference genome. RNA (NEW YORK, N.Y.) 2019;25:669-684. [PMID: 30872414 PMCID: PMC6521602 DOI: 10.1261/rna.070227.118] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2019] [Accepted: 03/06/2019] [Indexed: 05/19/2023]

Hellmuth M, Seemann CR. Alternative characterizations of Fitch's xenology relation. J Math Biol 2019;79:969-986. [PMID: 31111195 DOI: 10.1007/s00285-019-01384-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 05/08/2019] [Indexed: 11/25/2022]

Drukewitz SH, von Reumont BM. The Significance of Comparative Genomics in Modern Evolutionary Venomics. Front Ecol Evol 2019. [DOI: 10.3389/fevo.2019.00163] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Prost S, Armstrong EE, Nylander J, Thomas GWC, Suh A, Petersen B, Dalen L, Benz BW, Blom MPK, Palkopoulou E, Ericson PGP, Irestedt M. Comparative analyses identify genomic features potentially involved in the evolution of birds-of-paradise. Gigascience 2019;8:giz003. [PMID: 30689847 PMCID: PMC6497032 DOI: 10.1093/gigascience/giz003] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 10/30/2018] [Accepted: 01/10/2019] [Indexed: 12/14/2022] Open

Affiliation(s)

Stefan Prost Department of Biodiversity and Genetics, Swedish Museum of Natural History, Frescativaegen 40, 114 18 Stockholm, Sweden Department of Integrative Biology, University of California, 3040 Valley Life Science Building, Berkeley, CA 94720-3140, USA
Ellie E Armstrong Department of Biology, Stanford University, 371 Serra Mall, Stanford, CA 94305–5020, USA
Johan Nylander Department of Biodiversity and Genetics, Swedish Museum of Natural History, Frescativaegen 40, 114 18 Stockholm, Sweden
Gregg W C Thomas Department of Biology and School of Informatics, Computing, and Engineering, Indiana University, 1001 E. Third Street, Bloomington, IN 47405, USA
Alexander Suh Department of Evolutionary Biology (EBC), Uppsala University, Norbyvaegen 14-18, 75236 Uppsala, Sweden
Bent Petersen Natural History Museum of Denmark, University of Copenhagen, Oster Voldgade 5-7, 1353 Copenhagen, Denmark Centre of Excellence for Omics-Driven Computational Biodiscovery, Faculty of Applied Sciences, Asian Institute of Medicine, Science and Technology,Jalan Bedong-Semeling, 08100 Bedong, Kedah, Malaysia
Love Dalen Department of Biodiversity and Genetics, Swedish Museum of Natural History, Frescativaegen 40, 114 18 Stockholm, Sweden
Brett W Benz Department of Ornithology, American Museum of Natural History, Central Park West, New York, NY 10024, USA
Mozes P K Blom Department of Biodiversity and Genetics, Swedish Museum of Natural History, Frescativaegen 40, 114 18 Stockholm, Sweden
Eleftheria Palkopoulou Department of Biodiversity and Genetics, Swedish Museum of Natural History, Frescativaegen 40, 114 18 Stockholm, Sweden
Per G P Ericson Department of Biodiversity and Genetics, Swedish Museum of Natural History, Frescativaegen 40, 114 18 Stockholm, Sweden
Martin Irestedt Department of Biodiversity and Genetics, Swedish Museum of Natural History, Frescativaegen 40, 114 18 Stockholm, Sweden

Collapse

Best match graphs. J Math Biol 2019;78:2015-2057. [PMID: 30968198 PMCID: PMC6534531 DOI: 10.1007/s00285-019-01332-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Revised: 12/15/2018] [Indexed: 10/27/2022]

Armstrong EE, Taylor RW, Prost S, Blinston P, van der Meer E, Madzikanda H, Mufute O, Mandisodza-Chikerema R, Stuelpnagel J, Sillero-Zubiri C, Petrov D. Cost-effective assembly of the African wild dog (Lycaon pictus) genome using linked reads. Gigascience 2019;8:5140148. [PMID: 30346553 PMCID: PMC6350039 DOI: 10.1093/gigascience/giy124] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Accepted: 10/07/2018] [Indexed: 01/07/2023] Open

Georgescu CH, Manson AL, Griggs AD, Desjardins CA, Pironti A, Wapinski I, Abeel T, Haas BJ, Earl AM. SynerClust: a highly scalable, synteny-aware orthologue clustering tool. Microb Genom 2018;4. [PMID: 30418868 PMCID: PMC6321874 DOI: 10.1099/mgen.0.000231] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Nallu S, Hill JA, Don K, Sahagun C, Zhang W, Meslin C, Snell-Rood E, Clark NL, Morehouse NI, Bergelson J, Wheat CW, Kronforst MR. The molecular genetic basis of herbivory between butterflies and their host plants. Nat Ecol Evol 2018;2:1418-1427. [PMID: 30076351 PMCID: PMC6149523 DOI: 10.1038/s41559-018-0629-9] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Accepted: 07/02/2018] [Indexed: 12/30/2022]

Gene Phylogenies and Orthologous Groups. Methods Mol Biol 2018. [PMID: 29277861 DOI: 10.1007/978-1-4939-7463-4_1] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]

Waldl M, Thiel BC, Ochsenreiter R, Holzenleiter A, de Araujo Oliveira JV, Walter MEMT, Wolfinger MT, Stadler PF. TERribly Difficult: Searching for Telomerase RNAs in Saccharomycetes. Genes (Basel) 2018;9:genes9080372. [PMID: 30049970 PMCID: PMC6115765 DOI: 10.3390/genes9080372] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Revised: 07/17/2018] [Accepted: 07/18/2018] [Indexed: 11/20/2022] Open

Fertin G, Hüffner F, Komusiewicz C, Sorge M. Matching algorithms for assigning orthologs after genome duplication events. Comput Biol Chem 2018;74:379-390. [PMID: 29650458 DOI: 10.1016/j.compbiolchem.2018.03.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 03/13/2018] [Indexed: 11/25/2022]

Nøjgaard N, Geiß M, Merkle D, Stadler PF, Wieseke N, Hellmuth M. Time-consistent reconciliation maps and forbidden time travel. Algorithms Mol Biol 2018;13:2. [PMID: 29441122 PMCID: PMC5800358 DOI: 10.1186/s13015-018-0121-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2017] [Accepted: 01/20/2018] [Indexed: 12/04/2022] Open

Abstract

Background

In the absence of horizontal gene transfer it is possible to reconstruct the history of gene families from empirically determined orthology relations, which are equivalent to event-labeled gene trees. Knowledge of the event labels considerably simplifies the problem of reconciling a gene tree T with a species trees S, relative to the reconciliation problem without prior knowledge of the event types. It is well-known that optimal reconciliations in the unlabeled case may violate time-consistency and thus are not biologically feasible. Here we investigate the mathematical structure of the event labeled reconciliation problem with horizontal transfer.

Results

We investigate the issue of time-consistency for the event-labeled version of the reconciliation problem, provide a convenient axiomatic framework, and derive a complete characterization of time-consistent reconciliations. This characterization depends on certain weak conditions on the event-labeled gene trees that reflect conditions under which evolutionary events are observable at least in principle. We give an \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(|V(T)|\log (|V(S)|))$$\end{document}O(|V(T)|log(|V(S)|))-time algorithm to decide whether a time-consistent reconciliation map exists. It does not require the construction of explicit timing maps, but relies entirely on the comparably easy task of checking whether a small auxiliary graph is acyclic. The algorithms are implemented in C++ using the boost graph library and are freely available at https://github.com/Nojgaard/tc-recon.

Significance

The combinatorial characterization of time consistency and thus biologically feasible reconciliation is an important step towards the inference of gene family histories with horizontal transfer from orthology data, i.e., without presupposed gene and species trees. The fast algorithm to decide time consistency is useful in a broader context because it constitutes an attractive component for all tools that address tree reconciliation problems.

Collapse

Palmer JM, Drees KP, Foster JT, Lindner DL. Extreme sensitivity to ultraviolet light in the fungal pathogen causing white-nose syndrome of bats. Nat Commun 2018;9:35. [PMID: 29295979 PMCID: PMC5750222 DOI: 10.1038/s41467-017-02441-z] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Accepted: 11/30/2017] [Indexed: 02/08/2023] Open

Zheng A, Jiang B, Li Y, Zhang X, Ding C. Elastic K-means using posterior probability. PLoS One 2017;12:e0188252. [PMID: 29240756 PMCID: PMC5730165 DOI: 10.1371/journal.pone.0188252] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Accepted: 11/05/2017] [Indexed: 11/30/2022] Open

Jahangiri-Tazehkand S, Wong L, Eslahchi C. OrthoGNC: A Software for Accurate Identification of Orthologs Based on Gene Neighborhood Conservation. GENOMICS PROTEOMICS & BIOINFORMATICS 2017;15:361-370. [PMID: 29133277 PMCID: PMC5828658 DOI: 10.1016/j.gpb.2017.07.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2017] [Revised: 07/17/2017] [Accepted: 07/28/2017] [Indexed: 11/17/2022]

Genome-Guided Phylo-Transcriptomic Methods and the Nuclear Phylogentic Tree of the Paniceae Grasses. Sci Rep 2017;7:13528. [PMID: 29051622 PMCID: PMC5648822 DOI: 10.1038/s41598-017-13236-z] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2017] [Accepted: 09/20/2017] [Indexed: 11/23/2022] Open

Sharma A, Wai CM, Ming R, Yu Q. Diurnal Cycling Transcription Factors of Pineapple Revealed by Genome-Wide Annotation and Global Transcriptomic Analysis. Genome Biol Evol 2017;9:2170-2190. [PMID: 28922793 PMCID: PMC5737478 DOI: 10.1093/gbe/evx161] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/22/2017] [Indexed: 12/22/2022] Open

Hellmuth M. Biologically feasible gene trees, reconciliation maps and informative triples. Algorithms Mol Biol 2017;12:23. [PMID: 28861118 PMCID: PMC5576477 DOI: 10.1186/s13015-017-0114-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2017] [Accepted: 08/16/2017] [Indexed: 11/17/2022] Open

Abstract

BACKGROUND

The history of gene families-which are equivalent to event-labeled gene trees-can be reconstructed from empirically estimated evolutionary event-relations containing pairs of orthologous, paralogous or xenologous genes. The question then arises as whether inferred event-labeled gene trees are biologically feasible, that is, if there is a possible true history that would explain a given gene tree. In practice, this problem is boiled down to finding a reconciliation map-also known as DTL-scenario-between the event-labeled gene trees and a (possibly unknown) species tree.

RESULTS

In this contribution, we first characterize whether there is a valid reconciliation map for binary event-labeled gene trees T that contain speciation, duplication and horizontal gene transfer events and some unknown species tree S in terms of "informative" triples that are displayed in T and provide information of the topology of S. These informative triples are used to infer the unknown species tree S for T. We obtain a similar result for non-binary gene trees. To this end, however, the reconciliation map needs to be further restricted. We provide a polynomial-time algorithm to decide whether there is a species tree for a given event-labeled gene tree, and in the positive case, to construct the species tree and the respective (restricted) reconciliation map. However, informative triples as well as DTL-scenarios have their limitations when they are used to explain the biological feasibility of gene trees. While reconciliation maps imply biological feasibility, we show that the converse is not true in general. Moreover, we show that informative triples neither provide enough information to characterize "relaxed" DTL-scenarios nor non-restricted reconciliation maps for non-binary biologically feasible gene trees.

Collapse

Positive diversifying selection is a pervasive adaptive force throughout the Drosophila radiation. Mol Phylogenet Evol 2017;112:230-243. [DOI: 10.1016/j.ympev.2017.04.023] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Revised: 04/26/2017] [Accepted: 04/26/2017] [Indexed: 01/02/2023]

Battenberg K, Lee EK, Chiu JC, Berry AM, Potter D. OrthoReD: a rapid and accurate orthology prediction tool with low computational requirement. BMC Bioinformatics 2017. [PMID: 28633662 PMCID: PMC5479036 DOI: 10.1186/s12859-017-1726-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

Background

Identifying orthologous genes is an initial step required for phylogenetics, and it is also a common strategy employed in functional genetics to find candidates for functionally equivalent genes across multiple species. At the same time, in silico orthology prediction tools often require large computational resources only available on computing clusters. Here we present OrthoReD, an open-source orthology prediction tool with accuracy comparable to published tools that requires only a desktop computer. The low computational resource requirement of OrthoReD is achieved by repeating orthology searches on one gene of interest at a time, thereby generating a reduced dataset to limit the scope of orthology search for each gene of interest.

Results

The output of OrthoReD was highly similar to the outputs of two other published orthology prediction tools, OrthologID and/or OrthoDB, for the three dataset tested, which represented three phyla with different ranges of species diversity and different number of genomes included. Median CPU time for ortholog prediction per gene by OrthoReD executed on a desktop computer was <15 min even for the largest dataset tested, which included all coding sequences of 100 bacterial species.

Conclusions

With high-throughput sequencing, unprecedented numbers of genes from non-model organisms are available with increasing need for clear information about their orthologies and/or functional equivalents in model organisms. OrthoReD is not only fast and accurate as an orthology prediction tool, but also gives researchers flexibility in the number of genes analyzed at a time, without requiring a high-performance computing cluster.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-017-1726-5) contains supplementary material, which is available to authorized users.

Collapse

Doerr D, Kowada LAB, Araujo E, Deshpande S, Dantas S, Moret BME, Stoye J. New Genome Similarity Measures based on Conserved Gene Adjacencies. J Comput Biol 2017;24:616-634. [PMID: 28590847 DOI: 10.1089/cmb.2017.0065] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Abstract

Many important questions in molecular biology, evolution, and biomedicine can be addressed by comparative genomic approaches. One of the basic tasks when comparing genomes is the definition of measures of similarity (or dissimilarity) between two genomes, for example, to elucidate the phylogenetic relationships between species. The power of different genome comparison methods varies with the underlying formal model of a genome. The simplest models impose the strong restriction that each genome under study must contain the same genes, each in exactly one copy. More realistic models allow several copies of a gene in a genome. One speaks of gene families, and comparative genomic methods that allow this kind of input are called gene family-based. The most powerful-but also most complex-models avoid this preprocessing of the input data and instead integrate the family assignment within the comparative analysis. Such methods are called gene family-free. In this article, we study an intermediate approach between family-based and family-free genomic similarity measures. Introducing this simpler model, called gene connections, we focus on the combinatorial aspects of gene family-free genome comparison. While in most cases, the computational costs to the general family-free case are the same, we also find an instance where the gene connections model has lower complexity. Within the gene connections model, we define three variants of genomic similarity measures that have different expression powers. We give polynomial-time algorithms for two of them, while we show NP-hardness for the third, most powerful one. We also generalize the measures and algorithms to make them more robust against recent local disruptions in gene order. Our theoretical findings are supported by experimental results, proving the applicability and performance of our newly defined similarity measures.

Collapse

Doerr D, Balaban M, Feijão P, Chauve C. The gene family-free median of three. Algorithms Mol Biol 2017;12:14. [PMID: 28559921 PMCID: PMC5446766 DOI: 10.1186/s13015-017-0106-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2017] [Accepted: 05/18/2017] [Indexed: 11/20/2022] Open

Abstract

Background

The gene family-free framework for comparative genomics aims at providing methods for gene order analysis that do not require prior gene family assignment, but work directly on a sequence similarity graph. We study two problems related to the breakpoint median of three genomes, which asks for the construction of a fourth genome that minimizes the sum of breakpoint distances to the input genomes.

Methods

We present a model for constructing a median of three genomes in this family-free setting, based on maximizing an objective function that generalizes the classical breakpoint distance by integrating sequence similarity in the score of a gene adjacency. We study its computational complexity and we describe an integer linear program (ILP) for its exact solution. We further discuss a related problem called family-free adjacencies for k genomes for the special case of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k \le 3$$\end{document}k≤3 and present an ILP for its solution. However, for this problem, the computation of exact solutions remains intractable for sufficiently large instances. We then proceed to describe a heuristic method, FFAdj-AM, which performs well in practice.

Results

The developed methods compute accurate positional orthologs for genomes comparable in size of bacterial genomes on simulated data and genomic data acquired from the OMA orthology database. In particular, FFAdj-AM performs equally or better when compared to the well-established gene family prediction tool MultiMSOAR.

Conclusions

We study the computational complexity of a new family-free model and present algorithms for its solution. With FFAdj-AM, we propose an appealing alternative to established tools for identifying higher confidence positional orthologs.

Electronic supplementary material

The online version of this article (doi:10.1186/s13015-017-0106-z) contains supplementary material, which is available to authorized users.

Collapse

Leimbach A, Poehlein A, Vollmers J, Görlich D, Daniel R, Dobrindt U. No evidence for a bovine mastitis Escherichia coli pathotype. BMC Genomics 2017;18:359. [PMID: 28482799 PMCID: PMC5422975 DOI: 10.1186/s12864-017-3739-x] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2016] [Accepted: 04/27/2017] [Indexed: 11/30/2022] Open

Abstract

Background

Escherichia coli bovine mastitis is a disease of significant economic importance in the dairy industry. Molecular characterization of mastitis-associated E. coli (MAEC) did not result in the identification of common traits. Nevertheless, a mammary pathogenic E. coli (MPEC) pathotype has been proposed suggesting virulence traits that differentiate MAEC from commensal E. coli. The present study was designed to investigate the MPEC pathotype hypothesis by comparing the genomes of MAEC and commensal bovine E. coli.

Results

We sequenced the genomes of eight E. coli isolated from bovine mastitis cases and six fecal commensal isolates from udder-healthy cows. We analyzed the phylogenetic history of bovine E. coli genomes by supplementing this strain panel with eleven bovine-associated E. coli from public databases. The majority of the isolates originate from phylogroups A and B1, but neither MAEC nor commensal strains could be unambiguously distinguished by phylogenetic lineage. The gene content of both MAEC and commensal strains is highly diverse and dominated by their phylogenetic background. Although individual strains carry some typical E. coli virulence-associated genes, no traits important for pathogenicity could be specifically attributed to MAEC. Instead, both commensal strains and MAEC have very few gene families enriched in either pathotype. Only the aerobactin siderophore gene cluster was enriched in commensal E. coli within our strain panel.

Conclusions

This is the first characterization of a phylogenetically diverse strain panel including several MAEC and commensal isolates. With our comparative genomics approach we could not confirm previous studies that argue for a positive selection of specific traits enabling MAEC to elicit bovine mastitis. Instead, MAEC are facultative and opportunistic pathogens recruited from the highly diverse bovine gastrointestinal microbiota. Virulence-associated genes implicated in mastitis are a by-product of commensalism with the primary function to enhance fitness in the bovine gastrointestinal tract. Therefore, we put the definition of the MPEC pathotype into question and suggest to designate corresponding isolates as MAEC.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-017-3739-x) contains supplementary material, which is available to authorized users.

Collapse

Yue JX, Li J, Aigrain L, Hallin J, Persson K, Oliver K, Bergström A, Coupland P, Warringer J, Lagomarsino MC, Fischer G, Durbin R, Liti G. Contrasting evolutionary genome dynamics between domesticated and wild yeasts. Nat Genet 2017;49:913-924. [PMID: 28416820 PMCID: PMC5446901 DOI: 10.1038/ng.3847] [Citation(s) in RCA: 208] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 03/22/2017] [Indexed: 12/13/2022]