Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Salichos L, Rokas A. Evaluating ortholog prediction algorithms in a yeast model clade. PLoS One 2011;6:e18755. [PMID: 21533202 PMCID: PMC3076445 DOI: 10.1371/journal.pone.0018755] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Accepted: 03/15/2011] [Indexed: 11/18/2022] Open

For:	Salichos L, Rokas A. Evaluating ortholog prediction algorithms in a yeast model clade. PLoS One 2011;6:e18755. [PMID: 21533202 PMCID: PMC3076445 DOI: 10.1371/journal.pone.0018755] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Accepted: 03/15/2011] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Mulhair PO, McCarthy CGP, Siu-Ting K, Creevey CJ, O'Connell MJ. Filtering artifactual signal increases support for Xenacoelomorpha and Ambulacraria sister relationship in the animal tree of life. Curr Biol 2022;32:5180-5188.e3. [PMID: 36356574 DOI: 10.1016/j.cub.2022.10.036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 08/09/2022] [Accepted: 10/18/2022] [Indexed: 11/10/2022]

Abstract

Conflicting studies place a group of bilaterian invertebrates containing xenoturbellids and acoelomorphs, the Xenacoelomorpha, as either the primary emerging bilaterian phylum¹^,²^,³^,⁴^,⁵^,⁶ or within Deuterostomia, sister to Ambulacraria.⁷^,⁸^,⁹^,¹⁰^,¹¹ Although their placement as sister to the rest of Bilateria supports relatively simple morphology in the ancestral bilaterian, their alternative placement within Deuterostomia suggests a morphologically complex ancestral bilaterian along with extensive loss of major phenotypic traits in the Xenacoelomorpha. Recent studies have questioned whether Deuterostomia should be considered monophyletic at all.¹⁰^,¹²^,¹³ Hidden paralogy and poor phylogenetic signal present a major challenge for reconstructing species phylogenies.¹⁴^,¹⁵^,¹⁶^,¹⁷^,¹⁸ Here, we assess whether these issues have contributed to the conflict over the placement of Xenacoelomorpha. We reanalyzed published datasets, enriching for orthogroups whose gene trees support well-resolved clans elsewhere in the animal tree.¹⁶ We find that most genes in previously published datasets violate incontestable clans, suggesting that hidden paralogy and low phylogenetic signal affect the ability to reconstruct branching patterns at deep nodes in the animal tree. We demonstrate that removing orthogroups that cannot recapitulate incontestable relationships alters the final topology that is inferred, while simultaneously improving the fit of the model to the data. We discover increased, but ultimately not conclusive, support for the existence of Xenambulacraria in our set of filtered orthogroups. At a time when we are progressing toward sequencing all life on the planet, we argue that long-standing contentious issues in the tree of life will be resolved using smaller amounts of better quality data that can be modeled adequately.¹⁹.

Collapse

Xiong H, Wang D, Shao C, Yang X, Yang J, Ma T, Davis CC, Liu L, Xi Z. Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication. Syst Biol 2022;71:1348-1361. [PMID: 35689633 PMCID: PMC9558847 DOI: 10.1093/sysbio/syac040] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Revised: 06/03/2022] [Accepted: 06/07/2022] [Indexed: 12/02/2022] Open

Abstract

Whole-genome duplication (WGD) occurs broadly and repeatedly across the history of eukaryotes and is recognized as a prominent evolutionary force, especially in plants. Immediately following WGD, most genes are present in two copies as paralogs. Due to this redundancy, one copy of a paralog pair commonly undergoes pseudogenization and is eventually lost. When speciation occurs shortly after WGD; however, differential loss of paralogs may lead to spurious phylogenetic inference resulting from the inclusion of pseudoorthologs–paralogous genes mistakenly identified as orthologs because they are present in single copies within each sampled species. The influence and impact of including pseudoorthologs versus true orthologs as a result of gene extinction (or incomplete laboratory sampling) are only recently gaining empirical attention in the phylogenomics community. Moreover, few studies have yet to investigate this phenomenon in an explicit coalescent framework. Here, using mathematical models, numerous simulated data sets, and two newly assembled empirical data sets, we assess the effect of pseudoorthologs on species tree estimation under varying degrees of incomplete lineage sorting (ILS) and differential gene loss scenarios following WGD. When gene loss occurs along the terminal branches of the species tree, alignment-based (BPP) and gene-tree-based (ASTRAL, MP-EST, and STAR) coalescent methods are adversely affected as the degree of ILS increases. This can be greatly improved by sampling a sufficiently large number of genes. Under the same circumstances, however, concatenation methods consistently estimate incorrect species trees as the number of genes increases. Additionally, pseudoorthologs can greatly mislead species tree inference when gene loss occurs along the internal branches of the species tree. Here, both coalescent and concatenation methods yield inconsistent results. These results underscore the importance of understanding the influence of pseudoorthologs in the phylogenomics era. [Coalescent method; concatenation method; incomplete lineage sorting; pseudoorthologs; single-copy gene; whole-genome duplication.]

Collapse

Zhang C, Zhao Y, Braun EL, Mirarab S. TAPER: Pinpointing errors in multiple sequence alignments despite varying rates of evolution. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13696] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Shen XX, Steenwyk JL, Rokas A. Dissecting incongruence between concatenation- and quartet-based approaches in phylogenomic data. Syst Biol 2021;70:997-1014. [PMID: 33616672 DOI: 10.1093/sysbio/syab011] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 02/10/2021] [Accepted: 02/17/2021] [Indexed: 12/12/2022] Open

Abstract

Topological conflict or incongruence is widespread in phylogenomic data. Concatenation- and coalescent-based approaches often result in incongruent topologies, but the causes of this conflict can be difficult to characterize. We examined incongruence stemming from conflict between likelihood-based signal (quantified by the difference in gene-wise log likelihood score or ΔGLS) and quartet-based topological signal (quantified by the difference in gene-wise quartet score or ΔGQS) for every gene in three phylogenomic studies in animals, fungi, and plants, which were chosen because their concatenation-based IQ-TREE (T1) and quartet-based ASTRAL (T2) phylogenies are known to produce eight conflicting internal branches (bipartitions). By comparing the types of phylogenetic signal for all genes in these three data matrices, we found that 30% - 36% of genes in each data matrix are inconsistent, that is, each of these genes has higher log likelihood score for T1 versus T2 (i.e., ΔGLS >0) whereas its T1 topology has lower quartet score than its T2 topology (i.e., ΔGQS <0) or vice versa. Comparison of inconsistent and consistent genes using a variety of metrics (e.g., evolutionary rate, gene tree topology, distribution of branch lengths, hidden paralogy, and gene tree discordance) showed that inconsistent genes are more likely to recover neither T1 nor T2 and have higher levels of gene tree discordance than consistent genes. Simulation analyses demonstrate that removal of inconsistent genes from datasets with low levels of incomplete lineage sorting (ILS) and low and medium levels of gene tree estimation error (GTEE) reduced incongruence and increased accuracy. In contrast, removal of inconsistent genes from datasets with medium and high ILS levels and high GTEE levels eliminated or extensively reduced incongruence, but the resulting congruent species phylogenies were not always topologically identical to the true species trees.

Collapse

Vazquez JM, Lynch VJ. Pervasive duplication of tumor suppressors in Afrotherians during the evolution of large bodies and reduced cancer risk. eLife 2021;10:e65041. [PMID: 33513090 PMCID: PMC7952090 DOI: 10.7554/elife.65041] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 01/28/2021] [Indexed: 12/11/2022] Open

Correia K, Mahadevan R. Pan‐Genome‐Scale Network Reconstruction: Harnessing Phylogenomics Increases the Quantity and Quality of Metabolic Models. Biotechnol J 2020;15:e1900519. [DOI: 10.1002/biot.201900519] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 07/22/2020] [Indexed: 12/31/2022]

Agüero-Chapin G, Galpert D, Molina-Ruiz R, Ancede-Gallardo E, Pérez-Machado G, De la Riva GA, Antunes A. Graph Theory-Based Sequence Descriptors as Remote Homology Predictors. Biomolecules 2019;10:E26. [PMID: 31878100 PMCID: PMC7022958 DOI: 10.3390/biom10010026] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 12/16/2019] [Accepted: 12/18/2019] [Indexed: 12/23/2022] Open

Correia K, Yu SM, Mahadevan R. AYbRAH: a curated ortholog database for yeasts and fungi spanning 600 million years of evolution. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2019;2019:5403499. [PMID: 30893420 PMCID: PMC6425859 DOI: 10.1093/database/baz022] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 01/17/2019] [Accepted: 01/28/2019] [Indexed: 12/14/2022]

Abstract

Budding yeasts inhabit a range of environments by exploiting various metabolic traits. The genetic bases for these traits are mostly unknown, preventing their addition or removal in a chassis organism for metabolic engineering. Insight into the evolution of orthologs, paralogs and xenologs in the yeast pan-genome can help bridge these genotypes; however, existing phylogenomic databases do not span diverse yeasts, and sometimes cannot distinguish between these homologs. To help understand the molecular evolution of these traits in yeasts, we created Analyzing Yeasts by Reconstructing Ancestry of Homologs (AYbRAH), an open-source database of predicted and manually curated ortholog groups for 33 diverse fungi and yeasts in Dikarya, spanning 600 million years of evolution. OrthoMCL and OrthoDB were used to cluster protein sequence into ortholog and homolog groups, respectively; MAFFT and PhyML reconstructed the phylogeny of all homolog groups. Ortholog assignments for enzymes and small metabolite transporters were compared to their phylogenetic reconstruction, and curated to resolve any discrepancies. Information on homolog and ortholog groups can be viewed in the AYbRAH web portal (https://lmse.github.io/aybrah/), including functional annotations, predictions for mitochondrial localization and transmembrane domains, literature references and phylogenetic reconstructions. Ortholog assignments in AYbRAH were compared to HOGENOM, KEGG Orthology, OMA, eggNOG and PANTHER. PANTHER and OMA had the most congruent ortholog groups with AYbRAH, while the other phylogenomic databases had greater amounts of under-clustering, over-clustering or no ortholog annotations for proteins. Future plans are discussed for AYbRAH, and recommendations are made for other research communities seeking to create curated ortholog databases.

Collapse

Mead ME, Knowles SL, Raja HA, Beattie SR, Kowalski CH, Steenwyk JL, Silva LP, Chiaratto J, Ries LNA, Goldman GH, Cramer RA, Oberlies NH, Rokas A. Characterizing the Pathogenic, Genomic, and Chemical Traits of Aspergillus fischeri, a Close Relative of the Major Human Fungal Pathogen Aspergillus fumigatus. mSphere 2019;4:e00018-19. [PMID: 30787113 PMCID: PMC6382966 DOI: 10.1128/msphere.00018-19] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Accepted: 02/04/2019] [Indexed: 12/15/2022] Open

Abstract

Aspergillus fischeri is closely related to Aspergillus fumigatus, the major cause of invasive mold infections. Even though A. fischeri is commonly found in diverse environments, including hospitals, it rarely causes invasive disease. Why A. fischeri causes less human disease than A. fumigatus is unclear. A comparison of A. fischeri and A. fumigatus for pathogenic, genomic, and secondary metabolic traits revealed multiple differences in pathogenesis-related phenotypes. We observed that A. fischeri NRRL 181 is less virulent than A. fumigatus strain CEA10 in multiple animal models of disease, grows slower in low-oxygen environments, and is more sensitive to oxidative stress. Strikingly, the observed differences for some traits are of the same order of magnitude as those previously reported between A. fumigatus strains. In contrast, similar to what has previously been reported, the two species exhibit high genomic similarity; ∼90% of the A. fumigatus proteome is conserved in A. fischeri, including 48/49 genes known to be involved in A. fumigatus virulence. However, only 10/33 A. fumigatus biosynthetic gene clusters (BGCs) likely involved in secondary metabolite production are conserved in A. fischeri and only 13/48 A. fischeri BGCs are conserved in A. fumigatus Detailed chemical characterization of A. fischeri cultures grown on multiple substrates identified multiple secondary metabolites, including two new compounds and one never before isolated as a natural product. Additionally, an A. fischeri deletion mutant of laeA, a master regulator of secondary metabolism, produced fewer secondary metabolites and in lower quantities, suggesting that regulation of secondary metabolism is at least partially conserved. These results suggest that the nonpathogenic A. fischeri possesses many of the genes important for A. fumigatus pathogenicity but is divergent with respect to its ability to thrive under host-relevant conditions and its secondary metabolism.IMPORTANCEAspergillus fumigatus is the primary cause of aspergillosis, a devastating ensemble of diseases associated with severe morbidity and mortality worldwide. A. fischeri is a close relative of A. fumigatus but is not generally observed to cause human disease. To gain insights into the underlying causes of this remarkable difference in pathogenicity, we compared two representative strains (one from each species) for a range of pathogenesis-relevant biological and chemical characteristics. We found that disease progression in multiple A. fischeri mouse models was slower and caused less mortality than A. fumigatus Remarkably, the observed differences between A. fischeri and A. fumigatus strains examined here closely resembled those previously described for two commonly studied A. fumigatus strains, AF293 and CEA10. A. fischeri and A. fumigatus exhibited different growth profiles when placed in a range of stress-inducing conditions encountered during infection, such as low levels of oxygen and the presence of chemicals that induce the production of reactive oxygen species. We also found that the vast majority of A. fumigatus genes known to be involved in virulence are conserved in A. fischeri, whereas the two species differ significantly in their secondary metabolic pathways. These similarities and differences that we report here are the first step toward understanding the evolutionary origin of a major fungal pathogen.

Collapse

Knowles SL, Raja HA, Wright AJ, Lee AML, Caesar LK, Cech NB, Mead ME, Steenwyk JL, Ries LNA, Goldman GH, Rokas A, Oberlies NH. Mapping the Fungal Battlefield: Using in situ Chemistry and Deletion Mutants to Monitor Interspecific Chemical Interactions Between Fungi. Front Microbiol 2019;10:285. [PMID: 30837981 PMCID: PMC6389630 DOI: 10.3389/fmicb.2019.00285] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Accepted: 02/04/2019] [Indexed: 11/13/2022] Open

Abstract

Fungi grow in competitive environments, and to cope, they have evolved strategies, such as the ability to produce a wide range of secondary metabolites. This begs two related questions. First, how do secondary metabolites influence fungal ecology and interspecific interactions? Second, can these interspecific interactions provide a way to “see” how fungi respond, chemically, within a competitive environment? To evaluate these, and to gain insight into the secondary metabolic arsenal fungi possess, we co-cultured Aspergillus fischeri, a genetically tractable fungus that produces a suite of mycotoxins, with Xylaria cubensis, a fungus that produces the fungistatic compound and FDA-approved drug, griseofulvin. To monitor and characterize fungal chemistry in situ, we used the droplet-liquid microjunction-surface sampling probe (droplet probe). The droplet probe makes a microextraction at defined locations on the surface of the co-culture, followed by analysis of the secondary metabolite profile via liquid chromatography-mass spectrometry. Using this, we mapped and compared the spatial profiles of secondary metabolites from both fungi in monoculture versus co-culture. X. cubensis predominantly biosynthesized griseofulvin and dechlorogriseofulvin in monoculture. In contrast, under co-culture conditions a deadlock was formed between the two fungi, and X. cubensis biosynthesized the same two secondary metabolites, along with dechloro-5′-hydroxygriseofulvin and 5′-hydroxygriseofulvin, all of which have fungistatic properties, as well as mycotoxins like cytochalasin D and cytochalasin C. In contrast, in co-culture, A. fischeri increased the production of the mycotoxins fumitremorgin B and verruculogen, but otherwise remained unchanged relative to its monoculture. To evaluate that secondary metabolites play an important role in defense and territory establishment, we co-cultured A. fischeri lacking the master regulator of secondary metabolism laeA with X. cubensis. We found that the reduced secondary metabolite biosynthesis of the ΔlaeA strain of A. fischeri eliminated the organism’s ability to compete in co-culture and led to its displacement by X. cubensis. These results demonstrate the potential of in situ chemical analysis and deletion mutant approaches for shedding light on the ecological roles of secondary metabolites and how they influence fungal ecological strategies; co-culturing may also stimulate the biosynthesis of secondary metabolites that are not produced in monoculture in the laboratory.

Collapse

Bogaert KA, Blommaert L, Ljung K, Beeckman T, De Clerck O. Auxin Function in the Brown Alga Dictyota dichotoma. PLANT PHYSIOLOGY 2019;179:280-299. [PMID: 30420566 PMCID: PMC6324224 DOI: 10.1104/pp.18.01041] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Accepted: 10/30/2018] [Indexed: 05/14/2023]

Georgescu CH, Manson AL, Griggs AD, Desjardins CA, Pironti A, Wapinski I, Abeel T, Haas BJ, Earl AM. SynerClust: a highly scalable, synteny-aware orthologue clustering tool. Microb Genom 2018;4. [PMID: 30418868 PMCID: PMC6321874 DOI: 10.1099/mgen.0.000231] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Shen XX, Opulente DA, Kominek J, Zhou X, Steenwyk JL, Buh KV, Haase MAB, Wisecaver JH, Wang M, Doering DT, Boudouris JT, Schneider RM, Langdon QK, Ohkuma M, Endoh R, Takashima M, Manabe RI, Čadež N, Libkind D, Rosa CA, DeVirgilio J, Hulfachor AB, Groenewald M, Kurtzman CP, Hittinger CT, Rokas A. Tempo and Mode of Genome Evolution in the Budding Yeast Subphylum. Cell 2018;175:1533-1545.e20. [PMID: 30415838 DOI: 10.1016/j.cell.2018.10.023] [Citation(s) in RCA: 363] [Impact Index Per Article: 51.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Revised: 08/12/2018] [Accepted: 10/04/2018] [Indexed: 11/17/2022]

Affiliation(s)

Xing-Xing Shen Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
Dana A Opulente Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA; DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA
Jacek Kominek Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA; DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA
Xiaofan Zhou Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA; Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Centre, South China Agricultural University, 510642 Guangzhou, China
Jacob L Steenwyk Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
Kelly V Buh Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
Max A B Haase Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA; DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA; Sackler Institute of Graduate Biomedical Sciences, NYU School of Medicine, New York, NY 10016, USA
Jennifer H Wisecaver Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA; Department of Biochemistry, Center for Plant Biology, Purdue University, West Lafayette, IN 47907, USA
Mingshuang Wang Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
Drew T Doering Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
James T Boudouris Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
Rachel M Schneider Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA; DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA
Quinn K Langdon Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
Moriya Ohkuma Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Ibaraki 305-0074, Japan
Rikiya Endoh Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Ibaraki 305-0074, Japan
Masako Takashima Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Ibaraki 305-0074, Japan
Ri-Ichiroh Manabe Division of Genomic Technologies, RIKEN Center For Life Science Technologies, Laboratory for Comprehensive Genomic Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
Neža Čadež Biotechnical Faculty, University of Ljubljana, 1000 Ljubljana, Slovenia
Diego Libkind Laboratorio de Microbiología Aplicada y Biotecnología, Instituto Andino Patagónico de Tecnologías Biológicas y Geoambientales (IPATEC), Consejo Nacional de Investigaciones, Científicas y Técnicas (CONICET)-Universidad Nacional del Comahue, 8400 Bariloche, Argentina
Carlos A Rosa Departamento de Microbiologia, ICB, CP 486, Universidade Federal de Minas Gerais, Belo Horizonte, MG, 31270-901, Brazil
Jeremy DeVirgilio Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Agricultural Research Service, U.S. Department of Agriculture, Peoria, IL 61604, USA
Amanda Beth Hulfachor Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
Marizeth Groenewald Westerdijk Fungal Biodiversity Institute, 3584 CT, Utrecht, the Netherlands
Cletus P Kurtzman Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Agricultural Research Service, U.S. Department of Agriculture, Peoria, IL 61604, USA
Chris Todd Hittinger Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA; DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA.
Antonis Rokas Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA.

Collapse

Goodswen SJ, Kennedy PJ, Ellis JT. A Gene-Based Positive Selection Detection Approach to Identify Vaccine Candidates Using Toxoplasma gondii as a Test Case Protozoan Pathogen. Front Genet 2018;9:332. [PMID: 30177953 PMCID: PMC6109633 DOI: 10.3389/fgene.2018.00332] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 08/02/2018] [Indexed: 11/22/2022] Open

Abstract

Over the last two decades, various in silico approaches have been developed and refined that attempt to identify protein and/or peptide vaccines candidates from informative signals encoded in protein sequences of a target pathogen. As to date, no signal has been identified that clearly indicates a protein will effectively contribute to a protective immune response in a host. The premise for this study is that proteins under positive selection from the immune system are more likely suitable vaccine candidates than proteins exposed to other selection pressures. Furthermore, our expectation is that protein sequence regions encoding major histocompatibility complexes (MHC) binding peptides will contain consecutive positive selection sites. Using freely available data and bioinformatic tools, we present a high-throughput approach through a pipeline that predicts positive selection sites, protein subcellular locations, and sequence locations of medium to high T-Cell MHC class I binding peptides. Positive selection sites are estimated from a sequence alignment by comparing rates of synonymous (dS) and non-synonymous (dN) substitutions among protein coding sequences of orthologous genes in a phylogeny. The main pipeline output is a list of protein vaccine candidates predicted to be naturally exposed to the immune system and containing sites under positive selection. Candidates are ranked with respect to the number of consecutive sites located on protein sequence regions encoding MHCI-binding peptides. Results are constrained by the reliability of prediction programs and quality of input data. Protein sequences from Toxoplasma gondii ME49 strain (TGME49) were used as a case study. Surface antigen (SAG), dense granules (GRA), microneme (MIC), and rhoptry (ROP) proteins are considered worthy T. gondii candidates. Given 8263 TGME49 protein sequences processed anonymously, the top 10 predicted candidates were all worthy candidates. In particular, the top ten included ROP5 and ROP18, which are T. gondii virulence determinants. The chance of randomly selecting a ROP protein was 0.2% given 8263 sequences. We conclude that the approach described is a valuable addition to other in silico approaches to identify vaccines candidates worthy of laboratory validation and could be adapted for other apicomplexan parasite species (with appropriate data).

Collapse

Mei S, Flemington EK, Zhang K. Transferring knowledge of bacterial protein interaction networks to predict pathogen targeted human genes and immune signaling pathways: a case study on M. tuberculosis. BMC Genomics 2018;19:505. [PMID: 29954330 PMCID: PMC6027805 DOI: 10.1186/s12864-018-4873-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2017] [Accepted: 06/18/2018] [Indexed: 12/11/2022] Open

Abstract

Background

Bacterial invasive infection and host immune response is fundamental to the understanding of pathogen pathogenesis and the discovery of effective therapeutic drugs. However, there are very few experimental studies on the signaling cross-talks between bacteria and human host to date.

Methods

In this work, taking M. tuberculosis H37Rv (MTB) that is co-evolving with its human host as an example, we propose a general computational framework that exploits the known bacterial pathogen protein interaction networks in STRING database to predict pathogen-host protein interactions and their signaling cross-talks. In this framework, significant interlogs are derived from the known pathogen protein interaction networks to train a predictive l₂-regularized logistic regression model.

Results

The computational results show that the proposed method achieves excellent performance of cross validation as well as low predicted positive rates on the less significant interlogs and non-interlogs, indicating a low risk of false discovery. We further conduct gene ontology (GO) and pathway enrichment analyses of the predicted pathogen-host protein interaction networks, which potentially provides insights into the machinery that M. tuberculosis H37Rv targets human genes and signaling pathways. In addition, we analyse the pathogen-host protein interactions related to drug resistance, inhibition of which potentially provides an alternative solution to M. tuberculosis H37Rv drug resistance.

Conclusions

The proposed machine learning framework has been verified effective for predicting bacteria-host protein interactions via known bacterial protein interaction networks. For a vast majority of bacterial pathogens that lacks experimental studies of bacteria-host protein interactions, this framework is supposed to achieve a general-purpose applicability. The predicted protein interaction networks between M. tuberculosis H37Rv and Homo sapiens, provided in the Additional files, promise to gain applications in the two fields: (1) providing an alternative solution to drug resistance; (2) revealing the patterns that M. tuberculosis H37Rv genes target human immune signaling pathways.

Electronic supplementary material

The online version of this article (10.1186/s12864-018-4873-9) contains supplementary material, which is available to authorized users.

Collapse

Galpert D, Fernández A, Herrera F, Antunes A, Molina-Ruiz R, Agüero-Chapin G. Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers. BMC Bioinformatics 2018;19:166. [PMID: 29724166 PMCID: PMC5934817 DOI: 10.1186/s12859-018-2148-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Accepted: 04/04/2018] [Indexed: 12/24/2022] Open

Abstract

BACKGROUND

The development of new ortholog detection algorithms and the improvement of existing ones are of major importance in functional genomics. We have previously introduced a successful supervised pairwise ortholog classification approach implemented in a big data platform that considered several pairwise protein features and the low ortholog pair ratios found between two annotated proteomes (Galpert, D et al., BioMed Research International, 2015). The supervised models were built and tested using a Saccharomycete yeast benchmark dataset proposed by Salichos and Rokas (2011). Despite several pairwise protein features being combined in a supervised big data approach; they all, to some extent were alignment-based features and the proposed algorithms were evaluated on a unique test set. Here, we aim to evaluate the impact of alignment-free features on the performance of supervised models implemented in the Spark big data platform for pairwise ortholog detection in several related yeast proteomes.

RESULTS

The Spark Random Forest and Decision Trees with oversampling and undersampling techniques, and built with only alignment-based similarity measures or combined with several alignment-free pairwise protein features showed the highest classification performance for ortholog detection in three yeast proteome pairs. Although such supervised approaches outperformed traditional methods, there were no significant differences between the exclusive use of alignment-based similarity measures and their combination with alignment-free features, even within the twilight zone of the studied proteomes. Just when alignment-based and alignment-free features were combined in Spark Decision Trees with imbalance management, a higher success rate (98.71%) within the twilight zone could be achieved for a yeast proteome pair that underwent a whole genome duplication. The feature selection study showed that alignment-based features were top-ranked for the best classifiers while the runners-up were alignment-free features related to amino acid composition.

CONCLUSIONS

The incorporation of alignment-free features in supervised big data models did not significantly improve ortholog detection in yeast proteomes regarding the classification qualities achieved with just alignment-based similarity measures. However, the similarity of their classification performance to that of traditional ortholog detection methods encourages the evaluation of other alignment-free protein pair descriptors in future research.

Collapse

Kallal RJ, Fernández R, Giribet G, Hormiga G. A phylotranscriptomic backbone of the orb-weaving spider family Araneidae (Arachnida, Araneae) supported by multiple methodological approaches. Mol Phylogenet Evol 2018;126:129-140. [PMID: 29635025 DOI: 10.1016/j.ympev.2018.04.007] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Revised: 03/05/2018] [Accepted: 04/06/2018] [Indexed: 01/01/2023]

Abstract

The orb-weaving spider family Araneidae is extremely diverse (>3100 spp.) and its members can be charismatic terrestrial arthropods, many of them recognizable by their iconic orbicular snare web, such as the common garden spiders. Despite considerable effort to better understand their backbone relationships based on multiple sources of data (morphological, behavioral and molecular), pervasive low support remains in recent studies. In addition, no overarching phylogeny of araneids is available to date, hampering further comparative work. In this study, we analyze the transcriptomes of 33 taxa, including 19 araneids - 12 of them new to this study - representing most of the core family lineages, to examine the relationships within the family using genomic-scale datasets resulting from various methodological treatments, namely ortholog selection and gene occupancy as a measure of matrix completion. Six matrices were constructed to assess these effects by varying orthology inference method and gene occupancy threshold. Orthology methods used are the benchmarking tool BUSCO and the tree-based method UPhO; three gene occupancy thresholds (45%, 65%, 85%) were used to assess the effect of missing data. Gene tree and species tree-based methods (including multi-species coalescent and concatenation approaches, as well as maximum likelihood and Bayesian inference) were used totalling 17 analytical treatments. The monophyly of Araneidae and the placement of core araneid lineages were supported, together with some previously unsound backbone divergences; these include high support for Zygiellinae as the earliest diverging subfamily (followed by Nephilinae), the placement of Gasteracanthinae as sister group to Cyclosa and close relatives, and close relationships between the Araneus + Neoscona clade and Cyrtophorinae + Argiopinae clade. Incongruences were relegated to short branches in the clade comprising Cyclosa and its close relatives. We found congruence between most of the completed analyses, with minimal topological effects from occupancy/missing data and orthology assessment. The resulting number of genes by certain combinations of orthology and occupancy thresholds being analyzed had the greatest effect on the resulting trees, with anomalous outcomes recovered from analysis of lower numbers of genes.

Collapse

McGirr JA, Martin CH. Parallel evolution of gene expression between trophic specialists despite divergent genotypes and morphologies. Evol Lett 2018;2:62-75. [PMID: 30283665 PMCID: PMC6089502 DOI: 10.1002/evl3.41] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Revised: 12/22/2017] [Accepted: 01/03/2018] [Indexed: 12/20/2022] Open

Schreiber HL, Conover MS, Chou WC, Hibbing ME, Manson AL, Dodson KW, Hannan TJ, Roberts PL, Stapleton AE, Hooton TM, Livny J, Earl AM, Hultgren SJ. Bacterial virulence phenotypes of Escherichia coli and host susceptibility determine risk for urinary tract infections. Sci Transl Med 2017;9:9/382/eaaf1283. [PMID: 28330863 DOI: 10.1126/scitranslmed.aaf1283] [Citation(s) in RCA: 105] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Revised: 05/12/2016] [Accepted: 12/12/2016] [Indexed: 01/01/2023]

Eberlein C, Nielly-Thibault L, Maaroufi H, Dubé AK, Leducq JB, Charron G, Landry CR. The Rapid Evolution of an Ohnolog Contributes to the Ecological Specialization of Incipient Yeast Species. Mol Biol Evol 2017;34:2173-2186. [PMID: 28482005 DOI: 10.1093/molbev/msx153] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Affiliation(s)

Chris Eberlein Département de Biologie, Université Laval, Québec, QC, Canada.,Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada.,PROTEO, The Quebec Network for Research on Protein Function, Engineering and Applications, Québec, QC, Canada
Lou Nielly-Thibault Département de Biologie, Université Laval, Québec, QC, Canada.,Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada.,PROTEO, The Quebec Network for Research on Protein Function, Engineering and Applications, Québec, QC, Canada.,Big Data Research Center (CRDM), Université Laval, Québec, QC, Canada
Halim Maaroufi Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
Alexandre K Dubé Département de Biologie, Université Laval, Québec, QC, Canada.,Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada.,PROTEO, The Quebec Network for Research on Protein Function, Engineering and Applications, Québec, QC, Canada
Jean-Baptiste Leducq Département de Biologie, Université Laval, Québec, QC, Canada.,Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
Guillaume Charron Département de Biologie, Université Laval, Québec, QC, Canada.,Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada.,PROTEO, The Quebec Network for Research on Protein Function, Engineering and Applications, Québec, QC, Canada
Christian R Landry Département de Biologie, Université Laval, Québec, QC, Canada.,Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada.,PROTEO, The Quebec Network for Research on Protein Function, Engineering and Applications, Québec, QC, Canada.,Big Data Research Center (CRDM), Université Laval, Québec, QC, Canada

Collapse

SMORE: Synteny Modulator of Repetitive Elements. Life (Basel) 2017;7:life7040042. [PMID: 29088079 PMCID: PMC5745555 DOI: 10.3390/life7040042] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2017] [Revised: 10/27/2017] [Accepted: 10/28/2017] [Indexed: 12/19/2022] Open

Mahajan G, Mande SC. Using structural knowledge in the protein data bank to inform the search for potential host-microbe protein interactions in sequence space: application to Mycobacterium tuberculosis. BMC Bioinformatics 2017;18:201. [PMID: 28376709 PMCID: PMC5379762 DOI: 10.1186/s12859-017-1550-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Accepted: 02/16/2017] [Indexed: 12/31/2022] Open

Abstract

Background

A comprehensive map of the human-M. tuberculosis (MTB) protein interactome would help fill the gaps in our understanding of the disease, and computational prediction can aid and complement experimental studies towards this end. Several sequence-based in silico approaches tap the existing data on experimentally validated protein-protein interactions (PPIs); these PPIs serve as templates from which novel interactions between pathogen and host are inferred. Such comparative approaches typically make use of local sequence alignment, which, in the absence of structural details about the interfaces mediating the template interactions, could lead to incorrect inferences, particularly when multi-domain proteins are involved.

Results

We propose leveraging the domain-domain interaction (DDI) information in PDB complexes to score and prioritize candidate PPIs between host and pathogen proteomes based on targeted sequence-level comparisons. Our method picks out a small set of human-MTB protein pairs as candidates for physical interactions, and the use of functional meta-data suggests that some of them could contribute to the in vivo molecular cross-talk between pathogen and host that regulates the course of the infection. Further, we present numerical data for Pfam domain families that highlights interaction specificity on the domain level. Not every instance of a pair of domains, for which interaction evidence has been found in a few instances (i.e. structures), is likely to functionally interact. Our sorting approach scores candidates according to how “distant” they are in sequence space from known examples of DDIs (templates). Thus, it provides a natural way to deal with the heterogeneity in domain-level interactions.

Conclusions

Our method represents a more informed application of local alignment to the sequence-based search for potential human-microbial interactions that uses available PPI data as a prior. Our approach is somewhat limited in its sensitivity by the restricted size and diversity of the template dataset, but, given the rapid accumulation of solved protein complex structures, its scope and utility are expected to keep steadily improving.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-017-1550-y) contains supplementary material, which is available to authorized users.

Collapse

Dupont PY, Cox MP. Genomic Data Quality Impacts Automated Detection of Lateral Gene Transfer in Fungi. G3 (BETHESDA, MD.) 2017;7:1301-1314. [PMID: 28235827 PMCID: PMC5386878 DOI: 10.1534/g3.116.038448] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Accepted: 02/17/2017] [Indexed: 12/26/2022]

Steenwyk JL, Soghigian JS, Perfect JR, Gibbons JG. Copy number variation contributes to cryptic genetic variation in outbreak lineages of Cryptococcus gattii from the North American Pacific Northwest. BMC Genomics 2016;17:700. [PMID: 27590805 PMCID: PMC5009542 DOI: 10.1186/s12864-016-3044-0] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 08/24/2016] [Indexed: 12/13/2022] Open

Abstract

Background

Copy number variants (CNVs) are a class of structural variants (SVs) and are defined as fragments of DNA that are present at variable copy number in comparison with a reference genome. Recent advances in bioinformatics methodologies and sequencing technologies have enabled the high-resolution quantification of genome-wide CNVs. In pathogenic fungi SVs have been shown to alter gene expression, influence host specificity, and drive fungicide resistance, but little attention has focused specifically on CNVs. Using publicly available sequencing data, we identified 90 isolates across 212 Cryptococcus gattii genomes that belong to the VGII subgroups responsible for the recent deadly outbreaks in the North American Pacific Northwest. We generated CNV profiles for each sample to investigate the prevalence and function of CNV in C. gattii.

Results

We identified eight genetic clusters among publicly available Illumina whole genome sequence data from 212 C. gattii isolates through population structure analysis. Three clusters represent the VGIIa, VGIIb, and VGIIc subgroups from the North American Pacific Northwest. CNV was bioinformatically predicted and affected ~300–400 Kilobases (Kb) of the C. gattii VGII subgroup genomes. Sixty-seven loci, encompassing 58 genes, showed highly divergent patterns of copy number variation between VGII subgroups. Analysis of PFam domains within divergent CN variable genes revealed enrichment of protein domains associated with transport, cell wall organization and external encapsulating structure.

Conclusions

CNVs may contribute to pathological and phenotypic differences observed between the C. gattii VGIIa, VGIIb, and VGIIc subpopulations. Genes overlapping with population differentiated CNVs were enriched for several virulence related functional terms. These results uncover novel candidate genes to examine the genetic and functional underpinnings of C. gattii pathogenicity.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-016-3044-0) contains supplementary material, which is available to authorized users.

Collapse

Fernández R, Edgecombe GD, Giribet G. Exploring Phylogenetic Relationships within Myriapoda and the Effects of Matrix Composition and Occupancy on Phylogenomic Reconstruction. Syst Biol 2016;65:871-89. [PMID: 27162151 PMCID: PMC4997009 DOI: 10.1093/sysbio/syw041] [Citation(s) in RCA: 74] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2015] [Accepted: 04/28/2016] [Indexed: 11/14/2022] Open

Abstract

Myriapods, including the diverse and familiar centipedes and millipedes, are one of the dominant terrestrial arthropod groups. Although molecular evidence has shown that Myriapoda is monophyletic, its internal phylogeny remains contentious and understudied, especially when compared to those of Chelicerata and Hexapoda. Until now, efforts have focused on taxon sampling (e.g., by including a handful of genes from many species) or on maximizing matrix size (e.g., by including hundreds or thousands of genes in just a few species), but a phylogeny maximizing sampling at both levels remains elusive. In this study, we analyzed 40 Illumina transcriptomes representing 3 of the 4 myriapod classes (Diplopoda, Chilopoda, and Symphyla); 25 transcriptomes were newly sequenced to maximize representation at the ordinal level in Diplopoda and at the family level in Chilopoda. Ten supermatrices were constructed to explore the effect of several potential phylogenetic biases (e.g., rate of evolution, heterotachy) at 3 levels of gene occupancy per taxon (50%, 75%, and 90%). Analyses based on maximum likelihood and Bayesian mixture models retrieved monophyly of each myriapod class, and resulted in 2 alternative phylogenetic positions for Symphyla, as sister group to Diplopoda + Chilopoda, or closer to Diplopoda, the latter hypothesis having been traditionally supported by morphology. Within centipedes, all orders were well supported, but 2 deep nodes remained in conflict in the different analyses despite dense taxon sampling at the family level. Relationships among centipede orders in all analyses conducted with the most complete matrix (90% occupancy) are at odds not only with the sparser but more gene-rich supermatrices (75% and 50% supermatrices) and with the matrices optimizing phylogenetic informativeness or most conserved genes, but also with previous hypotheses based on morphology, development, or other molecular data sets. Our results indicate that a high percentage of ribosomal proteins in the most complete matrices, in conjunction with distance from the root, can act in concert to compromise the estimated relationships within the ingroup. We discuss the implications of these findings in the context of the ever more prevalent quest for completeness in phylogenomic studies.

Collapse

Havird JC, Mitchell RT, Henry RP, Santos SR. Salinity-induced changes in gene expression from anterior and posterior gills of Callinectes sapidus (Crustacea: Portunidae) with implications for crustacean ecological genomics. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY D-GENOMICS & PROTEOMICS 2016;19:34-44. [PMID: 27337176 DOI: 10.1016/j.cbd.2016.06.002] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2016] [Revised: 05/31/2016] [Accepted: 06/08/2016] [Indexed: 01/05/2023]

Abstract

Decapods represent one of the most ecologically diverse taxonomic groups within crustaceans, making them ideal to study physiological processes like osmoregulation. However, prior studies have failed to consider the entire transcriptomic response of the gill - the primary organ responsible for ion transport - to changing salinity. Moreover, the molecular genetic differences between non-osmoregulatory and osmoregulatory gill types, as well as the hormonal basis of osmoregulation, remain underexplored. Here, we identified and characterized differentially expressed genes (DEGs) via RNA-Seq in anterior (non-osmoregulatory) and posterior (osmoregulatory) gills during high to low salinity transfer in the blue crab Callinectes sapidus, a well-studied model for crustacean osmoregulation. Overall, we confirmed previous expression patterns for individual ion transport genes and identified novel ones with salinity-mediated expression. Notable, novel DEGs among salinities and gill types for C. sapidus included anterior gills having higher expression of structural genes such as actin and cuticle proteins while posterior gills exhibit elevated expression of ion transport and energy-related genes, with the latter likely linked to ion transport. Potential targets among recovered DEGs for hormonal regulation of ion transport between salinities and gill types included neuropeptide Y and a KCTD16-like protein. Using publically available sequence data, constituents for a "core" gill transcriptome among decapods are presented, comprising genes involved in ion transport and energy conversion and consistent with salinity transfer experiments. Lastly, rarefication analyses lead us to recommend a modest number of sequence reads (~10-15M), but with increased biological replication, be utilized in future DEG analyses of crustaceans.

Collapse

Ballesteros JA, Hormiga G. A New Orthology Assessment Method for Phylogenomic Data: Unrooted Phylogenetic Orthology. Mol Biol Evol 2016;33:2117-34. [PMID: 27189539 DOI: 10.1093/molbev/msw069] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Standardized benchmarking in the quest for orthologs. Nat Methods 2016;13:425-30. [PMID: 27043882 PMCID: PMC4827703 DOI: 10.1038/nmeth.3830] [Citation(s) in RCA: 132] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Accepted: 03/09/2016] [Indexed: 11/23/2022]

Guimarães LC, Florczak-Wyspianska J, de Jesus LB, Viana MVC, Silva A, Ramos RTJ, Soares SDC, Soares SDC. Inside the Pan-genome - Methods and Software Overview. Curr Genomics 2016;16:245-52. [PMID: 27006628 PMCID: PMC4765519 DOI: 10.2174/1389202916666150423002311] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2015] [Revised: 04/20/2015] [Accepted: 04/21/2015] [Indexed: 12/11/2022] Open

Roy Chowdhury P, DeMaere M, Chapman T, Worden P, Charles IG, Darling AE, Djordjevic SP. Comparative genomic analysis of toxin-negative strains of Clostridium difficile from humans and animals with symptoms of gastrointestinal disease. BMC Microbiol 2016;16:41. [PMID: 26971047 PMCID: PMC4789261 DOI: 10.1186/s12866-016-0653-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 03/02/2016] [Indexed: 12/13/2022] Open

Abstract

Background

Clostridium difficile infections (CDI) are a significant health problem to humans and food animals. Clostridial toxins ToxA and ToxB encoded by genes tcdA and tcdB are located on a pathogenicity locus known as the PaLoc and are the major virulence factors of C. difficile. While toxin-negative strains of C. difficile are often isolated from faeces of animals and patients suffering from CDI, they are not considered to play a role in disease. Toxin-negative strains of C. difficile have been used successfully to treat recurring CDI but their propensity to acquire the PaLoc via lateral gene transfer and express clinically relevant levels of toxins has reinforced the need to characterise them genetically. In addition, further studies that examine the pathogenic potential of toxin-negative strains of C. difficile and the frequency by which toxin-negative strains may acquire the PaLoc are needed.

Results

We undertook a comparative genomic analysis of five Australian toxin-negative isolates of C. difficile that lack tcdA, tcdB and both binary toxin genes cdtA and cdtB that were recovered from humans and farm animals with symptoms of gastrointestinal disease. Our analyses show that the five C. difficile isolates cluster closely with virulent toxigenic strains of C. difficile belonging to the same sequence type (ST) and have virulence gene profiles akin to those in toxigenic strains. Furthermore, phage acquisition appears to have played a key role in the evolution of C. difficile.

Conclusions

Our results are consistent with the C. difficile global population structure comprising six clades each containing both toxin-positive and toxin-negative strains. Our data also suggests that toxin-negative strains of C. difficile encode a repertoire of putative virulence factors that are similar to those found in toxigenic strains of C. difficile, raising the possibility that acquisition of PaLoc by toxin-negative strains poses a threat to human health. Studies in appropriate animal models are needed to examine the pathogenic potential of toxin-negative strains of C. difficile and to determine the frequency by which toxin-negative strains may acquire the PaLoc.

Electronic supplementary material

The online version of this article (doi:10.1186/s12866-016-0653-3) contains supplementary material, which is available to authorized users.

Collapse

Tekaia F. Inferring Orthologs: Open Questions and Perspectives. GENOMICS INSIGHTS 2016;9:17-28. [PMID: 26966373 PMCID: PMC4778853 DOI: 10.4137/gei.s37925] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Revised: 12/30/2015] [Accepted: 01/02/2016] [Indexed: 01/25/2023]

Barbosa R, Almeida P, Safar SVB, Santos RO, Morais PB, Nielly-Thibault L, Leducq JB, Landry CR, Gonçalves P, Rosa CA, Sampaio JP. Evidence of Natural Hybridization in Brazilian Wild Lineages of Saccharomyces cerevisiae. Genome Biol Evol 2016;8:317-29. [PMID: 26782936 PMCID: PMC4779607 DOI: 10.1093/gbe/evv263] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open

Hooper CM, Castleden IR, Aryamanesh N, Jacoby RP, Millar AH. Finding the Subcellular Location of Barley, Wheat, Rice and Maize Proteins: The Compendium of Crop Proteins with Annotated Locations (cropPAL). PLANT & CELL PHYSIOLOGY 2016;57:e9. [PMID: 26556651 DOI: 10.1093/pcp/pcv170] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 10/27/2015] [Indexed: 05/10/2023]

Impact of gene family evolutionary histories on phylogenetic species tree inference by gene tree parsimony. Mol Phylogenet Evol 2015;96:9-16. [PMID: 26702957 DOI: 10.1016/j.ympev.2015.12.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 10/11/2015] [Accepted: 12/03/2015] [Indexed: 11/21/2022]

Abstract

Complicated history of gene duplication and loss brings challenge to molecular phylogenetic inference, especially in deep phylogenies. However, phylogenomic approaches, such as gene tree parsimony (GTP), show advantage over some other approaches in its ability to use gene families with duplications. GTP searches the 'optimal' species tree by minimizing the total cost of biological events such as duplications, but accuracy of GTP and phylogenetic signal in the context of different gene families with distinct histories of duplication and loss are unclear. To evaluate how different evolutionary properties of different gene families can impact on species tree inference, 3900 gene families from seven angiosperms encompassing a wide range of gene content, lineage-specific expansions and contractions were analyzed. It was found that the gene content and total duplication number in a gene family strongly influence species tree inference accuracy, with the highest accuracy achieved at either very low or very high gene content (or duplication number) and lowest accuracy centered in intermediate gene content (or duplication number), as the relationship can fit a binomial regression. Besides, for gene families of similar level of average gene content, those with relatively higher lineage-specific expansion or duplication rates tend to show lower accuracy. Additional correlation tests support that high accuracy for those gene families with large gene content may rely on abundant ancestral copies to provide many subtrees to resolve conflicts, whereas high accuracy for single or low copy gene families are just subject to sequence substitution per se. Very low accuracy reached by gene families of intermediate gene content or duplication number can be due to insufficient subtrees to resolve the conflicts from loss of alternative copies. As these evolutionary properties can significantly influence species tree accuracy, I discussed the potential weighting of the duplication cost by evolutionary properties of gene families in future GTP analyses.

Collapse

Galpert D, del Río S, Herrera F, Ancede-Gallardo E, Antunes A, Agüero-Chapin G. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species. BIOMED RESEARCH INTERNATIONAL 2015;2015:748681. [PMID: 26605337 PMCID: PMC4641943 DOI: 10.1155/2015/748681] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Revised: 07/26/2015] [Accepted: 08/20/2015] [Indexed: 11/17/2022]

Schulz F, Martijn J, Wascher F, Lagkouvardos I, Kostanjšek R, Ettema TJG, Horn M. A Rickettsiales symbiont of amoebae with ancient features. Environ Microbiol 2015;18:2326-42. [PMID: 25908022 DOI: 10.1111/1462-2920.12881] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Revised: 03/03/2015] [Accepted: 03/16/2015] [Indexed: 11/28/2022]

Rock, paper, scissors: harnessing complementarity in ortholog detection methods improves comparative genomic inference. G3-GENES GENOMES GENETICS 2015;5:629-38. [PMID: 25711833 PMCID: PMC4390578 DOI: 10.1534/g3.115.017095] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Jeffares DC, Tomiczek B, Sojo V, dos Reis M. A beginners guide to estimating the non-synonymous to synonymous rate ratio of all protein-coding genes in a genome. Methods Mol Biol 2015;1201:65-90. [PMID: 25388108 DOI: 10.1007/978-1-4939-1438-8_4] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]

Li J, Wong CF, Wong MT, Huang H, Leung FC. Modularized evolution in archaeal methanogens phylogenetic forest. Genome Biol Evol 2014;6:3344-59. [PMID: 25502908 PMCID: PMC4986457 DOI: 10.1093/gbe/evu259] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/17/2014] [Indexed: 11/13/2022] Open

Trachana K, Forslund K, Larsson T, Powell S, Doerks T, von Mering C, Bork P. A phylogeny-based benchmarking test for orthology inference reveals the limitations of function-based validation. PLoS One 2014;9:e111122. [PMID: 25369365 PMCID: PMC4219706 DOI: 10.1371/journal.pone.0111122] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 09/23/2014] [Indexed: 11/19/2022] Open

Pereira C, Denise A, Lespinet O. A meta-approach for improving the prediction and the functional annotation of ortholog groups. BMC Genomics 2014;15 Suppl 6:S16. [PMID: 25573073 PMCID: PMC4240552 DOI: 10.1186/1471-2164-15-s6-s16] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Ogilvie HA, Imin N, Djordjevic MA. Diversification of the C-TERMINALLY ENCODED PEPTIDE (CEP) gene family in angiosperms, and evolution of plant-family specific CEP genes. BMC Genomics 2014;15:870. [PMID: 25287121 PMCID: PMC4197245 DOI: 10.1186/1471-2164-15-870] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2014] [Accepted: 09/24/2014] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Small, secreted signaling peptides work in parallel with phytohormones to control important aspects of plant growth and development. Genes from the C-TERMINALLY ENCODED PEPTIDE (CEP) family produce such peptides which negatively regulate plant growth, especially under stress, and affect other important developmental processes. To illuminate how the CEP gene family has evolved within the plant kingdom, including its emergence, diversification and variation between lineages, a comprehensive survey was undertaken to identify and characterize CEP genes in 106 plant genomes.

RESULTS

Using a motif-based system developed for this study to identify canonical CEP peptide domains, a total of 916 CEP genes and 1,223 CEP domains were found in angiosperms and for the first time in gymnosperms. This defines a narrow band for the emergence of CEP genes in plants, from the divergence of lycophytes to the angiosperm/gymnosperm split. Both CEP genes and domains were found to have diversified in angiosperms, particularly in the Poaceae and Solanaceae plant families. Multispecies orthologous relationships were determined for 22% of identified CEP genes, and further analysis of those groups found selective constraints upon residues within the CEP peptide and within the previously little-characterized variable region. An examination of public Oryza sativa RNA-Seq datasets revealed an expression pattern that links OsCEP5 and OsCEP6 to panicle development and flowering, and CEP gene trees reveal these emerged from a duplication event associated with the Poaceae plant family.

CONCLUSIONS

The characterization of the plant-family specific CEP genes OsCEP5 and OsCEP6, the association of CEP genes with angiosperm-specific development processes like panicle development, and the diversification of CEP genes in angiosperms provides further support for the hypothesis that CEP genes have been integral to the evolution of novel traits within the angiosperm lineage. Beyond these findings, the comprehensive set of CEP genes and their properties reported here will be a resource for future research on CEP genes and peptides.

Collapse

Andrade SCS, Montenegro H, Strand M, Schwartz ML, Kajihara H, Norenburg JL, Turbeville JM, Sundberg P, Giribet G. A Transcriptomic Approach to Ribbon Worm Systematics (Nemertea): Resolving the Pilidiophora Problem. Mol Biol Evol 2014;31:3206-15. [DOI: 10.1093/molbev/msu253] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

Literature-based gene curation and proposed genetic nomenclature for cryptococcus. EUKARYOTIC CELL 2014;13:878-83. [PMID: 24813190 DOI: 10.1128/ec.00083-14] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Dalquen DA, Dessimoz C. Bidirectional best hits miss many orthologs in duplication-rich clades such as plants and animals. Genome Biol Evol 2014;5:1800-6. [PMID: 24013106 PMCID: PMC3814191 DOI: 10.1093/gbe/evt132] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Fernández R, Laumer CE, Vahtera V, Libro S, Kaluziak S, Sharma PP, Pérez-Porro AR, Edgecombe GD, Giribet G. Evaluating topological conflict in centipede phylogeny using transcriptomic data sets. Mol Biol Evol 2014;31:1500-13. [PMID: 24674821 DOI: 10.1093/molbev/msu108] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Powell S, Forslund K, Szklarczyk D, Trachana K, Roth A, Huerta-Cepas J, Gabaldón T, Rattei T, Creevey C, Kuhn M, Jensen LJ, von Mering C, Bork P. eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Res 2013;42:D231-9. [PMID: 24297252 PMCID: PMC3964997 DOI: 10.1093/nar/gkt1253] [Citation(s) in RCA: 436] [Impact Index Per Article: 36.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open

Yang Y, Luo D. The origin of parasitism gene in nematodes: evolutionary analysis through the construction of domain trees. Evol Bioinform Online 2013;9:453-66. [PMID: 24277980 PMCID: PMC3836563 DOI: 10.4137/ebo.s13032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open

Francis O, Han F, Adams JC. Molecular phylogeny of a RING E3 ubiquitin ligase, conserved in eukaryotic cells and dominated by homologous components, the muskelin/RanBPM/CTLH complex. PLoS One 2013;8:e75217. [PMID: 24143168 PMCID: PMC3797097 DOI: 10.1371/journal.pone.0075217] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2012] [Accepted: 08/13/2013] [Indexed: 01/11/2023] Open

Abstract

Ubiquitination is an essential post-translational modification that regulates signalling and protein turnover in eukaryotic cells. Specificity of ubiquitination is driven by ubiquitin E3 ligases, many of which remain poorly understood. One such is the mammalian muskelin/RanBP9/CTLH complex that includes eight proteins, five of which (RanBP9/RanBPM, TWA1, MAEA, Rmnd5 and muskelin), share striking similarities of domain architecture and have been implicated in regulation of cell organisation. In budding yeast, the homologous GID complex acts to down-regulate gluconeogenesis. In both complexes, Rmnd5/GID2 corresponds to a RING ubiquitin ligase. To better understand this E3 ligase system, we conducted molecular phylogenetic and sequence analyses of the related components. TWA1, Rmnd5, MAEA and WDR26 are conserved throughout all eukaryotic supergroups, albeit WDR26 was not identified in Rhizaria. RanBPM is absent from Excavates and from some sub-lineages. Armc8 and c17orf39 were represented across unikonts but in bikonts were identified only in Viridiplantae and in O. trifallax within alveolates. Muskelin is present only in Opisthokonts. Phylogenetic and sequence analyses of the shared LisH and CTLH domains of RanBPM, TWA1, MAEA and Rmnd5 revealed closer relationships and profiles of conserved residues between, respectively, Rmnd5 and MAEA, and RanBPM and TWA1. Rmnd5 and MAEA are also related by the presence of conserved, variant RING domains. Examination of how N- or C-terminal domain deletions alter the sub-cellular localisation of each protein in mammalian cells identified distinct contributions of the LisH domains to protein localisation or folding/stability. In conclusion, all components except muskelin are inferred to have been present in the last eukaryotic common ancestor. Diversification of this ligase complex in different eukaryotic lineages may result from the apparently fast evolution of RanBPM, differing requirements for WDR26, Armc8 or c17orf39, and the origin of muskelin in opisthokonts as a RanBPM-binding protein.

Collapse

Capra JA, Stolzer M, Durand D, Pollard KS. How old is my gene? Trends Genet 2013;29:659-68. [PMID: 23915718 DOI: 10.1016/j.tig.2013.07.001] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Revised: 06/13/2013] [Accepted: 07/03/2013] [Indexed: 11/26/2022]