1
|
Thureborn O, Wikström N, Razafimandimbison SG, Rydin C. Plastid phylogenomics and cytonuclear discordance in Rubioideae, Rubiaceae. PLoS One 2024; 19:e0302365. [PMID: 38768140 PMCID: PMC11104678 DOI: 10.1371/journal.pone.0302365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 04/03/2024] [Indexed: 05/22/2024] Open
Abstract
In this study of evolutionary relationships in the subfamily Rubioideae (Rubiaceae), we take advantage of the off-target proportion of reads generated via previous target capture sequencing projects based on nuclear genomic data to build a plastome phylogeny and investigate cytonuclear discordance. The assembly of off-target reads resulted in a comprehensive plastome dataset and robust inference of phylogenetic relationships, where most intratribal and intertribal relationships are resolved with strong support. While the phylogenetic results were mostly in agreement with previous studies based on plastome data, novel relationships in the plastid perspective were also detected. For example, our analyses of plastome data provide strong support for the SCOUT clade and its sister relationship to the remaining members of the subfamily, which differs from previous results based on plastid data but agrees with recent results based on nuclear genomic data. However, several instances of highly supported cytonuclear discordance were identified across the Rubioideae phylogeny. Coalescent simulation analysis indicates that while ILS could, by itself, explain the majority of the discordant relationships, plastome introgression may be the better explanation in some cases. Our study further indicates that plastomes across the Rubioideae are, with few exceptions, highly conserved and mainly conform to the structure, gene content, and gene order present in the majority of the flowering plants.
Collapse
Affiliation(s)
- Olle Thureborn
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
| | - Niklas Wikström
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
- The Bergius Foundation, The Royal Academy of Sciences, Stockholm, Sweden
| | | | - Catarina Rydin
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
- The Bergius Foundation, The Royal Academy of Sciences, Stockholm, Sweden
| |
Collapse
|
2
|
Steenwyk JL, King N. The promise and pitfalls of synteny in phylogenomics. PLoS Biol 2024; 22:e3002632. [PMID: 38768403 PMCID: PMC11105162 DOI: 10.1371/journal.pbio.3002632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
Reconstructing the tree of life remains a central goal in biology. Early methods, which relied on small numbers of morphological or genetic characters, often yielded conflicting evolutionary histories, undermining confidence in the results. Investigations based on phylogenomics, which use hundreds to thousands of loci for phylogenetic inquiry, have provided a clearer picture of life's history, but certain branches remain problematic. To resolve difficult nodes on the tree of life, 2 recent studies tested the utility of synteny, the conserved collinearity of orthologous genetic loci in 2 or more organisms, for phylogenetics. Synteny exhibits compelling phylogenomic potential while also raising new challenges. This Essay identifies and discusses specific opportunities and challenges that bear on the value of synteny data and other rare genomic changes for phylogenomic studies. Synteny-based analyses of highly contiguous genome assemblies mark a new chapter in the phylogenomic era and the quest to reconstruct the tree of life.
Collapse
Affiliation(s)
- Jacob L. Steenwyk
- Howard Hughes Medical Institute, University of California, Berkeley, California, United States of America
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
| | - Nicole King
- Howard Hughes Medical Institute, University of California, Berkeley, California, United States of America
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
| |
Collapse
|
3
|
Ma B, Gong H, Xu Q, Gao Y, Guan A, Wang H, Hua K, Luo R, Jin H. Bases-dependent Rapid Phylogenetic Clustering (Bd-RPC) enables precise and efficient phylogenetic estimation in viruses. Virus Evol 2024; 10:veae005. [PMID: 38361823 PMCID: PMC10868571 DOI: 10.1093/ve/veae005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 01/06/2024] [Accepted: 01/22/2024] [Indexed: 02/17/2024] Open
Abstract
Understanding phylogenetic relationships among species is essential for many biological studies, which call for an accurate phylogenetic tree to understand major evolutionary transitions. The phylogenetic analyses present a major challenge in estimation accuracy and computational efficiency, especially recently facing a wave of severe emerging infectious disease outbreaks. Here, we introduced a novel, efficient framework called Bases-dependent Rapid Phylogenetic Clustering (Bd-RPC) for new sample placement for viruses. In this study, a brand-new recoding method called Frequency Vector Recoding was implemented to approximate the phylogenetic distance, and the Phylogenetic Simulated Annealing Search algorithm was developed to match the recoded distance matrix with the phylogenetic tree. Meanwhile, the indel (insertion/deletion) was heuristically introduced to foreign sequence recognition for the first time. Here, we compared the Bd-RPC with the recent placement software (PAGAN2, EPA-ng, TreeBeST) and evaluated it in Alphacoronavirus, Alphaherpesvirinae, and Betacoronavirus by using Split and Robinson-Foulds distances. The comparisons showed that Bd-RPC maintained the highest precision with great efficiency, demonstrating good performance in new sample placement on all three virus genera. Finally, a user-friendly website (http://www.bd-rpc.xyz) is available for users to classify new samples instantly and facilitate exploration of the phylogenetic research in viruses, and the Bd-RPC is available on GitHub (http://github.com/Bin-Ma/bd-rpc).
Collapse
Affiliation(s)
- Bin Ma
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Huimin Gong
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Qianshuai Xu
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Yuan Gao
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Aohan Guan
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Haoyu Wang
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Kexin Hua
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Rui Luo
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Hui Jin
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| |
Collapse
|
4
|
Roberts WR, Ruck EC, Downey KM, Pinseel E, Alverson AJ. Resolving Marine-Freshwater Transitions by Diatoms Through a Fog of Gene Tree Discordance. Syst Biol 2023; 72:984-997. [PMID: 37335140 DOI: 10.1093/sysbio/syad038] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 06/02/2023] [Accepted: 06/16/2023] [Indexed: 06/21/2023] Open
Abstract
Despite the obstacles facing marine colonists, most lineages of aquatic organisms have colonized and diversified in freshwaters repeatedly. These transitions can trigger rapid morphological or physiological change and, on longer timescales, lead to increased rates of speciation and extinction. Diatoms are a lineage of ancestrally marine microalgae that have diversified throughout freshwater habitats worldwide. We generated a phylogenomic data set of genomes and transcriptomes for 59 diatom taxa to resolve freshwater transitions in one lineage, the Thalassiosirales. Although most parts of the species tree were consistently resolved with strong support, we had difficulties resolving a Paleocene radiation, which affected the placement of one freshwater lineage. This and other parts of the tree were characterized by high levels of gene tree discordance caused by incomplete lineage sorting and low phylogenetic signal. Despite differences in species trees inferred from concatenation versus summary methods and codons versus amino acids, traditional methods of ancestral state reconstruction supported six transitions into freshwaters, two of which led to subsequent species diversification. Evidence from gene trees, protein alignments, and diatom life history together suggest that habitat transitions were largely the product of homoplasy rather than hemiplasy, a condition where transitions occur on branches in gene trees not shared with the species tree. Nevertheless, we identified a set of putatively hemiplasious genes, many of which have been associated with shifts to low salinity, indicating that hemiplasy played a small but potentially important role in freshwater adaptation. Accounting for differences in evolutionary outcomes, in which some taxa became locked into freshwaters while others were able to return to the ocean or become salinity generalists, might help further distinguish different sources of adaptive mutation in freshwater diatoms.
Collapse
Affiliation(s)
- Wade R Roberts
- Department of Biological Sciences, University of Arkansas, 1 University of Arkansas, Fayetteville, AR, 72701, USA
| | - Elizabeth C Ruck
- Department of Biological Sciences, University of Arkansas, 1 University of Arkansas, Fayetteville, AR, 72701, USA
| | - Kala M Downey
- Department of Biological Sciences, University of Arkansas, 1 University of Arkansas, Fayetteville, AR, 72701, USA
| | - Eveline Pinseel
- Department of Biological Sciences, University of Arkansas, 1 University of Arkansas, Fayetteville, AR, 72701, USA
| | - Andrew J Alverson
- Department of Biological Sciences, University of Arkansas, 1 University of Arkansas, Fayetteville, AR, 72701, USA
| |
Collapse
|
5
|
Redmond AK, Casey D, Gundappa MK, Macqueen DJ, McLysaght A. Independent rediploidization masks shared whole genome duplication in the sturgeon-paddlefish ancestor. Nat Commun 2023; 14:2879. [PMID: 37208359 DOI: 10.1038/s41467-023-38714-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 05/12/2023] [Indexed: 05/21/2023] Open
Abstract
Whole genome duplication (WGD) is a dramatic evolutionary event generating many new genes and which may play a role in survival through mass extinctions. Paddlefish and sturgeon are sister lineages that both show genomic evidence for ancient WGD. Until now this has been interpreted as two independent WGD events due to a preponderance of duplicate genes with independent histories. Here we show that although there is indeed a plurality of apparently independent gene duplications, these derive from a shared genome duplication event occurring well over 200 million years ago, likely close to the Permian-Triassic mass extinction period. This was followed by a prolonged process of reversion to stable diploid inheritance (rediploidization), that may have promoted survival during the Triassic-Jurassic mass extinction. We show that the sharing of this WGD is masked by the fact that paddlefish and sturgeon lineage divergence occurred before rediploidization had proceeded even half-way. Thus, for most genes the resolution to diploidy was lineage-specific. Because genes are only truly duplicated once diploid inheritance is established, the paddlefish and sturgeon genomes are thus a mosaic of shared and non-shared gene duplications resulting from a shared genome duplication event.
Collapse
Affiliation(s)
- Anthony K Redmond
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| | - Dearbhaile Casey
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| | - Manu Kumar Gundappa
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK
| | - Daniel J Macqueen
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK
| | - Aoife McLysaght
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland.
| |
Collapse
|
6
|
Fleming JF, Struck TH. nRCFV: a new, dataset-size-independent metric to quantify compositional heterogeneity in nucleotide and amino acid datasets. BMC Bioinformatics 2023; 24:145. [PMID: 37046225 PMCID: PMC10099917 DOI: 10.1186/s12859-023-05270-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 04/04/2023] [Indexed: 04/14/2023] Open
Abstract
MOTIVATION Compositional heterogeneity-when the proportions of nucleotides and amino acids are not broadly similar across the dataset-is a cause of a great number of phylogenetic artefacts. Whilst a variety of methods can identify it post-hoc, few metrics exist to quantify compositional heterogeneity prior to the computationally intensive task of phylogenetic tree reconstruction. Here we assess the efficacy of one such existing, widely used, metric: Relative Composition Frequency Variability (RCFV), using both real and simulated data. RESULTS Our results show that RCFV can be biased by sequence length, the number of taxa, and the number of possible character states within the dataset. However, we also find that missing data does not appear to have an appreciable effect on RCFV. We discuss the theory behind this, the consequences of this for the future of the usage of the RCFV value and propose a new metric, nRCFV, which accounts for these biases. Alongside this, we present a new software that calculates both RCFV and nRCFV, called nRCFV_Reader. AVAILABILITY AND IMPLEMENTATION nRCFV has been implemented in RCFV_Reader, available at: https://github.com/JFFleming/RCFV_Reader . Both our simulation and real data are available at Datadryad: https://doi.org/10.5061/dryad.wpzgmsbpn .
Collapse
Affiliation(s)
- James F Fleming
- University of Oslo Natural History Museum, Sars' Gata 1, Oslo, Norway.
| | - Torsten H Struck
- University of Oslo Natural History Museum, Sars' Gata 1, Oslo, Norway
| |
Collapse
|
7
|
Liu K, Xie N, Wang Y, Liu X. Extensive mitogenomic heteroplasmy and its implications in the phylogeny of the fish genus Megalobrama. 3 Biotech 2023; 13:115. [PMID: 36915286 PMCID: PMC10006376 DOI: 10.1007/s13205-023-03523-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Accepted: 02/13/2023] [Indexed: 03/12/2023] Open
Abstract
Megalobrama is China's most economically valuable fish genera. Four species make up this genus: M. amblycephala (MA), M. terminalis (MT), M. pellegrini (MP), and M. hoffmanni (MH). Many researchers have investigated the genetic relationship of Megalobrama based on mitochondrial DNA (mtDNA) and discovered that the branches of the phylogenetic tree for MT and MP are intertwined. We hypothesized that this occurs because mitogenomic heteroplasmy is overlooked when working with mtDNA, which causes MP and MT positions to intersect in phylogenetic trees. To eliminate the influence of nuclear mitochondrial DNA fragments (NUMTs) before analyzing mitogenomic heteroplasmy, we used PLastZ to identify NUMTs, which were then removed from the samples for the subsequent heteroplasmy analysis. Using the heteroplasmy caller icHET, we discovered 126, 339, 135, and 203 heteroplasmic variants in six MA, MT, MP, and MH samples. We reconstructed the Megalobrama fish genus's phylogenetic tree using the RY coding method and rejecting the third position on codons, which improved the performance of the phylogenetic tree by increasing the ratio of treeness to relative component variability from 100.02 ± 1.76 to 688.59 ± 190.56. Despite this, the RY coding method cannot alter the intersection of MP and MT positions in phylogenetic trees. We hypothesize that gene flow between MT and MP leads to intertwining mtDNA-based phylogenetic trees. In conclusion, our findings on the mitogenomic heteroplasmy of Megalobrama provide new insights into mtDNA-based phylogenetic studies. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-023-03523-0.
Collapse
Affiliation(s)
- Kai Liu
- Hangzhou Academy of Agricultural Sciences, Hangzhou, China
| | - Nan Xie
- Hangzhou Academy of Agricultural Sciences, Hangzhou, China
| | - Yuxi Wang
- Hangzhou Academy of Agricultural Sciences, Hangzhou, China
| | - Xinyi Liu
- Hangzhou Academy of Agricultural Sciences, Hangzhou, China
| |
Collapse
|
8
|
Xiang C, Gao F, Jakovlić I, Lei H, Hu Y, Zhang H, Zou H, Wang G, Zhang D. Using PhyloSuite for molecular phylogeny and tree-based analyses. IMETA 2023; 2:e87. [PMID: 38868339 PMCID: PMC10989932 DOI: 10.1002/imt2.87] [Citation(s) in RCA: 49] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 01/04/2023] [Accepted: 01/15/2023] [Indexed: 06/14/2024]
Abstract
Phylogenetic analysis has entered the genomics (multilocus) era. For less experienced researchers, conquering the large number of software programs required for a multilocus-based phylogenetic reconstruction can be somewhat daunting and time-consuming. PhyloSuite, a software with a user-friendly GUI, was designed to make this process more accessible by integrating multiple software programs needed for multilocus and single-gene phylogenies and further streamlining the whole process. In this protocol, we aim to explain how to conduct each step of the phylogenetic pipeline and tree-based analyses in PhyloSuite. We also present a new version of PhyloSuite (v1.2.3), wherein we fixed some bugs, made some optimizations, and introduced some new functions, including a number of tree-based analyses, such as signal-to-noise calculation, saturation analysis, spurious species identification, and etc. The step-by-step protocol includes background information (i.e., what the step does), reasons (i.e., why do the step), and operations (i.e., how to do it). This protocol will help researchers quick-start their way through the multilocus phylogenetic analysis, especially those interested in conducting organelle-based analyses.
Collapse
Affiliation(s)
- Chuan‐Yu Xiang
- State Key Laboratory of Grassland Agro‐Ecosystems, and College of EcologyLanzhou UniversityLanzhouChina
| | - Fangluan Gao
- Institute of Plant Virology, Fujian Agriculture and Forestry UniversityFuzhouChina
| | - Ivan Jakovlić
- State Key Laboratory of Grassland Agro‐Ecosystems, and College of EcologyLanzhou UniversityLanzhouChina
| | - Hong‐Peng Lei
- State Key Laboratory of Grassland Agro‐Ecosystems, and College of EcologyLanzhou UniversityLanzhouChina
| | - Ye Hu
- State Key Laboratory of Grassland Agro‐Ecosystems, and College of EcologyLanzhou UniversityLanzhouChina
| | - Hong Zhang
- State Key Laboratory of Grassland Agro‐Ecosystems, and College of EcologyLanzhou UniversityLanzhouChina
| | - Hong Zou
- Key Laboratory of Aquaculture Disease Control, Ministry of Agriculture, and State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of SciencesWuhanChina
| | - Gui‐Tang Wang
- Key Laboratory of Aquaculture Disease Control, Ministry of Agriculture, and State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of SciencesWuhanChina
| | - Dong Zhang
- State Key Laboratory of Grassland Agro‐Ecosystems, and College of EcologyLanzhou UniversityLanzhouChina
| |
Collapse
|
9
|
Conflicts in Mitochondrial Phylogenomics of Branchiopoda, with the First Complete Mitogenome of Laevicaudata (Crustacea: Branchiopoda). Curr Issues Mol Biol 2023; 45:820-837. [PMID: 36825999 PMCID: PMC9955068 DOI: 10.3390/cimb45020054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 01/13/2023] [Accepted: 01/16/2023] [Indexed: 01/19/2023] Open
Abstract
Conflicting phylogenetic signals are pervasive across genomes. The potential impact of such systematic biases may be reduced by phylogenetic approaches accommodating for heterogeneity or by the exclusive use of homoplastic sites in the datasets. Here, we present the complete mitogenome of Lynceus grossipedia as the first representative of the suborder Laevicaudata. We employed a phylogenomic approach on the mitogenomic datasets representing all major branchiopod groups to identify the presence of conflicts and concordance across the phylogeny. We found pervasive phylogenetic conflicts at the base of Diplostraca. The homogeneity of the substitution pattern tests and posterior predictive tests revealed a high degree of compositional heterogeneity among branchiopod mitogenomes at both the nucleotide and amino acid levels, which biased the phylogenetic inference. Our results suggest that Laevicaudata as the basal clade of Phyllopoda was most likely an artifact caused by compositional heterogeneity and conflicting phylogenetic signal. We demonstrated that the exclusive use of homoplastic site methods combining the application of site-heterogeneous models produced correct phylogenetic estimates of the higher-level relationships among branchiopods.
Collapse
|
10
|
Reynolds NK, Stajich JE, Benny GL, Barry K, Mondo S, LaButti K, Lipzen A, Daum C, Grigoriev IV, Ho HM, Crous PW, Spatafora JW, Smith ME. Mycoparasites, Gut Dwellers, and Saprotrophs: Phylogenomic Reconstructions and Comparative Analyses of Kickxellomycotina Fungi. Genome Biol Evol 2023; 15:6974727. [PMID: 36617272 PMCID: PMC9866270 DOI: 10.1093/gbe/evac185] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 12/15/2022] [Accepted: 12/20/2022] [Indexed: 01/09/2023] Open
Abstract
Improved sequencing technologies have profoundly altered global views of fungal diversity and evolution. High-throughput sequencing methods are critical for studying fungi due to the cryptic, symbiotic nature of many species, particularly those that are difficult to culture. However, the low coverage genome sequencing (LCGS) approach to phylogenomic inference has not been widely applied to fungi. Here we analyzed 171 Kickxellomycotina fungi using LCGS methods to obtain hundreds of marker genes for robust phylogenomic reconstruction. Additionally, we mined our LCGS data for a set of nine rDNA and protein coding genes to enable analyses across species for which no LCGS data were obtained. The main goals of this study were to: 1) evaluate the quality and utility of LCGS data for both phylogenetic reconstruction and functional annotation, 2) test relationships among clades of Kickxellomycotina, and 3) perform comparative functional analyses between clades to gain insight into putative trophic modes. In opposition to previous studies, our nine-gene analyses support two clades of arthropod gut dwelling species and suggest a possible single evolutionary event leading to this symbiotic lifestyle. Furthermore, we resolve the mycoparasitic Dimargaritales as the earliest diverging clade in the subphylum and find four major clades of Coemansia species. Finally, functional analyses illustrate clear variation in predicted carbohydrate active enzymes and secondary metabolites (SM) based on ecology, that is biotroph versus saprotroph. Saprotrophic Kickxellales broadly lack many known pectinase families compared with saprotrophic Mucoromycota and are depauperate for SM but have similar numbers of predicted chitinases as mycoparasitic.
Collapse
Affiliation(s)
| | - Jason E Stajich
- Department of Microbiology & Plant Pathology and Institute for Integrative Genome Biology, University of California–Riverside
| | | | - Kerrie Barry
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory
| | - Stephen Mondo
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory
| | - Kurt LaButti
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory
| | - Anna Lipzen
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory
| | - Chris Daum
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory
| | - Igor V Grigoriev
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory,Department of Plant and Microbial Biology, University of California Berkeley
| | - Hsiao-Man Ho
- Department of Science Education, University of Education, 134, Section 2, Heping E. Road, National Taipei, Taipei 106, Taiwan
| | - Pedro W Crous
- Department of Evolutionary Phytopathology, Westerdijk Fungal Biodiversity Institute, Uppsalalaan 8, 3584 CT, Utrecht, The Netherlands
| | | | | |
Collapse
|
11
|
Mulhair PO, McCarthy CGP, Siu-Ting K, Creevey CJ, O'Connell MJ. Filtering artifactual signal increases support for Xenacoelomorpha and Ambulacraria sister relationship in the animal tree of life. Curr Biol 2022; 32:5180-5188.e3. [PMID: 36356574 DOI: 10.1016/j.cub.2022.10.036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 08/09/2022] [Accepted: 10/18/2022] [Indexed: 11/10/2022]
Abstract
Conflicting studies place a group of bilaterian invertebrates containing xenoturbellids and acoelomorphs, the Xenacoelomorpha, as either the primary emerging bilaterian phylum1,2,3,4,5,6 or within Deuterostomia, sister to Ambulacraria.7,8,9,10,11 Although their placement as sister to the rest of Bilateria supports relatively simple morphology in the ancestral bilaterian, their alternative placement within Deuterostomia suggests a morphologically complex ancestral bilaterian along with extensive loss of major phenotypic traits in the Xenacoelomorpha. Recent studies have questioned whether Deuterostomia should be considered monophyletic at all.10,12,13 Hidden paralogy and poor phylogenetic signal present a major challenge for reconstructing species phylogenies.14,15,16,17,18 Here, we assess whether these issues have contributed to the conflict over the placement of Xenacoelomorpha. We reanalyzed published datasets, enriching for orthogroups whose gene trees support well-resolved clans elsewhere in the animal tree.16 We find that most genes in previously published datasets violate incontestable clans, suggesting that hidden paralogy and low phylogenetic signal affect the ability to reconstruct branching patterns at deep nodes in the animal tree. We demonstrate that removing orthogroups that cannot recapitulate incontestable relationships alters the final topology that is inferred, while simultaneously improving the fit of the model to the data. We discover increased, but ultimately not conclusive, support for the existence of Xenambulacraria in our set of filtered orthogroups. At a time when we are progressing toward sequencing all life on the planet, we argue that long-standing contentious issues in the tree of life will be resolved using smaller amounts of better quality data that can be modeled adequately.19.
Collapse
Affiliation(s)
- Peter O Mulhair
- Computational and Molecular Evolutionary Biology Research Group, School of Life Sciences, Faculty of Medicine and Health Sciences, University of Nottingham, Nottingham NG7 2RD, UK; Computational and Molecular Evolutionary Biology Research Group, School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, UK
| | - Charley G P McCarthy
- Computational and Molecular Evolutionary Biology Research Group, School of Life Sciences, Faculty of Medicine and Health Sciences, University of Nottingham, Nottingham NG7 2RD, UK
| | - Karen Siu-Ting
- Institute for Global Food Security, School of Biological Sciences, Queen's University Belfast, Belfast BT9 5DL, UK
| | - Christopher J Creevey
- Institute for Global Food Security, School of Biological Sciences, Queen's University Belfast, Belfast BT9 5DL, UK
| | - Mary J O'Connell
- Computational and Molecular Evolutionary Biology Research Group, School of Life Sciences, Faculty of Medicine and Health Sciences, University of Nottingham, Nottingham NG7 2RD, UK; Computational and Molecular Evolutionary Biology Research Group, School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, UK.
| |
Collapse
|
12
|
Steenwyk JL, Goltz DC, Buida TJ, Li Y, Shen XX, Rokas A. OrthoSNAP: A tree splitting and pruning algorithm for retrieving single-copy orthologs from gene family trees. PLoS Biol 2022; 20:e3001827. [PMID: 36228036 PMCID: PMC9595520 DOI: 10.1371/journal.pbio.3001827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 10/25/2022] [Accepted: 09/13/2022] [Indexed: 11/19/2022] Open
Abstract
Molecular evolution studies, such as phylogenomic studies and genome-wide surveys of selection, often rely on gene families of single-copy orthologs (SC-OGs). Large gene families with multiple homologs in 1 or more species-a phenomenon observed among several important families of genes such as transporters and transcription factors-are often ignored because identifying and retrieving SC-OGs nested within them is challenging. To address this issue and increase the number of markers used in molecular evolution studies, we developed OrthoSNAP, a software that uses a phylogenetic framework to simultaneously split gene families into SC-OGs and prune species-specific inparalogs. We term SC-OGs identified by OrthoSNAP as SNAP-OGs because they are identified using a splitting and pruning procedure analogous to snapping branches on a tree. From 415,129 orthologous groups of genes inferred across 7 eukaryotic phylogenomic datasets, we identified 9,821 SC-OGs; using OrthoSNAP on the remaining 405,308 orthologous groups of genes, we identified an additional 10,704 SNAP-OGs. Comparison of SNAP-OGs and SC-OGs revealed that their phylogenetic information content was similar, even in complex datasets that contain a whole-genome duplication, complex patterns of duplication and loss, transcriptome data where each gene typically has multiple transcripts, and contentious branches in the tree of life. OrthoSNAP is useful for increasing the number of markers used in molecular evolution data matrices, a critical step for robustly inferring and exploring the tree of life.
Collapse
Affiliation(s)
- Jacob L. Steenwyk
- Vanderbilt University, Department of Biological Sciences, Nashville, Tennessee, United States of America
- Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail: (JLS); (AR)
| | - Dayna C. Goltz
- Independent Researcher, Nashville, Tennessee, United States of America
| | - Thomas J. Buida
- Independent Researcher, Nashville, Tennessee, United States of America
| | - Yuanning Li
- Vanderbilt University, Department of Biological Sciences, Nashville, Tennessee, United States of America
- Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- Institute of Marine Science and Technology, Shandong University, Qingdao, China
| | - Xing-Xing Shen
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Antonis Rokas
- Vanderbilt University, Department of Biological Sciences, Nashville, Tennessee, United States of America
- Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- * E-mail: (JLS); (AR)
| |
Collapse
|
13
|
Lozano-Fernandez J. A Practical Guide to Design and Assess a Phylogenomic Study. Genome Biol Evol 2022; 14:evac129. [PMID: 35946263 PMCID: PMC9452790 DOI: 10.1093/gbe/evac129] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/03/2022] [Indexed: 11/13/2022] Open
Abstract
Over the last decade, molecular systematics has undergone a change of paradigm as high-throughput sequencing now makes it possible to reconstruct evolutionary relationships using genome-scale datasets. The advent of "big data" molecular phylogenetics provided a battery of new tools for biologists but simultaneously brought new methodological challenges. The increase in analytical complexity comes at the price of highly specific training in computational biology and molecular phylogenetics, resulting very often in a polarized accumulation of knowledge (technical on one side and biological on the other). Interpreting the robustness of genome-scale phylogenetic studies is not straightforward, particularly as new methodological developments have consistently shown that the general belief of "more genes, more robustness" often does not apply, and because there is a range of systematic errors that plague phylogenomic investigations. This is particularly problematic because phylogenomic studies are highly heterogeneous in their methodology, and best practices are often not clearly defined. The main aim of this article is to present what I consider as the ten most important points to take into consideration when planning a well-thought-out phylogenomic study and while evaluating the quality of published papers. The goal is to provide a practical step-by-step guide that can be easily followed by nonexperts and phylogenomic novices in order to assess the technical robustness of phylogenomic studies or improve the experimental design of a project.
Collapse
Affiliation(s)
- Jesus Lozano-Fernandez
- Department of Genetics, Microbiology and Statistics, Biodiversity Research Institute (IRBio), University of Barcelona, Avd. Diagonal 643, 08028 Barcelona, Spain
- Institute of Evolutionary Biology (CSIC – Universitat Pompeu Fabra), Passeig marítim de la Barcelona 37-49, 08003 Barcelona, Spain
| |
Collapse
|
14
|
Chen K, Moravec JÍC, Gavryushkin A, Welch D, Drummond AJ. Accounting for errors in data improves divergence time estimates in single-cell cancer evolution. Mol Biol Evol 2022; 39:6613463. [PMID: 35733333 PMCID: PMC9356729 DOI: 10.1093/molbev/msac143] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Single-cell sequencing provides a new way to explore the evolutionary history of cells. Compared to traditional bulk sequencing, where a population of heterogeneous cells is pooled to form a single observation, single-cell sequencing isolates and amplifies genetic material from individual cells, thereby preserving the information about the origin of the sequences. However, single-cell data is more error-prone than bulk sequencing data due to the limited genomic material available per cell. Here, we present error and mutation models for evolutionary inference of single-cell data within a mature and extensible Bayesian framework, BEAST2. Our framework enables integration with biologically informative models such as relaxed molecular clocks and population dynamic models. Our simulations show that modeling errors increase the accuracy of relative divergence times and substitution parameters. We reconstruct the phylogenetic history of a colorectal cancer patient and a healthy patient from single-cell DNA sequencing data. We find that the estimated times of terminal splitting events are shifted forward in time compared to models which ignore errors. We observed that not accounting for errors can overestimate the phylogenetic diversity in single-cell DNA sequencing data. We estimate that 30-50% of the apparent diversity can be attributed to error. Our work enables a full Bayesian approach capable of accounting for errors in the data within the integrative Bayesian software framework BEAST2.
Collapse
Affiliation(s)
- Kylie Chen
- School of Computer Science, University of Auckland, Auckland, New Zealand
| | - Jiř Í C Moravec
- Department of Computer Science, University of Otago, Dunedin, New Zealand.,School of Mathematics and Statistics, University of Canterbury, Christchurch, New Zealand
| | - Alex Gavryushkin
- School of Mathematics and Statistics, University of Canterbury, Christchurch, New Zealand
| | - David Welch
- School of Computer Science, University of Auckland, Auckland, New Zealand
| | - Alexei J Drummond
- School of Computer Science, University of Auckland, Auckland, New Zealand.,School of Biological Sciences, University of Auckland, Auckland, New Zealand
| |
Collapse
|
15
|
Foster PG, Schrempf D, Szöllősi GJ, Williams TA, Cox CJ, Embley TM. Recoding amino acids to a reduced alphabet may increase or decrease phylogenetic accuracy. Syst Biol 2022:6609786. [PMID: 35713492 DOI: 10.1093/sysbio/syac042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 05/16/2022] [Accepted: 06/07/2022] [Indexed: 11/12/2022] Open
Abstract
Common molecular phylogenetic characteristics such as long branches and compositional heterogeneity can be problematic for phylogenetic reconstruction when using amino acid data. Recoding alignments to reduced alphabets before phylogenetic analysis has often been used both to explore and potentially decrease the effect of such problems. We tested the effectiveness of this strategy on topological accuracy using simulated data on four-taxon trees. We simulated alignments in phylogenetically challenging ways to test the phylogenetic accuracy of analyses using various recoding strategies together with commonly-used homogeneous models. We tested three recoding methods based on amino acid exchangeability, and another recoding method based on lowering the compositional heterogeneity among alignment sequences as measured by the Chi-squared statistic. Our simulation results show that on trees with long branches where sequences approach saturation, accuracy was not greatly affected by exchangeability-based recodings, but Chi-squared-based recoding decreased accuracy. We then simulated sequences with different kinds of compositional heterogeneity over the tree. Recoding often increased accuracy on such alignments. Exchangeability-based recoding was rarely worse than not recoding, and often considerably better. Recoding based on lowering the Chi-squared value improved accuracy in some cases but not in others, suggesting that low compositional heterogeneity by itself is not sufficient to increase accuracy in the analysis of these alignments. We also simulated alignments using site-specific amino acid profiles, making sequences that had compositional heterogeneity over alignment sites. Exchangeability-based recoding coupled with site-homogeneous models had poor accuracy for these datasets but Chi-squared-based recoding on these alignments increased accuracy. We then simulated datasets that were compositionally both site- and tree-heterogeneous, like many real datasets. The effect on accuracy of recoding such doubly problematic datasets varied widely, depending on the type of compositional tree-heterogeneity and on the recoding scheme. Interestingly, analysis of unrecoded compositionally heterogeneous alignments with the NDCH or CAT models was generally more accurate than homogeneous analysis, whether recoded or not. Overall, our results suggest that making trees for recoded amino acid datasets can be useful, but they need to be interpreted cautiously as part of a more comprehensive analysis. The use of better fitting models like NDCH and CAT, which directly account for the patterns in the data, may offer a more promising long-term solution for analysing empirical data.
Collapse
Affiliation(s)
- Peter G Foster
- Department of Life Sciences, Natural History Museum, London SW7 5BD, UK
| | - Dominik Schrempf
- Department of Biological Physics, Eötvös Loránd University, 1117 Budapest, Hungary
| | - Gergely J Szöllősi
- Department of Biological Physics, Eötvös Loránd University, 1117 Budapest, Hungary.,MTA-ELTE "Lendület" Evolutionary Genomics Research Group, 1117 Budapest, Hungary.,Evolutionary Systems Research Group, Centre for Ecological Research, Hungarian Academy of Sciences, 8237 Tihany, Hungary
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, BS8 1TQ, Bristol, UK
| | - Cymon J Cox
- Centro de Ciências do Mar, Universidade do Algarve, Gambelas, 8005-319 Faro, Portugal
| | - T Martin Embley
- Biosciences Institute, Centre for Bacterial Cell Biology, Baddiley-Clark Building (room 2.04), Newcastle University, Richardson Road, Newcastle upon Tyne, UK
| |
Collapse
|
16
|
Protein Structure, Models of Sequence Evolution, and Data Type Effects in Phylogenetic Analyses of Mitochondrial Data: A Case Study in Birds. DIVERSITY 2021. [DOI: 10.3390/d13110555] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Phylogenomic analyses have revolutionized the study of biodiversity, but they have revealed that estimated tree topologies can depend, at least in part, on the subset of the genome that is analyzed. For example, estimates of trees for avian orders differ if protein-coding or non-coding data are analyzed. The bird tree is a good study system because the historical signal for relationships among orders is very weak, which should permit subtle non-historical signals to be identified, while monophyly of orders is strongly corroborated, allowing identification of strong non-historical signals. Hydrophobic amino acids in mitochondrially-encoded proteins, which are expected to be found in transmembrane helices, have been hypothesized to be associated with non-historical signals. We tested this hypothesis by comparing the evolution of transmembrane helices and extramembrane segments of mitochondrial proteins from 420 bird species, sampled from most avian orders. We estimated amino acid exchangeabilities for both structural environments and assessed the performance of phylogenetic analysis using each data type. We compared those relative exchangeabilities with values calculated using a substitution matrix for transmembrane helices estimated using a variety of nuclear- and mitochondrially-encoded proteins, allowing us to compare the bird-specific mitochondrial models with a general model of transmembrane protein evolution. To complement our amino acid analyses, we examined the impact of protein structure on patterns of nucleotide evolution. Models of transmembrane and extramembrane sequence evolution for amino acids and nucleotides exhibited striking differences, but there was no evidence for strong topological data type effects. However, incorporating protein structure into analyses of mitochondrially-encoded proteins improved model fit. Thus, we believe that considering protein structure will improve analyses of mitogenomic data, both in birds and in other taxa.
Collapse
|
17
|
Atta CJ, Yuan H, Li C, Arcila D, Betancur-R R, Hughes LC, Ortí G, Tornabene L. Exon-capture data and locus screening provide new insights into the phylogeny of flatfishes (Pleuronectoidei). Mol Phylogenet Evol 2021; 166:107315. [PMID: 34537325 DOI: 10.1016/j.ympev.2021.107315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 05/12/2021] [Accepted: 09/14/2021] [Indexed: 10/20/2022]
Abstract
There is an extensive collection of literature on the taxonomy and phylogenetics of flatfishes (Pleuronectiformes) that extends over two centuries, but consensus on many of their evolutionary relationships remains elusive. Phylogenetic uncertainty stems from highly divergent results derived from morphological and genetic characters, and between various molecular datasets. Deciphering relationships is complicated by rapid diversification early in the Pleuronectiformes tree and an abundance of studies that incompletely and inconsistently sample taxa and genetic markers. We present phylogenies based on a genome-wide dataset (4,434 nuclear markers via exon-capture) and wide taxon sampling (86 species spanning 12 of 16 families) of the largest flatfish suborder (Pleuronectoidei). Nine different subsets of the data and two tree construction approaches (eighteen phylogenies in total) are remarkably consistent with other recent molecular phylogenies, and show strong support for the monophyly of all families included except Pleuronectidae. Analyses resolved a novel phylogenetic hypothesis for the family Rhombosoleidae as being within the Pleuronectoidea rather than the Soleoidea, and failed to support the subfamily Hippoglossinae as a monophyletic group. Our results were corroborated with evidence from previous phylogenetic studies to outline regions of persistent phylogenetic uncertainty and identify groups in need of further phylogenetic inference.
Collapse
Affiliation(s)
- Calder J Atta
- School of Aquatic and Fishery Sciences, University of Washington, Seattle, USA; Burke Museum of Natural History and Culture, Seattle, USA.
| | - Hao Yuan
- Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources, Shanghai Ocean University, Ministry of Education, Shanghai, China
| | - Chenhong Li
- Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources, Shanghai Ocean University, Ministry of Education, Shanghai, China
| | - Dahiana Arcila
- Sam Noble Oklahoma Museum of Natural History, The University of Oklahoma, Norman, OK 73072, USA; Department of Biology, The University of Oklahoma, Norman, OK 73072, USA
| | - Ricardo Betancur-R
- Sam Noble Oklahoma Museum of Natural History, The University of Oklahoma, Norman, OK 73072, USA; Department of Biology, The University of Oklahoma, Norman, OK 73072, USA
| | - Lily C Hughes
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, IL 60637, USA; National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA; Department of Biological Sciences, The George Washington University, Washington, DC 20052, USA
| | - Guillermo Ortí
- National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA; Department of Biological Sciences, The George Washington University, Washington, DC 20052, USA
| | - Luke Tornabene
- School of Aquatic and Fishery Sciences, University of Washington, Seattle, USA; Burke Museum of Natural History and Culture, Seattle, USA
| |
Collapse
|
18
|
Vera-Ruiz VA, Robinson J, Jermiin LS. A Likelihood-Ratio Test for Lumpability of Phylogenetic Data: Is the Markovian Property of an Evolutionary Process retained in Recoded DNA? Syst Biol 2021; 71:660-675. [PMID: 34498090 DOI: 10.1093/sysbio/syab074] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 08/19/2021] [Accepted: 08/27/2021] [Indexed: 11/12/2022] Open
Abstract
In molecular phylogenetics, it is typically assumed that the evolutionary process for DNA can be approximated by independent and identically distributed Markovian processes at the variable sites and that these processes diverge over the edges of a rooted bifurcating tree. Sometimes the nucleotides are transformed from a 4-state alphabet to a 3- or 2-state alphabet by a procedure that is called recoding, lumping, or grouping of states. Here, we introduce a likelihood-ratio test for lumpability for DNA that has diverged under different Markovian conditions, which assesses the assumption that the Markovian property of the evolutionary process over each edge is retained after recoding of the nucleotides. The test is derived and validated numerically on simulated data. To demonstrate the insights that can be gained by using the test, we assessed two published data sets, one of mitochondrial DNA from a phylogenetic study of the ratites (Syst. Biol. 59:90-107 [2010]) and the other of nuclear DNA from a phylogenetic study of yeast (Mol. Biol. Evol. 21:1455-1458 [2004]). Our analysis of these data sets revealed that recoding of the DNA eliminated some of the compositional heterogeneity detected over the sequences. However, the Markovian property of the original evolutionary process was not retained by the recoding, leading to some significant distortions of edge lengths in reconstructed trees.
Collapse
Affiliation(s)
- Victor A Vera-Ruiz
- School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia.,Department of Mathematics and Statistics, University of Nevada, Reno, NV 89557, USA
| | - John Robinson
- School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia
| | - Lars S Jermiin
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia.,School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland.,Earth Institute, University College Dublin, Belfield, Dublin 4, Ireland
| |
Collapse
|
19
|
Mongiardino Koch N. Phylogenomic Subsampling and the Search for Phylogenetically Reliable Loci. Mol Biol Evol 2021; 38:4025-4038. [PMID: 33983409 DOI: 10.1101/2021.02.13.431075] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/21/2023] Open
Abstract
Phylogenomic subsampling is a procedure by which small sets of loci are selected from large genome-scale data sets and used for phylogenetic inference. This step is often motivated by either computational limitations associated with the use of complex inference methods or as a means of testing the robustness of phylogenetic results by discarding loci that are deemed potentially misleading. Although many alternative methods of phylogenomic subsampling have been proposed, little effort has gone into comparing their behavior across different data sets. Here, I calculate multiple gene properties for a range of phylogenomic data sets spanning animal, fungal, and plant clades, uncovering a remarkable predictability in their patterns of covariance. I also show how these patterns provide a means for ordering loci by both their rate of evolution and their relative phylogenetic usefulness. This method of retrieving phylogenetically useful loci is found to be among the top performing when compared with alternative subsampling protocols. Relatively common approaches such as minimizing potential sources of systematic bias or increasing the clock-likeness of the data are found to fare worse than selecting loci at random. Likewise, the general utility of rate-based subsampling is found to be limited: loci evolving at both low and high rates are among the least effective, and even those evolving at optimal rates can still widely differ in usefulness. This study shows that many common subsampling approaches introduce unintended effects in off-target gene properties and proposes an alternative multivariate method that simultaneously optimizes phylogenetic signal while controlling for known sources of bias.
Collapse
|
20
|
Abstract
Phylogenomic subsampling is a procedure by which small sets of loci are selected from large genome-scale data sets and used for phylogenetic inference. This step is often motivated by either computational limitations associated with the use of complex inference methods or as a means of testing the robustness of phylogenetic results by discarding loci that are deemed potentially misleading. Although many alternative methods of phylogenomic subsampling have been proposed, little effort has gone into comparing their behavior across different data sets. Here, I calculate multiple gene properties for a range of phylogenomic data sets spanning animal, fungal, and plant clades, uncovering a remarkable predictability in their patterns of covariance. I also show how these patterns provide a means for ordering loci by both their rate of evolution and their relative phylogenetic usefulness. This method of retrieving phylogenetically useful loci is found to be among the top performing when compared with alternative subsampling protocols. Relatively common approaches such as minimizing potential sources of systematic bias or increasing the clock-likeness of the data are found to fare worse than selecting loci at random. Likewise, the general utility of rate-based subsampling is found to be limited: loci evolving at both low and high rates are among the least effective, and even those evolving at optimal rates can still widely differ in usefulness. This study shows that many common subsampling approaches introduce unintended effects in off-target gene properties and proposes an alternative multivariate method that simultaneously optimizes phylogenetic signal while controlling for known sources of bias.
Collapse
|
21
|
Abstract
The Antarctic environment is famously inhospitable to most terrestrial biodiversity, traditionally viewed as a driver of species extinction. Combining population- and species-level molecular data, we show that beetles on islands along the Antarctic Polar Front diversified in response to major climatic events over the last 50 Ma in surprising synchrony with the region’s marine organisms. Unique algae- and moss-feeding habits enabled beetles to capitalize on cooling conditions, which resulted in a decline in flowering plants—the typical hosts for beetles elsewhere. Antarctica’s cooling paleoclimate thus fostered the diversification of both terrestrial and marine life. Climatically driven evolutionary processes since the Miocene may underpin much of the region’s diversity, are still ongoing, and should be further investigated among Antarctic biota. Global cooling and glacial–interglacial cycles since Antarctica’s isolation have been responsible for the diversification of the region’s marine fauna. By contrast, these same Earth system processes are thought to have played little role terrestrially, other than driving widespread extinctions. Here, we show that on islands along the Antarctic Polar Front, paleoclimatic processes have been key to diversification of one of the world’s most geographically isolated and unique groups of herbivorous beetles—Ectemnorhinini weevils. Combining phylogenomic, phylogenetic, and phylogeographic approaches, we demonstrate that these weevils colonized the sub-Antarctic islands from Africa at least 50 Ma ago and repeatedly dispersed among them. As the climate cooled from the mid-Miocene, diversification of the beetles accelerated, resulting in two species-rich clades. One of these clades specialized to feed on cryptogams, typical of the polar habitats that came to prevail under Miocene conditions yet remarkable as a food source for any beetle. This clade’s most unusual representative is a marine weevil currently undergoing further speciation. The other clade retained the more common weevil habit of feeding on angiosperms, which likely survived glaciation in isolated refugia. Diversification of Ectemnorhinini weevils occurred in synchrony with many other Antarctic radiations, including penguins and notothenioid fishes, and coincided with major environmental changes. Our results thus indicate that geo-climatically driven diversification has progressed similarly for Antarctic marine and terrestrial organisms since the Miocene, potentially constituting a general biodiversity paradigm that should be sought broadly for the region’s taxa.
Collapse
|
22
|
Chandhini S, Yamanoue Y, Varghese S, Ali PHA, Arjunan VM, Kumar VJR. Whole mitogenome analysis and phylogeny of freshwater fish red-finned catopra (Pristolepis rubripinnis) endemic to Kerala, India. J Genet 2021. [DOI: 10.1007/s12041-021-01292-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
23
|
Williams TA, Schrempf D, Szöllősi GJ, Cox CJ, Foster PG, Embley TM. Inferring the deep past from molecular data. Genome Biol Evol 2021; 13:6192802. [PMID: 33772552 PMCID: PMC8175050 DOI: 10.1093/gbe/evab067] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/22/2021] [Indexed: 12/17/2022] Open
Abstract
There is an expectation that analyses of molecular sequences might be able to distinguish between alternative hypotheses for ancient relationships, but the phylogenetic methods used and types of data analyzed are of critical importance in any attempt to recover historical signal. Here, we discuss some common issues that can influence the topology of trees obtained when using overly simple models to analyze molecular data that often display complicated patterns of sequence heterogeneity. To illustrate our discussion, we have used three examples of inferred relationships which have changed radically as models and methods of analysis have improved. In two of these examples, the sister-group relationship between thermophilic Thermus and mesophilic Deinococcus, and the position of long-branch Microsporidia among eukaryotes, we show that recovering what is now generally considered to be the correct tree is critically dependent on the fit between model and data. In the third example, the position of eukaryotes in the tree of life, the hypothesis that is currently supported by the best available methods is fundamentally different from the classical view of relationships between major cellular domains. Since heterogeneity appears to be pervasive and varied among all molecular sequence data, and even the best available models can still struggle to deal with some problems, the issues we discuss are generally relevant to phylogenetic analyses. It remains essential to maintain a critical attitude to all trees as hypotheses of relationship that may change with more data and better methods.
Collapse
Affiliation(s)
- Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol BS8 1TQ, United Kingdom
| | - Dominik Schrempf
- Dept. of Biological Physics, Eötvös Loránd University, 1117 Budapest, Hungary
| | - Gergely J Szöllősi
- Dept. of Biological Physics, Eötvös Loránd University, 1117 Budapest, Hungary.,MTA-ELTE "Lendület" Evolutionary Genomics Research Group, 1117 Budapest, Hungary.,Institute of Evolution, Centre for Ecological Research, 1121 Budapest, Hungary
| | - Cymon J Cox
- Centro de Ciências do Mar, Universidade do Algarve, Gambelas, 8005-319 Faro, Portugal
| | - Peter G Foster
- Department of Life Sciences, Natural History Museum, London SW7 5BD, United Kingdom
| | - T Martin Embley
- Biosciences Institute, Centre for Bacterial Cell Biology, Newcastle University, Newcastle upon Tyne NE2 4AX, United Kingdom
| |
Collapse
|
24
|
Shen XX, Steenwyk JL, Rokas A. Dissecting incongruence between concatenation- and quartet-based approaches in phylogenomic data. Syst Biol 2021; 70:997-1014. [PMID: 33616672 DOI: 10.1093/sysbio/syab011] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 02/10/2021] [Accepted: 02/17/2021] [Indexed: 12/12/2022] Open
Abstract
Topological conflict or incongruence is widespread in phylogenomic data. Concatenation- and coalescent-based approaches often result in incongruent topologies, but the causes of this conflict can be difficult to characterize. We examined incongruence stemming from conflict between likelihood-based signal (quantified by the difference in gene-wise log likelihood score or ΔGLS) and quartet-based topological signal (quantified by the difference in gene-wise quartet score or ΔGQS) for every gene in three phylogenomic studies in animals, fungi, and plants, which were chosen because their concatenation-based IQ-TREE (T1) and quartet-based ASTRAL (T2) phylogenies are known to produce eight conflicting internal branches (bipartitions). By comparing the types of phylogenetic signal for all genes in these three data matrices, we found that 30% - 36% of genes in each data matrix are inconsistent, that is, each of these genes has higher log likelihood score for T1 versus T2 (i.e., ΔGLS >0) whereas its T1 topology has lower quartet score than its T2 topology (i.e., ΔGQS <0) or vice versa. Comparison of inconsistent and consistent genes using a variety of metrics (e.g., evolutionary rate, gene tree topology, distribution of branch lengths, hidden paralogy, and gene tree discordance) showed that inconsistent genes are more likely to recover neither T1 nor T2 and have higher levels of gene tree discordance than consistent genes. Simulation analyses demonstrate that removal of inconsistent genes from datasets with low levels of incomplete lineage sorting (ILS) and low and medium levels of gene tree estimation error (GTEE) reduced incongruence and increased accuracy. In contrast, removal of inconsistent genes from datasets with medium and high ILS levels and high GTEE levels eliminated or extensively reduced incongruence, but the resulting congruent species phylogenies were not always topologically identical to the true species trees.
Collapse
Affiliation(s)
- Xing-Xing Shen
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Zhejiang University, Hangzhou, China.,Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Jacob L Steenwyk
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
25
|
Steenwyk JL, Buida TJ, Labella AL, Li Y, Shen XX, Rokas A. PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data. Bioinformatics 2021; 37:2325-2331. [PMID: 33560364 PMCID: PMC8388027 DOI: 10.1093/bioinformatics/btab096] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 01/13/2021] [Accepted: 02/05/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Diverse disciplines in biology process and analyze multiple sequence alignments (MSAs) and phylogenetic trees to evaluate their information content, infer evolutionary events and processes, and predict gene function. However, automated processing of MSAs and trees remains a challenge due to the lack of a unified toolkit. To fill this gap, we introduce PhyKIT, a toolkit for the UNIX shell environment with 30 functions that process MSAs and trees, including but not limited to estimation of mutation rate, evaluation of sequence composition biases, calculation of the degree of violation of a molecular clock, and collapsing bipartitions (internal branches) with low support. RESULTS To demonstrate the utility of PhyKIT, we detail three use cases: (1) summarizing information content in MSAs and phylogenetic trees for diagnosing potential biases in sequence or tree data; (2) evaluating gene-gene covariation of evolutionary rates to identify functional relationships, including novel ones, among genes; and (3) identify lack of resolution events or polytomies in phylogenetic trees, which are suggestive of rapid radiation events or lack of data. We anticipate PhyKIT will be useful for processing, examining, and deriving biological meaning from increasingly large phylogenomic datasets. AVAILABILITY PhyKIT is freely available on GitHub (https://github.com/JLSteenwyk/PhyKIT), PyPi (https://pypi.org/project/phykit/), and the Anaconda Cloud (https://anaconda.org/JLSteenwyk/phykit) under the MIT license with extensive documentation and user tutorials (https://jlsteenwyk.com/PhyKIT). SUPPLEMENTARY INFORMATION Supplementary data are available on figshare (doi: 10.6084/m9.figshare.13118600) and are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jacob L Steenwyk
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN, 37235, United States of America
| | - Thomas J Buida
- 9 City Place #312, Nashville, TN, 37209, United States of America
| | - Abigail L Labella
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN, 37235, United States of America
| | - Yuanning Li
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN, 37235, United States of America
| | - Xing-Xing Shen
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN, 37235, United States of America
| |
Collapse
|
26
|
Phillips MJ, Shazwani Zakaria S. Enhancing mitogenomic phylogeny and resolving the relationships of extinct megafaunal placental mammals. Mol Phylogenet Evol 2021; 158:107082. [PMID: 33482383 DOI: 10.1016/j.ympev.2021.107082] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2020] [Revised: 12/21/2020] [Accepted: 01/11/2021] [Indexed: 10/22/2022]
Abstract
Mitochondrial genomes provided the first widely used sequences that were sufficiently informative to resolve relationships among animals across a wide taxonomic domain, from within species to between phyla. However, mitogenome studies supported several anomalous relationships and fell partly out of favour as sequencing multiple, independent nuclear loci proved to be highly effective. A tendency to blame mitochondrial DNA (mtDNA) has overshadowed efforts to understand and ameliorate underlying model misspecification. Here we find that influential assessments of the infidelity of mitogenome phylogenies have often been overstated, but nevertheless, substitution saturation and compositional non-stationarity substantially mislead reconstruction. We show that RY coding the mtDNA, excluding protein-coding 3rd codon sites, partitioning models based on amino acid hydrophobicity and enhanced taxon sampling improve the accuracy of mitogenomic phylogeny reconstruction for placental mammals, almost to the level of multi-gene nuclear datasets. Indeed, combined analysis of mtDNA with 3-fold longer nuclear sequence data either maintained or improved upon the nuclear support for all generally accepted clades, even those that mtDNA alone did not favour, thus indicating "hidden support". Confident mtDNA phylogeny reconstruction is especially important for understanding the evolutionary dynamics of mitochondria themselves, and for merging extinct taxa into the tree of life, with ancient DNA often only accessible as mtDNA. Our ancient mtDNA analyses lend confidence to the relationships of three extinct megafaunal taxa: glyptodonts are nested within armadillos, the South American ungulate, Macrauchenia is sister to horses and rhinoceroses, and sabre-toothed and scimitar cats are the monophyletic sister-group of modern cats.
Collapse
Affiliation(s)
- Matthew J Phillips
- School of Biology and Environmental Science, Queensland University of Technology, 2 George Street, Brisbane 4000, QLD, Australia.
| | - Sarah Shazwani Zakaria
- School of Biology and Environmental Science, Queensland University of Technology, 2 George Street, Brisbane 4000, QLD, Australia; School of Biology, Faculty of Applied Sciences, Universiti Teknologi MARA (UiTM) Caw. Negeri Sembilan, Kuala Pilah 72000, Malaysia
| |
Collapse
|
27
|
Abstract
The phylogeny of Neoaves, the largest clade of extant birds, has remained unclear despite intense study. The difficulty associated with resolving the early branches in Neoaves is likely driven by the rapid radiation of this group. However, conflicts among studies may be exacerbated by the data type analyzed. For example, analyses of coding exons typically yield trees that place Strisores (nightjars and allies) sister to the remaining Neoaves, while analyses of non-coding data typically yield trees where Mirandornites (flamingos and grebes) is the sister of the remaining Neoaves. Our understanding of data type effects is hampered by the fact that previous analyses have used different taxa, loci, and types of non-coding data. Herein, we provide strong corroboration of the data type effects hypothesis for Neoaves by comparing trees based on coding and non-coding data derived from the same taxa and gene regions. A simple analytical method known to minimize biases due to base composition (coding nucleotides as purines and pyrimidines) resulted in coding exon data with increased congruence to the non-coding topology using concatenated analyses. These results improve our understanding of the resolution of neoavian phylogeny and point to a challenge—data type effects—that is likely to be an important factor in phylogenetic analyses of birds (and many other taxonomic groups). Using our results, we provide a summary phylogeny that identifies well-corroborated relationships and highlights specific nodes where future efforts should focus.
Collapse
|
28
|
Owen CL, Stern DB, Hilton SK, Crandall KA. Hemiptera phylogenomic resources: Tree‐based orthology prediction and conserved exon identification. Mol Ecol Resour 2020; 20:1346-1360. [DOI: 10.1111/1755-0998.13180] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Revised: 04/02/2020] [Accepted: 04/27/2020] [Indexed: 12/21/2022]
Affiliation(s)
- Christopher L. Owen
- Computational Biology Institute George Washington University Washington DC USA
- Systematic Entomology Laboratory USDA‐ARS Beltsville MD USA
| | - David B. Stern
- Computational Biology Institute George Washington University Washington DC USA
- Department of Integrative Biology University of Wisconsin ‐ Madison Madison WI USA
| | - Sarah K. Hilton
- Computational Biology Institute George Washington University Washington DC USA
- Department of Genome Sciences University of Washington Washington DC USA
| | - Keith A. Crandall
- Computational Biology Institute George Washington University Washington DC USA
| |
Collapse
|
29
|
Celik MA, Phillips MJ. Conflict Resolution for Mesozoic Mammals: Reconciling Phylogenetic Incongruence Among Anatomical Regions. Front Genet 2020; 11:0651. [PMID: 32774343 PMCID: PMC7381353 DOI: 10.3389/fgene.2020.00651] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2019] [Accepted: 05/28/2020] [Indexed: 11/13/2022] Open
Abstract
The evolutionary history of Mesozoic mammaliaformes is well studied. Although the backbone of their phylogeny is well resolved, the placement of ecologically specialized groups has remained uncertain. Functional and developmental covariation has long been identified as an important source of phylogenetic error, yet combining incongruent morphological characters altogether is currently a common practice when reconstructing phylogenetic relationships. Ignoring incongruence may inflate the confidence in reconstructing relationships, particularly for the placement of highly derived and ecologically specialized taxa, such as among australosphenidans (particularly, crown monotremes), haramiyidans, and multituberculates. The alternative placement of these highly derived clades can alter the taxonomic constituency and temporal origin of the mammalian crown group. Based on prior hypotheses and correlated homoplasy analyses, we identified cheek teeth and shoulder girdle character complexes as having a high potential to introduce phylogenetic error. We showed that incongruence among mandibulodental, cranial, and postcranial anatomical partitions for the placement of the australosphenidans, haramiyids, and multituberculates could largely be explained by apparently non-phylogenetic covariance from cheek teeth and shoulder girdle characters. Excluding these character complexes brought agreement between anatomical regions and improved the confidence in tree topology. These results emphasize the importance of considering and ameliorating major sources of bias in morphological data, and we anticipate that these will be valuable for confidently integrating morphological and molecular data in phylogenetic and dating analyses.
Collapse
Affiliation(s)
- Mélina A Celik
- School of Biology and Environmental Science, Queensland University of Technology, Brisbane, QLD, Australia
| | - Matthew J Phillips
- School of Biology and Environmental Science, Queensland University of Technology, Brisbane, QLD, Australia
| |
Collapse
|
30
|
Abstract
Knowing phylogenetic relationships among species is fundamental for many studies in biology. An accurate phylogenetic tree underpins our understanding of the major transitions in evolution, such as the emergence of new body plans or metabolism, and is key to inferring the origin of new genes, detecting molecular adaptation, understanding morphological character evolution and reconstructing demographic changes in recently diverged species. Although data are ever more plentiful and powerful analysis methods are available, there remain many challenges to reliable tree building. Here, we discuss the major steps of phylogenetic analysis, including identification of orthologous genes or proteins, multiple sequence alignment, and choice of substitution models and inference methodologies. Understanding the different sources of errors and the strategies to mitigate them is essential for assembling an accurate tree of life.
Collapse
|
31
|
Abstract
Phylogenetic trees are essential to evolutionary biology, and numerous methods exist that attempt to extract phylogenetic information applicable to a wide range of disciplines, such as epidemiology and metagenomics. Currently, the three main Python packages for trees are Bio.Phylo, DendroPy, and the ETE Toolkit, but as dataset sizes grow, parsing and manipulating ultra-large trees becomes impractical for these tools. To address this issue, we present TreeSwift, a user-friendly and massively scalable Python package for traversing and manipulating trees that is ideal for algorithms performed on ultra-large trees.
Collapse
Affiliation(s)
- N Moshiri
- Department of Computer Science and Engineering, UC San Diego, 92093, USA
| |
Collapse
|
32
|
Kealy S, Donnellan SC, Mitchell KJ, Herrera M, Aplin K, O'Connor S, Louys J. Phylogenetic relationships of the cuscuses (Diprotodontia : Phalangeridae) of island Southeast Asia and Melanesia based on the mitochondrial ND2 gene. AUSTRALIAN MAMMALOGY 2020. [DOI: 10.1071/am18050] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
The species-level systematics of the marsupial family Phalangeridae, particularly Phalanger, are poorly understood, due partly to the family’s wide distribution across Australia, New Guinea, eastern Indonesia, and surrounding islands. In order to refine the species-level systematics of Phalangeridae, and improve our understanding of their evolution, we generated 36 mitochondrial ND2 DNA sequences from multiple species and sample localities. We combined our new data with available sequences and produced the most comprehensive molecular phylogeny for Phalangeridae to date. Our analyses (1) strongly support the monophyly of the three phalangerid subfamilies (Trichosurinae, Ailuropinae, Phalangerinae); (2) reveal the need to re-examine all specimens currently identified as ‘Phalanger orientalis’; and (3) suggest the elevation of the Solomon Island P. orientalis subspecies to species level (P. breviceps Thomas, 1888). In addition, samples of P. orientalis from Timor formed a clade, consistent with an introduction by humans from a single source population. However, further research on east Indonesian P. orientalis populations will be required to test this hypothesis, resolve inconsistencies in divergence time estimates, and locate the source population and taxonomic status of the Timor P. orientalis.
Collapse
|
33
|
Sathyajith C, Yamanoue Y, Yokobori SI, Thampy S, Vattiringal Jayadradhan RK. Mitogenome analysis of dwarf pufferfish (Carinotetraodon travancoricus) endemic to southwest India and its implications in the phylogeny of Tetraodontidae. J Genet 2019. [DOI: 10.1007/s12041-019-1151-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
34
|
A Robust Phylogenomic Time Tree for Biotechnologically and Medically Important Fungi in the Genera Aspergillus and Penicillium. mBio 2019; 10:mBio.00925-19. [PMID: 31289177 PMCID: PMC6747717 DOI: 10.1128/mbio.00925-19] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Understanding the evolution of traits across technologically and medically significant fungi requires a robust phylogeny. Even though species in the Aspergillus and Penicillium genera (family Aspergillaceae, class Eurotiomycetes) are some of the most significant technologically and medically relevant fungi, we still lack a genome-scale phylogeny of the lineage or knowledge of the parts of the phylogeny that exhibit conflict among analyses. Here, we used a phylogenomic approach to infer evolutionary relationships among 81 genomes that span the diversity of Aspergillus and Penicillium species, to identify conflicts in the phylogeny, and to determine the likely underlying factors of the observed conflicts. Using a data matrix comprised of 1,668 genes, we found that while most branches of the phylogeny of the Aspergillaceae are robustly supported and recovered irrespective of method of analysis, a few exhibit various degrees of conflict among our analyses. Further examination of the observed conflict revealed that it largely stems from incomplete lineage sorting and hybridization or introgression. Our analyses provide a robust and comprehensive evolutionary genomic roadmap for this important lineage, which will facilitate the examination of the diverse technologically and medically relevant traits of these fungi in an evolutionary context. The filamentous fungal family Aspergillaceae contains >1,000 known species, mostly in the genera Aspergillus and Penicillium. Several species are used in the food, biotechnology, and drug industries (e.g., Aspergillus oryzae and Penicillium camemberti), while others are dangerous human and plant pathogens (e.g., Aspergillus fumigatus and Penicillium digitatum). To infer a robust phylogeny and pinpoint poorly resolved branches and their likely underlying contributors, we used 81 genomes spanning the diversity of Aspergillus and Penicillium to construct a 1,668-gene data matrix. Phylogenies of the nucleotide and amino acid versions of this full data matrix as well as of several additional data matrices were generated using three different maximum likelihood schemes (i.e., gene-partitioned, unpartitioned, and coalescence) and using both site-homogenous and site-heterogeneous models (total of 64 species-level phylogenies). Examination of the topological agreement among these phylogenies and measures of internode certainty identified 11/78 (14.1%) bipartitions that were incongruent and pinpointed the likely underlying contributing factors, which included incomplete lineage sorting, hidden paralogy, hybridization or introgression, and reconstruction artifacts associated with poor taxon sampling. Relaxed molecular clock analyses suggest that Aspergillaceae likely originated in the lower Cretaceous and that the Aspergillus and Penicillium genera originated in the upper Cretaceous. Our results shed light on the ongoing debate on Aspergillus systematics and taxonomy and provide a robust evolutionary and temporal framework for comparative genomic analyses in Aspergillaceae. More broadly, our approach provides a general template for phylogenomic identification of resolved and contentious branches in densely genome-sequenced lineages across the tree of life.
Collapse
|
35
|
Cascini M, Mitchell KJ, Cooper A, Phillips MJ. Reconstructing the Evolution of Giant Extinct Kangaroos: Comparing the Utility of DNA, Morphology, and Total Evidence. Syst Biol 2018; 68:520-537. [DOI: 10.1093/sysbio/syy080] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 11/20/2018] [Accepted: 11/20/2018] [Indexed: 11/12/2022] Open
Affiliation(s)
- Manuela Cascini
- School of Earth, Environmental and Biological Sciences, Queensland University of Technology, 2, George Street, Brisbane, QLD 4000, Australia
| | - Kieren J Mitchell
- Australian Centre for Ancient DNA, School of Biological Sciences, University of Adelaide, North Terrace Campus, South Australia 5005, Australia
| | - Alan Cooper
- Australian Centre for Ancient DNA, School of Biological Sciences, University of Adelaide, North Terrace Campus, South Australia 5005, Australia
| | - Matthew J Phillips
- School of Earth, Environmental and Biological Sciences, Queensland University of Technology, 2, George Street, Brisbane, QLD 4000, Australia
| |
Collapse
|
36
|
Shen XX, Opulente DA, Kominek J, Zhou X, Steenwyk JL, Buh KV, Haase MAB, Wisecaver JH, Wang M, Doering DT, Boudouris JT, Schneider RM, Langdon QK, Ohkuma M, Endoh R, Takashima M, Manabe RI, Čadež N, Libkind D, Rosa CA, DeVirgilio J, Hulfachor AB, Groenewald M, Kurtzman CP, Hittinger CT, Rokas A. Tempo and Mode of Genome Evolution in the Budding Yeast Subphylum. Cell 2018; 175:1533-1545.e20. [PMID: 30415838 DOI: 10.1016/j.cell.2018.10.023] [Citation(s) in RCA: 318] [Impact Index Per Article: 53.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Revised: 08/12/2018] [Accepted: 10/04/2018] [Indexed: 11/17/2022]
Abstract
Budding yeasts (subphylum Saccharomycotina) are found in every biome and are as genetically diverse as plants or animals. To understand budding yeast evolution, we analyzed the genomes of 332 yeast species, including 220 newly sequenced ones, which represent nearly one-third of all known budding yeast diversity. Here, we establish a robust genus-level phylogeny comprising 12 major clades, infer the timescale of diversification from the Devonian period to the present, quantify horizontal gene transfer (HGT), and reconstruct the evolution of 45 metabolic traits and the metabolic toolkit of the budding yeast common ancestor (BYCA). We infer that BYCA was metabolically complex and chronicle the tempo and mode of genomic and phenotypic evolution across the subphylum, which is characterized by very low HGT levels and widespread losses of traits and the genes that control them. More generally, our results argue that reductive evolution is a major mode of evolutionary diversification.
Collapse
Affiliation(s)
- Xing-Xing Shen
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Dana A Opulente
- Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA; DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Jacek Kominek
- Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA; DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Xiaofan Zhou
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA; Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Centre, South China Agricultural University, 510642 Guangzhou, China
| | - Jacob L Steenwyk
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Kelly V Buh
- Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Max A B Haase
- Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA; DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA; Sackler Institute of Graduate Biomedical Sciences, NYU School of Medicine, New York, NY 10016, USA
| | - Jennifer H Wisecaver
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA; Department of Biochemistry, Center for Plant Biology, Purdue University, West Lafayette, IN 47907, USA
| | - Mingshuang Wang
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Drew T Doering
- Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - James T Boudouris
- Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Rachel M Schneider
- Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA; DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Quinn K Langdon
- Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Moriya Ohkuma
- Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Ibaraki 305-0074, Japan
| | - Rikiya Endoh
- Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Ibaraki 305-0074, Japan
| | - Masako Takashima
- Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Ibaraki 305-0074, Japan
| | - Ri-Ichiroh Manabe
- Division of Genomic Technologies, RIKEN Center For Life Science Technologies, Laboratory for Comprehensive Genomic Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Neža Čadež
- Biotechnical Faculty, University of Ljubljana, 1000 Ljubljana, Slovenia
| | - Diego Libkind
- Laboratorio de Microbiología Aplicada y Biotecnología, Instituto Andino Patagónico de Tecnologías Biológicas y Geoambientales (IPATEC), Consejo Nacional de Investigaciones, Científicas y Técnicas (CONICET)-Universidad Nacional del Comahue, 8400 Bariloche, Argentina
| | - Carlos A Rosa
- Departamento de Microbiologia, ICB, CP 486, Universidade Federal de Minas Gerais, Belo Horizonte, MG, 31270-901, Brazil
| | - Jeremy DeVirgilio
- Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Agricultural Research Service, U.S. Department of Agriculture, Peoria, IL 61604, USA
| | - Amanda Beth Hulfachor
- Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | - Cletus P Kurtzman
- Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Agricultural Research Service, U.S. Department of Agriculture, Peoria, IL 61604, USA
| | - Chris Todd Hittinger
- Laboratory of Genetics, Genome Center of Wisconsin, Wisconsin Energy Institute, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA; DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA.
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA.
| |
Collapse
|
37
|
Kuang T, Tornabene L, Li J, Jiang J, Chakrabarty P, Sparks JS, Naylor GJP, Li C. Phylogenomic analysis on the exceptionally diverse fish clade Gobioidei (Actinopterygii: Gobiiformes) and data-filtering based on molecular clocklikeness. Mol Phylogenet Evol 2018; 128:192-202. [PMID: 30036699 DOI: 10.1016/j.ympev.2018.07.018] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Revised: 07/11/2018] [Accepted: 07/17/2018] [Indexed: 11/30/2022]
Abstract
The use of genome-scale data to infer phylogenetic relationships has gained in popularity in recent years due to the progress made in target-gene capture and sequencing techniques. Data filtering, the approach of excluding data inconsistent with the model from analyses, presumably could alleviate problems caused by systematic errors in phylogenetic inference. Different data filtering criteria, such as those based on evolutionary rate and molecular clocklikeness as well as others have been proposed for selecting useful phylogenetic markers, yet few studies have tested these criteria using phylogenomic data. We developed a novel set of single-copy nuclear coding markers to capture thousands of target genes in gobioid fishes, a species-rich lineages of vertebrates, and tested the effects of data-filtering methods based on substitution rate and molecular clocklikeness while attempting to control for the compounding effects of missing data and variation in locus length. We found that molecular clocklikeness was a better predictor than overall substitution rate for phylogenetic usefulness of molecular markers in our study. In addition, when the 100 best ranked loci for our predictors were concatenated and analyzed using maximum likelihood, or combined in a coalescent-based species-tree analysis, the resulting trees showed a well-resolved topology of Gobioidei that mostly agrees with previous studies. However, trees generated from the 100 least clocklike frequently recovered conflicting, and in some cases clearly erroneous topologies with strong support, thus indicating strong systematic biases in those datasets. Collectively these results suggest that data filtering has the potential improve the performance of phylogenetic inference when using both a concatenation approach as well as methods that rely on input from individual gene trees (i.e. coalescent species-tree approaches), which may be preferred in scenarios where incomplete lineage sorting is likely to be an issue.
Collapse
Affiliation(s)
- Ting Kuang
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai, China; Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai, China; National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), China
| | - Luke Tornabene
- School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98105, USA
| | - Jingyan Li
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai, China; Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai, China; National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), China
| | - Jiamei Jiang
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai, China; Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai, China; National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), China
| | - Prosanta Chakrabarty
- Louisiana State University, Museum of Natural Science, Department of Biological Sciences, Baton Rouge, LA 70803, USA
| | - John S Sparks
- American Museum of Natural History, Central Park West at 79th Street, NY, NY 10024, USA
| | | | - Chenhong Li
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai, China; Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai, China; National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), China.
| |
Collapse
|
38
|
Abstract
The Cretaceous-Palaeogene (K-Pg) mass extinction is linked to the rapid emergence of ecologically divergent higher taxa (for example, families and orders) across terrestrial vertebrates, but its impact on the diversification of marine vertebrates is less clear. Spiny-rayed fishes (Acanthomorpha) provide an ideal system for exploring the effects of the K-Pg on fish diversification, yet despite decades of morphological and molecular phylogenetic efforts, resolution of both early diverging lineages and enormously diverse subclades remains problematic. Recent multilocus studies have provided the first resolved phylogenetic backbone for acanthomorphs and suggested novel relationships among major lineages. However, these new relationships and associated timescales have not been interrogated using phylogenomic approaches. Here, we use targeted enrichment of >1,000 ultraconserved elements in conjunction with a divergence time analysis to resolve relationships among 120 major acanthomorph lineages and provide a new timescale for acanthomorph radiation. Our results include a well-supported topology that strongly resolves relationships along the acanthomorph backbone and the recovery of several new relationships within six major percomorph subclades. Divergence time analyses also reveal that crown ages for five of these subclades, and for the bulk of the species diversity in the sixth, coincide with the K-Pg boundary, with divergences between anatomically and ecologically distinctive suprafamilial clades concentrated in the first 10 million years of the Cenozoic.
Collapse
|
39
|
Barker FK. Molecular Phylogenetics of the Wrens and Allies (Passeriformes: Certhioidea), with Comments on the Relationships ofFerminia. AMERICAN MUSEUM NOVITATES 2017. [DOI: 10.1206/3887.1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
- F. Keith Barker
- Department of Ecology, Evolution and Behavior and Bell Museum of Natural History, University of Minnesota
| |
Collapse
|
40
|
Janšta P, Cruaud A, Delvare G, Genson G, Heraty J, Křížková B, Rasplus J. Torymidae (Hymenoptera, Chalcidoidea) revised: molecular phylogeny, circumscription and reclassification of the family with discussion of its biogeography and evolution of life‐history traits. Cladistics 2017; 34:627-651. [DOI: 10.1111/cla.12228] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/21/2017] [Indexed: 11/30/2022] Open
Affiliation(s)
- Petr Janšta
- Faculty of Science Department of Zoology Charles University Viničná 7 128 44 Prague 2 Czech Republic
| | - Astrid Cruaud
- CBGP, INRA, CIRAD, IRD Montpellier SupAgro Université de Montpellier Montpellier France
| | - Gérard Delvare
- CBGP, CIRAD Montpellier SupAgro INRA, IRD Université de Montpellier Montpellier France
| | - Guénaëlle Genson
- CBGP, INRA, CIRAD, IRD Montpellier SupAgro Université de Montpellier Montpellier France
| | - John Heraty
- Department of Entomology University of California Riverside CA 92521 USA
| | - Barbora Křížková
- Faculty of Science Department of Zoology Charles University Viničná 7 128 44 Prague 2 Czech Republic
| | - Jean‐Yves Rasplus
- CBGP, INRA, CIRAD, IRD Montpellier SupAgro Université de Montpellier Montpellier France
| |
Collapse
|
41
|
Urantowka AD, Kroczak A, Mackiewicz P. The influence of molecular markers and methods on inferring the phylogenetic relationships between the representatives of the Arini (parrots, Psittaciformes), determined on the basis of their complete mitochondrial genomes. BMC Evol Biol 2017; 17:166. [PMID: 28705202 PMCID: PMC5513162 DOI: 10.1186/s12862-017-1012-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2016] [Accepted: 07/04/2017] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Conures are a morphologically diverse group of Neotropical parrots classified as members of the tribe Arini, which has recently been subjected to a taxonomic revision. The previously broadly defined Aratinga genus of this tribe has been split into the 'true' Aratinga and three additional genera, Eupsittula, Psittacara and Thectocercus. Popular markers used in the reconstruction of the parrots' phylogenies derive from mitochondrial DNA. However, current phylogenetic analyses seem to indicate conflicting relationships between Aratinga and other conures, and also among other Arini members. Therefore, it is not clear if the mtDNA phylogenies can reliably define the species tree. The inconsistencies may result from the variable evolution rate of the markers used or their weak phylogenetic signal. To resolve these controversies and to assess to what extent the phylogenetic relationships in the tribe Arini can be inferred from mitochondrial genomes, we compared representative Arini mitogenomes as well as examined the usefulness of the individual mitochondrial markers and the efficiency of various phylogenetic methods. RESULTS Single molecular markers produced inconsistent tree topologies, while different methods offered various topologies even for the same marker. A significant disagreement in these tree topologies occurred for cytb, nd2 and nd6 genes, which are commonly used in parrot phylogenies. The strongest phylogenetic signal was found in the control region and RNA genes. However, these markers cannot be used alone in inferring Arini phylogenies because they do not provide fully resolved trees. The most reliable phylogeny of the parrots under study is obtained only on the concatenated set of all mitochondrial markers. The analyses established significantly resolved relationships within the former Aratinga representatives and the main genera of the tribe Arini. Such mtDNA phylogeny can be in agreement with the species tree, owing to its match with synapomorphic features in plumage colouration. CONCLUSIONS Phylogenetic relationships inferred from single mitochondrial markers can be incorrect and contradictory. Therefore, such phylogenies should be considered with caution. Reliable results can be produced by concatenated sets of all or at least the majority of mitochondrial genes and the control region. The results advance a new view on the relationships among the main genera of Arini and resolve the inconsistencies between the taxa that were previously classified as the broadly defined genus Aratinga. Although gene and species trees do not always have to be consistent, the mtDNA phylogenies for Arini can reflect the species tree.
Collapse
Affiliation(s)
- Adam Dawid Urantowka
- Department of Genetics, Wroclaw University of Environmental and Life Sciences, ul. Kożuchowska7, 51-631, Wroclaw, Poland
| | - Aleksandra Kroczak
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Fryderyka Joliot-Curie 14a, 50-383 Wrocław, Poland
| | - Paweł Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Fryderyka Joliot-Curie 14a, 50-383 Wrocław, Poland
| |
Collapse
|
42
|
Dornburg A, Townsend JP, Brooks W, Spriggs E, Eytan RI, Moore JA, Wainwright PC, Lemmon A, Lemmon EM, Near TJ. New insights on the sister lineage of percomorph fishes with an anchored hybrid enrichment dataset. Mol Phylogenet Evol 2017; 110:27-38. [PMID: 28254474 DOI: 10.1016/j.ympev.2017.02.017] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Revised: 02/22/2017] [Accepted: 02/25/2017] [Indexed: 11/17/2022]
Abstract
Percomorph fishes represent over 17,100 species, including several model organisms and species of economic importance. Despite continuous advances in the resolution of the percomorph Tree of Life, resolution of the sister lineage to Percomorpha remains inconsistent but restricted to a small number of candidate lineages. Here we use an anchored hybrid enrichment (AHE) dataset of 132 loci with over 99,000 base pairs to identify the sister lineage of percomorph fishes. Initial analyses of this dataset failed to recover a strongly supported sister clade to Percomorpha, however, scrutiny of the AHE dataset revealed a bias towards high GC content at fast-evolving codon partitions (GC bias). By combining several existing approaches aimed at mitigating the impacts of convergence in GC bias, including RY coding and analyses of amino acids, we consistently recovered a strongly supported clade comprised of Holocentridae (squirrelfishes), Berycidae (Alfonsinos), Melamphaidae (bigscale fishes), Cetomimidae (flabby whalefishes), and Rondeletiidae (redmouth whalefishes) as the sister lineage to Percomorpha. Additionally, implementing phylogenetic informativeness (PI) based metrics as a filtration method yielded this same topology, suggesting PI based approaches will preferentially filter these fast-evolving regions and act in a manner consistent with other phylogenetic approaches aimed at mitigating GC bias. Our results provide a new perspective on a key issue for studies investigating the evolutionary history of more than one quarter of all living species of vertebrates.
Collapse
Affiliation(s)
- Alex Dornburg
- North Carolina Museum of Natural Sciences, Raleigh, NC, USA.
| | - Jeffrey P Townsend
- Department of Ecology & Evolutionary Biology and Peabody Museum of Natural History, Yale University, New Haven, CT 06520, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Biostatistics, Yale University, New Haven, CT 06510, USA
| | - Willa Brooks
- North Carolina Museum of Natural Sciences, Raleigh, NC, USA
| | - Elizabeth Spriggs
- Department of Ecology & Evolutionary Biology and Peabody Museum of Natural History, Yale University, New Haven, CT 06520, USA
| | - Ron I Eytan
- Marine Biology Department, Texas A&M University at Galveston, Galveston, TX 77554, USA
| | - Jon A Moore
- Florida Atlantic University, Wilkes Honors College, Jupiter, FL 33458, USA; Florida Atlantic University, Harbor Branch Oceanographic Institution, Fort Pierce, FL 34946, USA
| | - Peter C Wainwright
- Department of Evolution & Ecology, University of California, Davis, CA 95616, USA
| | - Alan Lemmon
- Department of Scientific Computing, Florida State University, 400 Dirac Science Library, Tallahassee, FL 32306, USA
| | - Emily Moriarty Lemmon
- Department of Biological Science, Florida State University, 319 Stadium Drive, Tallahassee, FL 32306, USA
| | - Thomas J Near
- Department of Ecology & Evolutionary Biology and Peabody Museum of Natural History, Yale University, New Haven, CT 06520, USA; Peabody Museum of Natural History, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
43
|
Shen XX, Salichos L, Rokas A. A Genome-Scale Investigation of How Sequence, Function, and Tree-Based Gene Properties Influence Phylogenetic Inference. Genome Biol Evol 2016; 8:2565-80. [PMID: 27492233 PMCID: PMC5010910 DOI: 10.1093/gbe/evw179] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/25/2016] [Indexed: 12/13/2022] Open
Abstract
Molecular phylogenetic inference is inherently dependent on choices in both methodology and data. Many insightful studies have shown how choices in methodology, such as the model of sequence evolution or optimality criterion used, can strongly influence inference. In contrast, much less is known about the impact of choices in the properties of the data, typically genes, on phylogenetic inference. We investigated the relationships between 52 gene properties (24 sequence-based, 19 function-based, and 9 tree-based) with each other and with three measures of phylogenetic signal in two assembled data sets of 2,832 yeast and 2,002 mammalian genes. We found that most gene properties, such as evolutionary rate (measured through the percent average of pairwise identity across taxa) and total tree length, were highly correlated with each other. Similarly, several gene properties, such as gene alignment length, Guanine-Cytosine content, and the proportion of tree distance on internal branches divided by relative composition variability (treeness/RCV), were strongly correlated with phylogenetic signal. Analysis of partial correlations between gene properties and phylogenetic signal in which gene evolutionary rate and alignment length were simultaneously controlled, showed similar patterns of correlations, albeit weaker in strength. Examination of the relative importance of each gene property on phylogenetic signal identified gene alignment length, alongside with number of parsimony-informative sites and variable sites, as the most important predictors. Interestingly, the subsets of gene properties that optimally predicted phylogenetic signal differed considerably across our three phylogenetic measures and two data sets; however, gene alignment length and RCV were consistently included as predictors of all three phylogenetic measures in both yeasts and mammals. These results suggest that a handful of sequence-based gene properties are reliable predictors of phylogenetic signal and could be useful in guiding the choice of phylogenetic markers.
Collapse
Affiliation(s)
- Xing-Xing Shen
- Department of Biological Sciences, Vanderbilt University
| | - Leonidas Salichos
- Department of Biological Sciences, Vanderbilt University Department of Molecular Biophysics and Biochemistry, Yale University
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University
| |
Collapse
|
44
|
Fernández R, Edgecombe GD, Giribet G. Exploring Phylogenetic Relationships within Myriapoda and the Effects of Matrix Composition and Occupancy on Phylogenomic Reconstruction. Syst Biol 2016; 65:871-89. [PMID: 27162151 PMCID: PMC4997009 DOI: 10.1093/sysbio/syw041] [Citation(s) in RCA: 74] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2015] [Accepted: 04/28/2016] [Indexed: 11/14/2022] Open
Abstract
Myriapods, including the diverse and familiar centipedes and millipedes, are one of the dominant terrestrial arthropod groups. Although molecular evidence has shown that Myriapoda is monophyletic, its internal phylogeny remains contentious and understudied, especially when compared to those of Chelicerata and Hexapoda. Until now, efforts have focused on taxon sampling (e.g., by including a handful of genes from many species) or on maximizing matrix size (e.g., by including hundreds or thousands of genes in just a few species), but a phylogeny maximizing sampling at both levels remains elusive. In this study, we analyzed 40 Illumina transcriptomes representing 3 of the 4 myriapod classes (Diplopoda, Chilopoda, and Symphyla); 25 transcriptomes were newly sequenced to maximize representation at the ordinal level in Diplopoda and at the family level in Chilopoda. Ten supermatrices were constructed to explore the effect of several potential phylogenetic biases (e.g., rate of evolution, heterotachy) at 3 levels of gene occupancy per taxon (50%, 75%, and 90%). Analyses based on maximum likelihood and Bayesian mixture models retrieved monophyly of each myriapod class, and resulted in 2 alternative phylogenetic positions for Symphyla, as sister group to Diplopoda + Chilopoda, or closer to Diplopoda, the latter hypothesis having been traditionally supported by morphology. Within centipedes, all orders were well supported, but 2 deep nodes remained in conflict in the different analyses despite dense taxon sampling at the family level. Relationships among centipede orders in all analyses conducted with the most complete matrix (90% occupancy) are at odds not only with the sparser but more gene-rich supermatrices (75% and 50% supermatrices) and with the matrices optimizing phylogenetic informativeness or most conserved genes, but also with previous hypotheses based on morphology, development, or other molecular data sets. Our results indicate that a high percentage of ribosomal proteins in the most complete matrices, in conjunction with distance from the root, can act in concert to compromise the estimated relationships within the ingroup. We discuss the implications of these findings in the context of the ever more prevalent quest for completeness in phylogenomic studies.
Collapse
Affiliation(s)
- Rosa Fernández
- Museum of Comparative Zoology & Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Gregory D Edgecombe
- Department of Earth Sciences, The Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | - Gonzalo Giribet
- Museum of Comparative Zoology & Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| |
Collapse
|
45
|
Izumitani HF, Kusaka Y, Koshikawa S, Toda MJ, Katoh T. Phylogeography of the Subgenus Drosophila (Diptera: Drosophilidae): Evolutionary History of Faunal Divergence between the Old and the New Worlds. PLoS One 2016; 11:e0160051. [PMID: 27462734 PMCID: PMC4962979 DOI: 10.1371/journal.pone.0160051] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 07/13/2016] [Indexed: 11/19/2022] Open
Abstract
The current subgenus Drosophila (the traditional immigrans-tripunctata radiation) includes major elements of temperate drosophilid faunas in the northern hemisphere. Despite previous molecular phylogenetic analyses, the phylogeny of the subgenus Drosophila has not fully been resolved: the resulting trees have more or less varied in topology. One possible factor for such ambiguous results is taxon-sampling that has been biased towards New World species in previous studies. In this study, taxon sampling was balanced between Old and New World species, and phylogenetic relationships among 45 ingroup species selected from ten core species groups of the subgenus Drosophila were analyzed using nucleotide sequences of three nuclear and two mitochondrial genes. Based on the resulting phylogenetic tree, ancestral distributions and divergence times were estimated for each clade to test Throckmorton’s hypothesis that there was a primary, early-Oligocene disjunction of tropical faunas and a subsequent mid-Miocene disjunction of temperate faunas between the Old and the New Worlds that occurred in parallel in separate lineages of the Drosophilidae. Our results substantially support Throckmorton’s hypothesis of ancestral migrations via the Bering Land Bridge mainly from the Old to the New World, and subsequent vicariant divergence of descendants between the two Worlds occurred in parallel among different lineages of the subgenus Drosophila. However, our results also indicate that these events took place multiple times over a wider time range than Throckmorton proposed, from the late Oligocene to the Pliocene.
Collapse
Affiliation(s)
- Hiroyuki F. Izumitani
- Department of Natural History Science, Graduate school of Science, Hokkaido University, Sapporo, Hokkaido, Japan
| | - Yohei Kusaka
- Department of Biological Sciences, Tokyo Metropolitan University, Hachioji, Tokyo, Japan
| | - Shigeyuki Koshikawa
- The Hakubi Center for Advanced Research and Graduate School of Science, Kyoto University, Kyoto, Kyoto, Japan
| | - Masanori J. Toda
- Hokkaido University Museum, Hokkaido University, Sapporo, Hokkaido, Japan
| | - Toru Katoh
- Department of Biological Sciences, Faculty of Science, Hokkaido University, Sapporo, Hokkaido, Japan
- * E-mail:
| |
Collapse
|
46
|
Irisarri I, Meyer A. The Identification of the Closest Living Relative(s) of Tetrapods: Phylogenomic Lessons for Resolving Short Ancient Internodes. Syst Biol 2016; 65:1057-1075. [PMID: 27425642 DOI: 10.1093/sysbio/syw057] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Accepted: 06/08/2016] [Indexed: 01/08/2023] Open
Abstract
Identifying the closest living relative(s) of tetrapods is an important, yet still contested question in vertebrate phylogenetics. Three hypotheses are possible and ruling out alternatives has proven difficult even with large molecular data sets due to weak phylogenetic signal coupled nonphylogenetic noise resulting from relatively rapid speciation events that occurred a long time ago ([Formula: see text]400 Ma). Here, we revisit the identity of the closest living relative of land vertebrates from a phylogenomic perspective and include new genomic data for all extant lungfish genera. RNA-seq proves to be a great alternative to genomic sequencing, which currently is technically not feasible in lungfishes due to their huge (50-130 Gb) and repetitive genomes. We examined the most important sources of systematic error, namely long-branch attraction (LBA), compositional heterogeneity and distribution of missing data and applied different correction techniques. A multispecies coalescent approach is used to account for deep coalescence that might come from the short and deep internodes separating early sarcopterygian splits. Concatenation methods favored lungfishes as the closest living relatives of tetrapods with strong statistical support. Amino acid profile mixture models can unambiguously resolve this difficult internode thanks to their ability to avoid systematic error. We assessed the performance of different site-heterogeneous models and data partitioning and compared the ability of different strategies designed to overcome LBA, including taxon manipulation, reduction of among-lineage rate heterogeneity and removal of fast-evolving or compositionally heterogeneous positions. The identification of lungfish as sister group of tetrapods is robust regarding the effects of nonstationary composition and distribution of missing data. The multispecies coalescent method reconstructed strongly supported topologies that were congruent with concatenation, despite pervasive gene tree heterogeneity. We reject alternative topologies for early sarcopterygian relationships by increasing the signal-to-noise ratio in our alignments. The analytical pipeline outlined here combines probabilistic phylogenomic inference with methods for evaluating data quality, model adequacy, and assessing systematic error, and thus is likely to help resolve similarly difficult internodes in the tree of life. [Coalescence; coelacanth; compositional heterogeneity; gene tree; long-branch attraction; lungfish; missing data; model misspecification; phylogenomic; species tree; systematic error.].
Collapse
Affiliation(s)
- Iker Irisarri
- Laboratory for Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, 78464 Konstanz, Germany
| | - Axel Meyer
- Laboratory for Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, 78464 Konstanz, Germany
| |
Collapse
|
47
|
Testing heterogeneous base composition as potential cause for conflicting phylogenetic signal between mitochondrial and nuclear DNA in the land snail genus Theba Risso 1826 (Gastropoda: Stylommatophora: Helicoidea). ORG DIVERS EVOL 2016. [DOI: 10.1007/s13127-016-0288-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
48
|
Knudsen SW, Clements KD. World-wide species distributions in the family Kyphosidae (Teleostei: Perciformes). Mol Phylogenet Evol 2016; 101:252-266. [PMID: 27143240 DOI: 10.1016/j.ympev.2016.04.037] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Revised: 04/10/2016] [Accepted: 04/29/2016] [Indexed: 10/21/2022]
Abstract
Sea chubs of the family Kyphosidae are major consumers of macroalgae on both temperate and tropical reefs, where they can comprise a significant proportion of fish biomass. However, the relationships and taxonomic status of sea chubs (including the junior synonyms Hermosilla, Kyphosus, Neoscorpis and Sectator) worldwide have long been problematical due to perceived lack of character differentiation, complicating ecological assessment. More recently, the situation has been further complicated by publication of conflicting taxonomic treatments. Here, we resolve the relationships, taxonomy and distribution of all known species of sea chubs through a combined analysis of partial fragments from mitochondrial markers (12s, 16s, cytb, tRNA -Pro, -Phe, -Thr and -Val) and three nuclear markers (rag1, rag2, tmo4c4). These new results provide independent evidence for the presence of several junior synonyms among Atlantic and Indo-Pacific taxa, demonstrating that several sea chub species are more widespread than previously thought. In particular, our results can reject the hypothesis of endemic species in the Atlantic Ocean. At a higher taxonomic level, our results shed light on the relationships between Girellidae, Kuhliidae, Kyphosidae, Microcanthidae, Oplegnathidae and Scorpididae, with Scorpididae resolved as the sister group to Kyphosidae.
Collapse
Affiliation(s)
| | - Kendall D Clements
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| |
Collapse
|
49
|
CIORPAC M, DRUICĂ RC, GHIORGHIȚĂ G, COJOCARU D, GORGAN DL. CHD genes: a reliable marker for bird populations and phylogenetic analysis?Case study of the superfamily Sylvioidea (Aves: Passeriformes). TURK J ZOOL 2016. [DOI: 10.3906/zoo-1510-22] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
50
|
Kawahara M, Koyama S, Iimura S, Yamazaki W, Tanaka A, Kohri N, Sasaki K, Takahashi M. Preimplantation death of xenomitochondrial mouse embryo harbouring bovine mitochondria. Sci Rep 2015; 5:14512. [PMID: 26416548 PMCID: PMC4586891 DOI: 10.1038/srep14512] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Accepted: 09/02/2015] [Indexed: 11/09/2022] Open
Abstract
Mitochondria, cellular organelles playing essential roles in eukaryotic cell metabolism, are thought to have evolved from bacteria. The organization of mtDNA is remarkably uniform across species, reflecting its vital and conserved role in oxidative phosphorylation (OXPHOS). Our objectives were to evaluate the compatibility of xenogeneic mitochondria in the development of preimplantation embryos in mammals. Mouse embryos harbouring bovine mitochondria (mtB-M embryos) were prepared by the cell-fusion technique employing the haemagglutinating virus of Japan (HVJ). The mtB-M embryos showed developmental delay at embryonic days (E) 3.5 after insemination. Furthermore, none of the mtB-M embryos could implant into the maternal uterus after embryo transfer, whereas control mouse embryos into which mitochondria from another mouse had been transferred developed as well as did non-manipulated embryos. When we performed quantitative PCR (qPCR) of mouse and bovine ND5, we found that the mtB-M embryos contained 8.3% of bovine mitochondria at the blastocyst stage. Thus, contamination with mitochondria from another species induces embryonic lethality prior to implantation into the maternal uterus. The heteroplasmic state of these xenogeneic mitochondria could have detrimental effects on preimplantation development, leading to preservation of species-specific mitochondrial integrity in mammals.
Collapse
Affiliation(s)
- Manabu Kawahara
- Laboratory of Animal Breeding and Reproduction, Research Faculty of Agriculture, Hokkaido University, Kita-ku Kita 9 Nishi 9, Sapporo 060-8589, Japan
| | - Shiori Koyama
- Laboratory of Animal Breeding and Reproduction, Research Faculty of Agriculture, Hokkaido University, Kita-ku Kita 9 Nishi 9, Sapporo 060-8589, Japan
| | - Satomi Iimura
- Laboratory of Animal Breeding and Reproduction, Research Faculty of Agriculture, Hokkaido University, Kita-ku Kita 9 Nishi 9, Sapporo 060-8589, Japan
| | - Wataru Yamazaki
- Laboratory of Animal Breeding and Reproduction, Research Faculty of Agriculture, Hokkaido University, Kita-ku Kita 9 Nishi 9, Sapporo 060-8589, Japan
| | - Aiko Tanaka
- Laboratory of Animal Breeding and Reproduction, Research Faculty of Agriculture, Hokkaido University, Kita-ku Kita 9 Nishi 9, Sapporo 060-8589, Japan
| | - Nanami Kohri
- Laboratory of Animal Breeding and Reproduction, Research Faculty of Agriculture, Hokkaido University, Kita-ku Kita 9 Nishi 9, Sapporo 060-8589, Japan
| | - Keisuke Sasaki
- Laboratory of Animal Breeding and Reproduction, Research Faculty of Agriculture, Hokkaido University, Kita-ku Kita 9 Nishi 9, Sapporo 060-8589, Japan
| | - Masashi Takahashi
- Laboratory of Animal Breeding and Reproduction, Research Faculty of Agriculture, Hokkaido University, Kita-ku Kita 9 Nishi 9, Sapporo 060-8589, Japan
| |
Collapse
|