1
|
Tims AR, Unmack PJ, Hammer MP, Brown C, Adams M, McGee MD. Museum Genomics Reveals the Hybrid Origin of an Extinct Crater Lake Endemic. Syst Biol 2024; 73:506-520. [PMID: 38597146 PMCID: PMC11377190 DOI: 10.1093/sysbio/syae017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 03/13/2024] [Accepted: 04/09/2024] [Indexed: 04/11/2024] Open
Abstract
Crater lake fishes are common evolutionary model systems, with recent studies suggesting a key role for gene flow in promoting rapid adaptation and speciation. However, the study of these young lakes can be complicated by human-mediated extinctions. Museum genomics approaches integrating genetic data from recently extinct species are, therefore, critical to understanding the complex evolutionary histories of these fragile systems. Here, we examine the evolutionary history of an extinct Southern Hemisphere crater lake endemic, the rainbowfish Melanotaenia eachamensis. We undertook a comprehensive sampling of extant rainbowfish populations of the Atherton Tablelands of Australia alongside historical museum material to understand the evolutionary origins of the extinct crater lake population and the dynamics of gene flow across the ecoregion. The extinct crater lake species is genetically distinct from all other nearby populations due to historic introgression between 2 proximate riverine lineages, similar to other prominent crater lake speciation systems, but this historic gene flow has not been sufficient to induce a species flock. Our results suggest that museum genomics approaches can be successfully combined with extant sampling to unravel complex speciation dynamics involving recently extinct species.
Collapse
Affiliation(s)
- Amy R Tims
- School of Biological Sciences, Monash University, Melbourne, Victoria 3800, Australia
- School of Natural Sciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Peter J Unmack
- School of Biological Sciences, Monash University, Melbourne, Victoria 3800, Australia
- Centre for Applied Water Science, Institute for Applied Ecology, University of Canberra, Australian Capital Territory 2601, Australia
| | - Michael P Hammer
- Museum and Art Gallery of the Northern Territory, Darwin, Northern Territory 0801, Australia
| | - Culum Brown
- School of Natural Sciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Mark Adams
- Evolutionary Biology Unit, South Australian Museum, North Terrace, Adelaide, South Australia 5000, Australia
- School of Biological Sciences, The University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Matthew D McGee
- School of Biological Sciences, Monash University, Melbourne, Victoria 3800, Australia
| |
Collapse
|
2
|
Rodríguez-Machado S, Elías DJ, McMahan CD, Gruszkiewicz-Tolli A, Piller KR, Chakrabarty P. Disentangling historical relationships within Poeciliidae (Teleostei: Cyprinodontiformes) using ultraconserved elements. Mol Phylogenet Evol 2024; 190:107965. [PMID: 37977500 DOI: 10.1016/j.ympev.2023.107965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 10/18/2023] [Accepted: 11/12/2023] [Indexed: 11/19/2023]
Abstract
Poeciliids (Cyprinodontiformes: Poeciliidae), commonly known as livebearers, are popular fishes in the aquarium trade (e.g., guppies, mollies, swordtails) that are widely distributed in the Americas, with 274 valid species in 27 genera. This group has undergone various taxonomic changes recently, spurred by investigations using traditional genetic markers. Here we used over 1,000 ultraconserved loci to infer the relationships within Poeciliidae in the first attempt at understanding their diversification based on genome-scale data. We explore gene tree discordance and investigate potential incongruence between concatenation and coalescent inference methods. Our aim is to examine the influence of incomplete lineage sorting and reticulate evolution on the poeciliids' evolutionary history and how these factors contribute to the observed gene tree discordace. Our concatenated and coalescent phylogenomic inferences recovered four major clades within Poeciliidae. Most supra-generic level relationships we inferred were congruent with previous molecular studies, but we found some disagreements; the Middle American taxa Phallichthys and Poecilia (Mollienesia) were recovered as non-monophyletic, and unlike other recent molecular studies, we recovered Brachyrhaphis as monophyletic. Our study is the first to provide signatures of reticulate evolution in Poeciliidae at the family level; however, continued finer-scale investigations are needed to understand the complex evolutionary history of the family along with a much-needed taxonomic re-evaluation.
Collapse
Affiliation(s)
- Sheila Rodríguez-Machado
- Museum of Natural Science, Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, United States.
| | - Diego J Elías
- Museum of Natural Science, Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, United States; Field Museum of Natural History, Chicago, IL 60605, United States
| | - Caleb D McMahan
- Field Museum of Natural History, Chicago, IL 60605, United States
| | - Anna Gruszkiewicz-Tolli
- Department of Biological Sciences, Southeastern Louisiana University, Hammond, LA 70402, United States
| | - Kyle R Piller
- Department of Biological Sciences, Southeastern Louisiana University, Hammond, LA 70402, United States
| | - Prosanta Chakrabarty
- Museum of Natural Science, Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, United States
| |
Collapse
|
3
|
Hunt EP, Willis SC, Conway KW, Portnoy DS. Interrelationships and biogeography of the New World pufferfish genus Sphoeroides (Tetraodontiformes: Tetraodontidae) inferred using ultra-conserved DNA elements. Mol Phylogenet Evol 2023; 189:107935. [PMID: 37778529 DOI: 10.1016/j.ympev.2023.107935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 09/27/2023] [Accepted: 09/28/2023] [Indexed: 10/03/2023]
Abstract
Colonization of the New World by marine taxa has been hypothesized to have occurred through the Tethys Sea or by crossing the East Pacific Barrier. To better understand patterns and timing of diversification, geological events can be coupled with time calibrated phylogenetic hypotheses to infer major drivers of diversification. Phylogenetic relationships among members of Sphoeroides, a genus of four toothed pufferfishes (Tetraodontiformes: Tetraodontidae) which are found nearly exclusively in the New World (eastern Pacific and western Atlantic), were reconstructed using sequences from ultra-conserved DNA elements, nuclear markers with clear homology among many vertebrate taxa. Hypotheses derived from concatenated maximum-likelihood and species tree summary methods support a paraphyletic Sphoeroides, with Colomesus deeply nested within the genus. Analyses also revealed S. pachygaster, a pelagic species with a cosmopolitan distribution, as the sister taxon to the remainder of Sphoeroides and recovered distinct lineages within S. pachygaster, indicating that this cosmopolitan species may represent a species complex. Ancestral range reconstruction may suggest the genus colonized the New World through the eastern Pacific before diversifying in the western Atlantic, though date estimates for these events are uncertain due to the lack of reliable fossil record for the genus.
Collapse
Affiliation(s)
- Elizabeth P Hunt
- Department of Life Sciences, Texas A&M University - Corpus Christi, 6300 Ocean Dr., Corpus Christi, TX 78412, USA.
| | - Stuart C Willis
- Department of Life Sciences, Texas A&M University - Corpus Christi, 6300 Ocean Dr., Corpus Christi, TX 78412, USA; Columbia River Inter-Tribal Fish Commission - Hagerman Genetics Lab, 3059-F National Fish Hatchery Road, Hagerman, ID 83332, USA
| | - Kevin W Conway
- Department of Ecology and Conservation Biology and Biodiversity Research and Teaching Collections, Texas A&M University, 534 John Kimbrough Blvd., College Station, TX 77843, USA
| | - David S Portnoy
- Department of Life Sciences, Texas A&M University - Corpus Christi, 6300 Ocean Dr., Corpus Christi, TX 78412, USA
| |
Collapse
|
4
|
Han Y, Molloy EK. Quartets enable statistically consistent estimation of cell lineage trees under an unbiased error and missingness model. Algorithms Mol Biol 2023; 18:19. [PMID: 38041123 PMCID: PMC10691101 DOI: 10.1186/s13015-023-00248-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 11/19/2023] [Indexed: 12/03/2023] Open
Abstract
Cancer progression and treatment can be informed by reconstructing its evolutionary history from tumor cells. Although many methods exist to estimate evolutionary trees (called phylogenies) from molecular sequences, traditional approaches assume the input data are error-free and the output tree is fully resolved. These assumptions are challenged in tumor phylogenetics because single-cell sequencing produces sparse, error-ridden data and because tumors evolve clonally. Here, we study the theoretical utility of methods based on quartets (four-leaf, unrooted phylogenetic trees) in light of these barriers. We consider a popular tumor phylogenetics model, in which mutations arise on a (highly unresolved) tree and then (unbiased) errors and missing values are introduced. Quartets are then implied by mutations present in two cells and absent from two cells. Our main result is that the most probable quartet identifies the unrooted model tree on four cells. This motivates seeking a tree such that the number of quartets shared between it and the input mutations is maximized. We prove an optimal solution to this problem is a consistent estimator of the unrooted cell lineage tree; this guarantee includes the case where the model tree is highly unresolved, with error defined as the number of false negative branches. Lastly, we outline how quartet-based methods might be employed when there are copy number aberrations and other challenges specific to tumor phylogenetics.
Collapse
Affiliation(s)
- Yunheng Han
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Erin K Molloy
- Department of Computer Science, University of Maryland, College Park, MD, USA.
- University of Maryland Institute for Advanced Computer Studies, College Park, MD, USA.
| |
Collapse
|
5
|
Pereira DS, Hilário S, Gonçalves MFM, Phillips AJL. Diaporthe Species on Palms: Molecular Re-Assessment and Species Boundaries Delimitation in the D. arecae Species Complex. Microorganisms 2023; 11:2717. [PMID: 38004729 PMCID: PMC10673533 DOI: 10.3390/microorganisms11112717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 10/25/2023] [Accepted: 11/03/2023] [Indexed: 11/26/2023] Open
Abstract
Due to cryptic diversification, phenotypic plasticity and host associations, multilocus phylogenetic analyses have become the most important tool in accurately identifying and circumscribing species in the Diaporthe genus. However, the application of the genealogical concordance criterion has often been overlooked, ultimately leading to an exponential increase in novel Diaporthe spp. Due to the large number of species, many lineages remain poorly understood under the so-called species complexes. For this reason, a robust delimitation of the species boundaries in Diaporthe is still an ongoing challenge. Therefore, the present study aimed to resolve the species boundaries of the Diaporthe arecae species complex (DASC) by implementing an integrative taxonomic approach. The Genealogical Phylogenetic Species Recognition (GCPSR) principle revealed incongruences between the individual gene genealogies. Moreover, the Poisson Tree Processes' (PTPs) coalescent-based species delimitation models identified three well-delimited subclades represented by the species D. arecae, D. chiangmaiensis and D. smilacicola. These results evidence that all species previously described in the D. arecae subclade are conspecific, which is coherent with the morphological indistinctiveness observed and the absence of reproductive isolation and barriers to gene flow. Thus, 52 Diaporthe spp. are reduced to synonymy under D. arecae. Recent population expansion and the possibility of incomplete lineage sorting suggested that the D. arecae subclade may be considered as ongoing evolving lineages under active divergence and speciation. Hence, the genetic diversity and intraspecific variability of D. arecae in the context of current global climate change and the role of D. arecae as a pathogen on palm trees and other hosts are also discussed. This study illustrates that species in Diaporthe are highly overestimated, and highlights the relevance of applying an integrative taxonomic approach to accurately circumscribe the species boundaries in the genus Diaporthe.
Collapse
Affiliation(s)
- Diana S. Pereira
- Faculdade de Ciências, Biosystems and Integrative Sciences Institute (BioISI), Universidade de Lisboa, Campo Grande, 1749-016 Lisboa, Portugal;
| | - Sandra Hilário
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR), Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal;
- Faculty of Sciences, Biology Department, University of Porto, Rua do Campo Alegre, Edifício FC4, 4169-007 Porto, Portugal
| | - Micael F. M. Gonçalves
- Faculty of Sciences, Biology Department, University of Porto, Rua do Campo Alegre, Edifício FC4, 4169-007 Porto, Portugal
- Centre for Environmental and Marine Studies, Department of Biology, Campus Universitário de Santiago, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Alan J. L. Phillips
- Faculdade de Ciências, Biosystems and Integrative Sciences Institute (BioISI), Universidade de Lisboa, Campo Grande, 1749-016 Lisboa, Portugal;
| |
Collapse
|
6
|
Ma X, Shi X, Wang Q, Zhao M, Zhang Z, Zhong B. A Reinvestigation of Multiple Independent Evolution and Triassic-Jurassic Origins of Multicellular Volvocine Algae. Genome Biol Evol 2023; 15:evad142. [PMID: 37498572 PMCID: PMC10410301 DOI: 10.1093/gbe/evad142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 07/09/2023] [Accepted: 07/22/2023] [Indexed: 07/28/2023] Open
Abstract
The evolution of multicellular organisms is considered to be a major evolutionary transition, profoundly affecting the ecology and evolution of nearly all life on earth. The volvocine algae, a unique clade of chlorophytes with diverse cell morphology, provide an appealing model for investigating the evolution of multicellularity and development. However, the phylogenetic relationship and timescale of the volvocine algae are not fully resolved. Here, we use extensive taxon and gene sampling to reconstruct the phylogeny of the volvocine algae. Our results support that the colonial volvocine algae are not monophyletic group and multicellularity independently evolve at least twice in the volvocine algae, once in Tetrabaenaceae and another in the Goniaceae + Volvocaceae. The simulation analyses suggest that incomplete lineage sorting is a major factor for the tree topology discrepancy, which imply that the multispecies coalescent model better fits the data used in this study. The coalescent-based species tree supports that the Goniaceae is monophyletic and Crucicarteria is the earliest diverging lineage, followed by Hafniomonas and Radicarteria within the Volvocales. By considering the multiple uncertainties in divergence time estimation, the dating analyses indicate that the volvocine algae occurred during the Cryogenian to Ediacaran (696.6-551.1 Ma) and multicellularity in the volvocine algae originated from the Triassic to Jurassic. Our phylogeny and timeline provide an evolutionary framework for studying the evolution of key traits and the origin of multicellularity in the volvocine algae.
Collapse
Affiliation(s)
- Xiaoya Ma
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Xuan Shi
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Qiuping Wang
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Mengru Zhao
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Zhenhua Zhang
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Bojian Zhong
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| |
Collapse
|
7
|
Scheunert A, Lautenschlager U, Ott T, Oberprieler C. Nano-Strainer: A workflow for the identification of single-copy nuclear loci for plant systematic studies, using target capture kits and Oxford Nanopore long reads. Ecol Evol 2023; 13:e10190. [PMID: 37475726 PMCID: PMC10354226 DOI: 10.1002/ece3.10190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 05/18/2023] [Accepted: 06/01/2023] [Indexed: 07/22/2023] Open
Abstract
In modern plant systematics, target enrichment enables simultaneous analysis of hundreds of genes. However, when dealing with reticulate or polyploidization histories, few markers may suffice, but often are required to be single-copy, a condition that is not necessarily met with commercial capture kits. Also, large genome sizes can render target capture ineffective, so that amplicon sequencing would be preferable; however, knowledge about suitable loci is often missing. Here, we present a comprehensive workflow for the identification of putative single-copy nuclear markers in a genus of interest, by mining a small dataset from target capture using a few representative taxa. The proposed pipeline assesses sequence variability contained in the data from targeted loci and assigns reads to their respective genes, via a combined BLAST/clustering procedure. Cluster consensus sequences are then examined based on four pre-defined criteria presumably indicative for absence of paralogy. This is done by calculating four specialized indices; loci are ranked according to their performance in these indices, and top-scoring loci are considered putatively single- or low copy. The approach can be applied to any probe set. As it relies on long reads, the present contribution also provides template workflows for processing Nanopore-based target capture data. Obtained markers are further tested and then entered into amplicon sequencing. For the detection of possibly remaining paralogy in these data, which might occur in groups with rampant paralogy, we also employ the long-read assembly tool canu. In diploid representatives of the young Compositae genus Leucanthemum, characterized by high levels of polyploidy, our approach resulted in successful amplification of 13 loci. Modifications to remove traces of paralogy were made in seven of these. A species tree from the markers correctly reproduced main relationships in the genus, however, at low resolution. The presented workflow has the potential to valuably support phylogenetic research, for example in polyploid plant groups.
Collapse
Affiliation(s)
- Agnes Scheunert
- Evolutionary and Systematic Botany Group, Institute of Plant SciencesUniversity of RegensburgRegensburgGermany
| | - Ulrich Lautenschlager
- Evolutionary and Systematic Botany Group, Institute of Plant SciencesUniversity of RegensburgRegensburgGermany
| | - Tankred Ott
- Evolutionary and Systematic Botany Group, Institute of Plant SciencesUniversity of RegensburgRegensburgGermany
| | - Christoph Oberprieler
- Evolutionary and Systematic Botany Group, Institute of Plant SciencesUniversity of RegensburgRegensburgGermany
| |
Collapse
|
8
|
Lopes GP, Rohe F, Bertuol F, Polo E, Lima IJ, Valsecchi J, Santos TCM, Nash SD, da Silva MNF, Boubli JP, Farias IP, Hrbek T. Taxonomic review of Saguinus mystax (Spix, 1823) (Primates, Callitrichidae), and description of a new species. PeerJ 2023; 11:e14526. [PMID: 36647446 PMCID: PMC9840391 DOI: 10.7717/peerj.14526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Accepted: 11/15/2022] [Indexed: 01/13/2023] Open
Abstract
Although the Amazon has the greatest diversity of primates, there are still taxonomic uncertainties for many taxa, such as the species of the Saguinus mystax group. The most geographically broadly distributed and phenotypically diverse species in this group is S. mystax, and its phenotypic diversity has been recognized as three subspecies-S. mystax mystax, S. mystax pileatus and S. mystax pluto-with non-overlapping geographic distributions. In this sense, we carried out an extensive field survey in their distribution areas and used a framework of taxonomic hypothesis testing of genomic data combined with an integrative taxonomic decision-making framework to carry out a taxonomic revision of S. mystax. Our tests supported the existence of three lineages/species. The first species corresponds to Saguinus mystax mystax from the left bank of the Juruá River, which was raised to the species level, and we also discovered and described animals from the Juruá-Tefé interfluve previously attributed to S. mystax mystax as a new species. The subspecies S. m. pileatus and S. m. pluto are recognized as a single species, under a new nomenclatural combination. However, given their phenotypic distinction and allopatric distribution, they potentially are a manifestation of an early stage of speciation, and therefore we maintain their subspecific designations.
Collapse
Affiliation(s)
- Gerson Paulino Lopes
- Programa em Pós-Graduação em Zoologia, Universidade Federal do Amazonas, Manaus, Amazonas, Brazil
- Grupo de Pesquisa em Ecologia e Conservação de Primatas, Instituto de Desenvolvimento Sustentável Mamirauá, Tefé, Amazonas, Brazil
- Laboratório de Evolução e Genética Animal/Departamento de Genética/Instituto de Ciências Biológicas, Universidade Federal do Amazonas, Manaus, Amazonas, Brazil
- Grupo de Pesquisa em Ecologia de Vertebrados Terrestres, Instituto de Desenvolvimento Sustentável Mamirauá, Tefé, Amazonas, Brazil
| | - Fábio Rohe
- Laboratório de Evolução e Genética Animal/Departamento de Genética/Instituto de Ciências Biológicas, Universidade Federal do Amazonas, Manaus, Amazonas, Brazil
- Programa de Pós-Graduação em Genética, Conservação e Biologia Evolutiva, Instituto Nacional de Pesquisas da Amazônia, Manaus, Amazonas, Brazil
| | - Fabrício Bertuol
- Laboratório de Evolução e Genética Animal/Departamento de Genética/Instituto de Ciências Biológicas, Universidade Federal do Amazonas, Manaus, Amazonas, Brazil
| | - Erico Polo
- Laboratório de Evolução e Genética Animal/Departamento de Genética/Instituto de Ciências Biológicas, Universidade Federal do Amazonas, Manaus, Amazonas, Brazil
| | - Ivan Junqueira Lima
- Grupo de Pesquisa em Ecologia de Vertebrados Terrestres, Instituto de Desenvolvimento Sustentável Mamirauá, Tefé, Amazonas, Brazil
- Programa de Pós-Graduação em Ecologia Aplicada, Universidade Federal de Lavras, Lavras, Minas Gerais, Brazil
| | - João Valsecchi
- Grupo de Pesquisa em Ecologia e Conservação de Primatas, Instituto de Desenvolvimento Sustentável Mamirauá, Tefé, Amazonas, Brazil
- Grupo de Pesquisa em Ecologia de Vertebrados Terrestres, Instituto de Desenvolvimento Sustentável Mamirauá, Tefé, Amazonas, Brazil
- Rede de Pesquisa em Diversidade, Conservação e Uso da Fauna da Amazônia, Manaus, Amazonas, Brazil
- Comunidad de Manejo de Fauna Silvestre en América Latina, Iquitos, Peru
| | - Tamily Carvalho Melo Santos
- Grupo de Pesquisa em Ecologia de Vertebrados Terrestres, Instituto de Desenvolvimento Sustentável Mamirauá, Tefé, Amazonas, Brazil
| | - Stephen D. Nash
- Department of Anatomical Sciences/Health Sciences Center, Stony Brook University, New York, United States of America
| | | | - Jean P. Boubli
- Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil
- School of Science, Engineering and the Environment, University of Salford, Salford, United Kingdom
| | - Izeni Pires Farias
- Programa em Pós-Graduação em Zoologia, Universidade Federal do Amazonas, Manaus, Amazonas, Brazil
- Laboratório de Evolução e Genética Animal/Departamento de Genética/Instituto de Ciências Biológicas, Universidade Federal do Amazonas, Manaus, Amazonas, Brazil
| | - Tomas Hrbek
- Programa em Pós-Graduação em Zoologia, Universidade Federal do Amazonas, Manaus, Amazonas, Brazil
- Laboratório de Evolução e Genética Animal/Departamento de Genética/Instituto de Ciências Biológicas, Universidade Federal do Amazonas, Manaus, Amazonas, Brazil
- Department of Biology, Trinity University, San Antonio, Texas, United States
| |
Collapse
|
9
|
Morel B, Williams TA, Stamatakis A. Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data. Bioinformatics 2022; 39:6964379. [PMID: 36576010 PMCID: PMC9838317 DOI: 10.1093/bioinformatics/btac832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 12/12/2022] [Accepted: 12/26/2022] [Indexed: 12/29/2022] Open
Abstract
MOTIVATION Missing data and incomplete lineage sorting (ILS) are two major obstacles to accurate species tree inference. Gene tree summary methods such as ASTRAL and ASTRID have been developed to account for ILS. However, they can be severely affected by high levels of missing data. RESULTS We present Asteroid, a novel algorithm that infers an unrooted species tree from a set of unrooted gene trees. We show on both empirical and simulated datasets that Asteroid is substantially more accurate than ASTRAL and ASTRID for very high proportions (>80%) of missing data. Asteroid is several orders of magnitude faster than ASTRAL for datasets that contain thousands of genes. It offers advanced features such as parallelization, support value computation and support for multi-copy and multifurcating gene trees. AVAILABILITY AND IMPLEMENTATION Asteroid is freely available at https://github.com/BenoitMorel/Asteroid. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol BS8, UK
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg 69118, Germany,Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe 76131, Germany
| |
Collapse
|
10
|
Mahbub S, Sawmya S, Saha A, Reaz R, Rahman MS, Bayzid MS. Quartet Based Gene Tree Imputation Using Deep Learning Improves Phylogenomic Analyses Despite Missing Data. JOURNAL OF COMPUTATIONAL BIOLOGY : A JOURNAL OF COMPUTATIONAL MOLECULAR CELL BIOLOGY 2022; 29:1156-1172. [PMID: 36048555 DOI: 10.1089/cmb.2022.0212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Species tree estimation is frequently based on phylogenomic approaches that use multiple genes from throughout the genome. However, for a combination of reasons (ranging from sampling biases to more biological causes, as in gene birth and loss), gene trees are often incomplete, meaning that not all species of interest have a common set of genes. Incomplete gene trees can potentially impact the accuracy of phylogenomic inference. We, for the first time, introduce the problem of imputing the quartet distribution induced by a set of incomplete gene trees, which involves adding the missing quartets back to the quartet distribution. We present Quartet based Gene tree Imputation using Deep Learning (QT-GILD), an automated and specially tailored unsupervised deep learning technique, accompanied by cues from natural language processing, which learns the quartet distribution in a given set of incomplete gene trees and generates a complete set of quartets accordingly. QT-GILD is a general-purpose technique needing no explicit modeling of the subject system or reasons for missing data or gene tree heterogeneity. Experimental studies on a collection of simulated and empirical datasets suggest that QT-GILD can effectively impute the quartet distribution, which results in a dramatic improvement in the species tree accuracy. Remarkably, QT-GILD not only imputes the missing quartets but can also account for gene tree estimation error. Therefore, QT-GILD advances the state-of-the-art in species tree estimation from gene trees in the face of missing data.
Collapse
Affiliation(s)
- Sazan Mahbub
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh.,Department of Computer Science, University of Maryland, College Park, Maryland, USA
| | - Shashata Sawmya
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - Arpita Saha
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - Rezwana Reaz
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - M Sohel Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - Md Shamsuzzoha Bayzid
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| |
Collapse
|
11
|
Astudillo-Clavijo V, Stiassny MLJ, Ilves KL, Musilova Z, Salzburger W, López-Fernández H. Exon-based phylogenomics and the relationships of African cichlid fishes: tackling the challenges of reconstructing phylogenies with repeated rapid radiations. Syst Biol 2022; 72:134-149. [PMID: 35880863 DOI: 10.1093/sysbio/syac051] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 07/06/2022] [Accepted: 07/19/2022] [Indexed: 11/13/2022] Open
Abstract
African cichlids (subfamily: Pseudocrenilabrinae) are among the most diverse vertebrates, and their propensity for repeated rapid radiation has made them a celebrated model system in evolutionary research. Nonetheless, despite numerous studies, phylogenetic uncertainty persists, and riverine lineages remain comparatively underrepresented in higher-level phylogenetic studies. Heterogeneous gene histories resulting from incomplete lineage sorting (ILS) and hybridization are likely sources of uncertainty, especially during episodes of rapid speciation. We investigate relationships of Pseudocrenilabrinae and its close relatives while accounting for multiple sources of genetic discordance using species tree and hybrid network analyses with hundreds of single-copy exons. We improve sequence recovery for distant relatives, thereby extending the taxonomic reach of our probes, with a hybrid reference guided/de novo assembly approach. Our analyses provide robust hypotheses for most higher-level relationships and reveal widespread gene heterogeneity, including in riverine taxa. ILS and past hybridization are identified as sources of genetic discordance in different lineages. Sampling of various Blenniiformes (formerly Ovalentaria) adds strong phylogenomic support for convict blennies (Pholidichthyidae) as sister to Cichlidae, and points to other potentially useful protein-coding markers across the order. A reliable phylogeny with representatives from diverse environments will support ongoing taxonomic and comparative evolutionary research in the cichlid model system.
Collapse
Affiliation(s)
- Viviana Astudillo-Clavijo
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, M5S 3B2, Canada.,Department of Natural History, Royal Ontario Museum, Toronto, M5S 2C6, Canada.,Department of Ecology and Evolutionary Biology and Museum of Zoology, University of Michigan, Ann Arbor, 48109, USA
| | - Melanie L J Stiassny
- Department of Ichthyology, American Museum of Natural History, New York, 10024-5102, USA
| | - Katriina L Ilves
- Research & Collections, Zoology, Canadian Museum of Nature, Ottawa, K1P 6P4, Canada
| | - Zuzana Musilova
- Department of Zoology, Charles University in Prague, Vinicna 7, Prague, CZ-128 44, Czech Republic
| | - Walter Salzburger
- Zoological Institute, University of Basel, Vesalgasse 1, CH-4051, Basel, Switzerland
| | - Hernán López-Fernández
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, M5S 3B2, Canada.,Department of Natural History, Royal Ontario Museum, Toronto, M5S 2C6, Canada.,Department of Ecology and Evolutionary Biology and Museum of Zoology, University of Michigan, Ann Arbor, 48109, USA
| |
Collapse
|
12
|
Gatesy J, Springer MS. Phylogenomic Coalescent Analyses of Avian Retroelements Infer Zero-Length Branches at the Base of Neoaves, Emergent Support for Controversial Clades, and Ancient Introgressive Hybridization in Afroaves. Genes (Basel) 2022; 13:1167. [PMID: 35885951 PMCID: PMC9324441 DOI: 10.3390/genes13071167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 06/20/2022] [Accepted: 06/21/2022] [Indexed: 01/25/2023] Open
Abstract
Retroelement insertions (RIs) are low-homoplasy characters that are ideal data for addressing deep evolutionary radiations, where gene tree reconstruction errors can severely hinder phylogenetic inference with DNA and protein sequence data. Phylogenomic studies of Neoaves, a large clade of birds (>9000 species) that first diversified near the Cretaceous−Paleogene boundary, have yielded an array of robustly supported, contradictory relationships among deep lineages. Here, we reanalyzed a large RI matrix for birds using recently proposed quartet-based coalescent methods that enable inference of large species trees including branch lengths in coalescent units, clade-support, statistical tests for gene flow, and combined analysis with DNA-sequence-based gene trees. Genome-scale coalescent analyses revealed extremely short branches at the base of Neoaves, meager branch support, and limited congruence with previous work at the most challenging nodes. Despite widespread topological conflicts with DNA-sequence-based trees, combined analyses of RIs with thousands of gene trees show emergent support for multiple higher-level clades (Columbea, Passerea, Columbimorphae, Otidimorphae, Phaethoquornithes). RIs express asymmetrical support for deep relationships within the subclade Afroaves that hints at ancient gene flow involving the owl lineage (Strigiformes). Because DNA-sequence data are challenged by gene tree-reconstruction error, analysis of RIs represents one approach for improving gene tree-based methods when divergences are deep, internodes are short, terminal branches are long, and introgressive hybridization further confounds species−tree inference.
Collapse
Affiliation(s)
- John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
| | - Mark S. Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA 92521, USA;
| |
Collapse
|
13
|
Willson J, Roddur MS, Liu B, Zaharias P, Warnow T. DISCO: Species Tree Inference using Multicopy Gene Family Tree Decomposition. Syst Biol 2022; 71:610-629. [PMID: 34450658 PMCID: PMC9016570 DOI: 10.1093/sysbio/syab070] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 08/18/2021] [Accepted: 08/23/2021] [Indexed: 11/21/2022] Open
Abstract
Species tree inference from gene family trees is a significant problem in computational biology. However, gene tree heterogeneity, which can be caused by several factors including gene duplication and loss, makes the estimation of species trees very challenging. While there have been several species tree estimation methods introduced in recent years to specifically address gene tree heterogeneity due to gene duplication and loss (such as DupTree, FastMulRFS, ASTRAL-Pro, and SpeciesRax), many incur high cost in terms of both running time and memory. We introduce a new approach, DISCO, that decomposes the multi-copy gene family trees into many single copy trees, which allows for methods previously designed for species tree inference in a single copy gene tree context to be used. We prove that using DISCO with ASTRAL (i.e., ASTRAL-DISCO) is statistically consistent under the GDL model, provided that ASTRAL-Pro correctly roots and tags each gene family tree. We evaluate DISCO paired with different methods for estimating species trees from single copy genes (e.g., ASTRAL, ASTRID, and IQ-TREE) under a wide range of model conditions, and establish that high accuracy can be obtained even when ASTRAL-Pro is not able to correctly roots and tags the gene family trees. We also compare results using MI, an alternative decomposition strategy from Yang Y. and Smith S.A. (2014), and find that DISCO provides better accuracy, most likely as a result of covering more of the gene family tree leafset in the output decomposition. [Concatenation analysis; gene duplication and loss; species tree inference; summary method.].
Collapse
Affiliation(s)
- James Willson
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Mrinmoy Saha Roddur
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Baqiao Liu
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Paul Zaharias
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
14
|
Mai U, Mirarab S. Completing gene trees without species trees in sub-quadratic time. Bioinformatics 2022; 38:1532-1541. [PMID: 34978565 DOI: 10.1093/bioinformatics/btab875] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 11/27/2021] [Accepted: 12/30/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION As genome-wide reconstruction of phylogenetic trees becomes more widespread, limitations of available data are being appreciated more than ever before. One issue is that phylogenomic datasets are riddled with missing data, and gene trees, in particular, almost always lack representatives from some species otherwise available in the dataset. Since many downstream applications of gene trees require or can benefit from access to complete gene trees, it will be beneficial to algorithmically complete gene trees. Also, gene trees are often unrooted, and rooting them is useful for downstream applications. While completing and rooting a gene tree with respect to a given species tree has been studied, those problems are not studied in depth when we lack such a reference species tree. RESULTS We study completion of gene trees without a need for a reference species tree. We formulate an optimization problem to complete the gene trees while minimizing their quartet distance to the given set of gene trees. We extend a seminal algorithm by Brodal et al. to solve this problem in quasi-linear time. In simulated studies and on a large empirical data, we show that completion of gene trees using other gene trees is relatively accurate and, unlike the case where a species tree is available, is unbiased. AVAILABILITY AND IMPLEMENTATION Our method, tripVote, is available at https://github.com/uym2/tripVote. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Uyen Mai
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA 92093, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA 92093, USA
| |
Collapse
|
15
|
Liu B, Warnow T. Scalable Species Tree Inference with External Constraints. J Comput Biol 2022; 29:664-678. [PMID: 35196115 DOI: 10.1089/cmb.2021.0543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Species tree inference is a basic step in biological discovery, but discordance between gene trees creates analytical challenges and large data sets create computational challenges. Although there is generally some information available about the species trees that could be used to speed up the estimation, only one species tree estimation method that addresses gene tree discordance-ASTRAL-J, a recent development in the ASTRAL family of methods-is able to use this information. Here we describe two new methods, NJst-J and FASTRAL-J, that can estimate the species tree, given a partial knowledge of the species tree in the form of a nonbinary unrooted constraint tree. We show that both NJst-J and FASTRAL-J are much faster than ASTRAL-J and we prove that all three methods are statistically consistent under the multispecies coalescent model subject to this constraint. Our extensive simulation study shows that both FASTRAL-J and NJst-J provide advantages over ASTRAL-J: both are faster (and NJst-J is particularly fast), and FASTRAL-J is generally at least as accurate as ASTRAL-J. An analysis of the Avian Phylogenomics Project data set with 48 species and 14,446 genes presents additional evidence of the value of FASTRAL-J over ASTRAL-J (and both over ASTRAL), with dramatic reductions in running time (20 hours for default ASTRAL, and minutes or seconds for ASTRAL-J and FASTRAL-J, respectively).
Collapse
Affiliation(s)
- Baqiao Liu
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
16
|
Morales-Briones DF, Gehrke B, Huang CH, Liston A, Ma H, Marx HE, Tank DC, Yang Y. Analysis of Paralogs in Target Enrichment Data Pinpoints Multiple Ancient Polyploidy Events in Alchemilla s.l. (Rosaceae). Syst Biol 2021; 71:190-207. [PMID: 33978764 PMCID: PMC8677558 DOI: 10.1093/sysbio/syab032] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 04/28/2021] [Accepted: 05/03/2021] [Indexed: 12/16/2022] Open
Abstract
Target enrichment is becoming increasingly popular for phylogenomic studies. Although baits for enrichment are typically designed to target single-copy genes, paralogs are often recovered with increased sequencing depth, sometimes from a significant proportion of loci, especially in groups experiencing whole-genome duplication (WGD) events. Common approaches for processing paralogs in target enrichment data sets include random selection, manual pruning, and mainly, the removal of entire genes that show any evidence of paralogy. These approaches are prone to errors in orthology inference or removing large numbers of genes. By removing entire genes, valuable information that could be used to detect and place WGD events is discarded. Here, we used an automated approach for orthology inference in a target enrichment data set of 68 species of Alchemilla s.l. (Rosaceae), a widely distributed clade of plants primarily from temperate climate regions. Previous molecular phylogenetic studies and chromosome numbers both suggested ancient WGDs in the group. However, both the phylogenetic location and putative parental lineages of these WGD events remain unknown. By taking paralogs into consideration and inferring orthologs from target enrichment data, we identified four nodes in the backbone of Alchemilla s.l. with an elevated proportion of gene duplication. Furthermore, using a gene-tree reconciliation approach, we established the autopolyploid origin of the entire Alchemilla s.l. and the nested allopolyploid origin of four major clades within the group. Here, we showed the utility of automated tree-based orthology inference methods, previously designed for genomic or transcriptomic data sets, to study complex scenarios of polyploidy and reticulate evolution from target enrichment data sets.[Alchemilla; allopolyploidy; autopolyploidy; gene tree discordance; orthology inference; paralogs; Rosaceae; target enrichment; whole genome duplication.].
Collapse
Affiliation(s)
- Diego F Morales-Briones
- Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA
- Department of Biological Sciences and Institute for Bioinformatics and Evolutionary Studies, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID 83844, USA
| | - Berit Gehrke
- University Gardens, University Museum, University of Bergen, Mildeveien 240, 5259 Hjellestad, Norway
| | - Chien-Hsun Huang
- State Key Laboratory of Genetic Engineering and Collaborative Innovation Center of Genetics and Development, Ministry of Education Key Laboratory of Biodiversity and Ecological Engineering, Institute of Plant Biology, Center of Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai 200433, China
| | - Aaron Liston
- Department of Botany and Plant Pathology, Oregon State University, 2082 Cordley Hall, Corvallis, OR 97331, USA
| | - Hong Ma
- Department of Biology, the Huck Institute of the Life Sciences, the Pennsylvania State University, 510D Mueller Laboratory, University Park, PA 16802 USA
| | - Hannah E Marx
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109-1048, USA
- Museum of Southwestern Biology and Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| | - David C Tank
- Department of Biological Sciences and Institute for Bioinformatics and Evolutionary Studies, University of Idaho, 875 Perimeter Drive MS 3051, Moscow, ID 83844, USA
| | - Ya Yang
- Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA
| |
Collapse
|
17
|
Mirarab S, Nakhleh L, Warnow T. Multispecies Coalescent: Theory and Applications in Phylogenetics. ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS 2021. [DOI: 10.1146/annurev-ecolsys-012121-095340] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Species tree estimation is a basic part of many biological research projects, ranging from answering basic evolutionary questions (e.g., how did a group of species adapt to their environments?) to addressing questions in functional biology. Yet, species tree estimation is very challenging, due to processes such as incomplete lineage sorting, gene duplication and loss, horizontal gene transfer, and hybridization, which can make gene trees differ from each other and from the overall evolutionary history of the species. Over the last 10–20 years, there has been tremendous growth in methods and mathematical theory for estimating species trees and phylogenetic networks, and some of these methods are now in wide use. In this survey, we provide an overview of the current state of the art, identify the limitations of existing methods and theory, and propose additional research problems and directions.
Collapse
Affiliation(s)
- Siavash Mirarab
- Electrical and Computer Engineering Department, University of California, San Diego, La Jolla, California 92093, USA
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, Texas 77005, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|
18
|
Tilic E, Stiller J, Campos E, Pleijel F, Rouse GW. Phylogenomics resolves ambiguous relationships within Aciculata (Errantia, Annelida). Mol Phylogenet Evol 2021; 166:107339. [PMID: 34751138 DOI: 10.1016/j.ympev.2021.107339] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 10/23/2021] [Accepted: 10/27/2021] [Indexed: 10/20/2022]
Abstract
Aciculata (Eunicida + Phyllodocida) is among the largest clades of annelids, comprising almost half of the known diversity of all marine annelids. Despite the group's large size and biological importance, most phylogenomic studies on Annelida to date have had a limited sampling of this clade. The phylogenetic placement of many clades within Phyllodocida in particular has remained poorly understood. To resolve the relationships within Aciculata we conducted a large-scale phylogenomic analysis based on 24 transcriptomes (13 new), chosen to represent many family-ranked taxa that have never been included in a broad phylogenomic study. Our sampling also includes several enigmatic taxa with challenging phylogenetic placement, such as Histriobdella, Struwela, Lacydonia, Pilargis and the holopelagic worms Lopadorrhynchus, Travisiopsis and Tomopteris. Our robust phylogeny allows us to name and place some of these problematic clades and has significant implications on the systematics of the group. Within Eunicida we reinstate the names Eunicoidea and Oenonoidea. Within Phyllodocida we delineate Phyllodociformia, Glyceriformia, Nereidiformia, Nephtyiformia and Aphroditiformia. Phyllodociformia now includes: Lacydonia, Typhloscolecidae, Lopadorrhynchidae and Phyllodocidae. Nephtyiformia includes Nephtyidae and Pilargidae. We also broaden the delineation of Glyceriformia to include Sphaerodoridae, Tomopteridae and Glyceroidea (Glyceridae + Goniadidae). Furthermore, our study demonstrates and explores how conflicting, yet highly supported topologies can result from confounding signals in gene trees.
Collapse
Affiliation(s)
- Ekin Tilic
- Scripps Institution of Oceanography, UC San Diego, La Jolla, CA, USA; Institute of Evolutionary Biology and Animal Ecology, University of Bonn, Germany; Marine Biological Section, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark.
| | - Josefin Stiller
- Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark
| | - Ernesto Campos
- Facultad de Ciencias, Universidad Autónoma de Baja California. Ensenada, Baja California, México
| | - Fredrik Pleijel
- Department of Marine Sciences, University of Gothenburg, Tjärnö, Sweden
| | - Greg W Rouse
- Scripps Institution of Oceanography, UC San Diego, La Jolla, CA, USA.
| |
Collapse
|
19
|
Molloy EK, Gatesy J, Springer MS. Theoretical and practical considerations when using retroelement insertions to estimate species trees in the anomaly zone. Syst Biol 2021; 71:721-740. [PMID: 34677617 DOI: 10.1093/sysbio/syab086] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Accepted: 10/11/2021] [Indexed: 11/13/2022] Open
Abstract
A potential shortcoming of concatenation methods for species tree estimation is their failure to account for incomplete lineage sorting. Coalescent methods address this problem but make various assumptions that, if violated, can result in worse performance than concatenation. Given the challenges of analyzing DNA sequences with both concatenation and coalescent methods, retroelement insertions (RIs) have emerged as powerful phylogenomic markers for species tree estimation. Here, we show that two recently proposed quartet-based methods, SDPquartets and ASTRAL_BP, are statistically consistent estimators of the unrooted species tree topology under the coalescent when RIs follow a neutral infinite-sites model of mutation and the expected number of new RIs per generation is constant across the species tree. The accuracy of these (and other) methods for inferring species trees from RIs has yet to be assessed on simulated data sets, where the true species tree topology is known. Therefore, we evaluated eight methods given RIs simulated from four model species trees, all of which have short branches and at least three of which are in the anomaly zone. In our simulation study, ASTRAL_BP and SDPquartets always recovered the correct species tree topology when given a sufficiently large number of RIs, as predicted. A distance-based method (ASTRID_BP) and Dollo parsimony also performed well in recovering the species tree topology. In contrast, unordered, polymorphism, and Camin-Sokal parsimony typically fail to recover the correct species tree topology in anomaly zone situations with more than four ingroup taxa. Of the methods studied, only ASTRAL_BP automatically estimates internal branch lengths (in coalescent units) and support values (i.e. local posterior probabilities). We examined the accuracy of branch length estimation, finding that estimated lengths were accurate for short branches but upwardly biased otherwise. This led us to derive the maximum likelihood (branch length) estimate for when RIs are given as input instead of binary gene trees; this corrected formula produced accurate estimates of branch lengths in our simulation study, provided that a sufficiently large number of RIs were given as input. Lastly, we evaluated the impact of data quantity on species tree estimation by repeating the above experiments with input sizes varying from 100 to 100 000 parsimony-informative RIs. We found that, when given just 1 000 parsimony-informative RIs as input, ASTRAL_BP successfully reconstructed major clades (i.e clades separated by branches > 0.3 CUs) with high support and identified rapid radiations (i.e. shorter connected branches), although not their precise branching order. The local posterior probability was effective for controlling false positive branches in these scenarios.
Collapse
Affiliation(s)
- Erin K Molloy
- Department of Computer Science, University of Maryland, College Park, College Park, 20742, USA
| | - John Gatesy
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, 10024, USA
| | - Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, Riverside, 92521, USA
| |
Collapse
|
20
|
Yardeni G, Viruel J, Paris M, Hess J, Groot Crego C, de La Harpe M, Rivera N, Barfuss MHJ, Till W, Guzmán-Jacob V, Krömer T, Lexer C, Paun O, Leroy T. Taxon-specific or universal? Using target capture to study the evolutionary history of rapid radiations. Mol Ecol Resour 2021; 22:927-945. [PMID: 34606683 PMCID: PMC9292372 DOI: 10.1111/1755-0998.13523] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 09/09/2021] [Accepted: 09/22/2021] [Indexed: 12/20/2022]
Abstract
Target capture has emerged as an important tool for phylogenetics and population genetics in nonmodel taxa. Whereas developing taxon‐specific capture probes requires sustained efforts, available universal kits may have a lower power to reconstruct relationships at shallow phylogenetic scales and within rapidly radiating clades. We present here a newly developed target capture set for Bromeliaceae, a large and ecologically diverse plant family with highly variable diversification rates. The set targets 1776 coding regions, including genes putatively involved in key innovations, with the aim to empower testing of a wide range of evolutionary hypotheses. We compare the relative power of this taxon‐specific set, Bromeliad1776, to the universal Angiosperms353 kit. The taxon‐specific set results in higher enrichment success across the entire family; however, the overall performance of both kits to reconstruct phylogenetic trees is relatively comparable, highlighting the vast potential of universal kits for resolving evolutionary relationships. For more detailed phylogenetic or population genetic analyses, for example the exploration of gene tree concordance, nucleotide diversity or population structure, the taxon‐specific capture set presents clear benefits. We discuss the potential lessons that this comparative study provides for future phylogenetic and population genetic investigations, in particular for the study of evolutionary radiations.
Collapse
Affiliation(s)
- Gil Yardeni
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | | | - Margot Paris
- Unit of Ecology & Evolution, Department of Biology, University of Fribourg, Fribourg, Switzerland
| | - Jaqueline Hess
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria.,Department of Soil Ecology, Helmholtz Centre for Environmental Research, UFZ, Halle (Saale), Germany
| | - Clara Groot Crego
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria.,Vienna Graduate School of Population Genetics, Vienna, Austria
| | - Marylaure de La Harpe
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Norma Rivera
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Michael H J Barfuss
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Walter Till
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Valeria Guzmán-Jacob
- Biodiversity, Macroecology and Biogeography, University of Goettingen, Göttingen, Germany
| | - Thorsten Krömer
- Centro de Investigaciones Tropicales, Universidad Veracruzana, Xalapa, Mexico
| | - Christian Lexer
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Ovidiu Paun
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Thibault Leroy
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| |
Collapse
|
21
|
Cruaud A, Delvare G, Nidelet S, Sauné L, Ratnasingham S, Chartois M, Blaimer BB, Gates M, Brady SG, Faure S, van Noort S, Rossi JP, Rasplus JY. Ultra-Conserved Elements and morphology reciprocally illuminate conflicting phylogenetic hypotheses in Chalcididae (Hymenoptera, Chalcidoidea). Cladistics 2021; 37:1-35. [PMID: 34478176 DOI: 10.1111/cla.12416] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/15/2020] [Indexed: 11/30/2022] Open
Abstract
Recent technical advances combined with novel computational approaches have promised the acceleration of our understanding of the tree of life. However, when it comes to hyperdiverse and poorly known groups of invertebrates, studies are still scarce. As published phylogenies will be rarely challenged by future taxonomists, careful attention must be paid to potential analytical bias. We present the first molecular phylogenetic hypothesis for the family Chalcididae, a group of parasitoid wasps, with a representative sampling (144 ingroups and seven outgroups) that covers all described subfamilies and tribes, and 82% of the known genera. Analyses of 538 Ultra-Conserved Elements (UCEs) with supermatrix (RAxML and IQTREE) and gene tree reconciliation approaches (ASTRAL, ASTRID) resulted in highly supported topologies in overall agreement with morphology but reveal conflicting topologies for some of the deepest nodes. To resolve these conflicts, we explored the phylogenetic tree space with clustering and gene genealogy interrogation methods, analyzed marker and taxon properties that could bias inferences and performed a thorough morphological analysis (130 characters encoded for 40 taxa representative of the diversity). This joint analysis reveals that UCEs enable attainment of resolution between ancestry and convergent/divergent evolution when morphology is not informative enough, but also shows that a systematic exploration of bias with different analytical methods and a careful analysis of morphological features is required to prevent publication of artifactual results. We highlight a GC content bias for maximum-likelihood approaches, an artifactual mid-point rooting of the ASTRAL tree and a deleterious effect of high percentage of missing data (>85% missing UCEs) on gene tree reconciliation methods. Based on the results we propose a new classification of the family into eight subfamilies and ten tribes that lay the foundation for future studies on the evolutionary history of Chalcididae.
Collapse
Affiliation(s)
- Astrid Cruaud
- CBGP, CIRAD, INRAe, IRD, Montpellier SupAgro, Université de Montpellier, Montpellier, France
| | - Gérard Delvare
- CBGP, CIRAD, INRAe, IRD, Montpellier SupAgro, Université de Montpellier, Montpellier, France.,UMR CBGP, CIRAD, F-34398, Montpellier, France
| | - Sabine Nidelet
- CBGP, CIRAD, INRAe, IRD, Montpellier SupAgro, Université de Montpellier, Montpellier, France
| | - Laure Sauné
- CBGP, CIRAD, INRAe, IRD, Montpellier SupAgro, Université de Montpellier, Montpellier, France
| | | | - Marguerite Chartois
- CBGP, CIRAD, INRAe, IRD, Montpellier SupAgro, Université de Montpellier, Montpellier, France
| | | | - Michael Gates
- USDA, ARS, SEL, c/o Smithsonian Institution, National Museum of Natural History, Washington, DC, USA
| | - Seán G Brady
- Department of Entomology, Smithsonian Institution, National Museum of Natural History, Washington, DC, USA
| | - Sariana Faure
- Department of Zoology and Entomology, Rhodes University, Grahamstown, South Africa
| | - Simon van Noort
- Research and Exhibitions Department, South African Museum, Iziko Museums of South Africa, PO Box 61, Cape Town, 8000, South Africa.,Department of Biological Sciences, University of Cape Town, Private Bag, Rondebosch, 7701, Cape Town, South Africa
| | - Jean-Pierre Rossi
- CBGP, CIRAD, INRAe, IRD, Montpellier SupAgro, Université de Montpellier, Montpellier, France
| | - Jean-Yves Rasplus
- CBGP, CIRAD, INRAe, IRD, Montpellier SupAgro, Université de Montpellier, Montpellier, France
| |
Collapse
|
22
|
Silva GSC, Melo BF, Roxo FF, Ochoa LE, Shibatta OA, Sabaj MH, Oliveira C. Phylogenomics of the bumblebee catfishes (Siluriformes: Pseudopimelodidae) using ultraconserved elements. J ZOOL SYST EVOL RES 2021. [DOI: 10.1111/jzs.12513] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Gabriel S. C. Silva
- Instituto de Biociências Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP) Botucatu Brazil
| | - Bruno F. Melo
- Instituto de Biociências Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP) Botucatu Brazil
| | - Fábio F. Roxo
- Instituto de Biociências Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP) Botucatu Brazil
| | - Luz E. Ochoa
- Museu de Zoologia Universidade de São Paulo (USP) São Paulo Brazil
| | - Oscar A. Shibatta
- Museu de Zoologia Centro de Ciências Biológicas Universidade Estadual de Londrina (UEL) Londrina Brazil
| | - Mark H. Sabaj
- Department of Ichthyology Academy of Natural Sciences of Drexel University Philadelphia PA USA
| | - Claudio Oliveira
- Instituto de Biociências Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP) Botucatu Brazil
| |
Collapse
|
23
|
Thomas AE, Igea J, Meudt HM, Albach DC, Lee WG, Tanentzap AJ. Using target sequence capture to improve the phylogenetic resolution of a rapid radiation in New Zealand Veronica. AMERICAN JOURNAL OF BOTANY 2021; 108:1289-1306. [PMID: 34173225 DOI: 10.1002/ajb2.1678] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 03/10/2021] [Indexed: 05/08/2023]
Abstract
PREMISE Recent, rapid radiations present a challenge for phylogenetic reconstruction. Fast successive speciation events typically lead to low sequence divergence and poorly resolved relationships with standard phylogenetic markers. Target sequence capture of many independent nuclear loci has the potential to improve phylogenetic resolution for rapid radiations. METHODS Here we applied target sequence capture with 353 protein-coding genes (Angiosperms353 bait kit) to Veronica sect. Hebe (common name hebe) to determine its utility for improving the phylogenetic resolution of rapid radiations. Veronica section Hebe originated 5-10 million years ago in New Zealand, forming a monophyletic radiation of ca 130 extant species. RESULTS We obtained approximately 150 kbp of 353 protein-coding exons and an additional 200 kbp of flanking noncoding sequences for each of 77 hebe and two outgroup species. When comparing coding, noncoding, and combined data sets, we found that the latter provided the best overall phylogenetic resolution. While some deep nodes in the radiation remained unresolved, our phylogeny provided broad and often improved support for subclades identified by both morphology and standard markers in previous studies. Gene-tree discordance was nonetheless widespread, indicating that additional methods are needed to disentangle fully the history of the radiation. CONCLUSIONS Phylogenomic target capture data sets both increase phylogenetic signal and deliver new insights into the complex evolutionary history of rapid radiations as compared with traditional markers. Improving methods to resolve remaining discordance among loci from target sequence capture is now important to facilitate the further study of rapid radiations.
Collapse
Affiliation(s)
- Anne E Thomas
- Ecosystems and Global Change Group, Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Javier Igea
- Ecosystems and Global Change Group, Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Heidi M Meudt
- Museum of New Zealand Te Papa Tongarewa, Wellington, New Zealand
| | - Dirk C Albach
- Carl von Ossietzky-University, Oldenburg, D-26111, Germany
| | - William G Lee
- Manaaki Whenua - Landcare Research Otago, Dunedin, New Zealand
| | - Andrew J Tanentzap
- Ecosystems and Global Change Group, Department of Plant Sciences, University of Cambridge, Cambridge, UK
| |
Collapse
|
24
|
Pérez-Escobar OA, Dodsworth S, Bogarín D, Bellot S, Balbuena JA, Schley RJ, Kikuchi IA, Morris SK, Epitawalage N, Cowan R, Maurin O, Zuntini A, Arias T, Serna-Sánchez A, Gravendeel B, Torres Jimenez MF, Nargar K, Chomicki G, Chase MW, Leitch IJ, Forest F, Baker WJ. Hundreds of nuclear and plastid loci yield novel insights into orchid relationships. AMERICAN JOURNAL OF BOTANY 2021; 108:1166-1180. [PMID: 34250591 DOI: 10.1002/ajb2.1702] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 06/10/2021] [Indexed: 06/13/2023]
Abstract
PREMISE The inference of evolutionary relationships in the species-rich family Orchidaceae has hitherto relied heavily on plastid DNA sequences and limited taxon sampling. Previous studies have provided a robust plastid phylogenetic framework, which was used to classify orchids and investigate the drivers of orchid diversification. However, the extent to which phylogenetic inference based on the plastid genome is congruent with the nuclear genome has been only poorly assessed. METHODS We inferred higher-level phylogenetic relationships of orchids based on likelihood and ASTRAL analyses of 294 low-copy nuclear genes sequenced using the Angiosperms353 universal probe set for 75 species (representing 69 genera, 16 tribes, 24 subtribes) and a concatenated analysis of 78 plastid genes for 264 species (117 genera, 18 tribes, 28 subtribes). We compared phylogenetic informativeness and support for the nuclear and plastid phylogenetic hypotheses. RESULTS Phylogenetic inference using nuclear data sets provides well-supported orchid relationships that are highly congruent between analyses. Comparisons of nuclear gene trees and a plastid supermatrix tree showed that the trees are mostly congruent, but revealed instances of strongly supported phylogenetic incongruence in both shallow and deep time. The phylogenetic informativeness of individual Angiosperms353 genes is in general better than that of most plastid genes. CONCLUSIONS Our study provides the first robust nuclear phylogenomic framework for Orchidaceae and an assessment of intragenomic nuclear discordance, plastid-nuclear tree incongruence, and phylogenetic informativeness across the family. Our results also demonstrate what has long been known but rarely thoroughly documented: nuclear and plastid phylogenetic trees can contain strongly supported discordances, and this incongruence must be reconciled prior to interpretation in evolutionary studies, such as taxonomy, biogeography, and character evolution.
Collapse
Affiliation(s)
| | - Steven Dodsworth
- School of Biological Sciences, University of Portsmouth, Portsmouth, PO1 2UP, UK
| | - Diego Bogarín
- Lankester Botanic Garden, University of Costa Rica, Cartago, Costa Rica
| | | | | | | | | | | | | | - Robyn Cowan
- Royal Botanic Gardens Kew, Richmond, TW9 3AE, UK
| | | | | | | | | | | | | | - Katharina Nargar
- Australian Tropical Herbarium, James Cook University, Australia
- National Research Collections, Commonwealth Industrial and Scientific Research Organization, Australia
| | - Guillaume Chomicki
- Department of Animal and Plant Sciences, Alfred Denny Building, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK
| | - Mark W Chase
- Royal Botanic Gardens Kew, Richmond, TW9 3AE, UK
- Department of Environment and Agriculture, Curtin University, Bentley, Western Australia, 6102, Australia
| | | | - Félix Forest
- Royal Botanic Gardens Kew, Richmond, TW9 3AE, UK
| | | |
Collapse
|
25
|
Baker WJ, Dodsworth S, Forest F, Graham SW, Johnson MG, McDonnell A, Pokorny L, Tate JA, Wicke S, Wickett NJ. Exploring Angiosperms353: An open, community toolkit for collaborative phylogenomic research on flowering plants. AMERICAN JOURNAL OF BOTANY 2021; 108:1059-1065. [PMID: 34293179 DOI: 10.1002/ajb2.1703] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 05/14/2021] [Indexed: 06/13/2023]
Affiliation(s)
| | - Steven Dodsworth
- School of Life Sciences, University of Bedfordshire, University Square, Luton, LU1 3JU, UK
| | - Félix Forest
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, UK
| | - Sean W Graham
- Department of Botany, University of British Columbia, 6270 University Boulevard, Vancouver, British Columbia, V6T 1Z4, Canada
| | - Matthew G Johnson
- Department of Biological Sciences, Texas Tech University, Lubbock, TX, 79409, USA
| | - Angela McDonnell
- Plant Science and Conservation, Chicago Botanic Garden, 1000 Lake Cook Road, Glencoe, IL, 60022, USA
| | - Lisa Pokorny
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, UK
| | - Jennifer A Tate
- School of Fundamental Sciences, Massey University, Palmerston North, 4442, New Zealand
| | - Susann Wicke
- Plant Evolutionary Biology, Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Plant Systematics and Biodiversity, Institute for Biology, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Norman J Wickett
- Plant Science and Conservation, Chicago Botanic Garden, 1000 Lake Cook Road, Glencoe, IL, 60022, USA
| |
Collapse
|
26
|
Unravelling hybridization in Phytophthora using phylogenomics and genome size estimation. IMA Fungus 2021; 12:16. [PMID: 34193315 PMCID: PMC8246709 DOI: 10.1186/s43008-021-00068-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 05/23/2021] [Indexed: 02/06/2023] Open
Abstract
The genus Phytophthora comprises many economically and ecologically important plant pathogens. Hybrid species have previously been identified in at least six of the 12 phylogenetic clades. These hybrids can potentially infect a wider host range and display enhanced vigour compared to their progenitors. Phytophthora hybrids therefore pose a serious threat to agriculture as well as to natural ecosystems. Early and correct identification of hybrids is therefore essential for adequate plant protection but this is hampered by the limitations of morphological and traditional molecular methods. Identification of hybrids is also important in evolutionary studies as the positioning of hybrids in a phylogenetic tree can lead to suboptimal topologies. To improve the identification of hybrids we have combined genotyping-by-sequencing (GBS) and genome size estimation on a genus-wide collection of 614 Phytophthora isolates. Analyses based on locus- and allele counts and especially on the combination of species-specific loci and genome size estimations allowed us to confirm and characterize 27 previously described hybrid species and discover 16 new hybrid species. Our method was also valuable for species identification at an unprecedented resolution and further allowed correct naming of misidentified isolates. We used both a concatenation- and a coalescent-based phylogenomic method to construct a reliable phylogeny using the GBS data of 140 non-hybrid Phytophthora isolates. Hybrid species were subsequently connected to their progenitors in this phylogenetic tree. In this study we demonstrate the application of two validated techniques (GBS and flow cytometry) for relatively low cost but high resolution identification of hybrids and their phylogenetic relations.
Collapse
|
27
|
Silva GSC, Roxo FF, Melo BF, Ochoa LE, Bockmann FA, Sabaj MH, Jerep FC, Foresti F, Benine RC, Oliveira C. Evolutionary history of Heptapteridae catfishes using ultraconserved elements (Teleostei, Siluriformes). ZOOL SCR 2021. [DOI: 10.1111/zsc.12493] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
| | - Fábio F. Roxo
- Instituto de Biociências Universidade Estadual Paulista Botucatu Brazil
| | - Bruno F. Melo
- Instituto de Biociências Universidade Estadual Paulista Botucatu Brazil
| | - Luz E. Ochoa
- Museu de Zoologia Universidade de São Paulo São Paulo Brazil
| | - Flávio A. Bockmann
- Departamento de Biologia e Programa de Pós‐Graduação em Biologia Comparada Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto Universidade de São Paulo Ribeirão Preto Brazil
| | - Mark H. Sabaj
- Department of Ichthyology Academy of Natural Sciences of Drexel University Philadelphia PA USA
| | - Fernando C. Jerep
- Museu de Zoologia Centro de Ciências Biológicas Universidade Estadual de Londrina Londrina Brazil
| | - Fausto Foresti
- Instituto de Biociências Universidade Estadual Paulista Botucatu Brazil
| | - Ricardo C. Benine
- Instituto de Biociências Universidade Estadual Paulista Botucatu Brazil
| | - Claudio Oliveira
- Instituto de Biociências Universidade Estadual Paulista Botucatu Brazil
| |
Collapse
|
28
|
Arcila D, Hughes LC, Meléndez-Vazquez F, Baldwin CC, White W, Carpenter K, Williams JT, Santos MD, Pogonoski J, Miya M, Ortí G, Betancur-R R. Testing the utility of alternative metrics of branch support to address the ancient evolutionary radiation of tunas, stromateoids, and allies (Teleostei: Pelagiaria). Syst Biol 2021; 70:1123-1144. [PMID: 33783539 DOI: 10.1093/sysbio/syab018] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 03/13/2021] [Indexed: 12/19/2022] Open
Abstract
The use of high-throughput sequencing technologies to produce genome-scale datasets was expected to settle some long-standing controversies across the Tree of Life, particularly in areas where short branches occur at deep timescales. Instead, these datasets have often yielded many well-supported but conflicting topologies, and highly variable gene-tree distributions. A variety of branch-support metrics beyond the nonparametric bootstrap are now available to assess how robust a phylogenetic hypothesis may be, as well as new methods to quantify gene-tree discordance. We applied multiple branch support metrics to an ancient group of marine fishes (Teleostei: Pelagiaria) whose interfamilial relationships have proven difficult to resolve due to a rapid accumulation of lineages very early in its history. We analyzed hundreds of loci including published UCE data and newly generated exonic data along with their flanking regions to represent all 16 extant families for more than 150 out of 284 valid species in the group. Branch support was lower for interfamilial relationships (except the SH-like aLRT and aBayes methods) regardless of the type of marker used. Several nodes that were highly supported with bootstrap had very low site and gene-tree concordance, revealing underlying conflict. Despite this conflict, we were able to identify four consistent interfamilial clades, each comprised of two or three families. Combining exons with their flanking regions also produced increased branch lengths in the deep branches of the pelagiarian tree. Our results demonstrate the limitations of employing current metrics of branch support and species-tree estimation when assessing the confidence of ancient evolutionary radiations and emphasize the necessity to embrace alternative measurements to explore phylogenetic uncertainty and discordance in phylogenomic datasets.
Collapse
Affiliation(s)
- Dahiana Arcila
- Department of Ichthyology, Sam Noble Oklahoma Museum of Natural History, Norman, Oklahoma, U.S.A.,Department of Biology, University of Oklahoma, Norman, Oklahoma, U.S.A
| | - Lily C Hughes
- Department of Biological Sciences, The George Washington University, Washington, District of Columbia, U.S.A.,Department of Organismal Biology and Anatomy, The University of Chicago, Illinois, Chicago, U.S.A.,Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | - Fernando Meléndez-Vazquez
- Department of Ichthyology, Sam Noble Oklahoma Museum of Natural History, Norman, Oklahoma, U.S.A.,Department of Biology, University of Oklahoma, Norman, Oklahoma, U.S.A
| | - Carole C Baldwin
- Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | - William White
- CSIRO Australian National Fish Collection, National Research Collections Australia, Hobart, Hobart, Tasmania, Australia
| | - Kent Carpenter
- Department of Biological Sciences, Old Dominion University, Norfolk, Virginia, U.S.A
| | - Jeffrey T Williams
- Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | | | - John Pogonoski
- CSIRO Australian National Fish Collection, National Research Collections Australia, Hobart, Hobart, Tasmania, Australia
| | - Masaki Miya
- Natural History Museum and Institute, Chiba, Aoba-cho, Chuo-ku, Chiba, Japan
| | - Guillermo Ortí
- Department of Biological Sciences, The George Washington University, Washington, District of Columbia, U.S.A.,Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | | |
Collapse
|
29
|
Freitas FV, Branstetter MG, Griswold T, Almeida EAB. Partitioned Gene-Tree Analyses and Gene-Based Topology Testing Help Resolve Incongruence in a Phylogenomic Study of Host-Specialist Bees (Apidae: Eucerinae). Mol Biol Evol 2021; 38:1090-1100. [PMID: 33179746 PMCID: PMC7947843 DOI: 10.1093/molbev/msaa277] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Incongruence among phylogenetic results has become a common occurrence in analyses of genome-scale data sets. Incongruence originates from uncertainty in underlying evolutionary processes (e.g., incomplete lineage sorting) and from difficulties in determining the best analytical approaches for each situation. To overcome these difficulties, more studies are needed that identify incongruences and demonstrate practical ways to confidently resolve them. Here, we present results of a phylogenomic study based on the analysis 197 taxa and 2,526 ultraconserved element (UCE) loci. We investigate evolutionary relationships of Eucerinae, a diverse subfamily of apid bees (relatives of honey bees and bumble bees) with >1,200 species. We sampled representatives of all tribes within the group and >80% of genera, including two mysterious South American genera, Chilimalopsis and Teratognatha. Initial analysis of the UCE data revealed two conflicting hypotheses for relationships among tribes. To resolve the incongruence, we tested concatenation and species tree approaches and used a variety of additional strategies including locus filtering, partitioned gene-trees searches, and gene-based topological tests. We show that within-locus partitioning improves gene tree and subsequent species-tree estimation, and that this approach, confidently resolves the incongruence observed in our data set. After exploring our proposed analytical strategy on eucerine bees, we validated its efficacy to resolve hard phylogenetic problems by implementing it on a published UCE data set of Adephaga (Insecta: Coleoptera). Our results provide a robust phylogenetic hypothesis for Eucerinae and demonstrate a practical strategy for resolving incongruence in other phylogenomic data sets.
Collapse
Affiliation(s)
- Felipe V Freitas
- Laboratório de Biologia Comparada e Abelhas (LBCA), Departamento de Biologia, Faculdade de Filosofia, Ciências e Letras, Universidade de São Paulo, Ribeirão Preto, SP, Brazil
- U.S. Department of Agriculture, Agricultural Research Service (USDA-ARS), Pollinating Insects Research Unit, Utah State University, Logan, UT
| | - Michael G Branstetter
- U.S. Department of Agriculture, Agricultural Research Service (USDA-ARS), Pollinating Insects Research Unit, Utah State University, Logan, UT
| | - Terry Griswold
- U.S. Department of Agriculture, Agricultural Research Service (USDA-ARS), Pollinating Insects Research Unit, Utah State University, Logan, UT
| | - Eduardo A B Almeida
- Laboratório de Biologia Comparada e Abelhas (LBCA), Departamento de Biologia, Faculdade de Filosofia, Ciências e Letras, Universidade de São Paulo, Ribeirão Preto, SP, Brazil
| |
Collapse
|
30
|
New Approaches for Inferring Phylogenies in the Presence of Paralogs. Trends Genet 2021; 37:174-187. [DOI: 10.1016/j.tig.2020.08.012] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 08/13/2020] [Accepted: 08/19/2020] [Indexed: 12/18/2022]
|
31
|
Jiang X, Edwards SV, Liu L. The Multispecies Coalescent Model Outperforms Concatenation Across Diverse Phylogenomic Data Sets. Syst Biol 2021; 69:795-812. [PMID: 32011711 PMCID: PMC7302055 DOI: 10.1093/sysbio/syaa008] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 12/24/2019] [Accepted: 01/02/2020] [Indexed: 11/30/2022] Open
Abstract
A statistical framework of model comparison and model validation is essential to resolving the debates over concatenation and coalescent models in phylogenomic data analysis. A set of statistical tests are here applied and developed to evaluate and compare the adequacy of substitution, concatenation, and multispecies coalescent (MSC) models across 47 phylogenomic data sets collected across tree of life. Tests for substitution models and the concatenation assumption of topologically congruent gene trees suggest that a poor fit of substitution models, rejected by 44% of loci, and concatenation models, rejected by 38% of loci, is widespread. Logistic regression shows that the proportions of GC content and informative sites are both negatively correlated with the fit of substitution models across loci. Moreover, a substantial violation of the concatenation assumption of congruent gene trees is consistently observed across six major groups (birds, mammals, fish, insects, reptiles, and others, including other invertebrates). In contrast, among those loci adequately described by a given substitution model, the proportion of loci rejecting the MSC model is 11%, significantly lower than those rejecting the substitution and concatenation models. Although conducted on reduced data sets due to computational constraints, Bayesian model validation and comparison both strongly favor the MSC over concatenation across all data sets; the concatenation assumption of congruent gene trees rarely holds for phylogenomic data sets with more than 10 loci. Thus, for large phylogenomic data sets, model comparisons are expected to consistently and more strongly favor the coalescent model over the concatenation model. We also found that loci rejecting the MSC have little effect on species tree estimation. Our study reveals the value of model validation and comparison in phylogenomic data analysis, as well as the need for further improvements of multilocus models and computational tools for phylogenetic inference. [Bayes factor; Bayesian model validation; coalescent prior; congruent gene trees; independent prior; Metazoa; posterior predictive simulation.]
Collapse
Affiliation(s)
- Xiaodong Jiang
- Department of Statistics, University of Georgia, 310 Herty Drive, Athens, GA 30602, USA
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Liang Liu
- Department of Statistics, University of Georgia, 310 Herty Drive, Athens, GA 30602, USA.,Institute of Bioinformatics, University of Georgia, 120 Green Street, Athens, GA 30602, USA
| |
Collapse
|
32
|
Kandziora M, Sklenář P, Kolář F, Schmickl R. How to Tackle Phylogenetic Discordance in Recent and Rapidly Radiating Groups? Developing a Workflow Using Loricaria (Asteraceae) as an Example. FRONTIERS IN PLANT SCIENCE 2021; 12:765719. [PMID: 35069621 PMCID: PMC8777076 DOI: 10.3389/fpls.2021.765719] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 11/22/2021] [Indexed: 05/17/2023]
Abstract
A major challenge in phylogenetics and -genomics is to resolve young rapidly radiating groups. The fast succession of species increases the probability of incomplete lineage sorting (ILS), and different topologies of the gene trees are expected, leading to gene tree discordance, i.e., not all gene trees represent the species tree. Phylogenetic discordance is common in phylogenomic datasets, and apart from ILS, additional sources include hybridization, whole-genome duplication, and methodological artifacts. Despite a high degree of gene tree discordance, species trees are often well supported and the sources of discordance are not further addressed in phylogenomic studies, which can eventually lead to incorrect phylogenetic hypotheses, especially in rapidly radiating groups. We chose the high-Andean Asteraceae genus Loricaria to shed light on the potential sources of phylogenetic discordance and generated a phylogenetic hypothesis. By accounting for paralogy during gene tree inference, we generated a species tree based on hundreds of nuclear loci, using Hyb-Seq, and a plastome phylogeny obtained from off-target reads during target enrichment. We observed a high degree of gene tree discordance, which we found implausible at first sight, because the genus did not show evidence of hybridization in previous studies. We used various phylogenomic analyses (trees and networks) as well as the D-statistics to test for ILS and hybridization, which we developed into a workflow on how to tackle phylogenetic discordance in recent radiations. We found strong evidence for ILS and hybridization within the genus Loricaria. Low genetic differentiation was evident between species located in different Andean cordilleras, which could be indicative of substantial introgression between populations, promoted during Pleistocene glaciations, when alpine habitats shifted creating opportunities for secondary contact and hybridization.
Collapse
Affiliation(s)
- Martha Kandziora
- Department of Botany, Faculty of Science, Charles University, Prague, Czechia
- *Correspondence: Martha Kandziora,
| | - Petr Sklenář
- Department of Botany, Faculty of Science, Charles University, Prague, Czechia
| | - Filip Kolář
- Department of Botany, Faculty of Science, Charles University, Prague, Czechia
- Institute of Botany, The Czech Academy of Sciences, Průhonice, Czechia
| | - Roswitha Schmickl
- Department of Botany, Faculty of Science, Charles University, Prague, Czechia
- Institute of Botany, The Czech Academy of Sciences, Průhonice, Czechia
| |
Collapse
|
33
|
Rhodes JA. Topological Metrizations of Trees, and New Quartet Methods of Tree Inference. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:2107-2118. [PMID: 31095496 PMCID: PMC7650847 DOI: 10.1109/tcbb.2019.2917204] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Topological phylogenetic trees can be assigned edge weights in several natural ways, highlighting different aspects of the tree. Here, the rooted triple and quartet metrizations are introduced, and applied to formulate novel methods of inferring large trees from rooted triple and quartet data. These methods lead to new statistically consistent procedures for inference of a species tree from gene trees under the multispecies coalescent model.
Collapse
|
34
|
Singhal S, Colston TJ, Grundler MR, Smith SA, Costa GC, Colli GR, Moritz C, Pyron RA, Rabosky DL. Congruence and Conflict in the Higher-Level Phylogenetics of Squamate Reptiles: An Expanded Phylogenomic Perspective. Syst Biol 2020; 70:542-557. [PMID: 32681800 DOI: 10.1093/sysbio/syaa054] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 05/05/2020] [Accepted: 07/05/2020] [Indexed: 12/16/2022] Open
Abstract
Genome-scale data have the potential to clarify phylogenetic relationships across the tree of life but have also revealed extensive gene tree conflict. This seeming paradox, whereby larger data sets both increase statistical confidence and uncover significant discordance, suggests that understanding sources of conflict is important for accurate reconstruction of evolutionary history. We explore this paradox in squamate reptiles, the vertebrate clade comprising lizards, snakes, and amphisbaenians. We collected an average of 5103 loci for 91 species of squamates that span higher-level diversity within the clade, which we augmented with publicly available sequences for an additional 17 taxa. Using a locus-by-locus approach, we evaluated support for alternative topologies at 17 contentious nodes in the phylogeny. We identified shared properties of conflicting loci, finding that rate and compositional heterogeneity drives discordance between gene trees and species tree and that conflicting loci rarely overlap across contentious nodes. Finally, by comparing our tests of nodal conflict to previous phylogenomic studies, we confidently resolve 9 of the 17 problematic nodes. We suggest this locus-by-locus and node-by-node approach can build consensus on which topological resolutions remain uncertain in phylogenomic studies of other contentious groups. [Anchored hybrid enrichment (AHE); gene tree conflict; molecular evolution; phylogenomic concordance; target capture; ultraconserved elements (UCE).].
Collapse
Affiliation(s)
- Sonal Singhal
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.,Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA.,Department of Biology, CSU Dominguez Hills, Carson, CA 90747, USA
| | - Timothy J Colston
- Department of Biological Sciences, The George Washington University, Washington D.C. 20052, USA.,Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
| | - Maggie R Grundler
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.,Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA.,Department of Environmental Science, Policy, & Management, University of California Berkeley, Berkeley, CA 94720, USA
| | - Stephen A Smith
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Gabriel C Costa
- Department of Biology and Environmental Sciences, Auburn University at Montgomery, Montgomery, AL, USA
| | - Guarino R Colli
- Departamento de Zoologia, Universidade de Brasília, Brasília, DF, Brazil
| | - Craig Moritz
- Division of Ecology and Evolution, Research School of Biology, and Centre for Biodiversity Analysis, The Australian National University, 46 Sullivans Creek Road, Acton, ACT 2601, Australia
| | - R Alexander Pyron
- Department of Biological Sciences, The George Washington University, Washington D.C. 20052, USA
| | - Daniel L Rabosky
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.,Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
35
|
Portik DM, Wiens JJ. Do Alignment and Trimming Methods Matter for Phylogenomic (UCE) Analyses? Syst Biol 2020; 70:440-462. [PMID: 32797207 DOI: 10.1093/sysbio/syaa064] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 08/02/2020] [Accepted: 08/03/2020] [Indexed: 11/14/2022] Open
Abstract
Alignment is a crucial issue in molecular phylogenetics because different alignment methods can potentially yield very different topologies for individual genes. But it is unclear if the choice of alignment methods remains important in phylogenomic analyses, which incorporate data from hundreds or thousands of genes. For example, problematic biases in alignment might be multiplied across many loci, whereas alignment errors in individual genes might become irrelevant. The issue of alignment trimming (i.e., removing poorly aligned regions or missing data from individual genes) is also poorly explored. Here, we test the impact of 12 different combinations of alignment and trimming methods on phylogenomic analyses. We compare these methods using published phylogenomic data from ultraconserved elements (UCEs) from squamate reptiles (lizards and snakes), birds, and tetrapods. We compare the properties of alignments generated by different alignment and trimming methods (e.g., length, informative sites, missing data). We also test whether these data sets can recover well-established clades when analyzed with concatenated (RAxML) and species-tree methods (ASTRAL-III), using the full data ($\sim $5000 loci) and subsampled data sets (10% and 1% of loci). We show that different alignment and trimming methods can significantly impact various aspects of phylogenomic data sets (e.g., length, informative sites). However, these different methods generally had little impact on the recovery and support values for well-established clades, even across very different numbers of loci. Nevertheless, our results suggest several "best practices" for alignment and trimming. Intriguingly, the choice of phylogenetic methods impacted the phylogenetic results most strongly, with concatenated analyses recovering significantly more well-established clades (with stronger support) than the species-tree analyses. [Alignment; concatenated analysis; phylogenomics; sequence length heterogeneity; species-tree analysis; trimming].
Collapse
Affiliation(s)
- Daniel M Portik
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA.,California Academy of Sciences, San Francisco, CA 94118, USA
| | - John J Wiens
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
| |
Collapse
|
36
|
Chan KO, Hutter CR, Wood PL, Grismer LL, Brown RM. Larger, unfiltered datasets are more effective at resolving phylogenetic conflict: Introns, exons, and UCEs resolve ambiguities in Golden-backed frogs (Anura: Ranidae; genus Hylarana). Mol Phylogenet Evol 2020; 151:106899. [PMID: 32590046 DOI: 10.1016/j.ympev.2020.106899] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 05/18/2020] [Accepted: 06/17/2020] [Indexed: 01/01/2023]
Abstract
Using FrogCap, a recently-developed sequence-capture protocol, we obtained >12,000 highly informative exons, introns, and ultraconserved elements (UCEs), which we used to illustrate variation in evolutionary histories of these classes of markers, and to resolve long-standing systematic problems in Southeast Asian Golden-backed frogs of the genus-complex Hylarana. We also performed a comprehensive suite of analyses to assess the relative performance of different genetic markers, data filtering strategies, tree inference methods, and different measures of branch support. To reduce gene tree estimation error, we filtered the data using different thresholds of taxon completeness (missing data) and parsimony informative sites (PIS). We then estimated species trees using concatenated datasets and Maximum Likelihood (IQ-TREE) in addition to summary (ASTRAL-III), distance-based (ASTRID), and site-based (SVDQuartets) multispecies coalescent methods. Topological congruence and branch support were examined using traditional bootstrap, local posterior probabilities, gene concordance factors, quartet frequencies, and quartet scores. Our results did not yield a single concordant topology. Instead, introns, exons, and UCEs clearly possessed different phylogenetic signals, resulting in conflicting, yet strongly-supported phylogenetic estimates. However, a combined analysis comprising the most informative introns, exons, and UCEs converged on a similar topology across all analyses, with the exception of SVDQuartets. Bootstrap values were consistently high despite high levels of incongruence and high proportions of gene trees supporting conflicting topologies. Although low bootstrap values did indicate low heuristic support, high bootstrap support did not necessarily reflect congruence or support for the correct topology. This study reiterates findings of some previous studies, which demonstrated that traditional bootstrap values can produce positively misleading measures of support in large phylogenomic datasets. We also showed a remarkably strong positive relationship between branch length and topological congruence across all datasets, implying that very short internodes remain a challenge to resolve, even with orders of magnitude more data than ever before. Overall, our results demonstrate that more data from unfiltered or combined datasets produced superior results. Although data filtering reduced gene tree incongruence, decreased amounts of data also biased phylogenetic estimation. A point of diminishing returns was evident, at which higher congruence (from more stringent filtering) at the expense of amount of data led to topological error as assessed by comparison to more complete datasets across different genomic markers. Additionally, we showed that applying a parameter-rich model to a partitioned analysis of concatenated data produces better results compared to unpartitioned, or even partitioned analysis using model selection. Despite some lingering uncertainties, a combined analysis of our genomic data and sequences supplemented from GenBank (on the basis of a few gene regions) revealed highly supported novel systematic arrangements. Based on these new findings, we transfer Amnirana nicobariensis into the genus Indosylvirana; and I. milleti and Hylarana celebensis to the genus Papurana. We also provisionally place H. attigua in the genus Papurana pending verification from positively identified (voucher substantiated) samples.
Collapse
Affiliation(s)
- Kin Onn Chan
- Lee Kong Chian National History Museum, Faculty of Science, National University of Singapore, 2 Conservatory Drive, 117377, Singapore.
| | - Carl R Hutter
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045, USA; Museum of Natural Sciences and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Perry L Wood
- Museum of Natural Sciences and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA; Department of Biological Sciences & Museum of Natural History, Auburn University, Auburn, AL 36849, USA
| | - L Lee Grismer
- Herpetology Laboratory, Department of Biology, La Sierra University, 4500 Riverwalk Parkway, Riverside, CA 92505, USA
| | - Rafe M Brown
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|
37
|
Stiller J, Tilic E, Rousset V, Pleijel F, Rouse GW. Spaghetti to a Tree: A Robust Phylogeny for Terebelliformia (Annelida) Based on Transcriptomes, Molecular and Morphological Data. BIOLOGY 2020; 9:E73. [PMID: 32268525 PMCID: PMC7236012 DOI: 10.3390/biology9040073] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Revised: 04/03/2020] [Accepted: 04/03/2020] [Indexed: 12/23/2022]
Abstract
Terebelliformia-"spaghetti worms" and their allies-are speciose and ubiquitous marine annelids but our understanding of how their morphological and ecological diversity evolved is hampered by an uncertain delineation of lineages and their phylogenetic relationships. Here, we analyzed transcriptomes of 20 terebelliforms and an outgroup to build a robust phylogeny of the main lineages grounded on 12,674 orthologous genes. We then supplemented this backbone phylogeny with a denser sampling of 121 species using five genes and 90 morphological characters to elucidate fine-scale relationships. The monophyly of six major taxa was supported: Pectinariidae, Ampharetinae, Alvinellidae, Trichobranchidae, Terebellidae and Melinninae. The latter, traditionally a subfamily of Ampharetidae, was unexpectedly the sister to Terebellidae, and hence becomes Melinnidae, and Ampharetinae becomes Ampharetidae. We found no support for the recently proposed separation of Telothelepodidae, Polycirridae and Thelepodidae from Terebellidae. Telothelepodidae was nested within Thelepodinae and is accordingly made its junior synonym. Terebellidae contained the subfamily-ranked taxa Terebellinae and Thelepodinae. The placement of the simplified Polycirridae within Terebellinae differed from previous hypotheses, warranting the division of Terebellinae into Lanicini, Procleini, Terebellini and Polycirrini. Ampharetidae (excluding Melinnidae) were well-supported as the sister group to Alvinellidae and we recognize three clades: Ampharetinae, Amaginae and Amphicteinae. Our analysis found several paraphyletic genera and undescribed species. Morphological transformations on the phylogeny supported the hypothesis of an ancestor that possessed both branchiae and chaetae, which is at odds with proposals of a "naked" ancestor. Our study demonstrates how a robust backbone phylogeny can be combined with dense taxon coverage and morphological traits to give insights into the evolutionary history and transformation of traits.
Collapse
Affiliation(s)
- Josefin Stiller
- Scripps Institution of Oceanography, University of California, San Diego, CA 92037, USA; (E.T.)
- Centre for Biodiversity Genomics, Department of Biology, University of Copenhagen, 2100 Copenhagen, Denmark
| | - Ekin Tilic
- Scripps Institution of Oceanography, University of California, San Diego, CA 92037, USA; (E.T.)
- Institute of Evolutionary Biology and Animal Ecology, University of Bonn, 53121 Bonn, Germany
| | - Vincent Rousset
- Scripps Institution of Oceanography, University of California, San Diego, CA 92037, USA; (E.T.)
| | - Fredrik Pleijel
- Tjärnö Marine Laboratory, Department of Marine Sciences, University of Gothenburg, 405 30 Gothenburg, Sweden;
| | - Greg W. Rouse
- Scripps Institution of Oceanography, University of California, San Diego, CA 92037, USA; (E.T.)
| |
Collapse
|
38
|
Nute M, Chou J, Molloy EK, Warnow T. Correction to: The performance of coalescent-based species tree estimation methods under models of missing data. BMC Genomics 2020; 21:133. [PMID: 32039710 PMCID: PMC7008544 DOI: 10.1186/s12864-020-6540-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Affiliation(s)
- Michael Nute
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 S. Wright St., Champaign, IL, 61820, USA
| | - Jed Chou
- Department of Mathematics, University of Illinois at Urbana-Champaign, 1409 W. Green St., Urbana, IL, 61801, USA
| | - Erin K Molloy
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 North Goodwin Avenue, Urbana, IL, 61801, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 North Goodwin Avenue, Urbana, IL, 61801, USA.
| |
Collapse
|
39
|
Abstract
Green plants (Viridiplantae) include around 450,000-500,000 species1,2 of great diversity and have important roles in terrestrial and aquatic ecosystems. Here, as part of the One Thousand Plant Transcriptomes Initiative, we sequenced the vegetative transcriptomes of 1,124 species that span the diversity of plants in a broad sense (Archaeplastida), including green plants (Viridiplantae), glaucophytes (Glaucophyta) and red algae (Rhodophyta). Our analysis provides a robust phylogenomic framework for examining the evolution of green plants. Most inferred species relationships are well supported across multiple species tree and supermatrix analyses, but discordance among plastid and nuclear gene trees at a few important nodes highlights the complexity of plant genome evolution, including polyploidy, periods of rapid speciation, and extinction. Incomplete sorting of ancestral variation, polyploidization and massive expansions of gene families punctuate the evolutionary history of green plants. Notably, we find that large expansions of gene families preceded the origins of green plants, land plants and vascular plants, whereas whole-genome duplications are inferred to have occurred repeatedly throughout the evolution of flowering plants and ferns. The increasing availability of high-quality plant genome sequences and advances in functional genomics are enabling research on genome evolution across the green tree of life.
Collapse
|
40
|
Olofsson JK, Cantera I, Van de Paer C, Hong-Wa C, Zedane L, Dunning LT, Alberti A, Christin PA, Besnard G. Phylogenomics using low-depth whole genome sequencing: A case study with the olive tribe. Mol Ecol Resour 2019; 19:877-892. [PMID: 30934146 DOI: 10.1111/1755-0998.13016] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 03/19/2019] [Accepted: 03/25/2019] [Indexed: 12/20/2022]
Abstract
Species trees have traditionally been inferred from a few selected markers, and genome-wide investigations remain largely restricted to model organisms or small groups of species for which sampling of fresh material is available, leaving out most of the existing and historical species diversity. The genomes of an increasing number of species, including specimens extracted from natural history collections, are being sequenced at low depth. While these data sets are widely used to analyse organelle genomes, the nuclear fraction is generally ignored. Here we evaluate different reference-based methods to infer phylogenies of large taxonomic groups from such data sets. Using the example of the Oleeae tribe, a worldwide-distributed group, we build phylogenies based on single nucleotide polymorphisms (SNPs) obtained using two reference genomes (the olive and ash trees). The inferred phylogenies are overall congruent, yet present differences that might reflect the effect of distance to the reference on the amount of missing data. To limit this issue, genome complexity was reduced by using pairs of orthologous coding sequences as the reference, thus allowing us to combine SNPs obtained using two distinct references. Concatenated and coalescence trees based on these combined SNPs suggest events of incomplete lineage sorting and/or hybridization during the diversification of this large phylogenetic group. Our results show that genome-wide phylogenetic trees can be inferred from low-depth sequence data sets for eukaryote groups with complex genomes, and histories of reticulate evolution. This opens new avenues for large-scale phylogenomics and biogeographical analyses covering both the extant and the historical diversity stored in museum collections.
Collapse
Affiliation(s)
- Jill K Olofsson
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
| | - Isabel Cantera
- Laboratoire Évolution and Diversité Biologique (EDB, UMR5174), CNRS, UPS, IRD, Université de Toulouse, Toulouse, France
| | - Céline Van de Paer
- Laboratoire Évolution and Diversité Biologique (EDB, UMR5174), CNRS, UPS, IRD, Université de Toulouse, Toulouse, France
| | - Cynthia Hong-Wa
- Claude E. Phillips Herbarium, Delaware State University, Dover, Delaware
| | - Loubab Zedane
- Laboratoire Évolution and Diversité Biologique (EDB, UMR5174), CNRS, UPS, IRD, Université de Toulouse, Toulouse, France
| | - Luke T Dunning
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
| | - Adriana Alberti
- Genoscope, CEA - Institut de biologie François-Jacob, Evry Cedex, France
| | | | - Guillaume Besnard
- Laboratoire Évolution and Diversité Biologique (EDB, UMR5174), CNRS, UPS, IRD, Université de Toulouse, Toulouse, France
| |
Collapse
|
41
|
Montingelli GG, Grazziotin FG, Battilana J, Murphy RW, Zhang Y, Zaher H. Higher‐level phylogenetic affinities of the Neotropical genus
Mastigodryas
Amaral, 1934 (Serpentes: Colubridae), species‐group definition and description of a new genus for
Mastigodryas bifossatus. J ZOOL SYST EVOL RES 2019. [DOI: 10.1111/jzs.12262] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
- Giovanna G. Montingelli
- Department of Life SciencesNatural History Museum London UK
- Museu de Zoologia da Universidade de São Paulo São Paulo Brazil
| | | | | | - Robert W. Murphy
- Royal Ontario MuseumCentre for Biodiversity and Conservation Biology Toronto Ontario Canada
- State Key Laboratory of Genetic Resources and EvolutionKunming Institute of Zoology Kunming China
| | - Ya‐Ping Zhang
- State Key Laboratory of Genetic Resources and EvolutionKunming Institute of Zoology Kunming China
- Laboratory for Conservation and Utilization of Bio‐ResourcesYunnan University Kunming China
| | - Hussam Zaher
- Museu de Zoologia da Universidade de São Paulo São Paulo Brazil
| |
Collapse
|