1
|
Wang Y, Wu X, Chen Y, Xu C, Wang Y, Wang Q. Phylogenomic analyses revealed widely occurring hybridization events across Elsholtzieae (Lamiaceae). Mol Phylogenet Evol 2024; 198:108112. [PMID: 38806075 DOI: 10.1016/j.ympev.2024.108112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 05/14/2024] [Accepted: 05/22/2024] [Indexed: 05/30/2024]
Abstract
Obtaining a robust phylogeny proves challenging due to the intricate evolutionary history of species, where processes such as hybridization and incomplete lineage sorting can introduce conflicting signals, thereby complicating phylogenetic inference. In this study, we conducted comprehensive sampling of Elsholtzieae, with a particular focus on its largest genus, Elsholtzia. We utilized 503 nuclear loci and complete plastome sequences obtained from 99 whole-genome sequencing datasets to elucidate the interspecific relationships within the Elsholtzieae. Additionally, we explored various sources of conflicts between gene trees and species trees. Fully supported backbone phylogenies were recovered, and the monophyly of Elsholtzia and Keiskea was not supported. Significant gene tree heterogeneity was observed at numerous nodes, particularly regarding the placement of Vuhuangia and the E. densa clade. Further investigations into potential causes of this discordance revealed that incomplete lineage sorting (ILS), coupled with hybridization events, has given rise to substantial gene tree discordance. Several species, represented by multiple samples, exhibited a closer association with geographical distribution rather than following a strictly monophyletic pattern in plastid trees, suggesting chloroplast capture within Elsholtzieae and providing evidence of hybridization. In conclusion, this study provides phylogenomic insights to untangle taxonomic problems in the tribe Elsholtzieae, especially the genus Elsholtzia.
Collapse
Affiliation(s)
- Yan Wang
- State Key Laboratory of Plant Diversity and Specialty Crops, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; National Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xuexue Wu
- State Key Laboratory of Plant Diversity and Specialty Crops, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; National Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yanyi Chen
- State Key Laboratory of Plant Diversity and Specialty Crops, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; National Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chao Xu
- State Key Laboratory of Plant Diversity and Specialty Crops, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; National Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
| | - Yinghui Wang
- State Key Laboratory of Plant Diversity and Specialty Crops, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; National Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qiang Wang
- State Key Laboratory of Plant Diversity and Specialty Crops, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; National Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
2
|
Veltman MA, Anthoons B, Schrøder-Nielsen A, Gravendeel B, de Boer HJ. Orchidinae-205: A new genome-wide custom bait set for studying the evolution, systematics, and trade of terrestrial orchids. Mol Ecol Resour 2024; 24:e13986. [PMID: 38899721 DOI: 10.1111/1755-0998.13986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 05/16/2024] [Accepted: 05/30/2024] [Indexed: 06/21/2024]
Abstract
Terrestrial orchids are a group of genetically understudied, yet culturally and economically important plants. The Orchidinae tribe contains many species that produce edible tubers that are used for the production of traditional delicacies collectively called 'salep'. Overexploitation of wild orchids in the Eastern Mediterranean and Western Asia threatens to drive many of these species to extinction, but cost-effective tools for monitoring their trade are currently lacking. Here we present a custom bait kit for target enrichment and sequencing of 205 novel genetic markers that are tailored to phylogenomic applications in Orchidinae s.l. A subset of 31 markers capture genes putatively involved in the production of glucomannan, a water-soluble polysaccharide that gives salep its distinctive properties. We tested the kit on 73 taxa native to the area, demonstrating universally high locus recovery irrespective of species identity, that exceeds the total sequence length obtained with alternative kits currently available. Phylogenetic inference with concatenation and coalescent approaches was robust and showed high levels of support for most clades, including some which were previously unresolved. Resolution for hybridizing and recently radiated lineages remains difficult, but could be further improved by analysing multiple haplotypes and the non-exonic sequences captured by our kit, with the promise to shed new light on the evolution of enigmatic taxa with a complex speciation history. Offering a step-up from traditional barcoding and universal markers, the genome-wide custom loci targeted by Orchidinae-205 are a valuable new resource to study the evolution, systematics and trade of terrestrial orchids.
Collapse
Affiliation(s)
- Margaretha A Veltman
- Natural History Museum, Oslo, Norway
- Naturalis Biodiversity Center, Leiden, Netherlands
| | | | | | - Barbara Gravendeel
- Naturalis Biodiversity Center, Leiden, Netherlands
- Radboud Institute for Biological and Environmental Sciences, Radboud University, Nijmegen, Netherlands
| | | |
Collapse
|
3
|
Feng H, Banerjee AK, Guo W, Yuan Y, Duan F, Ng WL, Zhao X, Liu Y, Li C, Liu Y, Li L, Huang Y. Origin and evolution of a new tetraploid mangrove species in an intertidal zone. PLANT DIVERSITY 2024; 46:476-490. [PMID: 39280974 PMCID: PMC11390703 DOI: 10.1016/j.pld.2024.04.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 04/16/2024] [Accepted: 04/18/2024] [Indexed: 09/18/2024]
Abstract
Polyploidy is a major factor in the evolution of plants, yet we know little about the origin and evolution of polyploidy in intertidal species. This study aimed to identify the evolutionary transitions in three true-mangrove species of the genus Acanthus distributed in the Indo-West Pacific region. For this purpose, we took an integrative approach that combined data on morphology, cytology, climatic niche, phylogeny, and biogeography of 493 samples from 42 geographic sites. Our results show that the Acanthus ilicifolius lineage distributed east of the Thai-Malay Peninsula possesses a tetraploid karyotype, which is morphologically distinct from that of the lineage on the west side. The haplotype networks and phylogenetic trees for the chloroplast genome and eight nuclear genes reveal that the tetraploid species has two sub-genomes, one each from A. ilicifolius and A . ebracteatus, the paternal and maternal parents, respectively. Population structure analysis also supports the hybrid speciation history of the new tetraploid species. The two sub-genomes of the tetraploid species diverged from their diploid progenitors during the Pleistocene. Environmental niche models revealed that the tetraploid species not only occupied the near-entire niche space of the diploids, but also expanded into novel environments. Our findings suggest that A. ilicifolius species distributed on the east side of the Thai-Malay Peninsula should be regarded as a new species, A. tetraploideus, which originated from hybridization between A. ilicifolius and A. ebracteatus, followed by chromosome doubling. This is the first report of a true-mangrove allopolyploid species that can reproduce sexually and clonally reproduction, which explains the long-term adaptive potential of the species.
Collapse
Affiliation(s)
- Hui Feng
- State Key Laboratory of Biocontrol and Guangdong Provincial Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Achyut Kumar Banerjee
- State Key Laboratory of Biocontrol and Guangdong Provincial Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Wuxia Guo
- Department of Bioengineering, Zhuhai Campus of Zunyi Medical University, Zhuhai 519041, Guangdong, China
| | - Yang Yuan
- State Key Laboratory of Biocontrol and Guangdong Provincial Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Fuyuan Duan
- State Key Laboratory of Biocontrol and Guangdong Provincial Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Wei Lun Ng
- China-ASEAN College of Marine Sciences, Xiamen University Malaysia, Sepang 43900, Selangor, Malaysia
| | - Xuming Zhao
- State Key Laboratory of Biocontrol and Guangdong Provincial Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Yuting Liu
- School of Agriculture, Sun Yat-sen University, Shenzhen 518107, Guangdong, China
| | - Chunmei Li
- State Key Laboratory of Biocontrol and Guangdong Provincial Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Ying Liu
- School of Ecology, Sun Yat-sen University, Shenzhen 518107, Guangdong, China
| | - Linfeng Li
- State Key Laboratory of Biocontrol and Guangdong Provincial Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Yelin Huang
- State Key Laboratory of Biocontrol and Guangdong Provincial Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| |
Collapse
|
4
|
Ahmed Shazib SU, Cote-L’Heureux A, Ahsan R, Muñoz-Gómez SA, Lee J, Katz LA, Shin MK. Phylogeny and species delimitation of ciliates in the genus Spirostomum (Class, Heterotrichea) using single-cell transcriptomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.29.596006. [PMID: 38854132 PMCID: PMC11160781 DOI: 10.1101/2024.05.29.596006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Ciliates are single-celled microbial eukaryotes that diverged from other eukaryotic lineages over a billion years ago. The extensive evolutionary timespan of ciliate has led to enormous genetic and phenotypic changes, contributing significantly to their high level of diversity. Recent analyses based on molecular data have revealed numerous cases of cryptic species complexes in different ciliate lineages, demonstrating the need for a robust approach to delimit species boundaries and elucidate phylogenetic relationships. Heterotrich ciliate species of the genus Spirostomum are abundant in freshwater and brackish environments and are commonly used as biological indicators for assessing water quality. However, some Spirostomum species are difficult to identify due to a lack of distinguishable morphological characteristics, and the existence of cryptic species in this genus remains largely unexplored. Previous phylogenetic studies have focused on only a few loci, namely the ribosomal RNA genes, alpha-tubulin, and mitochondrial CO1. In this study, we obtained single-cell transcriptome of 25 Spirostomum species populations (representing six morphospecies) sampled from South Korea and the USA, and used concatenation- and coalescent-based methods for species tree inference and delimitation. Phylogenomic analysis of 37 Spirostomum populations and 265 protein-coding genes provided a robustious insight into the evolutionary relationships among Spirostomum species and confirmed that species with moniliform and compact macronucleus each form a distinct monophyletic lineage. Furthermore, the multispecies coalescent (MSC) model suggests that there are at least nine cryptic species in the Spirostomum genus, three in S. minus, two in S. ambiguum, S. subtilis, and S. teres each. Overall, our fine sampling of closely related Spirostomum populations and wide scRNA-seq allowed us to demonstrate the hidden crypticity of species within the genus Spirostomum, and to resolve and provide much stronger support than hitherto to the phylogeny of this important ciliate genus.
Collapse
Affiliation(s)
- Shahed Uddin Ahmed Shazib
- Department of Biological Sciences, University of Ulsan, Ulsan 44610, South Korea
- Department of Biological Sciences, Smith College, Northampton, Massachusetts 01063, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| | - Auden Cote-L’Heureux
- Department of Biological Sciences, Smith College, Northampton, Massachusetts 01063, USA
| | - Ragib Ahsan
- Department of Biological Sciences, University of Ulsan, Ulsan 44610, South Korea
- Department of Biological Sciences, Smith College, Northampton, Massachusetts 01063, USA
- University of Massachusetts Amherst, Program in Organismic and Evolutionary Biology, Amherst, Massachusetts, USA
| | - Sergio A. Muñoz-Gómez
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| | - JunMo Lee
- Department of Oceanography, Kyungpook National University, Daegu 41566, South Korea
- Kyungpook Institute of Oceanography, Kyungpook National University, Daegu 41566, South Korea
| | - Laura A. Katz
- Department of Biological Sciences, Smith College, Northampton, Massachusetts 01063, USA
- University of Massachusetts Amherst, Program in Organismic and Evolutionary Biology, Amherst, Massachusetts, USA
| | - Mann Kyoon Shin
- Department of Biological Sciences, University of Ulsan, Ulsan 44610, South Korea
| |
Collapse
|
5
|
Zhang Z, Liu G, Li M. Incomplete lineage sorting and gene flow within Allium (Amayllidaceae). Mol Phylogenet Evol 2024; 195:108054. [PMID: 38471599 DOI: 10.1016/j.ympev.2024.108054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 02/01/2024] [Accepted: 03/07/2024] [Indexed: 03/14/2024]
Abstract
The phylogeny and systematics of the genus Allium have been studied with a variety of diverse data types, including an increasing amount of molecular data. However, strong phylogenetic discordance and high levels of uncertainty have prevented the identification of a consistent phylogeny. The difficulty in establishing phylogenetic consensus and evidence for genealogical discordance make Allium a compelling test case to assess the relative contribution of incomplete lineage sorting (ILS), gene flow and gene tree estimation error on phylogenetic reconstruction. In this study, we obtained 75 transcriptomes of 38 Allium species across 10 subgenera. Whole plastid genome, single copy genes and consensus CDS were generated to estimate phylogenetic trees both using coalescence and concatenation methods. Multiple approaches including coalescence simulation, quartet sampling, reticulate network inference, sequence simulation, theta of ILS and reticulation index were carried out across the CDS gene trees to investigate the degrees of ILS, gene flow and gene tree estimation error. Afterward, a regression analysis was used to test the relative contributions of each of these forms of uncertainty to the final phylogeny. Despite extensive topological discordance among gene trees, we found a fully supported species tree that agrees with the most of well-accepted relationships and establishes monophyly of the genus Allium. We presented clear evidence for substantial ILS across the phylogeny of Allium. Further, we identified two ancient hybridization events for the formation of the second evolutionary line and subg. Butomissa as well as several introgression events between recently diverged species. Our regression analysis revealed that gene tree inference error and gene flow were the two most dominant factors explaining for the overall gene tree variation, with the difficulty in disentangling the effects of ILS and gene tree estimation error due to a positive correlation between them. Based on our efforts to mitigate the methodological errors in reconstructing trees, we believed ILS and gene flow are two principal reasons for the oft-reported phylogenetic heterogeneity of Allium. This study presents a strongly-supported and well-resolved phylogenetic backbone for the sampled Allium species, and exemplifies how to untangle heterogeneity in phylogenetic signal and reconstruct the true evolutionary history of the target taxa.
Collapse
Affiliation(s)
- ZengZhu Zhang
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Gang Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Minjie Li
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou 730000, People's Republic of China.
| |
Collapse
|
6
|
Pang XX, Zhang DY. Detection of Ghost Introgression Requires Exploiting Topological and Branch Length Information. Syst Biol 2024; 73:207-222. [PMID: 38224495 PMCID: PMC11129598 DOI: 10.1093/sysbio/syad077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 12/17/2023] [Accepted: 12/27/2023] [Indexed: 01/17/2024] Open
Abstract
In recent years, the study of hybridization and introgression has made significant progress, with ghost introgression-the transfer of genetic material from extinct or unsampled lineages to extant species-emerging as a key area for research. Accurately identifying ghost introgression, however, presents a challenge. To address this issue, we focused on simple cases involving 3 species with a known phylogenetic tree. Using mathematical analyses and simulations, we evaluated the performance of popular phylogenetic methods, including HyDe and PhyloNet/MPL, and the full-likelihood method, Bayesian Phylogenetics and Phylogeography (BPP), in detecting ghost introgression. Our findings suggest that heuristic approaches relying on site-pattern counts or gene-tree topologies struggle to differentiate ghost introgression from introgression between sampled non-sister species, frequently leading to incorrect identification of donor and recipient species. The full-likelihood method BPP uses multilocus sequence alignments directly-hence taking into account both gene-tree topologies and branch lengths, by contrast, is capable of detecting ghost introgression in phylogenomic datasets. We analyzed a real-world phylogenomic dataset of 14 species of Jaltomata (Solanaceae) to showcase the potential of full-likelihood methods for accurate inference of introgression.
Collapse
Affiliation(s)
- Xiao-Xu Pang
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Da-Yong Zhang
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| |
Collapse
|
7
|
Leaché AD, Davis HR, Feldman CR, Fujita MK, Singhal S. Repeated patterns of reptile diversification in Western North America supported by the Northern Alligator Lizard (Elgaria coerulea). J Hered 2024; 115:57-71. [PMID: 37982433 PMCID: PMC10838131 DOI: 10.1093/jhered/esad073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 11/09/2023] [Indexed: 11/21/2023] Open
Abstract
Understanding the processes that shape genetic diversity by either promoting or preventing population divergence can help identify geographic areas that either facilitate or limit gene flow. Furthermore, broadly distributed species allow us to understand how biogeographic and ecogeographic transitions affect gene flow. We investigated these processes using genomic data in the Northern Alligator Lizard (Elgaria coerulea), which is widely distributed in Western North America across diverse ecoregions (California Floristic Province and Pacific Northwest) and mountain ranges (Sierra Nevada, Coastal Ranges, and Cascades). We collected single-nucleotide polymorphism data from 120 samples of E. coerulea. Biogeographic analyses of squamate reptiles with similar distributions have identified several shared diversification patterns that provide testable predictions for E. coerulea, including deep genetic divisions in the Sierra Nevada, demographic stability of southern populations, and recent post-Pleistocene expansion into the Pacific Northwest. We use genomic data to test these predictions by estimating the structure, connectivity, and phylogenetic history of populations. At least 10 distinct populations are supported, with mixed-ancestry individuals situated at most population boundaries. A species tree analysis provides strong support for the early divergence of populations in the Sierra Nevada Mountains and recent diversification into the Pacific Northwest. Admixture and migration analyses detect gene flow among populations in the Lower Cascades and Northern California, and a spatial analysis of gene flow identified significant barriers to gene flow across both the Sierra Nevada and Coast Ranges. The distribution of genetic diversity in E. coerulea is uneven, patchy, and interconnected at population boundaries. The biogeographic patterns seen in E. coerulea are consistent with predictions from co-distributed species.
Collapse
Affiliation(s)
- Adam D Leaché
- Department of Biology & Burke Museum of Natural History and Culture, University of Washington, Seattle, WA, United States
| | - Hayden R Davis
- Department of Biology & Burke Museum of Natural History and Culture, University of Washington, Seattle, WA, United States
| | - Chris R Feldman
- Department of Biology and Program in Ecology, Evolution and Conservation Biology, University of Nevada, Reno, NV, United States
| | - Matthew K Fujita
- Department of Biology, The University of Texas at Arlington, Arlington, TX, United States
| | - Sonal Singhal
- Department of Biology, California State University – Dominguez Hills, Carson, CA, United States
| |
Collapse
|
8
|
Németh A, Mizsei E, Laczkó L, Czabán D, Hegyeli Z, Lengyel S, Csorba G, Sramkó G. Evolutionary history and systematics of European blind mole rats (Rodentia: Spalacidae: Nannospalax): Multilocus phylogeny and species delimitation in a puzzling group. Mol Phylogenet Evol 2024; 190:107958. [PMID: 37914032 DOI: 10.1016/j.ympev.2023.107958] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 10/25/2023] [Accepted: 10/26/2023] [Indexed: 11/03/2023]
Abstract
Species delimitation is a powerful approach to assist taxonomic decisions in challenging taxa where species boundaries are hard to establish. European taxa of the blind mole rats (genus Nannospalax) display small morphological differences and complex chromosomal evolution at a shallow evolutionary divergence level. Previous analyses led to the recognition of 25 'forms' in their distribution area. We provide a comprehensive framework to improve knowledge on the evolutionary history and revise the taxonomy of European blind mole rats based on samples from all but three of the 25 forms. We sequenced two nuclear-encoded genetic regions and the whole mitochondrial cytochrome b gene for phylogenetic tree reconstructions using concatenation and coalescence-based species-tree estimations. The phylogenetic analyses confirmed that Aegean N. insularis belongs to N. superspecies xanthodon, and that it represents the second known species of this superspecies in Europe. Mainland taxa reached Europe from Asia Minor in two colonisation events corresponding to two superspecies-level taxa: N. superspecies monticola (taxon established herewith) reached Europe c. 2.1 million years ago (Mya) and was followed by N. superspecies leucodon (re-defined herewith) c. 1.5 Mya. Species delimitation allowed the clarification of the taxonomic contents of the above superspecies. N. superspecies monticola contains three species geographically confined to the western periphery of the distribution of blind mole rats, whereas N. superspecies leucodon is more speciose with six species and several additional subspecies. The observed geographic pattern hints at a robust peripatric speciation process and rapid chromosomal evolution. The present treatment is thus regarded as the minimum taxonomic content of each lineage, which can be further refined based on other sources of information such as karyological traits, crossbreeding experiments, etc. The species delimitation models also allowed the recognition of a hitherto unnamed blind mole rat taxon from Albania, described here as a new subspecies.
Collapse
Affiliation(s)
- Attila Németh
- Department of Nature Conservation, Zoology and Game Management, University of Debrecen, Böszörményi u. 138, H-4032 Debrecen, Hungary; BirdLife Hungary - Hungarian Ornithological and Nature Conservation Society, Költő u. 21, H-1121 Budapest, Hungary
| | - Edvárd Mizsei
- Department of Ecology, University of Debrecen, Egyetem tér 1, H-4032 Debrecen, Hungary; DRI Conservation Ecology Research Group, Centre for Ecological Research, Hungarian Academy of Sciences, Bem tér 18/C, H-4026 Debrecen, Hungary
| | - Levente Laczkó
- Evolutionary Genomics Research Group, Department of Botany, University of Debrecen, Egyetem tér 1, H-4032 Debrecen, Hungary; HUN-REN-UD Conservation Biology Research Group, Egyetem tér 1, H-4032 Debrecen, Hungary
| | | | - Zsolt Hegyeli
- Milvus Group Bird and Nature Protection Association, Crinului St. 22, 540343 Târgu Mureş, Romania
| | - Szabolcs Lengyel
- DRI Conservation Ecology Research Group, Centre for Ecological Research, Hungarian Academy of Sciences, Bem tér 18/C, H-4026 Debrecen, Hungary
| | - Gábor Csorba
- Hungarian Natural History Museum, Baross u. 13, H-1088 Budapest, Hungary.
| | - Gábor Sramkó
- Evolutionary Genomics Research Group, Department of Botany, University of Debrecen, Egyetem tér 1, H-4032 Debrecen, Hungary; HUN-REN-UD Conservation Biology Research Group, Egyetem tér 1, H-4032 Debrecen, Hungary
| |
Collapse
|
9
|
Mcguire JA, Huang X, Reilly SB, Iskandar DT, Wang-Claypool CY, Werning S, Chong RA, Lawalata SZS, Stubbs AL, Frederick JH, Brown RM, Evans BJ, Arifin U, Riyanto A, Hamidy A, Arida E, Koo MS, Supriatna J, Andayani N, Hall R. Species Delimitation, Phylogenomics, and Biogeography of Sulawesi Flying Lizards: A Diversification History Complicated by Ancient Hybridization, Cryptic Species, and Arrested Speciation. Syst Biol 2023; 72:885-911. [PMID: 37074804 PMCID: PMC10405571 DOI: 10.1093/sysbio/syad020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 03/14/2023] [Accepted: 04/13/2023] [Indexed: 04/20/2023] Open
Abstract
The biota of Sulawesi is noted for its high degree of endemism and for its substantial levels of in situ biological diversification. While the island's long period of isolation and dynamic tectonic history have been implicated as drivers of the regional diversification, this has rarely been tested in the context of an explicit geological framework. Here, we provide a tectonically informed biogeographical framework that we use to explore the diversification history of Sulawesi flying lizards (the Draco lineatus Group), a radiation that is endemic to Sulawesi and its surrounding islands. We employ a framework for inferring cryptic speciation that involves phylogeographic and genetic clustering analyses as a means of identifying potential species followed by population demographic assessment of divergence-timing and rates of bi-directional migration as means of confirming lineage independence (and thus species status). Using this approach, phylogenetic and population genetic analyses of mitochondrial sequence data obtained for 613 samples, a 50-SNP data set for 370 samples, and a 1249-locus exon-capture data set for 106 samples indicate that the current taxonomy substantially understates the true number of Sulawesi Draco species, that both cryptic and arrested speciations have taken place, and that ancient hybridization confounds phylogenetic analyses that do not explicitly account for reticulation. The Draco lineatus Group appears to comprise 15 species-9 on Sulawesi proper and 6 on peripheral islands. The common ancestor of this group colonized Sulawesi ~11 Ma when proto-Sulawesi was likely composed of two ancestral islands, and began to radiate ~6 Ma as new islands formed and were colonized via overwater dispersal. The enlargement and amalgamation of many of these proto-islands into modern Sulawesi, especially during the past 3 Ma, set in motion dynamic species interactions as once-isolated lineages came into secondary contact, some of which resulted in lineage merger, and others surviving to the present. [Genomics; Indonesia; introgression; mitochondria; phylogenetics; phylogeography; population genetics; reptiles.].
Collapse
Affiliation(s)
- Jimmy A Mcguire
- Museum of Vertebrate Zoology, University of California, Berkeley, CA 94720, USA
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Xiaoting Huang
- College of Marine Life Sciences, Ocean University of China, No. 5 Yushan Road, Qindao, Shandong, 266003, PR China
| | - Sean B Reilly
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA 95060, USA
| | - Djoko T Iskandar
- School of Life Sciences and Technology, Institut Teknologi Bandung, Bandung, Indonesia
| | - Cynthia Y Wang-Claypool
- Museum of Vertebrate Zoology, University of California, Berkeley, CA 94720, USA
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Sarah Werning
- Department of Anatomy, Des Moines University, 3200 Grand Avenue, Des Moines, IA 50312-4198, USA
| | - Rebecca A Chong
- Department of Biology, University of Hawaii at Manoa, Honolulu, HI 96822, USA
| | - Shobi Z S Lawalata
- Museum of Vertebrate Zoology, University of California, Berkeley, CA 94720, USA
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
- United in Diversity Foundation, Jalan Hayam Wuruk, Jakarta, Indonesia
| | - Alexander L Stubbs
- Museum of Vertebrate Zoology, University of California, Berkeley, CA 94720, USA
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Jeffrey H Frederick
- Museum of Vertebrate Zoology, University of California, Berkeley, CA 94720, USA
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Rafe M Brown
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, 1345 Jayhawk Blvd., University of Kansas, Lawrence, KS 66045, USA
| | - Ben J Evans
- Biology Department, McMaster University, Hamilton, Ontario, Canada
| | - Umilaela Arifin
- Museum of Vertebrate Zoology, University of California, Berkeley, CA 94720, USA
- School of Life Sciences and Technology, Institut Teknologi Bandung, Bandung, Indonesia
- Center for Taxonomy and Morphology, Zoologisches Museum Hamburg, Leibniz Institute for the Analysis of Biodiversity Change, Martin-Luther-King-Platz 3, R230 20146 Hamburg, Germany
| | - Awal Riyanto
- Laboratory of Herpetology, Museum Zoologicum Bogoriense, Research Center for Biosystematics and Evolution, National Research and Innovation Agency of Indonesia (BRIN), Cibinong 16911, Indonesia
| | - Amir Hamidy
- Laboratory of Herpetology, Museum Zoologicum Bogoriense, Research Center for Biosystematics and Evolution, National Research and Innovation Agency of Indonesia (BRIN), Cibinong 16911, Indonesia
| | - Evy Arida
- Research Center for Applied Zoology, National Research and Innovation Agency of Indonesia (BRIN), Cibinong 16911, Indonesia
| | - Michelle S Koo
- Museum of Vertebrate Zoology, University of California, Berkeley, CA 94720, USA
| | - Jatna Supriatna
- Department of Biology, Institute for Sustainable Earth and Resources (I-SER), Gedung Laboratorium Multidisiplin, and Research Center for Climate Change (RCCC-UI), Gedung Laboratorium Multidisiplin, Faculty of Mathematics and Natural Sciences, Universitas Indonesia, Depok 16424, Indonesia
| | - Noviar Andayani
- Department of Biology, Institute for Sustainable Earth and Resources (I-SER), Gedung Laboratorium Multidisiplin, and Research Center for Climate Change (RCCC-UI), Gedung Laboratorium Multidisiplin, Faculty of Mathematics and Natural Sciences, Universitas Indonesia, Depok 16424, Indonesia
| | - Robert Hall
- SE Asia Research Group (SEARG), Department of Earth Sciences, Royal Holloway University of London, Egham, Surrey TW20 0EX, UK
| |
Collapse
|
10
|
Adams R, DeGiorgio M. Likelihood-Based Tests of Species Tree Hypotheses. Mol Biol Evol 2023; 40:msad159. [PMID: 37440530 PMCID: PMC10368450 DOI: 10.1093/molbev/msad159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 06/20/2023] [Accepted: 07/06/2023] [Indexed: 07/15/2023] Open
Abstract
Likelihood-based tests of phylogenetic trees are a foundation of modern systematics. Over the past decade, an enormous wealth and diversity of model-based approaches have been developed for phylogenetic inference of both gene trees and species trees. However, while many techniques exist for conducting formal likelihood-based tests of gene trees, such frameworks are comparatively underdeveloped and underutilized for testing species tree hypotheses. To date, widely used tests of tree topology are designed to assess the fit of classical models of molecular sequence data and individual gene trees and thus are not readily applicable to the problem of species tree inference. To address this issue, we derive several analogous likelihood-based approaches for testing topologies using modern species tree models and heuristic algorithms that use gene tree topologies as input for maximum likelihood estimation under the multispecies coalescent. For the purpose of comparing support for species trees, these tests leverage the statistical procedures of their original gene tree-based counterparts that have an extended history for testing phylogenetic hypotheses at a single locus. We discuss and demonstrate a number of applications, limitations, and important considerations of these tests using simulated and empirical phylogenomic data sets that include both bifurcating topologies and reticulate network models of species relationships. Finally, we introduce the open-source R package SpeciesTopoTestR (SpeciesTopology Tests in R) that includes a suite of functions for conducting formal likelihood-based tests of species topologies given a set of input gene tree topologies.
Collapse
Affiliation(s)
- Richard Adams
- Agricultural Statistics Laboratory, University of Arkansas, Fayetteville, AR
- Department of Entomology and Plant Pathology, University of Arkansas, Fayetteville, AR
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL
| |
Collapse
|
11
|
Ji J, Jackson DJ, Leaché AD, Yang Z. Power of Bayesian and Heuristic Tests to Detect Cross-Species Introgression with Reference to Gene Flow in the Tamias quadrivittatus Group of North American Chipmunks. Syst Biol 2023; 72:446-465. [PMID: 36504374 PMCID: PMC10275556 DOI: 10.1093/sysbio/syac077] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 11/15/2022] [Accepted: 12/01/2022] [Indexed: 10/25/2023] Open
Abstract
In the past two decades, genomic data have been widely used to detect historical gene flow between species in a variety of plants and animals. The Tamias quadrivittatus group of North America chipmunks, which originated through a series of rapid speciation events, are known to undergo massive amounts of mitochondrial introgression. Yet in a recent analysis of targeted nuclear loci from the group, no evidence for cross-species introgression was detected, indicating widespread cytonuclear discordance. The study used the heuristic method HYDE to detect gene flow, which may suffer from low power. Here we use the Bayesian method implemented in the program BPP to re-analyze these data. We develop a Bayesian test of introgression, calculating the Bayes factor via the Savage-Dickey density ratio using the Markov chain Monte Carlo (MCMC) sample under the model of introgression. We take a stepwise approach to constructing an introgression model by adding introgression events onto a well-supported binary species tree. The analysis detected robust evidence for multiple ancient introgression events affecting the nuclear genome, with introgression probabilities reaching 63%. We estimate population parameters and highlight the fact that species divergence times may be seriously underestimated if ancient cross-species gene flow is ignored in the analysis. We examine the assumptions and performance of HYDE and demonstrate that it lacks power if gene flow occurs between sister lineages or if the mode of gene flow does not match the assumed hybrid-speciation model with symmetrical population sizes. Our analyses highlight the power of likelihood-based inference of cross-species gene flow using genomic sequence data. [Bayesian test; BPP; chipmunks; introgression; MSci; multispecies coalescent; Savage-Dickey density ratio.].
Collapse
Affiliation(s)
- Jiayi Ji
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| | - Donavan J Jackson
- Department of Biology and Burke Museum of Natural History and Culture, University of Washington, Box 351800, Seattle, WA 98195-1800, USA
| | - Adam D Leaché
- Department of Biology and Burke Museum of Natural History and Culture, University of Washington, Box 351800, Seattle, WA 98195-1800, USA
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| |
Collapse
|
12
|
Gorring PS, Farrell BD. Evaluating species boundaries using coalescent delimitation in pine-killing Monochamus (Coleoptera: Cerambycidae) sawyer beetles. Mol Phylogenet Evol 2023; 184:107777. [PMID: 36990304 DOI: 10.1016/j.ympev.2023.107777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 02/18/2023] [Accepted: 03/24/2023] [Indexed: 03/30/2023]
Abstract
Plant-feeding beetle species are diverse and often individually highly variable. Accurate classifications can be difficult to establish yet are essential for study of evolutionary patterns and processes. Molecular data are key to further characterizing morphologically difficult groups and defining genus and species boundaries. Monochamus Dejean species are ecologically and economically significant, and in coniferous forests they vector the nematode that causes Pine Wilt Disease. This study uses nuclear and mitochondrial genes to test the monophyly and relationships of Monochamus and applies coalescent methods to further delimit the conifer-feeding species. Monochamus has also included approximately 120 Old World species associated with diverse angiosperm tree species. We sample from these additional morphologically diverse species to determine their placement in the Lamiini. Through supermatrix and coalescent methods, the higher-level relationships of Monochamus show that conifer-feeders are a monophyletic group that includes the type species and has split into Nearctic and Palearctic clades. Molecular dating indicates a single dispersal of conifer-feeders to North America over the second Bering Land Bridge circa 5.3 Ma. All other Monochamus sampled fall in different parts of the Lamiini tree. Small-bodied angiosperm-feeding Monochamus group with the monotypic genus Microgoes Casey. The African Monochamus subgenera sampled are distantly related to the conifer-feeding clade. The multispecies coalescent delimitation methods BPP and STACEY delimit 17 conifer-feeding Monochamus species for a total of 18 species, and supports the retention of all current species. An interrogation with nuclear gene allele phasing reveals that unphased data can be unreliable for accurate delimitations and divergence times. The delimited species are discussed with integrative evidence, highlighting real-world challenges in recognizing the completion of speciation trajectories.
Collapse
Affiliation(s)
- Patrick S Gorring
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford St. Cambridge, MA, USA.
| | - Brian D Farrell
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford St. Cambridge, MA, USA
| |
Collapse
|
13
|
Next-generation sequencing data show rapid radiation and several long-distance dispersal events in early Costaceae. Mol Phylogenet Evol 2023; 179:107664. [PMID: 36403710 DOI: 10.1016/j.ympev.2022.107664] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 02/12/2022] [Accepted: 11/02/2022] [Indexed: 11/18/2022]
Abstract
The monocot family Costaceae Nakai consists of seven genera but their mutual relationships have not been satisfactorily resolved in previous studies employing classical molecular markers. Phylogenomic analyses of 365 nuclear genes and nearly-complete plastome data provide almost fully resolved insights into their diversification. Paracostus is identified as sister to all other taxa, followed by several very short branches leading to discrete lineages, suggesting an ancient rapid radiation of these early lineages and leaving the exact relationships among them unresolved. Relationships among Chamaecostus, Dimerocostus and Monocostus confirmed earlier findings that these genera form a monophyletic group. The Afro-American Costus is also monophyletic. By contrast, Tapeinochilos appeared as a well-supported crown lineage of Cheilocostus rendering it paraphyletic. As these two genera differ morphologically from one another owing to a shift from insect- to bird-pollination, we propose to keep both names. The divergence time within Costaceae was estimated using penalized likelihood utilizing two fossils within Zingiberales, †Spirematospermum chandlerae and †Ensete oregonense, indicated a relatively recent diversification of Costaceae, between 18 and 9 Mya. Based on these data, the current pantropical distribution of the family is hypothesized to be the result of several long-distance intercontinental dispersal events, which do not correlate with global geoclimatic changes.
Collapse
|
14
|
Stubbs RL, Theodoridis S, Mora‐Carrera E, Keller B, Yousefi N, Potente G, Léveillé‐Bourret É, Celep F, Kochjarová J, Tedoradze G, Eaton DAR, Conti E. Whole-genome analyses disentangle reticulate evolution of primroses in a biodiversity hotspot. THE NEW PHYTOLOGIST 2023; 237:656-671. [PMID: 36210520 PMCID: PMC10099377 DOI: 10.1111/nph.18525] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
Biodiversity hotspots, such as the Caucasus mountains, provide unprecedented opportunities for understanding the evolutionary processes that shape species diversity and richness. Therefore, we investigated the evolution of Primula sect. Primula, a clade with a high degree of endemism in the Caucasus. We performed phylogenetic and network analyses of whole-genome resequencing data from the entire nuclear genome, the entire chloroplast genome, and the entire heterostyly supergene. The different characteristics of the genomic partitions and the resulting phylogenetic incongruences enabled us to disentangle evolutionary histories resulting from tokogenetic vs cladogenetic processes. We provide the first phylogeny inferred from the heterostyly supergene that includes all species of Primula sect. Primula. Our results identified recurrent admixture at deep nodes between lineages in the Caucasus as the cause of non-monophyly in Primula. Biogeographic analyses support the 'out-of-the-Caucasus' hypothesis, emphasizing the importance of this hotspot as a cradle for biodiversity. Our findings provide novel insights into causal processes of phylogenetic discordance, demonstrating that genome-wide analyses from partitions with contrasting genetic characteristics and broad geographic sampling are crucial for disentangling the diversification of species-rich clades in biodiversity hotspots.
Collapse
Affiliation(s)
- Rebecca L. Stubbs
- Department of Systematic and Evolutionary BotanyUniversity of ZurichZollikerstrasse 107Zurich8008Switzerland
| | - Spyros Theodoridis
- Senckenberg Biodiversity and Climate Research Centre (SBiK‐F)Frankfurt am Main60325Germany
| | - Emiliano Mora‐Carrera
- Department of Systematic and Evolutionary BotanyUniversity of ZurichZollikerstrasse 107Zurich8008Switzerland
| | - Barbara Keller
- Department of Systematic and Evolutionary BotanyUniversity of ZurichZollikerstrasse 107Zurich8008Switzerland
| | - Narjes Yousefi
- Department of Systematic and Evolutionary BotanyUniversity of ZurichZollikerstrasse 107Zurich8008Switzerland
| | - Giacomo Potente
- Department of Systematic and Evolutionary BotanyUniversity of ZurichZollikerstrasse 107Zurich8008Switzerland
| | - Étienne Léveillé‐Bourret
- Département de Sciences Biologiques, Institut de Recherche en Biologie Végétale (IRBV)Université de MontréalQuébecH1X 2B2Canada
| | - Ferhat Celep
- Department of Biology, Faculty of Arts and SciencesKırıkkale UniversityKırıkkale71450Turkey
| | - Judita Kochjarová
- Department of Phytology, Faculty of ForestryTechnical University in ZvolenZvolen96001Slovak Republic
| | - Giorgi Tedoradze
- Department of Plant Systematics and Geography, Institute of BotanyIlia State UniversityTbilisi0105Georgia
| | - Deren A. R. Eaton
- Department of Ecology, Evolution and Environmental BiologyColumbia UniversityNew YorkNY10027USA
| | - Elena Conti
- Department of Systematic and Evolutionary BotanyUniversity of ZurichZollikerstrasse 107Zurich8008Switzerland
| |
Collapse
|
15
|
Flouri T, Huang J, Jiao X, Kapli P, Rannala B, Yang Z. Bayesian phylogenetic inference using relaxed-clocks and the multispecies coalescent. Mol Biol Evol 2022; 39:6652437. [PMID: 35907248 PMCID: PMC9366188 DOI: 10.1093/molbev/msac161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
The multispecies coalescent (MSC) model accommodates both species divergences and within-species coalescent and provides a natural framework for phylogenetic analysis of genomic data when the gene trees vary across the genome. The MSC model implemented in the program bpp assumes a molecular clock and the Jukes–Cantor model, and is suitable for analyzing genomic data from closely related species. Here we extend our implementation to more general substitution models and relaxed clocks to allow the rate to vary among species. The MSC-with-relaxed-clock model allows the estimation of species divergence times and ancestral population sizes using genomic sequences sampled from contemporary species when the strict clock assumption is violated, and provides a simulation framework for evaluating species tree estimation methods. We conducted simulations and analyzed two real datasets to evaluate the utility of the new models. We confirm that the clock-JC model is adequate for inference of shallow trees with closely related species, but it is important to account for clock violation for distant species. Our simulation suggests that there is valuable phylogenetic information in the gene-tree branch lengths even if the molecular clock assumption is seriously violated, and the relaxed-clock models implemented in bpp are able to extract such information. Our Markov chain Monte Carlo algorithms suffer from mixing problems when used for species tree estimation under the relaxed clock and we discuss possible improvements. We conclude that the new models are currently most effective for estimating population parameters such as species divergence times when the species tree is fixed.
Collapse
Affiliation(s)
- Tomáš Flouri
- Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK
| | - Jun Huang
- Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK.,School of Biomedical Engineering, Capital Medical University, Beijing, 100069, China
| | - Xiyun Jiao
- Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK.,Department of Statistics and Data Science, China Southern University of Science and Technology, Shenzhen, Guangdong 518055, China
| | - Paschalia Kapli
- Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK
| | - Bruce Rannala
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Ziheng Yang
- Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK
| |
Collapse
|
16
|
Out of chaos: Phylogenomics of Asian Sonerileae. Mol Phylogenet Evol 2022; 175:107581. [PMID: 35810973 DOI: 10.1016/j.ympev.2022.107581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 05/23/2022] [Accepted: 05/26/2022] [Indexed: 11/22/2022]
Abstract
Sonerileae is a diverse Melastomataceae lineage comprising ca. 1000 species in 44 genera, with >70% of genera and species distributed in Asia. Asian Sonerileae are taxonomically intractable with obscure generic circumscriptions. The backbone phylogeny of this group remains poorly resolved, possibly due to complexity caused by rapid species radiation in early and middle Miocene, which hampers further systematic study. Here, we used genome resequencing data to reconstruct the phylogeny of Asian Sonerileae. Three parallel datasets, viz. single-copy ortholog (SCO), genomic SNPs, and whole plastome, were assembled from genome resequencing data of 205 species for this purpose. Based on these genome-scale data, we provided the first well resolved phylogeny of Asian Sonerileae, with 34 major clades identified and 74% of the interclade relationships consistently resolved by both SCO and genomic data. Meanwhile, widespread phylogenetic discordance was detected among SCO gene trees as well as species trees reconstructed using different tree estimation methods (concatenation/site-based coalescent method/summary method) or different datasets (SCO/genomic/plastome). We explored sources of discordance using multiple approaches and found that the observed discordance in Asian Sonerileae was mainly caused by a combination of biased distribution of missing data, random noise from uninformative genes, incomplete lineage sorting, and hybridization/introgression. Exploration of these sources can enable us to generate hypotheses for future testing, which is the first step towards understanding the evolution of Asian Sonerileae. We also detected high levels of homoplasy for some characters traditionally used in taxonomy, which explains current chaotic generic delimitations. The backbone phylogeny of Asian Sonerileae revealed in this study offers a solid basis for future taxonomic revision at the generic level.
Collapse
|
17
|
Zhu T, Flouri T, Yang Z. A simulation study to examine the impact of recombination on phylogenomic inferences under the multispecies coalescent model. Mol Ecol 2022; 31:2814-2829. [PMID: 35313033 PMCID: PMC9321900 DOI: 10.1111/mec.16433] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 01/25/2022] [Accepted: 02/28/2022] [Indexed: 11/28/2022]
Affiliation(s)
- Tianqi Zhu
- Institute of Applied Mathematics Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing 100190 China
- Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences Beijing 100190 China
| | - Tomáš Flouri
- Department of Genetics, Evolution and Environment University College London London WC1E 6BT UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment University College London London WC1E 6BT UK
| |
Collapse
|
18
|
Sanderson MJ, Búrquez A, Copetti D, McMahon MM, Zeng Y, Wojciechowski MF. Origin and diversification of the saguaro cactus (Carnegiea gigantea): a within-species phylogenomic analysis. Syst Biol 2022; 71:1178-1194. [PMID: 35244183 DOI: 10.1093/sysbio/syac017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 02/18/2022] [Accepted: 02/25/2022] [Indexed: 11/14/2022] Open
Abstract
Reconstructing accurate historical relationships within a species poses numerous challenges, not least in many plant groups in which gene flow is high enough to extend well beyond species boundaries. Nonetheless, the extent of tree-like history within a species is an empirical question on which it is now possible to bring large amounts of genome sequence to bear. We assess phylogenetic structure across the geographic range of the saguaro cactus, an emblematic member of Cactaceae, a clade known for extensive hybridization and porous species boundaries. Using 200 Gb of whole genome resequencing data from 20 individuals sampled from 10 localities, we assembled two data sets comprising 150,000 biallelic single nucleotide polymorphisms (SNPs) from protein coding sequences. From these we inferred within-species trees and evaluated their significance and robustness using five qualitatively different inference methods. Despite the low sequence diversity, large census population sizes, and presence of wide-ranging pollen and seed dispersal agents, phylogenetic trees were well resolved and highly consistent across both data sets and all methods. We inferred that the most likely root, based on marginal likelihood comparisons, is to the east and south of the region of highest genetic diversity, which lies along the coast of the Gulf of California in Sonora, Mexico. Together with striking decreases in marginal likelihood found to the north, this supports hypotheses that saguaro's current range reflects post-glacial expansion from the refugia in the south of its range. We conclude with observations about practical and theoretical issues raised by phylogenomic data sets within species, in which SNP-based methods must be used rather than gene tree methods that are widely used when sequence divergence is higher. These include computational scalability, inference of gene flow, and proper assessment of statistical support in the presence of linkage effects.
Collapse
Affiliation(s)
- Michael J Sanderson
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
| | - Alberto Búrquez
- Instituto de Ecología, Unidad Hermosillo, Universidad Nacional Autónoma de México, Hermosillo, Sonora, Mexico
| | - Dario Copetti
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, 85721 USA
| | | | - Yichao Zeng
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
| | | |
Collapse
|
19
|
McLean BS, Bell KC, Cook JA. SNP-based Phylogenomic Inference in Holarctic Ground Squirrels (Urocitellus). Mol Phylogenet Evol 2022; 169:107396. [PMID: 35031463 DOI: 10.1016/j.ympev.2022.107396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 12/02/2021] [Accepted: 12/08/2021] [Indexed: 11/24/2022]
Abstract
Resolution of rapid evolutionary radiations requires harvesting maximal signal from phylogenomic datasets. However, studies of non-model clades often target conserved loci that are characterized by reduced information content, which can negatively affect gene tree precision and species tree accuracy. Single nucleotide polymorphism (SNP)-based methods are an underutilized but potentially valuable tool for estimating phylogeny and divergence times because they do not rely on resolved gene trees, allowing information from many or all variant loci to be leveraged in species tree reconstruction. We evaluated the utility of SNP-based methods in resolving phylogeny of Holarctic ground squirrels (Urocitellus), a radiation that has been difficult to disentangle, even in prior phylogenomic studies. We inferred phylogeny from a dataset of >3,000 ultraconserved element loci (UCEs) using two methods (SNAPP, SVDquartets) and compared our results with a new mitogenome phylogeny. We also systematically evaluated how phasing of UCEs improves per-locus information content, and inference of topology and other parameters within each of these SNP-based methods. Phasing improved topological resolution and branch length estimation at shallow levels (within species complexes), but less so at deeper levels, likely reflecting true uncertainty due to ancestral polymorphisms segregating in these rapidly diverging lineages. We resolved several key clades in Urocitellus and present targeted opportunities for future phylogenomic inquiry. Our results extend the roadmap for use of SNPs to address vertebrate radiations and support comparative analyses at multiple temporal scales.
Collapse
Affiliation(s)
- Bryan S McLean
- University of North Carolina Greensboro, Department of Biology, Greensboro, NC 27402 USA.
| | - Kayce C Bell
- Natural History Museum of Los Angeles County, Department of Mammalogy, Los Angeles, CA 90007 USA.
| | - Joseph A Cook
- University of New Mexico, Department of Biology and Museum of Southwestern Biology, Albuquerque, NM 87131 USA.
| |
Collapse
|
20
|
Jiao X, Flouri T, Yang Z. Multispecies coalescent and its applications to infer species phylogenies and cross-species gene flow. Natl Sci Rev 2022; 8:nwab127. [PMID: 34987842 PMCID: PMC8692950 DOI: 10.1093/nsr/nwab127] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 07/10/2021] [Accepted: 07/11/2021] [Indexed: 02/06/2023] Open
Abstract
Multispecies coalescent (MSC) is the extension of the single-population coalescent model to multiple species. It integrates the phylogenetic process of species divergences and the population genetic process of coalescent, and provides a powerful framework for a number of inference problems using genomic sequence data from multiple species, including estimation of species divergence times and population sizes, estimation of species trees accommodating discordant gene trees, inference of cross-species gene flow and species delimitation. In this review, we introduce the major features of the MSC model, discuss full-likelihood and heuristic methods of species tree estimation and summarize recent methodological advances in inference of cross-species gene flow. We discuss the statistical and computational challenges in the field and research directions where breakthroughs may be likely in the next few years.
Collapse
Affiliation(s)
- Xiyun Jiao
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| | - Tomáš Flouri
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| |
Collapse
|
21
|
Borges R, Boussau B, Szöllősi GJ, Kosiol C. Nucleotide Usage Biases Distort Inferences of the Species Tree. Genome Biol Evol 2022; 14:6496956. [PMID: 34983052 PMCID: PMC8829901 DOI: 10.1093/gbe/evab290] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/27/2021] [Indexed: 12/15/2022] Open
Abstract
Despite the importance of natural selection in species’ evolutionary history, phylogenetic methods that take into account population-level processes typically ignore selection. The assumption of neutrality is often based on the idea that selection occurs at a minority of loci in the genome and is unlikely to compromise phylogenetic inferences significantly. However, genome-wide processes like GC-bias and some variation segregating at the coding regions are known to evolve in the nearly neutral range. As we are now using genome-wide data to estimate species trees, it is natural to ask whether weak but pervasive selection is likely to blur species tree inferences. We developed a polymorphism-aware phylogenetic model tailored for measuring signatures of nucleotide usage biases to test the impact of selection in the species tree. Our analyses indicate that although the inferred relationships among species are not significantly compromised, the genetic distances are systematically underestimated in a node-height-dependent manner: that is, the deeper nodes tend to be more underestimated than the shallow ones. Such biases have implications for molecular dating. We dated the evolutionary history of 30 worldwide fruit fly populations, and we found signatures of GC-bias considerably affecting the estimated divergence times (up to 23%) in the neutral model. Our findings call for the need to account for selection when quantifying divergence or dating species evolution.
Collapse
Affiliation(s)
- Rui Borges
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria
| | - Bastien Boussau
- Université de Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5558, LBBE, Villeurbanne, France
| | - Gergely J Szöllősi
- Department of Biological Physics, Eötvös University, Budapest , Hungary.,MTA-ELTE "Lendület" Evolutionary Genomics Research Group, Budapest, Hungary.,Evolutionary Systems Research Group, Centre for Ecological Research, Hungarian Academy of Sciences, Tihany, Hungary
| | - Carolin Kosiol
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria.,Centre for Biological Diversity, University of St Andrews, St Andrews, United Kingdom
| |
Collapse
|
22
|
Thawornwattana Y, Seixas FA, Yang Z, Mallet J. OUP accepted manuscript. Syst Biol 2022; 71:1159-1177. [PMID: 35169847 PMCID: PMC9366460 DOI: 10.1093/sysbio/syac009] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 02/01/2022] [Accepted: 02/08/2022] [Indexed: 11/21/2022] Open
Abstract
Introgressive hybridization plays a key role in adaptive evolution and species diversification in many groups of species. However, frequent hybridization and gene flow between species make estimation of the species phylogeny and key population parameters challenging. Here, we show that by accounting for phasing and using full-likelihood methods, introgression histories and population parameters can be estimated reliably from whole-genome sequence data. We employ the multispecies coalescent (MSC) model with and without gene flow to infer the species phylogeny and cross-species introgression events using genomic data from six members of the erato-sara clade of Heliconius butterflies. The methods naturally accommodate random fluctuations in genealogical history across the genome due to deep coalescence. To avoid heterozygote phasing errors in haploid sequences commonly produced by genome assembly methods, we process and compile unphased diploid sequence alignments and use analytical methods to average over uncertainties in heterozygote phase resolution. There is robust evidence for introgression across the genome, both among distantly related species deep in the phylogeny and between sister species in shallow parts of the tree. We obtain chromosome-specific estimates of key population parameters such as introgression directions, times and probabilities, as well as species divergence times and population sizes for modern and ancestral species. We confirm ancestral gene flow between the sara clade and an ancestral population of Heliconius telesiphe, a likely hybrid speciation origin for Heliconius hecalesia, and gene flow between the sister species Heliconius erato and Heliconius himera. Inferred introgression among ancestral species also explains the history of two chromosomal inversions deep in the phylogeny of the group. This study illustrates how a full-likelihood approach based on the MSC makes it possible to extract rich historical information of species divergence and gene flow from genomic data. [3s; bpp; gene flow; Heliconius; hybrid speciation; introgression; inversion; multispecies coalescent]
Collapse
Affiliation(s)
- Yuttapong Thawornwattana
- Correspondence to be sent to: Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA; E-mail: ; (Y.T. and J.M.); Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK; E-mail: (Z.Y.)
| | - Fernando A Seixas
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Ziheng Yang
- Correspondence to be sent to: Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA; E-mail: ; (Y.T. and J.M.); Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK; E-mail: (Z.Y.)
| | - James Mallet
- Correspondence to be sent to: Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA; E-mail: ; (Y.T. and J.M.); Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK; E-mail: (Z.Y.)
| |
Collapse
|
23
|
How challenging RADseq data turned out to favor coalescent-based species tree inference. A case study in Aichryson (Crassulaceae). Mol Phylogenet Evol 2021; 167:107342. [PMID: 34785384 DOI: 10.1016/j.ympev.2021.107342] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 07/05/2021] [Accepted: 10/29/2021] [Indexed: 12/24/2022]
Abstract
Analysing multiple genomic regions while incorporating detection and qualification of discordance among regions has become standard for understanding phylogenetic relationships. In plants, which usually have comparatively large genomes, this is feasible by the combination of reduced-representation library (RRL) methods and high-throughput sequencing enabling the cost effective acquisition of genomic data for thousands of loci from hundreds of samples. One popular RRL method is RADseq. A major disadvantage of established RADseq approaches is the rather short fragment and sequencing range, leading to loci of little individual phylogenetic information. This issue hampers the application of coalescent-based species tree inference. The modified RADseq protocol presented here targets ca. 5,000 loci of 300-600nt length, sequenced with the latest short-read-sequencing (SRS) technology, has the potential to overcome this drawback. To illustrate the advantages of this approach we use the study group Aichryson Webb & Berthelott (Crassulaceae), a plant genus that diversified on the Canary Islands. The data analysis approach used here aims at a careful quality control of the long loci dataset. It involves an informed selection of thresholds for accurate clustering, a thorough exploration of locus properties, such as locus length, coverage and variability, to identify potential biased data and a comparative phylogenetic inference of filtered datasets, accompanied by an evaluation of resulting BS support, gene and site concordance factor values, to improve overall resolution of the resulting phylogenetic trees. The final dataset contains variable loci with an average length of 373nt and facilitates species tree estimation using a coalescent-based summary approach. Additional improvements brought by the approach are critically discussed.
Collapse
|
24
|
Unmack PJ, Adams M, Hammer MP, Johnson JB, Gruber B, Gilles A, Young M, Georges A. Plotting for change: an analytical framework to aid decisions on which lineages are candidate species in phylogenomic species discovery. Biol J Linn Soc Lond 2021. [DOI: 10.1093/biolinnean/blab095] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Abstract
A recent study argued that coalescent-based models of species delimitation mostly delineate population structure, not species, and called for the validation of candidate species using biological information additional to the genetic information, such as phenotypic or ecological data. Here, we introduce a framework to interrogate genomic datasets and coalescent-based species trees for the presence of candidate species in situations where additional biological data are unavailable, unobtainable or uninformative. For de novo genomic studies of species boundaries, we propose six steps: (1) visualize genetic affinities among individuals to identify both discrete and admixed genetic groups from first principles and to hold aside individuals involved in contemporary admixture for independent consideration; (2) apply phylogenetic techniques to identify lineages; (3) assess diagnosability of those lineages as potential candidate species; (4) interpret the diagnosable lineages in a geographical context (sympatry, parapatry, allopatry); (5) assess significance of difference or trends in the context of sampling intensity; and (6) adopt a holistic approach to available evidence to inform decisions on species status in the difficult cases of allopatry. We adopt this approach to distinguish candidate species from within-species lineages for a widespread species complex of Australian freshwater fishes (Retropinna spp.). Our framework addresses two cornerstone issues in systematics that are often not discussed explicitly in genomic species discovery: diagnosability and how to determine it, and what criteria should be used to decide whether diagnosable lineages are conspecific or represent different species.
Collapse
Affiliation(s)
- Peter J Unmack
- Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
- Centre for Applied Water Science, Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
- Department of Biology, Brigham Young University, Provo, UT, USA
| | - Mark Adams
- Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
- Department of Biological Sciences, University of Adelaide, Adelaide, SA, Australia
| | - Michael P Hammer
- Museum & Art Gallery of the Northern Territory, Darwin, NT, Australia
| | - Jerald B Johnson
- Department of Biology, Brigham Young University, Provo, UT, USA
- Monte L. Bean Life Science Museum, Brigham Young University, Provo, UT, USA
| | - Bernd Gruber
- Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
| | - André Gilles
- UMR 1467 RECOVER, Aix Marseille Univ, INRAE, Centre St Charles, 3 place Victor Hugo, Marseille, France
| | - Matthew Young
- Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
| | - Arthur Georges
- Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia
| |
Collapse
|
25
|
|
26
|
Vázquez-Miranda H, Barker FK. Autosomal, sex-linked and mitochondrial loci resolve evolutionary relationships among wrens in the genus Campylorhynchus. Mol Phylogenet Evol 2021; 163:107242. [PMID: 34224849 DOI: 10.1016/j.ympev.2021.107242] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 06/14/2021] [Accepted: 06/29/2021] [Indexed: 01/18/2023]
Abstract
Although there is general consensus that sampling of multiple genetic loci is critical in accurate reconstruction of species trees, the exact numbers and the best types of molecular markers remain an open question. In particular, the phylogenetic utility of sex-linked loci is underexplored. Here, we sample all species and 70% of the named diversity of the New World wren genus Campylorhynchus using sequences from 23 loci, to evaluate the effects of linkage on efficiency in recovering a well-supported tree for the group. At a tree-wide level, we found that most loci supported fewer than half the possible clades and that sex-linked loci produced similar resolution to slower-coalescing autosomal markers, controlling for locus length. By contrast, we did find evidence that linkage affected the efficiency of recovery of individual relationships; as few as two sex-linked loci were necessary to resolve a selection of clades with long to medium subtending branches, whereas 4-6 autosomal loci were necessary to achieve comparable results. These results support an expanded role for sampling of the avian Z chromosome in phylogenetic studies, including target enrichment approaches. Our concatenated and species tree analyses represent significant improvements in our understanding of diversification in Campylorhynchus, and suggest a relatively complex scenario for its radiation across the Miocene/Pliocene boundary, with multiple invasions of South America.
Collapse
Affiliation(s)
- Hernán Vázquez-Miranda
- Departamento de Zoología, Instituto de Biología, Universidad Nacional Autónoma de México, Ciudad de México C.P. 04510, Mexico
| | - F Keith Barker
- Department of Ecology, Evolution and Behavior, Bell Museum of Natural History, University of Minnesota, 40 Gortner Laboratory, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
| |
Collapse
|
27
|
Gao Y, Zhang Y, Dietrich CH, Duan Y. Phylogenetic analyses and species delimitation of Nephotettix Matsumura (Hemiptera: Cicadellidae: Deltocephalinae: Chiasmini) in China based on molecular data. ZOOL ANZ 2021. [DOI: 10.1016/j.jcz.2021.06.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
28
|
Doyle JJ. Defining coalescent genes: Theory meets practice in organelle phylogenomics. Syst Biol 2021; 71:476-489. [PMID: 34191012 DOI: 10.1093/sysbio/syab053] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 06/24/2021] [Accepted: 06/28/2021] [Indexed: 11/13/2022] Open
Abstract
The species tree paradigm that dominates current molecular systematic practice infers species trees from collections of sequences under assumptions of the multispecies coalescent (MSC), i.e., that there is free recombination between the sequences and no (or very low) recombination within them. These coalescent genes (c-genes) are thus defined in an historical rather than molecular sense, and can in theory be as large as an entire genome or as small as a single nucleotide. A debate about how to define c-genes centers on the contention that nuclear gene sequences used in many coalescent analyses undergo too much recombination, such that their introns comprise multiple c-genes, violating a key assumption of the MSC. Recently a similar argument has been made for the genes of plastid (e.g., chloroplast) and mitochondrial genomes, which for the last 30 or more years have been considered to represent a single c-gene for the purposes of phylogeny reconstruction because they are non-recombining in a historical sense. Consequently, it has been suggested that these genomes should be analyzed using coalescent methods that treat their genes-over 70 protein-coding genes in the case of most plastid genomes (plastomes)-as independent estimates of species phylogeny, in contrast to the usual practice of concatenation, which is appropriate for generating gene trees. However, although recombination certainly occurs in the plastome, as has been recognized since the 1970's, it is unlikely to be phylogenetically relevant. This is because such historically effective recombination can only occur when plastomes with incongruent histories are brought together in the same plastid. However, plastids sort rapidly into different cell lineages and rarely fuse. Thus, because of plastid biology, the plastome is a more canonical c-gene than is the average multi-intron mammalian nuclear gene. The plastome should thus continue to be treated as a single estimate of the underlying species phylogeny, as should the mitochondrial genome. The implications of this long-held insight of molecular systematics for studies in the phylogenomic era are explored.
Collapse
Affiliation(s)
- Jeff J Doyle
- Plant Biology Section, Plant Breeding & Genetics Section, and L. H. Bailey Hortorium, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853 USA
| |
Collapse
|
29
|
Kim A, Rosenberg NA, Degnan JH. Probabilities of Unranked and Ranked Anomaly Zones under Birth-Death Models. Mol Biol Evol 2021; 37:1480-1494. [PMID: 31860090 DOI: 10.1093/molbev/msz305] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
A labeled gene tree topology that is more probable than the labeled gene tree topology matching a species tree is called "anomalous." Species trees that can generate such anomalous gene trees are said to be in the "anomaly zone." Here, probabilities of "unranked" and "ranked" gene tree topologies under the multispecies coalescent are considered. A ranked tree depicts not only the topological relationship among gene lineages, as an unranked tree does, but also the sequence in which the lineages coalesce. In this article, we study how the parameters of a species tree simulated under a constant-rate birth-death process can affect the probability that the species tree lies in the anomaly zone. We find that with more than five taxa, it is possible for species trees to have both anomalous unranked and ranked gene trees. The probability of being in either type of anomaly zone increases with more taxa. The probability of anomalous gene trees also increases with higher speciation rates. We observe that the probabilities of unranked anomaly zones are higher and grow much faster than those of ranked anomaly zones as the speciation rate increases. Our simulation shows that the most probable ranked gene tree is likely to have the same unranked topology as the species tree. We design the software PRANC, which computes probabilities of ranked gene tree topologies given a species tree under the coalescent model.
Collapse
Affiliation(s)
- Anastasiia Kim
- Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM
| | | | - James H Degnan
- Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM
| |
Collapse
|
30
|
Flouri T, Jiao X, Rannala B, Yang Z. A Bayesian Implementation of the Multispecies Coalescent Model with Introgression for Phylogenomic Analysis. Mol Biol Evol 2021; 37:1211-1223. [PMID: 31825513 PMCID: PMC7086182 DOI: 10.1093/molbev/msz296] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Recent analyses suggest that cross-species gene flow or introgression is common in nature, especially during species divergences. Genomic sequence data can be used to infer introgression events and to estimate the timing and intensity of introgression, providing an important means to advance our understanding of the role of gene flow in speciation. Here, we implement the multispecies-coalescent-with-introgression model, an extension of the multispecies-coalescent model to incorporate introgression, in our Bayesian Markov chain Monte Carlo program Bpp. The multispecies-coalescent-with-introgression model accommodates deep coalescence (or incomplete lineage sorting) and introgression and provides a natural framework for inference using genomic sequence data. Computer simulation confirms the good statistical properties of the method, although hundreds or thousands of loci are typically needed to estimate introgression probabilities reliably. Reanalysis of data sets from the purple cone spruce confirms the hypothesis of homoploid hybrid speciation. We estimated the introgression probability using the genomic sequence data from six mosquito species in the Anopheles gambiae species complex, which varies considerably across the genome, likely driven by differential selection against introgressed alleles.
Collapse
Affiliation(s)
- Tomáš Flouri
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Xiyun Jiao
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Bruce Rannala
- Department of Evolution and Ecology, University of California, Davis, Davis, CA
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| |
Collapse
|
31
|
Bober S, Glaubrecht M, Hausdorf B, Neiber MT. One, two or three? Integrative species delimitation of short-range endemic Hemicycla species (Gastropoda: Helicidae) from the Canary Islands based on morphology, barcoding, AFLP and ddRADseq data. Mol Phylogenet Evol 2021; 161:107153. [PMID: 33741537 DOI: 10.1016/j.ympev.2021.107153] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 03/03/2021] [Accepted: 03/08/2021] [Indexed: 11/26/2022]
Abstract
Hemicycla mascaensis and H. diegoi are short-range endemics that occur allopatrically in small areas in the Teno Mountains in the western part of Tenerife (Canary Islands). Both taxa have been recognised as distinct species based on differences in shell morphology and genital anatomy. Preliminary molecular analyses using mitochondrial markers suggested a potential paraphyly of H. diegoi with regard to H. mascaensis. We here use multilocus AFLP data and ddRADseq data as well as distribution data, data on shell morphology and genital anatomy to assess the status of these taxa using phylogenetic analyses, species tree reconstruction and molecular species delimitation based on the multispecies coalescent as implemented in BFD* and BPP in an integrative approach. Our analyses show that, based on the analysis of multilocus data, the two taxa are reciprocally monophyletic. Species delimitation methods, however, tend to recognise all investigated populations as distinct species, albeit neither lending unambiguous support to any of the species hypotheses. The comparison of the anatomy of distal genital organs further suggests differentiation within H. mascaensis. This highlights the need for a balanced weighting of arguments from different lines of evidence to determine species status and calls for cautious interpretations of the results of molecular species delimitation analyses, especially in organisms with low active dispersal capacities and expected distinct population structuring such as land snails. Taking all available evidence into account, we favour to recognise H. mascaensis and H. diegoi as distinct species, acknowledging, though, that the recognition of both taxa as subspecies (with possibly a third yet undescribed) would also be an option as morphological differentiation is within the limits of other land snail species that are traditionally subdivided into subspecies.
Collapse
Affiliation(s)
- Simon Bober
- Center of Natural History, Zoological Museum, University of Hamburg, Martin-Luther-King-Platz 3, 20146 Hamburg, Germany
| | - Matthias Glaubrecht
- Center of Natural History, Zoological Museum, University of Hamburg, Martin-Luther-King-Platz 3, 20146 Hamburg, Germany
| | - Bernhard Hausdorf
- Center of Natural History, Zoological Museum, University of Hamburg, Martin-Luther-King-Platz 3, 20146 Hamburg, Germany
| | - Marco T Neiber
- Center of Natural History, Zoological Museum, University of Hamburg, Martin-Luther-King-Platz 3, 20146 Hamburg, Germany.
| |
Collapse
|
32
|
Rabiee M, Mirarab S. SODA: Multi-locus species delimitation using quartet frequencies. Bioinformatics 2021; 36:5623-5631. [PMID: 33555318 DOI: 10.1093/bioinformatics/btaa1010] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 10/19/2020] [Accepted: 11/21/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Species delimitation, the process of deciding how to group a set of organisms into units called species, is one of the most challenging problems in evolutionary computational biology. While many methods exist for species delimitation, most based on the coalescent theory, few are scalable to very large datasets, and methods that scale tend to be not accurate. Species delimitation is closely related to species tree inference from discordant gene trees, a problem that has enjoyed rapid advances in recent years. RESULTS In this paper, we build on the accuracy and scalability of recent quartet-based methods for species tree estimation and propose a new method called SODA for species delimitation. SODA relies heavily on a recently developed method for testing zero branch length in species trees. In extensive simulations, we show that SODA can easily scale to very large datasets while maintaining high accuracy. AVAILABILITY The code and data presented here are available on https://github.com/maryamrabiee/SODA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Maryam Rabiee
- Computer Science and Engineering, University of California, San Diego, US
| | - Siavash Mirarab
- Electrical and Computer Engineering, University of California, San Diego, US
| |
Collapse
|
33
|
Zhu T, Yang Z. Complexity of the simplest species tree problem. Mol Biol Evol 2021; 38:3993-4009. [PMID: 33492385 PMCID: PMC8382899 DOI: 10.1093/molbev/msab009] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 01/04/2021] [Accepted: 01/13/2021] [Indexed: 02/06/2023] Open
Abstract
The multispecies coalescent model provides a natural framework for species tree estimation accounting for gene-tree conflicts. Although a number of species tree methods under the multispecies coalescent have been suggested and evaluated using simulation, their statistical properties remain poorly understood. Here, we use mathematical analysis aided by computer simulation to examine the identifiability, consistency, and efficiency of different species tree methods in the case of three species and three sequences under the molecular clock. We consider four major species-tree methods including concatenation, two-step, independent-sites maximum likelihood, and maximum likelihood. We develop approximations that predict that the probit transform of the species tree estimation error decreases linearly with the square root of the number of loci. Even in this simplest case, major differences exist among the methods. Full-likelihood methods are considerably more efficient than summary methods such as concatenation and two-step. They also provide estimates of important parameters such as species divergence times and ancestral population sizes,whereas these parameters are not identifiable by summary methods. Our results highlight the need to improve the statistical efficiency of summary methods and the computational efficiency of full likelihood methods of species tree estimation.
Collapse
Affiliation(s)
- Tianqi Zhu
- Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Ziheng Yang
- Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,Department of Genetics, University College London, Gower Street, London WC1E 6BT, UK
| |
Collapse
|
34
|
Abstract
The phylogeny of Neoaves, the largest clade of extant birds, has remained unclear despite intense study. The difficulty associated with resolving the early branches in Neoaves is likely driven by the rapid radiation of this group. However, conflicts among studies may be exacerbated by the data type analyzed. For example, analyses of coding exons typically yield trees that place Strisores (nightjars and allies) sister to the remaining Neoaves, while analyses of non-coding data typically yield trees where Mirandornites (flamingos and grebes) is the sister of the remaining Neoaves. Our understanding of data type effects is hampered by the fact that previous analyses have used different taxa, loci, and types of non-coding data. Herein, we provide strong corroboration of the data type effects hypothesis for Neoaves by comparing trees based on coding and non-coding data derived from the same taxa and gene regions. A simple analytical method known to minimize biases due to base composition (coding nucleotides as purines and pyrimidines) resulted in coding exon data with increased congruence to the non-coding topology using concatenated analyses. These results improve our understanding of the resolution of neoavian phylogeny and point to a challenge—data type effects—that is likely to be an important factor in phylogenetic analyses of birds (and many other taxonomic groups). Using our results, we provide a summary phylogeny that identifies well-corroborated relationships and highlights specific nodes where future efforts should focus.
Collapse
|
35
|
Schneider JV, Paule J, Jungcurt T, Cardoso D, Amorim AM, Berberich T, Zizka G. Resolving Recalcitrant Clades in the Pantropical Ochnaceae: Insights From Comparative Phylogenomics of Plastome and Nuclear Genomic Data Derived From Targeted Sequencing. FRONTIERS IN PLANT SCIENCE 2021; 12:638650. [PMID: 33613613 PMCID: PMC7890083 DOI: 10.3389/fpls.2021.638650] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Accepted: 01/15/2021] [Indexed: 05/13/2023]
Abstract
Plastid DNA sequence data have been traditionally widely used in plant phylogenetics because of the high copy number of plastids, their uniparental inheritance, and the blend of coding and non-coding regions with divergent substitution rates that allow the reconstruction of phylogenetic relationships at different taxonomic ranks. In the present study, we evaluate the utility of the plastome for the reconstruction of phylogenetic relationships in the pantropical plant family Ochnaceae (Malpighiales). We used the off-target sequence read fraction of a targeted sequencing study (targeting nuclear loci only) to recover more than 100 kb of the plastid genome from the majority of the more than 200 species of Ochnaceae and all but two genera using de novo and reference-based assembly strategies. Most of the recalcitrant nodes in the family's backbone were resolved by our plastome-based phylogenetic inference, corroborating the most recent classification system of Ochnaceae and findings from a phylogenomic study based on nuclear loci. Nonetheless, the phylogenetic relationships within the major clades of tribe Ochnineae, which comprise about two thirds of the family's species diversity, received mostly low support. Generally, the phylogenetic resolution was lowest at the infrageneric level. Overall there was little phylogenetic conflict compared to a recent analysis of nuclear loci. Effects of taxon sampling were invoked as the most likely reason for some of the few well-supported discords. Our study demonstrates the utility of the off-target fraction of a target enrichment study for assembling near-complete plastid genomes for a large proportion of samples.
Collapse
Affiliation(s)
- Julio V. Schneider
- Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany
- Entomology III, Department of Terrestrial Zoology, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany
| | - Juraj Paule
- Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany
- Institute of Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
| | - Tanja Jungcurt
- Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany
- Institute of Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
| | - Domingos Cardoso
- Instituto de Biologia, Universidade Federal da Bahia (UFBA), Salvador, Brazil
| | - André Márcio Amorim
- Universidade Estadual de Santa Cruz (UESC), Ilhéus, Brazil
- Herbário André Maurício Vieira de Carvalho, CEPEC, CEPLAC, Itabuna, Brazil
| | - Thomas Berberich
- Senckenberg Biodiversity and Climate Research Center, Lab-Center, Frankfurt am Main, Germany
| | - Georg Zizka
- Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany
- Institute of Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
- *Correspondence: Georg Zizka, ;
| |
Collapse
|
36
|
Bossert S, Murray EA, Pauly A, Chernyshov K, Brady SG, Danforth BN. Gene Tree Estimation Error with Ultraconserved Elements: An Empirical Study on Pseudapis Bees. Syst Biol 2020; 70:803-821. [PMID: 33367855 DOI: 10.1093/sysbio/syaa097] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 11/18/2020] [Accepted: 12/02/2020] [Indexed: 11/12/2022] Open
Abstract
Summarizing individual gene trees to species phylogenies using two-step coalescent methods is now a standard strategy in the field of phylogenomics. However, practical implementations of summary methods suffer from gene tree estimation error, which is caused by various biological and analytical factors. Greatly understudied is the choice of gene tree inference method and downstream effects on species tree estimation for empirical data sets. To better understand the impact of this method choice on gene and species tree accuracy, we compare gene trees estimated through four widely used programs under different model-selection criteria: PhyloBayes, MrBayes, IQ-Tree, and RAxML. We study their performance in the phylogenomic framework of $>$800 ultraconserved elements from the bee subfamily Nomiinae (Halictidae). Our taxon sampling focuses on the genus Pseudapis, a distinct lineage with diverse morphological features, but contentious morphology-based taxonomic classifications and no molecular phylogenetic guidance. We approximate topological accuracy of gene trees by assessing their ability to recover two uncontroversial, monophyletic groups, and compare branch lengths of individual trees using the stemminess metric (the relative length of internal branches). We further examine different strategies of removing uninformative loci and the collapsing of weakly supported nodes into polytomies. We then summarize gene trees with ASTRAL and compare resulting species phylogenies, including comparisons to concatenation-based estimates. Gene trees obtained with the reversible jump model search in MrBayes were most concordant on average and all Bayesian methods yielded gene trees with better stemminess values. The only gene tree estimation approach whose ASTRAL summary trees consistently produced the most likely correct topology, however, was IQ-Tree with automated model designation (ModelFinder program). We discuss these findings and provide practical advice on gene tree estimation for summary methods. Lastly, we establish the first phylogeny-informed classification for Pseudapis s. l. and map the distribution of distinct morphological features of the group. [ASTRAL; Bees; concordance; gene tree estimation error; IQ-Tree; MrBayes, Nomiinae; PhyloBayes; RAxML; phylogenomics; stemminess].
Collapse
Affiliation(s)
- Silas Bossert
- Department of Entomology, Cornell University, Comstock Hall, Ithaca, NY 14853, USA.,Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA.,Department of Entomology, Washington State University, Pullman, Washington 99164, USA
| | - Elizabeth A Murray
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA.,Department of Entomology, Washington State University, Pullman, Washington 99164, USA
| | - Alain Pauly
- O.D. Taxonomy and Phylogeny, Royal Belgian Institute of Natural Sciences, Rue Vautier 29, 1000 Brussels, Belgium
| | - Kyrylo Chernyshov
- College of Arts and Sciences, Cornell University, Ithaca, NY 14853, USA
| | - Seán G Brady
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA
| | - Bryan N Danforth
- Department of Entomology, Cornell University, Comstock Hall, Ithaca, NY 14853, USA
| |
Collapse
|
37
|
Koch H, DeGiorgio M. Maximum Likelihood Estimation of Species Trees from Gene Trees in the Presence of Ancestral Population Structure. Genome Biol Evol 2020; 12:3977-3995. [PMID: 32022857 PMCID: PMC7061232 DOI: 10.1093/gbe/evaa022] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/23/2020] [Indexed: 11/12/2022] Open
Abstract
Though large multilocus genomic data sets have led to overall improvements in phylogenetic inference, they have posed the new challenge of addressing conflicting signals across the genome. In particular, ancestral population structure, which has been uncovered in a number of diverse species, can skew gene tree frequencies, thereby hindering the performance of species tree estimators. Here we develop a novel maximum likelihood method, termed TASTI (Taxa with Ancestral structure Species Tree Inference), that can infer phylogenies under such scenarios, and find that it has increasing accuracy with increasing numbers of input gene trees, contrasting with the relatively poor performances of methods not tailored for ancestral structure. Moreover, we propose a supertree approach that allows TASTI to scale computationally with increasing numbers of input taxa. We use genetic simulations to assess TASTI's performance in the three- and four-taxon settings and demonstrate the application of TASTI on a six-species Afrotropical mosquito data set. Finally, we have implemented TASTI in an open-source software package for ease of use by the scientific community.
Collapse
Affiliation(s)
- Hillary Koch
- Department of Statistics, Pennsylvania State University
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University
| |
Collapse
|
38
|
Kubatko L. Book Review: A Mathematical Primer of Molecular Phylogenetics, by Xuhua Xia. Syst Biol 2020. [DOI: 10.1093/sysbio/syaa082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Laura Kubatko
- Department of Statistics, The Ohio State University, Columbus, OH 43210, USA
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
39
|
Cai L, Xi Z, Lemmon EM, Lemmon AR, Mast A, Buddenhagen CE, Liu L, Davis CC. The Perfect Storm: Gene Tree Estimation Error, Incomplete Lineage Sorting, and Ancient Gene Flow Explain the Most Recalcitrant Ancient Angiosperm Clade, Malpighiales. Syst Biol 2020; 70:491-507. [PMID: 33169797 DOI: 10.1093/sysbio/syaa083] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 10/20/2020] [Accepted: 10/28/2020] [Indexed: 12/20/2022] Open
Abstract
The genomic revolution offers renewed hope of resolving rapid radiations in the Tree of Life. The development of the multispecies coalescent model and improved gene tree estimation methods can better accommodate gene tree heterogeneity caused by incomplete lineage sorting (ILS) and gene tree estimation error stemming from the short internal branches. However, the relative influence of these factors in species tree inference is not well understood. Using anchored hybrid enrichment, we generated a data set including 423 single-copy loci from 64 taxa representing 39 families to infer the species tree of the flowering plant order Malpighiales. This order includes 9 of the top 10 most unstable nodes in angiosperms, which have been hypothesized to arise from the rapid radiation during the Cretaceous. Here, we show that coalescent-based methods do not resolve the backbone of Malpighiales and concatenation methods yield inconsistent estimations, providing evidence that gene tree heterogeneity is high in this clade. Despite high levels of ILS and gene tree estimation error, our simulations demonstrate that these two factors alone are insufficient to explain the lack of resolution in this order. To explore this further, we examined triplet frequencies among empirical gene trees and discovered some of them deviated significantly from those attributed to ILS and estimation error, suggesting gene flow as an additional and previously unappreciated phenomenon promoting gene tree variation in Malpighiales. Finally, we applied a novel method to quantify the relative contribution of these three primary sources of gene tree heterogeneity and demonstrated that ILS, gene tree estimation error, and gene flow contributed to 10.0$\%$, 34.8$\%$, and 21.4$\%$ of the variation, respectively. Together, our results suggest that a perfect storm of factors likely influence this lack of resolution, and further indicate that recalcitrant phylogenetic relationships like the backbone of Malpighiales may be better represented as phylogenetic networks. Thus, reducing such groups solely to existing models that adhere strictly to bifurcating trees greatly oversimplifies reality, and obscures our ability to more clearly discern the process of evolution. [Coalescent; concatenation; flanking region; hybrid enrichment, introgression; phylogenomics; rapid radiation, triplet frequency.].
Collapse
Affiliation(s)
- Liming Cai
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Zhenxiang Xi
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Emily Moriarty Lemmon
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
| | - Alan R Lemmon
- Department of Scientific Computing, Florida State University, Tallahassee, FL 32306, USA
| | - Austin Mast
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
| | - Christopher E Buddenhagen
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
- AgResearch, 10 Bisley Road, Hamilton 3214, New Zealand
| | - Liang Liu
- Department of Statistics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Charles C Davis
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
| |
Collapse
|
40
|
Mello B, Tao Q, Barba-Montoya J, Kumar S. Molecular dating for phylogenies containing a mix of populations and species by using Bayesian and RelTime approaches. Mol Ecol Resour 2020; 21:122-136. [PMID: 32881388 DOI: 10.1111/1755-0998.13249] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Revised: 08/14/2020] [Accepted: 08/19/2020] [Indexed: 12/11/2022]
Abstract
Simultaneous molecular dating of population and species divergences is essential in many biological investigations, including phylogeography, phylodynamics and species delimitation studies. In these investigations, multiple sequence alignments consist of both intra- and interspecies samples (mixed samples). As a result, the phylogenetic trees contain interspecies, interpopulation and within-population divergences. Bayesian relaxed clock methods are often employed in these analyses, but they assume the same tree prior for both inter- and intraspecies branching processes and require specification of a clock model for branch rates (independent vs. autocorrelated rates models). We evaluated the impact of a single tree prior on Bayesian divergence time estimates by analysing computer-simulated data sets. We also examined the effect of the assumption of independence of evolutionary rate variation among branches when the branch rates are autocorrelated. Bayesian approach with coalescent tree priors generally produced excellent molecular dates and highest posterior densities with high coverage probabilities. We also evaluated the performance of a non-Bayesian method, RelTime, which does not require the specification of a tree prior or a clock model. RelTime's performance was similar to that of the Bayesian approach, suggesting that it is also suitable to analyse data sets containing both populations and species variation when its computational efficiency is needed.
Collapse
Affiliation(s)
- Beatriz Mello
- Department of Genetics, Federal University of Rio de Janeiro, Brazil.,Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA
| | - Qiqing Tao
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.,Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Jose Barba-Montoya
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.,Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.,Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
41
|
Chan KO, Hutter CR, Wood PL, Grismer LL, Das I, Brown RM. Gene flow creates a mirage of cryptic species in a Southeast Asian spotted stream frog complex. Mol Ecol 2020; 29:3970-3987. [PMID: 32808335 DOI: 10.1111/mec.15603] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 07/29/2020] [Accepted: 08/13/2020] [Indexed: 02/06/2023]
Abstract
Most new cryptic species are described using conventional tree- and distance-based species delimitation methods (SDMs), which rely on phylogenetic arrangements and measures of genetic divergence. However, although numerous factors such as population structure and gene flow are known to confound phylogenetic inference and species delimitation, the influence of these processes is not frequently evaluated. Using large numbers of exons, introns, and ultraconserved elements obtained using the FrogCap sequence-capture protocol, we compared conventional SDMs with more robust genomic analyses that assess population structure and gene flow to characterize species boundaries in a Southeast Asian frog complex (Pulchrana picturata). Our results showed that gene flow and introgression can produce phylogenetic patterns and levels of divergence that resemble distinct species (up to 10% divergence in mitochondrial DNA). Hybrid populations were inferred as independent (singleton) clades that were highly divergent from adjacent populations (7%-10%) and unusually similar (<3%) to allopatric populations. Such anomalous patterns are not uncommon in Southeast Asian amphibians, which brings into question whether the high levels of cryptic diversity observed in other amphibian groups reflect distinct cryptic species-or, instead, highly admixed and structured metapopulation lineages. Our results also provide an alternative explanation to the conundrum of divergent (sometimes nonsister) sympatric lineages-a pattern that has been celebrated as indicative of true cryptic speciation. Based on these findings, we recommend that species delimitation of continuously distributed "cryptic" groups should not rely solely on conventional SDMs, but should necessarily examine population structure and gene flow to avoid taxonomic inflation.
Collapse
Affiliation(s)
- Kin O Chan
- Lee Kong Chian National History Museum, Faculty of Science, National University of Singapore, Singapore
| | - Carl R Hutter
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, USA.,Museum of Natural Sciences and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Perry L Wood
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, USA.,Department of Biological Sciences & Museum of Natural History, Auburn University, Auburn, AL, USA
| | - L L Grismer
- Herpetology Laboratory, Department of Biology, La Sierra University, Riverside, CA, USA
| | - Indraneil Das
- Institute of Biodiversity and Environmental Conservation, Universiti Malaysia Sarawak, Kota Samarahan, Sarawak, Malaysia
| | - Rafe M Brown
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, USA
| |
Collapse
|
42
|
Vasilikopoulos A, Gustafson GT, Balke M, Niehuis O, Beutel RG, Misof B. Resolving the phylogenetic position of Hygrobiidae (Coleoptera: Adephaga) requires objective statistical tests and exhaustive phylogenetic methodology: a response to Cai et al. (2020). Mol Phylogenet Evol 2020; 162:106923. [PMID: 32771549 DOI: 10.1016/j.ympev.2020.106923] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 08/03/2020] [Indexed: 12/20/2022]
Affiliation(s)
- Alexandros Vasilikopoulos
- Center for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53121 Bonn, Germany.
| | - Grey T Gustafson
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, 66045 KS, USA
| | - Michael Balke
- Department of Entomology, SNSB-Bavarian State Collections of Zoology, 81247 Munich, Germany
| | - Oliver Niehuis
- Department of Evolutionary Biology and Ecology, Institute of Biology I (Zoology), Albert Ludwig University of Freiburg, 79104 Freiburg, Germany
| | - Rolf G Beutel
- Institut für Zoologie und Evolutionsforschung, Friedrich-Schiller-Universität Jena, 07743 Jena, Germany
| | - Bernhard Misof
- Zoological Research Museum Alexander Koenig, 53121 Bonn, Germany
| |
Collapse
|
43
|
Molecular Clocks without Rocks: New Solutions for Old Problems. Trends Genet 2020; 36:845-856. [PMID: 32709458 DOI: 10.1016/j.tig.2020.06.002] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Revised: 06/02/2020] [Accepted: 06/11/2020] [Indexed: 02/07/2023]
Abstract
Molecular data have been used to date species divergences ever since they were described as documents of evolutionary history in the 1960s. Yet, an inadequate fossil record and discordance between gene trees and species trees are persistently problematic. We examine how, by accommodating gene tree discordance and by scaling branch lengths to absolute time using mutation rate and generation time, multispecies coalescent (MSC) methods can potentially overcome these challenges. We find that time estimates can differ - in some cases, substantially - depending on whether MSC methods or traditional phylogenetic methods that apply concatenation are used, and whether the tree is calibrated with pedigree-based mutation rates or with fossils. We discuss the advantages and shortcomings of both approaches and provide practical guidance for data analysis when using these methods.
Collapse
|
44
|
Huang J, Flouri T, Yang Z. A Simulation Study to Examine the Information Content in Phylogenomic Data Sets under the Multispecies Coalescent Model. Mol Biol Evol 2020; 37:3211-3224. [DOI: 10.1093/molbev/msaa166] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
AbstractWe use computer simulation to examine the information content in multilocus data sets for inference under the multispecies coalescent model. Inference problems considered include estimation of evolutionary parameters (such as species divergence times, population sizes, and cross-species introgression probabilities), species tree estimation, and species delimitation based on Bayesian comparison of delimitation models. We found that the number of loci is the most influential factor for almost all inference problems examined. Although the number of sequences per species does not appear to be important to species tree estimation, it is very influential to species delimitation. Increasing the number of sites and the per-site mutation rate both increase the mutation rate for the whole locus and these have the same effect on estimation of parameters, but the sequence length has a greater effect than the per-site mutation rate for species tree estimation. We discuss the computational costs when the data size increases and provide guidelines concerning the subsampling of genomic data to enable the application of full-likelihood methods of inference.
Collapse
Affiliation(s)
- Jun Huang
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
- Department of Mathematics, Beijing Jiaotong University, Beijing, P.R. China
| | - Tomáš Flouri
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| |
Collapse
|
45
|
Jiao X, Yang Z. Defining Species When There is Gene Flow. Syst Biol 2020; 70:108-119. [PMID: 32617579 DOI: 10.1093/sysbio/syaa052] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 06/23/2020] [Accepted: 06/23/2020] [Indexed: 12/20/2022] Open
Abstract
Whatever one's definition of species, it is generally expected that individuals of the same species should be genetically more similar to each other than they are to individuals of another species. Here, we show that in the presence of cross-species gene flow, this expectation may be incorrect. We use the multispecies coalescent model with continuous-time migration or episodic introgression to study the impact of gene flow on genetic differences within and between species and highlight a surprising but plausible scenario in which different population sizes and asymmetrical migration rates cause a genetic sequence to be on average more closely related to a sequence from another species than to a sequence from the same species. Our results highlight the extraordinary impact that even a small amount of gene flow may have on the genetic history of the species. We suggest that contrasting long-term migration rate and short-term hybridization rate, both of which can be estimated using genetic data, may be a powerful approach to detecting the presence of reproductive barriers and to define species boundaries.[Gene flow; introgression; migration; multispecies coalescent; species concept; species delimitation.].
Collapse
Affiliation(s)
- Xiyun Jiao
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK
| |
Collapse
|
46
|
Morales-Briones DF, Kadereit G, Tefarikis DT, Moore MJ, Smith SA, Brockington SF, Timoneda A, Yim WC, Cushman JC, Yang Y. Disentangling Sources of Gene Tree Discordance in Phylogenomic Data Sets: Testing Ancient Hybridizations in Amaranthaceae s.l. Syst Biol 2020; 70:219-235. [PMID: 32785686 PMCID: PMC7875436 DOI: 10.1093/sysbio/syaa066] [Citation(s) in RCA: 89] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 03/01/2020] [Accepted: 09/03/2020] [Indexed: 12/26/2022] Open
Abstract
Gene tree discordance in large genomic data sets can be caused by evolutionary processes such as incomplete lineage sorting and hybridization, as well as model violation, and errors in data processing, orthology inference, and gene tree estimation. Species tree methods that identify and accommodate all sources of conflict are not available, but a combination of multiple approaches can help tease apart alternative sources of conflict. Here, using a phylotranscriptomic analysis in combination with reference genomes, we test a hypothesis of ancient hybridization events within the plant family Amaranthaceae s.l. that was previously supported by morphological, ecological, and Sanger-based molecular data. The data set included seven genomes and 88 transcriptomes, 17 generated for this study. We examined gene-tree discordance using coalescent-based species trees and network inference, gene tree discordance analyses, site pattern tests of introgression, topology tests, synteny analyses, and simulations. We found that a combination of processes might have generated the high levels of gene tree discordance in the backbone of Amaranthaceae s.l. Furthermore, we found evidence that three consecutive short internal branches produce anomalous trees contributing to the discordance. Overall, our results suggest that Amaranthaceae s.l. might be a product of an ancient and rapid lineage diversification, and remains, and probably will remain, unresolved. This work highlights the potential problems of identifiability associated with the sources of gene tree discordance including, in particular, phylogenetic network methods. Our results also demonstrate the importance of thoroughly testing for multiple sources of conflict in phylogenomic analyses, especially in the context of ancient, rapid radiations. We provide several recommendations for exploring conflicting signals in such situations. [Amaranthaceae; gene tree discordance; hybridization; incomplete lineage sorting; phylogenomics; species network; species tree; transcriptomics.]
Collapse
Affiliation(s)
- Diego F Morales-Briones
- Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA
| | - Gudrun Kadereit
- Institut für Molekulare Physiologie, Johannes Gutenberg-Universität Mainz, D-55099 Mainz, Germany
| | - Delphine T Tefarikis
- Institut für Molekulare Physiologie, Johannes Gutenberg-Universität Mainz, D-55099 Mainz, Germany
| | - Michael J Moore
- Department of Biology, Oberlin College, Science Center K111, 119 Woodland Street, Oberlin, OH 44074-1097, USA
| | - Stephen A Smith
- Department of Ecology & Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, MI 48109-1048, USA
| | - Samuel F Brockington
- Department of Plant Sciences, University of Cambridge, Tennis Court Road, Cambridge CB2 3EA, UK
| | - Alfonso Timoneda
- Department of Plant Sciences, University of Cambridge, Tennis Court Road, Cambridge CB2 3EA, UK
| | - Won C Yim
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, 89577, USA
| | - John C Cushman
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, 89577, USA
| | - Ya Yang
- Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA
| |
Collapse
|
47
|
Vasilikopoulos A, Misof B, Meusemann K, Lieberz D, Flouri T, Beutel RG, Niehuis O, Wappler T, Rust J, Peters RS, Donath A, Podsiadlowski L, Mayer C, Bartel D, Böhm A, Liu S, Kapli P, Greve C, Jepson JE, Liu X, Zhou X, Aspöck H, Aspöck U. An integrative phylogenomic approach to elucidate the evolutionary history and divergence times of Neuropterida (Insecta: Holometabola). BMC Evol Biol 2020; 20:64. [PMID: 32493355 PMCID: PMC7268685 DOI: 10.1186/s12862-020-01631-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 05/19/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The latest advancements in DNA sequencing technologies have facilitated the resolution of the phylogeny of insects, yet parts of the tree of Holometabola remain unresolved. The phylogeny of Neuropterida has been extensively studied, but no strong consensus exists concerning the phylogenetic relationships within the order Neuroptera. Here, we assembled a novel transcriptomic dataset to address previously unresolved issues in the phylogeny of Neuropterida and to infer divergence times within the group. We tested the robustness of our phylogenetic estimates by comparing summary coalescent and concatenation-based phylogenetic approaches and by employing different quartet-based measures of phylogenomic incongruence, combined with data permutations. RESULTS Our results suggest that the order Raphidioptera is sister to Neuroptera + Megaloptera. Coniopterygidae is inferred as sister to all remaining neuropteran families suggesting that larval cryptonephry could be a ground plan feature of Neuroptera. A clade that includes Nevrorthidae, Osmylidae, and Sisyridae (i.e. Osmyloidea) is inferred as sister to all other Neuroptera except Coniopterygidae, and Dilaridae is placed as sister to all remaining neuropteran families. Ithonidae is inferred as the sister group of monophyletic Myrmeleontiformia. The phylogenetic affinities of Chrysopidae and Hemerobiidae were dependent on the data type analyzed, and quartet-based analyses showed only weak support for the placement of Hemerobiidae as sister to Ithonidae + Myrmeleontiformia. Our molecular dating analyses suggest that most families of Neuropterida started to diversify in the Jurassic and our ancestral character state reconstructions suggest a primarily terrestrial environment of the larvae of Neuropterida and Neuroptera. CONCLUSION Our extensive phylogenomic analyses consolidate several key aspects in the backbone phylogeny of Neuropterida, such as the basal placement of Coniopterygidae within Neuroptera and the monophyly of Osmyloidea. Furthermore, they provide new insights into the timing of diversification of Neuropterida. Despite the vast amount of analyzed molecular data, we found that certain nodes in the tree of Neuroptera are not robustly resolved. Therefore, we emphasize the importance of integrating the results of morphological analyses with those of sequence-based phylogenomics. We also suggest that comparative analyses of genomic meta-characters should be incorporated into future phylogenomic studies of Neuropterida.
Collapse
Affiliation(s)
- Alexandros Vasilikopoulos
- Centre for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113, Bonn, Germany.
| | - Bernhard Misof
- Centre for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113, Bonn, Germany.
| | - Karen Meusemann
- Centre for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113, Bonn, Germany
- Department of Evolutionary Biology and Ecology, Institute of Biology I (Zoology), Albert-Ludwigs-Universität Freiburg, 79104, Freiburg, Germany
- Australian National Insect Collection, National Research Collections Australia, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Canberra, ACT 2601, Australia
| | - Doria Lieberz
- Centre for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113, Bonn, Germany
| | - Tomáš Flouri
- Department of Genetics, Evolution and Environment, University College London, London, WC1E 6BT, UK
| | - Rolf G Beutel
- Institut für Zoologie und Evolutionsforschung, Friedrich-Schiller-Universität Jena, 07743, Jena, Germany
| | - Oliver Niehuis
- Department of Evolutionary Biology and Ecology, Institute of Biology I (Zoology), Albert-Ludwigs-Universität Freiburg, 79104, Freiburg, Germany
| | - Torsten Wappler
- Natural History Department, Hessisches Landesmuseum Darmstadt, 64283, Darmstadt, Germany
| | - Jes Rust
- Steinmann-Institut für Geologie, Mineralogie und Paläontologie, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany
| | - Ralph S Peters
- Centre for Taxonomy and Evolutionary Research, Arthropoda Department, Zoological Research Museum Alexander Koenig, 53113, Bonn, Germany
| | - Alexander Donath
- Centre for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113, Bonn, Germany
| | - Lars Podsiadlowski
- Centre for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113, Bonn, Germany
| | - Christoph Mayer
- Centre for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113, Bonn, Germany
| | - Daniela Bartel
- Department of Evolutionary Biology, University of Vienna, 1090, Vienna, Austria
| | - Alexander Böhm
- Department of Evolutionary Biology, University of Vienna, 1090, Vienna, Austria
| | - Shanlin Liu
- Department of Entomology, China Agricultural University, 100193, Beijing, People's Republic of China
| | - Paschalia Kapli
- Department of Genetics, Evolution and Environment, University College London, London, WC1E 6BT, UK
| | - Carola Greve
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), 60325, Frankfurt, Germany
| | - James E Jepson
- School of Biological, Earth and Environmental Sciences, University College Cork, Distillery Fields, North Mall, T23 N73K, Cork, Ireland
| | - Xingyue Liu
- Department of Entomology, China Agricultural University, 100193, Beijing, People's Republic of China
| | - Xin Zhou
- Department of Entomology, China Agricultural University, 100193, Beijing, People's Republic of China
| | - Horst Aspöck
- Institute of Specific Prophylaxis and Tropical Medicine, Medical Parasitology, Medical University of Vienna (MUW), 1090, Vienna, Austria
| | - Ulrike Aspöck
- Department of Evolutionary Biology, University of Vienna, 1090, Vienna, Austria
- Zoological Department II, Natural History Museum of Vienna, 1010, Vienna, Austria
| |
Collapse
|
48
|
Abstract
Knowing phylogenetic relationships among species is fundamental for many studies in biology. An accurate phylogenetic tree underpins our understanding of the major transitions in evolution, such as the emergence of new body plans or metabolism, and is key to inferring the origin of new genes, detecting molecular adaptation, understanding morphological character evolution and reconstructing demographic changes in recently diverged species. Although data are ever more plentiful and powerful analysis methods are available, there remain many challenges to reliable tree building. Here, we discuss the major steps of phylogenetic analysis, including identification of orthologous genes or proteins, multiple sequence alignment, and choice of substitution models and inference methodologies. Understanding the different sources of errors and the strategies to mitigate them is essential for assembling an accurate tree of life.
Collapse
|
49
|
Wascher M, Kubatko L. Consistency of SVDQuartets and Maximum Likelihood for Coalescent-Based Species Tree Estimation. Syst Biol 2020; 70:33-48. [PMID: 32415974 DOI: 10.1093/sysbio/syaa039] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 05/06/2020] [Accepted: 05/07/2020] [Indexed: 11/14/2022] Open
Abstract
Numerous methods for inferring species-level phylogenies under the coalescent model have been proposed within the last 20 years, and debates continue about the relative strengths and weaknesses of these methods. One desirable property of a phylogenetic estimator is that of statistical consistency, which means intuitively that as more data are collected, the probability that the estimated tree has the same topology as the true tree goes to 1. To date, consistency results for species tree inference under the multispecies coalescent (MSC) have been derived only for summary statistics methods, such as ASTRAL and MP-EST. These methods have been found to be consistent given true gene trees but may be inconsistent when gene trees are estimated from data for loci of finite length. Here, we consider the question of statistical consistency for four taxa for SVDQuartets for general data types, as well as for the maximum likelihood (ML) method in the case in which the data are a collection of sites generated under the MSC model such that the sites are conditionally independent given the species tree (we call these data coalescent independent sites [CIS] data). We show that SVDQuartets is statistically consistent for all data types (i.e., for both CIS data and for multilocus data), and we derive its rate of convergence. We additionally show that ML is consistent for CIS data under the JC69 model and discuss why a proof for the more general multilocus case is difficult. Finally, we compare the performance of ML and SDVQuartets using simulation for both data types. [Consistency; gene tree; maximum likelihood; multilocus data; hylogenetic inference; species tree; SVDQuartets.].
Collapse
Affiliation(s)
- Matthew Wascher
- Department of Statistics, The Ohio State University, Columbus, OH, USA
| | - Laura Kubatko
- Department of Statistics, The Ohio State University, Columbus, OH, USA.,Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
50
|
Olave M, Meyer A. Implementing Large Genomic Single Nucleotide Polymorphism Data Sets in Phylogenetic Network Reconstructions: A Case Study of Particularly Rapid Radiations of Cichlid Fish. Syst Biol 2020; 69:848-862. [DOI: 10.1093/sysbio/syaa005] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 01/09/2020] [Accepted: 01/23/2020] [Indexed: 12/23/2022] Open
Abstract
AbstractThe Midas cichlids of the Amphilophus citrinellus spp. species complex from Nicaragua (13 species) are an extraordinary example of adaptive and rapid radiation ($<$24,000 years old). These cichlids are a very challenging group to infer its evolutionary history in phylogenetic analyses, due to the apparent prevalence of incomplete lineage sorting (ILS), as well as past and current gene flow. Assuming solely a vertical transfer of genetic material from an ancestral lineage to new lineages is not appropriate in many cases of genes transferred horizontally in nature. Recently developed methods to infer phylogenetic networks under such circumstances might be able to circumvent these problems. These models accommodate not just ILS, but also gene flow, under the multispecies network coalescent (MSNC) model, processes that are at work in young, hybridizing, and/or rapidly diversifying lineages. There are currently only a few programs available that implement MSNC for estimating phylogenetic networks. Here, we present a novel way to incorporate single nucleotide polymorphism (SNP) data into the currently available PhyloNetworks program. Based on simulations, we demonstrate that SNPs can provide enough power to recover the true phylogenetic network. We also show that it can accurately infer the true network more often than other similar SNP-based programs (PhyloNet and HyDe). Moreover, our approach results in a faster algorithm compared to the original pipeline in PhyloNetworks, without losing power. We also applied our new approach to infer the phylogenetic network of Midas cichlid radiation. We implemented the most comprehensive genomic data set to date (RADseq data set of 679 individuals and $>$37K SNPs from 19 ingroup lineages) and present estimated phylogenetic networks for this extremely young and fast-evolving radiation of cichlid fish. We demonstrate that the MSNC is more appropriate than the multispecies coalescent alone for the analysis of this rapid radiation. [Genomics; multispecies network coalescent; phylogenetic networks; phylogenomics; RADseq; SNPs.]
Collapse
Affiliation(s)
- Melisa Olave
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Axel Meyer
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| |
Collapse
|