1
|
San Jose M, Doorenweerd C, Geib S, Barr N, Dupuis JR, Leblanc L, Kauwe A, Morris KY, Rubinoff D. Interspecific gene flow obscures phylogenetic relationships in an important insect pest species complex. Mol Phylogenet Evol 2023; 188:107892. [PMID: 37524217 DOI: 10.1016/j.ympev.2023.107892] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/07/2023] [Accepted: 07/28/2023] [Indexed: 08/02/2023]
Abstract
As genomic data proliferates, the prevalence of post-speciation gene flow is making species boundaries and relationships increasingly ambiguous. Although current approaches inferring fully bifurcating phylogenies based on concatenated datasets provide simple and robust answers to many species relationships, they may be inaccurate because the models ignore inter-specific gene flow and incomplete lineage sorting. To examine the potential error resulting from ignoring gene flow, we generated both a RAD-seq and a 500 protein-coding loci highly multiplexed amplicon (HiMAP) dataset for a monophyletic group of 12 species defined as the Bactrocera dorsalis sensu lato clade. With some of the world's worst agricultural pests, the taxonomy of the B. dorsalis s.l. clade is important for trade and quarantines. However, taxonomic confusion confounds resolution due to intra- and interspecific phenotypic variation and convergence, mitochondrial introgression across half of the species, and viable hybrids. We compared the topological convergence of our datasets using concatenated phylogenetic and various multispecies coalescent approaches, some of which account for gene flow. All analyses agreed on species delimitation, but there was incongruence between species relationships. Under concatenation, both datasets suggest identical species relationships with mostly high statistical support. However, multispecies coalescent and multispecies network approaches suggest markedly different hypotheses and detected significant gene flow. We suggest that the network approaches are likely more accurate because gene flow violates the assumptions of the concatenated phylogenetic analyses, but the data-reductive requirements of network approaches resulted in reduced statistical support and could not unambiguously resolve gene flow directions. Our study highlights the importance of testing for gene flow, particularly with phylogenomic datasets, even when concatenated approaches receive high statistical support.
Collapse
Affiliation(s)
- Michael San Jose
- University of Hawaii, College of Tropical Agriculture and Human Resources, Department of Plant and Environmental Protection Sciences, Entomology Section, 3050 Maile Way, Honolulu, HI, 96822-2231, USA.
| | - Camiel Doorenweerd
- University of Hawaii, College of Tropical Agriculture and Human Resources, Department of Plant and Environmental Protection Sciences, Entomology Section, 3050 Maile Way, Honolulu, HI, 96822-2231, USA
| | - Scott Geib
- Tropical Crop and Commodity Protection Research Unit, Daniel K Inouye U.S. Pacific Basin Agricultural Center, USDA Agricultural Research Services, Hilo, HI, USA
| | - Norman Barr
- United States Department of Agriculture, Animal and Plant Health Inspection Service, Plant Protection and Quarantine, Science & Technology, Insect Management and Molecular Diagnostics Laboratory, 22675 N. Moorefield Road, Edinburg, TX 78541, USA
| | - Julian R Dupuis
- University of Kentucky, Department of Entomology, S225 Ag Science Center North, 1100 South Limestone, Lexington, KY, 40546-0091, USA
| | - Luc Leblanc
- University of Idaho, Department of Entomology, Plant Pathology and Nematology, 875 Perimeter Drive, MS2329, Moscow, ID, 83844-2329, USA
| | - Angela Kauwe
- Tropical Crop and Commodity Protection Research Unit, Daniel K Inouye U.S. Pacific Basin Agricultural Center, USDA Agricultural Research Services, Hilo, HI, USA
| | - Kimberley Y Morris
- University of Hawaii, College of Tropical Agriculture and Human Resources, Department of Plant and Environmental Protection Sciences, Entomology Section, 3050 Maile Way, Honolulu, HI, 96822-2231, USA; Tropical Crop and Commodity Protection Research Unit, Daniel K Inouye U.S. Pacific Basin Agricultural Center, USDA Agricultural Research Services, Hilo, HI, USA
| | - Daniel Rubinoff
- University of Hawaii, College of Tropical Agriculture and Human Resources, Department of Plant and Environmental Protection Sciences, Entomology Section, 3050 Maile Way, Honolulu, HI, 96822-2231, USA
| |
Collapse
|
2
|
Zaharias P, Warnow T. Recent progress on methods for estimating and updating large phylogenies. Philos Trans R Soc Lond B Biol Sci 2022; 377:20210244. [PMID: 35989607 PMCID: PMC9393559 DOI: 10.1098/rstb.2021.0244] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 01/07/2022] [Indexed: 12/20/2022] Open
Abstract
With the increased availability of sequence data and even of fully sequenced and assembled genomes, phylogeny estimation of very large trees (even of hundreds of thousands of sequences) is now a goal for some biologists. Yet, the construction of these phylogenies is a complex pipeline presenting analytical and computational challenges, especially when the number of sequences is very large. In the past few years, new methods have been developed that aim to enable highly accurate phylogeny estimations on these large datasets, including divide-and-conquer techniques for multiple sequence alignment and/or tree estimation, methods that can estimate species trees from multi-locus datasets while addressing heterogeneity due to biological processes (e.g. incomplete lineage sorting and gene duplication and loss), and methods to add sequences into large gene trees or species trees. Here we present some of these recent advances and discuss opportunities for future improvements. This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.
Collapse
Affiliation(s)
- Paul Zaharias
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
3
|
Barley AJ, Nieto-Montes de Oca A, Manríquez-Morán NL, Thomson RC. The evolutionary network of whiptail lizards reveals predictable outcomes of hybridization. Science 2022; 377:773-777. [PMID: 35951680 DOI: 10.1126/science.abn1593] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Hybridization between diverging lineages is associated with the generation and loss of species diversity, introgression, adaptation, and changes in reproductive mode, but it is unknown when and why it results in these divergent outcomes. We estimate a comprehensive evolutionary network for the largest group of unisexual vertebrates and use it to understand the evolutionary outcomes of hybridization. Our results show that rates of introgression between species decrease with time since divergence and suggest that species must attain a threshold of evolutionary divergence before hybridization results in transitions to unisexuality. Rates of hybridization also predict genome-wide patterns of genetic diversity in whiptail lizards. These results distinguish among models for hybridization that have not previously been tested and suggest that the evolutionary outcomes can be predictable.
Collapse
Affiliation(s)
- Anthony J Barley
- Department of Evolution and Ecology, University of California, Davis, Davis, CA 95616, USA.,School of Life Sciences, University of Hawai'i, Honolulu, HI 96822, USA
| | - Adrián Nieto-Montes de Oca
- Laboratorio de Herpetología and Museo de Zoología Alfonso L. Herrera, Departamento de Biología Evolutiva, Facultad de Ciencias, Universidad Nacional Autónoma de México, Ciudad Universitaria, Alcadía Coyoacán, Ciudad de México, México
| | - Norma L Manríquez-Morán
- Laboratorio de Sistemática Molecular, Centro de Investigaciones Biológicas, Universidad Autónoma del Estado de Hidalgo, Colonia Carboneras, Mineral de la Reforma, Hidalgo, México
| | - Robert C Thomson
- School of Life Sciences, University of Hawai'i, Honolulu, HI 96822, USA
| |
Collapse
|
4
|
Out of chaos: Phylogenomics of Asian Sonerileae. Mol Phylogenet Evol 2022; 175:107581. [PMID: 35810973 DOI: 10.1016/j.ympev.2022.107581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 05/23/2022] [Accepted: 05/26/2022] [Indexed: 11/22/2022]
Abstract
Sonerileae is a diverse Melastomataceae lineage comprising ca. 1000 species in 44 genera, with >70% of genera and species distributed in Asia. Asian Sonerileae are taxonomically intractable with obscure generic circumscriptions. The backbone phylogeny of this group remains poorly resolved, possibly due to complexity caused by rapid species radiation in early and middle Miocene, which hampers further systematic study. Here, we used genome resequencing data to reconstruct the phylogeny of Asian Sonerileae. Three parallel datasets, viz. single-copy ortholog (SCO), genomic SNPs, and whole plastome, were assembled from genome resequencing data of 205 species for this purpose. Based on these genome-scale data, we provided the first well resolved phylogeny of Asian Sonerileae, with 34 major clades identified and 74% of the interclade relationships consistently resolved by both SCO and genomic data. Meanwhile, widespread phylogenetic discordance was detected among SCO gene trees as well as species trees reconstructed using different tree estimation methods (concatenation/site-based coalescent method/summary method) or different datasets (SCO/genomic/plastome). We explored sources of discordance using multiple approaches and found that the observed discordance in Asian Sonerileae was mainly caused by a combination of biased distribution of missing data, random noise from uninformative genes, incomplete lineage sorting, and hybridization/introgression. Exploration of these sources can enable us to generate hypotheses for future testing, which is the first step towards understanding the evolution of Asian Sonerileae. We also detected high levels of homoplasy for some characters traditionally used in taxonomy, which explains current chaotic generic delimitations. The backbone phylogeny of Asian Sonerileae revealed in this study offers a solid basis for future taxonomic revision at the generic level.
Collapse
|
5
|
Kong S, Pons JC, Kubatko L, Wicke K. Classes of explicit phylogenetic networks and their biological and mathematical significance. J Math Biol 2022; 84:47. [PMID: 35503141 DOI: 10.1007/s00285-022-01746-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 01/18/2022] [Accepted: 03/31/2022] [Indexed: 11/24/2022]
Abstract
The evolutionary relationships among organisms have traditionally been represented using rooted phylogenetic trees. However, due to reticulate processes such as hybridization or lateral gene transfer, evolution cannot always be adequately represented by a phylogenetic tree, and rooted phylogenetic networks that describe such complex processes have been introduced as a generalization of rooted phylogenetic trees. In fact, estimating rooted phylogenetic networks from genomic sequence data and analyzing their structural properties is one of the most important tasks in contemporary phylogenetics. Over the last two decades, several subclasses of rooted phylogenetic networks (characterized by certain structural constraints) have been introduced in the literature, either to model specific biological phenomena or to enable tractable mathematical and computational analyses. In the present manuscript, we provide a thorough review of these network classes, as well as provide a biological interpretation of the structural constraints underlying these networks where possible. In addition, we discuss how imposing structural constraints on the network topology can be used to address the scalability and identifiability challenges faced in the estimation of phylogenetic networks from empirical data.
Collapse
Affiliation(s)
- Sungsik Kong
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Joan Carles Pons
- Department of Mathematics and Computer Science, University of the Balearic Islands, Palma, 07122, Spain
| | - Laura Kubatko
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA.,Department of Statistics, The Ohio State University, Columbus, OH, USA
| | - Kristina Wicke
- Department of Mathematics, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
6
|
Yan Z, Cao Z, Liu Y, Ogilvie HA, Nakhleh L. Maximum Parsimony Inference of Phylogenetic Networks in the Presence of Polyploid Complexes. Syst Biol 2021; 71:706-720. [PMID: 34605924 PMCID: PMC9017653 DOI: 10.1093/sysbio/syab081] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 09/26/2021] [Accepted: 09/29/2021] [Indexed: 12/18/2022] Open
Abstract
Phylogenetic networks provide a powerful framework for modeling and analyzing reticulate
evolutionary histories. While polyploidy has been shown to be prevalent not only in plants
but also in other groups of eukaryotic species, most work done thus far on phylogenetic
network inference assumes diploid hybridization. These inference methods have been
applied, with varying degrees of success, to data sets with polyploid species, even though
polyploidy violates the mathematical assumptions underlying these methods. Statistical
methods were developed recently for handling specific types of polyploids and so were
parsimony methods that could handle polyploidy more generally yet while excluding
processes such as incomplete lineage sorting. In this article, we introduce a new method
for inferring most parsimonious phylogenetic networks on data that include polyploid
species. Taking gene tree topologies as input, the method seeks a phylogenetic network
that minimizes deep coalescences while accounting for polyploidy. We demonstrate the
performance of the method on both simulated and biological data. The inference method as
well as a method for evaluating evolutionary hypotheses in the form of phylogenetic
networks are implemented and publicly available in the PhyloNet software package.
[Incomplete lineage sorting; minimizing deep coalescences; multilabeled trees;
multispecies network coalescent; phylogenetic networks; polyploidy.]
Collapse
Affiliation(s)
- Zhi Yan
- Department of Computer Science, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
| | - Zhen Cao
- Department of Computer Science, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
| | - Yushu Liu
- Department of Computer Science, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
| | - Huw A Ogilvie
- Department of Computer Science, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
- Department of Biosciences, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
| |
Collapse
|
7
|
Debray K, Le Paslier MC, Bérard A, Thouroude T, Michel G, Marie-Magdelaine J, Bruneau A, Foucher F, Malécot V. Unveiling the Patterns of Reticulated Evolutionary Processes with Phylogenomics: Hybridization and Polyploidy in the genus Rosa. Syst Biol 2021; 71:547-569. [PMID: 34329460 DOI: 10.1093/sysbio/syab064] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 06/23/2021] [Accepted: 06/30/2021] [Indexed: 11/13/2022] Open
Abstract
Reticulation, caused by hybridization and allopolyploidization, is considered an important and frequent phenomenon in the evolution of numerous plant lineages. Although both processes represent important driving forces of evolution, they are mostly ignored in phylogenetic studies involving a large number of species. Indeed only a scattering of methods exists to recover a comprehensive reticulated evolutionary history for a broad taxon sampling. Among these methods, comparisons of topologies obtained from plastid markers with those from a few nuclear sequences are favored, even though they restrict in-depth studies of hybridization and polyploidization. The genus Rosa encompasses c. 150 species widely distributed throughout the northern hemisphere and represents a challenging taxonomic group in which hybridization and polyploidization are prominent. Our main objective was to develop a general framework that would take patterns of reticulation into account in the study of the phylogenetic relationships among Rosa species. Using amplicon sequencing we targeted allele variation in the nuclear genome as well as haploid sequences in the chloroplast genome. We successfully recovered robust plastid and nuclear phylogenies and performed in-depth tests for several scenarios of hybridization using a maximum pseudo-likelihood approach on taxon subsets. Our diploid-first approach followed by hybrid and polyploid grafting resolved most of the evolutionary relationships among Rosa subgenera, sections, and selected species. Based on these results, we provide new directions for a future revision of the infrageneric classification in Rosa. The stepwise strategy proposed here can be used to reconstruct the phylogenetic relationships of other challenging taxonomic groups with large numbers of hybrid and polyploid taxa.
Collapse
Affiliation(s)
- Kevin Debray
- Univ Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, F-49000 Angers, France
| | | | - Aurélie Bérard
- Etude du Polymorphisme des Génomes Végétaux (EPGV), INRA, Université Paris-Saclay, 91000 Evry, France
| | - Tatiana Thouroude
- Univ Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, F-49000 Angers, France
| | - Gilles Michel
- Univ Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, F-49000 Angers, France
| | | | - Anne Bruneau
- Institut de recherche en biologie végétale and Département de Sciences biologiques, Université de Montréal, 4101 Sherbrooke Est, Montréal, QC, H1X 2B2, Canada
| | - Fabrice Foucher
- Univ Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, F-49000 Angers, France
| | - Valéry Malécot
- Institut Agro, Univ Angers, INRAE, IRHS, SFR QUASAV, F-49000 Angers, France
| |
Collapse
|
8
|
Cai R, Ané C. Assessing the fit of the multi-species network coalescent to multi-locus data. Bioinformatics 2021; 37:634-641. [PMID: 33027508 DOI: 10.1093/bioinformatics/btaa863] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 09/14/2020] [Accepted: 09/22/2020] [Indexed: 01/25/2023] Open
Abstract
MOTIVATION With growing genome-wide molecular datasets from next-generation sequencing, phylogenetic networks can be estimated using a variety of approaches. These phylogenetic networks include events like hybridization, gene flow or horizontal gene transfer explicitly. However, the most accurate network inference methods are computationally heavy. Methods that scale to larger datasets do not calculate a full likelihood, such that traditional likelihood-based tools for model selection are not applicable to decide how many past hybridization events best fit the data. We propose here a goodness-of-fit test to quantify the fit between data observed from genome-wide multi-locus data, and patterns expected under the multi-species coalescent model on a candidate phylogenetic network. RESULTS We identified weaknesses in the previously proposed TICR test, and proposed corrections. The performance of our new test was validated by simulations on real-world phylogenetic networks. Our test provides one of the first rigorous tools for model selection, to select the adequate network complexity for the data at hand. The test can also work for identifying poorly inferred areas on a network. AVAILABILITY AND IMPLEMENTATION Software for the goodness-of-fit test is available as a Julia package at https://github.com/cecileane/QuartetNetworkGoodnessFit.jl. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ruoyi Cai
- Department of Statistics, University of Wisconsin - Madison, Madison, WI 53706, USA
| | - Cécile Ané
- Department of Statistics, University of Wisconsin - Madison, Madison, WI 53706, USA.,Department of Botany, University of Wisconsin - Madison, Madison, WI 53706, USA
| |
Collapse
|
9
|
Kandziora M, Sklenář P, Kolář F, Schmickl R. How to Tackle Phylogenetic Discordance in Recent and Rapidly Radiating Groups? Developing a Workflow Using Loricaria (Asteraceae) as an Example. FRONTIERS IN PLANT SCIENCE 2021; 12:765719. [PMID: 35069621 PMCID: PMC8777076 DOI: 10.3389/fpls.2021.765719] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 11/22/2021] [Indexed: 05/17/2023]
Abstract
A major challenge in phylogenetics and -genomics is to resolve young rapidly radiating groups. The fast succession of species increases the probability of incomplete lineage sorting (ILS), and different topologies of the gene trees are expected, leading to gene tree discordance, i.e., not all gene trees represent the species tree. Phylogenetic discordance is common in phylogenomic datasets, and apart from ILS, additional sources include hybridization, whole-genome duplication, and methodological artifacts. Despite a high degree of gene tree discordance, species trees are often well supported and the sources of discordance are not further addressed in phylogenomic studies, which can eventually lead to incorrect phylogenetic hypotheses, especially in rapidly radiating groups. We chose the high-Andean Asteraceae genus Loricaria to shed light on the potential sources of phylogenetic discordance and generated a phylogenetic hypothesis. By accounting for paralogy during gene tree inference, we generated a species tree based on hundreds of nuclear loci, using Hyb-Seq, and a plastome phylogeny obtained from off-target reads during target enrichment. We observed a high degree of gene tree discordance, which we found implausible at first sight, because the genus did not show evidence of hybridization in previous studies. We used various phylogenomic analyses (trees and networks) as well as the D-statistics to test for ILS and hybridization, which we developed into a workflow on how to tackle phylogenetic discordance in recent radiations. We found strong evidence for ILS and hybridization within the genus Loricaria. Low genetic differentiation was evident between species located in different Andean cordilleras, which could be indicative of substantial introgression between populations, promoted during Pleistocene glaciations, when alpine habitats shifted creating opportunities for secondary contact and hybridization.
Collapse
Affiliation(s)
- Martha Kandziora
- Department of Botany, Faculty of Science, Charles University, Prague, Czechia
- *Correspondence: Martha Kandziora,
| | - Petr Sklenář
- Department of Botany, Faculty of Science, Charles University, Prague, Czechia
| | - Filip Kolář
- Department of Botany, Faculty of Science, Charles University, Prague, Czechia
- Institute of Botany, The Czech Academy of Sciences, Průhonice, Czechia
| | - Roswitha Schmickl
- Department of Botany, Faculty of Science, Charles University, Prague, Czechia
- Institute of Botany, The Czech Academy of Sciences, Průhonice, Czechia
| |
Collapse
|
10
|
Cai L, Xi Z, Lemmon EM, Lemmon AR, Mast A, Buddenhagen CE, Liu L, Davis CC. The Perfect Storm: Gene Tree Estimation Error, Incomplete Lineage Sorting, and Ancient Gene Flow Explain the Most Recalcitrant Ancient Angiosperm Clade, Malpighiales. Syst Biol 2020; 70:491-507. [PMID: 33169797 DOI: 10.1093/sysbio/syaa083] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 10/20/2020] [Accepted: 10/28/2020] [Indexed: 12/20/2022] Open
Abstract
The genomic revolution offers renewed hope of resolving rapid radiations in the Tree of Life. The development of the multispecies coalescent model and improved gene tree estimation methods can better accommodate gene tree heterogeneity caused by incomplete lineage sorting (ILS) and gene tree estimation error stemming from the short internal branches. However, the relative influence of these factors in species tree inference is not well understood. Using anchored hybrid enrichment, we generated a data set including 423 single-copy loci from 64 taxa representing 39 families to infer the species tree of the flowering plant order Malpighiales. This order includes 9 of the top 10 most unstable nodes in angiosperms, which have been hypothesized to arise from the rapid radiation during the Cretaceous. Here, we show that coalescent-based methods do not resolve the backbone of Malpighiales and concatenation methods yield inconsistent estimations, providing evidence that gene tree heterogeneity is high in this clade. Despite high levels of ILS and gene tree estimation error, our simulations demonstrate that these two factors alone are insufficient to explain the lack of resolution in this order. To explore this further, we examined triplet frequencies among empirical gene trees and discovered some of them deviated significantly from those attributed to ILS and estimation error, suggesting gene flow as an additional and previously unappreciated phenomenon promoting gene tree variation in Malpighiales. Finally, we applied a novel method to quantify the relative contribution of these three primary sources of gene tree heterogeneity and demonstrated that ILS, gene tree estimation error, and gene flow contributed to 10.0$\%$, 34.8$\%$, and 21.4$\%$ of the variation, respectively. Together, our results suggest that a perfect storm of factors likely influence this lack of resolution, and further indicate that recalcitrant phylogenetic relationships like the backbone of Malpighiales may be better represented as phylogenetic networks. Thus, reducing such groups solely to existing models that adhere strictly to bifurcating trees greatly oversimplifies reality, and obscures our ability to more clearly discern the process of evolution. [Coalescent; concatenation; flanking region; hybrid enrichment, introgression; phylogenomics; rapid radiation, triplet frequency.].
Collapse
Affiliation(s)
- Liming Cai
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Zhenxiang Xi
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Emily Moriarty Lemmon
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
| | - Alan R Lemmon
- Department of Scientific Computing, Florida State University, Tallahassee, FL 32306, USA
| | - Austin Mast
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
| | - Christopher E Buddenhagen
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
- AgResearch, 10 Bisley Road, Hamilton 3214, New Zealand
| | - Liang Liu
- Department of Statistics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Charles C Davis
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
| |
Collapse
|
11
|
Morales-Briones DF, Kadereit G, Tefarikis DT, Moore MJ, Smith SA, Brockington SF, Timoneda A, Yim WC, Cushman JC, Yang Y. Disentangling Sources of Gene Tree Discordance in Phylogenomic Data Sets: Testing Ancient Hybridizations in Amaranthaceae s.l. Syst Biol 2020; 70:219-235. [PMID: 32785686 PMCID: PMC7875436 DOI: 10.1093/sysbio/syaa066] [Citation(s) in RCA: 89] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 03/01/2020] [Accepted: 09/03/2020] [Indexed: 12/26/2022] Open
Abstract
Gene tree discordance in large genomic data sets can be caused by evolutionary processes such as incomplete lineage sorting and hybridization, as well as model violation, and errors in data processing, orthology inference, and gene tree estimation. Species tree methods that identify and accommodate all sources of conflict are not available, but a combination of multiple approaches can help tease apart alternative sources of conflict. Here, using a phylotranscriptomic analysis in combination with reference genomes, we test a hypothesis of ancient hybridization events within the plant family Amaranthaceae s.l. that was previously supported by morphological, ecological, and Sanger-based molecular data. The data set included seven genomes and 88 transcriptomes, 17 generated for this study. We examined gene-tree discordance using coalescent-based species trees and network inference, gene tree discordance analyses, site pattern tests of introgression, topology tests, synteny analyses, and simulations. We found that a combination of processes might have generated the high levels of gene tree discordance in the backbone of Amaranthaceae s.l. Furthermore, we found evidence that three consecutive short internal branches produce anomalous trees contributing to the discordance. Overall, our results suggest that Amaranthaceae s.l. might be a product of an ancient and rapid lineage diversification, and remains, and probably will remain, unresolved. This work highlights the potential problems of identifiability associated with the sources of gene tree discordance including, in particular, phylogenetic network methods. Our results also demonstrate the importance of thoroughly testing for multiple sources of conflict in phylogenomic analyses, especially in the context of ancient, rapid radiations. We provide several recommendations for exploring conflicting signals in such situations. [Amaranthaceae; gene tree discordance; hybridization; incomplete lineage sorting; phylogenomics; species network; species tree; transcriptomics.]
Collapse
Affiliation(s)
- Diego F Morales-Briones
- Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA
| | - Gudrun Kadereit
- Institut für Molekulare Physiologie, Johannes Gutenberg-Universität Mainz, D-55099 Mainz, Germany
| | - Delphine T Tefarikis
- Institut für Molekulare Physiologie, Johannes Gutenberg-Universität Mainz, D-55099 Mainz, Germany
| | - Michael J Moore
- Department of Biology, Oberlin College, Science Center K111, 119 Woodland Street, Oberlin, OH 44074-1097, USA
| | - Stephen A Smith
- Department of Ecology & Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, MI 48109-1048, USA
| | - Samuel F Brockington
- Department of Plant Sciences, University of Cambridge, Tennis Court Road, Cambridge CB2 3EA, UK
| | - Alfonso Timoneda
- Department of Plant Sciences, University of Cambridge, Tennis Court Road, Cambridge CB2 3EA, UK
| | - Won C Yim
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, 89577, USA
| | - John C Cushman
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, 89577, USA
| | - Ya Yang
- Department of Plant and Microbial Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. Paul, MN 55108, USA
| |
Collapse
|
12
|
Huynh S, Marcussen T, Felber F, Parisod C. Hybridization preceded radiation in diploid wheats. Mol Phylogenet Evol 2019; 139:106554. [PMID: 31288105 DOI: 10.1016/j.ympev.2019.106554] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Revised: 07/03/2019] [Accepted: 07/03/2019] [Indexed: 01/06/2023]
Abstract
Evolutionary relationships among the Aegilops-Triticum relatives of cultivated wheats have been difficult to resolve owing to incomplete lineage sorting and reticulate evolution. Recent studies have suggested that the wheat D-genome lineage (progenitor of Ae. tauschii) originated through homoploid hybridization between the A-genome lineage (progenitor of Triticum s.str.) and the B-genome lineage (progenitor of Ae. speltoides). This scenario of reticulation has been debated, calling for adequate phylogenetic analyses based on comprehensive sampling. To reconstruct the evolution of Aegilops-Triticum diploids, we here combined high-throughput sequencing of 38 nuclear low-copy loci of multiple accessions of all 13 species with inferences of the species phylogeny using the full-parameterized MCMC_SEQ method. Phylogenies recovered a monophyletic Aegilops-Triticum lineage that began diversifying ~6.6 Ma ago and gave rise to four sublineages, i.e. the A- (2 species), B- (1 species), D- (9 species) and T- (Ae. mutica) genome lineage. Full-parameterized phylogenies as well as patterns of tree dilation and tree compression supported a hybrid origin of the D-genome lineage from A and B ~3.0-4.0 Ma ago, and did not indicate additional hybridization events. Conflicting ABBA-BABA tests suggestive of further reticulation were shown here to result from ancestral population structure rather than hybridization. This comprehensive and dated phylogeny of wheat relatives indicates that the origin of the hybrid D-genome was followed by intense diversification into the majority of extant diploid as well as allopolyploid wild wheats.
Collapse
Affiliation(s)
- Stella Huynh
- Institute of Biology, University of Neuchâtel, Switzerland
| | - Thomas Marcussen
- Centre for Ecological and Evolutionary Synthesis, University of Oslo, Norway
| | - François Felber
- Institute of Biology, University of Neuchâtel, Switzerland; Musée et Jardins botaniques cantonaux de Lausanne et Pont-de-Nant, Switzerland
| | | |
Collapse
|