1
|
Xie DF, Li J, Sun JH, Cheng RY, Wang Y, Song BN, He XJ, Zhou SD. Peering through the hedge: Multiple datasets yield insights into the phylogenetic relationships and incongruences in the tribe Lilieae (Liliaceae). Mol Phylogenet Evol 2024; 200:108182. [PMID: 39222738 DOI: 10.1016/j.ympev.2024.108182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 08/06/2024] [Accepted: 08/26/2024] [Indexed: 09/04/2024]
Abstract
The increasing use of genome-scale data has significantly facilitated phylogenetic analyses, contributing to the dissection of the underlying evolutionary mechanisms that shape phylogenetic incongruences, such as incomplete lineage sorting (ILS) and hybridization. Lilieae, a prominent member of the Liliaceae family, comprises four genera and approximately 260 species, representing 43% of all species within Liliaceae. They possess high ornamental, medicinal and edible values. Yet, no study has explored the validity of various genome-scale data in phylogenetic analyses within this tribe, nor have potential evolutionary mechanisms underlying its phylogenetic incongruences been investigated. Here, transcriptome, Angiosperms353, plastid and mitochondrial data, were collected from 50 to 93 samples of Lilieae, covering all four recognized genera. Multiple datasets were created and used for phylogenetic analyses based on concatenated and coalescent-based methods. Evolutionary rates of different datasets were calculated, and divergence times were estimated. Various approaches, including coalescence simulation, Quartet Sampling (QS), calculation of concordance factors (gCF and sCF), as well as MSCquartets and reticulate network inference, were carried out to infer the phylogenetic discordances and analyze their underlying mechanisms using a reduced 33-taxon dataset. Despite extensive phylogenetic discordances among gene trees, robust phylogenies were inferred from nuclear and plastid data compared to mitochondrial data, with lower synonymous substitution detected in mitochondrial genes than in nuclear and plastid genes. Significant ILS was detected across the phylogeny of Lilieae, with clear evidence of reticulate evolution identified. Divergence time estimation indicated that most of lineages in Lilieae diverged during a narrow time frame (ranging from 5.0 Ma to 10.0 Ma), consistent with the notion of rapid radiation evolution. Our results suggest that integrating transcriptomic and plastid data can serve as cost-effective and efficient tools for phylogenetic inference and evolutionary analysis within Lilieae, and Angiosperms353 data is also a favorable choice. Mitochondrial data are more suitable for phylogenetic analyses at higher taxonomic levels due to their stronger conservation and lower synonymous substitution rates. Significant phylogenetic incongruences detected in Lilieae were caused by both incomplete lineage sorting (ILS) and reticulate evolution, with hybridization and "ghost introgression" likely prevalent in the evolution of Lilieae species. Our findings provide new insights into the phylogeny of Lilieae, enhancing our understanding of the evolution of species in this tribe.
Collapse
Affiliation(s)
- Deng-Feng Xie
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, 610065 Chengdu, Sichuan, PR China.
| | - Juan Li
- Southwest Minzu University, Institute Of Qinghai-Tibetan Plateau, 610225 Chengdu, Sichuan, PR China
| | - Jia-Hui Sun
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, PR China
| | - Rui-Yu Cheng
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, 610065 Chengdu, Sichuan, PR China
| | - Yuan Wang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, 610065 Chengdu, Sichuan, PR China
| | - Bo-Ni Song
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, 610065 Chengdu, Sichuan, PR China
| | - Xing-Jin He
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, 610065 Chengdu, Sichuan, PR China
| | - Song-Dong Zhou
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, 610065 Chengdu, Sichuan, PR China.
| |
Collapse
|
2
|
Tiley GP, Crowl AA, Manos PS, Sessa EB, Solís-Lemus C, Yoder AD, Burleigh JG. Benefits and Limits of Phasing Alleles for Network Inference of Allopolyploid Complexes. Syst Biol 2024; 73:666-682. [PMID: 38733563 DOI: 10.1093/sysbio/syae024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 04/30/2024] [Accepted: 05/09/2024] [Indexed: 05/13/2024] Open
Abstract
Accurately reconstructing the reticulate histories of polyploids remains a central challenge for understanding plant evolution. Although phylogenetic networks can provide insights into relationships among polyploid lineages, inferring networks may be hindered by the complexities of homology determination in polyploid taxa. We use simulations to show that phasing alleles from allopolyploid individuals can improve phylogenetic network inference under the multispecies coalescent by obtaining the true network with fewer loci compared with haplotype consensus sequences or sequences with heterozygous bases represented as ambiguity codes. Phased allelic data can also improve divergence time estimates for networks, which is helpful for evaluating allopolyploid speciation hypotheses and proposing mechanisms of speciation. To achieve these outcomes in empirical data, we present a novel pipeline that leverages a recently developed phasing algorithm to reliably phase alleles from polyploids. This pipeline is especially appropriate for target enrichment data, where the depth of coverage is typically high enough to phase entire loci. We provide an empirical example in the North American Dryopteris fern complex that demonstrates insights from phased data as well as the challenges of network inference. We establish that our pipeline (PATÉ: Phased Alleles from Target Enrichment data) is capable of recovering a high proportion of phased loci from both diploids and polyploids. These data may improve network estimates compared with using haplotype consensus assemblies by accurately inferring the direction of gene flow, but statistical nonidentifiability of phylogenetic networks poses a barrier to inferring the evolutionary history of reticulate complexes.
Collapse
Affiliation(s)
| | - Andrew A Crowl
- Department of Biology, Duke University, 130 Science Dr, Durham, NC 27708, USA
| | - Paul S Manos
- Department of Biology, Duke University, 130 Science Dr, Durham, NC 27708, USA
| | - Emily B Sessa
- Department of Biology, University of Florida, 220 Bartram Hall, PO Box 118525, Gainesville, FL 32611, USA
| | - Claudia Solís-Lemus
- Department of Plant Pathology, Wisconsin Institute for Discovery, University of Wisconsin - Madison, 330 N Orchard St, Madison, WI 53706, USA
| | - Anne D Yoder
- Department of Biology, Duke University, 130 Science Dr, Durham, NC 27708, USA
| | - J Gordon Burleigh
- Department of Biology, University of Florida, 220 Bartram Hall, PO Box 118525, Gainesville, FL 32611, USA
| |
Collapse
|
3
|
Campbell MA, Hammer MP, Adams M, Raadik TA, Unmack PJ. Evolutionary relationships and fine-scale geographic structuring in the temperate percichthyid genus Gadopsis (blackfishes) to support fisheries and conservation management. Mol Phylogenet Evol 2024; 199:108159. [PMID: 39029548 DOI: 10.1016/j.ympev.2024.108159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/04/2024] [Accepted: 07/14/2024] [Indexed: 07/21/2024]
Abstract
Gadopsis (Percichthyidae) is a freshwater genus distributed in south-eastern Australia, including Tasmania, and comprises two recognized species. Previous molecular phylogenetic investigations of the genus, mostly conducted in the pre-genomics era and reflecting a range of geographic and molecular sampling intensities, have supported the recognition of up to seven candidate species. Here we analyze a genome-wide SNP dataset that provides comprehensive geographic and genomic coverage of Gadopsis to produce a robust hypothesis of species boundaries and evolutionary relationships. We then leverage the SNP dataset to characterize relationships within candidate species that lack clear intraspecific phylogenetic relationships. We find further support for the seven previously identified candidate species of Gadopsis and evidence that the Bass Strait centered candidate species (SBA) originated from ancient hybridization. The SNP dataset permits a high degree of intraspecific resolution, providing improvements over previous studies, with numerous candidate species showing intraspecific divisions in phylogenetic analysis. Further population genetic analysis of the Murray-Darling candidate species (NMD) and SBA finds support for K = 6 and K = 7 genetic clusters, respectively. The SNP data generated for this study have diverse applications in natural resource management for these fishes of conservation concern.
Collapse
Affiliation(s)
- Matthew A Campbell
- The University of California Davis, Davis, California, USA; University of Alaska Museum of the North, Fairbanks, Alaska, USA.
| | - Michael P Hammer
- Museum and Art Gallery of the Northern Territory, Darwin, Northern Territory, Australia
| | - Mark Adams
- South Australian Museum, Adelaide, South Australia, Australia; School of Biological Sciences, University of Adelaide, Adelaide, South Australia, Australia
| | - Tarmo A Raadik
- Arthur Rylah Institute for Environmental Research, Department of Energy, Environment and Climate Action, Heidelberg, Victoria, Australia
| | - Peter J Unmack
- University of Canberra, Canberra, Australian Capital Territory, Australia
| |
Collapse
|
4
|
Zhang Z, Liu G, Li M. Incomplete lineage sorting and gene flow within Allium (Amayllidaceae). Mol Phylogenet Evol 2024; 195:108054. [PMID: 38471599 DOI: 10.1016/j.ympev.2024.108054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 02/01/2024] [Accepted: 03/07/2024] [Indexed: 03/14/2024]
Abstract
The phylogeny and systematics of the genus Allium have been studied with a variety of diverse data types, including an increasing amount of molecular data. However, strong phylogenetic discordance and high levels of uncertainty have prevented the identification of a consistent phylogeny. The difficulty in establishing phylogenetic consensus and evidence for genealogical discordance make Allium a compelling test case to assess the relative contribution of incomplete lineage sorting (ILS), gene flow and gene tree estimation error on phylogenetic reconstruction. In this study, we obtained 75 transcriptomes of 38 Allium species across 10 subgenera. Whole plastid genome, single copy genes and consensus CDS were generated to estimate phylogenetic trees both using coalescence and concatenation methods. Multiple approaches including coalescence simulation, quartet sampling, reticulate network inference, sequence simulation, theta of ILS and reticulation index were carried out across the CDS gene trees to investigate the degrees of ILS, gene flow and gene tree estimation error. Afterward, a regression analysis was used to test the relative contributions of each of these forms of uncertainty to the final phylogeny. Despite extensive topological discordance among gene trees, we found a fully supported species tree that agrees with the most of well-accepted relationships and establishes monophyly of the genus Allium. We presented clear evidence for substantial ILS across the phylogeny of Allium. Further, we identified two ancient hybridization events for the formation of the second evolutionary line and subg. Butomissa as well as several introgression events between recently diverged species. Our regression analysis revealed that gene tree inference error and gene flow were the two most dominant factors explaining for the overall gene tree variation, with the difficulty in disentangling the effects of ILS and gene tree estimation error due to a positive correlation between them. Based on our efforts to mitigate the methodological errors in reconstructing trees, we believed ILS and gene flow are two principal reasons for the oft-reported phylogenetic heterogeneity of Allium. This study presents a strongly-supported and well-resolved phylogenetic backbone for the sampled Allium species, and exemplifies how to untangle heterogeneity in phylogenetic signal and reconstruct the true evolutionary history of the target taxa.
Collapse
Affiliation(s)
- ZengZhu Zhang
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Gang Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Minjie Li
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou 730000, People's Republic of China.
| |
Collapse
|
5
|
Allman ES, Baños H, Mitchell JD, Rhodes JA. TINNiK: Inference of the Tree of Blobs of a Species Network Under the Coalescent. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.20.590418. [PMID: 38712257 PMCID: PMC11071406 DOI: 10.1101/2024.04.20.590418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
The tree of blobs of a species network shows only the tree-like aspects of relationships of taxa on a network, omitting information on network substructures where hybridization or other types of lateral transfer of genetic information occur. By isolating such regions of a network, inference of the tree of blobs can serve as a starting point for a more detailed investigation, or indicate the limit of what may be inferrable without additional assumptions. Building on our theoretical work on the identifiability of the tree of blobs from gene quartet distributions under the Network Multispecies Coalescent model, we develop an algorithm, TINNiK, for statistically consistent tree of blobs inference. We provide examples of its application to both simulated and empirical datasets, utilizing an implementation in the MSCquartets 2.0 R package.
Collapse
Affiliation(s)
- Elizabeth S. Allman
- Department of Mathematics and Statistics, University of Alaska, Fairbanks, AK, USA
| | - Hector Baños
- Department of Mathematics, California State University San Bernadino, San Bernadino, CA, USA
| | - Jonathan D. Mitchell
- School of Natural Sciences (Mathematics), University of Tasmania, Hobart, TAS, Australia
- ARC Centre of Excellence for Plant Success in Nature and Agriculture, University of Tasmania, Hobart, TAS, Australia
| | - John A. Rhodes
- Department of Mathematics and Statistics, University of Alaska, Fairbanks, AK, USA
| |
Collapse
|
6
|
Kim SH, Yang J, Cho MS, Stuessy TF, Crawford DJ, Kim SC. Chloroplast Genome Provides Insights into Molecular Evolution and Species Relationship of Fleabanes ( Erigeron: Tribe Astereae, Asteraceae) in the Juan Fernández Islands, Chile. PLANTS (BASEL, SWITZERLAND) 2024; 13:612. [PMID: 38475459 DOI: 10.3390/plants13050612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 02/21/2024] [Accepted: 02/22/2024] [Indexed: 03/14/2024]
Abstract
Erigeron represents the third largest genus on the Juan Fernández Islands, with six endemic species, five of which occur exclusively on the younger Alejandro Selkirk Island with one species on both islands. While its continental sister species is unknown, Erigeron on the Juan Fernández Islands appears to be monophyletic and most likely evolved from South American progenitor species. We characterized the complete chloroplast genomes of five Erigeron species, including accessions of E. fernandezia and one each from Alejandro Selkirk and Robinson Crusoe Islands, with the purposes of elucidating molecular evolution and phylogenetic relationships. We found highly conserved chloroplast genomes in size, gene order and contents, and further identified several mutation hotspot regions. In addition, we found two positively selected chloroplast genes (ccsA and ndhF) among species in the islands. The complete plastome sequences confirmed the monophyly of Erigeron in the islands and corroborated previous phylogenetic relationships among species. New findings in the current study include (1) two major lineages, E. turricola-E. luteoviridis and E. fernandezia-E. ingae-E. rupicola, (2) the non-monophyly of E. fernandezia occurring on the two islands, and (3) the non-monophyly of the alpine species E. ingae complex.
Collapse
Affiliation(s)
- Seon-Hee Kim
- Department of Botany, Graduate School of Science, Kyoto University, Kyoto 606-8502, Japan
| | - JiYoung Yang
- Research Institute for Dok-do and Ulleung-do Island, Kyungpook National University, Daegu 41566, Republic of Korea
| | - Myong-Suk Cho
- Department of Biological Sciences, Sungkyunkwan University, Suwon 16419, Republic of Korea
| | - Tod F Stuessy
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Daniel J Crawford
- Department of Ecology and Evolutionary Biology and the Biodiversity Institute, The University of Kansas, Lawrence, KS 66045, USA
| | - Seung-Chul Kim
- Department of Biological Sciences, Sungkyunkwan University, Suwon 16419, Republic of Korea
| |
Collapse
|
7
|
Haque MR, Kubatko L. A global test of hybrid ancestry from genome-scale data. Stat Appl Genet Mol Biol 2024; 23:sagmb-2022-0061. [PMID: 38366619 DOI: 10.1515/sagmb-2022-0061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 01/27/2024] [Indexed: 02/18/2024]
Abstract
Methods based on the multi-species coalescent have been widely used in phylogenetic tree estimation using genome-scale DNA sequence data to understand the underlying evolutionary relationship between the sampled species. Evolutionary processes such as hybridization, which creates new species through interbreeding between two different species, necessitate inferring a species network instead of a species tree. A species tree is strictly bifurcating and thus fails to incorporate hybridization events which require an internal node of degree three. Hence, it is crucial to decide whether a tree or network analysis should be performed given a DNA sequence data set, a decision that is based on the presence of hybrid species in the sampled species. Although many methods have been proposed for hybridization detection, it is rare to find a technique that does so globally while considering a data generation mechanism that allows both hybridization and incomplete lineage sorting. In this paper, we consider hybridization and coalescence in a unified framework and propose a new test that can detect whether there are any hybrid species in a set of species of arbitrary size. Based on this global test of hybridization, one can decide whether a tree or network analysis is appropriate for a given data set.
Collapse
Affiliation(s)
- Md Rejuan Haque
- Division of Biostatistics, College of Public Health, and Department of Statistics, The Ohio State University, Columbus, OH 43210, USA
| | - Laura Kubatko
- Department of Statistics and Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
8
|
Li Y, Li X, Nie S, Zhang M, Yang Q, Xu W, Duan Y, Wang X. Reticulate evolution of the tertiary relict Osmanthus. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2024; 117:145-160. [PMID: 37837261 DOI: 10.1111/tpj.16480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 09/10/2023] [Accepted: 09/13/2023] [Indexed: 10/15/2023]
Abstract
When interspecific gene flow is common, species relationships are more accurately represented by a phylogenetic network than by a bifurcating tree. This study aimed to uncover the role of introgression in the evolution of Osmanthus, the only genus of the subtribe Oleinae (Oleaceae) with its distribution center in East Asia. We built species trees, detected introgression, and constructed networks using multiple kinds of sequencing data (whole genome resequencing, transcriptome sequencing, and Sanger sequencing of nrDNA) combined with concatenation and coalescence approaches. Then, based on well-understood species relationships, historical biogeographic analyses and diversification rate estimates were employed to reveal the history of Osmanthus. Osmanthus originated in mid-Miocene Europe and dispersed to the eastern Tibetan Plateau in the late Miocene. Thereafter, it continued to spread eastwards. Phylogenetic conflict is common within the 'Core Osmanthus' clade and is seen at both early and late stages of diversification, leading to hypotheses of net-like species relationships. Incomplete lineage sorting proved ineffective in explaining phylogenetic conflicts and thus supported introgression as the main cause of conflicts. This study elucidates the diversification history of a relict genus in the subtropical regions of eastern Asia and reveals that introgression had profound effects on its evolutionary history.
Collapse
Affiliation(s)
- Yongfu Li
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, International Cultivar Registration Center for Osmanthus, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| | - Xuan Li
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, International Cultivar Registration Center for Osmanthus, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| | - Shuai Nie
- Rice Research Institute, Guangdong Academy of Agricultural Sciences & Key Laboratory of Genetics and Breeding of High Quality Rice in Southern China (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs & Guangdong Key Laboratory of New Technology in Rice Breeding, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640, Guangdong, China
| | - Min Zhang
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, International Cultivar Registration Center for Osmanthus, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| | - Qinghua Yang
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, International Cultivar Registration Center for Osmanthus, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| | - Wenbin Xu
- Wuhan Botanical Garden, the Chinese Academy of Sciences, Wuhan, 430074, Hubei, China
| | - Yifan Duan
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, International Cultivar Registration Center for Osmanthus, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| | - Xianrong Wang
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, International Cultivar Registration Center for Osmanthus, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| |
Collapse
|
9
|
Bernardini G, van Iersel L, Julien E, Stougie L. Constructing phylogenetic networks via cherry picking and machine learning. Algorithms Mol Biol 2023; 18:13. [PMID: 37717003 PMCID: PMC10505335 DOI: 10.1186/s13015-023-00233-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 06/10/2023] [Indexed: 09/18/2023] Open
Abstract
BACKGROUND Combining a set of phylogenetic trees into a single phylogenetic network that explains all of them is a fundamental challenge in evolutionary studies. Existing methods are computationally expensive and can either handle only small numbers of phylogenetic trees or are limited to severely restricted classes of networks. RESULTS In this paper, we apply the recently-introduced theoretical framework of cherry picking to design a class of efficient heuristics that are guaranteed to produce a network containing each of the input trees, for practical-size datasets consisting of binary trees. Some of the heuristics in this framework are based on the design and training of a machine learning model that captures essential information on the structure of the input trees and guides the algorithms towards better solutions. We also propose simple and fast randomised heuristics that prove to be very effective when run multiple times. CONCLUSIONS Unlike the existing exact methods, our heuristics are applicable to datasets of practical size, and the experimental study we conducted on both simulated and real data shows that these solutions are qualitatively good, always within some small constant factor from the optimum. Moreover, our machine-learned heuristics are one of the first applications of machine learning to phylogenetics and show its promise.
Collapse
Affiliation(s)
| | - Leo van Iersel
- Delft Institute of Applied Mathematics, Delft, The Netherlands
| | - Esther Julien
- Delft Institute of Applied Mathematics, Delft, The Netherlands
| | - Leen Stougie
- CWI, Amsterdam, The Netherlands.
- Vrije Universiteit, Amsterdam, The Netherlands.
- INRIA-ERABLE, Lyon, France.
| |
Collapse
|
10
|
Folk RA, Gaynor ML, Engle-Wrye NJ, O’Meara BC, Soltis PS, Soltis DE, Guralnick RP, Smith SA, Grady CJ, Okuyama Y. Identifying Climatic Drivers of Hybridization with a New Ancestral Niche Reconstruction Method. Syst Biol 2023; 72:856-873. [PMID: 37073863 PMCID: PMC10405357 DOI: 10.1093/sysbio/syad018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 03/23/2023] [Accepted: 04/17/2023] [Indexed: 04/20/2023] Open
Abstract
Applications of molecular phylogenetic approaches have uncovered evidence of hybridization across numerous clades of life, yet the environmental factors responsible for driving opportunities for hybridization remain obscure. Verbal models implicating geographic range shifts that brought species together during the Pleistocene have often been invoked, but quantitative tests using paleoclimatic data are needed to validate these models. Here, we produce a phylogeny for Heuchereae, a clade of 15 genera and 83 species in Saxifragaceae, with complete sampling of recognized species, using 277 nuclear loci and nearly complete chloroplast genomes. We then employ an improved framework with a coalescent simulation approach to test and confirm previous hybridization hypotheses and identify one new intergeneric hybridization event. Focusing on the North American distribution of Heuchereae, we introduce and implement a newly developed approach to reconstruct potential past distributions for ancestral lineages across all species in the clade and across a paleoclimatic record extending from the late Pliocene. Time calibration based on both nuclear and chloroplast trees recovers a mid- to late-Pleistocene date for most inferred hybridization events, a timeframe concomitant with repeated geographic range restriction into overlapping refugia. Our results indicate an important role for past episodes of climate change, and the contrasting responses of species with differing ecological strategies, in generating novel patterns of range contact among plant communities and therefore new opportunities for hybridization. The new ancestral niche method flexibly models the shape of niche while incorporating diverse sources of uncertainty and will be an important addition to the current comparative methods toolkit. [Ancestral niche reconstruction; hybridization; paleoclimate; pleistocene.].
Collapse
Affiliation(s)
- Ryan A Folk
- Department of Biological Sciences, Mississippi State University, Mississippi State, MS, USA
| | - Michelle L Gaynor
- Florida Museum of Natural History, University of Florida, Gainesville, FL, USA
- Department of Biology, University of Florida, Gainesville, FL, USA
| | - Nicholas J Engle-Wrye
- Department of Biological Sciences, Mississippi State University, Mississippi State, MS, USA
| | - Brian C O’Meara
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN, USA
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL, USA
- Genetics Institute, University of Florida, Gainesville, FL, USA
- Biodiversity Institute, University of Florida, Gainesville, FL, USA
| | - Douglas E Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL, USA
- Department of Biology, University of Florida, Gainesville, FL, USA
- Genetics Institute, University of Florida, Gainesville, FL, USA
- Biodiversity Institute, University of Florida, Gainesville, FL, USA
| | - Robert P Guralnick
- Florida Museum of Natural History, University of Florida, Gainesville, FL, USA
- Biodiversity Institute, University of Florida, Gainesville, FL, USA
| | - Stephen A Smith
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | - Charles J Grady
- Biodiversity Institute, University of Kansas, Lawrence, KS, 66045, USA
| | - Yudai Okuyama
- Tsukuba Botanical Garden, National Museum of Nature and Science, Tsukuba, Japan
| |
Collapse
|
11
|
Casanellas M, Fernández-Sánchez J, Garrote-López M, Sabaté-Vidales M. Designing Weights for Quartet-Based Methods When Data are Heterogeneous Across Lineages. Bull Math Biol 2023; 85:68. [PMID: 37310552 DOI: 10.1007/s11538-023-01167-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 05/15/2023] [Indexed: 06/14/2023]
Abstract
Homogeneity across lineages is a general assumption in phylogenetics according to which nucleotide substitution rates are common to all lineages. Many phylogenetic methods relax this hypothesis but keep a simple enough model to make the process of sequence evolution more tractable. On the other hand, dealing successfully with the general case (heterogeneity of rates across lineages) is one of the key features of phylogenetic reconstruction methods based on algebraic tools. The goal of this paper is twofold. First, we present a new weighting system for quartets (ASAQ) based on algebraic and semi-algebraic tools, thus especially indicated to deal with data evolving under heterogeneous rates. This method combines the weights of two previous methods by means of a test based on the positivity of the branch lengths estimated with the paralinear distance. ASAQ is statistically consistent when applied to data generated under the general Markov model, considers rate and base composition heterogeneity among lineages and does not assume stationarity nor time-reversibility. Second, we test and compare the performance of several quartet-based methods for phylogenetic tree reconstruction (namely QFM, wQFM, quartet puzzling, weight optimization and Willson's method) in combination with several systems of weights, including ASAQ weights and other weights based on algebraic and semi-algebraic methods or on the paralinear distance. These tests are applied to both simulated and real data and support weight optimization with ASAQ weights as a reliable and successful reconstruction method that improves upon the accuracy of global methods (such as neighbor-joining or maximum likelihood) in the presence of long branches or on mixtures of distributions on trees.
Collapse
Affiliation(s)
- Marta Casanellas
- Institut de Matematiques de la UPC-BarcelonaTech (IMTech), Universitat Politècnica de Catalunya and Centre de Recerca Matemàtica, Av. Diagonal 647, 08028, Barcelona, Spain.
| | - Jesús Fernández-Sánchez
- Institut de Matematiques de la UPC-BarcelonaTech (IMTech), Universitat Politècnica de Catalunya and Centre de Recerca Matemàtica, Av. Diagonal 647, 08028, Barcelona, Spain
| | | | | |
Collapse
|
12
|
Raiyemo DA, Tranel PJ. Comparative analysis of dioecious Amaranthus plastomes and phylogenomic implications within Amaranthaceae s.s. BMC Ecol Evol 2023; 23:15. [PMID: 37149567 PMCID: PMC10164334 DOI: 10.1186/s12862-023-02121-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 04/28/2023] [Indexed: 05/08/2023] Open
Abstract
BACKGROUND The genus Amaranthus L. consists of 70-80 species distributed across temperate and tropical regions of the world. Nine species are dioecious and native to North America; two of which are agronomically important weeds of row crops. The genus has been described as taxonomically challenging and relationships among species including the dioecious ones are poorly understood. In this study, we investigated the phylogenetic relationships among the dioecious amaranths and sought to gain insights into plastid tree incongruence. A total of 19 Amaranthus species' complete plastomes were analyzed. Among these, seven dioecious Amaranthus plastomes were newly sequenced and assembled, an additional two were assembled from previously published short reads sequences and 10 other plastomes were obtained from a public repository (GenBank). RESULTS Comparative analysis of the dioecious Amaranthus species' plastomes revealed sizes ranged from 150,011 to 150,735 bp and consisted of 112 unique genes (78 protein-coding genes, 30 transfer RNAs and 4 ribosomal RNAs). Maximum likelihood trees, Bayesian inference trees and splits graphs support the monophyly of subgenera Acnida (7 dioecious species) and Amaranthus; however, the relationship of A. australis and A. cannabinus to the other dioecious species in Acnida could not be established, as it appears a chloroplast capture occurred from the lineage leading to the Acnida + Amaranthus clades. Our results also revealed intraplastome conflict at some tree branches that were in some cases alleviated with the use of whole chloroplast genome alignment, indicating non-coding regions contribute valuable phylogenetic signals toward shallow relationship resolution. Furthermore, we report a very low evolutionary distance between A. palmeri and A. watsonii, indicating that these two species are more genetically related than previously reported. CONCLUSIONS Our study provides valuable plastome resources as well as a framework for further evolutionary analyses of the entire Amaranthus genus as more species are sequenced.
Collapse
Affiliation(s)
- Damilola A Raiyemo
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Patrick J Tranel
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA.
| |
Collapse
|
13
|
Silva JJ, Fungaro MHP, Wang X, Larsen TO, Frisvad JC, Taniwaki MH, Iamanaka BT. Deep Genotypic Species Delimitation of Aspergillus Section Flavi Isolated from Brazilian Foodstuffs and the Description of Aspergillus annui sp. nov. and Aspergillus saccharicola sp. nov. J Fungi (Basel) 2022; 8:1279. [PMID: 36547612 PMCID: PMC9781283 DOI: 10.3390/jof8121279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 11/25/2022] [Accepted: 11/29/2022] [Indexed: 12/12/2022] Open
Abstract
Aspergillus section Flavi is a fungal group that is important in food because it contains spoilage and potentially aflatoxigenic species. Aflatoxins are metabolites that are harmful to human and animal health and have been recognized as the primary natural contaminant in food. Therefore, recognizing the biodiversity of this group in food is necessary to reduce risks to public health. Our study aimed to investigate the diversity of Aspergillus section Flavi isolated from Brazilian foodstuffs such as cassava, sugarcane, black pepper, paprika, Brazil nuts, yerba-mate, peanuts, rice, and corn. A polyphasic approach integrating phenotypic data and multilocus genotypic analyses (CaM, BenA, and RPB2) was performed for 396 strains. Two new species in the Aspergillus subgenus Circumdati section Flavi are proposed using maximum-likelihood analysis, Bayesian inference, and coalescence-based methods: Aspergillus saccharicola sp. nov. and Aspergillus annui sp. nov. A. saccharicola sp. nov. belongs to the series Flavi, is a potentially aflatoxigenic species (B1, B2, G1, and G2), closely related to Aspergillus arachidicola, and was found mostly in sugarcane. A. annui sp. nov. was isolated from samples of sweet paprika. To accommodate A. annui sp. nov., a new series Annuorum was proposed.
Collapse
Affiliation(s)
- Josué J. Silva
- Centro de Ciência e Qualidade de Alimentos, Instituto de Tecnologia de Alimentos, Campinas 13070-178, São Paulo, Brazil
| | - Maria H. P. Fungaro
- Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina 86057-970, Paraná, Brazil
| | - Xinhui Wang
- Department of Biotechnology and Biomedicine, DTU-Bioengineering, Technical University of Denmark, 2800 Kongens Lyngby, Denmark
| | - Thomas O. Larsen
- Department of Biotechnology and Biomedicine, DTU-Bioengineering, Technical University of Denmark, 2800 Kongens Lyngby, Denmark
| | - Jens C. Frisvad
- Department of Biotechnology and Biomedicine, DTU-Bioengineering, Technical University of Denmark, 2800 Kongens Lyngby, Denmark
| | - Marta H. Taniwaki
- Centro de Ciência e Qualidade de Alimentos, Instituto de Tecnologia de Alimentos, Campinas 13070-178, São Paulo, Brazil
| | - Beatriz T. Iamanaka
- Centro de Ciência e Qualidade de Alimentos, Instituto de Tecnologia de Alimentos, Campinas 13070-178, São Paulo, Brazil
| |
Collapse
|
14
|
Xiao TW, Ge XJ. Plastome structure, phylogenomics, and divergence times of tribe Cinnamomeae (Lauraceae). BMC Genomics 2022; 23:642. [PMID: 36076185 PMCID: PMC9461114 DOI: 10.1186/s12864-022-08855-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 08/26/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Tribe Cinnamomeae is a species-rich and ecologically important group in tropical and subtropical forests. Previous studies explored its phylogenetic relationships and historical biogeography using limited loci, which might result in biased molecular dating due to insufficient parsimony-informative sites. Thus, 15 plastomes were newly sequenced and combined with published plastomes to study plastome structural variations, gene evolution, phylogenetic relationships, and divergence times of this tribe. RESULTS Among the 15 newly generated plastomes, 14 ranged from 152,551 bp to 152,847 bp, and the remaining one (Cinnamomum chartophyllum XTBGLQM0164) was 158,657 bp. The inverted repeat (IR) regions of XTBGLQM0164 contained complete ycf2, trnICAU, rpl32, and rpl2. Four hypervariable plastid loci (ycf1, ycf2, ndhF-rpl32-trnLUAG, and petA-psbJ) were identified as candidate DNA barcodes. Divergence times based on a few loci were primarily determined by prior age constraints rather than by DNA data. In contrast, molecular dating using complete plastid protein-coding genes (PCGs) was determined by DNA data rather than by prior age constraints. Dating analyses using PCGs showed that Cinnamomum sect. Camphora diverged from C. sect. Cinnamomum in the late Oligocene (27.47 Ma). CONCLUSIONS This study reports the first case of drastic IR expansion in tribe Cinnamomeae, and indicates that plastomes have sufficient parsimony-informative sites for molecular dating. Besides, the dating analyses provide preliminary insights into the divergence time within tribe Cinnamomeae and can facilitate future studies on its historical biogeography.
Collapse
Affiliation(s)
- Tian-Wen Xiao
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Xue-Jun Ge
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China. .,Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Guangzhou, China.
| |
Collapse
|
15
|
LeMay M, Libeskind-Hadas R, Wu YC. A Polynomial-Time Algorithm for Minimizing the Deep Coalescence Cost for Level-1 Species Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2642-2653. [PMID: 34406946 DOI: 10.1109/tcbb.2021.3105922] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Phylogenetic analyses commonly assume that the species history can be represented as a tree. However, in the presence of hybridization, the species history is more accurately captured as a network. Despite several advances in modeling phylogenetic networks, there is no known polynomial-time algorithm for parsimoniously reconciling gene trees with species networks while accounting for incomplete lineage sorting. To address this issue, we present a polynomial-time algorithm for the case of level-1 networks, in which no hybrid species is the direct ancestor of another hybrid species. This work enables more efficient reconciliation of gene trees with species networks, which in turn, enables more efficient reconstruction of species networks.
Collapse
|
16
|
Xia M, Cai M, Comes HP, Zheng L, Ohi-Toma T, Lee J, Qi Z, Konowalik K, Li P, Cameron KM, Fu C. An overlooked dispersal route of Cardueae (Asteraceae) from the Mediterranean to East Asia revealed by phylogenomic and biogeographical analyses of Atractylodes. ANNALS OF BOTANY 2022; 130:53-64. [PMID: 35533344 PMCID: PMC9295924 DOI: 10.1093/aob/mcac059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 05/06/2022] [Indexed: 05/11/2023]
Abstract
BACKGROUND AND AIMS The East Asian-Tethyan disjunction pattern and its mechanisms of formation have long been of interest to researchers. Here, we studied the biogeographical history of Asteraceae tribe Cardueae, with a particular focus on the temperate East Asian genus Atractylodes DC., to understand the role of tectonic and climatic events in driving the diversification and disjunctions of the genus. METHODS A total of 76 samples of Atractylodes from 36 locations were collected for RAD-sequencing. Three single nucleotide polymorphism (SNP) datasets based on different filtering strategies were used for phylogenetic analyses. Molecular dating and ancestral distribution reconstruction were performed using both chloroplast DNA sequences (127 Cardueae samples) and SNP (36 Atractylodes samples) datasets. KEY RESULTS Six species of Atractylodes were well resolved as individually monophyletic, although some introgression was identified among accessions of A. chinensis, A. lancea and A. koreana. Dispersal of the subtribe Carlininae from the Mediterranean to East Asia occurred after divergence between Atractylodes and Carlina L. + Atractylis L. + Thevenotia DC. at ~31.57 Ma, resulting in an East Asian-Tethyan disjunction. Diversification of Atractylodes in East Asia mainly occurred from the Late Miocene to the Early Pleistocene. CONCLUSIONS Aridification of Asia and the closure of the Turgai Strait in the Late Oligocene promoted the dispersal of Cardueae from the Mediterranean to East China. Subsequent uplift of the Qinghai-Tibet Plateau as well as changes in Asian monsoon systems resulted in an East Asian-Tethyan disjunction between Atractylodes and Carlina + Atractylis + Thevenotia. In addition, Late Miocene to Quaternary climates and sea level fluctuations played major roles in the diversification of Atractylodes. Through this study of different taxonomic levels using genomic data, we have revealed an overlooked dispersal route between the Mediterranean and far East Asia (Japan/Korea) via Central Asia and East China.
Collapse
Affiliation(s)
| | | | - Hans Peter Comes
- Department of Biosciences, Salzburg University, Salzburg, Austria
| | - Li Zheng
- Systematic & Evolutionary Botany and Biodiversity Group, MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, China
- Key Laboratory of Jiaxing Second Hospital, Jiaxing, Zhejiang, China
| | - Tetsuo Ohi-Toma
- Nature Fieldwork Center, Okayama University of Science, Okayama, Japan
| | - Joongku Lee
- Department of Environment and Forest Resources, Chungnam National University, Daejeon, South Korea
| | - Zhechen Qi
- College of Life Sciences, Zhejiang Sci-Tech University, Hangzhou, China
| | - Kamil Konowalik
- Department of Plant Biology, Institute of Environmental Biology, Wrocław University of Environmental and Life Sciences, Kożuchowska 5b, 51-631, Wroclaw, Poland
| | - Pan Li
- For correspondence. E-email
| | | | - Chengxin Fu
- Systematic & Evolutionary Botany and Biodiversity Group, MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
17
|
Identifiability of species network topologies from genomic sequences using the logDet distance. J Math Biol 2022; 84:35. [PMID: 35385988 DOI: 10.1007/s00285-022-01734-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 01/12/2022] [Accepted: 03/02/2022] [Indexed: 10/18/2022]
Abstract
Inference of network-like evolutionary relationships between species from genomic data must address the interwoven signals from both gene flow and incomplete lineage sorting. The heavy computational demands of standard approaches to this problem severely limit the size of datasets that may be analyzed, in both the number of species and the number of genetic loci. Here we provide a theoretical pointer to more efficient methods, by showing that logDet distances computed from genomic-scale sequences retain sufficient information to recover network relationships in the level-1 ultrametric case. This result is obtained under the Network Multispecies Coalescent model combined with a mixture of General Time-Reversible sequence evolution models across individual gene trees. It applies to both unlinked site data, such as for SNPs, and to sequence data in which many contiguous sites may have evolved on a common tree, such as concatenated gene sequences. Thus under standard stochastic models statistically justifiable inference of network relationships from sequences can be accomplished without consideration of individual genes or gene trees.
Collapse
|
18
|
Singhal S, Derryberry GE, Bravo GA, Derryberry EP, Brumfield RT, Harvey MG. The dynamics of introgression across an avian radiation. Evol Lett 2021; 5:568-581. [PMID: 34917397 PMCID: PMC8645201 DOI: 10.1002/evl3.256] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 07/11/2021] [Accepted: 08/31/2021] [Indexed: 01/20/2023] Open
Abstract
Hybridization and resulting introgression can play both a destructive and a creative role in the evolution of diversity. Thus, characterizing when and where introgression is most likely to occur can help us understand the causes of diversification dynamics. Here, we examine the prevalence of and variation in introgression using phylogenomic data from a large (1300+ species), geographically widespread avian group, the suboscine birds. We first examine patterns of gene tree discordance across the geographic distribution of the entire clade. We then evaluate the signal of introgression in a subset of 206 species triads using Patterson's D‐statistic and test for associations between introgression signal and evolutionary, geographic, and environmental variables. We find that gene tree discordance varies across lineages and geographic regions. The signal of introgression is highest in cases where species occur in close geographic proximity and in regions with more dynamic climates since the Pleistocene. Our results highlight the potential of phylogenomic datasets for examining broad patterns of hybridization and suggest that the degree of introgression between diverging lineages might be predictable based on the setting in which they occur.
Collapse
Affiliation(s)
- Sonal Singhal
- Department of Biology California State University, Dominguez Hills Carson California 90747
| | - Graham E Derryberry
- Department of Ecology and Evolutionary Biology University of Tennessee Knoxville Tennessee 37996
| | - Gustavo A Bravo
- Department of Organismic and Evolutionary Biology Harvard University Cambridge Massachusetts 02138.,Museum of Comparative Zoology Harvard University Cambridge Massachusetts 02138
| | - Elizabeth P Derryberry
- Department of Ecology and Evolutionary Biology University of Tennessee Knoxville Tennessee 37996
| | - Robb T Brumfield
- Museum of Natural Science Louisiana State University Baton Rouge Louisiana 70803.,Department of Biological Sciences Louisiana State University Baton Rouge Louisiana 70803
| | - Michael G Harvey
- Department of Biological Sciences The University of Texas at El Paso El Paso Texas 79968.,Biodiversity Collections The University of Texas at El Paso El Paso Texas 79968
| |
Collapse
|
19
|
Górniak M, Szlachetko DL, Olędrzyńska N, Naczk AM, Mieszkowska A, Boss L, Ziętara MS. Species Phylogeny versus Gene Trees: A Case Study of an Incongruent Data Matrix Based on Paphiopedilum Pfitz. (Orchidaceae). Int J Mol Sci 2021; 22:ijms222111393. [PMID: 34768824 PMCID: PMC8583834 DOI: 10.3390/ijms222111393] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 10/19/2021] [Accepted: 10/20/2021] [Indexed: 11/16/2022] Open
Abstract
The phylogeny of the genus Paphiopedilum based on the plastome is consistent with morphological analysis. However, to date, none of the analyzed nuclear markers has confirmed this. Topology incongruence among the trees of different nuclear markers concerns entire sections of the subgenus Paphiopedilum. The low-copy nuclear protein-coding gene PHYC was obtained for 22 species representing all sections and subgenera of Paphiopedilum. The nuclear-based phylogeny is supported by morphological characteristics and plastid data analysis. We assumed that an incongruence in nuclear gene trees is caused by ancestral homoploid hybridization. We present a model for inferring the phylogeny of the species despite the incongruence of the different tree topologies. Our analysis, based on six low-copy nuclear genes, is congruent with plastome phylogeny and has been confirmed by phylogenetic network analysis.
Collapse
Affiliation(s)
- Marcin Górniak
- Department of Evolutionary Genetics and Biosystematics, University of Gdańsk, 80-309 Gdańsk, Poland; (A.M.N.); (A.M.); (M.S.Z.)
- Correspondence:
| | - Dariusz L. Szlachetko
- Department of Plant Taxonomy and Nature Conservation, University of Gdańsk, 80-309 Gdańsk, Poland; (D.L.S.); (N.O.)
| | - Natalia Olędrzyńska
- Department of Plant Taxonomy and Nature Conservation, University of Gdańsk, 80-309 Gdańsk, Poland; (D.L.S.); (N.O.)
| | - Aleksandra M. Naczk
- Department of Evolutionary Genetics and Biosystematics, University of Gdańsk, 80-309 Gdańsk, Poland; (A.M.N.); (A.M.); (M.S.Z.)
| | - Agata Mieszkowska
- Department of Evolutionary Genetics and Biosystematics, University of Gdańsk, 80-309 Gdańsk, Poland; (A.M.N.); (A.M.); (M.S.Z.)
| | - Lidia Boss
- Department of Bacterial Molecular Genetics, University of Gdańsk, 80-309 Gdańsk, Poland;
| | - Marek S. Ziętara
- Department of Evolutionary Genetics and Biosystematics, University of Gdańsk, 80-309 Gdańsk, Poland; (A.M.N.); (A.M.); (M.S.Z.)
| |
Collapse
|
20
|
Rabier CE, Berry V, Stoltz M, Santos JD, Wang W, Glaszmann JC, Pardi F, Scornavacca C. On the inference of complex phylogenetic networks by Markov Chain Monte-Carlo. PLoS Comput Biol 2021; 17:e1008380. [PMID: 34478440 PMCID: PMC8445492 DOI: 10.1371/journal.pcbi.1008380] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Revised: 09/16/2021] [Accepted: 07/13/2021] [Indexed: 11/19/2022] Open
Abstract
For various species, high quality sequences and complete genomes are nowadays available for many individuals. This makes data analysis challenging, as methods need not only to be accurate, but also time efficient given the tremendous amount of data to process. In this article, we introduce an efficient method to infer the evolutionary history of individuals under the multispecies coalescent model in networks (MSNC). Phylogenetic networks are an extension of phylogenetic trees that can contain reticulate nodes, which allow to model complex biological events such as horizontal gene transfer, hybridization and introgression. We present a novel way to compute the likelihood of biallelic markers sampled along genomes whose evolution involved such events. This likelihood computation is at the heart of a Bayesian network inference method called SnappNet, as it extends the Snapp method inferring evolutionary trees under the multispecies coalescent model, to networks. SnappNet is available as a package of the well-known beast 2 software. Recently, the MCMC_BiMarkers method, implemented in PhyloNet, also extended Snapp to networks. Both methods take biallelic markers as input, rely on the same model of evolution and sample networks in a Bayesian framework, though using different methods for computing priors. However, SnappNet relies on algorithms that are exponentially more time-efficient on non-trivial networks. Using simulations, we compare performances of SnappNet and MCMC_BiMarkers. We show that both methods enjoy similar abilities to recover simple networks, but SnappNet is more accurate than MCMC_BiMarkers on more complex network scenarios. Also, on complex networks, SnappNet is found to be extremely faster than MCMC_BiMarkers in terms of time required for the likelihood computation. We finally illustrate SnappNet performances on a rice data set. SnappNet infers a scenario that is consistent with previous results and provides additional understanding of rice evolution.
Collapse
Affiliation(s)
- Charles-Elie Rabier
- Institut des Sciences de l’Evolution (ISEM), Université de Montpellier, CNRS, EPHE, IRD, Montpellier, France
- Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier, CNRS, Montpellier, France
- Institut Montpelliérain Alexander Grothendieck (IMAG), Université de Montpellier, CNRS, Montpellier, France
| | - Vincent Berry
- Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier, CNRS, Montpellier, France
| | - Marnus Stoltz
- Institut des Sciences de l’Evolution (ISEM), Université de Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - João D. Santos
- CIRAD, UMR AGAP, Montpellier, France
- Amélioration Génétique et Adaptation des Plantes méditerranéennes et tropicales (AGAP), Université de Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Wensheng Wang
- Institute of Crop Sciences (ICS), Chinese Academy of Agricultural Sciences, Beijing, China
| | - Jean-Christophe Glaszmann
- CIRAD, UMR AGAP, Montpellier, France
- Amélioration Génétique et Adaptation des Plantes méditerranéennes et tropicales (AGAP), Université de Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Fabio Pardi
- Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier, CNRS, Montpellier, France
| | - Celine Scornavacca
- Institut des Sciences de l’Evolution (ISEM), Université de Montpellier, CNRS, EPHE, IRD, Montpellier, France
| |
Collapse
|
21
|
|
22
|
Suvorov A, Scornavacca C, Fujimoto MS, Bodily P, Clement M, Crandall KA, Whiting MF, Schrider DR, Bybee SM. Deep ancestral introgression shapes evolutionary history of dragonflies and damselflies. Syst Biol 2021; 71:526-546. [PMID: 34324671 PMCID: PMC9017697 DOI: 10.1093/sysbio/syab063] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 07/20/2021] [Accepted: 07/26/2021] [Indexed: 11/13/2022] Open
Abstract
Introgression is an important biological process affecting at least 10% of the extant species in the animal kingdom. Introgression significantly impacts inference of phylogenetic species relationships where a strictly binary tree model cannot adequately explain reticulate net-like species relationships. Here we use phylogenomic approaches to understand patterns of introgression along the evolutionary history of a unique, non-model insect system: dragonflies and damselflies (Odonata). We demonstrate that introgression is a pervasive evolutionary force across various taxonomic levels within Odonata. In particular, we show that the morphologically "intermediate" species of Anisozygoptera (one of the three primary suborders within Odonata besides Zygoptera and Anisoptera), which retain phenotypic characteristics of the other two suborders, experienced high levels of introgression likely coming from zygopteran genomes. Additionally, we find evidence for multiple cases of deep inter-superfamilial ancestral introgression.
Collapse
Affiliation(s)
- Anton Suvorov
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Celine Scornavacca
- Institut des Sciences de l'Evolution Université de Montpellier, CNRS, IRD, EPHE CC 064, Place Eugène Bataillon, 34095 Montpellier Cedex 05, France
| | - M Stanley Fujimoto
- Department of Computer Science, Brigham Young University, Provo, UT, United States
| | - Paul Bodily
- Department of Computer Science, Idaho State University, Pocatello, ID, United States
| | - Mark Clement
- Department of Computer Science, Brigham Young University, Provo, UT, United States
| | - Keith A Crandall
- Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, DC, United States
| | - Michael F Whiting
- Department of Biology, Brigham Young University, Provo, UT, United States.,M.L. Bean Museum, Brigham Young University, Provo, UT, United States
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Seth M Bybee
- Department of Biology, Brigham Young University, Provo, UT, United States.,M.L. Bean Museum, Brigham Young University, Provo, UT, United States
| |
Collapse
|
23
|
Knope ML, Bellinger MR, Datlof EM, Gallaher TJ, Johnson MA. Insights into the Evolutionary History of the Hawaiian Bidens (Asteraceae) Adaptive Radiation Revealed Through Phylogenomics. J Hered 2021; 111:119-137. [PMID: 31953949 DOI: 10.1093/jhered/esz066] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Accepted: 10/31/2019] [Indexed: 12/14/2022] Open
Abstract
Hawaiian plant radiations often result in lineages with exceptionally high species richness and extreme morphological and ecological differentiation. However, they typically display low levels of genetic variation, hindering the use of classic DNA markers to resolve their evolutionary histories. Here we utilize a phylogenomic approach to generate the first generally well-resolved phylogenetic hypothesis for the evolution of the Hawaiian Bidens (Asteraceae) adaptive radiation, including refined initial colonization and divergence time estimates. We sequenced the chloroplast genome (plastome) and nuclear ribosomal complex for 18 of the 19 endemic species of Hawaiian Bidens and 4 outgroup species. Phylogenomic analyses based on the concatenated dataset (plastome and nuclear) resulted in identical Bayesian and Maximum Likelihood trees with high statistical support at most nodes. Estimates from dating analyses were similar across datasets, with the crown group emerging ~1.76-1.82 Mya. Biogeographic analyses based on the nuclear and concatenated datasets indicated that colonization within the Hawaiian Islands generally followed the progression rule with 67-80% of colonization events from older to younger islands, while only 53% of events followed the progression rule in the plastome analysis. We find strong evidence for nuclear-plastome conflict indicating a potentially important role for hybridization in the evolution of the group. However, incomplete lineage sorting cannot be ruled out due to the small number of independent loci analyzed. This study contributes new insights into species relationships and the biogeographic history of the explosive Hawaiian Bidens adaptive radiation.
Collapse
Affiliation(s)
- Matthew L Knope
- Department of Biology, University of Hawai'i at Hilo, Hilo, HI
| | | | - Erin M Datlof
- Department of Biology, University of Hawai'i at Hilo, Hilo, HI
| | - Timothy J Gallaher
- Department of Biology, University of Washington, Seattle, WA.,Bernice Pauahi Bishop Museum, Honolulu, HI
| | - Melissa A Johnson
- USDA-ARS, Daniel K. Inouye U.S. Pacific Basin Agricultural Research Center, Hilo, HI
| |
Collapse
|
24
|
Ferreira MS, Jones MR, Callahan CM, Farelo L, Tolesa Z, Suchentrunk F, Boursot P, Mills LS, Alves PC, Good JM, Melo-Ferreira J. The Legacy of Recurrent Introgression during the Radiation of Hares. Syst Biol 2021; 70:593-607. [PMID: 33263746 PMCID: PMC8048390 DOI: 10.1093/sysbio/syaa088] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 11/06/2020] [Accepted: 11/13/2020] [Indexed: 12/30/2022] Open
Abstract
Hybridization may often be an important source of adaptive variation, but the extent and long-term impacts of introgression have seldom been evaluated in the phylogenetic context of a radiation. Hares (Lepus) represent a widespread mammalian radiation of 32 extant species characterized by striking ecological adaptations and recurrent admixture. To understand the relevance of introgressive hybridization during the diversification of Lepus, we analyzed whole exome sequences (61.7 Mb) from 15 species of hares (1-4 individuals per species), spanning the global distribution of the genus, and two outgroups. We used a coalescent framework to infer species relationships and divergence times, despite extensive genealogical discordance. We found high levels of allele sharing among species and show that this reflects extensive incomplete lineage sorting and temporally layered hybridization. Our results revealed recurrent introgression at all stages along the Lepus radiation, including recent gene flow between extant species since the last glacial maximum but also pervasive ancient introgression occurring since near the origin of the hare lineages. We show that ancient hybridization between northern hemisphere species has resulted in shared variation of potential adaptive relevance to highly seasonal environments, including genes involved in circadian rhythm regulation, pigmentation, and thermoregulation. Our results illustrate how the genetic legacy of ancestral hybridization may persist across a radiation, leaving a long-lasting signature of shared genetic variation that may contribute to adaptation. [Adaptation; ancient introgression; hybridization; Lepus; phylogenomics.].
Collapse
Affiliation(s)
- Mafalda S Ferreira
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências da Universidade do Porto, Porto, Portugal
- Division of Biological Sciences, University of Montana, Missoula, Montana, United States of America
| | - Matthew R Jones
- Division of Biological Sciences, University of Montana, Missoula, Montana, United States of America
| | - Colin M Callahan
- Division of Biological Sciences, University of Montana, Missoula, Montana, United States of America
| | - Liliana Farelo
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
| | - Zelalem Tolesa
- Department of Biology, Hawassa University, Hawassa, Ethiopia
| | - Franz Suchentrunk
- Department for Interdisciplinary Life Sciences, Research Institute of Wildlife Ecology, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Pierre Boursot
- Institut des Sciences de l’Évolution Montpellier (ISEM), Université de Montpellier, CNRS, IRD, EPHE, France
| | - L Scott Mills
- Wildlife Biology Program, College of Forestry and Conservation, University of Montana, Missoula, Montana, United States of America
- Office of Research and Creative Scholarship, University of Montana, Missoula, Montana, United States of America; Jeffrey M. Good and José Melo-Ferreira shared the senior authorship
| | - Paulo C Alves
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências da Universidade do Porto, Porto, Portugal
- Wildlife Biology Program, College of Forestry and Conservation, University of Montana, Missoula, Montana, United States of America
| | - Jeffrey M Good
- Division of Biological Sciences, University of Montana, Missoula, Montana, United States of America
- Wildlife Biology Program, College of Forestry and Conservation, University of Montana, Missoula, Montana, United States of America
| | - José Melo-Ferreira
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências da Universidade do Porto, Porto, Portugal
| |
Collapse
|
25
|
Allman ES, Mitchell JD, Rhodes JA. Gene tree discord, simplex plots, and statistical tests under the coalescent. Syst Biol 2021; 71:929-942. [PMID: 33560348 DOI: 10.1093/sysbio/syab008] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Revised: 01/31/2021] [Accepted: 02/03/2021] [Indexed: 02/06/2023] Open
Abstract
A simple graphical device, the simplex plot of quartet concordance factors, is introduced to aid in the exploration of a collection of gene trees on a common set of taxa. A single plot summarizes all gene tree discord, and allows for visual comparison to the expected discord from the multispecies coalescent model (MSC) of incomplete lineage sorting on a species tree. A formal statistical procedure is described that can quantify the deviation from expectation for each subset of four taxa, suggesting when the data is not in accord with the MSC, and thus that either gene tree inference error is substantial or a more complex model such as that on a network may be required. If the collection of gene trees is in accord with the MSC, the plots reveal when substantial incomplete lineage sorting is present. Applications to both simulated and empirical multilocus data sets illustrate the insights provided.
Collapse
Affiliation(s)
- Elizabeth S Allman
- Department of Mathematics and Statistics, University of Alaska Fairbanks, Fairbanks, AK 99709, USA
| | - Jonathan D Mitchell
- Department of Mathematics and Statistics, University of Alaska Fairbanks, Fairbanks, AK 99709, USA.,Unité Bioinformatique Evolutive, C3BI USR 3756, Institut Pasteur & CNRS, Paris, France
| | - John A Rhodes
- Department of Mathematics and Statistics, University of Alaska Fairbanks, Fairbanks, AK 99709, USA
| |
Collapse
|
26
|
Koch H, DeGiorgio M. Maximum Likelihood Estimation of Species Trees from Gene Trees in the Presence of Ancestral Population Structure. Genome Biol Evol 2020; 12:3977-3995. [PMID: 32022857 PMCID: PMC7061232 DOI: 10.1093/gbe/evaa022] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/23/2020] [Indexed: 11/12/2022] Open
Abstract
Though large multilocus genomic data sets have led to overall improvements in phylogenetic inference, they have posed the new challenge of addressing conflicting signals across the genome. In particular, ancestral population structure, which has been uncovered in a number of diverse species, can skew gene tree frequencies, thereby hindering the performance of species tree estimators. Here we develop a novel maximum likelihood method, termed TASTI (Taxa with Ancestral structure Species Tree Inference), that can infer phylogenies under such scenarios, and find that it has increasing accuracy with increasing numbers of input gene trees, contrasting with the relatively poor performances of methods not tailored for ancestral structure. Moreover, we propose a supertree approach that allows TASTI to scale computationally with increasing numbers of input taxa. We use genetic simulations to assess TASTI's performance in the three- and four-taxon settings and demonstrate the application of TASTI on a six-species Afrotropical mosquito data set. Finally, we have implemented TASTI in an open-source software package for ease of use by the scientific community.
Collapse
Affiliation(s)
- Hillary Koch
- Department of Statistics, Pennsylvania State University
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University
| |
Collapse
|
27
|
Cai L, Xi Z, Lemmon EM, Lemmon AR, Mast A, Buddenhagen CE, Liu L, Davis CC. The Perfect Storm: Gene Tree Estimation Error, Incomplete Lineage Sorting, and Ancient Gene Flow Explain the Most Recalcitrant Ancient Angiosperm Clade, Malpighiales. Syst Biol 2020; 70:491-507. [PMID: 33169797 DOI: 10.1093/sysbio/syaa083] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 10/20/2020] [Accepted: 10/28/2020] [Indexed: 12/20/2022] Open
Abstract
The genomic revolution offers renewed hope of resolving rapid radiations in the Tree of Life. The development of the multispecies coalescent model and improved gene tree estimation methods can better accommodate gene tree heterogeneity caused by incomplete lineage sorting (ILS) and gene tree estimation error stemming from the short internal branches. However, the relative influence of these factors in species tree inference is not well understood. Using anchored hybrid enrichment, we generated a data set including 423 single-copy loci from 64 taxa representing 39 families to infer the species tree of the flowering plant order Malpighiales. This order includes 9 of the top 10 most unstable nodes in angiosperms, which have been hypothesized to arise from the rapid radiation during the Cretaceous. Here, we show that coalescent-based methods do not resolve the backbone of Malpighiales and concatenation methods yield inconsistent estimations, providing evidence that gene tree heterogeneity is high in this clade. Despite high levels of ILS and gene tree estimation error, our simulations demonstrate that these two factors alone are insufficient to explain the lack of resolution in this order. To explore this further, we examined triplet frequencies among empirical gene trees and discovered some of them deviated significantly from those attributed to ILS and estimation error, suggesting gene flow as an additional and previously unappreciated phenomenon promoting gene tree variation in Malpighiales. Finally, we applied a novel method to quantify the relative contribution of these three primary sources of gene tree heterogeneity and demonstrated that ILS, gene tree estimation error, and gene flow contributed to 10.0$\%$, 34.8$\%$, and 21.4$\%$ of the variation, respectively. Together, our results suggest that a perfect storm of factors likely influence this lack of resolution, and further indicate that recalcitrant phylogenetic relationships like the backbone of Malpighiales may be better represented as phylogenetic networks. Thus, reducing such groups solely to existing models that adhere strictly to bifurcating trees greatly oversimplifies reality, and obscures our ability to more clearly discern the process of evolution. [Coalescent; concatenation; flanking region; hybrid enrichment, introgression; phylogenomics; rapid radiation, triplet frequency.].
Collapse
Affiliation(s)
- Liming Cai
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Zhenxiang Xi
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Emily Moriarty Lemmon
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
| | - Alan R Lemmon
- Department of Scientific Computing, Florida State University, Tallahassee, FL 32306, USA
| | - Austin Mast
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
| | - Christopher E Buddenhagen
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
- AgResearch, 10 Bisley Road, Hamilton 3214, New Zealand
| | - Liang Liu
- Department of Statistics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Charles C Davis
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
| |
Collapse
|
28
|
Campbell MA, Buser TJ, Alfaro ME, López JA. Addressing incomplete lineage sorting and paralogy in the inference of uncertain salmonid phylogenetic relationships. PeerJ 2020; 8:e9389. [PMID: 32685284 PMCID: PMC7337038 DOI: 10.7717/peerj.9389] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Accepted: 05/28/2020] [Indexed: 12/14/2022] Open
Abstract
Recent and continued progress in the scale and sophistication of phylogenetic research has yielded substantial advances in knowledge of the tree of life; however, segments of that tree remain unresolved and continue to produce contradicting or unstable results. These poorly resolved relationships may be the product of methodological shortcomings or of an evolutionary history that did not generate the signal traits needed for its eventual reconstruction. Relationships within the euteleost fish family Salmonidae have proven challenging to resolve in molecular phylogenetics studies in part due to ancestral autopolyploidy contributing to conflicting gene trees. We examine a sequence capture dataset from salmonids and use alternative strategies to accommodate the effects of gene tree conflict based on aspects of salmonid genome history and the multispecies coalescent. We investigate in detail three uncertain relationships: (1) subfamily branching, (2) monophyly of Coregonus and (3) placement of Parahucho. Coregoninae and Thymallinae are resolved as sister taxa, although conflicting topologies are found across analytical strategies. We find inconsistent and generally low support for the monophyly of Coregonus, including in results of analyses with the most extensive dataset and complex model. The most consistent placement of Parahucho is as sister lineage of Salmo.
Collapse
Affiliation(s)
- Matthew A. Campbell
- University of Alaska Museum, University of Alaska—Fairbanks, Fairbanks, AK, USA
| | - Thaddaeus J. Buser
- Department of Fisheries and Wildlife, Oregon State University, Corvallis, OR, USA
| | - Michael E. Alfaro
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, CA, USA
| | - J. Andrés López
- University of Alaska Museum, University of Alaska—Fairbanks, Fairbanks, AK, USA
- College of Fisheries and Ocean Sciences, University of Alaska—Fairbanks, Fairbanks, AK, USA
| |
Collapse
|
29
|
Bernhardt N, Brassac J, Dong X, Willing EM, Poskar CH, Kilian B, Blattner FR. Genome-wide sequence information reveals recurrent hybridization among diploid wheat wild relatives. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 102:493-506. [PMID: 31821649 DOI: 10.1111/tpj.14641] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 11/13/2019] [Accepted: 11/28/2019] [Indexed: 05/07/2023]
Abstract
Many conflicting hypotheses regarding the relationships among crops and wild species closely related to wheat (the genera Aegilops, Amblyopyrum, and Triticum) have been postulated. The contribution of hybridization to the evolution of these taxa is intensely discussed. To determine possible causes for this, and provide a phylogeny of the diploid taxa based on genome-wide sequence information, independent data were obtained from genotyping-by-sequencing and a target-enrichment experiment that returned 244 low-copy nuclear loci. The data were analyzed using Bayesian, likelihood and coalescent-based methods. D statistics were used to test if incomplete lineage sorting alone or together with hybridization is the source for incongruent gene trees. Here we present the phylogeny of all diploid species of the wheat wild relatives. We hypothesize that most of the wheat-group species were shaped by a primordial homoploid hybrid speciation event involving the ancestral Triticum and Am. muticum lineages to form all other species except Ae. speltoides. This hybridization event was followed by multiple introgressions affecting all taxa except Triticum. Mostly progenitors of the extant species were involved in these processes, while recent interspecific gene flow seems insignificant. The composite nature of many genomes of wheat-group taxa results in complicated patterns of diploid contributions when these lineages are involved in polyploid formation, which is, for example, the case for tetraploid and hexaploid wheats. Our analysis provides phylogenetic relationships and a testable hypothesis for the genome compositions in the basic evolutionary units within the wheat group of Triticeae.
Collapse
Affiliation(s)
- Nadine Bernhardt
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
| | - Jonathan Brassac
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
| | - Xue Dong
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
- Plant Germplasm and Genomics Centre, Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, 650201, Kunming, Yunnan, China
| | - Eva-Maria Willing
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
| | - C Hart Poskar
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
| | - Benjamin Kilian
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
- Global Crop Diversity Trust, 53113, Bonn, Germany
| | - Frank R Blattner
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, 04103, Leipzig, Germany
| |
Collapse
|
30
|
Springer MS, Molloy EK, Sloan DB, Simmons MP, Gatesy J. ILS-Aware Analysis of Low-Homoplasy Retroelement Insertions: Inference of Species Trees and Introgression Using Quartets. J Hered 2019; 111:147-168. [DOI: 10.1093/jhered/esz076] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2019] [Accepted: 12/12/2019] [Indexed: 12/20/2022] Open
Abstract
Abstract
DNA sequence alignments have provided the majority of data for inferring phylogenetic relationships with both concatenation and coalescent methods. However, DNA sequences are susceptible to extensive homoplasy, especially for deep divergences in the Tree of Life. Retroelement insertions have emerged as a powerful alternative to sequences for deciphering evolutionary relationships because these data are nearly homoplasy-free. In addition, retroelement insertions satisfy the “no intralocus-recombination” assumption of summary coalescent methods because they are singular events and better approximate neutrality relative to DNA loci commonly sampled in phylogenomic studies. Retroelements have traditionally been analyzed with parsimony, distance, and network methods. Here, we analyze retroelement data sets for vertebrate clades (Placentalia, Laurasiatheria, Balaenopteroidea, Palaeognathae) with 2 ILS-aware methods that operate by extracting, weighting, and then assembling unrooted quartets into a species tree. The first approach constructs a species tree from retroelement bipartitions with ASTRAL, and the second method is based on split-decomposition with parsimony. We also develop a Quartet-Asymmetry test to detect hybridization using retroelements. Both ILS-aware methods recovered the same species-tree topology for each data set. The ASTRAL species trees for Laurasiatheria have consecutive short branch lengths in the anomaly zone whereas Palaeognathae is outside of this zone. For the Balaenopteroidea data set, which includes rorquals (Balaenopteridae) and gray whale (Eschrichtiidae), both ILS-aware methods resolved balaeonopterids as paraphyletic. Application of the Quartet-Asymmetry test to this data set detected 19 different quartets of species for which historical introgression may be inferred. Evidence for introgression was not detected in the other data sets.
Collapse
Affiliation(s)
- Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA
| | - Erin K Molloy
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL
| | - Daniel B Sloan
- Department of Biology, Colorado State University, Fort Collins, CO
| | - Mark P Simmons
- Department of Biology, Colorado State University, Fort Collins, CO
| | - John Gatesy
- Division of Vertebrate Zoology and Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY
| |
Collapse
|
31
|
Allman ES, Baños H, Rhodes JA. NANUQ: a method for inferring species networks from gene trees under the coalescent model. Algorithms Mol Biol 2019; 14:24. [PMID: 31827592 PMCID: PMC6896299 DOI: 10.1186/s13015-019-0159-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 11/07/2019] [Indexed: 01/07/2023] Open
Abstract
Species networks generalize the notion of species trees to allow for hybridization or other lateral gene transfer. Under the network multispecies coalescent model, individual gene trees arising from a network can have any topology, but arise with frequencies dependent on the network structure and numerical parameters. We propose a new algorithm for statistical inference of a level-1 species network under this model, from data consisting of gene tree topologies, and provide the theoretical justification for it. The algorithm is based on an analysis of quartets displayed on gene trees, combining several statistical hypothesis tests with combinatorial ideas such as a quartet-based intertaxon distance appropriate to networks, the NeighborNet algorithm for circular split systems, and the Circular Network algorithm for constructing a splits graph.
Collapse
|
32
|
Hirano T, Saito T, Tsunamoto Y, Koseki J, Prozorova L, Do VT, Matsuoka K, Nakai K, Suyama Y, Chiba S. Role of ancient lakes in genetic and phenotypic diversification of freshwater snails. Mol Ecol 2019; 28:5032-5051. [DOI: 10.1111/mec.15272] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2019] [Revised: 09/14/2019] [Accepted: 09/16/2019] [Indexed: 01/17/2023]
Affiliation(s)
- Takahiro Hirano
- Department of Biological Sciences University of Idaho Moscow ID USA
| | - Takumi Saito
- Department of Biology Faculty of Science Toho University Funabashi Japan
| | - Yoshihiro Tsunamoto
- Tohoku Research Center Forestry and Forest Products Research Institute Morioka Japan
| | - Joichiro Koseki
- Graduate School of Life Sciences Tohoku University Sendai Japan
| | - Larisa Prozorova
- Federal Scientific Center of the East Asia Terrestrial Biodiversity Far Eastern Branch Russian Academy of Sciences Vladivostok Russia
| | - Van Tu Do
- Institute of Ecology and Biological Resources Vietnam Academy of Science and Technology Hanoi Vietnam
- Graduate University of Science and Technology Vietnam Academy of Science and Technology Hanoi Vietnam
| | | | | | - Yoshihisa Suyama
- Kawatabi Field Science Center Graduate School of Agricultural Science Tohoku University Osaki Japan
| | - Satoshi Chiba
- Graduate School of Life Sciences Tohoku University Sendai Japan
- Center for Northeast Asian Studies Tohoku University Sendai Japan
| |
Collapse
|
33
|
Karimi N, Grover CE, Gallagher JP, Wendel JF, Ané C, Baum DA. Reticulate Evolution Helps Explain Apparent Homoplasy in Floral Biology and Pollination in Baobabs (Adansonia; Bombacoideae; Malvaceae). Syst Biol 2019; 69:462-478. [DOI: 10.1093/sysbio/syz073] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Revised: 10/24/2019] [Accepted: 10/26/2019] [Indexed: 12/17/2022] Open
Abstract
Abstract
Baobabs (Adansonia) are a cohesive group of tropical trees with a disjunct distribution in Australia, Madagascar, and continental Africa, and diverse flowers associated with two pollination modes. We used custom-targeted sequence capture in conjunction with new and existing phylogenetic comparative methods to explore the evolution of floral traits and pollination systems while allowing for reticulate evolution. Our analyses suggest that relationships in Adansonia are confounded by reticulation, with network inference methods supporting at least one reticulation event. The best supported hypothesis involves introgression between Adansonia rubrostipa and core Longitubae, both of which are hawkmoth pollinated with yellow/red flowers, but there is also some support for introgression between the African lineage and Malagasy Brevitubae, which are both mammal-pollinated with white flowers. New comparative methods for phylogenetic networks were developed that allow maximum-likelihood inference of ancestral states and were applied to study the apparent homoplasy in floral biology and pollination mode seen in Adansonia. This analysis supports a role for introgressive hybridization in morphological evolution even in a clade with highly divergent and geographically widespread species. Our new comparative methods for discrete traits on species networks are implemented in the software PhyloNetworks. [Comparative methods; Hyb-Seq; introgression; network inference; population trees; reticulate evolution; species tree inference; targeted sequence capture.]
Collapse
Affiliation(s)
- Nisa Karimi
- Department of Botany, University of Wisconsin – Madison, 430 Lincoln Drive, Madison, WI 53706, USA
| | - Corrinne E Grover
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, 2200 Osborn Drive, Ames, IA 50011, USA
| | - Joseph P Gallagher
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, 2200 Osborn Drive, Ames, IA 50011, USA
- Department of Biology, University of Massachusetts, 611 North Pleasant Street, Amherst, MA 01003, USA
| | - Jonathan F Wendel
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, 2200 Osborn Drive, Ames, IA 50011, USA
| | - Cécile Ané
- Department of Botany, University of Wisconsin – Madison, 430 Lincoln Drive, Madison, WI 53706, USA
- Department of Statistics, University of Wisconsin – Madison, 1300 University Ave, WI, 53706, USA
| | - David A Baum
- Department of Botany, University of Wisconsin – Madison, 430 Lincoln Drive, Madison, WI 53706, USA
- Wisconsin Institute for Discovery, 330 N Orchard Street, Madison, 430 Lincoln Drive, Madison, WI 53706, USA
| |
Collapse
|
34
|
Blanco-Pastor JL, Bertrand YJK, Liberal IM, Wei Y, Brummer EC, Pfeil BE. Evolutionary networks from RADseq loci point to hybrid origins of Medicago carstiensis and Medicago cretacea. AMERICAN JOURNAL OF BOTANY 2019; 106:1219-1228. [PMID: 31535720 DOI: 10.1002/ajb2.1352] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 07/12/2019] [Indexed: 06/10/2023]
Abstract
PREMISE Although hybridization has played an important role in the evolution of many plant species, phylogenetic reconstructions that include hybridizing lineages have been historically constrained by the available models and data. Restriction-site-associated DNA sequencing (RADseq) has been a popular sequencing technique for the reconstruction of hybridization in the next-generation sequencing era. However, the utility of RADseq for the reconstruction of complex evolutionary networks has not been thoroughly investigated. Conflicting phylogenetic relationships in the genus Medicago have been mainly attributed to hybridization, but the specific hybrid origins of taxa have not been yet clarified. METHODS We obtained new molecular data from diploid species of Medicago section Medicago using single-digest RADseq to reconstruct evolutionary networks from gene trees, an approach that is computationally tractable with data sets that include several species and complex hybridization patterns. RESULTS Our analyses revealed that assembly filters to exclusively select a small set of loci with high phylogenetic information led to the most-divergent network topologies. Conversely, alternative clustering thresholds or filters on the number of samples per locus had a lower impact on networks. A strong hybridization signal was detected for M. carstiensis and M. cretacea, while signals were less clear for M. rugosa, M. rhodopea, M. suffruticosa, M. marina, M. scutellata, and M. sativa. CONCLUSIONS Complex network reconstructions from RADseq gene trees were not robust under variations of the assembly parameters and filters. But when the most-divergent networks were discarded, all remaining analyses consistently supported a hybrid origin for M. carstiensis and M. cretacea.
Collapse
Affiliation(s)
- José Luis Blanco-Pastor
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Göteborg, Sweden
- INRA, Centre Nouvelle-Aquitaine-Poitiers, UR4 (URP3F), 86600, Lusignan, France
| | - Yann J K Bertrand
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Göteborg, Sweden
- Institute of Botany, Czech Academy of Sciences, Zámek 1, 25243, Průhonice, Czech Republic
| | | | - Yanling Wei
- Plant Breeding Center, Department of Plant Sciences, University of California, Davis, Davis, CA, USA
| | - E Charles Brummer
- Plant Breeding Center, Department of Plant Sciences, University of California, Davis, Davis, CA, USA
| | - Bernard E Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Göteborg, Sweden
| |
Collapse
|
35
|
Van Iersel L, Janssen R, Jones M, Murakami Y, Zeh N. Polynomial-Time Algorithms for Phylogenetic Inference Problems involving duplication and reticulation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 17:14-26. [PMID: 31425045 DOI: 10.1109/tcbb.2019.2934957] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
A common problem in phylogenetics is to try to infer a species phylogeny from gene trees. We consider different variants of this problem. The first variant, called Unrestricted Minimal Episodes Inference, aims at inferring a species tree based on a model with speciation and duplication where duplications are clustered in duplication episodes. The goal is to minimize the number of such episodes. The second variant, Parental Hybridization, aims at inferring a species network based on a model with speciation and reticulation. The goal is to minimize the number of reticulation events. It is a variant of the well-studied Hybridization Number problem with a more generous view on which gene trees are consistent with a given species network. We show that these seemingly different problems are in fact closely related and can, surprisingly, both be solved in polynomial time, using a structure we call "beaded trees". However, we also show that methods based on these problems have to be used with care because the optimal species phylogenies always have a restricted form. To mitigate this problem, we introduce a new variant of Unrestricted Minimal Episodes Inference that minimizes the duplication episode depth. We prove that this new variant of the problem can also be solved in polynomial time.
Collapse
|
36
|
Tan M, Long H, Liao B, Cao Z, Yuan D, Tian G, Zhuang J, Yang J. QS-Net: Reconstructing Phylogenetic Networks Based on Quartet and Sextet. Front Genet 2019; 10:607. [PMID: 31396256 PMCID: PMC6667645 DOI: 10.3389/fgene.2019.00607] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Accepted: 06/11/2019] [Indexed: 01/27/2023] Open
Abstract
Phylogenetic networks are used to estimate evolutionary relationships among biological entities or taxa involving reticulate events such as horizontal gene transfer, hybridization, recombination, and reassortment. In the past decade, many phylogenetic tree and network reconstruction methods have been proposed. Despite that they are highly accurate in reconstructing simple to moderate complex reticulate events, the performance decreases when several reticulate events are present simultaneously. In this paper, we proposed QS-Net, a phylogenetic network reconstruction method taking advantage of information on the relationship among six taxa. To evaluate the performance of QS-Net, we conducted experiments on three artificial sequence data simulated from an evolutionary tree, an evolutionary network involving three reticulate events, and a complex evolutionary network involving five reticulate events. Comparison with popular phylogenetic methods including Neighbor-Joining, Split-Decomposition, Neighbor-Net, and Quartet-Net suggests that QS-Net is comparable with other methods in reconstructing tree-like evolutionary histories, while it outperforms them in reconstructing reticulate events. In addition, we also applied QS-Net in real data including a bacterial taxonomy data consisting of 36 bacterial species and the whole genome sequences of 22 H7N9 influenza A viruses. The results indicate that QS-Net is capable of inferring commonly believed bacterial taxonomy and influenza evolution as well as identifying novel reticulate events. The software QS-Net is publically available at https://github.com/Tmyiri/QS-Net.
Collapse
Affiliation(s)
- Ming Tan
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Haixia Long
- School of Information Science and Technology , Hainan Normal University, Haikou, China
| | - Bo Liao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.,School of Information Science and Technology , Hainan Normal University, Haikou, China
| | - Zhi Cao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Dawei Yuan
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Geng Tian
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Jujuan Zhuang
- Department of Mathematics, Dalian Martine University, Dalian, China
| | - Jialiang Yang
- School of Information Science and Technology , Hainan Normal University, Haikou, China.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| |
Collapse
|
37
|
Abstract
Abstract
Many methods exist for detecting introgression between nonsister species, but the most commonly used require either a single sequence from four or more taxa or multiple sequences from each of three taxa. Here, we present a test for introgression that uses only a single sequence from three taxa. This test, denoted D3, uses similar logic as the standard D-test for introgression, but by using pairwise distances instead of site patterns it is able to detect the same signal of introgression with fewer species. We use simulations to show that D3 has statistical power almost equal to D, demonstrating its use on a data set of wild bananas (Musa). The new test is easy to apply and easy to interpret, and should find wide use among currently available data sets.
Collapse
Affiliation(s)
- Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN
- Department of Computer Science, Indiana University, Bloomington, IN
| | - Mark S Hibbins
- Department of Biology, Indiana University, Bloomington, IN
| |
Collapse
|
38
|
Hirano T, Saito T, Tsunamoto Y, Koseki J, Ye B, Do VT, Miura O, Suyama Y, Chiba S. Enigmatic incongruence between mtDNA and nDNA revealed by multi-locus phylogenomic analyses in freshwater snails. Sci Rep 2019; 9:6223. [PMID: 30996240 PMCID: PMC6470147 DOI: 10.1038/s41598-019-42682-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 04/05/2019] [Indexed: 02/08/2023] Open
Abstract
Phylogenetic incongruence has frequently been encountered among different molecular markers. Recent progress in molecular phylogenomics has provided detailed and important information for evolutionary biology and taxonomy. Here we focused on the freshwater viviparid snails (Cipangopaludina chinensis chinensis and C. c. laeta) of East Asia. We conducted phylogenetic analyses and divergence time estimation using two mitochondrial markers. We also performed population genetic analyses using genome-wide SNPs. We investigated how and which phylogenetic patterns reflect shell morphology. The results showed these two species could be separated into four major mitochondrial clades, whereas the nuclear clusters supported two groups. The phylogenetic patterns of both mtDNA and nDNA largely reflected the geographical distribution. Shell morphology reflected the phylogenetic clusters based on nDNA. The findings also showed these two species diversified in the Pliocene to early Pleistocene era, and occurred introgressive hybridisation. The results also raise the taxonomic issue of the two species.
Collapse
Affiliation(s)
- Takahiro Hirano
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, USA.
| | - Takumi Saito
- Graduate school of Life Sciences, Tohoku University, Miyagi, Japan
| | - Yoshihiro Tsunamoto
- Kawatabi Field Science Center, Graduate School of Agricultural Science, Tohoku University, Miyagi, Japan
| | - Joichiro Koseki
- Graduate school of Life Sciences, Tohoku University, Miyagi, Japan
| | - Bin Ye
- Graduate school of Life Sciences, Tohoku University, Miyagi, Japan
- Agricultural Experiment Station, Zhejiang University, Hangzhou, China
| | - Van Tu Do
- Institute of Ecology and Biological Resources, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Osamu Miura
- Faculty of Agriculture and Marine Science, Kochi University, Kochi, Japan
| | - Yoshihisa Suyama
- Kawatabi Field Science Center, Graduate School of Agricultural Science, Tohoku University, Miyagi, Japan
| | - Satoshi Chiba
- Graduate school of Life Sciences, Tohoku University, Miyagi, Japan
- Center for Northeast Asian Studies, Tohoku University, Miyagi, Japan
| |
Collapse
|
39
|
Zhang C, Ogilvie HA, Drummond AJ, Stadler T. Bayesian Inference of Species Networks from Multilocus Sequence Data. Mol Biol Evol 2019; 35:504-517. [PMID: 29220490 PMCID: PMC5850812 DOI: 10.1093/molbev/msx307] [Citation(s) in RCA: 103] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Reticulate species evolution, such as hybridization or introgression, is relatively common in nature. In the presence of reticulation, species relationships can be captured by a rooted phylogenetic network, and orthologous gene evolution can be modeled as bifurcating gene trees embedded in the species network. We present a Bayesian approach to jointly infer species networks and gene trees from multilocus sequence data. A novel birth-hybridization process is used as the prior for the species network, and we assume a multispecies network coalescent prior for the embedded gene trees. We verify the ability of our method to correctly sample from the posterior distribution, and thus to infer a species network, through simulations. To quantify the power of our method, we reanalyze two large data sets of genes from spruces and yeasts. For the three closely related spruces, we verify the previously suggested homoploid hybridization event in this clade; for the yeast data, we find extensive hybridization events. Our method is available within the BEAST 2 add-on SpeciesNetwork, and thus provides an extensible framework for Bayesian inference of reticulate evolution.
Collapse
Affiliation(s)
- Chi Zhang
- Department of Biosystems Science and Engineering, Eidgenössische Technische Hochschule Zürich, Basel, Switzerland.,Swiss Institute of Bioinformatics (SIB), Switzerland.,Key Laboratory of Vertebrate Evolution and Human Origins of Chinese Academy of Sciences, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing, China
| | - Huw A Ogilvie
- Division of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australia.,Centre for Computational Evolution, University of Auckland, Auckland, New Zealand
| | - Alexei J Drummond
- Centre for Computational Evolution, University of Auckland, Auckland, New Zealand.,Department of Computer Science, University of Auckland, Auckland, New Zealand
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, Eidgenössische Technische Hochschule Zürich, Basel, Switzerland.,Swiss Institute of Bioinformatics (SIB), Switzerland
| |
Collapse
|
40
|
Baños H. Identifying Species Network Features from Gene Tree Quartets Under the Coalescent Model. Bull Math Biol 2019; 81:494-534. [PMID: 30094772 PMCID: PMC6344282 DOI: 10.1007/s11538-018-0485-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Accepted: 07/30/2018] [Indexed: 10/28/2022]
Abstract
We show that many topological features of level-1 species networks are identifiable from the distribution of the gene tree quartets under the network multi-species coalescent model. In particular, every cycle of size at least 4 and every hybrid node in a cycle of size at least 5 are identifiable. This is a step toward justifying the inference of such networks which was recently implemented by Solís-Lemus and Ané. We show additionally how to compute quartet concordance factors for a network in terms of simpler networks, and explore some circumstances in which cycles of size 3 and hybrid nodes in 4-cycles can be detected.
Collapse
Affiliation(s)
- Hector Baños
- University of Alaska Fairbanks, P.O. Box 756660, Fairbanks, AK, 99775-6660, USA.
| |
Collapse
|
41
|
Advances in Computational Methods for Phylogenetic Networks in the Presence of Hybridization. BIOINFORMATICS AND PHYLOGENETICS 2019. [DOI: 10.1007/978-3-030-10837-3_13] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
42
|
Abstract
Most phylogenies are typically represented as purely bifurcating. However, as genomic data have become more common in phylogenetic studies, it is not unusual to find reticulation among terminal lineages or among internal nodes (deep time reticulation; DTR). In these situations, gene flow must have happened in the same or adjacent geographic areas for these DTRs to have occurred and therefore biogeographic reconstruction should provide similar area estimates for parental nodes, provided extinction or dispersal has not eroded these patterns. We examine the phylogeny of the widely distributed New World kingsnakes (Lampropeltis), determine if DTR is present in this group, and estimate the ancestral area for reticulation. Importantly, we develop a new method that uses coalescent simulations in a machine learning framework to show conclusively that this phylogeny is best represented as reticulating at deeper time. Using joint probabilities of ancestral area reconstructions on the bifurcating parental lineages from the reticulating node, we show that this reticulation likely occurred in northwestern Mexico/southwestern US, and subsequently, led to the diversification of the Mexican kingsnakes. This region has been previously identified as an area important for understanding speciation and secondary contact with gene flow in snakes and other squamates. This research shows that phylogenetic reticulation is common, even in well-studied groups, and that the geographic scope of ancient hybridization is recoverable.
Collapse
Affiliation(s)
- Frank T Burbrink
- Department of Herpetology, The American Museum of Natural History, 79th Street at Central Park West, New York, NY 10024, USA
| | - Marcelo Gehara
- Department of Herpetology, The American Museum of Natural History, 79th Street at Central Park West, New York, NY 10024, USA
| |
Collapse
|
43
|
Blischak PD, Chifman J, Wolfe AD, Kubatko LS. HyDe: A Python Package for Genome-Scale Hybridization Detection. Syst Biol 2018; 67:821-829. [PMID: 29562307 DOI: 10.1093/sysbio/syy023] [Citation(s) in RCA: 123] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 03/15/2018] [Indexed: 11/13/2022] Open
Abstract
The analysis of hybridization and gene flow among closely related taxa is a common goal for researchers studying speciation and phylogeography. Many methods for hybridization detection use simple site pattern frequencies from observed genomic data and compare them to null models that predict an absence of gene flow. The theory underlying the detection of hybridization using these site pattern probabilities exploits the relationship between the coalescent process for gene trees within population trees and the process of mutation along the branches of the gene trees. For certain models, site patterns are predicted to occur in equal frequency (i.e., their difference is 0), producing a set of functions called phylogenetic invariants. In this article, we introduce HyDe, a software package for detecting hybridization using phylogenetic invariants arising under the coalescent model with hybridization. HyDe is written in Python and can be used interactively or through the command line using pre-packaged scripts. We demonstrate the use of HyDe on simulated data, as well as on two empirical data sets from the literature. We focus in particular on identifying individual hybrids within population samples and on distinguishing between hybrid speciation and gene flow. HyDe is freely available as an open source Python package under the GNU GPL v3 on both GitHub (https://github.com/pblischak/HyDe) and the Python Package Index (PyPI: https://pypi.python.org/pypi/phyde).
Collapse
Affiliation(s)
- Paul D Blischak
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Julia Chifman
- Department of Mathematics and Statistics, American University, Washington, DC 20016, USA
| | - Andrea D Wolfe
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Laura S Kubatko
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA.,Department of Statistics, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
44
|
Kang Q, Schardl CL, Moore N, Yoshida R. CURatio: Genome-wide phylogenomic analysis method using ratios of total branch lengths. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 17:10.1109/TCBB.2018.2878564. [PMID: 30387738 PMCID: PMC7372714 DOI: 10.1109/tcbb.2018.2878564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Evolutionary hypotheses provide important underpinnings of biological and medical sciences, and comprehensive, genome-wide understanding of evolutionary relationships among organisms are needed to test and refine such hypotheses. Theory and empirical evidence clearly indicate that phylogenies (trees) of different genes (loci) should not display precisely matching topologies. The main reason for such phylogenetic incongruence is reticulated evolutionary history of most species due to meiotic sexual recombination in eukaryotes, or horizontal transfers of genetic material in prokaryotes. Nevertheless, many genes should display topologically related phylogenies, and should group into one or more (for genetic hybrids) clusters in poly-dimensional "tree space". Unusual evolutionary histories or effects of selection may result in "outlier" genes with phylogenies that fall outside the main distribution(s) of trees in tree space. We present a new phylogenomic method, CURatio, which uses ratios of total branch lengths in gene trees to help identify phylogenetic outliers in a given set of ortholog groups from multiple genomes. An advantage of CURatio over other methods is that genes absent from and/or duplicated in some genomes can be included in the analysis. We conducted a simulation study under the coalescent model, and showed that, given sufficient species depth and topological difference, these ratios are significantly higher for the "outlier" gene phylogenies. Also, we applied CURatio to a set of annotated genomes of the fungal family, Clavicipitaceae, and identified alkaloid biosynthesis genes as outliers, probably due to a history of duplication and loss. The source code is available at https://github.com/QiwenKang/CURatio, and the empirical data set on Clavicipitaceae and simulated data set are available at Mendeley https://data.mendeley.com/datasets/mrxts7wjrr/1.
Collapse
|
45
|
Abstract
PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.
Collapse
Affiliation(s)
| | | | | | - Luay Nakhleh
- Computer Science.,BioSciences, Rice University, 6100 Main Street, Houston, TX 77005, USA
| |
Collapse
|
46
|
Tang Q, Edwards SV, Rheindt FE. Rapid diversification and hybridization have shaped the dynamic history of the genus Elaenia. Mol Phylogenet Evol 2018; 127:522-533. [DOI: 10.1016/j.ympev.2018.05.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 04/11/2018] [Accepted: 05/08/2018] [Indexed: 01/04/2023]
|
47
|
Degnan JH. Modeling Hybridization Under the Network Multispecies Coalescent. Syst Biol 2018; 67:786-799. [PMID: 29846734 PMCID: PMC6101600 DOI: 10.1093/sysbio/syy040] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2017] [Revised: 05/13/2018] [Accepted: 05/16/2018] [Indexed: 11/13/2022] Open
Abstract
Simultaneously modeling hybridization and the multispecies coalescent is becoming increasingly common, and inference of species networks in this context is now implemented in several software packages. This article addresses some of the conceptual issues and decisions to be made in this modeling, including whether or not to use branch lengths and issues with model identifiability. This article is based on a talk given at a Spotlight Session at Evolution 2017 meeting in Portland, Oregon. This session included several talks about modeling hybridization and gene flow in the presence of incomplete lineage sorting. Other talks given at this meeting are also included in this special issue of Systematic Biology.
Collapse
Affiliation(s)
- James H Degnan
- Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM 87131, USA
| |
Collapse
|
48
|
Beckman EJ, Benham PM, Cheviron ZA, Witt C. Detecting introgression despite phylogenetic uncertainty: The case of the South American siskins. Mol Ecol 2018; 27:4350-4367. [DOI: 10.1111/mec.14795] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Revised: 05/21/2018] [Accepted: 05/23/2018] [Indexed: 12/25/2022]
Affiliation(s)
- Elizabeth J. Beckman
- Division of Biological Sciences University of Montana Missoula Montana
- Department of Biology and Museum of Southwestern Biology University of New Mexico Albuquerque New Mexico
| | - Phred M. Benham
- Division of Biological Sciences University of Montana Missoula Montana
| | | | - Christopher C. Witt
- Department of Biology and Museum of Southwestern Biology University of New Mexico Albuquerque New Mexico
| |
Collapse
|
49
|
De Maio N, Worby CJ, Wilson DJ, Stoesser N. Bayesian reconstruction of transmission within outbreaks using genomic variants. PLoS Comput Biol 2018; 14:e1006117. [PMID: 29668677 PMCID: PMC5927459 DOI: 10.1371/journal.pcbi.1006117] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 04/30/2018] [Accepted: 04/03/2018] [Indexed: 01/19/2023] Open
Abstract
Pathogen genome sequencing can reveal details of transmission histories and is a powerful tool in the fight against infectious disease. In particular, within-host pathogen genomic variants identified through heterozygous nucleotide base calls are a potential source of information to identify linked cases and infer direction and time of transmission. However, using such data effectively to model disease transmission presents a number of challenges, including differentiating genuine variants from those observed due to sequencing error, as well as the specification of a realistic model for within-host pathogen population dynamics. Here we propose a new Bayesian approach to transmission inference, BadTrIP (BAyesian epiDemiological TRansmission Inference from Polymorphisms), that explicitly models evolution of pathogen populations in an outbreak, transmission (including transmission bottlenecks), and sequencing error. BadTrIP enables the inference of host-to-host transmission from pathogen sequencing data and epidemiological data. By assuming that genomic variants are unlinked, our method does not require the computationally intensive and unreliable reconstruction of individual haplotypes. Using simulations we show that BadTrIP is robust in most scenarios and can accurately infer transmission events by efficiently combining information from genetic and epidemiological sources; thanks to its realistic model of pathogen evolution and the inclusion of epidemiological data, BadTrIP is also more accurate than existing approaches. BadTrIP is distributed as an open source package (https://bitbucket.org/nicofmay/badtrip) for the phylogenetic software BEAST2. We apply our method to reconstruct transmission history at the early stages of the 2014 Ebola outbreak, showcasing the power of within-host genomic variants to reconstruct transmission events. We present a new tool to reconstruct transmission events within outbreaks. Our approach makes use of pathogen genetic information, notably genetic variants at low frequency within host that are usually discarded, and combines it with epidemiological information of host exposure to infection. This leads to accurate reconstruction of transmission even in cases where abundant within-host pathogen genetic variation and weak transmission bottlenecks (multiple pathogen units colonising a new host at transmission) would otherwise make inference difficult due to the transmission history differing from the pathogen evolution history inferred from pathogen isolets. Also, the use of within-host pathogen genomic variants increases the resolution of the reconstruction of the transmission tree even in scenarios with limited within-outbreak pathogen genetic diversity: within-host pathogen populations that appear identical at the level of consensus sequences can be discriminated using within-host variants. Our Bayesian approach provides a measure of the confidence in different possible transmission histories, and is published as open source software. We show with simulations and with an analysis of the beginning of the 2014 Ebola outbreak that our approach is applicable in many scenarios, improves our understanding of transmission dynamics, and will contribute to finding and limiting sources and routes of transmission, and therefore preventing the spread of infectious disease.
Collapse
Affiliation(s)
- Nicola De Maio
- Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Colin J Worby
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey, United States of America
| | - Daniel J Wilson
- Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.,Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Nicole Stoesser
- Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
50
|
Fungal species and their boundaries matter – Definitions, mechanisms and practical implications. FUNGAL BIOL REV 2018. [DOI: 10.1016/j.fbr.2017.11.002] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|