1
|
Fogg J, Allman ES, Ané C. PhyloCoalSimulations: A Simulator for Network Multispecies Coalescent Models, Including a New Extension for the Inheritance of Gene Flow. Syst Biol 2023; 72:1171-1179. [PMID: 37254872 DOI: 10.1093/sysbio/syad030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 05/03/2023] [Accepted: 05/15/2023] [Indexed: 06/01/2023] Open
Abstract
We consider the evolution of phylogenetic gene trees along phylogenetic species networks, according to the network multispecies coalescent process, and introduce a new network coalescent model with correlated inheritance of gene flow. This model generalizes two traditional versions of the network coalescent: with independent or common inheritance. At each reticulation, multiple lineages of a given locus are inherited from parental populations chosen at random, either independently across lineages or with positive correlation according to a Dirichlet process. This process may account for locus-specific probabilities of inheritance, for example. We implemented the simulation of gene trees under these network coalescent models in the Julia package PhyloCoalSimulations, which depends on PhyloNetworks and its powerful network manipulation tools. Input species phylogenies can be read in extended Newick format, either in numbers of generations or in coalescent units. Simulated gene trees can be written in Newick format, and in a way that preserves information about their embedding within the species network. This embedding can be used for downstream purposes, such as to simulate species-specific processes like rate variation across species, or for other scenarios as illustrated in this note. This package should be useful for simulation studies and simulation-based inference methods. The software is available open source with documentation and a tutorial at https://github.com/cecileane/PhyloCoalSimulations.jl.
Collapse
Affiliation(s)
- John Fogg
- Department of Statistics, University of Wisconsin - Madison, WI, 53706, USA
| | - Elizabeth S Allman
- Department of Mathematics and Statistics, University of Alaska - Fairbanks, AK, 99775, USA
| | - Cécile Ané
- Department of Statistics, University of Wisconsin - Madison, WI, 53706, USA
- Department of Botany, University of Wisconsin - Madison, WI, 53706, USA
| |
Collapse
|
2
|
Everson KM, Donohue ME, Weisrock DW. A Pervasive History of Gene Flow in Madagascar's True Lemurs (Genus Eulemur). Genes (Basel) 2023; 14:1130. [PMID: 37372308 DOI: 10.3390/genes14061130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 05/16/2023] [Accepted: 05/19/2023] [Indexed: 06/29/2023] Open
Abstract
In recent years, it has become widely accepted that interspecific gene flow is common across the Tree of Life. Questions remain about how species boundaries can be maintained in the face of high levels of gene flow and how phylogeneticists should account for reticulation in their analyses. The true lemurs of Madagascar (genus Eulemur, 12 species) provide a unique opportunity to explore these questions, as they form a recent radiation with at least five active hybrid zones. Here, we present new analyses of a mitochondrial dataset with hundreds of individuals in the genus Eulemur, as well as a nuclear dataset containing hundreds of genetic loci for a small number of individuals. Traditional coalescent-based phylogenetic analyses of both datasets reveal that not all recognized species are monophyletic. Using network-based approaches, we also find that a species tree containing between one and three ancient reticulations is supported by strong evidence. Together, these results suggest that hybridization has been a prominent feature of the genus Eulemur in both the past and present. We also recommend that greater taxonomic attention should be paid to this group so that geographic boundaries and conservation priorities can be better established.
Collapse
Affiliation(s)
- Kathryn M Everson
- Department of Integrative Biology, Oregon State University, Corvallis, OR 97331, USA
- Department of Biology, University of Kentucky, Lexington, KY 40506, USA
| | - Mariah E Donohue
- Department of Biology, University of Kentucky, Lexington, KY 40506, USA
| | - David W Weisrock
- Department of Biology, University of Kentucky, Lexington, KY 40506, USA
| |
Collapse
|
3
|
Jin X, Guo X, Chen J, Li J, Zhang S, Zheng S, Wang Y, Peng Y, Zhang K, Liu Y, Liu B. The complete mitochondrial genome of Hemigrapsus sinensis (Brachyura, Grapsoidea, Varunidae) and its phylogenetic position within Grapsoidea. Genes Genomics 2023; 45:377-391. [PMID: 36346542 DOI: 10.1007/s13258-022-01319-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Accepted: 09/24/2022] [Indexed: 11/10/2022]
Abstract
BACKGROUND In this study, the complete mitogenome of Hemigrapsus sinensis was the first identified and analyzed. OBJECTIVE The complete mitochondrial genome of Hemigrapsus sinensis (Brachyura, Grapsoidea, Varunidae) and its phylogenetic position within Grapsoidea. METHODS The sample of Hemigrapsus sinensis was collected and DNA was extracted. After sequencing, NOVOPlasty was used for sequence assembly. Annotate sequences with MITOS WebServer, tRNAscan-SE2.0, and NCBI database. MEGA was used for sequence analysis and Phylosuite was used for phylogenetic tree construction. DnaSP was used to calculate Ka/Ks. RESULTS This mitochondrial genome shows that it was 15,900 bp and encoded 13 PCGs, 22 tRNA genes, two rRNA genes, and one control region. The genome composition tends to A + T (74.34%) and presents a negative GC-skew (- 0.22) and AT-skew (- 0.03). The PCGs initiation codon was the typical ATN and termination codon was the typical TAN, incomplete T or missing. The ML and BI trees showed that H. sinensis was most closely related to Hemigrapsus and clustered together with the Varunidae. And our phylogenetic trees provide proof that Ocypodoidea and Grapsoidea may be of common origin. Meanwhile, in the phylogenetic tree, parallel mixing of Chiromantes and Orisarma raised doubts over the traditional classification system. Besides, Incomplete Lineage sorting (ILS) was observed in Varunidae. In the subsequent analysis of evolution rate, we found that all of the PCGs (NAD4 was not calculated) had undergone negative selections, indicating the conservation of mitochondrial genes of H. sinensis during the evolution. CONCLUSION Therefore, researching the complete mitogenome of H. sinensis would be contributing to molecular taxonomy, phylogenetic relationship, and breeding optimization within the Grapsoidea superfamily.
Collapse
Affiliation(s)
- Xun Jin
- National Engineering Laboratory of Marine Germplasm Resources Exploration and Utilization, Zhejiang Ocean University, Zhoushan, 316022, China.,National Engineering Research Center for Facilitated Marine Aquaculture, Marine Science and Technology College, Zhejiang Ocean University, No. 1, Haida South Road, Zhoushan, 316022, Zhejiang, China
| | - Xingle Guo
- National Engineering Laboratory of Marine Germplasm Resources Exploration and Utilization, Zhejiang Ocean University, Zhoushan, 316022, China.,National Engineering Research Center for Facilitated Marine Aquaculture, Marine Science and Technology College, Zhejiang Ocean University, No. 1, Haida South Road, Zhoushan, 316022, Zhejiang, China
| | - Jian Chen
- National Engineering Laboratory of Marine Germplasm Resources Exploration and Utilization, Zhejiang Ocean University, Zhoushan, 316022, China.,National Engineering Research Center for Facilitated Marine Aquaculture, Marine Science and Technology College, Zhejiang Ocean University, No. 1, Haida South Road, Zhoushan, 316022, Zhejiang, China
| | - Jiasheng Li
- National Engineering Laboratory of Marine Germplasm Resources Exploration and Utilization, Zhejiang Ocean University, Zhoushan, 316022, China.,National Engineering Research Center for Facilitated Marine Aquaculture, Marine Science and Technology College, Zhejiang Ocean University, No. 1, Haida South Road, Zhoushan, 316022, Zhejiang, China
| | - Shufei Zhang
- Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, South China Sea Fisheries Research Institute, Chinese Academy of Fisheries Sciences, Guangzhou, 510300, Guangdong, China
| | - Sixu Zheng
- National Engineering Laboratory of Marine Germplasm Resources Exploration and Utilization, Zhejiang Ocean University, Zhoushan, 316022, China.,National Engineering Research Center for Facilitated Marine Aquaculture, Marine Science and Technology College, Zhejiang Ocean University, No. 1, Haida South Road, Zhoushan, 316022, Zhejiang, China
| | - Yunpeng Wang
- National Engineering Laboratory of Marine Germplasm Resources Exploration and Utilization, Zhejiang Ocean University, Zhoushan, 316022, China.,National Engineering Research Center for Facilitated Marine Aquaculture, Marine Science and Technology College, Zhejiang Ocean University, No. 1, Haida South Road, Zhoushan, 316022, Zhejiang, China
| | - Ying Peng
- National Engineering Laboratory of Marine Germplasm Resources Exploration and Utilization, Zhejiang Ocean University, Zhoushan, 316022, China.,National Engineering Research Center for Facilitated Marine Aquaculture, Marine Science and Technology College, Zhejiang Ocean University, No. 1, Haida South Road, Zhoushan, 316022, Zhejiang, China
| | - Kun Zhang
- National Engineering Laboratory of Marine Germplasm Resources Exploration and Utilization, Zhejiang Ocean University, Zhoushan, 316022, China.,National Engineering Research Center for Facilitated Marine Aquaculture, Marine Science and Technology College, Zhejiang Ocean University, No. 1, Haida South Road, Zhoushan, 316022, Zhejiang, China
| | - Yifan Liu
- National Engineering Laboratory of Marine Germplasm Resources Exploration and Utilization, Zhejiang Ocean University, Zhoushan, 316022, China.,National Engineering Research Center for Facilitated Marine Aquaculture, Marine Science and Technology College, Zhejiang Ocean University, No. 1, Haida South Road, Zhoushan, 316022, Zhejiang, China
| | - Bingjian Liu
- National Engineering Laboratory of Marine Germplasm Resources Exploration and Utilization, Zhejiang Ocean University, Zhoushan, 316022, China. .,National Engineering Research Center for Facilitated Marine Aquaculture, Marine Science and Technology College, Zhejiang Ocean University, No. 1, Haida South Road, Zhoushan, 316022, Zhejiang, China.
| |
Collapse
|
4
|
Coelho MAG, Pearson GA, Boavida JRH, Paulo D, Aurelle D, Arnaud‐Haond S, Gómez‐Gras D, Bensoussan N, López‐Sendino P, Cerrano C, Kipson S, Bakran‐Petricioli T, Ferretti E, Linares C, Garrabou J, Serrão EA, Ledoux J. Not out of the Mediterranean: Atlantic populations of the gorgonian Paramuricea clavata are a separate sister species under further lineage diversification. Ecol Evol 2023; 13:e9740. [PMID: 36789139 PMCID: PMC9912747 DOI: 10.1002/ece3.9740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 12/22/2022] [Accepted: 12/27/2022] [Indexed: 01/31/2023] Open
Abstract
The accurate delimitation of species boundaries in nonbilaterian marine taxa is notoriously difficult, with consequences for many studies in ecology and evolution. Anthozoans are a diverse group of key structural organisms worldwide, but the lack of reliable morphological characters and informative genetic markers hampers our ability to understand species diversification. We investigated population differentiation and species limits in Atlantic (Iberian Peninsula) and Mediterranean lineages of the octocoral genus Paramuricea previously identified as P. clavata. We used a diverse set of molecular markers (microsatellites, RNA-seq derived single-copy orthologues [SCO] and mt-mutS [mitochondrial barcode]) at 49 locations. Clear segregation of Atlantic and Mediterranean lineages was found with all markers. Species-tree estimations based on SCO strongly supported these two clades as distinct, recently diverged sister species with incomplete lineage sorting, P. cf. grayi and P. clavata, respectively. Furthermore, a second putative (or ongoing) speciation event was detected in the Atlantic between two P. cf. grayi color morphotypes (yellow and purple) using SCO and supported by microsatellites. While segregating P. cf. grayi lineages showed considerable geographic structure, dominating circalittoral communities in southern (yellow) and western (purple) Portugal, their occurrence in sympatry at some localities suggests a degree of reproductive isolation. Overall, our results show that previous molecular and morphological studies have underestimated species diversity in Paramuricea occurring in the Iberian Peninsula, which has important implications for conservation planning. Finally, our findings validate the usefulness of phylotranscriptomics for resolving evolutionary relationships in octocorals.
Collapse
Affiliation(s)
- Márcio A. G. Coelho
- Centre for Marine Sciences (CCMAR)University of AlgarveFaroPortugal,MARE – Marine and Environmental Sciences CentreISPA‐Instituto UniversitárioLisboaPortugal
| | | | | | - Diogo Paulo
- Centre for Marine Sciences (CCMAR)University of AlgarveFaroPortugal
| | - Didier Aurelle
- Aix Marseille Univ., Université de Toulon, CNRS, IRD, MIOMarseilleFrance,Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum National d'Histoire Naturelle, CNRSSorbonne UniversitéParisFrance
| | - Sophie Arnaud‐Haond
- MARBEC (Marine Biodiversity, Exploitation and Conservation)Univ. Montpellier, IFREMER, CNRS, IRDSète CedexFrance
| | - Daniel Gómez‐Gras
- Hawai‘i Institute of Marine BiologyUniversity of Hawai‘i at MānoaKaneoheHawaiiUSA,Departament de Biologia Evolutiva, Ecologia i Ciències AmbientalsUniversitat de Barcelona (UB)BarcelonaSpain,Institut de Recerca de la Biodiversitat (IRBio)Universitat de Barcelona (UB)BarcelonaSpain
| | - Nathaniel Bensoussan
- Aix Marseille Univ., Université de Toulon, CNRS, IRD, MIOMarseilleFrance,Departament de Biologia MarinaInstitut de Ciències del Mar (CSIC)BarcelonaSpain
| | - Paula López‐Sendino
- Departament de Biologia MarinaInstitut de Ciències del Mar (CSIC)BarcelonaSpain
| | - Carlo Cerrano
- Dipartimento di Scienze della Vita e dell’Ambiente (DiSVA)Università Politecnica delle MarcheAnconaItaly,Consorzio Nazionale Interuniversitario per le Scienze del Mare (CoNISMa)RomeItaly,Stazione Zoologica Anton DohrnNaplesItaly,Fano Marine CenterFanoItaly
| | - Silvija Kipson
- Department of Biology, Faculty of ScienceUniversity of ZagrebZagrebCroatia,SEAFAN – Marine Research & ConsultancyZagrebCroatia
| | | | - Eliana Ferretti
- Studio Associato GAIA s.n.c.GenoaItaly,Institute of Marine ScienceThe University of AucklandAucklandNew Zealand
| | - Cristina Linares
- Departament de Biologia Evolutiva, Ecologia i Ciències AmbientalsUniversitat de Barcelona (UB)BarcelonaSpain,Institut de Recerca de la Biodiversitat (IRBio)Universitat de Barcelona (UB)BarcelonaSpain
| | - Joaquim Garrabou
- Aix Marseille Univ., Université de Toulon, CNRS, IRD, MIOMarseilleFrance,Departament de Biologia MarinaInstitut de Ciències del Mar (CSIC)BarcelonaSpain
| | - Ester A. Serrão
- Centre for Marine Sciences (CCMAR)University of AlgarveFaroPortugal,CIBIO/InBIO‐Centro de Investigação em Biodiversidade e Recursos GenéticosVairãoPortugal
| | - Jean‐Baptiste Ledoux
- CIIMAR/CIMAR, Centro Interdisciplinar de Investigação Marinha e AmbientalUniversidade do PortoPortoPortugal
| |
Collapse
|
5
|
Cheng L, Han Q, Chen F, Li M, Balbuena TS, Zhao Y. Phylogenomics as an effective approach to untangle cross-species hybridization event: A case study in the family Nymphaeaceae. Front Genet 2022; 13:1031705. [PMID: 36406110 PMCID: PMC9670182 DOI: 10.3389/fgene.2022.1031705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022] Open
Abstract
Hybridization is common and considered as an important evolutionary force to increase intraspecific genetic diversity. Detecting hybridization events is crucial for understanding the evolutionary history of species and further improving molecular breeding. The studies on identifying hybridization events through the phylogenomic approach are still limited. We proposed the conception and method of identifying allopolyploidy events by phylogenomics. The reconciliation and summary of nuclear multi-labeled gene family trees were adopted to untangle hybridization events from next-generation data in our novel phylogenomic approach. Given horticulturalists’ relatively clear cultivated crossbreeding history, the water lily family is a suitable case for examining recent allopolyploidy events. Here, we reconstructed and confirmed the well-resolved nuclear phylogeny for the Nymphaeales family in the context of geological time as a framework for identifying hybridization signals. We successfully identified two possible allopolyploidy events with the parental lineages for the hybrids in the family Nymphaeaceae based on summarization from multi-labeled gene family trees of Nymphaeales. The lineages where species Nymphaea colorata and Nymphaea caerulea are located may be the progenitors of horticultural cultivated species Nymphaea ‘midnight’ and Nymphaea ‘Woods blue goddess’. The proposed hybridization hypothesis is also supported by horticultural breeding records. Our methodology can be widely applied to identify hybridization events and theoretically facilitate the genome breeding design of hybrid plants.
Collapse
Affiliation(s)
- Lin Cheng
- Henan International Joint Laboratory of Tea-oil Tree Biology and High-Value Utilization, Xinyang Normal University, Xinyang, Henan, China
| | - Qunwei Han
- Henan International Joint Laboratory of Tea-oil Tree Biology and High-Value Utilization, Xinyang Normal University, Xinyang, Henan, China
| | - Fei Chen
- College of Tropical Crops, Hainan University, Haikou, China
| | - Mengge Li
- Henan International Joint Laboratory of Tea-oil Tree Biology and High-Value Utilization, Xinyang Normal University, Xinyang, Henan, China
| | - Tiago Santana Balbuena
- Department of Agricultural, Livestock and Environmental Biotechnology, UNESP, São Paulo, Brazil
| | - Yiyong Zhao
- State Key Laboratory of Genetic Engineering and Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai, China
- College of Agriculture, Guizhou University, Guiyang, China
- *Correspondence: Yiyong Zhao, ,
| |
Collapse
|
6
|
Menet H, Daubin V, Tannier E. Phylogenetic reconciliation. PLoS Comput Biol 2022; 18:e1010621. [PMID: 36327227 PMCID: PMC9632901 DOI: 10.1371/journal.pcbi.1010621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Affiliation(s)
- Hugo Menet
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558,Villeurbanne, France
| | - Vincent Daubin
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558,Villeurbanne, France
- * E-mail: (VD); (ET)
| | - Eric Tannier
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558,Villeurbanne, France
- Inria, centre de recherche de Lyon, Villeurbanne, France
- * E-mail: (VD); (ET)
| |
Collapse
|
7
|
LeMay M, Libeskind-Hadas R, Wu YC. A Polynomial-Time Algorithm for Minimizing the Deep Coalescence Cost for Level-1 Species Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2642-2653. [PMID: 34406946 DOI: 10.1109/tcbb.2021.3105922] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Phylogenetic analyses commonly assume that the species history can be represented as a tree. However, in the presence of hybridization, the species history is more accurately captured as a network. Despite several advances in modeling phylogenetic networks, there is no known polynomial-time algorithm for parsimoniously reconciling gene trees with species networks while accounting for incomplete lineage sorting. To address this issue, we present a polynomial-time algorithm for the case of level-1 networks, in which no hybrid species is the direct ancestor of another hybrid species. This work enables more efficient reconciliation of gene trees with species networks, which in turn, enables more efficient reconstruction of species networks.
Collapse
|
8
|
Mostert‐O'Neill MM, Tate H, Reynolds SM, Mphahlele MM, van den Berg G, Verryn SD, Acosta JJ, Borevitz JO, Myburg AA. Genomic consequences of artificial selection during early domestication of a wood fibre crop. THE NEW PHYTOLOGIST 2022; 235:1944-1956. [PMID: 35657639 PMCID: PMC9541791 DOI: 10.1111/nph.18297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 05/20/2022] [Indexed: 06/15/2023]
Abstract
From its origins in Australia, Eucalyptus grandis has spread to every continent, except Antarctica, as a wood crop. It has been cultivated and bred for over 100 yr in places such as South Africa. Unlike most annual crops and fruit trees, domestication of E. grandis is still in its infancy, representing a unique opportunity to interrogate the genomic consequences of artificial selection early in the domestication process. To determine how a century of artificial selection has changed the genome of E. grandis, we generated single nucleotide polymorphism genotypes for 1080 individuals from three advanced South African breeding programmes using the EUChip60K chip, and investigated population structure and genome-wide differentiation patterns relative to wild progenitors. Breeding and wild populations appeared genetically distinct. We found genomic evidence of evolutionary processes known to have occurred in other plant domesticates, including interspecific introgression and intraspecific infusion from wild material. Furthermore, we found genomic regions with increased linkage disequilibrium and genetic differentiation, putatively representing early soft sweeps of selection. This is, to our knowledge, the first study of genomic signatures of domestication in a timber species looking beyond the first few generations of cultivation. Our findings highlight the importance of intra- and interspecific hybridization during early domestication.
Collapse
Affiliation(s)
- Marja M. Mostert‐O'Neill
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI)University of PretoriaPrivate Bag X20Pretoria0028South Africa
| | - Hannah Tate
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI)University of PretoriaPrivate Bag X20Pretoria0028South Africa
| | - S. Melissa Reynolds
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI)University of PretoriaPrivate Bag X20Pretoria0028South Africa
| | - Makobatjatji M. Mphahlele
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI)University of PretoriaPrivate Bag X20Pretoria0028South Africa
- Mondi Forests, Tree Improvement Technology Programme, Trahar Technology Centre – TTCMountain Home Estate, Off Dennis Shepstone Dr.Hilton3245South Africa
| | - Gert van den Berg
- Sappi Forests Research, Shaw Research CentrePO Box 473Howick3290South Africa
| | - Steve D. Verryn
- Creation Breeding Innovations75 Kafue St.Lynnwood Glen0081South Africa
| | - Juan J. Acosta
- Camcore, Department of Forestry and Environmental ResourcesNorth Carolina State UniversityPO Box 7626RaleighNC27695USA
| | - Justin O. Borevitz
- Research School of Biology and Centre for Biodiversity Analysis, ARC Centre of Excellence in Plant Energy BiologyAustralian National UniversityCanberraACT0200Australia
| | - Alexander A. Myburg
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI)University of PretoriaPrivate Bag X20Pretoria0028South Africa
| |
Collapse
|
9
|
Tomasello S, Oberprieler C. Reticulate Evolution in the Western Mediterranean Mountain Ranges: The Case of the Leucanthemopsis Polyploid Complex. FRONTIERS IN PLANT SCIENCE 2022; 13:842842. [PMID: 35783934 PMCID: PMC9247603 DOI: 10.3389/fpls.2022.842842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 05/18/2022] [Indexed: 06/15/2023]
Abstract
Polyploidization is one of the most common speciation mechanisms in plants. This is particularly relevant in high mountain environments and/or in areas heavily affected by climatic oscillations. Although the role of polyploidy and the temporal and geographical frameworks of polyploidization have been intensively investigated in the alpine regions of the temperate and arctic biomes, fewer studies are available with a specific focus on the Mediterranean region. Leucanthemopsis (Asteraceae) consists of six to ten species with several infraspecific entities, mainly distributed in the western Mediterranean Basin. It is a polyploid complex including montane, subalpine, and strictly alpine lineages, which are locally distributed in different mountain ranges of Western Europe and North Africa. We used a mixed approach including Sanger sequencing and (Roche-454) high throughput sequencing of amplicons to gather information from single-copy nuclear markers and plastid regions. Nuclear regions were carefully tested for recombinants/PCR artifacts and for paralogy. Coalescent-based methods were used to infer the number of polyploidization events and the age of formation of polyploid lineages, and to reconstruct the reticulate evolution of the genus. Whereas the polyploids within the widespread Leucanthemopsis alpina are autopolyploids, the situation is more complex among the taxa endemic to the western Mediterranean. While the hexaploid, L. longipectinata, confined to the northern Moroccan mountain ranges (north-west Africa), is an autopolyploid, the Iberian polyploids are clearly of allopolyploid origins. At least two different polyploidization events gave rise to L. spathulifolia and to all other tetraploid Iberian taxa, respectively. The formation of the Iberian allopolyploids took place in the early Pleistocene and was probably caused by latitudinal and elevational range shifts that brought into contact previously isolated Leucanthemopsis lineages. Our study thus highlights the importance of the Pleistocene climatic oscillations and connected polyploidization events for the high plant diversity in the Mediterranean Basin.
Collapse
Affiliation(s)
- Salvatore Tomasello
- Department of Systematics, Biodiversity and Evolution of Plants (With Herbarium), University of Göttingen, Göttingen, Germany
| | - Christoph Oberprieler
- Evolutionary and Systematic Botany Group, Institute of Plant Sciences, University of Regensburg, Regensburg, Germany
| |
Collapse
|
10
|
Wawerka M, Dąbkowski D, Rutecka N, Mykowiecka A, Górecki P. Embedding gene trees into phylogenetic networks by conflict resolution algorithms. Algorithms Mol Biol 2022; 17:11. [PMID: 35590416 PMCID: PMC9119282 DOI: 10.1186/s13015-022-00218-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 03/22/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Phylogenetic networks are mathematical models of evolutionary processes involving reticulate events such as hybridization, recombination, or horizontal gene transfer. One of the crucial notions in phylogenetic network modelling is displayed tree, which is obtained from a network by removing a set of reticulation edges. Displayed trees may represent an evolutionary history of a gene family if the evolution is shaped by reticulation events. RESULTS We address the problem of inferring an optimal tree displayed by a network, given a gene tree G and a tree-child network N, under the deep coalescence and duplication costs. We propose an O(mn)-time dynamic programming algorithm (DP) to compute a lower bound of the optimal displayed tree cost, where m and n are the sizes of G and N, respectively. In addition, our algorithm can verify whether the solution is exact. Moreover, it provides a set of reticulation edges corresponding to the obtained cost. If the cost is exact, the set induces an optimal displayed tree. Otherwise, the set contains pairs of conflicting edges, i.e., edges sharing a reticulation node. Next, we show a conflict resolution algorithm that requires [Formula: see text] invocations of DP in the worst case, where r is the number of reticulations. We propose a similar [Formula: see text]-time algorithm for level-k tree-child networks and a branch and bound solution to compute lower and upper bounds of optimal costs. We also extend the algorithms to a broader class of phylogenetic networks. Based on simulated data, the average runtime is [Formula: see text] under the deep-coalescence cost and [Formula: see text] under the duplication cost. CONCLUSIONS Despite exponential complexity in the worst case, our algorithms perform significantly well on empirical and simulated datasets, due to the strategy of resolving internal dissimilarities between gene trees and networks. Therefore, the algorithms are efficient alternatives to enumeration strategies commonly proposed in the literature and enable analyses of complex networks with dozens of reticulations.
Collapse
|
11
|
Kong S, Pons JC, Kubatko L, Wicke K. Classes of explicit phylogenetic networks and their biological and mathematical significance. J Math Biol 2022; 84:47. [PMID: 35503141 DOI: 10.1007/s00285-022-01746-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 01/18/2022] [Accepted: 03/31/2022] [Indexed: 11/24/2022]
Abstract
The evolutionary relationships among organisms have traditionally been represented using rooted phylogenetic trees. However, due to reticulate processes such as hybridization or lateral gene transfer, evolution cannot always be adequately represented by a phylogenetic tree, and rooted phylogenetic networks that describe such complex processes have been introduced as a generalization of rooted phylogenetic trees. In fact, estimating rooted phylogenetic networks from genomic sequence data and analyzing their structural properties is one of the most important tasks in contemporary phylogenetics. Over the last two decades, several subclasses of rooted phylogenetic networks (characterized by certain structural constraints) have been introduced in the literature, either to model specific biological phenomena or to enable tractable mathematical and computational analyses. In the present manuscript, we provide a thorough review of these network classes, as well as provide a biological interpretation of the structural constraints underlying these networks where possible. In addition, we discuss how imposing structural constraints on the network topology can be used to address the scalability and identifiability challenges faced in the estimation of phylogenetic networks from empirical data.
Collapse
Affiliation(s)
- Sungsik Kong
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Joan Carles Pons
- Department of Mathematics and Computer Science, University of the Balearic Islands, Palma, 07122, Spain
| | - Laura Kubatko
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA.,Department of Statistics, The Ohio State University, Columbus, OH, USA
| | - Kristina Wicke
- Department of Mathematics, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
12
|
Markin A, Wagle S, Anderson TK, Eulenstein O. RF-Net 2: fast inference of virus reassortment and hybridization networks. Bioinformatics 2022; 38:2144-2152. [PMID: 35150239 PMCID: PMC9004648 DOI: 10.1093/bioinformatics/btac075] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 01/26/2022] [Accepted: 02/07/2022] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION A phylogenetic network is a powerful model to represent entangled evolutionary histories with both divergent (speciation) and convergent (e.g. hybridization, reassortment, recombination) evolution. The standard approach to inference of hybridization networks is to (i) reconstruct rooted gene trees and (ii) leverage gene tree discordance for network inference. Recently, we introduced a method called RF-Net for accurate inference of virus reassortment and hybridization networks from input gene trees in the presence of errors commonly found in phylogenetic trees. While RF-Net demonstrated the ability to accurately infer networks with up to four reticulations from erroneous input gene trees, its application was limited by the number of reticulations it could handle in a reasonable amount of time. This limitation is particularly restrictive in the inference of the evolutionary history of segmented RNA viruses such as influenza A virus (IAV), where reassortment is one of the major mechanisms shaping the evolution of these pathogens. RESULTS Here, we expand the functionality of RF-Net that makes it significantly more applicable in practice. Crucially, we introduce a fast extension to RF-Net, called Fast-RF-Net, that can handle large numbers of reticulations without sacrificing accuracy. In addition, we develop automatic stopping criteria to select the appropriate number of reticulations heuristically and implement a feature for RF-Net to output error-corrected input gene trees. We then conduct a comprehensive study of the original method and its novel extensions and confirm their efficacy in practice using extensive simulation and empirical IAV evolutionary analyses. AVAILABILITY AND IMPLEMENTATION RF-Net 2 is available at https://github.com/flu-crew/rf-net-2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alexey Markin
- Virus and Prion Research Unit, National Animal Disease Center, USDA-ARS, Ames, IA 50010, USA
| | - Sanket Wagle
- Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| | - Tavis K Anderson
- Virus and Prion Research Unit, National Animal Disease Center, USDA-ARS, Ames, IA 50010, USA
| | - Oliver Eulenstein
- Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| |
Collapse
|
13
|
Mirarab S, Nakhleh L, Warnow T. Multispecies Coalescent: Theory and Applications in Phylogenetics. ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS 2021. [DOI: 10.1146/annurev-ecolsys-012121-095340] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Species tree estimation is a basic part of many biological research projects, ranging from answering basic evolutionary questions (e.g., how did a group of species adapt to their environments?) to addressing questions in functional biology. Yet, species tree estimation is very challenging, due to processes such as incomplete lineage sorting, gene duplication and loss, horizontal gene transfer, and hybridization, which can make gene trees differ from each other and from the overall evolutionary history of the species. Over the last 10–20 years, there has been tremendous growth in methods and mathematical theory for estimating species trees and phylogenetic networks, and some of these methods are now in wide use. In this survey, we provide an overview of the current state of the art, identify the limitations of existing methods and theory, and propose additional research problems and directions.
Collapse
Affiliation(s)
- Siavash Mirarab
- Electrical and Computer Engineering Department, University of California, San Diego, La Jolla, California 92093, USA
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, Texas 77005, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|
14
|
Yan Z, Cao Z, Liu Y, Ogilvie HA, Nakhleh L. Maximum Parsimony Inference of Phylogenetic Networks in the Presence of Polyploid Complexes. Syst Biol 2021; 71:706-720. [PMID: 34605924 PMCID: PMC9017653 DOI: 10.1093/sysbio/syab081] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 09/26/2021] [Accepted: 09/29/2021] [Indexed: 12/18/2022] Open
Abstract
Phylogenetic networks provide a powerful framework for modeling and analyzing reticulate
evolutionary histories. While polyploidy has been shown to be prevalent not only in plants
but also in other groups of eukaryotic species, most work done thus far on phylogenetic
network inference assumes diploid hybridization. These inference methods have been
applied, with varying degrees of success, to data sets with polyploid species, even though
polyploidy violates the mathematical assumptions underlying these methods. Statistical
methods were developed recently for handling specific types of polyploids and so were
parsimony methods that could handle polyploidy more generally yet while excluding
processes such as incomplete lineage sorting. In this article, we introduce a new method
for inferring most parsimonious phylogenetic networks on data that include polyploid
species. Taking gene tree topologies as input, the method seeks a phylogenetic network
that minimizes deep coalescences while accounting for polyploidy. We demonstrate the
performance of the method on both simulated and biological data. The inference method as
well as a method for evaluating evolutionary hypotheses in the form of phylogenetic
networks are implemented and publicly available in the PhyloNet software package.
[Incomplete lineage sorting; minimizing deep coalescences; multilabeled trees;
multispecies network coalescent; phylogenetic networks; polyploidy.]
Collapse
Affiliation(s)
- Zhi Yan
- Department of Computer Science, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
| | - Zhen Cao
- Department of Computer Science, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
| | - Yushu Liu
- Department of Computer Science, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
| | - Huw A Ogilvie
- Department of Computer Science, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
- Department of Biosciences, Rice University, Houston, 6100 Main Street, Houston, TX 77005, USA
| |
Collapse
|
15
|
Cai R, Ané C. Assessing the fit of the multi-species network coalescent to multi-locus data. Bioinformatics 2021; 37:634-641. [PMID: 33027508 DOI: 10.1093/bioinformatics/btaa863] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 09/14/2020] [Accepted: 09/22/2020] [Indexed: 01/25/2023] Open
Abstract
MOTIVATION With growing genome-wide molecular datasets from next-generation sequencing, phylogenetic networks can be estimated using a variety of approaches. These phylogenetic networks include events like hybridization, gene flow or horizontal gene transfer explicitly. However, the most accurate network inference methods are computationally heavy. Methods that scale to larger datasets do not calculate a full likelihood, such that traditional likelihood-based tools for model selection are not applicable to decide how many past hybridization events best fit the data. We propose here a goodness-of-fit test to quantify the fit between data observed from genome-wide multi-locus data, and patterns expected under the multi-species coalescent model on a candidate phylogenetic network. RESULTS We identified weaknesses in the previously proposed TICR test, and proposed corrections. The performance of our new test was validated by simulations on real-world phylogenetic networks. Our test provides one of the first rigorous tools for model selection, to select the adequate network complexity for the data at hand. The test can also work for identifying poorly inferred areas on a network. AVAILABILITY AND IMPLEMENTATION Software for the goodness-of-fit test is available as a Julia package at https://github.com/cecileane/QuartetNetworkGoodnessFit.jl. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ruoyi Cai
- Department of Statistics, University of Wisconsin - Madison, Madison, WI 53706, USA
| | - Cécile Ané
- Department of Statistics, University of Wisconsin - Madison, Madison, WI 53706, USA.,Department of Botany, University of Wisconsin - Madison, Madison, WI 53706, USA
| |
Collapse
|
16
|
Tidwell H, Nakhleh L. Integrated likelihood for phylogenomics under a no-common-mechanism model. BMC Genomics 2020; 21:219. [PMID: 32299348 PMCID: PMC7161099 DOI: 10.1186/s12864-020-6608-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Background Multi-locus species phylogeny inference is based on models of sequence evolution on gene trees as well as models of gene tree evolution within the branches of species phylogenies. Almost all statistical methods for this inference task assume a common mechanism across all loci as captured by a single value of each branch length of the species phylogeny. Results In this paper, we pursue a “no common mechanism" (NCM) model, where every gene tree evolves according to its own parameters of the species phylogeny. Based on this model, we derive an analytically integrated likelihood of both species trees and networks given the gene trees of multiple loci under an NCM model. We demonstrate the performance of inference under this integrated likelihood on both simulated and biological data. Conclusions The model presented here will afford opportunities for exploring connections among various criteria for estimating species phylogenies from multiple, independent loci. Furthermore, further development of this model could potentially result in more efficient methods for searching the space of species phylogenies by focusing solely on the topology of the phylogeny.
Collapse
|
17
|
Granados-Aguilar X, Granados Mendoza C, Cervantes CR, Montes JR, Arias S. Unraveling Reticulate Evolution in Opuntia (Cactaceae) From Southern Mexico. FRONTIERS IN PLANT SCIENCE 2020; 11:606809. [PMID: 33519858 PMCID: PMC7838128 DOI: 10.3389/fpls.2020.606809] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 12/10/2020] [Indexed: 05/20/2023]
Abstract
The process of hybridization occurs in approximately 40% of vascular plants, and this exchange of genetic material between non-conspecific individuals occurs unequally among plant lineages, being more frequent in certain groups such as Opuntia (Cactaceae). This genus is known for multiple taxonomic controversies due to widespread polyploidy and probable hybrid origin of several of its species. Southern Mexico species of this genus have been poorly studied despite their great diversity in regions such as the Tehuacán-Cuicatlán Valley which contains around 12% of recognized Mexico's native Opuntia species. In this work, we focus on testing the hybrid status of two putative hybrids from this region, Opuntia tehuacana and Opuntia pilifera, and estimate if hybridization occurs among sampled southern opuntias using two newly identified nuclear intron markers to construct phylogenetic networks with HyDe and Dsuite and perform invariant analysis under the coalescent model with HyDe and Dsuite. For the test of hybrid origin in O. tehuacana, our results could not recover hybridization as proposed in the literature, but we found introgression into O. tehuacana individuals involving O. decumbens and O. huajuapensis. Regarding O. pilifera, we identified O. decumbens as probable parental species, supported by our analysis, which sustains the previous hybridization hypothesis between Nopalea and Basilares clades. Finally, we suggest new hybridization and introgression cases among southern Mexican species involving O. tehuantepecana and O. depressa as parental species of O. velutina and O. decumbens.
Collapse
Affiliation(s)
- Xochitl Granados-Aguilar
- Posgrado en Ciencias Biológicas, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
- Jardín Botánico, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
- *Correspondence: Xochitl Granados-Aguilar,
| | - Carolina Granados Mendoza
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Cristian Rafael Cervantes
- Posgrado en Ciencias Biológicas, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
- Jardín Botánico, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - José Rubén Montes
- Posgrado en Ciencias Biológicas, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Salvador Arias
- Jardín Botánico, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
- Salvador Arias,
| |
Collapse
|
18
|
Abstract
Interspecific hybridization is the process where closely related species mate and produce offspring with admixed genomes. The genomic revolution has shown that hybridization is common, and that it may represent an important source of novel variation. Although most interspecific hybrids are sterile or less fit than their parents, some may survive and reproduce, enabling the transfer of adaptive variants across the species boundary, and even result in the formation of novel evolutionary lineages. There are two main variants of hybrid species genomes: allopolyploid, which have one full chromosome set from each parent species, and homoploid, which are a mosaic of the parent species genomes with no increase in chromosome number. The establishment of hybrid species requires the development of reproductive isolation against parental species. Allopolyploid species often have strong intrinsic reproductive barriers due to differences in chromosome number, and homoploid hybrids can become reproductively isolated from the parent species through assortment of genetic incompatibilities. However, both types of hybrids can become further reproductively isolated, gaining extrinsic isolation barriers, by exploiting novel ecological niches, relative to their parents. Hybrids represent the merging of divergent genomes and thus face problems arising from incompatible combinations of genes. Thus hybrid genomes are highly dynamic and undergo rapid evolutionary change, including genome stabilization in which selection against incompatible combinations results in fixation of compatible ancestry block combinations within the hybrid species. The potential for rapid adaptation or speciation makes hybrid genomes a particularly exciting subject of in evolutionary biology. Here we summarize how introgressed alleles or hybrid species can establish and how the resulting hybrid genomes evolve.
Collapse
Affiliation(s)
- Anna Runemark
- Department of Biology, Lund University, Lund, Sweden
- * E-mail:
| | - Mario Vallejo-Marin
- Biological and Environmental Sciences, University of Stirling, Stirling, Scotland, United Kingdom
| | - Joana I. Meier
- St John's College, Cambridge, Cambridge, United Kingdom
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
19
|
Blanco-Pastor JL, Bertrand YJK, Liberal IM, Wei Y, Brummer EC, Pfeil BE. Evolutionary networks from RADseq loci point to hybrid origins of Medicago carstiensis and Medicago cretacea. AMERICAN JOURNAL OF BOTANY 2019; 106:1219-1228. [PMID: 31535720 DOI: 10.1002/ajb2.1352] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 07/12/2019] [Indexed: 06/10/2023]
Abstract
PREMISE Although hybridization has played an important role in the evolution of many plant species, phylogenetic reconstructions that include hybridizing lineages have been historically constrained by the available models and data. Restriction-site-associated DNA sequencing (RADseq) has been a popular sequencing technique for the reconstruction of hybridization in the next-generation sequencing era. However, the utility of RADseq for the reconstruction of complex evolutionary networks has not been thoroughly investigated. Conflicting phylogenetic relationships in the genus Medicago have been mainly attributed to hybridization, but the specific hybrid origins of taxa have not been yet clarified. METHODS We obtained new molecular data from diploid species of Medicago section Medicago using single-digest RADseq to reconstruct evolutionary networks from gene trees, an approach that is computationally tractable with data sets that include several species and complex hybridization patterns. RESULTS Our analyses revealed that assembly filters to exclusively select a small set of loci with high phylogenetic information led to the most-divergent network topologies. Conversely, alternative clustering thresholds or filters on the number of samples per locus had a lower impact on networks. A strong hybridization signal was detected for M. carstiensis and M. cretacea, while signals were less clear for M. rugosa, M. rhodopea, M. suffruticosa, M. marina, M. scutellata, and M. sativa. CONCLUSIONS Complex network reconstructions from RADseq gene trees were not robust under variations of the assembly parameters and filters. But when the most-divergent networks were discarded, all remaining analyses consistently supported a hybrid origin for M. carstiensis and M. cretacea.
Collapse
Affiliation(s)
- José Luis Blanco-Pastor
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Göteborg, Sweden
- INRA, Centre Nouvelle-Aquitaine-Poitiers, UR4 (URP3F), 86600, Lusignan, France
| | - Yann J K Bertrand
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Göteborg, Sweden
- Institute of Botany, Czech Academy of Sciences, Zámek 1, 25243, Průhonice, Czech Republic
| | | | - Yanling Wei
- Plant Breeding Center, Department of Plant Sciences, University of California, Davis, Davis, CA, USA
| | - E Charles Brummer
- Plant Breeding Center, Department of Plant Sciences, University of California, Davis, Davis, CA, USA
| | - Bernard E Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530, Göteborg, Sweden
| |
Collapse
|
20
|
Scholz GE, Popescu AA, Taylor MI, Moulton V, Huber KT. OSF-Builder: A New Tool for Constructing and Representing Evolutionary Histories Involving Introgression. Syst Biol 2019; 68:717-729. [PMID: 30668824 DOI: 10.1093/sysbio/syz004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Revised: 01/15/2019] [Accepted: 01/15/2019] [Indexed: 11/13/2022] Open
Abstract
Introgression is an evolutionary process which provides an important source of innovation for evolution. Although various methods have been used to detect introgression, very few methods are currently available for constructing evolutionary histories involving introgression. In this article, we propose a new method for constructing such evolutionary histories whose starting point is a species forest (consisting of a collection of lineage trees, usually arising as a collection of clades or monophyletic groups in a species tree), and a gene tree for a specific allele of interest, or allele tree for short. Our method is based on representing introgression in terms of a certain "overlay" of the allele tree over the lineage trees, called an overlaid species forest (OSF). OSFs are similar to phylogenetic networks although a key difference is that they typically have multiple roots because each monophyletic group in the species tree has a different point of origin. Employing a new model for introgression, we derive an efficient algorithm for building OSFs called OSF-Builder that is guaranteed to return an optimal OSF in the sense that the number of potential introgression events is minimized. As well as using simulations to assess the performance of OSF-Builder, we illustrate its use on a butterfly data set in which introgression has been previously inferred. The OSF-Builder software is available for download from https://www.uea.ac.uk/computing/software/OSF-Builder.
Collapse
Affiliation(s)
| | | | - Martin I Taylor
- School of Biological Sciences, University of East Anglia, Norwich, UK
| | | | | |
Collapse
|
21
|
Huynh S, Marcussen T, Felber F, Parisod C. Hybridization preceded radiation in diploid wheats. Mol Phylogenet Evol 2019; 139:106554. [PMID: 31288105 DOI: 10.1016/j.ympev.2019.106554] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Revised: 07/03/2019] [Accepted: 07/03/2019] [Indexed: 01/06/2023]
Abstract
Evolutionary relationships among the Aegilops-Triticum relatives of cultivated wheats have been difficult to resolve owing to incomplete lineage sorting and reticulate evolution. Recent studies have suggested that the wheat D-genome lineage (progenitor of Ae. tauschii) originated through homoploid hybridization between the A-genome lineage (progenitor of Triticum s.str.) and the B-genome lineage (progenitor of Ae. speltoides). This scenario of reticulation has been debated, calling for adequate phylogenetic analyses based on comprehensive sampling. To reconstruct the evolution of Aegilops-Triticum diploids, we here combined high-throughput sequencing of 38 nuclear low-copy loci of multiple accessions of all 13 species with inferences of the species phylogeny using the full-parameterized MCMC_SEQ method. Phylogenies recovered a monophyletic Aegilops-Triticum lineage that began diversifying ~6.6 Ma ago and gave rise to four sublineages, i.e. the A- (2 species), B- (1 species), D- (9 species) and T- (Ae. mutica) genome lineage. Full-parameterized phylogenies as well as patterns of tree dilation and tree compression supported a hybrid origin of the D-genome lineage from A and B ~3.0-4.0 Ma ago, and did not indicate additional hybridization events. Conflicting ABBA-BABA tests suggestive of further reticulation were shown here to result from ancestral population structure rather than hybridization. This comprehensive and dated phylogeny of wheat relatives indicates that the origin of the hybrid D-genome was followed by intense diversification into the majority of extant diploid as well as allopolyploid wild wheats.
Collapse
Affiliation(s)
- Stella Huynh
- Institute of Biology, University of Neuchâtel, Switzerland
| | - Thomas Marcussen
- Centre for Ecological and Evolutionary Synthesis, University of Oslo, Norway
| | - François Felber
- Institute of Biology, University of Neuchâtel, Switzerland; Musée et Jardins botaniques cantonaux de Lausanne et Pont-de-Nant, Switzerland
| | | |
Collapse
|
22
|
MacGuigan DJ, Near TJ. Phylogenomic Signatures of Ancient Introgression in a Rogue Lineage of Darters (Teleostei: Percidae). Syst Biol 2019; 68:329-346. [PMID: 30395332 DOI: 10.1093/sysbio/syy074] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Accepted: 10/29/2018] [Indexed: 12/17/2022] Open
Abstract
Evolutionary history is typically portrayed as a branching phylogenetic tree, yet not all evolution proceeds in a purely bifurcating manner. Introgressive hybridization is one process that results in reticulate evolution. Most known examples of genome-wide introgression occur among closely related species with relatively recent common ancestry; however, we present evidence for ancient hybridization and genome-wide introgression between major stem lineages of darters, a species-rich clade of North American freshwater fishes. Previous attempts to resolve the relationships of darters have been confounded by the uncertain phylogenetic resolution of the lineage Allohistium. In this study, we investigate the phylogenomics of darters, specifically the relationships of Allohistium, through analyses of approximately 30,000 RADseq loci sampled from 112 species. Our phylogenetic inferences are based on traditional approaches in combination with strategies that accommodate reticulate evolution. These analyses result in a novel phylogenetic hypothesis for darters that includes ancient introgression between Allohistium and other two major darter lineages, minimally occurring 20 million years ago. Darters offer a compelling case for the necessity of incorporating phylogenetic networks in reconstructing the evolutionary history of diversification in species-rich lineages. We anticipate that the growing wealth of genomic data for clades of non-model organisms will reveal more examples of ancient hybridization, eventually requiring a re-evaluation of how evolutionary history is visualized and utilized in macroevolutonary investigations.
Collapse
Affiliation(s)
- Daniel J MacGuigan
- Department of Ecology and Evolutionary Biology, Yale University, P.O. Box 208106, New Haven, CT 06520, USA
| | - Thomas J Near
- Department of Ecology and Evolutionary Biology, Yale University, P.O. Box 208106, New Haven, CT 06520, USA.,Peabody Museum of Natural History, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
23
|
Advances in Computational Methods for Phylogenetic Networks in the Presence of Hybridization. BIOINFORMATICS AND PHYLOGENETICS 2019. [DOI: 10.1007/978-3-030-10837-3_13] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
24
|
Blischak PD, Chifman J, Wolfe AD, Kubatko LS. HyDe: A Python Package for Genome-Scale Hybridization Detection. Syst Biol 2018; 67:821-829. [PMID: 29562307 DOI: 10.1093/sysbio/syy023] [Citation(s) in RCA: 106] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 03/15/2018] [Indexed: 11/13/2022] Open
Abstract
The analysis of hybridization and gene flow among closely related taxa is a common goal for researchers studying speciation and phylogeography. Many methods for hybridization detection use simple site pattern frequencies from observed genomic data and compare them to null models that predict an absence of gene flow. The theory underlying the detection of hybridization using these site pattern probabilities exploits the relationship between the coalescent process for gene trees within population trees and the process of mutation along the branches of the gene trees. For certain models, site patterns are predicted to occur in equal frequency (i.e., their difference is 0), producing a set of functions called phylogenetic invariants. In this article, we introduce HyDe, a software package for detecting hybridization using phylogenetic invariants arising under the coalescent model with hybridization. HyDe is written in Python and can be used interactively or through the command line using pre-packaged scripts. We demonstrate the use of HyDe on simulated data, as well as on two empirical data sets from the literature. We focus in particular on identifying individual hybrids within population samples and on distinguishing between hybrid speciation and gene flow. HyDe is freely available as an open source Python package under the GNU GPL v3 on both GitHub (https://github.com/pblischak/HyDe) and the Python Package Index (PyPI: https://pypi.python.org/pypi/phyde).
Collapse
Affiliation(s)
- Paul D Blischak
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Julia Chifman
- Department of Mathematics and Statistics, American University, Washington, DC 20016, USA
| | - Andrea D Wolfe
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Laura S Kubatko
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA.,Department of Statistics, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
25
|
Abstract
PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.
Collapse
Affiliation(s)
| | | | | | - Luay Nakhleh
- Computer Science.,BioSciences, Rice University, 6100 Main Street, Houston, TX 77005, USA
| |
Collapse
|
26
|
Tang Q, Edwards SV, Rheindt FE. Rapid diversification and hybridization have shaped the dynamic history of the genus Elaenia. Mol Phylogenet Evol 2018; 127:522-533. [DOI: 10.1016/j.ympev.2018.05.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 04/11/2018] [Accepted: 05/08/2018] [Indexed: 01/04/2023]
|
27
|
Folk RA, Visger CJ, Soltis PS, Soltis DE, Guralnick RP. Geographic Range Dynamics Drove Ancient Hybridization in a Lineage of Angiosperms. Am Nat 2018; 192:171-187. [DOI: 10.1086/698120] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
28
|
Van Iersel L, Jones M, Scornavacca C. Improved Maximum Parsimony Models for Phylogenetic Networks. Syst Biol 2018; 67:518-542. [PMID: 29272537 DOI: 10.1093/sysbio/syx094] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Accepted: 12/11/2017] [Indexed: 11/13/2022] Open
Abstract
Phylogenetic networks are well suited to represent evolutionary histories comprising reticulate evolution. Several methods aiming at reconstructing explicit phylogenetic networks have been developed in the last two decades. In this article, we propose a new definition of maximum parsimony for phylogenetic networks that permits to model biological scenarios that cannot be modeled by the definitions currently present in the literature (namely, the "hardwired" and "softwired" parsimony). Building on this new definition, we provide several algorithmic results that lay the foundations for new parsimony-based methods for phylogenetic network reconstruction.
Collapse
Affiliation(s)
- Leo Van Iersel
- Delft Institute of Applied Mathematics, Delft University of Technology, P.O. Box 5, 2600 AA Delft, the Netherlands
| | - Mark Jones
- Delft Institute of Applied Mathematics, Delft University of Technology, P.O. Box 5, 2600 AA Delft, the Netherlands
| | - Celine Scornavacca
- Institut des Sciences de l'Évolution Université de Montpellier, CNRS, IRD, EPHE CC 064, Place Eugène Bataillon 34095 Montpellier Cedex 05, France.,Institut de Biologie Computationnelle (IBC), Montpellier, France
| |
Collapse
|
29
|
Gregg WCT, Ather SH, Hahn MW. Gene-Tree Reconciliation with MUL-Trees to Resolve Polyploidy Events. Syst Biol 2018; 66:1007-1018. [PMID: 28419377 DOI: 10.1093/sysbio/syx044] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Accepted: 03/30/2017] [Indexed: 11/13/2022] Open
Abstract
Polyploidy can have a huge impact on the evolution of species, and it is a common occurrence, especially in plants. The two types of polyploids-autopolyploids and allopolyploids-differ in the level of divergence between the genes that are brought together in the new polyploid lineage. Because allopolyploids are formed via hybridization, the homoeologous copies of genes within them are at least as divergent as orthologs in the parental species that came together to form them. This means that common methods for estimating the parental lineages of allopolyploidy events are not accurate, and can lead to incorrect inferences about the number of gene duplications and losses. Here, we have adapted an algorithm for topology-based gene-tree reconciliation to work with multi-labeled trees (MUL-trees). By definition, MUL-trees have some tips with identical labels, which makes them a natural representation of the genomes of polyploids. Using this new reconciliation algorithm we can: accurately place allopolyploidy events on a phylogeny, identify the parental lineages that hybridized to form allopolyploids, distinguish between allo-, auto-, and (in most cases) no polyploidy, and correctly count the number of duplications and losses in a set of gene trees. We validate our method using gene trees simulated with and without polyploidy, and revisit the history of polyploidy in data from the clades including both baker's yeast and bread wheat. Our re-analysis of the yeast data confirms the allopolyploid origin and parental lineages previously identified for this group. The method presented here should find wide use in the growing number of genomes from species with a history of polyploidy. [Polyploidy; reconciliation; whole-genome duplication.].
Collapse
Affiliation(s)
- W C Thomas Gregg
- Department of Biology and School of Informatics and Computing, Indiana University, Bloomington, IN 47405, USA
| | - S Hussain Ather
- Department of Biology and School of Informatics and Computing, Indiana University, Bloomington, IN 47405, USA
| | - Matthew W Hahn
- Department of Biology and School of Informatics and Computing, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
30
|
Sousa F, Bertrand YJK, Doyle JJ, Oxelman B, Pfeil BE. Using Genomic Location and Coalescent Simulation to Investigate Gene Tree Discordance in Medicago L. Syst Biol 2018; 66:934-949. [PMID: 28177088 DOI: 10.1093/sysbio/syx035] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Accepted: 02/01/2017] [Indexed: 12/28/2022] Open
Abstract
Several well-documented evolutionary processes are known to cause conflict between species-level phylogenies and gene-level phylogenies. Three of the most challenging processes for species tree inference are incomplete lineage sorting, hybridization and gene duplication, which may result in unwarranted comparisons of paralogous genes. Several existing methods have dealt with these processes but none has yet been able to untangle all three at once. Here, we propose a stepwise method by which these processes can be discerned using information on genomic location coupled with coalescent simulations. In the first step, highly discordant genes within genomic blocks (putative paralogs) are identified and excluded from the data set and, in the second step, blocks of linked genes are grouped according to their hybrid history. Existing multispecies coalescent software can then be applied to recover the principal tree(s) that make up the species tree/network without violating the underlying model. The potential of the approach is evaluated on simulated data derived from a species network composed of nine species, of which one is of hybrid origin, and displaying a single-gene duplication that leads to paralogous comparisons. We apply our method to an empirical set of 12 genes from 7 species sampled in the plant genus Medicago that display phylogenetic discordance. We identify the causes of the discordance and demonstrate that the Medicago orbicularis lineage experienced an episode of ancient hybridization. Our results show promise as a new way to explore phylogenetic sequence data that can significantly improve species tree inference in presence of hybridization and undetected paralogy or other causes leading to extremely discordant gene trees. [Coalescent simulation; gene tree; genomic location; hybridization; incomplete lineage sorting; paralogy; phylogenetic incongruence; principal tree; species tree.].
Collapse
Affiliation(s)
- F Sousa
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| | - Y J K Bertrand
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| | - J J Doyle
- Department of Plant Biology, Cornell University, 404 Mann Library Building, Ithaca, NY 14853, USA
| | - B Oxelman
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| | - B E Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 40530 Gothenburg, Sweden
| |
Collapse
|
31
|
Nadeau NJ, Kawakami T. Population Genomics of Speciation and Admixture. POPULATION GENOMICS 2018. [DOI: 10.1007/13836_2018_24] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
32
|
Wen D, Nakhleh L. Coestimating Reticulate Phylogenies and Gene Trees from Multilocus Sequence Data. Syst Biol 2017; 67:439-457. [DOI: 10.1093/sysbio/syx085] [Citation(s) in RCA: 90] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2017] [Accepted: 10/24/2017] [Indexed: 11/13/2022] Open
Affiliation(s)
| | - Luay Nakhleh
- Department of Computer Science
- Department of BioSciences, Rice University, 6100 Main Street, Houston, TX 77005, USA
| |
Collapse
|
33
|
Pabijan M, Zieliński P, Dudek K, Stuglik M, Babik W. Isolation and gene flow in a speciation continuum in newts. Mol Phylogenet Evol 2017; 116:1-12. [PMID: 28797693 DOI: 10.1016/j.ympev.2017.08.003] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Revised: 08/04/2017] [Accepted: 08/06/2017] [Indexed: 02/06/2023]
Abstract
Because reproductive isolation often evolves gradually, differentiating lineages may retain the potential for genetic exchange for prolonged periods, providing an opportunity to quantify and to understand the fundamental role of gene flow during speciation. Here we delimit evolutionary lineages, reconstruct the phylogeny and infer gene flow in newts of the Lissotriton vulgaris species complex based on 74 nuclear markers sampled from 127 localities. We demonstrate that distinct lineages along the speciation continuum in newts exchange nontrivial amounts of genes, affecting their evolutionary trajectories. By integrating a wide array of methods, we delimit nine evolutionary lineages and show that two principal factors have driven their genetic differentiation: time since the last common ancestor determining levels of shared ancestral polymorphism, and shifts in geographic distributions determining the extent of secondary contact. Post-divergence gene flow, indicative of evolutionary non-independence, has been most extensive in Central Europe, while four southern European lineages have acquired the population-genetic hallmarks of independent species (L. graecus, L. kosswigi, L. lantzi, L. schmidtleri). We obtained strong statistical support for widespread mtDNA introgression following secondary contact, previously suggested by discordance between mtDNA phylogeny and morphology. Our study reveals long-term evolutionary persistence of evolutionary lineages that may periodically exchange genes with one another: although some of these lineages may become extinct or fuse, others will acquire complete reproductive isolation and will carry signatures of this complex history in their genomes.
Collapse
Affiliation(s)
- Maciej Pabijan
- Institute of Environmental Sciences, Jagiellonian University, ul. Gronostajowa 7, 30-387 Kraków, Poland.
| | - Piotr Zieliński
- Institute of Environmental Sciences, Jagiellonian University, ul. Gronostajowa 7, 30-387 Kraków, Poland.
| | - Katarzyna Dudek
- Institute of Environmental Sciences, Jagiellonian University, ul. Gronostajowa 7, 30-387 Kraków, Poland.
| | - Michał Stuglik
- Institute of Environmental Sciences, Jagiellonian University, ul. Gronostajowa 7, 30-387 Kraków, Poland; Scotland's Rural College, Integrative Animal Sciences, Easter Bush Campus, Midlothian EH25 9RG, Scotland, UK.
| | - Wiesław Babik
- Institute of Environmental Sciences, Jagiellonian University, ul. Gronostajowa 7, 30-387 Kraków, Poland.
| |
Collapse
|
34
|
Kamneva OK, Syring J, Liston A, Rosenberg NA. Evaluating allopolyploid origins in strawberries (Fragaria) using haplotypes generated from target capture sequencing. BMC Evol Biol 2017; 17:180. [PMID: 28778145 PMCID: PMC5543553 DOI: 10.1186/s12862-017-1019-7] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Accepted: 07/25/2017] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Hybridization is observed in many eukaryotic lineages and can lead to the formation of polyploid species. The study of hybridization and polyploidization faces challenges both in data generation and in accounting for population-level phenomena such as coalescence processes in phylogenetic analysis. Genus Fragaria is one example of a set of plant taxa in which a range of ploidy levels is observed across species, but phylogenetic origins are unknown. RESULTS Here, using 20 diploid and polyploid Fragaria species, we combine approaches from NGS data analysis and phylogenetics to infer evolutionary origins of polyploid strawberries, taking into account coalescence processes. We generate haplotype sequences for 257 low-copy nuclear markers assembled from Illumina target capture sequence data. We then identify putative hybridization events by analyzing gene tree topologies, and further test predicted hybridizations in a coalescence framework. This approach confirms the allopolyploid ancestry of F. chiloensis and F. virginiana, and provides new allopolyploid ancestry hypotheses for F. iturupensis, F. moschata, and F. orientalis. Evidence of gene flow between diploids F. bucharica and F. vesca is also detected, suggesting that it might be appropriate to consider these groups as conspecifics. CONCLUSIONS This study is one of the first in which target capture sequencing followed by computational deconvolution of individual haplotypes is used for tracing origins of polyploid taxa. The study also provides new perspectives on the evolutionary history of Fragaria.
Collapse
Affiliation(s)
- Olga K Kamneva
- Department of Biology, Stanford University, 371 Serra Mall, Stanford, CA, 94305, USA.
| | - John Syring
- Department of Biology, Linfield College, McMinnville, OR, 97128, USA
| | - Aaron Liston
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, 97331, USA
| | - Noah A Rosenberg
- Department of Biology, Stanford University, 371 Serra Mall, Stanford, CA, 94305, USA
| |
Collapse
|
35
|
Rogers J, Fishberg A, Youngs N, Wu YC. Reconciliation feasibility in the presence of gene duplication, loss, and coalescence with multiple individuals per species. BMC Bioinformatics 2017; 18:292. [PMID: 28583091 PMCID: PMC5460407 DOI: 10.1186/s12859-017-1701-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2016] [Accepted: 05/22/2017] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND In phylogenetics, we often seek to reconcile gene trees with species trees within the framework of an evolutionary model. While the most popular models for eukaryotic species allow for only gene duplication and gene loss or only multispecies coalescence, recent work has combined these phenomena through a reconciliation structure, the labeled coalescent tree (LCT), that simultaneously describes the duplication-loss and coalescent history of a gene family. However, the LCT makes the simplifying assumption that only one individual is sampled per species whereas, with advances in gene sequencing, we now have access to multiple samples per species. RESULTS We demonstrate that with these additional samples, there exist gene tree topologies that are impossible to reconcile with any species tree. In particular, the multiple samples enforce new constraints on the placement of duplications within a valid reconciliation. To model these constraints, we extend the LCT to a new structure, the partially labeled coalescent tree (PLCT) and demonstrate how to use the PLCT to evaluate the feasibility of a gene tree topology. We apply our algorithm to two clades of apes and flies to characterize possible sources of infeasibility. CONCLUSION Going forward, we believe that this model represents a first step towards understanding reconciliations in duplication-loss-coalescence models with multiple samples per species.
Collapse
Affiliation(s)
- Jennifer Rogers
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA
| | - Andrew Fishberg
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA
| | - Nora Youngs
- Department of Mathematics, Harvey Mudd College, Claremont, 91711, California, USA
- Current Address: Department of Mathematics and Statistics, Colby College, Waterville, 04901, Maine, USA
| | - Yi-Chieh Wu
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA.
| |
Collapse
|
36
|
Leavitt DH, Marion AB, Hollingsworth BD, Reeder TW. Multilocus phylogeny of alligator lizards ( Elgaria , Anguidae): Testing mtDNA introgression as the source of discordant molecular phylogenetic hypotheses. Mol Phylogenet Evol 2017; 110:104-121. [DOI: 10.1016/j.ympev.2017.02.010] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2016] [Revised: 02/08/2017] [Accepted: 02/12/2017] [Indexed: 12/25/2022]
|
37
|
Bordewich M, Linz S, Semple C. Lost in space? Generalising subtree prune and regraft to spaces of phylogenetic networks. J Theor Biol 2017; 423:1-12. [PMID: 28414085 DOI: 10.1016/j.jtbi.2017.03.032] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Revised: 03/18/2017] [Accepted: 03/20/2017] [Indexed: 10/19/2022]
Abstract
Over the last fifteen years, phylogenetic networks have become a popular tool to analyse relationships between species whose past includes reticulation events such as hybridisation or horizontal gene transfer. However, the space of phylogenetic networks is significantly larger than that of phylogenetic trees, and how to analyse and search this enlarged space remains a poorly understood problem. Inspired by the widely-used rooted subtree prune and regraft (rSPR) operation on rooted phylogenetic trees, we propose a new operation-called subnet prune and regraft (SNPR)-that induces a metric on the space of all rooted phylogenetic networks on a fixed set of leaves. We show that the spaces of several popular classes of rooted phylogenetic networks (e.g. tree child, reticulation visible, and tree based) are connected under SNPR and that connectedness remains for the subclasses of these networks with a fixed number of reticulations. Lastly, we bound the distance between two rooted phylogenetic networks under the SNPR operation, show that it is computationally hard to compute this distance exactly, and analyse how the SNPR-distance between two such networks relates to the rSPR-distance between rooted phylogenetic trees that are embedded in these networks.
Collapse
Affiliation(s)
- Magnus Bordewich
- School of Engineering and Computing Sciences, Durham University, Durham DH1 3LE, United Kingdom.
| | - Simone Linz
- Department of Computer Science, The University of Auckland, Private Bag 92019, Auckland 1142, New Zealand.
| | - Charles Semple
- School of Mathematics and Statistics, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand.
| |
Collapse
|
38
|
Kamneva OK, Rosenberg NA. Simulation-Based Evaluation of Hybridization Network Reconstruction Methods in the Presence of Incomplete Lineage Sorting. Evol Bioinform Online 2017; 13:1176934317691935. [PMID: 28469378 PMCID: PMC5395256 DOI: 10.1177/1176934317691935] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2016] [Accepted: 01/11/2017] [Indexed: 11/22/2022] Open
Abstract
Hybridization events generate reticulate species relationships, giving rise to species networks rather than species trees. We report a comparative study of consensus, maximum parsimony, and maximum likelihood methods of species network reconstruction using gene trees simulated assuming a known species history. We evaluate the role of the divergence time between species involved in a hybridization event, the relative contributions of the hybridizing species, and the error in gene tree estimation. When gene tree discordance is mostly due to hybridization and not due to incomplete lineage sorting (ILS), most of the methods can detect even highly skewed hybridization events between highly divergent species. For recent divergences between hybridizing species, when the influence of ILS is sufficiently high, likelihood methods outperform parsimony and consensus methods, which erroneously identify extra hybridizations. The more sophisticated likelihood methods, however, are affected by gene tree errors to a greater extent than are consensus and parsimony.
Collapse
Affiliation(s)
- Olga K Kamneva
- Department of Biology, Stanford University, Stanford, CA, USA
| | | |
Collapse
|
39
|
Smith JF, Clark JL, Amaya-Márquez M, Marín-Gómez OH. Resolving incongruence: Species of hybrid origin in Columnea (Gesneriaceae). Mol Phylogenet Evol 2017; 106:228-240. [DOI: 10.1016/j.ympev.2016.10.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Revised: 09/30/2016] [Accepted: 10/03/2016] [Indexed: 01/19/2023]
|
40
|
Distribution of coalescent histories under the coalescent model with gene flow. Mol Phylogenet Evol 2016; 105:177-192. [DOI: 10.1016/j.ympev.2016.08.024] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2016] [Revised: 08/16/2016] [Accepted: 08/31/2016] [Indexed: 12/19/2022]
|
41
|
Oberprieler C, Wagner F, Tomasello S, Konowalik K. A permutation approach for inferring species networks from gene trees in polyploid complexes by minimising deep coalescences. Methods Ecol Evol 2016. [DOI: 10.1111/2041-210x.12694] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- Christoph Oberprieler
- Evolutionary and Systematic Botany Group Institute of Plant Sciences University of Regensburg Universitätsstr. 31 D‐93053 Regensburg Germany
| | - Florian Wagner
- Evolutionary and Systematic Botany Group Institute of Plant Sciences University of Regensburg Universitätsstr. 31 D‐93053 Regensburg Germany
| | - Salvatore Tomasello
- Evolutionary and Systematic Botany Group Institute of Plant Sciences University of Regensburg Universitätsstr. 31 D‐93053 Regensburg Germany
- Systematic Botany and Mycology Department of Biology Ludwig‐Maximilians‐University Munich (LMU) Menzingerstr. 67 D‐80638 Munich Germany
| | - Kamil Konowalik
- Evolutionary and Systematic Botany Group Institute of Plant Sciences University of Regensburg Universitätsstr. 31 D‐93053 Regensburg Germany
| |
Collapse
|
42
|
Hejase HA, Liu KJ. A scalability study of phylogenetic network inference methods using empirical datasets and simulations involving a single reticulation. BMC Bioinformatics 2016; 17:422. [PMID: 27737628 PMCID: PMC5064893 DOI: 10.1186/s12859-016-1277-1] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2016] [Accepted: 09/22/2016] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND Branching events in phylogenetic trees reflect bifurcating and/or multifurcating speciation and splitting events. In the presence of gene flow, a phylogeny cannot be described by a tree but is instead a directed acyclic graph known as a phylogenetic network. Both phylogenetic trees and networks are typically reconstructed using computational analysis of multi-locus sequence data. The advent of high-throughput sequencing technologies has brought about two main scalability challenges: (1) dataset size in terms of the number of taxa and (2) the evolutionary divergence of the taxa in a study. The impact of both dimensions of scale on phylogenetic tree inference has been well characterized by recent studies; in contrast, the scalability limits of phylogenetic network inference methods are largely unknown. RESULTS In this study, we quantify the performance of state-of-the-art phylogenetic network inference methods on large-scale datasets using empirical data sampled from natural mouse populations and a range of simulations using model phylogenies with a single reticulation. We find that, as in the case of phylogenetic tree inference, the performance of leading network inference methods is negatively impacted by both dimensions of dataset scale. In general, we found that topological accuracy degrades as the number of taxa increases; a similar effect was observed with increased sequence mutation rate. The most accurate methods were probabilistic inference methods which maximize either likelihood under coalescent-based models or pseudo-likelihood approximations to the model likelihood. The improved accuracy obtained with probabilistic inference methods comes at a computational cost in terms of runtime and main memory usage, which become prohibitive as dataset size grows past twenty-five taxa. None of the probabilistic methods completed analyses of datasets with 30 taxa or more after many weeks of CPU runtime. CONCLUSIONS We conclude that the state of the art of phylogenetic network inference lags well behind the scope of current phylogenomic studies. New algorithmic development is critically needed to address this methodological gap.
Collapse
Affiliation(s)
- Hussein A. Hejase
- Department of Computer Science and Engineering, Michigan State University, 428 S. Shaw Lane, East Lansing, MI USA
| | - Kevin J. Liu
- Department of Computer Science and Engineering, Michigan State University, 428 S. Shaw Lane, East Lansing, MI USA
| |
Collapse
|
43
|
Suh A. The phylogenomic forest of bird trees contains a hard polytomy at the root of Neoaves. ZOOL SCR 2016. [DOI: 10.1111/zsc.12213] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Alexander Suh
- Department of Evolutionary Biology Evolutionary Biology Centre (EBC) Uppsala University SE ‐ 752 36 Uppsala Sweden
| |
Collapse
|
44
|
Payseur BA, Rieseberg LH. A genomic perspective on hybridization and speciation. Mol Ecol 2016; 25:2337-60. [PMID: 26836441 PMCID: PMC4915564 DOI: 10.1111/mec.13557] [Citation(s) in RCA: 292] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Revised: 01/18/2016] [Accepted: 01/25/2016] [Indexed: 12/13/2022]
Abstract
Hybridization among diverging lineages is common in nature. Genomic data provide a special opportunity to characterize the history of hybridization and the genetic basis of speciation. We review existing methods and empirical studies to identify recent advances in the genomics of hybridization, as well as issues that need to be addressed. Notable progress has been made in the development of methods for detecting hybridization and inferring individual ancestries. However, few approaches reconstruct the magnitude and timing of gene flow, estimate the fitness of hybrids or incorporate knowledge of recombination rate. Empirical studies indicate that the genomic consequences of hybridization are complex, including a highly heterogeneous landscape of differentiation. Inferred characteristics of hybridization differ substantially among species groups. Loci showing unusual patterns - which may contribute to reproductive barriers - are usually scattered throughout the genome, with potential enrichment in sex chromosomes and regions of reduced recombination. We caution against the growing trend of interpreting genomic variation in summary statistics across genomes as evidence of differential gene flow. We argue that converting genomic patterns into useful inferences about hybridization will ultimately require models and methods that directly incorporate key ingredients of speciation, including the dynamic nature of gene flow, selection acting in hybrid populations and recombination rate variation.
Collapse
Affiliation(s)
- Bret A. Payseur
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Loren H. Rieseberg
- Department of Botany, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
45
|
Bayesian Inference of Reticulate Phylogenies under the Multispecies Network Coalescent. PLoS Genet 2016; 12:e1006006. [PMID: 27144273 PMCID: PMC4856265 DOI: 10.1371/journal.pgen.1006006] [Citation(s) in RCA: 83] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2015] [Accepted: 04/04/2016] [Indexed: 11/19/2022] Open
Abstract
The multispecies coalescent (MSC) is a statistical framework that models how gene genealogies grow within the branches of a species tree. The field of computational phylogenetics has witnessed an explosion in the development of methods for species tree inference under MSC, owing mainly to the accumulating evidence of incomplete lineage sorting in phylogenomic analyses. However, the evolutionary history of a set of genomes, or species, could be reticulate due to the occurrence of evolutionary processes such as hybridization or horizontal gene transfer. We report on a novel method for Bayesian inference of genome and species phylogenies under the multispecies network coalescent (MSNC). This framework models gene evolution within the branches of a phylogenetic network, thus incorporating reticulate evolutionary processes, such as hybridization, in addition to incomplete lineage sorting. As phylogenetic networks with different numbers of reticulation events correspond to points of different dimensions in the space of models, we devise a reversible-jump Markov chain Monte Carlo (RJMCMC) technique for sampling the posterior distribution of phylogenetic networks under MSNC. We implemented the methods in the publicly available, open-source software package PhyloNet and studied their performance on simulated and biological data. The work extends the reach of Bayesian inference to phylogenetic networks and enables new evolutionary analyses that account for reticulation. Trees have long formed in biology the basic structure with which to represent and understand evolutionary relationships. Mathematical models, computational methods, and software tools for inferring phylogenetic trees and studying their mathematical properties are currently the norm in biology. The availability of genomic data from closely related species, as well as from multiple individuals within species, have brought the two fields of phylogenetics and population genetics closer than ever. In particular, the last two decades have witnessed a great flourish in the development and implementation of phylogenetic methods based on the multispecies coalescent model to capture the intricate relationship between gene and genome evolution. However, when reticulation processes such as hybridization occur, the phylogenetic history is best represented by a network. In this work, we demonstrate how the multispecies coalescent model can be adapted to reticulate evolutionary histories and report on a Bayesian method for inference of such histories under this extended model. As networks subsume trees, the model and method provide a principled and unified statistical framework for inferring treelike and non-treelike evolutionary relationships.
Collapse
|
46
|
Wen D, Yu Y, Hahn MW, Nakhleh L. Reticulate evolutionary history and extensive introgression in mosquito species revealed by phylogenetic network analysis. Mol Ecol 2016; 25:2361-72. [PMID: 26808290 DOI: 10.1111/mec.13544] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2015] [Revised: 12/15/2015] [Accepted: 01/06/2016] [Indexed: 12/27/2022]
Abstract
The role of hybridization and subsequent introgression has been demonstrated in an increasing number of species. Recently, Fontaine et al. (Science, 347, 2015, 1258524) conducted a phylogenomic analysis of six members of the Anopheles gambiae species complex. Their analysis revealed a reticulate evolutionary history and pointed to extensive introgression on all four autosomal arms. The study further highlighted the complex evolutionary signals that the co-occurrence of incomplete lineage sorting (ILS) and introgression can give rise to in phylogenomic analyses. While tree-based methodologies were used in the study, phylogenetic networks provide a more natural model to capture reticulate evolutionary histories. In this work, we reanalyse the Anopheles data using a recently devised framework that combines the multispecies coalescent with phylogenetic networks. This framework allows us to capture ILS and introgression simultaneously, and forms the basis for statistical methods for inferring reticulate evolutionary histories. The new analysis reveals a phylogenetic network with multiple hybridization events, some of which differ from those reported in the original study. To elucidate the extent and patterns of introgression across the genome, we devise a new method that quantifies the use of reticulation branches in the phylogenetic network by each genomic region. Applying the method to the mosquito data set reveals the evolutionary history of all the chromosomes. This study highlights the utility of 'network thinking' and the new insights it can uncover, in particular in phylogenomic analyses of large data sets with extensive gene tree incongruence.
Collapse
Affiliation(s)
- Dingqiao Wen
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Yun Yu
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN, 47405, USA.,School of Informatics and Computing, Indiana University, Bloomington, IN, 47405, USA
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, TX, 77005, USA.,Department of BioSciences, Rice University, Houston, TX, 77005, USA
| |
Collapse
|
47
|
Solís-Lemus C, Ané C. Inferring Phylogenetic Networks with Maximum Pseudolikelihood under Incomplete Lineage Sorting. PLoS Genet 2016; 12:e1005896. [PMID: 26950302 PMCID: PMC4780787 DOI: 10.1371/journal.pgen.1005896] [Citation(s) in RCA: 231] [Impact Index Per Article: 28.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Accepted: 02/03/2016] [Indexed: 11/23/2022] Open
Abstract
Phylogenetic networks are necessary to represent the tree of life expanded by edges to represent events such as horizontal gene transfers, hybridizations or gene flow. Not all species follow the paradigm of vertical inheritance of their genetic material. While a great deal of research has flourished into the inference of phylogenetic trees, statistical methods to infer phylogenetic networks are still limited and under development. The main disadvantage of existing methods is a lack of scalability. Here, we present a statistical method to infer phylogenetic networks from multi-locus genetic data in a pseudolikelihood framework. Our model accounts for incomplete lineage sorting through the coalescent model, and for horizontal inheritance of genes through reticulation nodes in the network. Computation of the pseudolikelihood is fast and simple, and it avoids the burdensome calculation of the full likelihood which can be intractable with many species. Moreover, estimation at the quartet-level has the added computational benefit that it is easily parallelizable. Simulation studies comparing our method to a full likelihood approach show that our pseudolikelihood approach is much faster without compromising accuracy. We applied our method to reconstruct the evolutionary relationships among swordtails and platyfishes (Xiphophorus: Poeciliidae), which is characterized by widespread hybridizations. Phylogenetic networks display the evolutionary history of groups of individuals (species or populations) including reticulation events such as hybridization, horizontal gene transfer or migration. Here, we present a likelihood method to learn networks from molecular sequences at multiple genes. Our model accounts for several biological processes: mutations, incomplete lineage sorting of alleles in ancestral populations, and reticulations in the network. The likelihood is decomposed into 4-taxon subsets to make the analyses scale to many species and many genes. Our work makes it possible to learn large phylogenetic networks from large data sets, with a statistical approach and a biologically relevant model.
Collapse
Affiliation(s)
- Claudia Solís-Lemus
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- * E-mail:
| | - Cécile Ané
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Department of Botany, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| |
Collapse
|
48
|
Morrison DA. Genealogies: Pedigrees and Phylogenies are Reticulating Networks Not Just Divergent Trees. Evol Biol 2016. [DOI: 10.1007/s11692-016-9376-5] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
49
|
Konowalik K, Wagner F, Tomasello S, Vogt R, Oberprieler C. Detecting reticulate relationships among diploid Leucanthemum Mill. (Compositae, Anthemideae) taxa using multilocus species tree reconstruction methods and AFLP fingerprinting. Mol Phylogenet Evol 2015; 92:308-28. [DOI: 10.1016/j.ympev.2015.06.003] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Revised: 05/29/2015] [Accepted: 06/02/2015] [Indexed: 12/23/2022]
|
50
|
Abstract
Background Several phylogenomic analyses have recently demonstrated the need to account simultaneously for incomplete lineage sorting (ILS) and hybridization when inferring a species phylogeny. A maximum likelihood approach was introduced recently for inferring species phylogenies in the presence of both processes, and showed very good results. However, computing the likelihood of a model in this case is computationally infeasible except for very small data sets. Results Inspired by recent work on the pseudo-likelihood of species trees based on rooted triples, we introduce the pseudo-likelihood of a phylogenetic network, which, when combined with a search heuristic, provides a statistical method for phylogenetic network inference in the presence of ILS. Unlike trees, networks are not always uniquely encoded by a set of rooted triples. Therefore, even when given sufficient data, the method might converge to a network that is equivalent under rooted triples to the true one, but not the true one itself. The method is computationally efficient and has produced very good results on the data sets we analyzed. The method is implemented in PhyloNet, which is publicly available in open source. Conclusions Maximum pseudo-likelihood allows for inferring species phylogenies in the presence of hybridization and ILS, while scaling to much larger data sets than is currently feasible under full maximum likelihood. The nonuniqueness of phylogenetic networks encoded by a system of rooted triples notwithstanding, the proposed method infers the correct network under certain scenarios, and provides candidates for further exploration under other criteria and/or data in other scenarios.
Collapse
|